All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/14] New TM Model
@ 2018-11-06 12:40 Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry Breno Leitao
                   ` (14 more replies)
  0 siblings, 15 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, ldufour, cyrilbur

This  patchset for the hardware transactional memory (TM) subsystem
aims to avoid spending a lot of time on TM suspended mode in kernel
space.  It basically changes where the reclaim/recheckpoint will be
executed.

The hardware is designed so once a CPU enters in transactional state it
uses a footprint area to track down the loads/stores performed in
transaction so it can be verified later to decide if a conflict happened
due to some change done in that state by another thread.  If a transaction
is active in userspace and there is an exception that takes the CPU to the
kernel space, the CPU moves the transaction state to suspended state but
does not discard the registers (GPR,VEC,VSX,FP) from footprint area,
although the memory footprint might be discarded.

POWER9 has a known problem [1, 2] and does not have enough room in
footprint area for several transactions to be suspended at the same time
on various CPUs leading to CPU stalls.

This new model, together with a future 'fake userspace suspended'
implementation may workaround POWER9 hardware issue.

This patchset aims to reclaim the checkpointed registers as soon as the
kernel is invoked, in the beginning of the exception handlers, thus freeing
room to other CPUs enter in suspended mode for a short period of time as
soon as possible, avoiding too many CPUs in suspended state that can cause
the CPUs to stall. The same mechanism is done on kernel exit, doing a
recheckpoint as late as possible (which will reload the checkpointed
registers into CPU's checkpoint area) at the exception return path.

The way to achieve this goal is creating a macro (TM_KERNEL_ENTRY) which
will check if userspace was in an active transaction just after getting
into kernel space and reclaim the transaction if that's the case. Thus all
exception handlers will call this macro as soon as possible.

All exceptions should reclaim (if necessary) at this stage and only
recheckpoint if the task is tagged as TIF_RESTORE_TM (i.e. was in
transactional state before being interrupted), which will be done at
restore_tm_state().

Hence, by allowing the CPU to be in suspended state for only a brief period
it's possible to create the initial infrastructure that copes with the TM
hardware limitations.

This patchset was tested in different scenarios using different test
suites, as the kernel selftests, OpenJDK TM tests, and htm-torture [3], in the
following configuration:

 * POWER8/pseries LE and BE
 * POWER8/powernv LE
 * POWER9/pseries LE 
 * POWER8/powernv LE hosting KVM guests running TM tests

This patchset is based on initial work done by Cyril Bur:
    https://patchwork.ozlabs.org/cover/875341/

V1 patchset URL: https://patchwork.ozlabs.org/cover/969173/

Major Change from v1:
 
 * restore_tm_state() being called later at the kernel exit path, so, there is
   no way to replay any IRQ, which will be done with TM in suspended state.
   This is mostly described in the 'Recheckpoint at exit path' patch.

 * No neeed to force TEXASR[FS] bit explicitly. This was required because
   in a very specific case, TEXASR SPR was not being restored properly but
   MSR[TM] was set. Fixed in patch 'Do not restore TM without SPRs'.

 * All treclaim/trechkpoint have a WARN_ON() if not called on kernel
   entrance or exit path. tm_reclaim() is only called by TM_KERNEL_ENTRY
   and tm_recheckpoint is only called by restore_tm_state(). All the rest
   causes a warning.
 
Regards,
Breno

[1] Documentation/powerpc/transactional_memory.txt
[2] commit 4bb3c7a0208fc13ca70598efd109901a7cd45ae7
[3] https://github.com/leitao/htm_torture/

Breno Leitao (14):
  powerpc/tm: Reclaim transaction on kernel entry
  powerpc/tm: Reclaim on unavailable exception
  powerpc/tm: Recheckpoint when exiting from kernel
  powerpc/tm: Always set TIF_RESTORE_TM on reclaim
  powerpc/tm: Refactor the __switch_to_tm code
  powerpc/tm: Do not recheckpoint at sigreturn
  powerpc/tm: Do not reclaim on ptrace
  powerpc/tm: Recheckpoint at exit path
  powerpc/tm: Warn if state is transactional
  powerpc/tm: Improve TM debug information
  powerpc/tm: Save MSR to PACA before RFID
  powerpc/tm: Restore transactional SPRs
  powerpc/tm: Do not restore TM without SPRs
  selftests/powerpc: Adapt tm-syscall test to no suspend

 arch/powerpc/include/asm/exception-64s.h      |  50 ++++
 arch/powerpc/include/asm/thread_info.h        |   2 +-
 arch/powerpc/kernel/asm-offsets.c             |   4 +
 arch/powerpc/kernel/entry_64.S                |  37 ++-
 arch/powerpc/kernel/exceptions-64s.S          |  15 +-
 arch/powerpc/kernel/process.c                 | 242 ++++++++++--------
 arch/powerpc/kernel/ptrace.c                  |  16 +-
 arch/powerpc/kernel/signal.c                  |   2 +-
 arch/powerpc/kernel/signal_32.c               |  38 +--
 arch/powerpc/kernel/signal_64.c               |  42 ++-
 arch/powerpc/kernel/tm.S                      |  19 +-
 arch/powerpc/kernel/traps.c                   |  22 +-
 .../testing/selftests/powerpc/tm/tm-syscall.c |   6 -
 13 files changed, 293 insertions(+), 202 deletions(-)

-- 
2.19.0


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-15  0:06   ` Michael Neuling
  2018-11-15  0:51   ` Nicholas Piggin
  2018-11-06 12:40 ` [RFC PATCH 02/14] powerpc/tm: Reclaim on unavailable exception Breno Leitao
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

This patch creates a macro that will be invoked on all entrance to the
kernel, so, in kernel space the transaction will be completely reclaimed
and not suspended anymore.

This patchset checks if we are coming from PR, if not, skip. This is useful
when there is a irq_replay() being called after recheckpoint, when the IRQ
is re-enable. In this case, we do not want to re-reclaim and
re-recheckpoint, thus, if not coming from PR, skip it completely.

This macro does not care about TM SPR also, it will only be saved and
restore in the context switch code now on.

This macro will return 0 or 1 in r3 register, to specify if a reclaim was
executed or not.

This patchset is based on initial work done by Cyril:
https://patchwork.ozlabs.org/cover/875341/

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/include/asm/exception-64s.h | 46 ++++++++++++++++++++++++
 arch/powerpc/kernel/entry_64.S           | 10 ++++++
 arch/powerpc/kernel/exceptions-64s.S     | 12 +++++--
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 3b4767ed3ec5..931a74ba037b 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -36,6 +36,7 @@
  */
 #include <asm/head-64.h>
 #include <asm/feature-fixups.h>
+#include <asm/tm.h>
 
 /* PACA save area offsets (exgen, exmc, etc) */
 #define EX_R9		0
@@ -677,10 +678,54 @@ BEGIN_FTR_SECTION				\
 	beql	ppc64_runlatch_on_trampoline;	\
 END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+
+/*
+ * This macro will reclaim a transaction if called when coming from userspace
+ * (MSR.PR = 1) and if the transaction state is active or suspended.
+ *
+ * Since we don't want to reclaim when coming from kernel, for instance after
+ * a trechkpt. or a IRQ replay, the live MSR is not useful and instead of it the
+ * MSR from thread stack is used to check the MSR.PR bit.
+ * This macro has one argument which is the cause that will be used by treclaim.
+ * and returns in r3 '1' if the reclaim happens or '0' if reclaim didn't
+ * happen, which is useful to know what registers were clobbered.
+ *
+ * NOTE: If addition registers are clobbered here, make sure the callee
+ * function restores them before proceeding.
+ */
+#define TM_KERNEL_ENTRY(cause)						\
+	ld      r3, _MSR(r1);						\
+	andi.   r0, r3, MSR_PR;	/* Coming from userspace? */		\
+	beq     1f;		/* Skip reclaim if MSR.PR != 1 */	\
+	rldicl. r0, r3, (64-MSR_TM_LG), 63; /* Is TM enabled? */	\
+	beq     1f;		/* Skip reclaim if TM is off */		\
+	rldicl. r0, r3, (64-MSR_TS_LG), 62;	/* Is active */		\
+	beq     1f;		/* Skip reclaim if neither */		\
+	/*								\
+	 * If there is a transaction active or suspended, save the	\
+	 * non-volatile GPRs if they are not already saved.		\
+	 */								\
+	bl      save_nvgprs;						\
+	/*								\
+	 * Soft disable the IRQs, otherwise it might cause a CPU hang.	\
+	 */								\
+	RECONCILE_IRQ_STATE(r10, r11);					\
+	li      r3, cause;						\
+	bl      tm_reclaim_current;					\
+	li      r3, 1;		/* Reclaim happened */			\
+	b       2f;							\
+1:	li      r3, 0;		/* Reclaim didn't happen */		\
+2:
+#else
+#define TM_KERNEL_ENTRY(cause)
+#endif
+
 #define EXCEPTION_COMMON(area, trap, label, hdlr, ret, additions) \
 	EXCEPTION_PROLOG_COMMON(trap, area);			\
 	/* Volatile regs are potentially clobbered here */	\
 	additions;						\
+	TM_KERNEL_ENTRY(TM_CAUSE_MISC);					\
 	addi	r3,r1,STACK_FRAME_OVERHEAD;			\
 	bl	hdlr;						\
 	b	ret
@@ -695,6 +740,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 	EXCEPTION_PROLOG_COMMON_3(trap);			\
 	/* Volatile regs are potentially clobbered here */	\
 	additions;						\
+	TM_KERNEL_ENTRY(TM_CAUSE_MISC);				\
 	addi	r3,r1,STACK_FRAME_OVERHEAD;			\
 	bl	hdlr
 
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 7b1693adff2a..17484ebda66c 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -131,6 +131,16 @@ BEGIN_FW_FTR_SECTION
 END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE && CONFIG_PPC_SPLPAR */
 
+#if CONFIG_PPC_TRANSACTIONAL_MEM
+	TM_KERNEL_ENTRY(TM_CAUSE_SYSCALL)
+	cmpdi	r3, 0x1
+	bne	44f
+	/* Restore from r4 to r12 */
+	REST_8GPRS(4,r1)
+44:	/* treclaim was not called, just restore r3 and r0 */
+	REST_GPR(3, r1)
+	REST_GPR(0, r1)
+#endif
 	/*
 	 * A syscall should always be called with interrupts enabled
 	 * so we just unconditionally hard-enable here. When some kind
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 89d32bb79d5e..5c685a46202d 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -717,6 +717,7 @@ EXC_COMMON_BEGIN(alignment_common)
 	std	r3,_DAR(r1)
 	std	r4,_DSISR(r1)
 	bl	save_nvgprs
+	TM_KERNEL_ENTRY(TM_CAUSE_ALIGNMENT)
 	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	alignment_exception
@@ -751,6 +752,8 @@ EXC_COMMON_BEGIN(program_check_common)
 	b 3f				/* Jump into the macro !!	*/
 1:	EXCEPTION_PROLOG_COMMON(0x700, PACA_EXGEN)
 	bl	save_nvgprs
+	ld      r3, _MSR(r1)
+	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
 	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	program_check_exception
@@ -1650,7 +1653,9 @@ do_hash_page:
 
 /* Here we have a page fault that hash_page can't handle. */
 handle_page_fault:
-11:	andis.  r0,r4,DSISR_DABRMATCH@h
+11:	TM_KERNEL_ENTRY(TM_CAUSE_TLBI)
+	ld      r4,_DSISR(r1)
+	andis.  r0,r4,DSISR_DABRMATCH@h
 	bne-    handle_dabr_fault
 	ld	r4,_DAR(r1)
 	ld	r5,_DSISR(r1)
@@ -1681,6 +1686,8 @@ handle_dabr_fault:
  */
 13:	bl	save_nvgprs
 	mr	r5,r3
+	TM_KERNEL_ENTRY(TM_CAUSE_TLBI)
+	REST_GPR(3,r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	ld	r4,_DAR(r1)
 	bl	low_hash_fault
@@ -1695,7 +1702,8 @@ handle_dabr_fault:
  * the access, or panic if there isn't a handler.
  */
 77:	bl	save_nvgprs
-	mr	r4,r3
+	TM_KERNEL_ENTRY(TM_CAUSE_TLBI)
+	ld      r4,_DAR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	li	r5,SIGSEGV
 	bl	bad_page_fault
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 02/14] powerpc/tm: Reclaim on unavailable exception
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-15  0:06   ` Michael Neuling
  2018-11-06 12:40 ` [RFC PATCH 03/14] powerpc/tm: Recheckpoint when exiting from kernel Breno Leitao
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

If there is a FP/VEC/Altivec touch inside a transaction and the facility is
disabled, then a facility unavailable exception is raised and ends up
calling {fp,vec,vsx}_unavailable_tm, which was reclaiming and
recheckpointing.

This is not required anymore, since the checkpointed state was reclaimed in
the exception entrance, and it will be recheckpointed by restore_tm_state
later.

Adding a WARN_ON() warning if we hit the _unavailable_tm() in suspended
mode, i.e, the reclaim was not executed somehow in the trap entrance, and
this is a bug.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/include/asm/exception-64s.h |  4 ++++
 arch/powerpc/kernel/exceptions-64s.S     |  3 +++
 arch/powerpc/kernel/traps.c              | 22 ++++------------------
 3 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 931a74ba037b..80f01d5683c3 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -711,6 +711,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 	 * Soft disable the IRQs, otherwise it might cause a CPU hang.	\
 	 */								\
 	RECONCILE_IRQ_STATE(r10, r11);					\
+	/*								\
+	 * Although this cause will be set initially, it might be	\
+	 * updated later, once the exception is better understood	\
+	 */								\
 	li      r3, cause;						\
 	bl      tm_reclaim_current;					\
 	li      r3, 1;		/* Reclaim happened */			\
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 5c685a46202d..47e05b09eed6 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -786,6 +786,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
 	bl	save_nvgprs
+	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
 	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	fp_unavailable_tm
@@ -1128,6 +1129,7 @@ BEGIN_FTR_SECTION
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
 	bl	save_nvgprs
+	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
 	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	altivec_unavailable_tm
@@ -1164,6 +1166,7 @@ BEGIN_FTR_SECTION
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
 	bl	save_nvgprs
+	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
 	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	vsx_unavailable_tm
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 9a86572db1ef..e74b735e974c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1742,23 +1742,10 @@ void fp_unavailable_tm(struct pt_regs *regs)
          * transaction, and probably retry but now with FP enabled.  So the
          * checkpointed FP registers need to be loaded.
 	 */
-	tm_reclaim_current(TM_CAUSE_FAC_UNAV);
-
-	/*
-	 * Reclaim initially saved out bogus (lazy) FPRs to ckfp_state, and
-	 * then it was overwrite by the thr->fp_state by tm_reclaim_thread().
-	 *
-	 * At this point, ck{fp,vr}_state contains the exact values we want to
-	 * recheckpoint.
-	 */
+	WARN_ON(MSR_TM_SUSPENDED(mfmsr()));
 
 	/* Enable FP for the task: */
 	current->thread.load_fp = 1;
-
-	/*
-	 * Recheckpoint all the checkpointed ckpt, ck{fp, vr}_state registers.
-	 */
-	tm_recheckpoint(&current->thread);
 }
 
 void altivec_unavailable_tm(struct pt_regs *regs)
@@ -1770,10 +1757,10 @@ void altivec_unavailable_tm(struct pt_regs *regs)
 	TM_DEBUG("Vector Unavailable trap whilst transactional at 0x%lx,"
 		 "MSR=%lx\n",
 		 regs->nip, regs->msr);
-	tm_reclaim_current(TM_CAUSE_FAC_UNAV);
+	WARN_ON(MSR_TM_SUSPENDED(mfmsr()));
 	current->thread.load_vec = 1;
-	tm_recheckpoint(&current->thread);
 	current->thread.used_vr = 1;
+
 }
 
 void vsx_unavailable_tm(struct pt_regs *regs)
@@ -1792,12 +1779,11 @@ void vsx_unavailable_tm(struct pt_regs *regs)
 	current->thread.used_vsr = 1;
 
 	/* This reclaims FP and/or VR regs if they're already enabled */
-	tm_reclaim_current(TM_CAUSE_FAC_UNAV);
+	WARN_ON(MSR_TM_SUSPENDED(mfmsr()));
 
 	current->thread.load_vec = 1;
 	current->thread.load_fp = 1;
 
-	tm_recheckpoint(&current->thread);
 }
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 03/14] powerpc/tm: Recheckpoint when exiting from kernel
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 02/14] powerpc/tm: Reclaim on unavailable exception Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-15  0:11   ` Michael Neuling
  2018-11-06 12:40 ` [RFC PATCH 04/14] powerpc/tm: Always set TIF_RESTORE_TM on reclaim Breno Leitao
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

This is the only place we are going to recheckpoint now. Now the task
needs to have TIF_RESTORE_TM flag set, which will get into
restore_tm_state() at exception exit path, and execute the recheckpoint
depending on the MSR.

Every time a task is required to recheckpoint, or just have the TM SPRs
restore, the TIF_RESTORE_TM flag should be set and the task MSR should
properly be in a transactional state, which will be checked by
restore_tm_state().

After the facility registers are recheckpointed, they are clobbered with
the values that were recheckpointed (and are now also in the checkpoint
area).

If facility is enabled at MSR that is being returned to user space, then
the facility registers need to be restored, otherwise userspace will see
invalid values.

This patch simplify the restore_tm_state() to just restore the facility
registers that are enabled when returning to userspace, i.e. the MSR will
be the same that will be put into SRR1, which will be the MSR after RFID.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/process.c | 38 ++++++++++++++++++++++++-----------
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 4d5322cfad25..c7e758a42b8f 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1049,8 +1049,6 @@ static inline void __switch_to_tm(struct task_struct *prev,
  */
 void restore_tm_state(struct pt_regs *regs)
 {
-	unsigned long msr_diff;
-
 	/*
 	 * This is the only moment we should clear TIF_RESTORE_TM as
 	 * it is here that ckpt_regs.msr and pt_regs.msr become the same
@@ -1061,19 +1059,35 @@ void restore_tm_state(struct pt_regs *regs)
 	if (!MSR_TM_ACTIVE(regs->msr))
 		return;
 
-	msr_diff = current->thread.ckpt_regs.msr & ~regs->msr;
-	msr_diff &= MSR_FP | MSR_VEC | MSR_VSX;
+	tm_enable();
+	/* The only place we recheckpoint */
+	tm_recheckpoint(&current->thread);
 
-	/* Ensure that restore_math() will restore */
-	if (msr_diff & MSR_FP)
-		current->thread.load_fp = 1;
+	/*
+	 * Restore the facility registers that were clobbered during
+	 * recheckpoint.
+	 */
+	if (regs->msr & MSR_FP) {
+		/*
+		 * Using load_fp_state() instead of restore_fp() because we
+		 * want to force the restore, independent of
+		 * tsk->thread.load_fp. Same for other cases below.
+		 */
+		load_fp_state(&current->thread.fp_state);
+	}
 #ifdef CONFIG_ALTIVEC
-	if (cpu_has_feature(CPU_FTR_ALTIVEC) && msr_diff & MSR_VEC)
-		current->thread.load_vec = 1;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC) && regs->msr & MSR_VEC)
+		load_vr_state(&current->thread.vr_state);
+#endif
+#ifdef CONFIG_VSX
+	if (cpu_has_feature(CPU_FTR_VSX) && regs->msr & MSR_VSX) {
+		/*
+		 * If VSX is enabled, it is expected that VEC and FP are
+		 * also enabled and already restored the full register set.
+		 * Cause a warning if that is not the case.
+		 */
+		WARN_ON(!(regs->msr & MSR_VEC) || !(regs->msr & MSR_FP)); }
 #endif
-	restore_math(regs);
-
-	regs->msr |= msr_diff;
 }
 
 #else
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 04/14] powerpc/tm: Always set TIF_RESTORE_TM on reclaim
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (2 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 03/14] powerpc/tm: Recheckpoint when exiting from kernel Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 05/14] powerpc/tm: Refactor the __switch_to_tm code Breno Leitao
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

If the task data was reclaimed (through TM_KERNEL_ENTRY), then it needs to
be recheckpointed later, once exiting to userspace. The recheckpoint is
done by restore_tm_state() function, which is called on our way to
userspace if the task has the TIF_RESTORE_TM flag set.

This patch makes sure the task has TIF_RESTORE_TM tag set every time there
is a reclaim, so, a recheckpoint will be executed later.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/process.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index c7e758a42b8f..1842fd96b123 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -883,6 +883,9 @@ static void tm_reclaim_thread(struct thread_struct *thr, uint8_t cause)
 
 	tm_reclaim(thr, cause);
 
+	/* Tag it so restore_tm_state will pay attention to this task */
+	set_thread_flag(TIF_RESTORE_TM);
+
 	/*
 	 * If we are in a transaction and FP is off then we can't have
 	 * used FP inside that transaction. Hence the checkpointed
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 05/14] powerpc/tm: Refactor the __switch_to_tm code
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (3 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 04/14] powerpc/tm: Always set TIF_RESTORE_TM on reclaim Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-15  1:04   ` Michael Neuling
  2018-11-06 12:40 ` [RFC PATCH 06/14] powerpc/tm: Do not recheckpoint at sigreturn Breno Leitao
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

__switch_to_tm is the function that switches between two tasks which might
have TM enabled. This function is clearly split in two parts, the task that
is leaving the CPU, known as 'prev' and the task that is being scheduled,
known as 'new'.

It starts checking if the previous task had TM enable, if so, it increases
the load_tm (this is the only place we increment load_tm). It also saves
the TM SPRs here.

If the previous task was scheduled out with a transaction active, the
failure cause needs to be updated, since it might contain the failure cause
that caused the exception, as TM_CAUSE_MISC. In this case, since there was
a context switch, overwrite the failure cause.

If the previous task has overflowed load_tm, disable TM, putting the
facility save/restore lazy mechanism at lazy mode.

Regarding the 'new' task being scheduled, restoring TM SPRs is enough if
the task had TM enabled when it was de-scheduled. (Checking if a
recheckpoint would be required will be done later, at restore_tm_state()
stage.)

On top of that, both tm_reclaim_task() and tm_recheckpoint_new_task()
functions are not used anymore, removing them.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/process.c | 167 ++++++++++++++++------------------
 1 file changed, 78 insertions(+), 89 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1842fd96b123..73872f751b33 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -912,48 +912,6 @@ void tm_reclaim_current(uint8_t cause)
 	tm_reclaim_thread(&current->thread, cause);
 }
 
-static inline void tm_reclaim_task(struct task_struct *tsk)
-{
-	/* We have to work out if we're switching from/to a task that's in the
-	 * middle of a transaction.
-	 *
-	 * In switching we need to maintain a 2nd register state as
-	 * oldtask->thread.ckpt_regs.  We tm_reclaim(oldproc); this saves the
-	 * checkpointed (tbegin) state in ckpt_regs, ckfp_state and
-	 * ckvr_state
-	 *
-	 * We also context switch (save) TFHAR/TEXASR/TFIAR in here.
-	 */
-	struct thread_struct *thr = &tsk->thread;
-
-	if (!thr->regs)
-		return;
-
-	if (!MSR_TM_ACTIVE(thr->regs->msr))
-		goto out_and_saveregs;
-
-	WARN_ON(tm_suspend_disabled);
-
-	TM_DEBUG("--- tm_reclaim on pid %d (NIP=%lx, "
-		 "ccr=%lx, msr=%lx, trap=%lx)\n",
-		 tsk->pid, thr->regs->nip,
-		 thr->regs->ccr, thr->regs->msr,
-		 thr->regs->trap);
-
-	tm_reclaim_thread(thr, TM_CAUSE_RESCHED);
-
-	TM_DEBUG("--- tm_reclaim on pid %d complete\n",
-		 tsk->pid);
-
-out_and_saveregs:
-	/* Always save the regs here, even if a transaction's not active.
-	 * This context-switches a thread's TM info SPRs.  We do it here to
-	 * be consistent with the restore path (in recheckpoint) which
-	 * cannot happen later in _switch().
-	 */
-	tm_save_sprs(thr);
-}
-
 extern void __tm_recheckpoint(struct thread_struct *thread);
 
 void tm_recheckpoint(struct thread_struct *thread)
@@ -980,59 +938,91 @@ void tm_recheckpoint(struct thread_struct *thread)
 	local_irq_restore(flags);
 }
 
-static inline void tm_recheckpoint_new_task(struct task_struct *new)
+static void tm_change_failure_cause(struct task_struct *task, uint8_t cause)
 {
-	if (!cpu_has_feature(CPU_FTR_TM))
-		return;
-
-	/* Recheckpoint the registers of the thread we're about to switch to.
-	 *
-	 * If the task was using FP, we non-lazily reload both the original and
-	 * the speculative FP register states.  This is because the kernel
-	 * doesn't see if/when a TM rollback occurs, so if we take an FP
-	 * unavailable later, we are unable to determine which set of FP regs
-	 * need to be restored.
-	 */
-	if (!tm_enabled(new))
-		return;
-
-	if (!MSR_TM_ACTIVE(new->thread.regs->msr)){
-		tm_restore_sprs(&new->thread);
-		return;
-	}
-	/* Recheckpoint to restore original checkpointed register state. */
-	TM_DEBUG("*** tm_recheckpoint of pid %d (new->msr 0x%lx)\n",
-		 new->pid, new->thread.regs->msr);
-
-	tm_recheckpoint(&new->thread);
-
-	/*
-	 * The checkpointed state has been restored but the live state has
-	 * not, ensure all the math functionality is turned off to trigger
-	 * restore_math() to reload.
-	 */
-	new->thread.regs->msr &= ~(MSR_FP | MSR_VEC | MSR_VSX);
-
-	TM_DEBUG("*** tm_recheckpoint of pid %d complete "
-		 "(kernel msr 0x%lx)\n",
-		 new->pid, mfmsr());
+	task->thread.tm_texasr &= ~TEXASR_FC;
+	task->thread.tm_texasr |= (unsigned long) cause << TEXASR_FC_LG;
 }
 
 static inline void __switch_to_tm(struct task_struct *prev,
 		struct task_struct *new)
 {
-	if (cpu_has_feature(CPU_FTR_TM)) {
-		if (tm_enabled(prev) || tm_enabled(new))
-			tm_enable();
+	if (!cpu_has_feature(CPU_FTR_TM))
+		return;
+
+	/* The task leaving the CPU was using TM, let's handle it */
+	if (tm_enabled(prev)) {
+		/*
+		 * Load_tm is incremented only when the task is scheduled out
+		 */
+		prev->thread.load_tm++;
+
+		/*
+		 * If TM is enabled for the thread, it needs to, at least,
+		 * save the SPRs
+		 */
+		tm_enable();
+		tm_save_sprs(&prev->thread);
+
+		/*
+		 * If the task being descheduled had an active transaction
+		 * during the exception that brought it here, the
+		 * transaction was reclaimed by TM_KERNEL_ENTRY.
+		 *
+		 * Independently of the TEXASR failure code set at
+		 * TM_KERNEL_ENTRY time, the task is being descheduled, and
+		 * the failure code needs to be updated to reflect it.
+		 */
+		if (MSR_TM_ACTIVE(prev->thread.regs->msr)) {
+			/*
+			 * If there was an IRQ during trecheckpoint, it will
+			 * cause an IRQ to be replayed. This replayed IRQ can
+			 * invoke SCHEDULE_USER, thus, we arrive here with a TM
+			 * active transaction.
+			 * I.e, the task was leaving kernelspace to userspace,
+			 * already trecheckpointed, but there was a IRQ during
+			 * the trecheckpoint process (soft irq disabled), and
+			 * on the IRQ replay, the process was de-scheduled, so,
+			 * SCHEDULE_USER was called and here we are.
+			 *
+			 */
+			if (WARN_ON(MSR_TM_ACTIVE(mfmsr())))
+				tm_reclaim_current(TM_CAUSE_RESCHED);
 
-		if (tm_enabled(prev)) {
-			prev->thread.load_tm++;
-			tm_reclaim_task(prev);
-			if (!MSR_TM_ACTIVE(prev->thread.regs->msr) && prev->thread.load_tm == 0)
+			/*
+			 * If rescheduled with TM active, update the
+			 * failure cause
+			 */
+			tm_change_failure_cause(prev, TM_CAUSE_RESCHED);
+		} else {
+			/*
+			 * TM enabled but not transactional. Just disable TM
+			 * if load_tm overflows. This should be the only place
+			 * that disables the TM and reenables the laziness
+			 * save/restore
+			 */
+			if (prev->thread.load_tm == 0)
 				prev->thread.regs->msr &= ~MSR_TM;
 		}
+	}
 
-		tm_recheckpoint_new_task(new);
+	/*
+	 * It is a *bug* if we arrived so late with a transaction active
+	 * (more precisely suspended)
+	 */
+	if (WARN_ON(MSR_TM_ACTIVE(mfmsr()))) {
+		/* Recovery path. 0x99 shouldn't be exported to UAPI */
+		tm_reclaim_current(0x99);
+	}
+
+	/*
+	 * If the next task has TM enabled, restore the SPRs. Do not need to
+	 * care about recheckpoint at this time. It will be done later if
+	 * TIF_RESTORE_TM was set when the task was scheduled out
+	 */
+	if (tm_enabled(new)) {
+		tm_enable();
+		tm_restore_sprs(&new->thread);
 	}
 }
 
@@ -1094,7 +1084,6 @@ void restore_tm_state(struct pt_regs *regs)
 }
 
 #else
-#define tm_recheckpoint_new_task(new)
 #define __switch_to_tm(prev, new)
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
@@ -1599,9 +1588,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	/*
 	 * Flush TM state out so we can copy it.  __switch_to_tm() does this
 	 * flush but it removes the checkpointed state from the current CPU and
-	 * transitions the CPU out of TM mode.  Hence we need to call
-	 * tm_recheckpoint_new_task() (on the same task) to restore the
-	 * checkpointed state back and the TM mode.
+	 * transitions the CPU out of TM mode.  Hence we need to make sure
+	 * TIF_RESTORE_TM is set so restore_tm_state is called to restore the
+	 * checkpointed state and back to TM mode.
 	 *
 	 * Can't pass dst because it isn't ready. Doesn't matter, passing
 	 * dst is only important for __switch_to()
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 06/14] powerpc/tm: Do not recheckpoint at sigreturn
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (4 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 05/14] powerpc/tm: Refactor the __switch_to_tm code Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 07/14] powerpc/tm: Do not reclaim on ptrace Breno Leitao
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

Do not recheckpoint at signal code return. Just make sure TIF_RESTORE_TM is
set, which will restore on the exit to userspace by restore_tm_state.

All the FP and VEC lazy restore was already done by tm_reclaim_current(),
where it checked if FP/VEC was set, and filled out the ckfp and ckvr
registers area to the expected value.

The load_fp_state() and load_vr_state() is being called to restore the
clobbered registers after tm_recheckpoint(). This is not necessary anymore,
since restore_tm_state() does it already, so, this is only required to be
done after trecheckpoint(), and the only place calling it is
restore_tm_state().

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/signal_32.c | 38 +++++++++------------------------
 arch/powerpc/kernel/signal_64.c | 30 ++++++++------------------
 2 files changed, 19 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index e6474a45cef5..046030d7921e 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -850,28 +850,11 @@ static long restore_tm_user_regs(struct pt_regs *regs,
 		return 1;
 	/* Pull in the MSR TM bits from the user context */
 	regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr_hi & MSR_TS_MASK);
-	/* Now, recheckpoint.  This loads up all of the checkpointed (older)
-	 * registers, including FP and V[S]Rs.  After recheckpointing, the
-	 * transactional versions should be loaded.
-	 */
-	tm_enable();
+
 	/* Make sure the transaction is marked as failed */
 	current->thread.tm_texasr |= TEXASR_FS;
-	/* This loads the checkpointed FP/VEC state, if used */
-	tm_recheckpoint(&current->thread);
-
-	/* This loads the speculative FP/VEC state, if used */
-	msr_check_and_set(msr & (MSR_FP | MSR_VEC));
-	if (msr & MSR_FP) {
-		load_fp_state(&current->thread.fp_state);
-		regs->msr |= (MSR_FP | current->thread.fpexc_mode);
-	}
-#ifdef CONFIG_ALTIVEC
-	if (msr & MSR_VEC) {
-		load_vr_state(&current->thread.vr_state);
-		regs->msr |= MSR_VEC;
-	}
-#endif
+	/* Make sure restore_tm_state will be called */
+	set_thread_flag(TIF_RESTORE_TM);
 
 	return 0;
 }
@@ -1156,16 +1139,15 @@ SYSCALL_DEFINE0(rt_sigreturn)
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/*
-	 * If there is a transactional state then throw it away.
-	 * The purpose of a sigreturn is to destroy all traces of the
-	 * signal frame, this includes any transactional state created
-	 * within in. We only check for suspended as we can never be
-	 * active in the kernel, we are active, there is nothing better to
-	 * do than go ahead and Bad Thing later.
-	 * The cause is not important as there will never be a
+	 * If the transaction is active at this point, it means that
+	 * TM_KERNEL_ENTRY was not invoked properly and it is a bug.
+	 * If this is the case, print a warning and try to work around,
+	 * calling tm_reclaim_current() to discard the footprint.
+	 *
+	 * The failure cause is not important as there will never be a
 	 * recheckpoint so it's not user visible.
 	 */
-	if (MSR_TM_SUSPENDED(mfmsr()))
+	if (WARN_ON(MSR_TM_SUSPENDED(mfmsr())))
 		tm_reclaim_current(0);
 
 	if (__get_user(tmp, &rt_sf->uc.uc_link))
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 83d51bf586c7..487c3b6aa2e3 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -569,21 +569,10 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
 		}
 	}
 #endif
-	tm_enable();
 	/* Make sure the transaction is marked as failed */
 	tsk->thread.tm_texasr |= TEXASR_FS;
-	/* This loads the checkpointed FP/VEC state, if used */
-	tm_recheckpoint(&tsk->thread);
-
-	msr_check_and_set(msr & (MSR_FP | MSR_VEC));
-	if (msr & MSR_FP) {
-		load_fp_state(&tsk->thread.fp_state);
-		regs->msr |= (MSR_FP | tsk->thread.fpexc_mode);
-	}
-	if (msr & MSR_VEC) {
-		load_vr_state(&tsk->thread.vr_state);
-		regs->msr |= MSR_VEC;
-	}
+	/* Guarantee that restore_tm_state() will be called */
+	set_thread_flag(TIF_RESTORE_TM);
 
 	return err;
 }
@@ -717,16 +706,15 @@ SYSCALL_DEFINE0(rt_sigreturn)
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/*
-	 * If there is a transactional state then throw it away.
-	 * The purpose of a sigreturn is to destroy all traces of the
-	 * signal frame, this includes any transactional state created
-	 * within in. We only check for suspended as we can never be
-	 * active in the kernel, we are active, there is nothing better to
-	 * do than go ahead and Bad Thing later.
-	 * The cause is not important as there will never be a
+	 * If the transaction is active at this point, it means that
+	 * TM_KERNEL_ENTRY was not invoked properly and it is a bug.
+	 * If this is the case, print a warning and try to work around,
+	 * calling tm_reclaim_current() to discard the footprint.
+	 *
+	 * The failure cause is not important as there will never be a
 	 * recheckpoint so it's not user visible.
 	 */
-	if (MSR_TM_SUSPENDED(mfmsr()))
+	if (WARN_ON(MSR_TM_SUSPENDED(mfmsr())))
 		tm_reclaim_current(0);
 
 	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 07/14] powerpc/tm: Do not reclaim on ptrace
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (5 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 06/14] powerpc/tm: Do not recheckpoint at sigreturn Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 08/14] powerpc/tm: Recheckpoint at exit path Breno Leitao
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

Make sure that we are not suspended on ptrace and that the registers were
already reclaimed.

Since the data was already reclaimed, there is nothing to be done here
except to restore the SPRs.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/ptrace.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index afb819f4ca68..4faf0612d58c 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -136,9 +136,19 @@ static void flush_tmregs_to_thread(struct task_struct *tsk)
 	if ((!cpu_has_feature(CPU_FTR_TM)) || (tsk != current))
 		return;
 
-	if (MSR_TM_SUSPENDED(mfmsr())) {
-		tm_reclaim_current(TM_CAUSE_SIGNAL);
-	} else {
+	if (WARN_ON(MSR_TM_SUSPENDED(mfmsr()))) {
+		/*
+		 * We should never be here in suspended state. This is a
+		 * bug!
+		 */
+		tm_reclaim_current(0x99);
+	}
+	if (tsk->thread.regs->msr & MSR_TM) {
+		/*
+		 * Only flush TM SPRs to the thread if TM was enabled,
+		 * otherwise (TM lazily disabled), the thread already
+		 * contains the latest SPR value
+		 */
 		tm_enable();
 		tm_save_sprs(&(tsk->thread));
 	}
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 08/14] powerpc/tm: Recheckpoint at exit path
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (6 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 07/14] powerpc/tm: Do not reclaim on ptrace Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-15  2:45   ` Michael Neuling
  2018-11-06 12:40 ` [RFC PATCH 09/14] powerpc/tm: Warn if state is transactional Breno Leitao
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

In the past, TIF_RESTORE_TM was being handled with the rest of the TIF workers,
but, that was too early, and can cause some IRQ to be replayed in suspended
state (after recheckpoint).

This patch moves TIF_RESTORE_TM handler to as late as possible, it also
forces the IRQ to be disabled, and it will continue to be until RFID, so,
no IRQ will be replayed at all. I.e, if trecheckpoint happens, it will RFID
to userspace.

This makes TIF_RESTORE_TM a special case that should not be handled
similarly to the _TIF_USER_WORK_MASK.

Since _TIF_RESTORE_TM is not part of _TIF_USER_WORK_MASK anymore, we
need to force system_call_exit to continue to leaves through
fast_exception_return, so, we add the flags together with
_TIF_USER_WORK_MASK at system_call_exist path.

If this flag is set at system_call_exit, it means that recheckpoint
will be called, and doing it through fast_exception_return is the only
way to do so.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/include/asm/thread_info.h |  2 +-
 arch/powerpc/kernel/entry_64.S         | 23 ++++++++++++++---------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 544cac0474cb..2835d60bc9ef 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -139,7 +139,7 @@ void arch_setup_new_exec(void);
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
 				 _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
-				 _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
+				 _TIF_PATCH_PENDING | \
 				 _TIF_FSCHECK)
 #define _TIF_PERSYSCALL_MASK	(_TIF_RESTOREALL|_TIF_NOERROR)
 
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 17484ebda66c..a86619edf29d 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -255,7 +255,7 @@ system_call_exit:
 
 	ld	r9,TI_FLAGS(r12)
 	li	r11,-MAX_ERRNO
-	andi.	r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
+	andi.   r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK |_TIF_RESTORE_TM)
 	bne-	.Lsyscall_exit_work
 
 	andi.	r0,r8,MSR_FP
@@ -784,14 +784,6 @@ _GLOBAL(ret_from_except_lite)
 	SCHEDULE_USER
 	b	ret_from_except_lite
 2:
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-	andi.	r0,r4,_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM
-	bne	3f		/* only restore TM if nothing else to do */
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	restore_tm_state
-	b	restore
-3:
-#endif
 	bl	save_nvgprs
 	/*
 	 * Use a non volatile GPR to save and restore our thread_info flags
@@ -938,6 +930,19 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	 */
 	.globl	fast_exception_return
 fast_exception_return:
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	CURRENT_THREAD_INFO(r4, r1)
+	ld	r4,TI_FLAGS(r4)
+	andi.   r0,r4,_TIF_RESTORE_TM
+	beq	22f
+	ld	r4,_MSR(r1) 	/* TODO: MSR[!PR] shouldn't be here */
+	andi.	r0,r4,MSR_PR
+	beq	22f  /* Skip if Kernel thread */
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	restore_tm_state
+22:
+#endif
 	ld	r3,_MSR(r1)
 	ld	r4,_CTR(r1)
 	ld	r0,_LINK(r1)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 09/14] powerpc/tm: Warn if state is transactional
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (7 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 08/14] powerpc/tm: Recheckpoint at exit path Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-15  2:48   ` Michael Neuling
  2018-11-06 12:40 ` [RFC PATCH 10/14] powerpc/tm: Improve TM debug information Breno Leitao
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

Since every kernel entrance is calling TM_KERNEL_ENTRY, it is not
expected to arrive at this point with a suspended transaction.

If that is the case, cause a warning and reclaim the current thread in
order to avoid a TM Bad Thing.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/process.c | 7 +++----
 arch/powerpc/kernel/signal.c  | 2 +-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 73872f751b33..849591bf0881 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1752,11 +1752,10 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/*
-	 * Clear any transactional state, we're exec()ing. The cause is
-	 * not important as there will never be a recheckpoint so it's not
-	 * user visible.
+	 * It is a bug if the transaction was not reclaimed until this
+	 * point. Warn us and try to workaround it calling tm_reclaim().
 	 */
-	if (MSR_TM_SUSPENDED(mfmsr()))
+	if (WARN_ON(MSR_TM_SUSPENDED(mfmsr())))
 		tm_reclaim_current(0);
 #endif
 
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index b3e8db376ecd..cbaccc2be0fb 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -203,7 +203,7 @@ unsigned long get_tm_stackpointer(struct task_struct *tsk)
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	BUG_ON(tsk != current);
 
-	if (MSR_TM_ACTIVE(tsk->thread.regs->msr)) {
+	if (WARN_ON(MSR_TM_ACTIVE(mfmsr()))) {
 		tm_reclaim_current(TM_CAUSE_SIGNAL);
 		if (MSR_TM_TRANSACTIONAL(tsk->thread.regs->msr))
 			return tsk->thread.ckpt_regs.gpr[1];
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 10/14] powerpc/tm: Improve TM debug information
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (8 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 09/14] powerpc/tm: Warn if state is transactional Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 11/14] powerpc/tm: Save MSR to PACA before RFID Breno Leitao
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

Add some debug information into the TM subsystem. When enable, now it
prints when there is a reclaim, recheckpoint or lazy TM disabling.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/process.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 849591bf0881..8a9c298928f9 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -879,6 +879,8 @@ static void tm_reclaim_thread(struct thread_struct *thr, uint8_t cause)
 	if (!MSR_TM_SUSPENDED(mfmsr()))
 		return;
 
+	TM_DEBUG("TM reclaim thread at 0x%lx, MSR=%lx\n", thr->regs->nip,
+		thr->regs->msr);
 	giveup_all(container_of(thr, struct task_struct, thread));
 
 	tm_reclaim(thr, cause);
@@ -921,6 +923,8 @@ void tm_recheckpoint(struct thread_struct *thread)
 	if (!(thread->regs->msr & MSR_TM))
 		return;
 
+	TM_DEBUG("TM recheckpoint at 0x%lx, MSR=%lx\n", thread->regs->nip,
+		thread->regs->msr);
 	/* We really can't be interrupted here as the TEXASR registers can't
 	 * change and later in the trecheckpoint code, we have a userspace R1.
 	 * So let's hard disable over this region.
@@ -1001,8 +1005,11 @@ static inline void __switch_to_tm(struct task_struct *prev,
 			 * that disables the TM and reenables the laziness
 			 * save/restore
 			 */
-			if (prev->thread.load_tm == 0)
+			if (prev->thread.load_tm == 0) {
 				prev->thread.regs->msr &= ~MSR_TM;
+				TM_DEBUG("Disabling TM facility for process %s (%lx)\n",
+					 prev->comm, prev->pid);
+			}
 		}
 	}
 
@@ -1052,6 +1059,7 @@ void restore_tm_state(struct pt_regs *regs)
 	if (!MSR_TM_ACTIVE(regs->msr))
 		return;
 
+	TM_DEBUG("Restore TM state at 0x%lx, MSR=%lx\n", regs->nip, regs->msr);
 	tm_enable();
 	/* The only place we recheckpoint */
 	tm_recheckpoint(&current->thread);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 11/14] powerpc/tm: Save MSR to PACA before RFID
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (9 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 10/14] powerpc/tm: Improve TM debug information Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 12/14] powerpc/tm: Restore transactional SPRs Breno Leitao
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

As other exit points, move SRR1 (MSR) into paca->tm_scratch, so, if
there is a TM Bad Thing in RFID, it is easy to understand what was the
SRR1 value being used.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/entry_64.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index a86619edf29d..19e460b893e8 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -296,6 +296,10 @@ BEGIN_FTR_SECTION
 	HMT_MEDIUM_LOW
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	std	r8, PACATMSCRATCH(r13)
+#endif
+
 	ld	r13,GPR13(r1)	/* only restore r13 if returning to usermode */
 	ld	r2,GPR2(r1)
 	ld	r1,GPR1(r1)
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 12/14] powerpc/tm: Restore transactional SPRs
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (10 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 11/14] powerpc/tm: Save MSR to PACA before RFID Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-06 12:40 ` [RFC PATCH 13/14] powerpc/tm: Do not restore TM without SPRs Breno Leitao
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

In the previous TM code, trecheckpoint was being executed in the middle of
an exception, thus, SPRs were being restored to default kernel SPRs (as
DSCR) value after trecheckpoint was done.

With this current patchset, trecheckpoint is executed just before getting
to userspace, at ret_from_except_lite, for example. Thus, SPRs need to be
restored to transactional value.

In order to so, a function, called restore_sprs_after_recheckpoint(), was
created to restore the SPRs after recheckpoint. This function should be
called after trecheckpoint to restore transactional SPRs, otherwise the
SPRs will be clobbered (with checkpointed values) after recheckpoint. I
understand it is preferred to have it done as a C function instead of
embedding it in the ASM code.

This patch also changes tm_reclaim() that now always return with
transactional SPRs value (instead of checkpointed value), as NVGPRs.
Previously tm_reclaim() was returning with transactional NVGPRS and
checkpointed SPRS, intermixing them.  With the current patch, tm_reclaim()
just fill out thread->ck and thread->tm values, and return with
transactional values in a uniform way (for SPRS and NVGPRS). In this case,
a later save_spr() at switch_to() will not overwrite thread->sprs with
checkpointed values, but with transactional values.

Facility registers, as VEC and FP, continue to be clobbered after
tm_reclaim(), tho.

These are the SPRs that are checkpointed by the hardware: CR fields other
than CR0, LR, CTR, FPSCR, AMR, PPR, VRSAVE, VSCR, DSCR, and TAR.

This patch only cares about PPR, TAR and DSCR, because others SPRS either
volatiles, restored as part of facilities or not being handled currently as
AMR/CRs.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/asm-offsets.c |  4 ++++
 arch/powerpc/kernel/process.c     | 19 +++++++++++++++++++
 arch/powerpc/kernel/tm.S          | 19 ++++++++++++-------
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 9ffc72ded73a..93def2e23e68 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -109,6 +109,9 @@ int main(void)
 	OFFSET(THREAD_FPSAVEAREA, thread_struct, fp_save_area);
 	OFFSET(FPSTATE_FPSCR, thread_fp_state, fpscr);
 	OFFSET(THREAD_LOAD_FP, thread_struct, load_fp);
+	OFFSET(THREAD_DSCR, thread_struct, dscr);
+	OFFSET(THREAD_TAR, thread_struct, tar);
+
 #ifdef CONFIG_ALTIVEC
 	OFFSET(THREAD_VRSTATE, thread_struct, vr_state.vr);
 	OFFSET(THREAD_VRSAVEAREA, thread_struct, vr_save_area);
@@ -311,6 +314,7 @@ int main(void)
 	STACK_PT_REGS_OFFSET(ORIG_GPR3, orig_gpr3);
 	STACK_PT_REGS_OFFSET(RESULT, result);
 	STACK_PT_REGS_OFFSET(_TRAP, trap);
+	STACK_PT_REGS_OFFSET(_PPR, ppr);
 #ifndef CONFIG_PPC64
 	/*
 	 * The PowerPC 400-class & Book-E processors have neither the DAR
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 8a9c298928f9..3937408ff339 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -79,6 +79,9 @@
 #endif
 
 extern unsigned long _get_SP(void);
+static inline void save_sprs(struct thread_struct *t);
+static inline void restore_sprs_after_recheckpoint(struct thread_struct
+						   *thread);
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 /*
@@ -883,6 +886,8 @@ static void tm_reclaim_thread(struct thread_struct *thr, uint8_t cause)
 		thr->regs->msr);
 	giveup_all(container_of(thr, struct task_struct, thread));
 
+	/* Save SPRS before reclaim */
+	save_sprs(thr);
 	tm_reclaim(thr, cause);
 
 	/* Tag it so restore_tm_state will pay attention to this task */
@@ -939,6 +944,7 @@ void tm_recheckpoint(struct thread_struct *thread)
 
 	__tm_recheckpoint(thread);
 
+	restore_sprs_after_recheckpoint(thread);
 	local_irq_restore(flags);
 }
 
@@ -1166,6 +1172,19 @@ static inline void restore_sprs(struct thread_struct *old_thread,
 	thread_pkey_regs_restore(new_thread, old_thread);
 }
 
+static inline void restore_sprs_after_recheckpoint(struct thread_struct *thread)
+{
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (cpu_has_feature(CPU_FTR_DSCR))
+		mtspr(SPRN_DSCR, thread->dscr);
+
+	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
+		mtspr(SPRN_TAR, thread->tar);
+		mtspr(SPRN_FSCR, thread->fscr);
+	}
+#endif
+}
+
 #ifdef CONFIG_PPC_BOOK3S_64
 #define CP_SIZE 128
 static const u8 dummy_copy_buffer[CP_SIZE] __attribute__((aligned(CP_SIZE)));
diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
index 9fabdce255cd..e3090502061e 100644
--- a/arch/powerpc/kernel/tm.S
+++ b/arch/powerpc/kernel/tm.S
@@ -326,9 +326,18 @@ _GLOBAL(tm_reclaim)
 	mtlr	r0
 	ld	r2, STK_GOT(r1)
 
-	/* Load CPU's default DSCR */
-	ld	r0, PACA_DSCR_DEFAULT(r13)
-	mtspr	SPRN_DSCR, r0
+	/*
+	 * Load transactional SPRs, as all other registers have the
+	 * transacional value, which will be seen by save_sprs and so
+	 * forth. Checkpointed SPRs are in the thread->tm_tar/dscr.
+	 */
+	ld      r0, THREAD_DSCR(r12)
+	mtspr   SPRN_DSCR, r0
+	ld      r0, THREAD_TAR(r12)
+	mtspr   SPRN_TAR, r0
+	ld      r0, _PPR(r7)
+	mtspr   SPRN_PPR, r0
+
 
 	blr
 
@@ -518,10 +527,6 @@ restore_gprs:
 	mtlr	r0
 	ld	r2, STK_GOT(r1)
 
-	/* Load CPU's default DSCR */
-	ld	r0, PACA_DSCR_DEFAULT(r13)
-	mtspr	SPRN_DSCR, r0
-
 	blr
 
 	/* ****************************************************************** */
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 13/14] powerpc/tm: Do not restore TM without SPRs
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (11 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 12/14] powerpc/tm: Restore transactional SPRs Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-15  3:02   ` Michael Neuling
  2018-11-06 12:40 ` [RFC PATCH 14/14] selftests/powerpc: Adapt tm-syscall test to no suspend Breno Leitao
  2018-11-06 18:32 ` [RFC PATCH v2 00/14] New TM Model Florian Weimer
  14 siblings, 1 reply; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

Currently the signal context restore code enables the bit on the MSR
register without restoring the TM SPRs, which can cause undesired side
effects.

This is not correct because if TM is enabled in MSR, it means the TM SPR
registers are valid and updated, which is not correct here. In fact, the
live registers may contain previous' thread SPRs.

Functions check if the register values are valid or not through looking
if the facility is enabled at MSR, as MSR[TM] set means that the TM SPRs
are hot.

When just enabling MSR[TM] without updating the live SPRs, this can cause a
crash, since current TM SPR from previous thread will be saved on the
current thread, and might not have TEXASR[FS] set, for example.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/signal_64.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 487c3b6aa2e3..156b90e8ee78 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -478,8 +478,18 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
 	 * happened whilst in the signal handler and load_tm overflowed,
 	 * disabling the TM bit. In either case we can end up with an illegal
 	 * TM state leading to a TM Bad Thing when we return to userspace.
+	 *
+	 * Every time MSR_TM is enabled, mainly for the b) case, the TM SPRs
+	 * must be restored in the live registers, since the live registers
+	 * could contain garbage and later we want to read from live, since
+	 * MSR_TM is enabled, and MSR[TM] is what is used to check if the
+	 * TM SPRs live registers are valid or not.
 	 */
-	regs->msr |= MSR_TM;
+	if ((regs->msr & MSR_TM) == 0) {
+		regs->msr |= MSR_TM;
+		tm_enable();
+		tm_restore_sprs(&tsk->thread);
+	}
 
 	/* pull in MSR LE from user context */
 	regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH 14/14] selftests/powerpc: Adapt tm-syscall test to no suspend
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (12 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 13/14] powerpc/tm: Do not restore TM without SPRs Breno Leitao
@ 2018-11-06 12:40 ` Breno Leitao
  2018-11-06 18:32 ` [RFC PATCH v2 00/14] New TM Model Florian Weimer
  14 siblings, 0 replies; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 12:40 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, gromero, Breno Leitao, ldufour, cyrilbur

The Documentation/powerpc/transactional_memory.txt says:

 "Syscalls made from within a suspended transaction are performed as normal
  and the transaction is not explicitly doomed by the kernel.  However,
  what the kernel does to perform the syscall may result in the transaction
  being doomed by the hardware."

With this new TM mechanism, the syscall will continue to be executed if the
syscall happens on a suspended syscall, but, it will always be doomed now,
because the transaction will be reclaimed and recheckpointed, which causes
the transaction to be doomed.

This test expects that the transaction is not doomed, calling
getppid_tm_suspended(), which has the following body:

	FUNC_START(getppid_tm_suspended)
		tbegin.
		beq 1f
		li      r0, __NR_getppid
		tsuspend.
		sc
		tresume.
		tend.
		blr
	1:
		li      r3, -1
		blr

This will never succeed and return the syscall output because tresume
will abort the transaction, and jumps to the failure handler, returning r3
:= -1.

This patch updates the test case to not assume that a syscall inside a
suspended transaction will not be doomed, because kernel entrace will doom
any suspended transaction now on.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 tools/testing/selftests/powerpc/tm/tm-syscall.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/tools/testing/selftests/powerpc/tm/tm-syscall.c b/tools/testing/selftests/powerpc/tm/tm-syscall.c
index 454b965a2db3..1439a87eba3a 100644
--- a/tools/testing/selftests/powerpc/tm/tm-syscall.c
+++ b/tools/testing/selftests/powerpc/tm/tm-syscall.c
@@ -78,12 +78,6 @@ int tm_syscall(void)
 	timeradd(&end, &now, &end);
 
 	for (count = 0; timercmp(&now, &end, <); count++) {
-		/*
-		 * Test a syscall within a suspended transaction and verify
-		 * that it succeeds.
-		 */
-		FAIL_IF(getppid_tm(true) == -1); /* Should succeed. */
-
 		/*
 		 * Test a syscall within an active transaction and verify that
 		 * it fails with the correct failure code.
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 00/14] New TM Model
  2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
                   ` (13 preceding siblings ...)
  2018-11-06 12:40 ` [RFC PATCH 14/14] selftests/powerpc: Adapt tm-syscall test to no suspend Breno Leitao
@ 2018-11-06 18:32 ` Florian Weimer
  2018-11-06 19:31   ` Breno Leitao
  14 siblings, 1 reply; 26+ messages in thread
From: Florian Weimer @ 2018-11-06 18:32 UTC (permalink / raw)
  To: Breno Leitao; +Cc: mikey, ldufour, linuxppc-dev, cyrilbur, gromero

* Breno Leitao:

> This  patchset for the hardware transactional memory (TM) subsystem
> aims to avoid spending a lot of time on TM suspended mode in kernel
> space.  It basically changes where the reclaim/recheckpoint will be
> executed.

I assumed that we want to abort on every system call these days?

We have this commit in glibc:

commit f0458cf4f9ff3d870c43b624e6dccaaf657d5e83
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Mon Aug 27 09:42:50 2018 -0300

    powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC
    
    Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
    instead it suspend and resume it when leaving the kernel.  The
    side-effects of the syscall will always remain visible, even if the
    transaction is aborted.  This is an issue when transaction is used along
    with futex syscall, on pthread_cond_wait for instance, where the futex
    call might succeed but the transaction is rolled back leading the
    pthread_cond object in an inconsistent state.
    
    Glibc used to prevent it by always aborting a transaction before issuing
    a syscall.  Linux 4.2 also decided to abort active transaction in
    syscalls which makes the glibc workaround superfluous.  Worse, glibc
    transaction abortion leads to a performance issue on recent kernels
    where the HTM state is saved/restore lazily (v4.9).  By aborting a
    transaction on every syscalls, regardless whether a transaction has being
    initiated before, GLIBS makes the kernel always save/restore HTM state
    (it can not even lazily disable it after a certain number of syscall
    iterations).
    
    Because of this shortcoming, Transactional Lock Elision is just enabled
    when it has been explicitly set (either by tunables of by a configure
    switch) and if kernel aborts HTM transactions on syscalls
    (PPC_FEATURE2_HTM_NOSC).  It is reported that using simple benchmark [1],
    the context-switch is about 5% faster by not issuing a tabort in every
    syscall in newer kernels.

I wonder how the new TM model interacts with the assumption we currently
have in glibc.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 00/14] New TM Model
  2018-11-06 18:32 ` [RFC PATCH v2 00/14] New TM Model Florian Weimer
@ 2018-11-06 19:31   ` Breno Leitao
  2018-11-07  0:39     ` Michael Neuling
  0 siblings, 1 reply; 26+ messages in thread
From: Breno Leitao @ 2018-11-06 19:31 UTC (permalink / raw)
  To: Florian Weimer
  Cc: mikey, gromero, Adhemerval Zanella, ldufour, linuxppc-dev, cyrilbur

hi Florian,

On 11/06/2018 04:32 PM, Florian Weimer wrote:
> * Breno Leitao:
> 
>> This  patchset for the hardware transactional memory (TM) subsystem
>> aims to avoid spending a lot of time on TM suspended mode in kernel
>> space.  It basically changes where the reclaim/recheckpoint will be
>> executed.
> 
> I assumed that we want to abort on every system call these days?
> 
> We have this commit in glibc:
> 
> commit f0458cf4f9ff3d870c43b624e6dccaaf657d5e83
> Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
> Date:   Mon Aug 27 09:42:50 2018 -0300
> 
>     powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC
>     
>     Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
>     instead it suspend and resume it when leaving the kernel.  The
>     side-effects of the syscall will always remain visible, even if the
>     transaction is aborted.  This is an issue when transaction is used along
>     with futex syscall, on pthread_cond_wait for instance, where the futex
>     call might succeed but the transaction is rolled back leading the
>     pthread_cond object in an inconsistent state.
>     
>     Glibc used to prevent it by always aborting a transaction before issuing
>     a syscall.  Linux 4.2 also decided to abort active transaction in
>     syscalls which makes the glibc workaround superfluous.  Worse, glibc
>     transaction abortion leads to a performance issue on recent kernels
>     where the HTM state is saved/restore lazily (v4.9).  By aborting a
>     transaction on every syscalls, regardless whether a transaction has being
>     initiated before, GLIBS makes the kernel always save/restore HTM state
>     (it can not even lazily disable it after a certain number of syscall
>     iterations).
>     
>     Because of this shortcoming, Transactional Lock Elision is just enabled
>     when it has been explicitly set (either by tunables of by a configure
>     switch) and if kernel aborts HTM transactions on syscalls
>     (PPC_FEATURE2_HTM_NOSC).  It is reported that using simple benchmark [1],
>     the context-switch is about 5% faster by not issuing a tabort in every
>     syscall in newer kernels.
> 
> I wonder how the new TM model interacts with the assumption we currently
> have in glibc.

This new TM model is almost transparent to userspace. My patchset basically
affects where recheckpoint and reclaim happens inside kernel space, and
should not change userspace behavior.

I say "almost transparent" because it might cause some very specific
transactions to have a higher doom rate, see patch 14/14 for a more detailed
information, and also a reference for GLIBCs "tabort prior system calls"
behavior.

Regarding Adhemerval's patch, it is unaffected to this new model. Prior to
kernel 4.2, kernel was executing a syscall independently of the TM state,
which caused undesired side effect, thus GLIBC decision to abort a
transaction prior to calling a syscall.

Later, kernel system call mechanism was aware of the TM state, and this GLIBC
workaround was not necessary anymore.

More than that, this workaround started to cause  performance degradation on
context switches, mainly when TM facility became lazy enabled, i.e, the TM
facility mechanism would be enabled on demand (a task uses TM explicitly).
This happens because this "abort prior to every system call" workaround
started to trigger the TM facility to be enabled for every task that calls
system calls.

In fact, I was the one that identified this performance degradation issue,
and reported to Adhemerval who kindly fixed it with
f0458cf4f9ff3d870c43b624e6dccaaf657d5e83.

Anyway, I think we are safe here.

Thanks for bringing this up.
Breno








^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 00/14] New TM Model
  2018-11-06 19:31   ` Breno Leitao
@ 2018-11-07  0:39     ` Michael Neuling
  0 siblings, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-07  0:39 UTC (permalink / raw)
  To: Breno Leitao, Florian Weimer
  Cc: ldufour, linuxppc-dev, cyrilbur, Adhemerval Zanella, gromero


> In fact, I was the one that identified this performance degradation issue,
> and reported to Adhemerval who kindly fixed it with
> f0458cf4f9ff3d870c43b624e6dccaaf657d5e83.
> 
> Anyway, I think we are safe here.

FWIW Agreed. PPC_FEATURE2_HTM_NOSC should be persevered by this series.

Mikey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 02/14] powerpc/tm: Reclaim on unavailable exception
  2018-11-06 12:40 ` [RFC PATCH 02/14] powerpc/tm: Reclaim on unavailable exception Breno Leitao
@ 2018-11-15  0:06   ` Michael Neuling
  0 siblings, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-15  0:06 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: ldufour, cyrilbur, gromero

On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> If there is a FP/VEC/Altivec touch inside a transaction and the facility is
> disabled, then a facility unavailable exception is raised and ends up
> calling {fp,vec,vsx}_unavailable_tm, which was reclaiming and
> recheckpointing.
> 
> This is not required anymore, since the checkpointed state was reclaimed in
> the exception entrance, and it will be recheckpointed by restore_tm_state
> later.
> 
> Adding a WARN_ON() warning if we hit the _unavailable_tm() in suspended
> mode, i.e, the reclaim was not executed somehow in the trap entrance, and
> this is a bug.

The "why" above is good and the important part of the commit but, 

Can you also add what you're doing?  The above would suggest you're just
removing some things but you're actually adding the TM_KERNEL_ENTRY() macro too.

Mikey

> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/include/asm/exception-64s.h |  4 ++++
>  arch/powerpc/kernel/exceptions-64s.S     |  3 +++
>  arch/powerpc/kernel/traps.c              | 22 ++++------------------
>  3 files changed, 11 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h
> b/arch/powerpc/include/asm/exception-64s.h
> index 931a74ba037b..80f01d5683c3 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -711,6 +711,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
>  	 * Soft disable the IRQs, otherwise it might cause a CPU hang.	\
>  	 */								\
>  	RECONCILE_IRQ_STATE(r10, r11);					\
> +	/*								\
> +	 * Although this cause will be set initially, it might be	\
> +	 * updated later, once the exception is better understood	\
> +	 */								\
>  	li      r3, cause;						\
>  	bl      tm_reclaim_current;					\
>  	li      r3, 1;		/* Reclaim happened */			\
> diff --git a/arch/powerpc/kernel/exceptions-64s.S
> b/arch/powerpc/kernel/exceptions-64s.S
> index 5c685a46202d..47e05b09eed6 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -786,6 +786,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  2:	/* User process was in a transaction */
>  	bl	save_nvgprs
> +	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
>  	RECONCILE_IRQ_STATE(r10, r11)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	fp_unavailable_tm
> @@ -1128,6 +1129,7 @@ BEGIN_FTR_SECTION
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  2:	/* User process was in a transaction */
>  	bl	save_nvgprs
> +	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
>  	RECONCILE_IRQ_STATE(r10, r11)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	altivec_unavailable_tm
> @@ -1164,6 +1166,7 @@ BEGIN_FTR_SECTION
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  2:	/* User process was in a transaction */
>  	bl	save_nvgprs
> +	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
>  	RECONCILE_IRQ_STATE(r10, r11)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	vsx_unavailable_tm
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 9a86572db1ef..e74b735e974c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1742,23 +1742,10 @@ void fp_unavailable_tm(struct pt_regs *regs)
>           * transaction, and probably retry but now with FP enabled.  So the
>           * checkpointed FP registers need to be loaded.
>  	 */
> -	tm_reclaim_current(TM_CAUSE_FAC_UNAV);
> -
> -	/*
> -	 * Reclaim initially saved out bogus (lazy) FPRs to ckfp_state, and
> -	 * then it was overwrite by the thr->fp_state by tm_reclaim_thread().
> -	 *
> -	 * At this point, ck{fp,vr}_state contains the exact values we want to
> -	 * recheckpoint.
> -	 */
> +	WARN_ON(MSR_TM_SUSPENDED(mfmsr()));
>  
>  	/* Enable FP for the task: */
>  	current->thread.load_fp = 1;
> -
> -	/*
> -	 * Recheckpoint all the checkpointed ckpt, ck{fp, vr}_state registers.
> -	 */
> -	tm_recheckpoint(&current->thread);
>  }
>  
>  void altivec_unavailable_tm(struct pt_regs *regs)
> @@ -1770,10 +1757,10 @@ void altivec_unavailable_tm(struct pt_regs *regs)
>  	TM_DEBUG("Vector Unavailable trap whilst transactional at 0x%lx,"
>  		 "MSR=%lx\n",
>  		 regs->nip, regs->msr);
> -	tm_reclaim_current(TM_CAUSE_FAC_UNAV);
> +	WARN_ON(MSR_TM_SUSPENDED(mfmsr()));
>  	current->thread.load_vec = 1;
> -	tm_recheckpoint(&current->thread);
>  	current->thread.used_vr = 1;
> +
>  }
>  
>  void vsx_unavailable_tm(struct pt_regs *regs)
> @@ -1792,12 +1779,11 @@ void vsx_unavailable_tm(struct pt_regs *regs)
>  	current->thread.used_vsr = 1;
>  
>  	/* This reclaims FP and/or VR regs if they're already enabled */
> -	tm_reclaim_current(TM_CAUSE_FAC_UNAV);
> +	WARN_ON(MSR_TM_SUSPENDED(mfmsr()));
>  
>  	current->thread.load_vec = 1;
>  	current->thread.load_fp = 1;
>  
> -	tm_recheckpoint(&current->thread);
>  }
>  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>  

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry
  2018-11-06 12:40 ` [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry Breno Leitao
@ 2018-11-15  0:06   ` Michael Neuling
  2018-11-15  0:51   ` Nicholas Piggin
  1 sibling, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-15  0:06 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: Nicholas Piggin, gromero, ldufour, cyrilbur

On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> This patch creates a macro that will be invoked on all entrance to the
> kernel, so, in kernel space the transaction will be completely reclaimed
> and not suspended anymore.
> 
> This patchset checks if we are coming from PR, if not, skip. 

Remove the double negative here. ie

"This skips when coming from the OS". or "Only happens when coming from PR"

> This is useful
> when there is a irq_replay() being called after recheckpoint, when the IRQ
> is re-enable. 

So we are talking about tm_recheckpoint on exit? On exit, we do:
   tm_recheckpoint -> irq_replay -> rfid?

Why not swap the order of the recheckpoint and the replay to avoid this problem?

> In this case, we do not want to re-reclaim and
> re-recheckpoint, thus, if not coming from PR, skip it completely.

Move double negatives... Try: "if coming from the OS, skip" or "only do it when
coming from userspace"

> This macro does not care about TM SPR also, it will only be saved and
> restore in the context switch code now on.
> This macro will return 0 or 1 in r3 register, to specify if a reclaim was
> executed or not.
> 
> This patchset is based on initial work done by Cyril:
> https://patchwork.ozlabs.org/cover/875341/
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/include/asm/exception-64s.h | 46 ++++++++++++++++++++++++
>  arch/powerpc/kernel/entry_64.S           | 10 ++++++
>  arch/powerpc/kernel/exceptions-64s.S     | 12 +++++--
>  3 files changed, 66 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 3b4767ed3ec5..931a74ba037b 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -36,6 +36,7 @@
>   */
>  #include <asm/head-64.h>
>  #include <asm/feature-fixups.h>
> +#include <asm/tm.h>
>  
>  /* PACA save area offsets (exgen, exmc, etc) */
>  #define EX_R9		0
> @@ -677,10 +678,54 @@ BEGIN_FTR_SECTION				\
>  	beql	ppc64_runlatch_on_trampoline;	\
>  END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
>  
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +
> +/*
> + * This macro will reclaim a transaction if called when coming from userspace
> + * (MSR.PR = 1) and if the transaction state is active or suspended.
> + *
> + * Since we don't want to reclaim when coming from kernel, for instance after
> + * a trechkpt. or a IRQ replay, the live MSR is not useful and instead of it the
> + * MSR from thread stack is used to check the MSR.PR bit.
> + * This macro has one argument which is the cause that will be used by treclaim.
> + * and returns in r3 '1' if the reclaim happens or '0' if reclaim didn't
> + * happen, which is useful to know what registers were clobbered.
> + *
> + * NOTE: If addition registers are clobbered here, make sure the callee
> + * function restores them before proceeding.
> + */
> +#define TM_KERNEL_ENTRY(cause)						\
> +	ld      r3, _MSR(r1);						\
> +	andi.   r0, r3, MSR_PR;	/* Coming from userspace? */		\
> +	beq     1f;		/* Skip reclaim if MSR.PR != 1 */	\
> +	rldicl. r0, r3, (64-MSR_TM_LG), 63; /* Is TM enabled? */	\
> +	beq     1f;		/* Skip reclaim if TM is off */		\
> +	rldicl. r0, r3, (64-MSR_TS_LG), 62;	/* Is active */		\
> +	beq     1f;		/* Skip reclaim if neither */		\
> +	/*								\
> +	 * If there is a transaction active or suspended, save the	\
> +	 * non-volatile GPRs if they are not already saved.		\
> +	 */								\
> +	bl      save_nvgprs;						\
> +	/*								\
> +	 * Soft disable the IRQs, otherwise it might cause a CPU hang.	\
> +	 */								\
> +	RECONCILE_IRQ_STATE(r10, r11);					\
> +	li      r3, cause;						\
> +	bl      tm_reclaim_current;					\

Are we ready to call out to C at this point in the exception handlers?

> +	li      r3, 1;		/* Reclaim happened */			\
> +	b       2f;							\
> +1:	li      r3, 0;		/* Reclaim didn't happen */             \
> +2:
> +#else
> +#define TM_KERNEL_ENTRY(cause)
> +#endif
> +
>  #define EXCEPTION_COMMON(area, trap, label, hdlr, ret, additions) \
>         EXCEPTION_PROLOG_COMMON(trap, area);                    \
>         /* Volatile regs are potentially clobbered here */      \
>         additions;                                              \
> +       TM_KERNEL_ENTRY(TM_CAUSE_MISC);                                 \
>         addi    r3,r1,STACK_FRAME_OVERHEAD;                     \
>         bl      hdlr;                                           \
>         b       ret
> @@ -695,6 +740,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
>         EXCEPTION_PROLOG_COMMON_3(trap);                        \
>         /* Volatile regs are potentially clobbered here */      \
>         additions;                                              \
> +       TM_KERNEL_ENTRY(TM_CAUSE_MISC);                         \
>         addi    r3,r1,STACK_FRAME_OVERHEAD;                     \
>         bl      hdlr
>  
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 7b1693adff2a..17484ebda66c 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -131,6 +131,16 @@ BEGIN_FW_FTR_SECTION
>  END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
>  #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE && CONFIG_PPC_SPLPAR */
>  
> +#if CONFIG_PPC_TRANSACTIONAL_MEM
> +	TM_KERNEL_ENTRY(TM_CAUSE_SYSCALL)
> +	cmpdi	r3, 0x1
> +	bne	44f
> +	/* Restore from r4 to r12 */
> +	REST_8GPRS(4,r1)
> +44:	/* treclaim was not called, just restore r3 and r0 */
> +	REST_GPR(3, r1)
> +	REST_GPR(0, r1)
> +#endif
>  	/*
>  	 * A syscall should always be called with interrupts enabled
>  	 * so we just unconditionally hard-enable here. When some kind
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 89d32bb79d5e..5c685a46202d 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -717,6 +717,7 @@ EXC_COMMON_BEGIN(alignment_common)
>  	std	r3,_DAR(r1)
>  	std	r4,_DSISR(r1)
>  	bl	save_nvgprs
> +	TM_KERNEL_ENTRY(TM_CAUSE_ALIGNMENT)
>  	RECONCILE_IRQ_STATE(r10, r11)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	alignment_exception
> @@ -751,6 +752,8 @@ EXC_COMMON_BEGIN(program_check_common)
>  	b 3f				/* Jump into the macro !!	*/
>  1:	EXCEPTION_PROLOG_COMMON(0x700, PACA_EXGEN)
>  	bl	save_nvgprs
> +	ld      r3, _MSR(r1)
> +	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
>  	RECONCILE_IRQ_STATE(r10, r11)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	program_check_exception
> @@ -1650,7 +1653,9 @@ do_hash_page:
>  
>  /* Here we have a page fault that hash_page can't handle. */
>  handle_page_fault:
> -11:	andis.  r0,r4,DSISR_DABRMATCH@h
> +11:	TM_KERNEL_ENTRY(TM_CAUSE_TLBI)
> +	ld      r4,_DSISR(r1)
> +	andis.  r0,r4,DSISR_DABRMATCH@h
>  	bne-    handle_dabr_fault
>  	ld	r4,_DAR(r1)
>  	ld	r5,_DSISR(r1)
> @@ -1681,6 +1686,8 @@ handle_dabr_fault:
>   */
>  13:	bl	save_nvgprs
>  	mr	r5,r3
> +	TM_KERNEL_ENTRY(TM_CAUSE_TLBI)
> +	REST_GPR(3,r1)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	ld	r4,_DAR(r1)
>  	bl	low_hash_fault
> @@ -1695,7 +1702,8 @@ handle_dabr_fault:
>   * the access, or panic if there isn't a handler.
>   */
>  77:	bl	save_nvgprs
> -	mr	r4,r3
> +	TM_KERNEL_ENTRY(TM_CAUSE_TLBI)
> +	ld      r4,_DAR(r1)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	li	r5,SIGSEGV
>  	bl	bad_page_fault

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 03/14] powerpc/tm: Recheckpoint when exiting from kernel
  2018-11-06 12:40 ` [RFC PATCH 03/14] powerpc/tm: Recheckpoint when exiting from kernel Breno Leitao
@ 2018-11-15  0:11   ` Michael Neuling
  0 siblings, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-15  0:11 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: ldufour, cyrilbur, gromero

On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> This is the only place we are going to recheckpoint now. Now the task
> needs to have TIF_RESTORE_TM flag set, which will get into
> restore_tm_state() at exception exit path, and execute the recheckpoint
> depending on the MSR.
> 
> Every time a task is required to recheckpoint, or just have the TM SPRs
> restore, the TIF_RESTORE_TM flag should be set and the task MSR should
> properly be in a transactional state, which will be checked by
> restore_tm_state().
> 
> After the facility registers are recheckpointed, they are clobbered with
> the values that were recheckpointed (and are now also in the checkpoint
> area).

Which facility registers? I don't understand this.

> If facility is enabled at MSR that is being returned to user space, then
> the facility registers need to be restored, otherwise userspace will see
> invalid values.
> 
> This patch simplify the restore_tm_state() to just restore the facility
> registers that are enabled when returning to userspace, i.e. the MSR will
> be the same that will be put into SRR1, which will be the MSR after RFID.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/kernel/process.c | 38 ++++++++++++++++++++++++-----------
>  1 file changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 4d5322cfad25..c7e758a42b8f 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1049,8 +1049,6 @@ static inline void __switch_to_tm(struct task_struct
> *prev,
>   */
>  void restore_tm_state(struct pt_regs *regs)
>  {
> -	unsigned long msr_diff;
> -
>  	/*
>  	 * This is the only moment we should clear TIF_RESTORE_TM as
>  	 * it is here that ckpt_regs.msr and pt_regs.msr become the same
> @@ -1061,19 +1059,35 @@ void restore_tm_state(struct pt_regs *regs)
>  	if (!MSR_TM_ACTIVE(regs->msr))
>  		return;
>  
> -	msr_diff = current->thread.ckpt_regs.msr & ~regs->msr;
> -	msr_diff &= MSR_FP | MSR_VEC | MSR_VSX;
> +	tm_enable();
> +	/* The only place we recheckpoint */
> +	tm_recheckpoint(&current->thread);
>  
> -	/* Ensure that restore_math() will restore */
> -	if (msr_diff & MSR_FP)
> -		current->thread.load_fp = 1;
> +	/*
> +	 * Restore the facility registers that were clobbered during
> +	 * recheckpoint.
> +	 */
> +	if (regs->msr & MSR_FP) {
> +		/*
> +		 * Using load_fp_state() instead of restore_fp() because we
> +		 * want to force the restore, independent of
> +		 * tsk->thread.load_fp. Same for other cases below.
> +		 */
> +		load_fp_state(&current->thread.fp_state);
> +	}
>  #ifdef CONFIG_ALTIVEC
> -	if (cpu_has_feature(CPU_FTR_ALTIVEC) && msr_diff & MSR_VEC)
> -		current->thread.load_vec = 1;
> +	if (cpu_has_feature(CPU_FTR_ALTIVEC) && regs->msr & MSR_VEC)
> +		load_vr_state(&current->thread.vr_state);
> +#endif
> +#ifdef CONFIG_VSX
> +	if (cpu_has_feature(CPU_FTR_VSX) && regs->msr & MSR_VSX) {
> +		/*
> +		 * If VSX is enabled, it is expected that VEC and FP are
> +		 * also enabled and already restored the full register set.
> +		 * Cause a warning if that is not the case.
> +		 */
> +		WARN_ON(!(regs->msr & MSR_VEC) || !(regs->msr & MSR_FP)); }
>  #endif
> -	restore_math(regs);
> -
> -	regs->msr |= msr_diff;
>  }
>  
>  #else

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry
  2018-11-06 12:40 ` [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry Breno Leitao
  2018-11-15  0:06   ` Michael Neuling
@ 2018-11-15  0:51   ` Nicholas Piggin
  1 sibling, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2018-11-15  0:51 UTC (permalink / raw)
  To: Breno Leitao; +Cc: mikey, ldufour, linuxppc-dev, cyrilbur, gromero

On Tue,  6 Nov 2018 10:40:15 -0200
Breno Leitao <leitao@debian.org> wrote:

> This patch creates a macro that will be invoked on all entrance to the
> kernel, so, in kernel space the transaction will be completely reclaimed
> and not suspended anymore.

This doesn't get invoked on _all_ kernel entries, by the looks (SLB
miss or early machine check, for example). And of course we always
have to run _some_ MSR[PR]=0 code before it is reclaimed. So it is
important to document the rules for what code must not run with TM
suspended now, and why.

> 
> This patchset checks if we are coming from PR, if not, skip. This is useful
> when there is a irq_replay() being called after recheckpoint, when the IRQ
> is re-enable. In this case, we do not want to re-reclaim and
> re-recheckpoint, thus, if not coming from PR, skip it completely.

I really should learn a bit more about TM but I've been trying not to.
Seeing as I don't, I don't really understand this comment. Why don't
we want to reclaim?

> 
> This macro does not care about TM SPR also, it will only be saved and
> restore in the context switch code now on.
> 
> This macro will return 0 or 1 in r3 register, to specify if a reclaim was
> executed or not.

We want to be careful about efficiency here, so I think this macro
should be tightened up. A lot of code doesn't seem to care about the
return value for example, so you could have two macros, one which
cares about return, another which doesn't. Instead of setting value
via branches which you then use to test and branch again, macro could
accept branch labels to go to perhaps.

It would be good to move the TM reclaim path out of line and make the
common case a not taken branch. Don't know how feasible that will be.

> 
> This patchset is based on initial work done by Cyril:
> https://patchwork.ozlabs.org/cover/875341/
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/include/asm/exception-64s.h | 46 ++++++++++++++++++++++++
>  arch/powerpc/kernel/entry_64.S           | 10 ++++++
>  arch/powerpc/kernel/exceptions-64s.S     | 12 +++++--
>  3 files changed, 66 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 3b4767ed3ec5..931a74ba037b 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -36,6 +36,7 @@
>   */
>  #include <asm/head-64.h>
>  #include <asm/feature-fixups.h>
> +#include <asm/tm.h>
>  
>  /* PACA save area offsets (exgen, exmc, etc) */
>  #define EX_R9		0
> @@ -677,10 +678,54 @@ BEGIN_FTR_SECTION				\
>  	beql	ppc64_runlatch_on_trampoline;	\
>  END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
>  
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +
> +/*
> + * This macro will reclaim a transaction if called when coming from userspace
> + * (MSR.PR = 1) and if the transaction state is active or suspended.
> + *
> + * Since we don't want to reclaim when coming from kernel, for instance after
> + * a trechkpt. or a IRQ replay, the live MSR is not useful and instead of it the
> + * MSR from thread stack is used to check the MSR.PR bit.
> + * This macro has one argument which is the cause that will be used by treclaim.
> + * and returns in r3 '1' if the reclaim happens or '0' if reclaim didn't
> + * happen, which is useful to know what registers were clobbered.
> + *
> + * NOTE: If addition registers are clobbered here, make sure the callee
> + * function restores them before proceeding.
> + */
> +#define TM_KERNEL_ENTRY(cause)						\
> +	ld      r3, _MSR(r1);						\
> +	andi.   r0, r3, MSR_PR;	/* Coming from userspace? */		\
> +	beq     1f;		/* Skip reclaim if MSR.PR != 1 */	\

I wonder if this can be put with the other userspace entry code?
Maybe it's too difficult.

> +	rldicl. r0, r3, (64-MSR_TM_LG), 63; /* Is TM enabled? */	\
> +	beq     1f;		/* Skip reclaim if TM is off */		\
> +	rldicl. r0, r3, (64-MSR_TS_LG), 62;	/* Is active */		\
> +	beq     1f;		/* Skip reclaim if neither */		\

Can this be merged into a single test?

And/or can these branches be rearranged so the one most likely to
go to skip happens first? (I assume TM being active is less likely
than being enabled).

> +	/*								\
> +	 * If there is a transaction active or suspended, save the	\
> +	 * non-volatile GPRs if they are not already saved.		\
> +	 */								\
> +	bl      save_nvgprs;						\
> +	/*								\
> +	 * Soft disable the IRQs, otherwise it might cause a CPU hang.	\
> +	 */								\
> +	RECONCILE_IRQ_STATE(r10, r11);					\

How might this cause a CPU hang? IRQ state must be reconciled before
enabling MSR[EE] and also before doing any local_irq_disable, etc. But
if we do neither of those things, it should be okay to run C code
without this.

Other option may be to always call this macro after irq state is
reconciled, although that may not work depending on what your rules
are for reclaiming a txn. Maybe two versions of the macro though.

> +	li      r3, cause;						\
> +	bl      tm_reclaim_current;					\
> +	li      r3, 1;		/* Reclaim happened */			\
> +	b       2f;							\
> +1:	li      r3, 0;		/* Reclaim didn't happen */		\
> +2:
> +#else
> +#define TM_KERNEL_ENTRY(cause)
> +#endif
> +
>  #define EXCEPTION_COMMON(area, trap, label, hdlr, ret, additions) \
>  	EXCEPTION_PROLOG_COMMON(trap, area);			\
>  	/* Volatile regs are potentially clobbered here */	\
>  	additions;						\
> +	TM_KERNEL_ENTRY(TM_CAUSE_MISC);					\
>  	addi	r3,r1,STACK_FRAME_OVERHEAD;			\
>  	bl	hdlr;						\
>  	b	ret
> @@ -695,6 +740,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
>  	EXCEPTION_PROLOG_COMMON_3(trap);			\
>  	/* Volatile regs are potentially clobbered here */	\
>  	additions;						\
> +	TM_KERNEL_ENTRY(TM_CAUSE_MISC);				\
>  	addi	r3,r1,STACK_FRAME_OVERHEAD;			\
>  	bl	hdlr
>  
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 7b1693adff2a..17484ebda66c 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -131,6 +131,16 @@ BEGIN_FW_FTR_SECTION
>  END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
>  #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE && CONFIG_PPC_SPLPAR */
>  
> +#if CONFIG_PPC_TRANSACTIONAL_MEM
> +	TM_KERNEL_ENTRY(TM_CAUSE_SYSCALL)
> +	cmpdi	r3, 0x1
> +	bne	44f
> +	/* Restore from r4 to r12 */
> +	REST_8GPRS(4,r1)
> +44:	/* treclaim was not called, just restore r3 and r0 */
> +	REST_GPR(3, r1)
> +	REST_GPR(0, r1)
> +#endif

AFAIKS this is the only place the return value is used? Unless future
patches use it more.

The comments are not all that helpful. It would be good to add a bit
more explanation. Are these REST_ purely to restore the registers we
clobbered in the process of testing and reclaiming?

Normally the caller passes in registers to the macros which use them
for testing things like this, then it's a bit easier to see what's
going on.

>  	/*
>  	 * A syscall should always be called with interrupts enabled
>  	 * so we just unconditionally hard-enable here. When some kind
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 89d32bb79d5e..5c685a46202d 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -717,6 +717,7 @@ EXC_COMMON_BEGIN(alignment_common)
>  	std	r3,_DAR(r1)
>  	std	r4,_DSISR(r1)
>  	bl	save_nvgprs
> +	TM_KERNEL_ENTRY(TM_CAUSE_ALIGNMENT)
>  	RECONCILE_IRQ_STATE(r10, r11)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	alignment_exception
> @@ -751,6 +752,8 @@ EXC_COMMON_BEGIN(program_check_common)
>  	b 3f				/* Jump into the macro !!	*/
>  1:	EXCEPTION_PROLOG_COMMON(0x700, PACA_EXGEN)
>  	bl	save_nvgprs
> +	ld      r3, _MSR(r1)

        ^^^ not required

> +	TM_KERNEL_ENTRY(TM_CAUSE_FAC_UNAV)
>  	RECONCILE_IRQ_STATE(r10, r11)

See these could all be a much smaller macro which does not do the
return value, does not reconcile, does not save nvgprs, etc.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 05/14] powerpc/tm: Refactor the __switch_to_tm code
  2018-11-06 12:40 ` [RFC PATCH 05/14] powerpc/tm: Refactor the __switch_to_tm code Breno Leitao
@ 2018-11-15  1:04   ` Michael Neuling
  0 siblings, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-15  1:04 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: ldufour, cyrilbur, gromero

On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> __switch_to_tm is the function that switches between two tasks which might
> have TM enabled. This function is clearly split in two parts, the task that
> is leaving the CPU, known as 'prev' and the task that is being scheduled,
> known as 'new'.
> 
> It starts checking if the previous task had TM enable, if so, it increases
> the load_tm (this is the only place we increment load_tm). It also saves
> the TM SPRs here.
> 
> If the previous task was scheduled out with a transaction active, the
> failure cause needs to be updated, since it might contain the failure cause
> that caused the exception, as TM_CAUSE_MISC. In this case, since there was
> a context switch, overwrite the failure cause.
> 
> If the previous task has overflowed load_tm, disable TM, putting the
> facility save/restore lazy mechanism at lazy mode.
> 
> Regarding the 'new' task being scheduled, restoring TM SPRs is enough if
> the task had TM enabled when it was de-scheduled. (Checking if a
> recheckpoint would be required will be done later, at restore_tm_state()
> stage.)
> 
> On top of that, both tm_reclaim_task() and tm_recheckpoint_new_task()
> functions are not used anymore, removing them.

Is the above describing the previous functionality or the refactored
functionality?

> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/kernel/process.c | 167 ++++++++++++++++------------------
>  1 file changed, 78 insertions(+), 89 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 1842fd96b123..73872f751b33 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -912,48 +912,6 @@ void tm_reclaim_current(uint8_t cause)
>  	tm_reclaim_thread(&current->thread, cause);
>  }
>  
> -static inline void tm_reclaim_task(struct task_struct *tsk)
> -{
> -	/* We have to work out if we're switching from/to a task that's in the
> -	 * middle of a transaction.
> -	 *
> -	 * In switching we need to maintain a 2nd register state as
> -	 * oldtask->thread.ckpt_regs.  We tm_reclaim(oldproc); this saves the
> -	 * checkpointed (tbegin) state in ckpt_regs, ckfp_state and
> -	 * ckvr_state
> -	 *
> -	 * We also context switch (save) TFHAR/TEXASR/TFIAR in here.
> -	 */
> -	struct thread_struct *thr = &tsk->thread;
> -
> -	if (!thr->regs)
> -		return;
> -
> -	if (!MSR_TM_ACTIVE(thr->regs->msr))
> -		goto out_and_saveregs;
> -
> -	WARN_ON(tm_suspend_disabled);
> -
> -	TM_DEBUG("--- tm_reclaim on pid %d (NIP=%lx, "
> -		 "ccr=%lx, msr=%lx, trap=%lx)\n",
> -		 tsk->pid, thr->regs->nip,
> -		 thr->regs->ccr, thr->regs->msr,
> -		 thr->regs->trap);
> -
> -	tm_reclaim_thread(thr, TM_CAUSE_RESCHED);
> -
> -	TM_DEBUG("--- tm_reclaim on pid %d complete\n",
> -		 tsk->pid);
> -
> -out_and_saveregs:
> -	/* Always save the regs here, even if a transaction's not active.
> -	 * This context-switches a thread's TM info SPRs.  We do it here to
> -	 * be consistent with the restore path (in recheckpoint) which
> -	 * cannot happen later in _switch().
> -	 */
> -	tm_save_sprs(thr);
> -}
> -
>  extern void __tm_recheckpoint(struct thread_struct *thread);
>  
>  void tm_recheckpoint(struct thread_struct *thread)
> @@ -980,59 +938,91 @@ void tm_recheckpoint(struct thread_struct *thread)
>  	local_irq_restore(flags);
>  }
>  
> -static inline void tm_recheckpoint_new_task(struct task_struct *new)
> +static void tm_change_failure_cause(struct task_struct *task, uint8_t cause)
>  {
> -	if (!cpu_has_feature(CPU_FTR_TM))
> -		return;
> -
> -	/* Recheckpoint the registers of the thread we're about to switch to.
> -	 *
> -	 * If the task was using FP, we non-lazily reload both the original and
> -	 * the speculative FP register states.  This is because the kernel
> -	 * doesn't see if/when a TM rollback occurs, so if we take an FP
> -	 * unavailable later, we are unable to determine which set of FP regs
> -	 * need to be restored.
> -	 */
> -	if (!tm_enabled(new))
> -		return;
> -
> -	if (!MSR_TM_ACTIVE(new->thread.regs->msr)){
> -		tm_restore_sprs(&new->thread);
> -		return;
> -	}
> -	/* Recheckpoint to restore original checkpointed register state. */
> -	TM_DEBUG("*** tm_recheckpoint of pid %d (new->msr 0x%lx)\n",
> -		 new->pid, new->thread.regs->msr);
> -
> -	tm_recheckpoint(&new->thread);
> -
> -	/*
> -	 * The checkpointed state has been restored but the live state has
> -	 * not, ensure all the math functionality is turned off to trigger
> -	 * restore_math() to reload.
> -	 */
> -	new->thread.regs->msr &= ~(MSR_FP | MSR_VEC | MSR_VSX);
> -
> -	TM_DEBUG("*** tm_recheckpoint of pid %d complete "
> -		 "(kernel msr 0x%lx)\n",
> -		 new->pid, mfmsr());
> +	task->thread.tm_texasr &= ~TEXASR_FC;
> +	task->thread.tm_texasr |= (unsigned long) cause << TEXASR_FC_LG;
>  }
>  
>  static inline void __switch_to_tm(struct task_struct *prev,
>  		struct task_struct *new)
>  {
> -	if (cpu_has_feature(CPU_FTR_TM)) {
> -		if (tm_enabled(prev) || tm_enabled(new))
> -			tm_enable();
> +	if (!cpu_has_feature(CPU_FTR_TM))
> +		return;
> +
> +	/* The task leaving the CPU was using TM, let's handle it */
> +	if (tm_enabled(prev)) {
> +		/*
> +		 * Load_tm is incremented only when the task is scheduled out
> +		 */
> +		prev->thread.load_tm++;
> +
> +		/*
> +		 * If TM is enabled for the thread, it needs to, at least,
> +		 * save the SPRs
> +		 */
> +		tm_enable();
> +		tm_save_sprs(&prev->thread);
> +
> +		/*
> +		 * If the task being descheduled had an active transaction
> +		 * during the exception that brought it here, the
> +		 * transaction was reclaimed by TM_KERNEL_ENTRY.
> +		 *
> +		 * Independently of the TEXASR failure code set at
> +		 * TM_KERNEL_ENTRY time, the task is being descheduled, and
> +		 * the failure code needs to be updated to reflect it.
> +		 */
> +		if (MSR_TM_ACTIVE(prev->thread.regs->msr)) {
> +			/*
> +			 * If there was an IRQ during trecheckpoint, it will
> +			 * cause an IRQ to be replayed. This replayed IRQ can
> +			 * invoke SCHEDULE_USER, thus, we arrive here with a TM
> +			 * active transaction.
> +			 * I.e, the task was leaving kernelspace to userspace,
> +			 * already trecheckpointed, but there was a IRQ during
> +			 * the trecheckpoint process (soft irq disabled), and
> +			 * on the IRQ replay, the process was de-scheduled, so,
> +			 * SCHEDULE_USER was called and here we are.
> +			 *
> +			 */
> +			if (WARN_ON(MSR_TM_ACTIVE(mfmsr())))
> +				tm_reclaim_current(TM_CAUSE_RESCHED);

Do we need the MSR_TM_ACTIVE() check gating this? if MSR_TM_ACTIVE(mfmsr())
after TM_KERNEL_ENTRY, something is pretty broken. (in fact you have this check
later... so just remove this one I think).


>  
> -		if (tm_enabled(prev)) {
> -			prev->thread.load_tm++;
> -			tm_reclaim_task(prev);
> -			if (!MSR_TM_ACTIVE(prev->thread.regs->msr) && prev->thread.load_tm == 0)
> +			/*
> +			 * If rescheduled with TM active, update the
> +			 * failure cause
> +			 */
> +			tm_change_failure_cause(prev, TM_CAUSE_RESCHED);
> +		} else {

else if to avoid more indenting...

> +			/*
> +			 * TM enabled but not transactional. Just disable TM
> +			 * if load_tm overflows. This should be the only place
> +			 * that disables the TM and reenables the laziness
> +			 * save/restore
> +			 */
> +			if (prev->thread.load_tm == 0)
>  				prev->thread.regs->msr &= ~MSR_TM;
>  		}
> +	}
>  
> -		tm_recheckpoint_new_task(new);
> +	/*
> +	 * It is a *bug* if we arrived so late with a transaction active
> +	 * (more precisely suspended)
> +	 */
> +	if (WARN_ON(MSR_TM_ACTIVE(mfmsr()))) {
> +		/* Recovery path. 0x99 shouldn't be exported to UAPI */
> +		tm_reclaim_current(0x99);
> +	}

As discussed privately, if we hit this case, either we BUG_ON() or we WARN_ON()
AND kill the process. Something is bust in the kernel at this point so we need
to do something more than just warning and limp along with a reclaim.

Also, no {} needed for single line if statements.

> +
> +	/*
> +	 * If the next task has TM enabled, restore the SPRs. Do not need to
> +	 * care about recheckpoint at this time. It will be done later if
> +	 * TIF_RESTORE_TM was set when the task was scheduled out
> +	 */
> +	if (tm_enabled(new)) {
> +		tm_enable();
> +		tm_restore_sprs(&new->thread);
>  	}
>  }
>  
> @@ -1094,7 +1084,6 @@ void restore_tm_state(struct pt_regs *regs)
>  }
>  
>  #else
> -#define tm_recheckpoint_new_task(new)
>  #define __switch_to_tm(prev, new)
>  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>  
> @@ -1599,9 +1588,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
>  	/*
>  	 * Flush TM state out so we can copy it.  __switch_to_tm() does this
>  	 * flush but it removes the checkpointed state from the current CPU and
> -	 * transitions the CPU out of TM mode.  Hence we need to call
> -	 * tm_recheckpoint_new_task() (on the same task) to restore the
> -	 * checkpointed state back and the TM mode.
> +	 * transitions the CPU out of TM mode.  Hence we need to make sure
> +	 * TIF_RESTORE_TM is set so restore_tm_state is called to restore the
> +	 * checkpointed state and back to TM mode.
>  	 *
>  	 * Can't pass dst because it isn't ready. Doesn't matter, passing
>  	 * dst is only important for __switch_to()

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 08/14] powerpc/tm: Recheckpoint at exit path
  2018-11-06 12:40 ` [RFC PATCH 08/14] powerpc/tm: Recheckpoint at exit path Breno Leitao
@ 2018-11-15  2:45   ` Michael Neuling
  0 siblings, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-15  2:45 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev, Nicholas Piggin; +Cc: ldufour, cyrilbur, gromero

On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> In the past, TIF_RESTORE_TM was being handled with the rest of the TIF
> workers,
> but, that was too early, and can cause some IRQ to be replayed in suspended
> state (after recheckpoint).
> 
> This patch moves TIF_RESTORE_TM handler to as late as possible, it also
> forces the IRQ to be disabled, and it will continue to be until RFID, so,
> no IRQ will be replayed at all. I.e, if trecheckpoint happens, it will RFID
> to userspace.
> 
> This makes TIF_RESTORE_TM a special case that should not be handled
> similarly to the _TIF_USER_WORK_MASK.
> 
> Since _TIF_RESTORE_TM is not part of _TIF_USER_WORK_MASK anymore, we
> need to force system_call_exit to continue to leaves through
> fast_exception_return, so, we add the flags together with
> _TIF_USER_WORK_MASK at system_call_exist path.
> 
> If this flag is set at system_call_exit, it means that recheckpoint
> will be called, and doing it through fast_exception_return is the only
> way to do so.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/include/asm/thread_info.h |  2 +-
>  arch/powerpc/kernel/entry_64.S         | 23 ++++++++++++++---------
>  2 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/thread_info.h
> b/arch/powerpc/include/asm/thread_info.h
> index 544cac0474cb..2835d60bc9ef 100644
> --- a/arch/powerpc/include/asm/thread_info.h
> +++ b/arch/powerpc/include/asm/thread_info.h
> @@ -139,7 +139,7 @@ void arch_setup_new_exec(void);
>  
>  #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
>  				 _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
> -				 _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
> +				 _TIF_PATCH_PENDING | \
>  				 _TIF_FSCHECK)
>  #define _TIF_PERSYSCALL_MASK	(_TIF_RESTOREALL|_TIF_NOERROR)
>  
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 17484ebda66c..a86619edf29d 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -255,7 +255,7 @@ system_call_exit:
>  
>  	ld	r9,TI_FLAGS(r12)
>  	li	r11,-MAX_ERRNO
> -	andi.	r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MAS
> K|_TIF_PERSYSCALL_MASK)
> +	andi.   r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|
> _TIF_PERSYSCALL_MASK |_TIF_RESTORE_TM)
>  	bne-	.Lsyscall_exit_work
>  
>  	andi.	r0,r8,MSR_FP
> @@ -784,14 +784,6 @@ _GLOBAL(ret_from_except_lite)
>  	SCHEDULE_USER
>  	b	ret_from_except_lite
>  2:
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> -	andi.	r0,r4,_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM
> -	bne	3f		/* only restore TM if nothing else to do */
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	bl	restore_tm_state
> -	b	restore
> -3:
> -#endif
>  	bl	save_nvgprs
>  	/*
>  	 * Use a non volatile GPR to save and restore our thread_info flags
> @@ -938,6 +930,19 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>  	 */
>  	.globl	fast_exception_return
>  fast_exception_return:
> +
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +	CURRENT_THREAD_INFO(r4, r1)
> +	ld	r4,TI_FLAGS(r4)
> +	andi.   r0,r4,_TIF_RESTORE_TM
> +	beq	22f
> +	ld	r4,_MSR(r1) 	/* TODO: MSR[!PR] shouldn't be here */
> +	andi.	r0,r4,MSR_PR
> +	beq	22f  /* Skip if Kernel thread */
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
> +	bl	restore_tm_state

Calling out to C here is a bit concerning this late.

The main thing you are calling out to is asm anyway (with a small C wrapper). 
We might want to adapt the asm this case, rather than the old model of getting
it called from __switch_to(). We shouldn't need to call
tm_reclaim/tm_recheckpoint from C anymore (especially if we use BUG_ON() rather
than WARN_ON() in the cases already mentioned.).

Same for patch 1.

> +22:
> +#endif
>  	ld	r3,_MSR(r1)
>  	ld	r4,_CTR(r1)
>  	ld	r0,_LINK(r1)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 09/14] powerpc/tm: Warn if state is transactional
  2018-11-06 12:40 ` [RFC PATCH 09/14] powerpc/tm: Warn if state is transactional Breno Leitao
@ 2018-11-15  2:48   ` Michael Neuling
  0 siblings, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-15  2:48 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: ldufour, cyrilbur, gromero



On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> Since every kernel entrance is calling TM_KERNEL_ENTRY, it is not
> expected to arrive at this point with a suspended transaction.
> 
> If that is the case, cause a warning and reclaim the current thread in
> order to avoid a TM Bad Thing.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/kernel/process.c | 7 +++----
>  arch/powerpc/kernel/signal.c  | 2 +-
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 73872f751b33..849591bf0881 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1752,11 +1752,10 @@ void start_thread(struct pt_regs *regs, unsigned long
> start, unsigned long sp)
>  
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  	/*
> -	 * Clear any transactional state, we're exec()ing. The cause is
> -	 * not important as there will never be a recheckpoint so it's not
> -	 * user visible.
> +	 * It is a bug if the transaction was not reclaimed until this
> +	 * point. Warn us and try to workaround it calling tm_reclaim().
>  	 */
> -	if (MSR_TM_SUSPENDED(mfmsr()))
> +	if (WARN_ON(MSR_TM_SUSPENDED(mfmsr())))
>  		tm_reclaim_current(0);
>  #endif

Let's turn these into BUG_ON()

Mikey
 
> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> index b3e8db376ecd..cbaccc2be0fb 100644
> --- a/arch/powerpc/kernel/signal.c
> +++ b/arch/powerpc/kernel/signal.c
> @@ -203,7 +203,7 @@ unsigned long get_tm_stackpointer(struct task_struct *tsk)
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  	BUG_ON(tsk != current);
>  
> -	if (MSR_TM_ACTIVE(tsk->thread.regs->msr)) {
> +	if (WARN_ON(MSR_TM_ACTIVE(mfmsr()))) {
>  		tm_reclaim_current(TM_CAUSE_SIGNAL);
>  		if (MSR_TM_TRANSACTIONAL(tsk->thread.regs->msr))
>  			return tsk->thread.ckpt_regs.gpr[1];

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH 13/14] powerpc/tm: Do not restore TM without SPRs
  2018-11-06 12:40 ` [RFC PATCH 13/14] powerpc/tm: Do not restore TM without SPRs Breno Leitao
@ 2018-11-15  3:02   ` Michael Neuling
  0 siblings, 0 replies; 26+ messages in thread
From: Michael Neuling @ 2018-11-15  3:02 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: ldufour, cyrilbur, gromero

On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> Currently the signal context restore code enables the bit on the MSR
> register without restoring the TM SPRs, which can cause undesired side
> effects.
> 
> This is not correct because if TM is enabled in MSR, it means the TM SPR
> registers are valid and updated, which is not correct here. In fact, the
> live registers may contain previous' thread SPRs.
> 
> Functions check if the register values are valid or not through looking
> if the facility is enabled at MSR, as MSR[TM] set means that the TM SPRs
> are hot.
> 
> When just enabling MSR[TM] without updating the live SPRs, this can cause a
> crash, since current TM SPR from previous thread will be saved on the
> current thread, and might not have TEXASR[FS] set, for example.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/kernel/signal_64.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 487c3b6aa2e3..156b90e8ee78 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -478,8 +478,18 @@ static long restore_tm_sigcontexts(struct task_struct
> *tsk,
>  	 * happened whilst in the signal handler and load_tm overflowed,
>  	 * disabling the TM bit. In either case we can end up with an illegal
>  	 * TM state leading to a TM Bad Thing when we return to userspace.
> +	 *
> +	 * Every time MSR_TM is enabled, mainly for the b) case, the TM SPRs
> +	 * must be restored in the live registers, since the live registers
> +	 * could contain garbage and later we want to read from live, since
> +	 * MSR_TM is enabled, and MSR[TM] is what is used to check if the
> +	 * TM SPRs live registers are valid or not.
>  	 */
> -	regs->msr |= MSR_TM;
> +	if ((regs->msr & MSR_TM) == 0) {
> +		regs->msr |= MSR_TM;
> +		tm_enable();
> +		tm_restore_sprs(&tsk->thread);
> +	}

I'm wondering if we should put the save/restore TM registers in the early
entry/exit code too. We'd need to add the check on msr[tm]/load_tm.

Distributing the SPR save/restore throughout the kernel is just going to lead us
to similar problems that we are having now with reclaim/recheckpoint.

Mikey


>  
>  	/* pull in MSR LE from user context */
>  	regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-11-15  3:05 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06 12:40 [RFC PATCH v2 00/14] New TM Model Breno Leitao
2018-11-06 12:40 ` [RFC PATCH 01/14] powerpc/tm: Reclaim transaction on kernel entry Breno Leitao
2018-11-15  0:06   ` Michael Neuling
2018-11-15  0:51   ` Nicholas Piggin
2018-11-06 12:40 ` [RFC PATCH 02/14] powerpc/tm: Reclaim on unavailable exception Breno Leitao
2018-11-15  0:06   ` Michael Neuling
2018-11-06 12:40 ` [RFC PATCH 03/14] powerpc/tm: Recheckpoint when exiting from kernel Breno Leitao
2018-11-15  0:11   ` Michael Neuling
2018-11-06 12:40 ` [RFC PATCH 04/14] powerpc/tm: Always set TIF_RESTORE_TM on reclaim Breno Leitao
2018-11-06 12:40 ` [RFC PATCH 05/14] powerpc/tm: Refactor the __switch_to_tm code Breno Leitao
2018-11-15  1:04   ` Michael Neuling
2018-11-06 12:40 ` [RFC PATCH 06/14] powerpc/tm: Do not recheckpoint at sigreturn Breno Leitao
2018-11-06 12:40 ` [RFC PATCH 07/14] powerpc/tm: Do not reclaim on ptrace Breno Leitao
2018-11-06 12:40 ` [RFC PATCH 08/14] powerpc/tm: Recheckpoint at exit path Breno Leitao
2018-11-15  2:45   ` Michael Neuling
2018-11-06 12:40 ` [RFC PATCH 09/14] powerpc/tm: Warn if state is transactional Breno Leitao
2018-11-15  2:48   ` Michael Neuling
2018-11-06 12:40 ` [RFC PATCH 10/14] powerpc/tm: Improve TM debug information Breno Leitao
2018-11-06 12:40 ` [RFC PATCH 11/14] powerpc/tm: Save MSR to PACA before RFID Breno Leitao
2018-11-06 12:40 ` [RFC PATCH 12/14] powerpc/tm: Restore transactional SPRs Breno Leitao
2018-11-06 12:40 ` [RFC PATCH 13/14] powerpc/tm: Do not restore TM without SPRs Breno Leitao
2018-11-15  3:02   ` Michael Neuling
2018-11-06 12:40 ` [RFC PATCH 14/14] selftests/powerpc: Adapt tm-syscall test to no suspend Breno Leitao
2018-11-06 18:32 ` [RFC PATCH v2 00/14] New TM Model Florian Weimer
2018-11-06 19:31   ` Breno Leitao
2018-11-07  0:39     ` Michael Neuling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.