stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts
       [not found] <1546956464-48825-1-git-send-email-julien.thierry@arm.com>
@ 2019-01-08 14:07 ` Julien Thierry
  2019-01-14 15:56   ` Catalin Marinas
  2019-01-28  9:16   ` Marc Zyngier
  0 siblings, 2 replies; 5+ messages in thread
From: Julien Thierry @ 2019-01-08 14:07 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, daniel.thompson, joel, marc.zyngier,
	christoffer.dall, james.morse, catalin.marinas, will.deacon,
	mark.rutland, Julien Thierry, Arnd Bergmann, linux-arch, stable

When using VHE, the host needs to clear HCR_EL2.TGE bit in order
to interract with guest TLBs, switching from EL2&0 translation regime
to EL1&0.

However, some non-maskable asynchronous event could happen while TGE is
cleared like SDEI. Because of this address translation operations
relying on EL2&0 translation regime could fail (tlb invalidation,
userspace access, ...).

Fix this by properly setting HCR_EL2.TGE when entering NMI context and
clear it if necessary when returning to the interrupted context.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
---
 arch/arm64/include/asm/hardirq.h | 28 ++++++++++++++++++++++++++++
 arch/arm64/kernel/irq.c          |  3 +++
 include/linux/hardirq.h          |  7 +++++++
 3 files changed, 38 insertions(+)

diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index 1473fc2..94b7481 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -19,6 +19,7 @@
 #include <linux/cache.h>
 #include <linux/threads.h>
 #include <asm/irq.h>
+#include <asm/kvm_arm.h>
 
 #define NR_IPI	7
 
@@ -37,6 +38,33 @@
 
 #define __ARCH_IRQ_EXIT_IRQS_DISABLED	1
 
+struct nmi_ctx {
+	u64 hcr;
+};
+
+DECLARE_PER_CPU(struct nmi_ctx, nmi_contexts);
+
+#define arch_nmi_enter()							\
+	do {									\
+		if (is_kernel_in_hyp_mode()) {					\
+			struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts);	\
+			nmi_ctx->hcr = read_sysreg(hcr_el2);			\
+			if (!(nmi_ctx->hcr & HCR_TGE)) {			\
+				write_sysreg(nmi_ctx->hcr | HCR_TGE, hcr_el2);	\
+				isb();						\
+			}							\
+		}								\
+	} while (0)
+
+#define arch_nmi_exit()								\
+	do {									\
+		if (is_kernel_in_hyp_mode()) {					\
+			struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts);	\
+			if (!(nmi_ctx->hcr & HCR_TGE))				\
+				write_sysreg(nmi_ctx->hcr, hcr_el2);		\
+		}								\
+	} while (0)
+
 static inline void ack_bad_irq(unsigned int irq)
 {
 	extern unsigned long irq_err_count;
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 780a12f..92fa817 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -33,6 +33,9 @@
 
 unsigned long irq_err_count;
 
+/* Only access this in an NMI enter/exit */
+DEFINE_PER_CPU(struct nmi_ctx, nmi_contexts);
+
 DEFINE_PER_CPU(unsigned long *, irq_stack_ptr);
 
 int arch_show_interrupts(struct seq_file *p, int prec)
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 0fbbcdf..da0af63 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -60,8 +60,14 @@ static inline void rcu_nmi_exit(void)
  */
 extern void irq_exit(void);
 
+#ifndef arch_nmi_enter
+#define arch_nmi_enter()	do { } while (0)
+#define arch_nmi_exit()		do { } while (0)
+#endif
+
 #define nmi_enter()						\
 	do {							\
+		arch_nmi_enter();				\
 		printk_nmi_enter();				\
 		lockdep_off();					\
 		ftrace_nmi_enter();				\
@@ -80,6 +86,7 @@ static inline void rcu_nmi_exit(void)
 		ftrace_nmi_exit();				\
 		lockdep_on();					\
 		printk_nmi_exit();				\
+		arch_nmi_exit();				\
 	} while (0)
 
 #endif /* LINUX_HARDIRQ_H */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-08 14:07 ` [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts Julien Thierry
@ 2019-01-14 15:56   ` Catalin Marinas
  2019-01-14 16:12     ` Julien Thierry
  2019-01-28  9:16   ` Marc Zyngier
  1 sibling, 1 reply; 5+ messages in thread
From: Catalin Marinas @ 2019-01-14 15:56 UTC (permalink / raw)
  To: Julien Thierry
  Cc: linux-arm-kernel, mark.rutland, linux-arch, daniel.thompson,
	Arnd Bergmann, marc.zyngier, will.deacon, linux-kernel, stable,
	christoffer.dall, james.morse, joel

On Tue, Jan 08, 2019 at 02:07:19PM +0000, Julien Thierry wrote:
> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> to interract with guest TLBs, switching from EL2&0 translation regime
> to EL1&0.
> 
> However, some non-maskable asynchronous event could happen while TGE is
> cleared like SDEI. Because of this address translation operations
> relying on EL2&0 translation regime could fail (tlb invalidation,
> userspace access, ...).

Why would an NMI context need to access user space? (just curious what
breaks exactly without this patch; otherwise it looks fine)

-- 
Catalin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-14 15:56   ` Catalin Marinas
@ 2019-01-14 16:12     ` Julien Thierry
  2019-01-14 17:25       ` James Morse
  0 siblings, 1 reply; 5+ messages in thread
From: Julien Thierry @ 2019-01-14 16:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, mark.rutland, linux-arch, daniel.thompson,
	Arnd Bergmann, marc.zyngier, will.deacon, linux-kernel, stable,
	christoffer.dall, james.morse, joel



On 14/01/2019 15:56, Catalin Marinas wrote:
> On Tue, Jan 08, 2019 at 02:07:19PM +0000, Julien Thierry wrote:
>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
>> to interract with guest TLBs, switching from EL2&0 translation regime
>> to EL1&0.
>>
>> However, some non-maskable asynchronous event could happen while TGE is
>> cleared like SDEI. Because of this address translation operations
>> relying on EL2&0 translation regime could fail (tlb invalidation,
>> userspace access, ...).
> 
> Why would an NMI context need to access user space? (just curious what
> breaks exactly without this patch; otherwise it looks fine)

If I remember correctly, the SDEI interrupt might perform cache
maintenance with EL2&0 translation regime, but James can probably give
more detail (or correct me if I'm wrong).

Otherwise, if we decide to use the pseudo NMI for profiling with perf, I
believe the perf interrupt can access user space (although I'm not
completely sure whether that might be to record profiling data in
buffers shared with user space or something else).

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-14 16:12     ` Julien Thierry
@ 2019-01-14 17:25       ` James Morse
  0 siblings, 0 replies; 5+ messages in thread
From: James Morse @ 2019-01-14 17:25 UTC (permalink / raw)
  To: Julien Thierry, Catalin Marinas
  Cc: linux-arm-kernel, mark.rutland, linux-arch, daniel.thompson,
	Arnd Bergmann, marc.zyngier, will.deacon, linux-kernel, stable,
	christoffer.dall, joel

Hi guys,

On 14/01/2019 16:12, Julien Thierry wrote:
> On 14/01/2019 15:56, Catalin Marinas wrote:
>> On Tue, Jan 08, 2019 at 02:07:19PM +0000, Julien Thierry wrote:
>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
>>> to interract with guest TLBs, switching from EL2&0 translation regime
>>> to EL1&0.
>>>
>>> However, some non-maskable asynchronous event could happen while TGE is
>>> cleared like SDEI. Because of this address translation operations
>>> relying on EL2&0 translation regime could fail (tlb invalidation,
>>> userspace access, ...).
>>
>> Why would an NMI context need to access user space? (just curious what
>> breaks exactly without this patch; otherwise it looks fine)
> 
> If I remember correctly, the SDEI interrupt might perform cache
> maintenance with EL2&0 translation regime, but James can probably give
> more detail (or correct me if I'm wrong).

Yup, spot on.
The APEI driver has to map/unmap memory using the fixmap. If it interrupts a
guest, the TLB maintenance would affect EL1&0 instead.


> Otherwise, if we decide to use the pseudo NMI for profiling with perf, I
> believe the perf interrupt can access user space (although I'm not
> completely sure whether that might be to record profiling data in
> buffers shared with user space or something else).

It does a stack walk, I think its the PERF_SAMPLE_CALLCHAIN feature, and the
code is:
arch/arm64/kernel/perf_callchain.c::user_backtrace()


Thanks,

James

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-08 14:07 ` [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts Julien Thierry
  2019-01-14 15:56   ` Catalin Marinas
@ 2019-01-28  9:16   ` Marc Zyngier
  1 sibling, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2019-01-28  9:16 UTC (permalink / raw)
  To: Julien Thierry
  Cc: linux-arm-kernel, linux-kernel, daniel.thompson, joel,
	christoffer.dall, james.morse, catalin.marinas, will.deacon,
	mark.rutland, Arnd Bergmann, linux-arch, stable

On Tue, 08 Jan 2019 14:07:19 +0000,
Julien Thierry <julien.thierry@arm.com> wrote:
> 
> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> to interract with guest TLBs, switching from EL2&0 translation regime
> to EL1&0.
> 
> However, some non-maskable asynchronous event could happen while TGE is
> cleared like SDEI. Because of this address translation operations
> relying on EL2&0 translation regime could fail (tlb invalidation,
> userspace access, ...).
> 
> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
> clear it if necessary when returning to the interrupted context.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: linux-arch@vger.kernel.org
> Cc: stable@vger.kernel.org

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-28  9:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1546956464-48825-1-git-send-email-julien.thierry@arm.com>
2019-01-08 14:07 ` [PATCH v8 01/26] arm64: Fix HCR.TGE status for NMI contexts Julien Thierry
2019-01-14 15:56   ` Catalin Marinas
2019-01-14 16:12     ` Julien Thierry
2019-01-14 17:25       ` James Morse
2019-01-28  9:16   ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).