stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
       [not found] <1548084825-8803-1-git-send-email-julien.thierry@arm.com>
@ 2019-01-21 15:33 ` Julien Thierry
  2019-01-28 11:48   ` James Morse
  0 siblings, 1 reply; 9+ messages in thread
From: Julien Thierry @ 2019-01-21 15:33 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, daniel.thompson, joel, marc.zyngier,
	christoffer.dall, james.morse, catalin.marinas, will.deacon,
	mark.rutland, Julien Thierry, Arnd Bergmann, linux-arch, stable

When using VHE, the host needs to clear HCR_EL2.TGE bit in order
to interract with guest TLBs, switching from EL2&0 translation regime
to EL1&0.

However, some non-maskable asynchronous event could happen while TGE is
cleared like SDEI. Because of this address translation operations
relying on EL2&0 translation regime could fail (tlb invalidation,
userspace access, ...).

Fix this by properly setting HCR_EL2.TGE when entering NMI context and
clear it if necessary when returning to the interrupted context.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
---
 arch/arm64/include/asm/hardirq.h | 28 ++++++++++++++++++++++++++++
 arch/arm64/kernel/irq.c          |  3 +++
 include/linux/hardirq.h          |  7 +++++++
 3 files changed, 38 insertions(+)

diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index 1473fc2..94b7481 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -19,6 +19,7 @@
 #include <linux/cache.h>
 #include <linux/threads.h>
 #include <asm/irq.h>
+#include <asm/kvm_arm.h>
 
 #define NR_IPI	7
 
@@ -37,6 +38,33 @@
 
 #define __ARCH_IRQ_EXIT_IRQS_DISABLED	1
 
+struct nmi_ctx {
+	u64 hcr;
+};
+
+DECLARE_PER_CPU(struct nmi_ctx, nmi_contexts);
+
+#define arch_nmi_enter()							\
+	do {									\
+		if (is_kernel_in_hyp_mode()) {					\
+			struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts);	\
+			nmi_ctx->hcr = read_sysreg(hcr_el2);			\
+			if (!(nmi_ctx->hcr & HCR_TGE)) {			\
+				write_sysreg(nmi_ctx->hcr | HCR_TGE, hcr_el2);	\
+				isb();						\
+			}							\
+		}								\
+	} while (0)
+
+#define arch_nmi_exit()								\
+	do {									\
+		if (is_kernel_in_hyp_mode()) {					\
+			struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts);	\
+			if (!(nmi_ctx->hcr & HCR_TGE))				\
+				write_sysreg(nmi_ctx->hcr, hcr_el2);		\
+		}								\
+	} while (0)
+
 static inline void ack_bad_irq(unsigned int irq)
 {
 	extern unsigned long irq_err_count;
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 780a12f..92fa817 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -33,6 +33,9 @@
 
 unsigned long irq_err_count;
 
+/* Only access this in an NMI enter/exit */
+DEFINE_PER_CPU(struct nmi_ctx, nmi_contexts);
+
 DEFINE_PER_CPU(unsigned long *, irq_stack_ptr);
 
 int arch_show_interrupts(struct seq_file *p, int prec)
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 0fbbcdf..da0af63 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -60,8 +60,14 @@ static inline void rcu_nmi_exit(void)
  */
 extern void irq_exit(void);
 
+#ifndef arch_nmi_enter
+#define arch_nmi_enter()	do { } while (0)
+#define arch_nmi_exit()		do { } while (0)
+#endif
+
 #define nmi_enter()						\
 	do {							\
+		arch_nmi_enter();				\
 		printk_nmi_enter();				\
 		lockdep_off();					\
 		ftrace_nmi_enter();				\
@@ -80,6 +86,7 @@ static inline void rcu_nmi_exit(void)
 		ftrace_nmi_exit();				\
 		lockdep_on();					\
 		printk_nmi_exit();				\
+		arch_nmi_exit();				\
 	} while (0)
 
 #endif /* LINUX_HARDIRQ_H */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-21 15:33 ` [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts Julien Thierry
@ 2019-01-28 11:48   ` James Morse
  2019-01-28 15:42     ` Julien Thierry
  0 siblings, 1 reply; 9+ messages in thread
From: James Morse @ 2019-01-28 11:48 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel
  Cc: linux-kernel, daniel.thompson, joel, marc.zyngier,
	christoffer.dall, catalin.marinas, will.deacon, mark.rutland,
	Arnd Bergmann, linux-arch, stable

Hi Julien,

On 21/01/2019 15:33, Julien Thierry wrote:
> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> to interract with guest TLBs, switching from EL2&0 translation regime

(interact)


> to EL1&0.
> 
> However, some non-maskable asynchronous event could happen while TGE is
> cleared like SDEI. Because of this address translation operations
> relying on EL2&0 translation regime could fail (tlb invalidation,
> userspace access, ...).
> 
> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
> clear it if necessary when returning to the interrupted context.

Yes please. This would not have been fun to debug!

Reviewed-by: James Morse <james.morse@arm.com>



I was looking for why we need core code to do this, instead of updating the
arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
itself.


Thanks,

James


> diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
> index 1473fc2..94b7481 100644
> --- a/arch/arm64/include/asm/hardirq.h
> +++ b/arch/arm64/include/asm/hardirq.h
> @@ -19,6 +19,7 @@
>  #include <linux/cache.h>
>  #include <linux/threads.h>
>  #include <asm/irq.h>
> +#include <asm/kvm_arm.h>

percpu.h?
sysreg.h?
barrier.h?


> @@ -37,6 +38,33 @@
>  
>  #define __ARCH_IRQ_EXIT_IRQS_DISABLED	1
>  
> +struct nmi_ctx {
> +	u64 hcr;
> +};
> +
> +DECLARE_PER_CPU(struct nmi_ctx, nmi_contexts);
> +
> +#define arch_nmi_enter()							\
> +	do {									\
> +		if (is_kernel_in_hyp_mode()) {					\
> +			struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts);	\
> +			nmi_ctx->hcr = read_sysreg(hcr_el2);			\
> +			if (!(nmi_ctx->hcr & HCR_TGE)) {			\
> +				write_sysreg(nmi_ctx->hcr | HCR_TGE, hcr_el2);	\
> +				isb();						\
> +			}							\
> +		}								\
> +	} while (0)
> +
> +#define arch_nmi_exit()								\
> +	do {									\
> +		if (is_kernel_in_hyp_mode()) {					\
> +			struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts);	\
> +			if (!(nmi_ctx->hcr & HCR_TGE))				\
> +				write_sysreg(nmi_ctx->hcr, hcr_el2);		\
> +		}								\
> +	} while (0)
> +
>  static inline void ack_bad_irq(unsigned int irq)
>  {
>  	extern unsigned long irq_err_count;



> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index 0fbbcdf..da0af63 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -60,8 +60,14 @@ static inline void rcu_nmi_exit(void)
>   */
>  extern void irq_exit(void);
>  
> +#ifndef arch_nmi_enter
> +#define arch_nmi_enter()	do { } while (0)
> +#define arch_nmi_exit()		do { } while (0)
> +#endif
> +
>  #define nmi_enter()						\
>  	do {							\
> +		arch_nmi_enter();				\
>  		printk_nmi_enter();				\
>  		lockdep_off();					\
>  		ftrace_nmi_enter();				\
> @@ -80,6 +86,7 @@ static inline void rcu_nmi_exit(void)
>  		ftrace_nmi_exit();				\
>  		lockdep_on();					\
>  		printk_nmi_exit();				\
> +		arch_nmi_exit();				\
>  	} while (0)
>  
>  #endif /* LINUX_HARDIRQ_H */
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-28 11:48   ` James Morse
@ 2019-01-28 15:42     ` Julien Thierry
  2019-01-31  8:19       ` Christoffer Dall
  0 siblings, 1 reply; 9+ messages in thread
From: Julien Thierry @ 2019-01-28 15:42 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: linux-kernel, daniel.thompson, joel, marc.zyngier,
	christoffer.dall, catalin.marinas, will.deacon, mark.rutland,
	Arnd Bergmann, linux-arch, stable

Hi James,

On 28/01/2019 11:48, James Morse wrote:
> Hi Julien,
> 
> On 21/01/2019 15:33, Julien Thierry wrote:
>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
>> to interract with guest TLBs, switching from EL2&0 translation regime
> 
> (interact)
> 
> 
>> to EL1&0.
>>
>> However, some non-maskable asynchronous event could happen while TGE is
>> cleared like SDEI. Because of this address translation operations
>> relying on EL2&0 translation regime could fail (tlb invalidation,
>> userspace access, ...).
>>
>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
>> clear it if necessary when returning to the interrupted context.
> 
> Yes please. This would not have been fun to debug!
> 
> Reviewed-by: James Morse <james.morse@arm.com>
> 
> 

Thanks.

> 
> I was looking for why we need core code to do this, instead of updating the
> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
> itself.
> 

Yes, that's the main reason.

> 
>> diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
>> index 1473fc2..94b7481 100644
>> --- a/arch/arm64/include/asm/hardirq.h
>> +++ b/arch/arm64/include/asm/hardirq.h
>> @@ -19,6 +19,7 @@
>>  #include <linux/cache.h>
>>  #include <linux/threads.h>
>>  #include <asm/irq.h>
>> +#include <asm/kvm_arm.h>
> 
> percpu.h?
> sysreg.h?
> barrier.h?
> 

Good point, I'll add those.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-28 15:42     ` Julien Thierry
@ 2019-01-31  8:19       ` Christoffer Dall
  2019-01-31  8:56         ` Julien Thierry
  0 siblings, 1 reply; 9+ messages in thread
From: Christoffer Dall @ 2019-01-31  8:19 UTC (permalink / raw)
  To: Julien Thierry
  Cc: James Morse, linux-arm-kernel, linux-kernel, daniel.thompson,
	joel, marc.zyngier, catalin.marinas, will.deacon, mark.rutland,
	Arnd Bergmann, linux-arch, stable

On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
> Hi James,
> 
> On 28/01/2019 11:48, James Morse wrote:
> > Hi Julien,
> > 
> > On 21/01/2019 15:33, Julien Thierry wrote:
> >> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> >> to interract with guest TLBs, switching from EL2&0 translation regime
> > 
> > (interact)
> > 
> > 
> >> to EL1&0.
> >>
> >> However, some non-maskable asynchronous event could happen while TGE is
> >> cleared like SDEI. Because of this address translation operations
> >> relying on EL2&0 translation regime could fail (tlb invalidation,
> >> userspace access, ...).
> >>
> >> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
> >> clear it if necessary when returning to the interrupted context.
> > 
> > Yes please. This would not have been fun to debug!
> > 
> > Reviewed-by: James Morse <james.morse@arm.com>
> > 
> > 
> 
> Thanks.
> 
> > 
> > I was looking for why we need core code to do this, instead of updating the
> > arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
> > to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
> > itself.
> > 
> 
> Yes, that's the main reason.
> 
I wondered the same thing, but I don't understand the explanation :(

Why can't we do a local_daif_mask() around the (very small) calls that
clear TGE instead?


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-31  8:19       ` Christoffer Dall
@ 2019-01-31  8:56         ` Julien Thierry
  2019-01-31  9:27           ` Christoffer Dall
  0 siblings, 1 reply; 9+ messages in thread
From: Julien Thierry @ 2019-01-31  8:56 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: James Morse, linux-arm-kernel, linux-kernel, daniel.thompson,
	joel, marc.zyngier, catalin.marinas, will.deacon, mark.rutland,
	Arnd Bergmann, linux-arch, stable



On 31/01/2019 08:19, Christoffer Dall wrote:
> On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
>> Hi James,
>>
>> On 28/01/2019 11:48, James Morse wrote:
>>> Hi Julien,
>>>
>>> On 21/01/2019 15:33, Julien Thierry wrote:
>>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
>>>> to interract with guest TLBs, switching from EL2&0 translation regime
>>>
>>> (interact)
>>>
>>>
>>>> to EL1&0.
>>>>
>>>> However, some non-maskable asynchronous event could happen while TGE is
>>>> cleared like SDEI. Because of this address translation operations
>>>> relying on EL2&0 translation regime could fail (tlb invalidation,
>>>> userspace access, ...).
>>>>
>>>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
>>>> clear it if necessary when returning to the interrupted context.
>>>
>>> Yes please. This would not have been fun to debug!
>>>
>>> Reviewed-by: James Morse <james.morse@arm.com>
>>>
>>>
>>
>> Thanks.
>>
>>>
>>> I was looking for why we need core code to do this, instead of updating the
>>> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
>>> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
>>> itself.
>>>
>>
>> Yes, that's the main reason.
>>
> I wondered the same thing, but I don't understand the explanation :(
> 
> Why can't we do a local_daif_mask() around the (very small) calls that
> clear TGE instead?
> 

That would protect against the pseudo-NMIs, but you can still get an
SDEI at that point even with all daif bits set. Or did I misunderstand
how SDEI works?

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-31  8:56         ` Julien Thierry
@ 2019-01-31  9:27           ` Christoffer Dall
  2019-01-31  9:40             ` Julien Thierry
  0 siblings, 1 reply; 9+ messages in thread
From: Christoffer Dall @ 2019-01-31  9:27 UTC (permalink / raw)
  To: Julien Thierry
  Cc: James Morse, linux-arm-kernel, linux-kernel, daniel.thompson,
	joel, marc.zyngier, catalin.marinas, will.deacon, mark.rutland,
	Arnd Bergmann, linux-arch, stable

On Thu, Jan 31, 2019 at 08:56:04AM +0000, Julien Thierry wrote:
> 
> 
> On 31/01/2019 08:19, Christoffer Dall wrote:
> > On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
> >> Hi James,
> >>
> >> On 28/01/2019 11:48, James Morse wrote:
> >>> Hi Julien,
> >>>
> >>> On 21/01/2019 15:33, Julien Thierry wrote:
> >>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> >>>> to interract with guest TLBs, switching from EL2&0 translation regime
> >>>
> >>> (interact)
> >>>
> >>>
> >>>> to EL1&0.
> >>>>
> >>>> However, some non-maskable asynchronous event could happen while TGE is
> >>>> cleared like SDEI. Because of this address translation operations
> >>>> relying on EL2&0 translation regime could fail (tlb invalidation,
> >>>> userspace access, ...).
> >>>>
> >>>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
> >>>> clear it if necessary when returning to the interrupted context.
> >>>
> >>> Yes please. This would not have been fun to debug!
> >>>
> >>> Reviewed-by: James Morse <james.morse@arm.com>
> >>>
> >>>
> >>
> >> Thanks.
> >>
> >>>
> >>> I was looking for why we need core code to do this, instead of updating the
> >>> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
> >>> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
> >>> itself.
> >>>
> >>
> >> Yes, that's the main reason.
> >>
> > I wondered the same thing, but I don't understand the explanation :(
> > 
> > Why can't we do a local_daif_mask() around the (very small) calls that
> > clear TGE instead?
> > 
> 
> That would protect against the pseudo-NMIs, but you can still get an
> SDEI at that point even with all daif bits set. Or did I misunderstand
> how SDEI works?
> 

I don't know the details of SDEI.  From looking at this patch, the
logical conclusion would be that SDEIs can then only be delivered once
we've called nmi_enter, but since we don't call this directly from the
code that clears TGE for doing guest TLB invalidation (or do we?) then
masking interrupts at the PSTATE level should be sufficient.

Surely I'm missing some part of the bigger picture here.

Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-31  9:27           ` Christoffer Dall
@ 2019-01-31  9:40             ` Julien Thierry
  2019-01-31  9:48               ` Christoffer Dall
  2019-01-31  9:53               ` Marc Zyngier
  0 siblings, 2 replies; 9+ messages in thread
From: Julien Thierry @ 2019-01-31  9:40 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: James Morse, linux-arm-kernel, linux-kernel, daniel.thompson,
	joel, marc.zyngier, catalin.marinas, will.deacon, mark.rutland,
	Arnd Bergmann, linux-arch, stable



On 31/01/2019 09:27, Christoffer Dall wrote:
> On Thu, Jan 31, 2019 at 08:56:04AM +0000, Julien Thierry wrote:
>>
>>
>> On 31/01/2019 08:19, Christoffer Dall wrote:
>>> On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
>>>> Hi James,
>>>>
>>>> On 28/01/2019 11:48, James Morse wrote:
>>>>> Hi Julien,
>>>>>
>>>>> On 21/01/2019 15:33, Julien Thierry wrote:
>>>>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
>>>>>> to interract with guest TLBs, switching from EL2&0 translation regime
>>>>>
>>>>> (interact)
>>>>>
>>>>>
>>>>>> to EL1&0.
>>>>>>
>>>>>> However, some non-maskable asynchronous event could happen while TGE is
>>>>>> cleared like SDEI. Because of this address translation operations
>>>>>> relying on EL2&0 translation regime could fail (tlb invalidation,
>>>>>> userspace access, ...).
>>>>>>
>>>>>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
>>>>>> clear it if necessary when returning to the interrupted context.
>>>>>
>>>>> Yes please. This would not have been fun to debug!
>>>>>
>>>>> Reviewed-by: James Morse <james.morse@arm.com>
>>>>>
>>>>>
>>>>
>>>> Thanks.
>>>>
>>>>>
>>>>> I was looking for why we need core code to do this, instead of updating the
>>>>> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
>>>>> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
>>>>> itself.
>>>>>
>>>>
>>>> Yes, that's the main reason.
>>>>
>>> I wondered the same thing, but I don't understand the explanation :(
>>>
>>> Why can't we do a local_daif_mask() around the (very small) calls that
>>> clear TGE instead?
>>>
>>
>> That would protect against the pseudo-NMIs, but you can still get an
>> SDEI at that point even with all daif bits set. Or did I misunderstand
>> how SDEI works?
>>
> 
> I don't know the details of SDEI.  From looking at this patch, the
> logical conclusion would be that SDEIs can then only be delivered once
> we've called nmi_enter, but since we don't call this directly from the
> code that clears TGE for doing guest TLB invalidation (or do we?) then
> masking interrupts at the PSTATE level should be sufficient.
> 
> Surely I'm missing some part of the bigger picture here.
> 

I'm not sure I understand. SDEI uses the NMI context and AFAIU, it is an
interrupt that the firmware sends to the OS, and it is sent regardless
of the PSTATE at the OS EL.

So, the worrying part is:
- Hyp clears TGE
- Exception/interrupt taken to EL3
- Firmware decides it's a good time to send an SDEI to the OS
- SDEI handler (at EL2 for VHE) does nmi_enter()
- SDEI handler needs to do cache invalidation or something with the
EL2&0 translation regime but TGE is cleared

We don't expect the code that clears TGE to call nmi_enter().

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-31  9:40             ` Julien Thierry
@ 2019-01-31  9:48               ` Christoffer Dall
  2019-01-31  9:53               ` Marc Zyngier
  1 sibling, 0 replies; 9+ messages in thread
From: Christoffer Dall @ 2019-01-31  9:48 UTC (permalink / raw)
  To: Julien Thierry
  Cc: James Morse, linux-arm-kernel, linux-kernel, daniel.thompson,
	joel, marc.zyngier, catalin.marinas, will.deacon, mark.rutland,
	Arnd Bergmann, linux-arch, stable

On Thu, Jan 31, 2019 at 09:40:02AM +0000, Julien Thierry wrote:
> 
> 
> On 31/01/2019 09:27, Christoffer Dall wrote:
> > On Thu, Jan 31, 2019 at 08:56:04AM +0000, Julien Thierry wrote:
> >>
> >>
> >> On 31/01/2019 08:19, Christoffer Dall wrote:
> >>> On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
> >>>> Hi James,
> >>>>
> >>>> On 28/01/2019 11:48, James Morse wrote:
> >>>>> Hi Julien,
> >>>>>
> >>>>> On 21/01/2019 15:33, Julien Thierry wrote:
> >>>>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> >>>>>> to interract with guest TLBs, switching from EL2&0 translation regime
> >>>>>
> >>>>> (interact)
> >>>>>
> >>>>>
> >>>>>> to EL1&0.
> >>>>>>
> >>>>>> However, some non-maskable asynchronous event could happen while TGE is
> >>>>>> cleared like SDEI. Because of this address translation operations
> >>>>>> relying on EL2&0 translation regime could fail (tlb invalidation,
> >>>>>> userspace access, ...).
> >>>>>>
> >>>>>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
> >>>>>> clear it if necessary when returning to the interrupted context.
> >>>>>
> >>>>> Yes please. This would not have been fun to debug!
> >>>>>
> >>>>> Reviewed-by: James Morse <james.morse@arm.com>
> >>>>>
> >>>>>
> >>>>
> >>>> Thanks.
> >>>>
> >>>>>
> >>>>> I was looking for why we need core code to do this, instead of updating the
> >>>>> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
> >>>>> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
> >>>>> itself.
> >>>>>
> >>>>
> >>>> Yes, that's the main reason.
> >>>>
> >>> I wondered the same thing, but I don't understand the explanation :(
> >>>
> >>> Why can't we do a local_daif_mask() around the (very small) calls that
> >>> clear TGE instead?
> >>>
> >>
> >> That would protect against the pseudo-NMIs, but you can still get an
> >> SDEI at that point even with all daif bits set. Or did I misunderstand
> >> how SDEI works?
> >>
> > 
> > I don't know the details of SDEI.  From looking at this patch, the
> > logical conclusion would be that SDEIs can then only be delivered once
> > we've called nmi_enter, but since we don't call this directly from the
> > code that clears TGE for doing guest TLB invalidation (or do we?) then
> > masking interrupts at the PSTATE level should be sufficient.
> > 
> > Surely I'm missing some part of the bigger picture here.
> > 
> 
> I'm not sure I understand. SDEI uses the NMI context and AFAIU, it is an
> interrupt that the firmware sends to the OS, and it is sent regardless
> of the PSTATE at the OS EL.
> 
> So, the worrying part is:
> - Hyp clears TGE
> - Exception/interrupt taken to EL3
> - Firmware decides it's a good time to send an SDEI to the OS
> - SDEI handler (at EL2 for VHE) does nmi_enter()
> - SDEI handler needs to do cache invalidation or something with the
> EL2&0 translation regime but TGE is cleared
> 
> We don't expect the code that clears TGE to call nmi_enter().
> 

You do understand :)

I didn't understand that the SDEI handler calls nmi_enter() -- and to be
fair the commit message didn't really provide that link -- but it
makes perfect sense now.  I naively thought that SDEI had respected the
pstate bits setting before, and that this was becoming a problem with
the introduction of pseudo-NMIs, but I clearly came at this from the
wrong direction.


Thanks for the explanation!

    Christoffer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts
  2019-01-31  9:40             ` Julien Thierry
  2019-01-31  9:48               ` Christoffer Dall
@ 2019-01-31  9:53               ` Marc Zyngier
  1 sibling, 0 replies; 9+ messages in thread
From: Marc Zyngier @ 2019-01-31  9:53 UTC (permalink / raw)
  To: Julien Thierry, Christoffer Dall
  Cc: James Morse, linux-arm-kernel, linux-kernel, daniel.thompson,
	joel, catalin.marinas, will.deacon, mark.rutland, Arnd Bergmann,
	linux-arch, stable

On 31/01/2019 09:40, Julien Thierry wrote:
> 
> 
> On 31/01/2019 09:27, Christoffer Dall wrote:
>> On Thu, Jan 31, 2019 at 08:56:04AM +0000, Julien Thierry wrote:
>>>
>>>
>>> On 31/01/2019 08:19, Christoffer Dall wrote:
>>>> On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
>>>>> Hi James,
>>>>>
>>>>> On 28/01/2019 11:48, James Morse wrote:
>>>>>> Hi Julien,
>>>>>>
>>>>>> On 21/01/2019 15:33, Julien Thierry wrote:
>>>>>>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
>>>>>>> to interract with guest TLBs, switching from EL2&0 translation regime
>>>>>>
>>>>>> (interact)
>>>>>>
>>>>>>
>>>>>>> to EL1&0.
>>>>>>>
>>>>>>> However, some non-maskable asynchronous event could happen while TGE is
>>>>>>> cleared like SDEI. Because of this address translation operations
>>>>>>> relying on EL2&0 translation regime could fail (tlb invalidation,
>>>>>>> userspace access, ...).
>>>>>>>
>>>>>>> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
>>>>>>> clear it if necessary when returning to the interrupted context.
>>>>>>
>>>>>> Yes please. This would not have been fun to debug!
>>>>>>
>>>>>> Reviewed-by: James Morse <james.morse@arm.com>
>>>>>>
>>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>>
>>>>>> I was looking for why we need core code to do this, instead of updating the
>>>>>> arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
>>>>>> to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
>>>>>> itself.
>>>>>>
>>>>>
>>>>> Yes, that's the main reason.
>>>>>
>>>> I wondered the same thing, but I don't understand the explanation :(
>>>>
>>>> Why can't we do a local_daif_mask() around the (very small) calls that
>>>> clear TGE instead?
>>>>
>>>
>>> That would protect against the pseudo-NMIs, but you can still get an
>>> SDEI at that point even with all daif bits set. Or did I misunderstand
>>> how SDEI works?
>>>
>>
>> I don't know the details of SDEI.  From looking at this patch, the
>> logical conclusion would be that SDEIs can then only be delivered once
>> we've called nmi_enter, but since we don't call this directly from the
>> code that clears TGE for doing guest TLB invalidation (or do we?) then
>> masking interrupts at the PSTATE level should be sufficient.
>>
>> Surely I'm missing some part of the bigger picture here.
>>
> 
> I'm not sure I understand. SDEI uses the NMI context and AFAIU, it is an
> interrupt that the firmware sends to the OS, and it is sent regardless
> of the PSTATE at the OS EL.

I don't think we can describe SDEI as an interrupt. It is not even an
exception. It is just EL3 ERET-ing to a pre-defined location. And yes,
it will completely ignore any form of mask bit.

> 
> So, the worrying part is:
> - Hyp clears TGE
> - Exception/interrupt taken to EL3
> - Firmware decides it's a good time to send an SDEI to the OS
> - SDEI handler (at EL2 for VHE) does nmi_enter()
> - SDEI handler needs to do cache invalidation or something with the
> EL2&0 translation regime but TGE is cleared
> 
> We don't expect the code that clears TGE to call nmi_enter().

Indeed. Without this patch, SDEI is already broken. Pseudo-NMIs only
make the bug easier to trigger.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-01-31  9:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1548084825-8803-1-git-send-email-julien.thierry@arm.com>
2019-01-21 15:33 ` [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts Julien Thierry
2019-01-28 11:48   ` James Morse
2019-01-28 15:42     ` Julien Thierry
2019-01-31  8:19       ` Christoffer Dall
2019-01-31  8:56         ` Julien Thierry
2019-01-31  9:27           ` Christoffer Dall
2019-01-31  9:40             ` Julien Thierry
2019-01-31  9:48               ` Christoffer Dall
2019-01-31  9:53               ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).