All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
@ 2021-06-10 14:57 ` Marc Zyngier
  0 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2021-06-10 14:57 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: Mark Rutland, Will Deacon, Catalin Marinas, Alexandru Elisei,
	Thomas Gleixner, valentin.schneider, kernel-team

The arm64 entry code suffers from an annoying issue on taking
a NMI, as it sets PMR to a value that actually allows IRQs
to be acknowledged. This is done for consistency with other parts
of the code, and is in the process of being fixed. This shouldn't
be a problem, as we are not enabling interrupts whilst in NMI
context.

However, in the infortunate scenario that we took a spurious NMI
(retired before the read of IAR) *and* that there is an IRQ pending
at the same time, we'll ack the IRQ in NMI context. Too bad.

In order to avoid deadlocks while running something like perf,
teach the GICv3 driver about this situation: if we were in
a context where no interrupt should have fired, transiently
set PMR to a value that only allows NMIs before acking the pending
interrupt, and restore the original value after that.

This papers over the core issue for the time being, and makes
NMIs great again. Sort of.

Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/irqchip/irq-gic-v3.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 37a23aa6de37..3d3502efb807 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -642,11 +642,45 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
 		nmi_exit();
 }
 
+static u32 do_read_iar(struct pt_regs *regs)
+{
+	u32 iar;
+
+	if (gic_supports_nmi() && unlikely(!interrupts_enabled(regs))) {
+		u64 pmr;
+
+		/*
+		 * We were in a context with interrupt disabled. However,
+		 * the entry code has set PMR to a value that allows any
+		 * interrupt to be acknowledged, and not just NMIs. This can
+		 * lead to surprising effects if the NMI has been retired in
+		 * the meantime, and that there is an IRQ pending. The IRQ
+		 * would then be taken in NMI context, something that nobody
+		 * wants to debug a second time.
+		 *
+		 * Until we sort this, drop PMR again to a level that will
+		 * actually only allow NMIs before reading IAR, and then
+		 * restore it to what it was.
+		 */
+		pmr = gic_read_pmr();
+		gic_pmr_mask_irqs();
+		isb();
+
+		iar = gic_read_iar();
+
+		gic_write_pmr(pmr);
+	} else {
+		iar = gic_read_iar();
+	}
+
+	return iar;
+}
+
 static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
 	u32 irqnr;
 
-	irqnr = gic_read_iar();
+	irqnr = do_read_iar(regs);
 
 	/* Check for special IDs first */
 	if ((irqnr >= 1020 && irqnr <= 1023))
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
@ 2021-06-10 14:57 ` Marc Zyngier
  0 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2021-06-10 14:57 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: Mark Rutland, Will Deacon, Catalin Marinas, Alexandru Elisei,
	Thomas Gleixner, valentin.schneider, kernel-team

The arm64 entry code suffers from an annoying issue on taking
a NMI, as it sets PMR to a value that actually allows IRQs
to be acknowledged. This is done for consistency with other parts
of the code, and is in the process of being fixed. This shouldn't
be a problem, as we are not enabling interrupts whilst in NMI
context.

However, in the infortunate scenario that we took a spurious NMI
(retired before the read of IAR) *and* that there is an IRQ pending
at the same time, we'll ack the IRQ in NMI context. Too bad.

In order to avoid deadlocks while running something like perf,
teach the GICv3 driver about this situation: if we were in
a context where no interrupt should have fired, transiently
set PMR to a value that only allows NMIs before acking the pending
interrupt, and restore the original value after that.

This papers over the core issue for the time being, and makes
NMIs great again. Sort of.

Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/irqchip/irq-gic-v3.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 37a23aa6de37..3d3502efb807 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -642,11 +642,45 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
 		nmi_exit();
 }
 
+static u32 do_read_iar(struct pt_regs *regs)
+{
+	u32 iar;
+
+	if (gic_supports_nmi() && unlikely(!interrupts_enabled(regs))) {
+		u64 pmr;
+
+		/*
+		 * We were in a context with interrupt disabled. However,
+		 * the entry code has set PMR to a value that allows any
+		 * interrupt to be acknowledged, and not just NMIs. This can
+		 * lead to surprising effects if the NMI has been retired in
+		 * the meantime, and that there is an IRQ pending. The IRQ
+		 * would then be taken in NMI context, something that nobody
+		 * wants to debug a second time.
+		 *
+		 * Until we sort this, drop PMR again to a level that will
+		 * actually only allow NMIs before reading IAR, and then
+		 * restore it to what it was.
+		 */
+		pmr = gic_read_pmr();
+		gic_pmr_mask_irqs();
+		isb();
+
+		iar = gic_read_iar();
+
+		gic_write_pmr(pmr);
+	} else {
+		iar = gic_read_iar();
+	}
+
+	return iar;
+}
+
 static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
 	u32 irqnr;
 
-	irqnr = gic_read_iar();
+	irqnr = do_read_iar(regs);
 
 	/* Check for special IDs first */
 	if ((irqnr >= 1020 && irqnr <= 1023))
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
  2021-06-10 14:57 ` Marc Zyngier
@ 2021-06-10 15:59   ` Mark Rutland
  -1 siblings, 0 replies; 8+ messages in thread
From: Mark Rutland @ 2021-06-10 15:59 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Alexandru Elisei, Thomas Gleixner, valentin.schneider,
	kernel-team

Hi Marc,

On Thu, Jun 10, 2021 at 03:57:31PM +0100, Marc Zyngier wrote:
> The arm64 entry code suffers from an annoying issue on taking
> a NMI, as it sets PMR to a value that actually allows IRQs
> to be acknowledged. This is done for consistency with other parts
> of the code, and is in the process of being fixed. This shouldn't
> be a problem, as we are not enabling interrupts whilst in NMI
> context.
> 
> However, in the infortunate scenario that we took a spurious NMI
> (retired before the read of IAR) *and* that there is an IRQ pending
> at the same time, we'll ack the IRQ in NMI context. Too bad.
> 
> In order to avoid deadlocks while running something like perf,
> teach the GICv3 driver about this situation: if we were in
> a context where no interrupt should have fired, transiently
> set PMR to a value that only allows NMIs before acking the pending
> interrupt, and restore the original value after that.
> 
> This papers over the core issue for the time being, and makes
> NMIs great again. Sort of.
> 
> Co-developed-by: Mark Rutland <mark.rutland@arm.com>

According to the kernel documentation, a Co-developed-by should be
immediately followed by that developer's Signed-off-by, so FWIW:

Signed-off-by: Mark Rutland <mark.rutland@arm.com>

... unless you want to downgrade that to a Suggested-by, which is also
fine by me!

> Signed-off-by: Marc Zyngier <maz@kernel.org>

Having played about with a few options, I think this is the
simplest/cleanest thing we can do for now, and given it's all in one
place and "obviously correct", I think there's little risk that this
will break something else. So:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

We should probably also give this:

Fixes: 4d6a38da8e79e94c ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry")

... since prior to that commit the `gic_prio_irq_setup` gunk would
prevent this specific problem (though other bits like
local_daif_{save,restore}()) would be broken in NMI paths.

Thanks,
Mark.

> ---
>  drivers/irqchip/irq-gic-v3.c | 36 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 37a23aa6de37..3d3502efb807 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -642,11 +642,45 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
>  		nmi_exit();
>  }
>  
> +static u32 do_read_iar(struct pt_regs *regs)
> +{
> +	u32 iar;
> +
> +	if (gic_supports_nmi() && unlikely(!interrupts_enabled(regs))) {
> +		u64 pmr;
> +
> +		/*
> +		 * We were in a context with interrupt disabled. However,
> +		 * the entry code has set PMR to a value that allows any
> +		 * interrupt to be acknowledged, and not just NMIs. This can
> +		 * lead to surprising effects if the NMI has been retired in
> +		 * the meantime, and that there is an IRQ pending. The IRQ
> +		 * would then be taken in NMI context, something that nobody
> +		 * wants to debug a second time.
> +		 *
> +		 * Until we sort this, drop PMR again to a level that will
> +		 * actually only allow NMIs before reading IAR, and then
> +		 * restore it to what it was.
> +		 */
> +		pmr = gic_read_pmr();
> +		gic_pmr_mask_irqs();
> +		isb();
> +
> +		iar = gic_read_iar();
> +
> +		gic_write_pmr(pmr);
> +	} else {
> +		iar = gic_read_iar();
> +	}
> +
> +	return iar;
> +}
> +
>  static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
>  {
>  	u32 irqnr;
>  
> -	irqnr = gic_read_iar();
> +	irqnr = do_read_iar(regs);
>  
>  	/* Check for special IDs first */
>  	if ((irqnr >= 1020 && irqnr <= 1023))
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
@ 2021-06-10 15:59   ` Mark Rutland
  0 siblings, 0 replies; 8+ messages in thread
From: Mark Rutland @ 2021-06-10 15:59 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Alexandru Elisei, Thomas Gleixner, valentin.schneider,
	kernel-team

Hi Marc,

On Thu, Jun 10, 2021 at 03:57:31PM +0100, Marc Zyngier wrote:
> The arm64 entry code suffers from an annoying issue on taking
> a NMI, as it sets PMR to a value that actually allows IRQs
> to be acknowledged. This is done for consistency with other parts
> of the code, and is in the process of being fixed. This shouldn't
> be a problem, as we are not enabling interrupts whilst in NMI
> context.
> 
> However, in the infortunate scenario that we took a spurious NMI
> (retired before the read of IAR) *and* that there is an IRQ pending
> at the same time, we'll ack the IRQ in NMI context. Too bad.
> 
> In order to avoid deadlocks while running something like perf,
> teach the GICv3 driver about this situation: if we were in
> a context where no interrupt should have fired, transiently
> set PMR to a value that only allows NMIs before acking the pending
> interrupt, and restore the original value after that.
> 
> This papers over the core issue for the time being, and makes
> NMIs great again. Sort of.
> 
> Co-developed-by: Mark Rutland <mark.rutland@arm.com>

According to the kernel documentation, a Co-developed-by should be
immediately followed by that developer's Signed-off-by, so FWIW:

Signed-off-by: Mark Rutland <mark.rutland@arm.com>

... unless you want to downgrade that to a Suggested-by, which is also
fine by me!

> Signed-off-by: Marc Zyngier <maz@kernel.org>

Having played about with a few options, I think this is the
simplest/cleanest thing we can do for now, and given it's all in one
place and "obviously correct", I think there's little risk that this
will break something else. So:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

We should probably also give this:

Fixes: 4d6a38da8e79e94c ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry")

... since prior to that commit the `gic_prio_irq_setup` gunk would
prevent this specific problem (though other bits like
local_daif_{save,restore}()) would be broken in NMI paths.

Thanks,
Mark.

> ---
>  drivers/irqchip/irq-gic-v3.c | 36 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 37a23aa6de37..3d3502efb807 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -642,11 +642,45 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
>  		nmi_exit();
>  }
>  
> +static u32 do_read_iar(struct pt_regs *regs)
> +{
> +	u32 iar;
> +
> +	if (gic_supports_nmi() && unlikely(!interrupts_enabled(regs))) {
> +		u64 pmr;
> +
> +		/*
> +		 * We were in a context with interrupt disabled. However,
> +		 * the entry code has set PMR to a value that allows any
> +		 * interrupt to be acknowledged, and not just NMIs. This can
> +		 * lead to surprising effects if the NMI has been retired in
> +		 * the meantime, and that there is an IRQ pending. The IRQ
> +		 * would then be taken in NMI context, something that nobody
> +		 * wants to debug a second time.
> +		 *
> +		 * Until we sort this, drop PMR again to a level that will
> +		 * actually only allow NMIs before reading IAR, and then
> +		 * restore it to what it was.
> +		 */
> +		pmr = gic_read_pmr();
> +		gic_pmr_mask_irqs();
> +		isb();
> +
> +		iar = gic_read_iar();
> +
> +		gic_write_pmr(pmr);
> +	} else {
> +		iar = gic_read_iar();
> +	}
> +
> +	return iar;
> +}
> +
>  static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
>  {
>  	u32 irqnr;
>  
> -	irqnr = gic_read_iar();
> +	irqnr = do_read_iar(regs);
>  
>  	/* Check for special IDs first */
>  	if ((irqnr >= 1020 && irqnr <= 1023))
> -- 
> 2.30.2
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
  2021-06-10 15:59   ` Mark Rutland
@ 2021-06-10 16:26     ` Marc Zyngier
  -1 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2021-06-10 16:26 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Alexandru Elisei, Thomas Gleixner, valentin.schneider,
	kernel-team

On Thu, 10 Jun 2021 16:59:30 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> Hi Marc,
> 
> On Thu, Jun 10, 2021 at 03:57:31PM +0100, Marc Zyngier wrote:
> > The arm64 entry code suffers from an annoying issue on taking
> > a NMI, as it sets PMR to a value that actually allows IRQs
> > to be acknowledged. This is done for consistency with other parts
> > of the code, and is in the process of being fixed. This shouldn't
> > be a problem, as we are not enabling interrupts whilst in NMI
> > context.
> > 
> > However, in the infortunate scenario that we took a spurious NMI
> > (retired before the read of IAR) *and* that there is an IRQ pending
> > at the same time, we'll ack the IRQ in NMI context. Too bad.
> > 
> > In order to avoid deadlocks while running something like perf,
> > teach the GICv3 driver about this situation: if we were in
> > a context where no interrupt should have fired, transiently
> > set PMR to a value that only allows NMIs before acking the pending
> > interrupt, and restore the original value after that.
> > 
> > This papers over the core issue for the time being, and makes
> > NMIs great again. Sort of.
> > 
> > Co-developed-by: Mark Rutland <mark.rutland@arm.com>
> 
> According to the kernel documentation, a Co-developed-by should be
> immediately followed by that developer's Signed-off-by, so FWIW:
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> 
> ... unless you want to downgrade that to a Suggested-by, which is also
> fine by me!

Nah, we both wasted too many grey bits on this one, and I want shared
responsibility for it!

> 
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> 
> Having played about with a few options, I think this is the
> simplest/cleanest thing we can do for now, and given it's all in one
> place and "obviously correct", I think there's little risk that this
> will break something else. So:
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> 
> We should probably also give this:
> 
> Fixes: 4d6a38da8e79e94c ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry")
> 
> ... since prior to that commit the `gic_prio_irq_setup` gunk would
> prevent this specific problem (though other bits like
> local_daif_{save,restore}()) would be broken in NMI paths.

Yup. I'll add that too and send it as a fix for -rc6.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
@ 2021-06-10 16:26     ` Marc Zyngier
  0 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2021-06-10 16:26 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Alexandru Elisei, Thomas Gleixner, valentin.schneider,
	kernel-team

On Thu, 10 Jun 2021 16:59:30 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> Hi Marc,
> 
> On Thu, Jun 10, 2021 at 03:57:31PM +0100, Marc Zyngier wrote:
> > The arm64 entry code suffers from an annoying issue on taking
> > a NMI, as it sets PMR to a value that actually allows IRQs
> > to be acknowledged. This is done for consistency with other parts
> > of the code, and is in the process of being fixed. This shouldn't
> > be a problem, as we are not enabling interrupts whilst in NMI
> > context.
> > 
> > However, in the infortunate scenario that we took a spurious NMI
> > (retired before the read of IAR) *and* that there is an IRQ pending
> > at the same time, we'll ack the IRQ in NMI context. Too bad.
> > 
> > In order to avoid deadlocks while running something like perf,
> > teach the GICv3 driver about this situation: if we were in
> > a context where no interrupt should have fired, transiently
> > set PMR to a value that only allows NMIs before acking the pending
> > interrupt, and restore the original value after that.
> > 
> > This papers over the core issue for the time being, and makes
> > NMIs great again. Sort of.
> > 
> > Co-developed-by: Mark Rutland <mark.rutland@arm.com>
> 
> According to the kernel documentation, a Co-developed-by should be
> immediately followed by that developer's Signed-off-by, so FWIW:
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> 
> ... unless you want to downgrade that to a Suggested-by, which is also
> fine by me!

Nah, we both wasted too many grey bits on this one, and I want shared
responsibility for it!

> 
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> 
> Having played about with a few options, I think this is the
> simplest/cleanest thing we can do for now, and given it's all in one
> place and "obviously correct", I think there's little risk that this
> will break something else. So:
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> 
> We should probably also give this:
> 
> Fixes: 4d6a38da8e79e94c ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry")
> 
> ... since prior to that commit the `gic_prio_irq_setup` gunk would
> prevent this specific problem (though other bits like
> local_daif_{save,restore}()) would be broken in NMI paths.

Yup. I'll add that too and send it as a fix for -rc6.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [irqchip: irq/irqchip-fixes] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
  2021-06-10 14:57 ` Marc Zyngier
  (?)
  (?)
@ 2021-06-10 16:45 ` irqchip-bot for Marc Zyngier
  -1 siblings, 0 replies; 8+ messages in thread
From: irqchip-bot for Marc Zyngier @ 2021-06-10 16:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mark Rutland, Marc Zyngier, tglx

The following commit has been merged into the irq/irqchip-fixes branch of irqchip:

Commit-ID:     f6f3e6e9b362363d6eb303982d6302a1d43312c9
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/f6f3e6e9b362363d6eb303982d6302a1d43312c9
Author:        Marc Zyngier <maz@kernel.org>
AuthorDate:    Thu, 10 Jun 2021 15:13:46 +01:00
Committer:     Marc Zyngier <maz@kernel.org>
CommitterDate: Thu, 10 Jun 2021 17:42:21 +01:00

irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry

The arm64 entry code suffers from an annoying issue on taking
a NMI, as it sets PMR to a value that actually allows IRQs
to be acknowledged. This is done for consistency with other parts
of the code, and is in the process of being fixed. This shouldn't
be a problem, as we are not enabling interrupts whilst in NMI
context.

However, in the infortunate scenario that we took a spurious NMI
(retired before the read of IAR) *and* that there is an IRQ pending
at the same time, we'll ack the IRQ in NMI context. Too bad.

In order to avoid deadlocks while running something like perf,
teach the GICv3 driver about this situation: if we were in
a context where no interrupt should have fired, transiently
set PMR to a value that only allows NMIs before acking the pending
interrupt, and restore the original value after that.

This papers over the core issue for the time being, and makes
NMIs great again. Sort of.

Fixes: 4d6a38da8e79e94c ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry")
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/lkml/20210610145731.1350460-1-maz@kernel.org
---
 drivers/irqchip/irq-gic-v3.c | 36 ++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 37a23aa..66d623f 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -642,11 +642,45 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
 		nmi_exit();
 }
 
+static u32 do_read_iar(struct pt_regs *regs)
+{
+	u32 iar;
+
+	if (gic_supports_nmi() && unlikely(!interrupts_enabled(regs))) {
+		u64 pmr;
+
+		/*
+		 * We were in a context with IRQs disabled. However, the
+		 * entry code has set PMR to a value that allows any
+		 * interrupt to be acknowledged, and not just NMIs. This can
+		 * lead to surprising effects if the NMI has been retired in
+		 * the meantime, and that there is an IRQ pending. The IRQ
+		 * would then be taken in NMI context, something that nobody
+		 * wants to debug twice.
+		 *
+		 * Until we sort this, drop PMR again to a level that will
+		 * actually only allow NMIs before reading IAR, and then
+		 * restore it to what it was.
+		 */
+		pmr = gic_read_pmr();
+		gic_pmr_mask_irqs();
+		isb();
+
+		iar = gic_read_iar();
+
+		gic_write_pmr(pmr);
+	} else {
+		iar = gic_read_iar();
+	}
+
+	return iar;
+}
+
 static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
 	u32 irqnr;
 
-	irqnr = gic_read_iar();
+	irqnr = do_read_iar(regs);
 
 	/* Check for special IDs first */
 	if ((irqnr >= 1020 && irqnr <= 1023))

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [irqchip: irq/irqchip-fixes] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
  2021-06-10 14:57 ` Marc Zyngier
                   ` (2 preceding siblings ...)
  (?)
@ 2021-06-10 16:57 ` irqchip-bot for Marc Zyngier
  -1 siblings, 0 replies; 8+ messages in thread
From: irqchip-bot for Marc Zyngier @ 2021-06-10 16:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mark Rutland, Marc Zyngier, tglx

The following commit has been merged into the irq/irqchip-fixes branch of irqchip:

Commit-ID:     382e6e177bc1c02473e56591fe5083ae1e4904f6
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/382e6e177bc1c02473e56591fe5083ae1e4904f6
Author:        Marc Zyngier <maz@kernel.org>
AuthorDate:    Thu, 10 Jun 2021 15:13:46 +01:00
Committer:     Marc Zyngier <maz@kernel.org>
CommitterDate: Thu, 10 Jun 2021 17:54:34 +01:00

irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry

The arm64 entry code suffers from an annoying issue on taking
a NMI, as it sets PMR to a value that actually allows IRQs
to be acknowledged. This is done for consistency with other parts
of the code, and is in the process of being fixed. This shouldn't
be a problem, as we are not enabling interrupts whilst in NMI
context.

However, in the infortunate scenario that we took a spurious NMI
(retired before the read of IAR) *and* that there is an IRQ pending
at the same time, we'll ack the IRQ in NMI context. Too bad.

In order to avoid deadlocks while running something like perf,
teach the GICv3 driver about this situation: if we were in
a context where no interrupt should have fired, transiently
set PMR to a value that only allows NMIs before acking the pending
interrupt, and restore the original value after that.

This papers over the core issue for the time being, and makes
NMIs great again. Sort of.

Fixes: 4d6a38da8e79e94c ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry")
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/lkml/20210610145731.1350460-1-maz@kernel.org
---
 drivers/irqchip/irq-gic-v3.c | 36 ++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 37a23aa..66d623f 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -642,11 +642,45 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
 		nmi_exit();
 }
 
+static u32 do_read_iar(struct pt_regs *regs)
+{
+	u32 iar;
+
+	if (gic_supports_nmi() && unlikely(!interrupts_enabled(regs))) {
+		u64 pmr;
+
+		/*
+		 * We were in a context with IRQs disabled. However, the
+		 * entry code has set PMR to a value that allows any
+		 * interrupt to be acknowledged, and not just NMIs. This can
+		 * lead to surprising effects if the NMI has been retired in
+		 * the meantime, and that there is an IRQ pending. The IRQ
+		 * would then be taken in NMI context, something that nobody
+		 * wants to debug twice.
+		 *
+		 * Until we sort this, drop PMR again to a level that will
+		 * actually only allow NMIs before reading IAR, and then
+		 * restore it to what it was.
+		 */
+		pmr = gic_read_pmr();
+		gic_pmr_mask_irqs();
+		isb();
+
+		iar = gic_read_iar();
+
+		gic_write_pmr(pmr);
+	} else {
+		iar = gic_read_iar();
+	}
+
+	return iar;
+}
+
 static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
 	u32 irqnr;
 
-	irqnr = gic_read_iar();
+	irqnr = do_read_iar(regs);
 
 	/* Check for special IDs first */
 	if ((irqnr >= 1020 && irqnr <= 1023))

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-10 16:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10 14:57 [PATCH] irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry Marc Zyngier
2021-06-10 14:57 ` Marc Zyngier
2021-06-10 15:59 ` Mark Rutland
2021-06-10 15:59   ` Mark Rutland
2021-06-10 16:26   ` Marc Zyngier
2021-06-10 16:26     ` Marc Zyngier
2021-06-10 16:45 ` [irqchip: irq/irqchip-fixes] " irqchip-bot for Marc Zyngier
2021-06-10 16:57 ` irqchip-bot for Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.