All of lore.kernel.org
 help / color / mirror / Atom feed
* Query regarding pseudo nmi support on GIC V3 and request_nmi()
@ 2020-05-07 16:06 Neeraj Upadhyay
  2020-05-08 10:45 ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Neeraj Upadhyay @ 2020-05-07 16:06 UTC (permalink / raw)
  To: julien.thierry.kdev, maz; +Cc: linux-kernel

Hi,

I have one query regarding pseudo NMI support on GIC v3; from what I 
could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
However the request_nmi() in irq framework requires NMI to be per cpu
interrupt source (it checks for IRQF_PERCPU). Can you please help
understand this part, how SPIs can be configured as NMIs, if there is
a per cpu interrupt source restriction?



Thanks
Neeraj

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-07 16:06 Query regarding pseudo nmi support on GIC V3 and request_nmi() Neeraj Upadhyay
@ 2020-05-08 10:45 ` Marc Zyngier
  2020-05-08 11:06   ` Neeraj Upadhyay
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2020-05-08 10:45 UTC (permalink / raw)
  To: Neeraj Upadhyay; +Cc: julien.thierry.kdev, linux-kernel

On Thu, 07 May 2020 17:06:19 +0100,
Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> 
> Hi,
> 
> I have one query regarding pseudo NMI support on GIC v3; from what I
> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
> However the request_nmi() in irq framework requires NMI to be per cpu
> interrupt source (it checks for IRQF_PERCPU). Can you please help
> understand this part, how SPIs can be configured as NMIs, if there is
> a per cpu interrupt source restriction?

Let me answer your question by another question: what is the semantic
of a NMI if you can't associate it with a particular CPU?

We use pseudo-NMI to be able to profile (or detect lockups) within
sections where normal interrupts cannot fire. If the interrupt can
end-up on a random CPU (with an unrelated PMU or one that hasn't
locked up), what have we achieved? Only confusion.

The whole point is that NMIs have to be tied to a given CPU. For
SGI/PPI, this is guaranteed by construction. For SPIs, this means that
the affinity cannot be changed from userspace. IRQF_PERCPU doesn't
mean much in this context as we don't "broadcast" interrupts, but is
an indication to the core kernel that the same interrupt cannot be
taken on another CPU.

The short of it is that NMIs are only for per-CPU sources. For SPIs,
that's for PMUs that use SPIs instead of PPIs. Don't use it for
anything else.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-08 10:45 ` Marc Zyngier
@ 2020-05-08 11:06   ` Neeraj Upadhyay
  2020-05-08 12:27     ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Neeraj Upadhyay @ 2020-05-08 11:06 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: julien.thierry.kdev, linux-kernel

Hi Marc,

On 5/8/2020 4:15 PM, Marc Zyngier wrote:
> On Thu, 07 May 2020 17:06:19 +0100,
> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>
>> Hi,
>>
>> I have one query regarding pseudo NMI support on GIC v3; from what I
>> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
>> However the request_nmi() in irq framework requires NMI to be per cpu
>> interrupt source (it checks for IRQF_PERCPU). Can you please help
>> understand this part, how SPIs can be configured as NMIs, if there is
>> a per cpu interrupt source restriction?
> 
> Let me answer your question by another question: what is the semantic
> of a NMI if you can't associate it with a particular CPU?
>

I was actually thinking of a use case, where, we have a watchdog 
interrupt (which is a SPI), which is used for detecting software hangs
and cause device reset; If that interrupt's current cpu affinity is on a 
core, where interrupts are disabled, we won't be able to serve it; so, 
we need to group that interrupt as an fiq; I was thinking, if its 
feasible to mark that interrupt as pseudo NMI and route it to EL1 as 
irq. However, looks like that is not the semantic of a NMI and we would
need something like pseudo NMI ipi for this.

> We use pseudo-NMI to be able to profile (or detect lockups) within
> sections where normal interrupts cannot fire. If the interrupt can
> end-up on a random CPU (with an unrelated PMU or one that hasn't
> locked up), what have we achieved? Only confusion.
> 
> The whole point is that NMIs have to be tied to a given CPU. For
> SGI/PPI, this is guaranteed by construction. For SPIs, this means that
> the affinity cannot be changed from userspace. IRQF_PERCPU doesn't
> mean much in this context as we don't "broadcast" interrupts, but is
> an indication to the core kernel that the same interrupt cannot be
> taken on another CPU.
> 
> The short of it is that NMIs are only for per-CPU sources. For SPIs,
> that's for PMUs that use SPIs instead of PPIs. Don't use it for
> anything else.
> 

Thank you for the explanation!

> Thanks,
> 
> 	M.
> 

Thanks
Neeraj

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-08 11:06   ` Neeraj Upadhyay
@ 2020-05-08 12:27     ` Marc Zyngier
  2020-05-08 12:39       ` Neeraj Upadhyay
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2020-05-08 12:27 UTC (permalink / raw)
  To: Neeraj Upadhyay; +Cc: julien.thierry.kdev, linux-kernel

On Fri, 8 May 2020 16:36:42 +0530
Neeraj Upadhyay <neeraju@codeaurora.org> wrote:

> Hi Marc,
> 
> On 5/8/2020 4:15 PM, Marc Zyngier wrote:
> > On Thu, 07 May 2020 17:06:19 +0100,
> > Neeraj Upadhyay <neeraju@codeaurora.org> wrote:  
> >>
> >> Hi,
> >>
> >> I have one query regarding pseudo NMI support on GIC v3; from what I
> >> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
> >> However the request_nmi() in irq framework requires NMI to be per cpu
> >> interrupt source (it checks for IRQF_PERCPU). Can you please help
> >> understand this part, how SPIs can be configured as NMIs, if there is
> >> a per cpu interrupt source restriction?  
> > 
> > Let me answer your question by another question: what is the semantic
> > of a NMI if you can't associate it with a particular CPU?
> >  
> 
> I was actually thinking of a use case, where, we have a watchdog
> interrupt (which is a SPI), which is used for detecting software
> hangs and cause device reset; If that interrupt's current cpu
> affinity is on a core, where interrupts are disabled, we won't be
> able to serve it; so, we need to group that interrupt as an fiq; 

Linux doesn't use Group-0 interrupts, as they are strictly secure
(unless your SoC doesn't have EL3, which I doubt).

> I was thinking, if its feasible to mark that interrupt as pseudo NMI
> and route it to EL1 as irq. However, looks like that is not the
> semantic of a NMI and we would need something like pseudo NMI ipi for
> this.

Sending a NMI IPI from another NMI handler? Even once I've added these,
there is no way this will work for that particular scenario. Just look
at the restrictions we impose on NMIs.

Frankly, if all you need to do is to reset the SoC, use EL3 firmware.
That is what it is for.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-08 12:27     ` Marc Zyngier
@ 2020-05-08 12:39       ` Neeraj Upadhyay
  2020-05-08 12:53         ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Neeraj Upadhyay @ 2020-05-08 12:39 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: julien.thierry.kdev, linux-kernel

Hi Marc,

On 5/8/2020 5:57 PM, Marc Zyngier wrote:
> On Fri, 8 May 2020 16:36:42 +0530
> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> 
>> Hi Marc,
>>
>> On 5/8/2020 4:15 PM, Marc Zyngier wrote:
>>> On Thu, 07 May 2020 17:06:19 +0100,
>>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have one query regarding pseudo NMI support on GIC v3; from what I
>>>> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
>>>> However the request_nmi() in irq framework requires NMI to be per cpu
>>>> interrupt source (it checks for IRQF_PERCPU). Can you please help
>>>> understand this part, how SPIs can be configured as NMIs, if there is
>>>> a per cpu interrupt source restriction?
>>>
>>> Let me answer your question by another question: what is the semantic
>>> of a NMI if you can't associate it with a particular CPU?
>>>   
>>
>> I was actually thinking of a use case, where, we have a watchdog
>> interrupt (which is a SPI), which is used for detecting software
>> hangs and cause device reset; If that interrupt's current cpu
>> affinity is on a core, where interrupts are disabled, we won't be
>> able to serve it; so, we need to group that interrupt as an fiq;
> 
> Linux doesn't use Group-0 interrupts, as they are strictly secure
> (unless your SoC doesn't have EL3, which I doubt).

Yes, we handle that watchdog interrupt as a Group-0 interrupt, which is 
handled as fiq in EL3.

> 
>> I was thinking, if its feasible to mark that interrupt as pseudo NMI
>> and route it to EL1 as irq. However, looks like that is not the
>> semantic of a NMI and we would need something like pseudo NMI ipi for
>> this.
> 
> Sending a NMI IPI from another NMI handler? Even once I've added these,
> there is no way this will work for that particular scenario. Just look
> at the restrictions we impose on NMIs.
> 

Sending a pseudo NMI IPI (to EL1) from fiq handler (which runs in EL3); 
I will check, but do you think, that might not work?

> Frankly, if all you need to do is to reset the SoC, use EL3 firmware.
> That is what it is for.
> 

Before triggering SoC reset, we want to collect certain  EL1 debug 
information like stack trace for CPUs and other debug information.

> Thanks,
> 
> 	M.
> 

Thanks
Neeraj

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-08 12:39       ` Neeraj Upadhyay
@ 2020-05-08 12:53         ` Marc Zyngier
  2020-05-08 13:34           ` Neeraj Upadhyay
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2020-05-08 12:53 UTC (permalink / raw)
  To: Neeraj Upadhyay; +Cc: julien.thierry.kdev, linux-kernel

On Fri, 8 May 2020 18:09:00 +0530
Neeraj Upadhyay <neeraju@codeaurora.org> wrote:

> Hi Marc,
> 
> On 5/8/2020 5:57 PM, Marc Zyngier wrote:
> > On Fri, 8 May 2020 16:36:42 +0530
> > Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> >   
> >> Hi Marc,
> >>
> >> On 5/8/2020 4:15 PM, Marc Zyngier wrote:  
> >>> On Thu, 07 May 2020 17:06:19 +0100,
> >>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:  
> >>>>
> >>>> Hi,
> >>>>
> >>>> I have one query regarding pseudo NMI support on GIC v3; from what I
> >>>> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
> >>>> However the request_nmi() in irq framework requires NMI to be per cpu
> >>>> interrupt source (it checks for IRQF_PERCPU). Can you please help
> >>>> understand this part, how SPIs can be configured as NMIs, if there is
> >>>> a per cpu interrupt source restriction?  
> >>>
> >>> Let me answer your question by another question: what is the semantic
> >>> of a NMI if you can't associate it with a particular CPU?  
> >>>   >>  
> >> I was actually thinking of a use case, where, we have a watchdog
> >> interrupt (which is a SPI), which is used for detecting software
> >> hangs and cause device reset; If that interrupt's current cpu
> >> affinity is on a core, where interrupts are disabled, we won't be
> >> able to serve it; so, we need to group that interrupt as an fiq;  
> > 
> > Linux doesn't use Group-0 interrupts, as they are strictly secure
> > (unless your SoC doesn't have EL3, which I doubt).  
> 
> Yes, we handle that watchdog interrupt as a Group-0 interrupt, which
> is handled as fiq in EL3.
> 
> >   
> >> I was thinking, if its feasible to mark that interrupt as pseudo
> >> NMI and route it to EL1 as irq. However, looks like that is not the
> >> semantic of a NMI and we would need something like pseudo NMI ipi
> >> for this.  
> > 
> > Sending a NMI IPI from another NMI handler? Even once I've added
> > these, there is no way this will work for that particular scenario.
> > Just look at the restrictions we impose on NMIs.
> >   
> 
> Sending a pseudo NMI IPI (to EL1) from fiq handler (which runs in
> EL3); I will check, but do you think, that might not work?

How do you know, from EL3, what to write in memory so that the NMI
handler knows what you want to do? Are you going to parse the S1 page
tables? Hard-code the behaviour of some random Linux version in your
legendary non-updatable firmware? This isn't an acceptable behaviour.

An IPI is between two CPUs used by the same SW entitiy. What runs at
EL3 is completely alien to Linux, and so is Linux to EL3. If you want
to IPI, send Group-0 IPIs that are private to the firmware.

If you want to inject NMI-type exceptions into EL1, you can always try
SDEI (did I actually write this? Help!). But given your use case below,
that wouldn't work either.

> > Frankly, if all you need to do is to reset the SoC, use EL3
> > firmware. That is what it is for.
> >   
> 
> Before triggering SoC reset, we want to collect certain  EL1 debug
> information like stack trace for CPUs and other debug information.

Frankly, if you are going to reset the SoC because EL1/EL2 has gone
bust, how can you trust it to do anything sensible when injecting an
interrupt?. Once you take a SPI at EL3, gather whatever state you want
from EL3. Do not involve EL1 at all.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-08 12:53         ` Marc Zyngier
@ 2020-05-08 13:34           ` Neeraj Upadhyay
  2020-05-08 16:11             ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Neeraj Upadhyay @ 2020-05-08 13:34 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: julien.thierry.kdev, linux-kernel

Hi Marc,

On 5/8/2020 6:23 PM, Marc Zyngier wrote:
> On Fri, 8 May 2020 18:09:00 +0530
> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> 
>> Hi Marc,
>>
>> On 5/8/2020 5:57 PM, Marc Zyngier wrote:
>>> On Fri, 8 May 2020 16:36:42 +0530
>>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>>    
>>>> Hi Marc,
>>>>
>>>> On 5/8/2020 4:15 PM, Marc Zyngier wrote:
>>>>> On Thu, 07 May 2020 17:06:19 +0100,
>>>>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have one query regarding pseudo NMI support on GIC v3; from what I
>>>>>> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
>>>>>> However the request_nmi() in irq framework requires NMI to be per cpu
>>>>>> interrupt source (it checks for IRQF_PERCPU). Can you please help
>>>>>> understand this part, how SPIs can be configured as NMIs, if there is
>>>>>> a per cpu interrupt source restriction?
>>>>>
>>>>> Let me answer your question by another question: what is the semantic
>>>>> of a NMI if you can't associate it with a particular CPU?
>>>>>    >>
>>>> I was actually thinking of a use case, where, we have a watchdog
>>>> interrupt (which is a SPI), which is used for detecting software
>>>> hangs and cause device reset; If that interrupt's current cpu
>>>> affinity is on a core, where interrupts are disabled, we won't be
>>>> able to serve it; so, we need to group that interrupt as an fiq;
>>>
>>> Linux doesn't use Group-0 interrupts, as they are strictly secure
>>> (unless your SoC doesn't have EL3, which I doubt).
>>
>> Yes, we handle that watchdog interrupt as a Group-0 interrupt, which
>> is handled as fiq in EL3.
>>
>>>    
>>>> I was thinking, if its feasible to mark that interrupt as pseudo
>>>> NMI and route it to EL1 as irq. However, looks like that is not the
>>>> semantic of a NMI and we would need something like pseudo NMI ipi
>>>> for this.
>>>
>>> Sending a NMI IPI from another NMI handler? Even once I've added
>>> these, there is no way this will work for that particular scenario.
>>> Just look at the restrictions we impose on NMIs.
>>>    
>>
>> Sending a pseudo NMI IPI (to EL1) from fiq handler (which runs in
>> EL3); I will check, but do you think, that might not work?
> 
> How do you know, from EL3, what to write in memory so that the NMI
> handler knows what you want to do? Are you going to parse the S1 page
> tables? Hard-code the behaviour of some random Linux version in your
> legendary non-updatable firmware? This isn't an acceptable behaviour.
> 

Ok, I understand;

Initial thought was to use watchdog SPI as pseudo NMI; however, as 
pseudo NMIs are only per CPU sources, we were exploring the possibility 
of using an unused ipi (using the work which is done in [1] and [2]  for 
SGIs) as pseudo NMI, which EL3 sends to EL1, on receiving watchdog fiq. 
The pseudo NMI handler would collect required debug information, to help 
indentify the lockup cause. We weren't thinking of communicating any 
information from EL3 fiq handler to EL1.

However, from this discussion, I realize that calling irq handler from
fiq handler, would not be possible. So, the approach looks flawed.

I believe, allowing a non-percpu pseudo NMI is not acceptable to community?



[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/gic-sgi

[2] https://lkml.org/lkml/2020/4/25/135

> An IPI is between two CPUs used by the same SW entitiy. What runs at
> EL3 is completely alien to Linux, and so is Linux to EL3. If you want
> to IPI, send Group-0 IPIs that are private to the firmware.
> 

Ok got it; however, I wonder what's the use case of sending
SGI to EL1, from secure world, using ICC_ASGI1R. I thought it
allowed communication between EL1 and EL3; but, looks like I understood 
in wrong.

> If you want to inject NMI-type exceptions into EL1, you can always try
> SDEI (did I actually write this? Help!). But given your use case below,
> that wouldn't work either.
> 

Ok.

>>> Frankly, if all you need to do is to reset the SoC, use EL3
>>> firmware. That is what it is for.
>>>    
>>
>> Before triggering SoC reset, we want to collect certain  EL1 debug
>> information like stack trace for CPUs and other debug information.
> 
> Frankly, if you are going to reset the SoC because EL1/EL2 has gone
> bust, how can you trust it to do anything sensible when injecting an
> interrupt?. Once you take a SPI at EL3, gather whatever state you want
> from EL3. Do not involve EL1 at all.
> 
> 	M.
> 

Agree that it might not work for all cases. But, for the cases like, 
some kernel code is stuck after disabling local irqs; pseudo NMI might 
still be able to run and capture stack and other debug info, to help 
detect the cause of lockups.


Thanks
Neeraj

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-08 13:34           ` Neeraj Upadhyay
@ 2020-05-08 16:11             ` Marc Zyngier
  2020-05-08 16:16               ` Neeraj Upadhyay
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2020-05-08 16:11 UTC (permalink / raw)
  To: Neeraj Upadhyay; +Cc: julien.thierry.kdev, linux-kernel

On Fri, 08 May 2020 14:34:10 +0100,
Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> 
> Hi Marc,
> 
> On 5/8/2020 6:23 PM, Marc Zyngier wrote:
> > On Fri, 8 May 2020 18:09:00 +0530
> > Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> > 
> >> Hi Marc,
> >> 
> >> On 5/8/2020 5:57 PM, Marc Zyngier wrote:
> >>> On Fri, 8 May 2020 16:36:42 +0530
> >>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> >>>    
> >>>> Hi Marc,
> >>>> 
> >>>> On 5/8/2020 4:15 PM, Marc Zyngier wrote:
> >>>>> On Thu, 07 May 2020 17:06:19 +0100,
> >>>>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
> >>>>>> 
> >>>>>> Hi,
> >>>>>> 
> >>>>>> I have one query regarding pseudo NMI support on GIC v3; from what I
> >>>>>> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
> >>>>>> However the request_nmi() in irq framework requires NMI to be per cpu
> >>>>>> interrupt source (it checks for IRQF_PERCPU). Can you please help
> >>>>>> understand this part, how SPIs can be configured as NMIs, if there is
> >>>>>> a per cpu interrupt source restriction?
> >>>>> 
> >>>>> Let me answer your question by another question: what is the semantic
> >>>>> of a NMI if you can't associate it with a particular CPU?
> >>>>>    >>
> >>>> I was actually thinking of a use case, where, we have a watchdog
> >>>> interrupt (which is a SPI), which is used for detecting software
> >>>> hangs and cause device reset; If that interrupt's current cpu
> >>>> affinity is on a core, where interrupts are disabled, we won't be
> >>>> able to serve it; so, we need to group that interrupt as an fiq;
> >>> 
> >>> Linux doesn't use Group-0 interrupts, as they are strictly secure
> >>> (unless your SoC doesn't have EL3, which I doubt).
> >> 
> >> Yes, we handle that watchdog interrupt as a Group-0 interrupt, which
> >> is handled as fiq in EL3.
> >> 
> >>>    
> >>>> I was thinking, if its feasible to mark that interrupt as pseudo
> >>>> NMI and route it to EL1 as irq. However, looks like that is not the
> >>>> semantic of a NMI and we would need something like pseudo NMI ipi
> >>>> for this.
> >>> 
> >>> Sending a NMI IPI from another NMI handler? Even once I've added
> >>> these, there is no way this will work for that particular scenario.
> >>> Just look at the restrictions we impose on NMIs.
> >>>    
> >> 
> >> Sending a pseudo NMI IPI (to EL1) from fiq handler (which runs in
> >> EL3); I will check, but do you think, that might not work?
> > 
> > How do you know, from EL3, what to write in memory so that the NMI
> > handler knows what you want to do? Are you going to parse the S1 page
> > tables? Hard-code the behaviour of some random Linux version in your
> > legendary non-updatable firmware? This isn't an acceptable behaviour.
> > 
> 
> Ok, I understand;
> 
> Initial thought was to use watchdog SPI as pseudo NMI; however, as
> pseudo NMIs are only per CPU sources, we were exploring the
> possibility of using an unused ipi (using the work which is done in
> [1] and [2]  for SGIs) as pseudo NMI, which EL3 sends to EL1, on
> receiving watchdog fiq. The pseudo NMI handler would collect required
> debug information, to help indentify the lockup cause. We weren't
> thinking of communicating any information from EL3 fiq handler to
> EL1.

What if the operating system running at EL1/EL2 is *not* Linux?

> 
> However, from this discussion, I realize that calling irq handler from
> fiq handler, would not be possible. So, the approach looks flawed.
> 
> I believe, allowing a non-percpu pseudo NMI is not acceptable to community?

No, I really don't want to entertain this idea, because the semantics
are way too loosely defined and you'd end up with everyone wanting
something mildly different.

> > An IPI is between two CPUs used by the same SW entitiy. What runs at
> > EL3 is completely alien to Linux, and so is Linux to EL3. If you want
> > to IPI, send Group-0 IPIs that are private to the firmware.
> > 
> 
> Ok got it; however, I wonder what's the use case of sending
> SGI to EL1, from secure world, using ICC_ASGI1R. I thought it
> allowed communication between EL1 and EL3; but, looks like I
> understood in wrong.

There is what the GIC architecture can do, and there is what is
sensible for Linux. The GIC allows IPIs from S-to-NS as well as the
opposite. This doesn't make it a good idea (it actually is a terrible
idea, and I really hope that future versions of the architecture will
simply kill the feature).

The idea was that you'd make SGIs an first class ABI between S and
NS. Given that the two are developed separately and that nobody ever
standardised what the SGI numbers mean, this idea is completely dead.

> 
> > If you want to inject NMI-type exceptions into EL1, you can always try
> > SDEI (did I actually write this? Help!). But given your use case below,
> > that wouldn't work either.
> > 
> 
> Ok.
> 
> >>> Frankly, if all you need to do is to reset the SoC, use EL3
> >>> firmware. That is what it is for.
> >>>    
> >> 
> >> Before triggering SoC reset, we want to collect certain  EL1 debug
> >> information like stack trace for CPUs and other debug information.
> > 
> > Frankly, if you are going to reset the SoC because EL1/EL2 has gone
> > bust, how can you trust it to do anything sensible when injecting an
> > interrupt?. Once you take a SPI at EL3, gather whatever state you want
> > from EL3. Do not involve EL1 at all.
> > 
> > 	M.
> > 
> 
> Agree that it might not work for all cases. But, for the cases like,
> some kernel code is stuck after disabling local irqs; pseudo NMI might
> still be able to run and capture stack and other debug info, to help
> detect the cause of lockups.

And for that we'll have pseudo-NMI IPIs, initiated from the kernel
itself as part of the normal debugging infrastructure. It is the EL3
initiated IPI to EL1 that I strongly oppose against. Not to mention
that if the kernel locks up with PSTATE.I set (which still happens on
exception entry), the pseudo-NMI won't work either.

As I said, you only have two options: either implement everything in
EL3 (and the NS OS doesn't need to know anything at all), or use SDEI
as the architected way to inject an exception into the NS world (and
Linux already supports it).

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
  2020-05-08 16:11             ` Marc Zyngier
@ 2020-05-08 16:16               ` Neeraj Upadhyay
  0 siblings, 0 replies; 9+ messages in thread
From: Neeraj Upadhyay @ 2020-05-08 16:16 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: julien.thierry.kdev, linux-kernel

Hi Marc,

Thanks a lot for your comments. I will work on exploring how SDEI can be 
used for it.



Thanks
Neeraj

On 5/8/2020 9:41 PM, Marc Zyngier wrote:
> On Fri, 08 May 2020 14:34:10 +0100,
> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>
>> Hi Marc,
>>
>> On 5/8/2020 6:23 PM, Marc Zyngier wrote:
>>> On Fri, 8 May 2020 18:09:00 +0530
>>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>>
>>>> Hi Marc,
>>>>
>>>> On 5/8/2020 5:57 PM, Marc Zyngier wrote:
>>>>> On Fri, 8 May 2020 16:36:42 +0530
>>>>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>>>>     
>>>>>> Hi Marc,
>>>>>>
>>>>>> On 5/8/2020 4:15 PM, Marc Zyngier wrote:
>>>>>>> On Thu, 07 May 2020 17:06:19 +0100,
>>>>>>> Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have one query regarding pseudo NMI support on GIC v3; from what I
>>>>>>>> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
>>>>>>>> However the request_nmi() in irq framework requires NMI to be per cpu
>>>>>>>> interrupt source (it checks for IRQF_PERCPU). Can you please help
>>>>>>>> understand this part, how SPIs can be configured as NMIs, if there is
>>>>>>>> a per cpu interrupt source restriction?
>>>>>>>
>>>>>>> Let me answer your question by another question: what is the semantic
>>>>>>> of a NMI if you can't associate it with a particular CPU?
>>>>>>>     >>
>>>>>> I was actually thinking of a use case, where, we have a watchdog
>>>>>> interrupt (which is a SPI), which is used for detecting software
>>>>>> hangs and cause device reset; If that interrupt's current cpu
>>>>>> affinity is on a core, where interrupts are disabled, we won't be
>>>>>> able to serve it; so, we need to group that interrupt as an fiq;
>>>>>
>>>>> Linux doesn't use Group-0 interrupts, as they are strictly secure
>>>>> (unless your SoC doesn't have EL3, which I doubt).
>>>>
>>>> Yes, we handle that watchdog interrupt as a Group-0 interrupt, which
>>>> is handled as fiq in EL3.
>>>>
>>>>>     
>>>>>> I was thinking, if its feasible to mark that interrupt as pseudo
>>>>>> NMI and route it to EL1 as irq. However, looks like that is not the
>>>>>> semantic of a NMI and we would need something like pseudo NMI ipi
>>>>>> for this.
>>>>>
>>>>> Sending a NMI IPI from another NMI handler? Even once I've added
>>>>> these, there is no way this will work for that particular scenario.
>>>>> Just look at the restrictions we impose on NMIs.
>>>>>     
>>>>
>>>> Sending a pseudo NMI IPI (to EL1) from fiq handler (which runs in
>>>> EL3); I will check, but do you think, that might not work?
>>>
>>> How do you know, from EL3, what to write in memory so that the NMI
>>> handler knows what you want to do? Are you going to parse the S1 page
>>> tables? Hard-code the behaviour of some random Linux version in your
>>> legendary non-updatable firmware? This isn't an acceptable behaviour.
>>>
>>
>> Ok, I understand;
>>
>> Initial thought was to use watchdog SPI as pseudo NMI; however, as
>> pseudo NMIs are only per CPU sources, we were exploring the
>> possibility of using an unused ipi (using the work which is done in
>> [1] and [2]  for SGIs) as pseudo NMI, which EL3 sends to EL1, on
>> receiving watchdog fiq. The pseudo NMI handler would collect required
>> debug information, to help indentify the lockup cause. We weren't
>> thinking of communicating any information from EL3 fiq handler to
>> EL1.
> 
> What if the operating system running at EL1/EL2 is *not* Linux?
> 
>>
>> However, from this discussion, I realize that calling irq handler from
>> fiq handler, would not be possible. So, the approach looks flawed.
>>
>> I believe, allowing a non-percpu pseudo NMI is not acceptable to community?
> 
> No, I really don't want to entertain this idea, because the semantics
> are way too loosely defined and you'd end up with everyone wanting
> something mildly different.
> 
>>> An IPI is between two CPUs used by the same SW entitiy. What runs at
>>> EL3 is completely alien to Linux, and so is Linux to EL3. If you want
>>> to IPI, send Group-0 IPIs that are private to the firmware.
>>>
>>
>> Ok got it; however, I wonder what's the use case of sending
>> SGI to EL1, from secure world, using ICC_ASGI1R. I thought it
>> allowed communication between EL1 and EL3; but, looks like I
>> understood in wrong.
> 
> There is what the GIC architecture can do, and there is what is
> sensible for Linux. The GIC allows IPIs from S-to-NS as well as the
> opposite. This doesn't make it a good idea (it actually is a terrible
> idea, and I really hope that future versions of the architecture will
> simply kill the feature).
> 
> The idea was that you'd make SGIs an first class ABI between S and
> NS. Given that the two are developed separately and that nobody ever
> standardised what the SGI numbers mean, this idea is completely dead.
> 
>>
>>> If you want to inject NMI-type exceptions into EL1, you can always try
>>> SDEI (did I actually write this? Help!). But given your use case below,
>>> that wouldn't work either.
>>>
>>
>> Ok.
>>
>>>>> Frankly, if all you need to do is to reset the SoC, use EL3
>>>>> firmware. That is what it is for.
>>>>>     
>>>>
>>>> Before triggering SoC reset, we want to collect certain  EL1 debug
>>>> information like stack trace for CPUs and other debug information.
>>>
>>> Frankly, if you are going to reset the SoC because EL1/EL2 has gone
>>> bust, how can you trust it to do anything sensible when injecting an
>>> interrupt?. Once you take a SPI at EL3, gather whatever state you want
>>> from EL3. Do not involve EL1 at all.
>>>
>>> 	M.
>>>
>>
>> Agree that it might not work for all cases. But, for the cases like,
>> some kernel code is stuck after disabling local irqs; pseudo NMI might
>> still be able to run and capture stack and other debug info, to help
>> detect the cause of lockups.
> 
> And for that we'll have pseudo-NMI IPIs, initiated from the kernel
> itself as part of the normal debugging infrastructure. It is the EL3
> initiated IPI to EL1 that I strongly oppose against. Not to mention
> that if the kernel locks up with PSTATE.I set (which still happens on
> exception entry), the pseudo-NMI won't work either.
> 
> As I said, you only have two options: either implement everything in
> EL3 (and the NS OS doesn't need to know anything at all), or use SDEI
> as the architected way to inject an exception into the NS world (and
> Linux already supports it).
> 
> 	M.
> 

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-05-08 16:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-07 16:06 Query regarding pseudo nmi support on GIC V3 and request_nmi() Neeraj Upadhyay
2020-05-08 10:45 ` Marc Zyngier
2020-05-08 11:06   ` Neeraj Upadhyay
2020-05-08 12:27     ` Marc Zyngier
2020-05-08 12:39       ` Neeraj Upadhyay
2020-05-08 12:53         ` Marc Zyngier
2020-05-08 13:34           ` Neeraj Upadhyay
2020-05-08 16:11             ` Marc Zyngier
2020-05-08 16:16               ` Neeraj Upadhyay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.