linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
@ 2019-03-29 13:23 Zenghui Yu
  2019-03-29 13:58 ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Zenghui Yu @ 2019-03-29 13:23 UTC (permalink / raw)
  To: marc.zyngier, julien.thierry, rostedt
  Cc: mingo, catalin.marinas, will.deacon, mark.rutland, linux,
	james.morse, suzuki.poulose, wanghaibin.wang, linux-arm-kernel,
	linux-kernel, Zenghui Yu

Enable pseudo NMI together with function_graph tracer, will lead
the system to a hang. This is easy to reproduce,

  1) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
  2) echo function_graph > /sys/kernel/debug/tracing/current_tracer

This patch (RFC) set gic_handle_irq() as notrace and it seems works
fine now. But I have no idea about what the issue is exactly, and
you can regard this patch as a report then :)

Can someone give a look at it and provide some explanations ?

Thanks!

Cc: Julien Thierry <julien.thierry@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
---
 drivers/irqchip/irq-gic-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 15e55d3..8d0c25f 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -487,7 +487,7 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
 		gic_deactivate_unhandled(irqnr);
 }
 
-static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
+static asmlinkage notrace void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
 	u32 irqnr;
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-03-29 13:23 [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace Zenghui Yu
@ 2019-03-29 13:58 ` Marc Zyngier
  2019-03-29 14:53   ` Steven Rostedt
  2019-03-29 15:02   ` Zenghui Yu
  0 siblings, 2 replies; 9+ messages in thread
From: Marc Zyngier @ 2019-03-29 13:58 UTC (permalink / raw)
  To: Zenghui Yu, julien.thierry, rostedt
  Cc: mingo, catalin.marinas, will.deacon, mark.rutland, linux,
	james.morse, suzuki.poulose, wanghaibin.wang, linux-arm-kernel,
	linux-kernel

Hi Zenghui,

On 29/03/2019 13:23, Zenghui Yu wrote:
> Enable pseudo NMI together with function_graph tracer, will lead
> the system to a hang. This is easy to reproduce,
> 
>   1) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
>   2) echo function_graph > /sys/kernel/debug/tracing/current_tracer
> 
> This patch (RFC) set gic_handle_irq() as notrace and it seems works
> fine now. But I have no idea about what the issue is exactly, and
> you can regard this patch as a report then :)
> 
> Can someone give a look at it and provide some explanations ?
> 
> Thanks!
> 
> Cc: Julien Thierry <julien.thierry@arm.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 15e55d3..8d0c25f 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -487,7 +487,7 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
>  		gic_deactivate_unhandled(irqnr);
>  }
>  
> -static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
> +static asmlinkage notrace void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
>  {
>  	u32 irqnr;
>  
> 


That's interesting. Do you have any out of tree patch that actually
makes use of the pseudo-NMI feature? Without those patches, the
behaviour should stay unchanged.

On the other hand, if you can generate pseudo-NMIs, you could end-up
tracing gic_handle_irq whilst being inside the tracing code with
interrupts being notionally disabled (and that could be pretty bad).

So, patches or no patches?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-03-29 13:58 ` Marc Zyngier
@ 2019-03-29 14:53   ` Steven Rostedt
  2019-03-29 15:35     ` Zenghui Yu
  2019-03-29 15:02   ` Zenghui Yu
  1 sibling, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2019-03-29 14:53 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Zenghui Yu, julien.thierry, mingo, catalin.marinas, will.deacon,
	mark.rutland, linux, james.morse, suzuki.poulose,
	wanghaibin.wang, linux-arm-kernel, linux-kernel

On Fri, 29 Mar 2019 13:58:40 +0000
Marc Zyngier <marc.zyngier@arm.com> wrote:

> On the other hand, if you can generate pseudo-NMIs, you could end-up
> tracing gic_handle_irq whilst being inside the tracing code with
> interrupts being notionally disabled (and that could be pretty bad).

Actually, that should still be safe. The tracing code is expected to
trace NMIs.

Now the entry of an NMI can be an issue because of the way tracing is
enabled. But this would also cause function tracing to crash, which was
not stated. Does function tracing have the same issue, or is this just
with function_graph tracing?

This is because a breakpoint is added to all the places that are going
to be traced so that the "nops" at the beginning of function calls are
going to be converted to calls to the tracer. Now that means a
breakpoint is being added at the beginning of gic_handle_irq(). I don't
know how this pseudo NMI works, but we have notrace set for do_NMI
because breakpoints at the entry (before it can fix things) causes
issues. But this still doesn't make sense because the gic_handle_irq()
call doesn't fix things up either, so other functions that are traced
by gic_handle_irq() should blow up too, which appears (by the patch)
not to be the case.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-03-29 13:58 ` Marc Zyngier
  2019-03-29 14:53   ` Steven Rostedt
@ 2019-03-29 15:02   ` Zenghui Yu
  2019-04-02 14:00     ` Julien Thierry
  1 sibling, 1 reply; 9+ messages in thread
From: Zenghui Yu @ 2019-03-29 15:02 UTC (permalink / raw)
  To: Marc Zyngier, julien.thierry, rostedt
  Cc: mingo, catalin.marinas, will.deacon, mark.rutland, linux,
	james.morse, suzuki.poulose, wanghaibin.wang, linux-arm-kernel,
	linux-kernel

Hi Marc,

On 2019/3/29 21:58, Marc Zyngier wrote:
> Hi Zenghui,
> 
> On 29/03/2019 13:23, Zenghui Yu wrote:
>> Enable pseudo NMI together with function_graph tracer, will lead
>> the system to a hang. This is easy to reproduce,
>>
>>    1) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
>>    2) echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>
>> This patch (RFC) set gic_handle_irq() as notrace and it seems works
>> fine now. But I have no idea about what the issue is exactly, and
>> you can regard this patch as a report then :)
>>
>> Can someone give a look at it and provide some explanations ?
>>
>> Thanks!
>>
>> Cc: Julien Thierry <julien.thierry@arm.com>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>> ---
>>   drivers/irqchip/irq-gic-v3.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index 15e55d3..8d0c25f 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -487,7 +487,7 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
>>   		gic_deactivate_unhandled(irqnr);
>>   }
>>   
>> -static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
>> +static asmlinkage notrace void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
>>   {
>>   	u32 irqnr;
>>   
>>
> 
> 
> That's interesting. Do you have any out of tree patch that actually
> makes use of the pseudo-NMI feature? Without those patches, the
> behaviour should stay unchanged.

I am at commit 1a9df9e29c2afecf6e3089442d429b377279ca3c. No more
patches, and this is the most confusing. Just out of curiosity, I
wanted to run Julien's "Use NMI for perf interrupt" patch (posted
on the mailing list), so I have to enable NMI first.

That said, with
   1) Select Kernel Feature -> Support for NMI-like interrupts
   2) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
   3) No pseudo-NMIs have been generated at all
and this issue was hit.


thanks,

zenghui

> 
> On the other hand, if you can generate pseudo-NMIs, you could end-up
> tracing gic_handle_irq whilst being inside the tracing code with
> interrupts being notionally disabled (and that could be pretty bad).
> 
> So, patches or no patches?
> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-03-29 14:53   ` Steven Rostedt
@ 2019-03-29 15:35     ` Zenghui Yu
  2019-03-29 15:45       ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: Zenghui Yu @ 2019-03-29 15:35 UTC (permalink / raw)
  To: Steven Rostedt, Marc Zyngier
  Cc: julien.thierry, mingo, catalin.marinas, will.deacon,
	mark.rutland, linux, james.morse, suzuki.poulose,
	wanghaibin.wang, linux-arm-kernel, linux-kernel

Hi Steven,

On 2019/3/29 22:53, Steven Rostedt wrote:
> On Fri, 29 Mar 2019 13:58:40 +0000
> Marc Zyngier <marc.zyngier@arm.com> wrote:
> 
>> On the other hand, if you can generate pseudo-NMIs, you could end-up
>> tracing gic_handle_irq whilst being inside the tracing code with
>> interrupts being notionally disabled (and that could be pretty bad).
> 
> Actually, that should still be safe. The tracing code is expected to
> trace NMIs.
> 
> Now the entry of an NMI can be an issue because of the way tracing is
> enabled. But this would also cause function tracing to crash, which was
> not stated. Does function tracing have the same issue, or is this just
> with function_graph tracing?

This is just with function_graph tracing. Function tracing works fine.

> 
> This is because a breakpoint is added to all the places that are going
> to be traced so that the "nops" at the beginning of function calls are
> going to be converted to calls to the tracer. Now that means a
> breakpoint is being added at the beginning of gic_handle_irq(). I don't
> know how this pseudo NMI works, but we have notrace set for do_NMI
> because breakpoints at the entry (before it can fix things) causes
> issues. But this still doesn't make sense because the gic_handle_irq()
> call doesn't fix things up either, so other functions that are traced
> by gic_handle_irq() should blow up too, which appears (by the patch)
> not to be the case.

Thanks for your explanation. I will read the code around further, and
get back to you if there's any discovery.


thanks,

zenghui

> 
> -- Steve
> 
> .
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-03-29 15:35     ` Zenghui Yu
@ 2019-03-29 15:45       ` Steven Rostedt
  2019-03-29 16:19         ` Zenghui Yu
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2019-03-29 15:45 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: Marc Zyngier, julien.thierry, mingo, catalin.marinas,
	will.deacon, mark.rutland, linux, james.morse, suzuki.poulose,
	wanghaibin.wang, linux-arm-kernel, linux-kernel

On Fri, 29 Mar 2019 23:35:59 +0800
Zenghui Yu <yuzenghui@huawei.com> wrote:

> Hi Steven,
> 
> On 2019/3/29 22:53, Steven Rostedt wrote:
> > On Fri, 29 Mar 2019 13:58:40 +0000
> > Marc Zyngier <marc.zyngier@arm.com> wrote:
> >   
> >> On the other hand, if you can generate pseudo-NMIs, you could end-up
> >> tracing gic_handle_irq whilst being inside the tracing code with
> >> interrupts being notionally disabled (and that could be pretty bad).  
> > 
> > Actually, that should still be safe. The tracing code is expected to
> > trace NMIs.
> > 
> > Now the entry of an NMI can be an issue because of the way tracing is
> > enabled. But this would also cause function tracing to crash, which was
> > not stated. Does function tracing have the same issue, or is this just
> > with function_graph tracing?  
> 
> This is just with function_graph tracing. Function tracing works fine.

Then we can rule out the break point issue.

> 
> > 
> > This is because a breakpoint is added to all the places that are going
> > to be traced so that the "nops" at the beginning of function calls are
> > going to be converted to calls to the tracer. Now that means a
> > breakpoint is being added at the beginning of gic_handle_irq(). I don't
> > know how this pseudo NMI works, but we have notrace set for do_NMI
> > because breakpoints at the entry (before it can fix things) causes
> > issues. But this still doesn't make sense because the gic_handle_irq()
> > call doesn't fix things up either, so other functions that are traced
> > by gic_handle_irq() should blow up too, which appears (by the patch)
> > not to be the case.  
> 
> Thanks for your explanation. I will read the code around further, and
> get back to you if there's any discovery.

Well, if it only affects function_graph tracing, then we don't need to
investigate the break point issue. If it was the break point issue, it
would cause function tracing to break too.

Also, which architecture is this for? ARM or ARM64?

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-03-29 15:45       ` Steven Rostedt
@ 2019-03-29 16:19         ` Zenghui Yu
  0 siblings, 0 replies; 9+ messages in thread
From: Zenghui Yu @ 2019-03-29 16:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Zenghui Yu, Mark Rutland, julien.thierry, Marc Zyngier,
	catalin.marinas, suzuki.poulose, Will Deacon, linux, LKML,
	Ingo Molnar, james.morse, wanghaibin.wang, linux-arm-kernel

On Fri, Mar 29, 2019 at 11:46 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 29 Mar 2019 23:35:59 +0800
> Zenghui Yu <yuzenghui@huawei.com> wrote:
>
> > Hi Steven,
> >
> > On 2019/3/29 22:53, Steven Rostedt wrote:
> > > On Fri, 29 Mar 2019 13:58:40 +0000
> > > Marc Zyngier <marc.zyngier@arm.com> wrote:
> > >
> > >> On the other hand, if you can generate pseudo-NMIs, you could end-up
> > >> tracing gic_handle_irq whilst being inside the tracing code with
> > >> interrupts being notionally disabled (and that could be pretty bad).
> > >
> > > Actually, that should still be safe. The tracing code is expected to
> > > trace NMIs.
> > >
> > > Now the entry of an NMI can be an issue because of the way tracing is
> > > enabled. But this would also cause function tracing to crash, which was
> > > not stated. Does function tracing have the same issue, or is this just
> > > with function_graph tracing?
> >
> > This is just with function_graph tracing. Function tracing works fine.
>
> Then we can rule out the break point issue.
>
> >
> > >
> > > This is because a breakpoint is added to all the places that are going
> > > to be traced so that the "nops" at the beginning of function calls are
> > > going to be converted to calls to the tracer. Now that means a
> > > breakpoint is being added at the beginning of gic_handle_irq(). I don't
> > > know how this pseudo NMI works, but we have notrace set for do_NMI
> > > because breakpoints at the entry (before it can fix things) causes
> > > issues. But this still doesn't make sense because the gic_handle_irq()
> > > call doesn't fix things up either, so other functions that are traced
> > > by gic_handle_irq() should blow up too, which appears (by the patch)
> > > not to be the case.
> >
> > Thanks for your explanation. I will read the code around further, and
> > get back to you if there's any discovery.
>
> Well, if it only affects function_graph tracing, then we don't need to
> investigate the break point issue. If it was the break point issue, it
> would cause function tracing to break too.

OK.

>
> Also, which architecture is this for? ARM or ARM64?

on ARM64.


thanks,

zenghui

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-03-29 15:02   ` Zenghui Yu
@ 2019-04-02 14:00     ` Julien Thierry
  2019-04-03 15:23       ` liwei (GF)
  0 siblings, 1 reply; 9+ messages in thread
From: Julien Thierry @ 2019-04-02 14:00 UTC (permalink / raw)
  To: Zenghui Yu, Marc Zyngier, rostedt
  Cc: mingo, catalin.marinas, will.deacon, mark.rutland, linux,
	james.morse, suzuki.poulose, wanghaibin.wang, linux-arm-kernel,
	linux-kernel

Hi,

On 29/03/2019 15:02, Zenghui Yu wrote:
> Hi Marc,
> 
> On 2019/3/29 21:58, Marc Zyngier wrote:
>> Hi Zenghui,
>>
>> On 29/03/2019 13:23, Zenghui Yu wrote:
>>> Enable pseudo NMI together with function_graph tracer, will lead
>>> the system to a hang. This is easy to reproduce,
>>>
>>>    1) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
>>>    2) echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>>
>>> This patch (RFC) set gic_handle_irq() as notrace and it seems works
>>> fine now. But I have no idea about what the issue is exactly, and
>>> you can regard this patch as a report then :)
>>>
>>> Can someone give a look at it and provide some explanations ?
>>>

Thanks for testing this and reporting the issue.

>>> Thanks!
>>>
>>> Cc: Julien Thierry <julien.thierry@arm.com>
>>> Cc: Steven Rostedt <rostedt@goodmis.org>
>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>> ---
>>>   drivers/irqchip/irq-gic-v3.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>>> index 15e55d3..8d0c25f 100644
>>> --- a/drivers/irqchip/irq-gic-v3.c
>>> +++ b/drivers/irqchip/irq-gic-v3.c
>>> @@ -487,7 +487,7 @@ static inline void gic_handle_nmi(u32 irqnr,
>>> struct pt_regs *regs)
>>>           gic_deactivate_unhandled(irqnr);
>>>   }
>>>   -static asmlinkage void __exception_irq_entry gic_handle_irq(struct
>>> pt_regs *regs)
>>> +static asmlinkage notrace void __exception_irq_entry
>>> gic_handle_irq(struct pt_regs *regs)
>>>   {
>>>       u32 irqnr;
>>>  
>>
>>
>> That's interesting. Do you have any out of tree patch that actually
>> makes use of the pseudo-NMI feature? Without those patches, the
>> behaviour should stay unchanged.
> 
> I am at commit 1a9df9e29c2afecf6e3089442d429b377279ca3c. No more
> patches, and this is the most confusing. Just out of curiosity, I
> wanted to run Julien's "Use NMI for perf interrupt" patch (posted
> on the mailing list), so I have to enable NMI first.
> 
> That said, with
>   1) Select Kernel Feature -> Support for NMI-like interrupts
>   2) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
>   3) No pseudo-NMIs have been generated at all
> and this issue was hit.
> 

I finally found out what happens.

When using interrupt priority masking, at the begining of
gic_handle_irq(), we are in this awkward state where we still have the I
bit set and PMR unmasked (this is because we cannot ack the interrupt
that was signaled if it has lower priority than the current priority mask).

To try and keep things simple, we decided that local_irq_*() would only
deal with PMR (when using priority masking). With one additional case
being that, if PSR.I is set when saving flags, we'd represent it in the
form of a value of PMR (i.e. GIC_PRIO_IRQOFF), so that irqs_disabled()
and such would still accurately state that no interrupt could happen in
those sections. The assumption was that in the few sections were we are
having the PSR.I set, we wouldn't care about having interrupts disabled
with PSR.I or PMR. And now that assumption appears to be wrong:

trace_graph_entry(), called at the begining of gic_handle_irq() when
enabling the tracer, does use local_irq_save/restore(). The save
operation returns flags GIC_PRIO_IRQOFF (because PSR.I is set when we
enter gic_handle_irq() ). The restore operation will then proceed to
mask PMR, once we get back to gic_handle_irq() we can't acknowledge the
interrupt we just received...

The function tracer does not appear to save/restore flags on function
entry (I saw save/restore operations in the stack tracer but for some
reason couldn't get them to trigger the issue).

To confirm this, I checked with the following diff (which is not a fix,
it is better to mark gic_handle_irq() as notrace if I don't find
something more suitable).

Cheers,

-- 
Julien Thierry

-->

diff --git a/kernel/trace/trace_functions_graph.c
b/kernel/trace/trace_functions_graph.c
index 69ebf3c..80b941e 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -133,6 +133,7 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
        int ret;
        int cpu;
        int pc;
+       bool restore = false;

        if (trace_recursion_test(TRACE_GRAPH_NOTRACE_BIT))
                return 0;
@@ -172,7 +173,13 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
        if (tracing_thresh)
                return 1;

-       local_irq_save(flags);
+       if (!irqs_disabled()) {
+               restore = true;
+               local_irq_save(flags);
+       } else {
+               local_save_flags(flags);
+       }
+
        cpu = raw_smp_processor_id();
        data = per_cpu_ptr(tr->trace_buffer.data, cpu);
        disabled = atomic_inc_return(&data->disabled);
@@ -184,7 +191,8 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
        }

        atomic_dec(&data->disabled);
-       local_irq_restore(flags);
+       if (restore)
+               local_irq_restore(flags);

        return ret;
 }


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
  2019-04-02 14:00     ` Julien Thierry
@ 2019-04-03 15:23       ` liwei (GF)
  0 siblings, 0 replies; 9+ messages in thread
From: liwei (GF) @ 2019-04-03 15:23 UTC (permalink / raw)
  To: Julien Thierry, Zenghui Yu, Marc Zyngier, rostedt
  Cc: mark.rutland, suzuki.poulose, catalin.marinas, will.deacon,
	linux, linux-kernel, mingo, james.morse, wanghaibin.wang,
	linux-arm-kernel, guohanjun,
	huawei.libin@huawei.com >> libin 00196512

Hi Julien,

On 2019/4/2 22:00, Julien Thierry wrote:
I meet this issue by coincidence before too.

> I finally found out what happens.
> 
> When using interrupt priority masking, at the begining of
> gic_handle_irq(), we are in this awkward state where we still have the I
> bit set and PMR unmasked (this is because we cannot ack the interrupt
> that was signaled if it has lower priority than the current priority mask).
> 
> To try and keep things simple, we decided that local_irq_*() would only
> deal with PMR (when using priority masking). With one additional case
> being that, if PSR.I is set when saving flags, we'd represent it in the
> form of a value of PMR (i.e. GIC_PRIO_IRQOFF), so that irqs_disabled()
> and such would still accurately state that no interrupt could happen in
> those sections. The assumption was that in the few sections were we are
> having the PSR.I set, we wouldn't care about having interrupts disabled
> with PSR.I or PMR. And now that assumption appears to be wrong:
I am confused of the logic of the save and restore here, why can't we do the
save & restore exactly?

> trace_graph_entry(), called at the begining of gic_handle_irq() when
> enabling the tracer, does use local_irq_save/restore(). The save
> operation returns flags GIC_PRIO_IRQOFF (because PSR.I is set when we
> enter gic_handle_irq() ). The restore operation will then proceed to
> mask PMR, once we get back to gic_handle_irq() we can't acknowledge the
> interrupt we just received...
> 
> The function tracer does not appear to save/restore flags on function
> entry (I saw save/restore operations in the stack tracer but for some
> reason couldn't get them to trigger the issue).
> 
> To confirm this, I checked with the following diff (which is not a fix,
> it is better to mark gic_handle_irq() as notrace if I don't find
> something more suitable).
> 
I am convinced a notrace marking here will walkaround this issue,but i am
afraid there is still some trouble like this else.
I send my modification to solve this issue just now, it is tested ok so far,
but i am not particularly sure, could you please have a look?

Thanks,
Wei


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-03 15:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-29 13:23 [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace Zenghui Yu
2019-03-29 13:58 ` Marc Zyngier
2019-03-29 14:53   ` Steven Rostedt
2019-03-29 15:35     ` Zenghui Yu
2019-03-29 15:45       ` Steven Rostedt
2019-03-29 16:19         ` Zenghui Yu
2019-03-29 15:02   ` Zenghui Yu
2019-04-02 14:00     ` Julien Thierry
2019-04-03 15:23       ` liwei (GF)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).