linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Julien Thierry <julien.thierry@arm.com>
To: Zenghui Yu <yuzenghui@huawei.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	rostedt@goodmis.org
Cc: mingo@redhat.com, catalin.marinas@arm.com, will.deacon@arm.com,
	mark.rutland@arm.com, linux@armlinux.org.uk, james.morse@arm.com,
	suzuki.poulose@arm.com, wanghaibin.wang@huawei.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace
Date: Tue, 2 Apr 2019 15:00:05 +0100	[thread overview]
Message-ID: <aad77b33-7eb0-57a6-2d60-1cb717c1e1c7@arm.com> (raw)
In-Reply-To: <bf434b0c-2b17-b92d-3171-a57203486c84@huawei.com>

Hi,

On 29/03/2019 15:02, Zenghui Yu wrote:
> Hi Marc,
> 
> On 2019/3/29 21:58, Marc Zyngier wrote:
>> Hi Zenghui,
>>
>> On 29/03/2019 13:23, Zenghui Yu wrote:
>>> Enable pseudo NMI together with function_graph tracer, will lead
>>> the system to a hang. This is easy to reproduce,
>>>
>>>    1) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
>>>    2) echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>>
>>> This patch (RFC) set gic_handle_irq() as notrace and it seems works
>>> fine now. But I have no idea about what the issue is exactly, and
>>> you can regard this patch as a report then :)
>>>
>>> Can someone give a look at it and provide some explanations ?
>>>

Thanks for testing this and reporting the issue.

>>> Thanks!
>>>
>>> Cc: Julien Thierry <julien.thierry@arm.com>
>>> Cc: Steven Rostedt <rostedt@goodmis.org>
>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>> ---
>>>   drivers/irqchip/irq-gic-v3.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>>> index 15e55d3..8d0c25f 100644
>>> --- a/drivers/irqchip/irq-gic-v3.c
>>> +++ b/drivers/irqchip/irq-gic-v3.c
>>> @@ -487,7 +487,7 @@ static inline void gic_handle_nmi(u32 irqnr,
>>> struct pt_regs *regs)
>>>           gic_deactivate_unhandled(irqnr);
>>>   }
>>>   -static asmlinkage void __exception_irq_entry gic_handle_irq(struct
>>> pt_regs *regs)
>>> +static asmlinkage notrace void __exception_irq_entry
>>> gic_handle_irq(struct pt_regs *regs)
>>>   {
>>>       u32 irqnr;
>>>  
>>
>>
>> That's interesting. Do you have any out of tree patch that actually
>> makes use of the pseudo-NMI feature? Without those patches, the
>> behaviour should stay unchanged.
> 
> I am at commit 1a9df9e29c2afecf6e3089442d429b377279ca3c. No more
> patches, and this is the most confusing. Just out of curiosity, I
> wanted to run Julien's "Use NMI for perf interrupt" patch (posted
> on the mailing list), so I have to enable NMI first.
> 
> That said, with
>   1) Select Kernel Feature -> Support for NMI-like interrupts
>   2) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
>   3) No pseudo-NMIs have been generated at all
> and this issue was hit.
> 

I finally found out what happens.

When using interrupt priority masking, at the begining of
gic_handle_irq(), we are in this awkward state where we still have the I
bit set and PMR unmasked (this is because we cannot ack the interrupt
that was signaled if it has lower priority than the current priority mask).

To try and keep things simple, we decided that local_irq_*() would only
deal with PMR (when using priority masking). With one additional case
being that, if PSR.I is set when saving flags, we'd represent it in the
form of a value of PMR (i.e. GIC_PRIO_IRQOFF), so that irqs_disabled()
and such would still accurately state that no interrupt could happen in
those sections. The assumption was that in the few sections were we are
having the PSR.I set, we wouldn't care about having interrupts disabled
with PSR.I or PMR. And now that assumption appears to be wrong:

trace_graph_entry(), called at the begining of gic_handle_irq() when
enabling the tracer, does use local_irq_save/restore(). The save
operation returns flags GIC_PRIO_IRQOFF (because PSR.I is set when we
enter gic_handle_irq() ). The restore operation will then proceed to
mask PMR, once we get back to gic_handle_irq() we can't acknowledge the
interrupt we just received...

The function tracer does not appear to save/restore flags on function
entry (I saw save/restore operations in the stack tracer but for some
reason couldn't get them to trigger the issue).

To confirm this, I checked with the following diff (which is not a fix,
it is better to mark gic_handle_irq() as notrace if I don't find
something more suitable).

Cheers,

-- 
Julien Thierry

-->

diff --git a/kernel/trace/trace_functions_graph.c
b/kernel/trace/trace_functions_graph.c
index 69ebf3c..80b941e 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -133,6 +133,7 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
        int ret;
        int cpu;
        int pc;
+       bool restore = false;

        if (trace_recursion_test(TRACE_GRAPH_NOTRACE_BIT))
                return 0;
@@ -172,7 +173,13 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
        if (tracing_thresh)
                return 1;

-       local_irq_save(flags);
+       if (!irqs_disabled()) {
+               restore = true;
+               local_irq_save(flags);
+       } else {
+               local_save_flags(flags);
+       }
+
        cpu = raw_smp_processor_id();
        data = per_cpu_ptr(tr->trace_buffer.data, cpu);
        disabled = atomic_inc_return(&data->disabled);
@@ -184,7 +191,8 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
        }

        atomic_dec(&data->disabled);
-       local_irq_restore(flags);
+       if (restore)
+               local_irq_restore(flags);

        return ret;
 }


  reply	other threads:[~2019-04-02 14:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-29 13:23 [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace Zenghui Yu
2019-03-29 13:58 ` Marc Zyngier
2019-03-29 14:53   ` Steven Rostedt
2019-03-29 15:35     ` Zenghui Yu
2019-03-29 15:45       ` Steven Rostedt
2019-03-29 16:19         ` Zenghui Yu
2019-03-29 15:02   ` Zenghui Yu
2019-04-02 14:00     ` Julien Thierry [this message]
2019-04-03 15:23       ` liwei (GF)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aad77b33-7eb0-57a6-2d60-1cb717c1e1c7@arm.com \
    --to=julien.thierry@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=suzuki.poulose@arm.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will.deacon@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).