All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liao, Chang" <liaochang1@huawei.com>
To: "Gowans, James" <jgowans@amazon.com>,
	"maz@kernel.org" <maz@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>
Cc: "zouyipeng@huawei.com" <zouyipeng@huawei.com>,
	"Raslan, KarimAllah" <karahmed@amazon.com>,
	"Woodhouse, David" <dwmw@amazon.co.uk>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Sironi, Filippo" <sironi@amazon.de>,
	"chris.zjh@huawei.com" <chris.zjh@huawei.com>
Subject: Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke
Date: Mon, 29 May 2023 10:47:14 +0800	[thread overview]
Message-ID: <516b8316-fba9-f3d9-2b3a-2536bd747a69@huawei.com> (raw)
In-Reply-To: <7de1c7f0c9ff4a84b3cc3fa15ee39596c08b665f.camel@amazon.com>

Hi, James.

在 2023/5/25 18:04, Gowans, James 写道:
> On Tue, 2023-05-23 at 11:16 +0800, liaochang (A) wrote:
>> Hi, James and Marc,
>>
>> After studying your discussions, I list some requirements need to satify for
>> the final practical solution:
>>
>> 1. Use the GIC to maintain the unhandled LPI.
>> 2. Do not change the semantics of set_irq_affinity, which means that the interrupt
>>    action must be performed on the new CPU when the next interrupt occurs after a
>>    successful set_irq_affinity operation.
>> 3. Minimize the cost, especially to other tasks running on CPUs, which means avoid
>>    a do/while loop on the original CPU and repeatedly resend interrupt on the new CPU.
> 
> Seems reasonable to me.
>>
>> Based on these requirements and Linux v6.3 rev, I propose the following hack:
>>
>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> index 49e7bc871fec..1b49518b19bd 100644
>> --- a/kernel/irq/chip.c
>> +++ b/kernel/irq/chip.c
>> @@ -692,8 +692,14 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>>
>>         raw_spin_lock(&desc->lock);
>>
>> -       if (!irq_may_run(desc))
>> +       /*
>> +        * Ack another interrupt from the same source can occurs on new
>> +        * CPU even before the first one is handled on original CPU.
>> +        */
>> +       if (!irq_may_run(desc)) {
>> +               desc->istate |= IRQS_PENDING;
>>                 goto out;
>> +       }
> 
> Wording may need a slight tweak, especially pointing out why PENDING is
> set.

Sure, see the comment below:

"Ack another interrupt from the same source can occurs on new CPU even before
the first one is handled on original CPU, IRQS_PENDING bit can be reused to
indicate this situation, which will defer the execution of the interrupt handler
function associated with the irq_desc until the first interrupt handler returns."

In summary, the IRQ_PENDINGS ensures that only one interrupt handler is ever
running for a particular source at a time, and the major usages of IRQS_PENDING
in kernel as follows:

1. Used in irq flow handler to indicate that an acknowledged interrupt cannot be
handled immediately due to three different reasons:
- Case1: the interrupt handler function has been unregistered via free_irq().
- Case2: the interrupt has been disabled via irq_disable().
- Case3: the interrupt is an edge-triggered interrupt and its handler is already
         running on the CPU.

In any of these cases, the kernel will defer the execution of the interrupt handler
until the interrupt is enabled and new handler is established again via check_irq_resend(),
or via the inside loop in handle_edge_irq() upon the previous handler returns.

2. Used in the spurious interrupt detector, a few systems with misdescribed IRQ
routing can cause an interrupt to be handled on the wrong CPU. In this situation,
the spurious interrupt detector searches for a recovery handler for the interrupt.
If the found handler is running on another CPU, the spurious interrupt detector
also defers the execution of the recovery handler, similar to case 3 in #1.

I hope this is helpful.

>>
>>         desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
>>
>> @@ -715,6 +721,8 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>>
>>         cond_unmask_eoi_irq(desc, chip);
>>
>> +       check_irq_resend(desc, true);
>> +
> 
> I'm pretty sure the second param, "inject", must not be true. As is this
> will *always* resend the interrupt after every interrupt. Infinite
> interrupt loop! (did you test this?)
> I've changed it to false and tested, seems good to me.

Sorry, This code is a very trivial prototype and has not yet been tested.
You are correct that the "inject" parameter should be set to false, otherwise
the IRQS_PENDING check in check_irq_resend() will be skipped. Thank you for
pointing out my mistake.

> 
> I'd suggest adding a comment here which pairs with the comment above
> explaining why this is being called here. Also perhaps adding:
> if (unlikely((desc->istate & IRQS_PENDING))
> ...so that in general this function call will be skipped over.

Agree.

> 
> If you want I'm happy to make these tweaks and post as V2?
> There are also some other comments I'm keen to add to the flag enums to
> make it a bit clearer what some of the flags mean.

Don't see why not :)

Thanks.

> 
> Marc, what are your thoughts on this approach?
> The main (only?) downside I see compared to your suggestion is that in the
> irq_sw_resend() case the tasklet will be scheduled on the original CPU. As
> you mentioned in a previous reply, perhaps we shouldn't bother thinking
> about the sw_resend case now as only GIC is impacted. In future if
> necessary the sw_resend implementation could be tweaked to schedule the
> tasklet on the CPU which the interrupt is affined to.
> 
> JG
> 
>>         raw_spin_unlock(&desc->lock);
>>         return;
>>  out:
>>
>> Looking forward to your feedbacks, thanks.

-- 
BR
Liao, Chang

  reply	other threads:[~2023-05-29  2:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17  9:53 [PATCH] irq: fasteoi handler re-runs on concurrent invoke James Gowans
2023-03-17 10:12 ` Yipeng Zou
2023-03-17 11:49   ` Gowans, James
2023-03-22  6:26     ` Yipeng Zou
2023-03-22  7:48       ` Gowans, James
2023-03-22 10:37         ` Thomas Gleixner
2023-04-03 13:17           ` zhangjianhua (E)
2023-04-03 13:19             ` Marc Zyngier
2023-04-03 13:39               ` Gowans, James
2023-03-22 10:38         ` Yipeng Zou
2023-04-09 11:41 ` Marc Zyngier
2023-04-11 10:27   ` Gowans, James
2023-04-12 13:32     ` Marc Zyngier
2023-04-18  2:39       ` Yipeng Zou
2023-04-18 10:56       ` Gowans, James
2023-04-19  3:08         ` Yipeng Zou
2023-05-02  8:43         ` Gowans, James
2023-05-23  3:16           ` liaochang (A)
2023-05-25 10:04             ` Gowans, James
2023-05-29  2:47               ` Liao, Chang [this message]
2023-05-30 21:47                 ` Gowans, James
     [not found]           ` <86sfcfghqh.wl-maz@kernel.org>
2023-05-23 12:47             ` Gowans, James
2023-05-25 12:31               ` Liao, Chang
2023-05-02 10:28         ` Marc Zyngier
2023-05-23  3:16       ` liaochang (A)
2023-05-23  3:15 ` liaochang (A)
2023-05-23 11:59   ` Gowans, James
2023-05-25 12:31     ` Liao, Chang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516b8316-fba9-f3d9-2b3a-2536bd747a69@huawei.com \
    --to=liaochang1@huawei.com \
    --cc=chris.zjh@huawei.com \
    --cc=dwmw@amazon.co.uk \
    --cc=jgowans@amazon.com \
    --cc=karahmed@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=sironi@amazon.de \
    --cc=tglx@linutronix.de \
    --cc=zouyipeng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.