All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yipeng Zou <zouyipeng@huawei.com>
To: James Gowans <jgowans@amazon.com>, Thomas Gleixner <tglx@linutronix.de>
Cc: <linux-kernel@vger.kernel.org>,
	David Woodhouse <dwmw@amazon.co.uk>,
	Marc Zyngier <maz@kernel.org>,
	KarimAllah Raslan <karahmed@amazon.com>
Subject: Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke
Date: Fri, 17 Mar 2023 18:12:38 +0800	[thread overview]
Message-ID: <f0879a30-6f88-30e0-ce30-e230df8f2936@huawei.com> (raw)
In-Reply-To: <20230317095300.4076497-1-jgowans@amazon.com>


在 2023/3/17 17:53, James Gowans 写道:
> Update the generic handle_fasteoi_irq to cater for the case when the
> next interrupt comes in while the previous handler is still running.
> Currently when that happens the irq_may_run() early out causes the next
> IRQ to be lost. Change the behaviour to mark the interrupt as pending
> and re-run the handler when handle_fasteoi_irq sees that the pending
> flag has been set again. This is largely inspired by handle_edge_irq.
>
> Generally it should not be possible for the next interrupt to arrive
> while the previous handler is still running: the next interrupt should
> only arrive after the EOI message has been sent and the previous handler
> has returned. However, there is a race where if the interrupt affinity
> is changed while the previous handler is running, then the next
> interrupt can arrive at a different CPU while the previous handler is
> still running. In that case there will be a concurrent invoke and the
> early out will be taken.
>
> For example:
>
>             CPU 0             |          CPU 1
> -----------------------------|-----------------------------
> interrupt start              |
>    handle_fasteoi_irq         | set_affinity(CPU 1)
>      handler                  |
>      ...                      | interrupt start
>      ...                      |   handle_fasteoi_irq -> early out
>    handle_fasteoi_irq return  | interrupt end
> interrupt end                |
>
> This issue was observed specifically on an arm64 system with a GIC-v3
> handling MSIs; GIC-v3 uses the handle_fasteoi_irq handler. The issue is
> that the global ITS is responsible for affinity but does not know
> whether interrupts are pending/running, only the CPU-local redistributor
> handles the EOI. Hence when the affinity is changed in the ITS, the new
> CPU's redistributor does not know that the original CPU is still running
> the handler.
>
> There are a few uncertainties about this implementation compared to the
> prior art in handle_edge_irq:
>
> 1. Do we need to mask the IRQ and then unmask it later? I don't think so
> but it's not entirely clear why handle_edge_irq does this anyway; it's
> an edge IRQ so not sure why it needs to be masked.
>
> 2. Should the EOI delivery be inside the do loop after every handler
> run? I think outside the loops is best; only inform the chip to deliver
> more IRQs once all pending runs have been finished.
>
> 3. Do we need to check that desc->action is still set? I don't know if
> it can be un-set while the IRQ is still enabled.
>
> 4. There is now more overlap with the handle_edge_eoi_irq handler;
> should we try to unify these?
>
> Signed-off-by: James Gowans <jgowans@amazon.com>
> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: KarimAllah Raslan <karahmed@amazon.com>
> ---
>   Documentation/core-api/genericirq.rst | 9 ++++++++-
>   kernel/irq/chip.c                     | 9 +++++++--
>   2 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/core-api/genericirq.rst b/Documentation/core-api/genericirq.rst
> index f959c9b53f61..b54485eca8b5 100644
> --- a/Documentation/core-api/genericirq.rst
> +++ b/Documentation/core-api/genericirq.rst
> @@ -240,7 +240,14 @@ which only need an EOI at the end of the handler.
>   
>   The following control flow is implemented (simplified excerpt)::
>   
> -    handle_irq_event(desc->action);
> +    if (desc->status & running) {
> +        desc->status |= pending;
> +        return;
> +    }
> +    do {
> +        desc->status &= ~pending;
> +        handle_irq_event(desc->action);
> +    } while (status & pending);
>       desc->irq_data.chip->irq_eoi();
>   
>   
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 49e7bc871fec..4e5fc2b7e8a9 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -692,8 +692,10 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>   
>   	raw_spin_lock(&desc->lock);
>   
> -	if (!irq_may_run(desc))
> +	if (!irq_may_run(desc)) {
> +		desc->istate |= IRQS_PENDING;
>   		goto out;
> +	}
>   
>   	desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
>   
> @@ -711,7 +713,10 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>   	if (desc->istate & IRQS_ONESHOT)
>   		mask_irq(desc);
>   
> -	handle_irq_event(desc);
> +	do {
> +		handle_irq_event(desc);
> +	} while (unlikely((desc->istate & IRQS_PENDING) &&
> +			!irqd_irq_disabled(&desc->irq_data)));
>   
>   	cond_unmask_eoi_irq(desc, chip);
>   

Hi:

Finally, someone also hit this problem.

I just send patch a  few weeks ago.

It seems that we have the same solution.(I introduced a new flow handler).

Hopefully this issue will be fixed as soon as possible.

[1] 
https://lore.kernel.org/all/20230310101417.1081434-1-zouyipeng@huawei.com/

-- 
Regards,
Yipeng Zou


  reply	other threads:[~2023-03-17 10:12 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17  9:53 [PATCH] irq: fasteoi handler re-runs on concurrent invoke James Gowans
2023-03-17 10:12 ` Yipeng Zou [this message]
2023-03-17 11:49   ` Gowans, James
2023-03-22  6:26     ` Yipeng Zou
2023-03-22  7:48       ` Gowans, James
2023-03-22 10:37         ` Thomas Gleixner
2023-04-03 13:17           ` zhangjianhua (E)
2023-04-03 13:19             ` Marc Zyngier
2023-04-03 13:39               ` Gowans, James
2023-03-22 10:38         ` Yipeng Zou
2023-04-09 11:41 ` Marc Zyngier
2023-04-11 10:27   ` Gowans, James
2023-04-12 13:32     ` Marc Zyngier
2023-04-18  2:39       ` Yipeng Zou
2023-04-18 10:56       ` Gowans, James
2023-04-19  3:08         ` Yipeng Zou
2023-05-02  8:43         ` Gowans, James
2023-05-23  3:16           ` liaochang (A)
2023-05-25 10:04             ` Gowans, James
2023-05-29  2:47               ` Liao, Chang
2023-05-30 21:47                 ` Gowans, James
     [not found]           ` <86sfcfghqh.wl-maz@kernel.org>
2023-05-23 12:47             ` Gowans, James
2023-05-25 12:31               ` Liao, Chang
2023-05-02 10:28         ` Marc Zyngier
2023-05-23  3:16       ` liaochang (A)
2023-05-23  3:15 ` liaochang (A)
2023-05-23 11:59   ` Gowans, James
2023-05-25 12:31     ` Liao, Chang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0879a30-6f88-30e0-ce30-e230df8f2936@huawei.com \
    --to=zouyipeng@huawei.com \
    --cc=dwmw@amazon.co.uk \
    --cc=jgowans@amazon.com \
    --cc=karahmed@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.