All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Gowans, James" <jgowans@amazon.com>
To: "maz@kernel.org" <maz@kernel.org>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"Raslan, KarimAllah" <karahmed@amazon.com>,
	"Woodhouse, David" <dwmw@amazon.co.uk>,
	"zouyipeng@huawei.com" <zouyipeng@huawei.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Sironi, Filippo" <sironi@amazon.de>,
	"chris.zjh@huawei.com" <chris.zjh@huawei.com>
Subject: Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke
Date: Tue, 18 Apr 2023 10:56:07 +0000	[thread overview]
Message-ID: <7fdfb01590d8e502f384aa0bb0dc9c614caa5dfc.camel@amazon.com> (raw)
In-Reply-To: <86pm89kyyt.wl-maz@kernel.org>

On Wed, 2023-04-12 at 14:32 +0100, Marc Zyngier wrote:
> 
> > > > 1. Do we need to mask the IRQ and then unmask it later? I don't think so
> > > > but it's not entirely clear why handle_edge_irq does this anyway; it's
> > > > an edge IRQ so not sure why it needs to be masked.
> > > 
> > > Please measure that cost and weep, specially in the context of
> > > multiple concurrent interrupts serviced by a single ITS (cost of
> > > locking the command queue, of waiting for a full round trip to the ITS
> > > for a couple of commands...).
> > 
> > Fortunately this mask/unmasking would only happen in the rare(ish) cased of the
> > race condition described here being hit. Exactly the same as
> > with handle_edge_irq(), the masking and later unmasking would only be done
> > when irq_may_run() == false due to the race being hit. Considering that this is
> > a rare occurrence, I think we could stomach the occasional overhead? I was more
> > asking if it's actually *necessary* to do this masking/unmasking. I'm not sure
> > it's necessary anyway, hence it wasn't implemented in my patch.
> 
> But does it solve anything? At the point where you mask the interrupt,
> you already have consumed it. You'd still need to make it pending
> somehow, which is what your patch somehow.

I don't really know - the reason I asked the question is that the related
handle_edge_irq() does this mask/unmasking, and I wasn't quite sure why it
did that and hence if we needed to do something similar.
Anyway, let's focus on your patch rather - I think it's more compelling.


> > Yes. This bothered me too initially, but on reflection I'm not sure it's
> > actually a problem. One possible issue that came to mind was around CPU
> > offlining, but in the event that a CPU being offlined was running interrupt
> > handlers it wouldn't be able to complete the offline anyway until the handlers
> > were finished, so I don't think this is an issue. Do you see any practical issue
> > with running the handler once more on the original CPU immediately after the
> > affinity has been changed?
> 
> My take on this is that we put the pressure on the CPU we want to move
> away from. I'd rather we put the it on the GIC itself, and use its
> Turing-complete powers to force it to redeliver the interrupt at a
> more convenient time.

This idea and implementation looks and works great! It may need a few
tweaks; discussing below.

> 
> From c96d2ab37fe273724f1264fba5f4913259875d56 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Mon, 10 Apr 2023 10:56:32 +0100
> Subject: [PATCH] irqchip/gicv3-its: Force resend of LPIs taken while
> already
>  in-progress

Perhaps you can pillage some of my commit message to explain the race here
when you send this patch?
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index b1b28affb32a..4b2a7cc96eb2 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -223,6 +223,8 @@ struct irq_data {
>   *                               irq_chip::irq_set_affinity() when
> deactivated.
>   * IRQD_IRQ_ENABLED_ON_SUSPEND - Interrupt is enabled on suspend by irq
> pm if
>   *                               irqchip have flag
> IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND set.
> + * IRQD_RESEND_WHEN_IN_PROGRESS - Interrupt may fire when already in
> progress,
> + *                               needs resending.
>   */
>  enum {
>         IRQD_TRIGGER_MASK               = 0xf,
> @@ -249,6 +251,7 @@ enum {
>         IRQD_HANDLE_ENFORCE_IRQCTX      = (1 << 28),
>         IRQD_AFFINITY_ON_ACTIVATE       = (1 << 29),
>         IRQD_IRQ_ENABLED_ON_SUSPEND     = (1 << 30),
> +       IRQD_RESEND_WHEN_IN_PROGRESS    = (1 << 31),
>  };

Do we really want a new flag here? I'd be keen to fix this race for all
drivers, not just those who know to set this flag. I think the patch
you're suggesting is pretty close to being safe to enable generally? If so
my preference is for one less config option - just run it always.

>  static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 49e7bc871fec..73546ba8bc43 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -692,8 +692,11 @@ void handle_fasteoi_irq(struct irq_desc *desc)
> 
>         raw_spin_lock(&desc->lock);
> 
> -       if (!irq_may_run(desc))
> +       if (!irq_may_run(desc)) {
> +               if (irqd_needs_resend_when_in_progress(&desc->irq_data))
> +                       check_irq_resend(desc, true);
>                 goto out;
> +       }

This will run check_irq_resend() on the *newly affined* CPU, while the old
one is still running the original handler. AFAICT what will happen is:
check_irq_resend
  try_retrigger
    irq_chip_retrigger_hierarchy
      its_irq_retrigger
... which will cause the ITS to *immediately* re-trigger the IRQ. The
original CPU can still be running the handler in that case.

If that happens, consider what will happen in check_irq_resend:
- first IRQ comes in, successflly runs try_retrigger and sets IRQS_REPLAY.
- it is *immediately* retriggered by ITS, and because the original handler
on the other CPU is still running, comes into check_irq_resend again.
- check_irq_resend now observes that IRQS_REPLAY is set and early outs.
- No more resends, the IRQ is still lost. :-(

Now I admit the failure mode is getting a bit pathological: two re-
triggers while the original handler is still running, but I was able to
hit this on my test machine by intentionally slowing
the handler down by a few dozen micros. Should we cater for this?

I can see two possibilities:
- tweak check_irq_resend() to not early-out in this case but to keep re-
triggering until it eventually runs.
- move the check_irq_resend to only happen later, *after* the original
handler has finished running. This would be very similar to what I
suggested in my original patch, except instead of running a do/while loop,
the code would observe that the pending flag was set again and run
check_irq_resend.

I'm also wondering what will happen for users who don't have the
chip->irq_retrigger callback set and fall back to the tasklet
via irq_sw_resend()... Looks like it will work fine. However if we do my
suggestion and move check_irq_resend to the end of handle_fasteoi_irq then
the tasklet will be scheduled on the old CPU again, which may be sub-
optimal.

JG

  parent reply	other threads:[~2023-04-18 10:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17  9:53 [PATCH] irq: fasteoi handler re-runs on concurrent invoke James Gowans
2023-03-17 10:12 ` Yipeng Zou
2023-03-17 11:49   ` Gowans, James
2023-03-22  6:26     ` Yipeng Zou
2023-03-22  7:48       ` Gowans, James
2023-03-22 10:37         ` Thomas Gleixner
2023-04-03 13:17           ` zhangjianhua (E)
2023-04-03 13:19             ` Marc Zyngier
2023-04-03 13:39               ` Gowans, James
2023-03-22 10:38         ` Yipeng Zou
2023-04-09 11:41 ` Marc Zyngier
2023-04-11 10:27   ` Gowans, James
2023-04-12 13:32     ` Marc Zyngier
2023-04-18  2:39       ` Yipeng Zou
2023-04-18 10:56       ` Gowans, James [this message]
2023-04-19  3:08         ` Yipeng Zou
2023-05-02  8:43         ` Gowans, James
2023-05-23  3:16           ` liaochang (A)
2023-05-25 10:04             ` Gowans, James
2023-05-29  2:47               ` Liao, Chang
2023-05-30 21:47                 ` Gowans, James
     [not found]           ` <86sfcfghqh.wl-maz@kernel.org>
2023-05-23 12:47             ` Gowans, James
2023-05-25 12:31               ` Liao, Chang
2023-05-02 10:28         ` Marc Zyngier
2023-05-23  3:16       ` liaochang (A)
2023-05-23  3:15 ` liaochang (A)
2023-05-23 11:59   ` Gowans, James
2023-05-25 12:31     ` Liao, Chang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7fdfb01590d8e502f384aa0bb0dc9c614caa5dfc.camel@amazon.com \
    --to=jgowans@amazon.com \
    --cc=chris.zjh@huawei.com \
    --cc=dwmw@amazon.co.uk \
    --cc=karahmed@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=sironi@amazon.de \
    --cc=tglx@linutronix.de \
    --cc=zouyipeng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.