All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	'Linux Samsung SOC' <linux-samsung-soc@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	John Garry <john.garry@huawei.com>,
	Xiongfeng Wang <wangxiongfeng2@huawei.com>,
	David Decotigny <ddecotig@google.com>,
	Krzysztof Kozlowski <krzk@kernel.org>
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs
Date: Wed, 13 Apr 2022 18:26:33 +0100	[thread overview]
Message-ID: <878rs8c2t2.wl-maz@kernel.org> (raw)
In-Reply-To: <4b7fc13c-887b-a664-26e8-45aed13f048a@samsung.com>

Hi Marek,

On Wed, 13 Apr 2022 15:59:21 +0100,
Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> Hi Marc,
> 
> On 05.04.2022 20:50, Marc Zyngier wrote:
> > When booting with maxcpus=<small number> (or even loading a driver
> > while most CPUs are offline), it is pretty easy to observe managed
> > affinities containing a mix of online and offline CPUs being passed
> > to the irqchip driver.
> >
> > This means that the irqchip cannot trust the affinity passed down
> > from the core code, which is a bit annoying and requires (at least
> > in theory) all drivers to implement some sort of affinity narrowing.
> >
> > In order to address this, always limit the cpumask to the set of
> > online CPUs.
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> 
> This patch landed in linux next-20220413 as commit 33de0aa4bae9 
> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it 
> breaks booting of most ARM 32bit Samsung Exynos based boards.
> 
> I don't see anything specific in the log, though. Booting just hangs at 
> some point. The only Samsung Exynos boards that boot properly are those 
> Exynos4412 based.
> 
> I assume that this is related to the Multi Core Timer IRQ configuration 
> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other 
> Exynos SoCs have separate IRQ lines for each CPU.
> 
> Let me know how I can help debugging this issue.

Thanks for the heads up. Can you pick the last working kernel, enable
CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
entries for the timer IRQs?

Also, see below.

> 
> > ---
> >   kernel/irq/manage.c | 25 +++++++++++++++++--------
> >   1 file changed, 17 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > index c03f71d5ec10..f71ecc100545 100644
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> >   {
> >   	struct irq_desc *desc = irq_data_to_desc(data);
> >   	struct irq_chip *chip = irq_data_get_irq_chip(data);
> > +	const struct cpumask  *prog_mask;
> >   	int ret;
> >   
> > +	static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> > +	static struct cpumask tmp_mask;
> > +
> >   	if (!chip || !chip->irq_set_affinity)
> >   		return -EINVAL;
> >   
> > +	raw_spin_lock(&tmp_mask_lock);
> >   	/*
> >   	 * If this is a managed interrupt and housekeeping is enabled on
> >   	 * it check whether the requested affinity mask intersects with
> > @@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> >   	 */
> >   	if (irqd_affinity_is_managed(data) &&
> >   	    housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
> > -		const struct cpumask *hk_mask, *prog_mask;
> > -
> > -		static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> > -		static struct cpumask tmp_mask;
> > +		const struct cpumask *hk_mask;
> >   
> >   		hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
> >   
> > -		raw_spin_lock(&tmp_mask_lock);
> >   		cpumask_and(&tmp_mask, mask, hk_mask);
> >   		if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
> >   			prog_mask = mask;
> >   		else
> >   			prog_mask = &tmp_mask;
> > -		ret = chip->irq_set_affinity(data, prog_mask, force);
> > -		raw_spin_unlock(&tmp_mask_lock);
> >   	} else {
> > -		ret = chip->irq_set_affinity(data, mask, force);
> > +		prog_mask = mask;
> >   	}
> > +
> > +	/* Make sure we only provide online CPUs to the irqchip */
> > +	cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
> > +	if (!cpumask_empty(&tmp_mask))
> > +		ret = chip->irq_set_affinity(data, &tmp_mask, force);
> > +	else
> > +		ret = -EINVAL;

Can you also check that with the patch applied, it is this path that
is taken and that it is the timer interrupts that get rejected? If
that's the case, can you put a dump_stack() here and give me that
stack trace? The use of irq_force_affinity() in the driver looks
suspicious...

Finally, is there a QEMU emulation of one of these failing boards?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2022-04-13 17:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05 18:50 [PATCH v3 0/3] genirq: Managed affinity fixes Marc Zyngier
2022-04-05 18:50 ` [PATCH v3 1/3] genirq/msi: Shutdown managed interrupts with unsatifiable affinities Marc Zyngier
2022-04-10 19:13   ` [tip: irq/core] " tip-bot2 for Marc Zyngier
2022-04-05 18:50 ` [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs Marc Zyngier
2022-04-10 19:12   ` [tip: irq/core] " tip-bot2 for Marc Zyngier
     [not found]   ` <CGME20220413145922eucas1p2dc46908354f4d2b48db79978d086a838@eucas1p2.samsung.com>
2022-04-13 14:59     ` [PATCH v3 2/3] " Marek Szyprowski
2022-04-13 17:26       ` Marc Zyngier [this message]
2022-04-14  9:09         ` Marek Szyprowski
2022-04-14 10:35           ` Marc Zyngier
2022-04-14 11:08             ` Marek Szyprowski
2022-04-20  9:13               ` Krzysztof Kozlowski
2022-04-20  9:40                 ` Marc Zyngier
2022-04-20  9:42                   ` Krzysztof Kozlowski
2022-04-20  9:47                 ` Marek Szyprowski
2022-04-20  9:50                   ` Krzysztof Kozlowski
2022-04-14 10:49         ` Thomas Gleixner
2022-04-14 14:14       ` [tip: irq/core] genirq: Take the proposed affinity at face value if force==true tip-bot2 for Marc Zyngier
2022-04-05 18:50 ` [PATCH v3 3/3] irqchip/gic-v3: Always trust the managed affinity provided by the core code Marc Zyngier
2022-04-10 19:12   ` [tip: irq/core] " tip-bot2 for Marc Zyngier
2022-04-07 17:29 ` [PATCH v3 0/3] genirq: Managed affinity fixes John Garry
2022-04-08  1:02   ` Xiongfeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878rs8c2t2.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=ddecotig@google.com \
    --cc=john.garry@huawei.com \
    --cc=krzk@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=tglx@linutronix.de \
    --cc=wangxiongfeng2@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.