All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <marc.zyngier@arm.com>
To: Yang Yingliang <yangyingliang@huawei.com>
Cc: <julien.thierry@arm.com>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v2] irqchip/gic-v3-its: fix ITS queue timeout
Date: Wed, 06 Jun 2018 10:13:44 +0100	[thread overview]
Message-ID: <86a7s89t13.wl-marc.zyngier@arm.com> (raw)
In-Reply-To: <1528252824-15144-1-git-send-email-yangyingliang@huawei.com>

On Wed, 06 Jun 2018 03:40:24 +0100,
Yang Yingliang wrote:

[I'm travelling, so please do not expect any quick answer...]

> 
> When the kernel booted with maxcpus=x, 'x' is smaller
> than actual cpu numbers, the TAs of offline cpus won't

TA? Target Address? Target Affinity? Timing Advance? Terrible Acronym?

> be set to its->collection.
> 
> If LPI is bind to offline cpu, sync cmd will use zero TA,
> it leads to ITS queue timeout.  Fix this by choosing a
> online cpu, if there is no online cpu in cpu_mask.

So instead of fixing the emission of a sync command on a non-mapped
collection, you hack set_affinity? It doesn't feel like the right
thing to do.

It is also worth noticing that mapping an LPI to a collection that is
not mapped yet is perfectly legal.

> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 5416f2b..d8b9539 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -2309,7 +2309,9 @@ static int its_irq_domain_activate(struct irq_domain *domain,
>  		cpu_mask = cpumask_of_node(its_dev->its->numa_node);
>  
>  	/* Bind the LPI to the first possible CPU */
> -	cpu = cpumask_first(cpu_mask);
> +	cpu = cpumask_first_and(cpu_mask, cpu_online_mask);
> +	if (cpu >= nr_cpu_ids)
> +		cpu = cpumask_first(cpu_online_mask);

Now you're completely ignoring cpu_mask which constraints the NUMA
affinity. On some systems, this ends up with a deadlock (Cavium TX1,
if I remember well).

Wouldn't it be better to just return that the affinity setting request
is impossible to satisfy? And more to the point, how comes we end-up
in such a case?

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

WARNING: multiple messages have this Message-ID (diff)
From: marc.zyngier@arm.com (Marc Zyngier)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] irqchip/gic-v3-its: fix ITS queue timeout
Date: Wed, 06 Jun 2018 10:13:44 +0100	[thread overview]
Message-ID: <86a7s89t13.wl-marc.zyngier@arm.com> (raw)
In-Reply-To: <1528252824-15144-1-git-send-email-yangyingliang@huawei.com>

On Wed, 06 Jun 2018 03:40:24 +0100,
Yang Yingliang wrote:

[I'm travelling, so please do not expect any quick answer...]

> 
> When the kernel booted with maxcpus=x, 'x' is smaller
> than actual cpu numbers, the TAs of offline cpus won't

TA? Target Address? Target Affinity? Timing Advance? Terrible Acronym?

> be set to its->collection.
> 
> If LPI is bind to offline cpu, sync cmd will use zero TA,
> it leads to ITS queue timeout.  Fix this by choosing a
> online cpu, if there is no online cpu in cpu_mask.

So instead of fixing the emission of a sync command on a non-mapped
collection, you hack set_affinity? It doesn't feel like the right
thing to do.

It is also worth noticing that mapping an LPI to a collection that is
not mapped yet is perfectly legal.

> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 5416f2b..d8b9539 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -2309,7 +2309,9 @@ static int its_irq_domain_activate(struct irq_domain *domain,
>  		cpu_mask = cpumask_of_node(its_dev->its->numa_node);
>  
>  	/* Bind the LPI to the first possible CPU */
> -	cpu = cpumask_first(cpu_mask);
> +	cpu = cpumask_first_and(cpu_mask, cpu_online_mask);
> +	if (cpu >= nr_cpu_ids)
> +		cpu = cpumask_first(cpu_online_mask);

Now you're completely ignoring cpu_mask which constraints the NUMA
affinity. On some systems, this ends up with a deadlock (Cavium TX1,
if I remember well).

Wouldn't it be better to just return that the affinity setting request
is impossible to satisfy? And more to the point, how comes we end-up
in such a case?

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

  reply	other threads:[~2018-06-06  9:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-06  2:40 [PATCH v2] irqchip/gic-v3-its: fix ITS queue timeout Yang Yingliang
2018-06-06  2:40 ` Yang Yingliang
2018-06-06  9:13 ` Marc Zyngier [this message]
2018-06-06  9:13   ` Marc Zyngier
2018-06-07 12:25   ` Hanjun Guo
2018-06-07 12:25     ` Hanjun Guo
2018-06-10 10:40     ` Marc Zyngier
2018-06-10 10:40       ` Marc Zyngier
2018-06-11  9:31 ` Marc Zyngier
2018-06-11  9:31   ` Marc Zyngier
2018-06-11 11:23   ` Yang Yingliang
2018-06-11 11:23     ` Yang Yingliang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86a7s89t13.wl-marc.zyngier@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=julien.thierry@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yangyingliang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.