linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: John Garry <john.garry@huawei.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Jason Cooper <jason@lakedaemon.net>,
	chenxiang <chenxiang66@hisilicon.com>,
	Robin Murphy <robin.murphy@arm.com>,
	luojiaxing@huawei.com, Ming Lei <ming.lei@redhat.com>,
	Zhou Wang <wangzhou1@hisilicon.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH v3 0/2] irqchip/gic-v3-its: Balance LPI affinity across CPUs
Date: Fri, 15 May 2020 11:14:39 +0100	[thread overview]
Message-ID: <668f819c8747104814245cd6faebdd9a@kernel.org> (raw)
In-Reply-To: <7c05b08b-2edc-7f97-0175-898e9772673e@huawei.com>

Hi John,

On 2020-05-14 13:05, John Garry wrote:
>> 
>> +       its_inc_lpi_count(d, cpu);
>> +
>>          return IRQ_SET_MASK_OK_DONE;
>>   }
>> 
>> Results look ok:
>>      nvme.use_threaded_interrupts=1    =0*
>> Before    950K IOPs            1000K IOPs
>> After    1100K IOPs            1150K IOPs
>> 
>> * as mentioned before, this is quite unstable and causes lockups. 
>> JFYI, there was an attempt to fix this:
>> 
>> https://lore.kernel.org/linux-nvme/20191209175622.1964-1-kbusch@kernel.org/
>> 
> 
> Hi Marc,
> 
> Just wondering if we can try to get this series over the line?

Absolutely. Life has got in the way, so let me page it back in...

> So I tested the patches on v5.7-rc5, and get similar performance
> improvement to above.
> 
> I did apply a couple of patches, below, to remedy the issues I
> experienced for my D06CS.

Comments on that below.

> 
> Thanks,
> John
> 
> 
> ---->8
> 
> 
> [PATCH 1/2] irqchip/gic-v3-its: Don't double account for target CPU
>  assignment
> 
> In its_set_affinity(), when a managed irq is already assigned to a CPU,
> we may needlessly reassign the irq to another CPU.
> 
> This is because when selecting the target CPU, being the least loaded
> CPU in the mask, we account of that irq still being assigned to a CPU;
> thereby we may unfairly select another CPU.
> 
> Modify this behaviour to pre-decrement the current target CPU LPI count
> when finding the least loaded CPU.
> 
> Alternatively we may be able to just bail out early when the current
> target CPU already falls within the requested mask.
> 
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
> b/drivers/irqchip/irq-gic-v3-its.c
> index 73f5c12..2b18feb 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1636,6 +1636,8 @@ static int its_set_affinity(struct irq_data *d,
> const struct cpumask *mask_val,
>  	if (irqd_is_forwarded_to_vcpu(d))
>  		return -EINVAL;
> 
> +	its_dec_lpi_count(d, its_dev->event_map.col_map[id]);
> +
>  	if (!force)
>  		cpu = its_select_cpu(d, mask_val);
>  	else
> @@ -1646,14 +1648,14 @@ static int its_set_affinity(struct irq_data
> *d, const struct cpumask *mask_val,
> 
>  	/* don't set the affinity when the target cpu is same as current one 
> */
>  	if (cpu != its_dev->event_map.col_map[id]) {
> -		its_inc_lpi_count(d, cpu);
> -		its_dec_lpi_count(d, its_dev->event_map.col_map[id]);
>  		target_col = &its_dev->its->collections[cpu];
>  		its_send_movi(its_dev, target_col, id);
>  		its_dev->event_map.col_map[id] = cpu;
>  		irq_data_update_effective_affinity(d, cpumask_of(cpu));
>  	}
> 
> +	its_inc_lpi_count(d, cpu);
> +

I'm OK with that change, as it removes unnecessary churn.

>  	return IRQ_SET_MASK_OK_DONE;
>  }
> 
> ---
> 
> 
> [PATCH 2/2] irqchip/gic-v3-its: Handle no overlap of non-managed irq
>  affinity mask
> 
> In selecting the target CPU for a non-managed interrupt, we may select 
> a
> target CPU outside the requested affinity mask.
> 
> This is because there may be no overlap of the ITS node mask and the
> requested CPU affinity mask. The requested affinity mask may be coming
> from userspace or some drivers which try to set irq affinity, see [0].
> 
> In this case, just ignore the ITS node cpumask. This is a deviation 
> from
> what Thomas described. Having said that, I am not sure if the
> interrupt is ever bound to a node for us.
> 
> [0] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/hisilicon/hisi_uncore_pmu.c#n417
> 
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
> b/drivers/irqchip/irq-gic-v3-its.c
> index 2b18feb..12d5d4b4 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1584,10 +1584,6 @@ static int its_select_cpu(struct irq_data *d,
>  			cpumask_and(tmpmask, cpumask_of_node(node), aff_mask);
>  			cpumask_and(tmpmask, tmpmask, cpu_online_mask);
> 
> -			/* If that doesn't work, try the nodemask itself */
> -			if (cpumask_empty(tmpmask))
> -				cpumask_and(tmpmask, cpumask_of_node(node), cpu_online_mask);
> -
>  			cpu = cpumask_pick_least_loaded(d, tmpmask);
>  			if (cpu < nr_cpu_ids)
>  				goto out;

I'm really not sure. Shouldn't we then drop the wider search on
cpu_inline_mask, because userspace could have given us something
that we cannot deal with?

What you are advocating for is a strict adherence to the provided
mask, and it doesn't seem to be what other architectures are providing.
I consider the userspace-provided affinity as a hint more that anything
else, as in this case the kernel does know better (routing the interrupt
to a foreign node might be costly, or even impossible, see the TX1
erratum).

 From what I remember of the earlier discussion, you saw an issue on
a system with two sockets and a single ITS, with the node mask limited
to the first socket. Is that correct?

I'll respin the series today and report it with you first patch
squased in.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

  reply	other threads:[~2020-05-15 10:14 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-16 11:54 [PATCH v3 0/2] irqchip/gic-v3-its: Balance LPI affinity across CPUs Marc Zyngier
2020-03-16 11:54 ` [PATCH v3 1/2] irqchip/gic-v3-its: Track LPI distribution on a per CPU basis Marc Zyngier
2020-03-16 11:54 ` [PATCH v3 2/2] irqchip/gic-v3-its: Balance initial LPI affinity across CPUs Marc Zyngier
2020-03-16 13:02   ` John Garry
2020-03-16 13:14     ` Marc Zyngier
2020-03-17 18:43       ` John Garry
2020-03-18 14:16         ` Marc Zyngier
2020-03-18 14:25           ` John Garry
2020-03-18 12:22   ` John Garry
2020-03-18 14:04     ` Marc Zyngier
2020-03-18 15:34       ` John Garry
2020-03-18 17:30         ` Marc Zyngier
2020-03-18 19:00           ` John Garry
2020-03-27 17:52   ` John Garry
2020-03-19 12:31 ` [PATCH v3 0/2] irqchip/gic-v3-its: Balance " John Garry
2020-03-27 17:47   ` John Garry
2020-04-01 11:33     ` John Garry
2020-05-14 12:05       ` John Garry
2020-05-15 10:14         ` Marc Zyngier [this message]
2020-05-15 11:50           ` John Garry
2020-05-15 15:37             ` Marc Zyngier
2020-05-15 16:15               ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=668f819c8747104814245cd6faebdd9a@kernel.org \
    --to=maz@kernel.org \
    --cc=chenxiang66@hisilicon.com \
    --cc=jason@lakedaemon.net \
    --cc=john.garry@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luojiaxing@huawei.com \
    --cc=ming.lei@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=tglx@linutronix.de \
    --cc=wangzhou1@hisilicon.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).