From: Marc Zyngier <maz@kernel.org>
To: John Garry <john.garry@huawei.com>
Cc: linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
Jason Cooper <jason@lakedaemon.net>,
chenxiang <chenxiang66@hisilicon.com>,
Robin Murphy <robin.murphy@arm.com>,
luojiaxing@huawei.com, Ming Lei <ming.lei@redhat.com>,
Zhou Wang <wangzhou1@hisilicon.com>,
Thomas Gleixner <tglx@linutronix.de>,
Will Deacon <will@kernel.org>
Subject: Re: [PATCH v3 0/2] irqchip/gic-v3-its: Balance LPI affinity across CPUs
Date: Fri, 15 May 2020 11:14:39 +0100 [thread overview]
Message-ID: <668f819c8747104814245cd6faebdd9a@kernel.org> (raw)
In-Reply-To: <7c05b08b-2edc-7f97-0175-898e9772673e@huawei.com>
Hi John,
On 2020-05-14 13:05, John Garry wrote:
>>
>> + its_inc_lpi_count(d, cpu);
>> +
>> return IRQ_SET_MASK_OK_DONE;
>> }
>>
>> Results look ok:
>> nvme.use_threaded_interrupts=1 =0*
>> Before 950K IOPs 1000K IOPs
>> After 1100K IOPs 1150K IOPs
>>
>> * as mentioned before, this is quite unstable and causes lockups.
>> JFYI, there was an attempt to fix this:
>>
>> https://lore.kernel.org/linux-nvme/20191209175622.1964-1-kbusch@kernel.org/
>>
>
> Hi Marc,
>
> Just wondering if we can try to get this series over the line?
Absolutely. Life has got in the way, so let me page it back in...
> So I tested the patches on v5.7-rc5, and get similar performance
> improvement to above.
>
> I did apply a couple of patches, below, to remedy the issues I
> experienced for my D06CS.
Comments on that below.
>
> Thanks,
> John
>
>
> ---->8
>
>
> [PATCH 1/2] irqchip/gic-v3-its: Don't double account for target CPU
> assignment
>
> In its_set_affinity(), when a managed irq is already assigned to a CPU,
> we may needlessly reassign the irq to another CPU.
>
> This is because when selecting the target CPU, being the least loaded
> CPU in the mask, we account of that irq still being assigned to a CPU;
> thereby we may unfairly select another CPU.
>
> Modify this behaviour to pre-decrement the current target CPU LPI count
> when finding the least loaded CPU.
>
> Alternatively we may be able to just bail out early when the current
> target CPU already falls within the requested mask.
>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c
> b/drivers/irqchip/irq-gic-v3-its.c
> index 73f5c12..2b18feb 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1636,6 +1636,8 @@ static int its_set_affinity(struct irq_data *d,
> const struct cpumask *mask_val,
> if (irqd_is_forwarded_to_vcpu(d))
> return -EINVAL;
>
> + its_dec_lpi_count(d, its_dev->event_map.col_map[id]);
> +
> if (!force)
> cpu = its_select_cpu(d, mask_val);
> else
> @@ -1646,14 +1648,14 @@ static int its_set_affinity(struct irq_data
> *d, const struct cpumask *mask_val,
>
> /* don't set the affinity when the target cpu is same as current one
> */
> if (cpu != its_dev->event_map.col_map[id]) {
> - its_inc_lpi_count(d, cpu);
> - its_dec_lpi_count(d, its_dev->event_map.col_map[id]);
> target_col = &its_dev->its->collections[cpu];
> its_send_movi(its_dev, target_col, id);
> its_dev->event_map.col_map[id] = cpu;
> irq_data_update_effective_affinity(d, cpumask_of(cpu));
> }
>
> + its_inc_lpi_count(d, cpu);
> +
I'm OK with that change, as it removes unnecessary churn.
> return IRQ_SET_MASK_OK_DONE;
> }
>
> ---
>
>
> [PATCH 2/2] irqchip/gic-v3-its: Handle no overlap of non-managed irq
> affinity mask
>
> In selecting the target CPU for a non-managed interrupt, we may select
> a
> target CPU outside the requested affinity mask.
>
> This is because there may be no overlap of the ITS node mask and the
> requested CPU affinity mask. The requested affinity mask may be coming
> from userspace or some drivers which try to set irq affinity, see [0].
>
> In this case, just ignore the ITS node cpumask. This is a deviation
> from
> what Thomas described. Having said that, I am not sure if the
> interrupt is ever bound to a node for us.
>
> [0]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/hisilicon/hisi_uncore_pmu.c#n417
>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c
> b/drivers/irqchip/irq-gic-v3-its.c
> index 2b18feb..12d5d4b4 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1584,10 +1584,6 @@ static int its_select_cpu(struct irq_data *d,
> cpumask_and(tmpmask, cpumask_of_node(node), aff_mask);
> cpumask_and(tmpmask, tmpmask, cpu_online_mask);
>
> - /* If that doesn't work, try the nodemask itself */
> - if (cpumask_empty(tmpmask))
> - cpumask_and(tmpmask, cpumask_of_node(node), cpu_online_mask);
> -
> cpu = cpumask_pick_least_loaded(d, tmpmask);
> if (cpu < nr_cpu_ids)
> goto out;
I'm really not sure. Shouldn't we then drop the wider search on
cpu_inline_mask, because userspace could have given us something
that we cannot deal with?
What you are advocating for is a strict adherence to the provided
mask, and it doesn't seem to be what other architectures are providing.
I consider the userspace-provided affinity as a hint more that anything
else, as in this case the kernel does know better (routing the interrupt
to a foreign node might be costly, or even impossible, see the TX1
erratum).
From what I remember of the earlier discussion, you saw an issue on
a system with two sockets and a single ITS, with the node mask limited
to the first socket. Is that correct?
I'll respin the series today and report it with you first patch
squased in.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
next prev parent reply other threads:[~2020-05-15 10:14 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-16 11:54 [PATCH v3 0/2] irqchip/gic-v3-its: Balance LPI affinity across CPUs Marc Zyngier
2020-03-16 11:54 ` [PATCH v3 1/2] irqchip/gic-v3-its: Track LPI distribution on a per CPU basis Marc Zyngier
2020-03-16 11:54 ` [PATCH v3 2/2] irqchip/gic-v3-its: Balance initial LPI affinity across CPUs Marc Zyngier
2020-03-16 13:02 ` John Garry
2020-03-16 13:14 ` Marc Zyngier
2020-03-17 18:43 ` John Garry
2020-03-18 14:16 ` Marc Zyngier
2020-03-18 14:25 ` John Garry
2020-03-18 12:22 ` John Garry
2020-03-18 14:04 ` Marc Zyngier
2020-03-18 15:34 ` John Garry
2020-03-18 17:30 ` Marc Zyngier
2020-03-18 19:00 ` John Garry
2020-03-27 17:52 ` John Garry
2020-03-19 12:31 ` [PATCH v3 0/2] irqchip/gic-v3-its: Balance " John Garry
2020-03-27 17:47 ` John Garry
2020-04-01 11:33 ` John Garry
2020-05-14 12:05 ` John Garry
2020-05-15 10:14 ` Marc Zyngier [this message]
2020-05-15 11:50 ` John Garry
2020-05-15 15:37 ` Marc Zyngier
2020-05-15 16:15 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=668f819c8747104814245cd6faebdd9a@kernel.org \
--to=maz@kernel.org \
--cc=chenxiang66@hisilicon.com \
--cc=jason@lakedaemon.net \
--cc=john.garry@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luojiaxing@huawei.com \
--cc=ming.lei@redhat.com \
--cc=robin.murphy@arm.com \
--cc=tglx@linutronix.de \
--cc=wangzhou1@hisilicon.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).