linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Ming Lei <ming.lei@redhat.com>, John Garry <john.garry@huawei.com>
Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	iommu@lists.linux-foundation.org, Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [bug report] iommu_dma_unmap_sg() is very slow then running IO from remote numa node
Date: Thu, 22 Jul 2021 18:40:18 +0100	[thread overview]
Message-ID: <0adbe03b-ce26-e4d3-3425-d967bc436ef5@arm.com> (raw)
In-Reply-To: <YPmUoBk9u+tU2rbS@T590>

On 2021-07-22 16:54, Ming Lei wrote:
[...]
>> If you are still keen to investigate more, then can try either of these:
>>
>> - add iommu.strict=0 to the cmdline
>>
>> - use perf record+annotate to find the hotspot
>>    - For this you need to enable psuedo-NMI with 2x steps:
>>      CONFIG_ARM64_PSEUDO_NMI=y in defconfig
>>      Add irqchip.gicv3_pseudo_nmi=1
>>
>>      See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/Kconfig#n1745
>>      Your kernel log should show:
>>      [    0.000000] GICv3: Pseudo-NMIs enabled using forced ICC_PMR_EL1
>> synchronisation
> 
> OK, will try the above tomorrow.

Thanks, I was also going to suggest the latter, since it's what 
arm_smmu_cmdq_issue_cmdlist() does with IRQs masked that should be most 
indicative of where the slowness most likely stems from.

FWIW I would expect iommu.strict=0 to give a proportional reduction in 
SMMU overhead for both cases since it should effectively mean only 1/256 
as many invalidations are issued.

Could you also check whether the SMMU platform devices have "numa_node" 
properties exposed in sysfs (and if so whether the values look right), 
and share all the SMMU output from the boot log?

I still suspect that the most significant bottleneck is likely to be 
MMIO access across chips, incurring the CML/CCIX latency twice for every 
single read, but it's also possible that the performance of the SMMU 
itself could be reduced if its NUMA affinity is not described and we end 
up allocating stuff like pagetables on the wrong node as well.

>> But my impression is that this may be a HW implementation issue, considering
>> we don't see such a huge drop off on our HW.
> 
> Except for mpere-mtjade, we saw bad nvme performance on ThunderX2® CN99XX too,
> but I don't get one CN99XX system to check if the issue is same with
> this one.

I know Cavium's SMMU implementation didn't support MSIs, so that case 
would quite possibly lean towards the MMIO polling angle as well (albeit 
with a very different interconnect).

Robin.

  reply	other threads:[~2021-07-22 17:40 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-09  8:38 [bug report] iommu_dma_unmap_sg() is very slow then running IO from remote numa node Ming Lei
2021-07-09 10:16 ` Russell King (Oracle)
2021-07-09 14:21   ` Ming Lei
2021-07-09 10:26 ` Robin Murphy
2021-07-09 11:04   ` John Garry
2021-07-09 12:34     ` Robin Murphy
2021-07-09 14:24   ` Ming Lei
2021-07-19 16:14     ` John Garry
2021-07-21  1:40       ` Ming Lei
2021-07-21  9:23         ` John Garry
2021-07-21  9:59           ` Ming Lei
2021-07-21 11:07             ` John Garry
2021-07-21 11:58               ` Ming Lei
2021-07-22  7:58               ` Ming Lei
2021-07-22 10:05                 ` John Garry
2021-07-22 10:19                   ` Ming Lei
2021-07-22 11:12                     ` John Garry
2021-07-22 12:53                       ` Marc Zyngier
2021-07-22 13:54                         ` John Garry
2021-07-22 15:54                       ` Ming Lei
2021-07-22 17:40                         ` Robin Murphy [this message]
2021-07-23 10:21                           ` Ming Lei
2021-07-26  7:51                             ` John Garry
2021-07-28  1:32                               ` Ming Lei
2021-07-28 10:38                                 ` John Garry
2021-07-28 15:17                                   ` Ming Lei
2021-07-28 15:39                                     ` Robin Murphy
2021-08-10  9:36                                     ` John Garry
2021-08-10 10:35                                       ` Ming Lei
2021-07-27 17:08                             ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0adbe03b-ce26-e4d3-3425-d967bc436ef5@arm.com \
    --to=robin.murphy@arm.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=john.garry@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).