From: Will Deacon <will@kernel.org>
To: John Garry <john.garry@huawei.com>
Cc: Vijay Kilary <vkilari@codeaurora.org>,
Jean-Philippe Brucker <jean-philippe.brucker@arm.com>,
Jon Masters <jcm@redhat.com>, Jan Glauber <jglauber@marvell.com>,
Alex Williamson <alex.williamson@redhat.com>,
iommu@lists.linux-foundation.org,
Jayachandran Chandrasekharan Nair <jnair@marvell.com>,
Robin Murphy <robin.murphy@arm.com>
Subject: Re: [RFC PATCH v2 00/19] Try to reduce lock contention on the SMMUv3 command queue
Date: Wed, 24 Jul 2019 15:48:17 +0100 [thread overview]
Message-ID: <20190724144817.kecc6kx7lhitaaac@willie-the-truck> (raw)
In-Reply-To: <085c4eb1-e385-04b7-e3ce-f290a80c1779@huawei.com>
On Wed, Jul 24, 2019 at 03:25:07PM +0100, John Garry wrote:
> On 24/07/2019 13:20, Will Deacon wrote:
> > On Wed, Jul 24, 2019 at 10:58:26AM +0100, John Garry wrote:
> > > On 11/07/2019 18:19, Will Deacon wrote:
> > > > This is a significant rework of the RFC I previously posted here:
> > > >
> > > > https://lkml.kernel.org/r/20190611134603.4253-1-will.deacon@arm.com
> > > >
> > > > But this time, it looks like it might actually be worthwhile according
> > > > to my perf profiles, where __iommu_unmap() falls a long way down the
> > > > profile for a multi-threaded netperf run. I'm still relying on others to
> > > > confirm this is useful, however.
> > > >
> > > > Some of the changes since last time are:
> > > >
> > > > * Support for constructing and submitting a list of commands in the
> > > > driver
> > > >
> > > > * Numerous changes to the IOMMU and io-pgtable APIs so that we can
> > > > submit commands in batches
> > > >
> > > > * Removal of cmpxchg() from cmdq_shared_lock() fast-path
> > > >
> > > > * Code restructuring and cleanups
> > > >
> > > > This current applies against my iommu/devel branch that Joerg has pulled
> > > > for 5.3. If you want to test it out, I've put everything here:
> > > >
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=iommu/cmdq
> > > >
> > > > Feedback welcome. I appreciate that we're in the merge window, but I
> > > > wanted to get this on the list for people to look at as an RFC.
> > > >
> > >
> > > I tested storage performance on this series, which I think is a better
> > > scenario to test than network performance, that being generally limited by
> > > the network link speed.
> >
> > Interesting, thanks for sharing. Do you also see a similar drop in CPU time
> > to the one reported by Ganapat?
>
> Not really, CPU load reported by fio is mostly the same.
That's a pity. Maybe the cmdq isn't actually getting hit very heavily by
fio.
> > > Baseline performance (will/iommu/devel, commit 9e6ea59f3)
> > > 8x SAS disks D05 839K IOPS
> > > 1x NVMe D05 454K IOPS
> > > 1x NVMe D06 442k IOPS
> > >
> > > Patchset performance (will/iommu/cmdq)
> > > 8x SAS disk D05 835K IOPS
> > > 1x NVMe D05 472K IOPS
> > > 1x NVMe D06 459k IOPS
> > >
> > > So we see a bit of an NVMe boost, but about the same for 8x disks. No iommu
> > > performance is about 918K IOPs for 8x disks, so it is not limited by the
> > > medium.
> >
> > It would be nice to know if this performance gap is because of Linux, or
> > simply because of the translation overhead in the SMMU hardware. Are you
> > able to get a perf profile to see where we're spending time?
>
> I'll look to do that, but I'd really expect it to be down to the time linux
> spends on the DMA map and unmaps.
Right, and it would be good to see how much of that is in SMMUv3-specific
code. Another interesting thing to try would be reducing the depth of the
io-pgtable. We currently key that off VA_BITS which may be much larger
than you need (by virtue of being a compile-time value).
Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2019-07-24 14:48 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-11 17:19 [RFC PATCH v2 00/19] Try to reduce lock contention on the SMMUv3 command queue Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 01/19] iommu: Remove empty iommu_tlb_range_add() callback from iommu_ops Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 02/19] iommu/io-pgtable-arm: Remove redundant call to io_pgtable_tlb_sync() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 03/19] iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 04/19] iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes Will Deacon
2019-07-24 7:19 ` Joerg Roedel
2019-07-24 7:41 ` Will Deacon
2019-07-25 7:58 ` Joerg Roedel
2019-07-11 17:19 ` [RFC PATCH v2 05/19] iommu: Introduce iommu_iotlb_gather_add_page() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 06/19] iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 07/19] iommu/io-pgtable: Introduce tlb_flush_walk() and tlb_flush_leaf() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 08/19] iommu/io-pgtable: Hook up ->tlb_flush_walk() and ->tlb_flush_leaf() in drivers Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 09/19] iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 10/19] iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 11/19] iommu/io-pgtable: Remove unused ->tlb_sync() callback Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 12/19] iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 13/19] iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page() Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 14/19] iommu/arm-smmu-v3: Separate s/w and h/w views of prod and cons indexes Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 15/19] iommu/arm-smmu-v3: Drop unused 'q' argument from Q_OVF macro Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 16/19] iommu/arm-smmu-v3: Move low-level queue fields out of arm_smmu_queue Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 17/19] iommu/arm-smmu-v3: Operate directly on low-level queue where possible Will Deacon
2019-07-11 17:19 ` [RFC PATCH v2 18/19] iommu/arm-smmu-v3: Reduce contention during command-queue insertion Will Deacon
2019-07-19 11:04 ` John Garry
2019-07-24 12:15 ` Will Deacon
2019-07-24 14:03 ` John Garry
2019-07-24 14:07 ` Will Deacon
2019-07-24 8:20 ` John Garry
2019-07-24 14:33 ` Will Deacon
2019-07-25 11:31 ` John Garry
2019-07-11 17:19 ` [RFC PATCH v2 19/19] iommu/arm-smmu-v3: Defer TLB invalidation until ->iotlb_sync() Will Deacon
2019-07-19 4:25 ` [RFC PATCH v2 00/19] Try to reduce lock contention on the SMMUv3 command queue Ganapatrao Kulkarni
2019-07-24 12:28 ` Will Deacon
2019-07-24 9:58 ` John Garry
2019-07-24 12:20 ` Will Deacon
2019-07-24 14:25 ` John Garry
2019-07-24 14:48 ` Will Deacon [this message]
2019-07-25 10:11 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190724144817.kecc6kx7lhitaaac@willie-the-truck \
--to=will@kernel.org \
--cc=alex.williamson@redhat.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jcm@redhat.com \
--cc=jean-philippe.brucker@arm.com \
--cc=jglauber@marvell.com \
--cc=jnair@marvell.com \
--cc=john.garry@huawei.com \
--cc=robin.murphy@arm.com \
--cc=vkilari@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).