iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Will Deacon <will@kernel.org>, <iommu@lists.linux-foundation.org>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Vijay Kilary <vkilari@codeaurora.org>,
	Jon Masters <jcm@redhat.com>, Jan Glauber <jglauber@marvell.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jayachandran Chandrasekharan Nair <jnair@marvell.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH 00/13] Rework IOMMU API to allow for batching of invalidation
Date: Thu, 15 Aug 2019 12:19:58 +0100	[thread overview]
Message-ID: <78d366b7-590f-b114-1a9a-91dea01cde4d@huawei.com> (raw)
In-Reply-To: <20190814175634.21081-1-will@kernel.org>

On 14/08/2019 18:56, Will Deacon wrote:
> Hi everybody,
>
> These are the core IOMMU changes that I have posted previously as part
> of my ongoing effort to reduce the lock contention of the SMMUv3 command
> queue. I thought it would be better to split this out as a separate
> series, since I think it's ready to go and all the driver conversions
> mean that it's quite a pain for me to maintain out of tree!
>
> The idea of the patch series is to allow TLB invalidation to be batched
> up into a new 'struct iommu_iotlb_gather' structure, which tracks the
> properties of the virtual address range being invalidated so that it
> can be deferred until the driver's ->iotlb_sync() function is called.
> This allows for more efficient invalidation on hardware that can submit
> multiple invalidations in one go.
>
> The previous series was included in:
>
>   https://lkml.kernel.org/r/20190711171927.28803-1-will@kernel.org
>
> The only real change since then is incorporating the newly merged
> virtio-iommu driver.
>
> If you'd like to play with the patches, then I've also pushed them here:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=iommu/unmap
>
> but they should behave as a no-op on their own.

Hi Will,

As anticipated, my storage testing scenarios roughly give parity 
throughput and CPU loading before and after this series.

Patches to convert the
> Arm SMMUv3 driver to the new API are here:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=iommu/cmdq

I quickly tested this again and now I see a performance lift:

			before (5.3-rc1)		after
D05 8x SAS disks	907K IOPS			970K IOPS
D05 1x NVMe		450K IOPS			466K IOPS
D06 1x NVMe		467K IOPS			466K IOPS

The CPU loading seems to track throughput, so nothing much to say there.

Note: From 5.2 testing, I was seeing >900K IOPS from that NVMe disk for 
!IOMMU.

BTW, what were your thoughts on changing 
arm_smmu_atc_inv_domain()->arm_smmu_atc_inv_master() to batching? It 
seems suitable, but looks untouched. Were you waiting for a resolution 
to the performance issue which Leizhen reported?

Thanks,
John

>
> Cheers,
>
> Will
>
> --->8
>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com>
> Cc: Jan Glauber <jglauber@marvell.com>
> Cc: Jon Masters <jcm@redhat.com>
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Zhen Lei <thunder.leizhen@huawei.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Vijay Kilary <vkilari@codeaurora.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: David Woodhouse <dwmw2@infradead.org>
>
> Will Deacon (13):
>   iommu: Remove empty iommu_tlb_range_add() callback from iommu_ops
>   iommu/io-pgtable-arm: Remove redundant call to io_pgtable_tlb_sync()
>   iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops
>   iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes
>   iommu: Introduce iommu_iotlb_gather_add_page()
>   iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync()
>   iommu/io-pgtable: Introduce tlb_flush_walk() and tlb_flush_leaf()
>   iommu/io-pgtable: Hook up ->tlb_flush_walk() and ->tlb_flush_leaf() in
>     drivers
>   iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf()
>   iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page()
>   iommu/io-pgtable: Remove unused ->tlb_sync() callback
>   iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap()
>   iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page()
>
>  drivers/gpu/drm/panfrost/panfrost_mmu.c |  24 +++++---
>  drivers/iommu/amd_iommu.c               |  11 ++--
>  drivers/iommu/arm-smmu-v3.c             |  52 +++++++++++-----
>  drivers/iommu/arm-smmu.c                | 103 ++++++++++++++++++++++++--------
>  drivers/iommu/dma-iommu.c               |   9 ++-
>  drivers/iommu/exynos-iommu.c            |   3 +-
>  drivers/iommu/intel-iommu.c             |   3 +-
>  drivers/iommu/io-pgtable-arm-v7s.c      |  57 +++++++++---------
>  drivers/iommu/io-pgtable-arm.c          |  48 ++++++++-------
>  drivers/iommu/iommu.c                   |  24 ++++----
>  drivers/iommu/ipmmu-vmsa.c              |  28 +++++----
>  drivers/iommu/msm_iommu.c               |  42 +++++++++----
>  drivers/iommu/mtk_iommu.c               |  45 +++++++++++---
>  drivers/iommu/mtk_iommu_v1.c            |   3 +-
>  drivers/iommu/omap-iommu.c              |   2 +-
>  drivers/iommu/qcom_iommu.c              |  44 +++++++++++---
>  drivers/iommu/rockchip-iommu.c          |   2 +-
>  drivers/iommu/s390-iommu.c              |   3 +-
>  drivers/iommu/tegra-gart.c              |  12 +++-
>  drivers/iommu/tegra-smmu.c              |   2 +-
>  drivers/iommu/virtio-iommu.c            |   5 +-
>  drivers/vfio/vfio_iommu_type1.c         |  27 +++++----
>  include/linux/io-pgtable.h              |  57 ++++++++++++------
>  include/linux/iommu.h                   |  92 +++++++++++++++++++++-------
>  24 files changed, 483 insertions(+), 215 deletions(-)
>


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  parent reply	other threads:[~2019-08-15 11:20 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-14 17:56 [PATCH 00/13] Rework IOMMU API to allow for batching of invalidation Will Deacon
2019-08-14 17:56 ` [PATCH 01/13] iommu: Remove empty iommu_tlb_range_add() callback from iommu_ops Will Deacon
2019-08-14 17:56 ` [PATCH 02/13] iommu/io-pgtable-arm: Remove redundant call to io_pgtable_tlb_sync() Will Deacon
2019-08-15 12:43   ` Robin Murphy
2019-08-15 13:57     ` Will Deacon
2019-08-15 14:23       ` Robin Murphy
2019-08-14 17:56 ` [PATCH 03/13] iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops Will Deacon
2019-08-14 17:56 ` [PATCH 04/13] iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes Will Deacon
2019-08-14 17:56 ` [PATCH 05/13] iommu: Introduce iommu_iotlb_gather_add_page() Will Deacon
2019-08-14 17:56 ` [PATCH 06/13] iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync() Will Deacon
2019-08-14 17:56 ` [PATCH 07/13] iommu/io-pgtable: Introduce tlb_flush_walk() and tlb_flush_leaf() Will Deacon
2019-08-21 16:01   ` Robin Murphy
2019-08-14 17:56 ` [PATCH 08/13] iommu/io-pgtable: Hook up ->tlb_flush_walk() and ->tlb_flush_leaf() in drivers Will Deacon
2019-08-14 17:56 ` [PATCH 09/13] iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf() Will Deacon
2019-08-14 17:56 ` [PATCH 10/13] iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page() Will Deacon
2019-08-21 11:42   ` Robin Murphy
2019-08-21 12:05     ` Will Deacon
2019-08-21 12:33       ` Robin Murphy
2019-08-14 17:56 ` [PATCH 11/13] iommu/io-pgtable: Remove unused ->tlb_sync() callback Will Deacon
2019-08-14 17:56 ` [PATCH 12/13] iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap() Will Deacon
2019-08-14 17:56 ` [PATCH 13/13] iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page() Will Deacon
2019-08-15 11:19 ` John Garry [this message]
2019-08-15 13:55   ` [PATCH 00/13] Rework IOMMU API to allow for batching of invalidation Will Deacon
2019-08-16 10:11     ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78d366b7-590f-b114-1a9a-91dea01cde4d@huawei.com \
    --to=john.garry@huawei.com \
    --cc=alex.williamson@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jcm@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jglauber@marvell.com \
    --cc=jnair@marvell.com \
    --cc=robin.murphy@arm.com \
    --cc=vkilari@codeaurora.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).