All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lu Baolu <baolu.lu@linux.intel.com>
To: "Isaac J. Manjarres" <isaacm@codeaurora.org>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Cc: robin.murphy@arm.com, will@kernel.org, pratikp@codeaurora.org
Subject: Re: [RFC PATCH v5 00/15] Optimizing iommu_[map/unmap] performance
Date: Fri, 11 Jun 2021 11:10:14 +0800	[thread overview]
Message-ID: <405c06b2-0f5e-0d9e-5a11-1523522f9d55@linux.intel.com> (raw)
In-Reply-To: <20210408171402.12607-1-isaacm@codeaurora.org>

Hi Isaac,

Any update for this series? The iommu core part looks good to me and I
also have some patches for Intel IOMMU implementation of [un]map_pages.
Just wonder when could iommu core have this optimization.

Best regards,
baolu

On 4/9/21 1:13 AM, Isaac J. Manjarres wrote:
> When unmapping a buffer from an IOMMU domain, the IOMMU framework unmaps
> the buffer at a granule of the largest page size that is supported by
> the IOMMU hardware and fits within the buffer. For every block that
> is unmapped, the IOMMU framework will call into the IOMMU driver, and
> then the io-pgtable framework to walk the page tables to find the entry
> that corresponds to the IOVA, and then unmaps the entry.
> 
> This can be suboptimal in scenarios where a buffer or a piece of a
> buffer can be split into several contiguous page blocks of the same size.
> For example, consider an IOMMU that supports 4 KB page blocks, 2 MB page
> blocks, and 1 GB page blocks, and a buffer that is 4 MB in size is being
> unmapped at IOVA 0. The current call-flow will result in 4 indirect calls,
> and 2 page table walks, to unmap 2 entries that are next to each other in
> the page-tables, when both entries could have been unmapped in one shot
> by clearing both page table entries in the same call.
> 
> The same optimization is applicable to mapping buffers as well, so
> these patches implement a set of callbacks called unmap_pages and
> map_pages to the io-pgtable code and IOMMU drivers which unmaps or maps
> an IOVA range that consists of a number of pages of the same
> page size that is supported by the IOMMU hardware, and allows for
> manipulating multiple page table entries in the same set of indirect
> calls. The reason for introducing these callbacks is to give other IOMMU
> drivers/io-pgtable formats time to change to using the new callbacks, so
> that the transition to using this approach can be done piecemeal.
> 
> Changes since V4:
> 
> * Fixed type for addr_merge from phys_addr_t to unsigned long so
>    that GENMASK() can be used.
> * Hooked up arm_v7s_[unmap/map]_pages to the io-pgtable ops.
> * Introduced a macro for calculating the number of page table entries
>    for the ARM LPAE io-pgtable format.
> 
> Changes since V3:
> 
> * Removed usage of ULL variants of bitops from Will's patches, as
>    they were not needed.
> * Instead of unmapping/mapping pgcount pages, unmap_pages() and
>    map_pages() will at most unmap and map pgcount pages, allowing
>    for part of the pages in pgcount to be mapped and unmapped. This
>    was done to simplify the handling in the io-pgtable layer.
> * Extended the existing PTE manipulation methods in io-pgtable-arm
>    to handle multiple entries, per Robin's suggestion, eliminating
>    the need to add functions to clear multiple PTEs.
> * Implemented a naive form of [map/unmap]_pages() for ARM v7s io-pgtable
>    format.
> * arm_[v7s/lpae]_[map/unmap] will call
>    arm_[v7s/lpae]_[map_pages/unmap_pages] with an argument of 1 page.
> * The arm_smmu_[map/unmap] functions have been removed, since they
>    have been replaced by arm_smmu_[map/unmap]_pages.
> 
> Changes since V2:
> 
> * Added a check in __iommu_map() to check for the existence
>    of either the map or map_pages callback as per Lu's suggestion.
> 
> Changes since V1:
> 
> * Implemented the map_pages() callbacks
> * Integrated Will's patches into this series which
>    address several concerns about how iommu_pgsize() partitioned a
>    buffer (I made a minor change to the patch which changes
>    iommu_pgsize() to use bitmaps by using the ULL variants of
>    the bitops)
> 
> Isaac J. Manjarres (12):
>    iommu/io-pgtable: Introduce unmap_pages() as a page table op
>    iommu: Add an unmap_pages() op for IOMMU drivers
>    iommu/io-pgtable: Introduce map_pages() as a page table op
>    iommu: Add a map_pages() op for IOMMU drivers
>    iommu: Add support for the map_pages() callback
>    iommu/io-pgtable-arm: Prepare PTE methods for handling multiple
>      entries
>    iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()
>    iommu/io-pgtable-arm: Implement arm_lpae_map_pages()
>    iommu/io-pgtable-arm-v7s: Implement arm_v7s_unmap_pages()
>    iommu/io-pgtable-arm-v7s: Implement arm_v7s_map_pages()
>    iommu/arm-smmu: Implement the unmap_pages() IOMMU driver callback
>    iommu/arm-smmu: Implement the map_pages() IOMMU driver callback
> 
> Will Deacon (3):
>    iommu: Use bitmap to calculate page size in iommu_pgsize()
>    iommu: Split 'addr_merge' argument to iommu_pgsize() into separate
>      parts
>    iommu: Hook up '->unmap_pages' driver callback
> 
>   drivers/iommu/arm/arm-smmu/arm-smmu.c |  18 +--
>   drivers/iommu/io-pgtable-arm-v7s.c    |  50 ++++++-
>   drivers/iommu/io-pgtable-arm.c        | 189 +++++++++++++++++---------
>   drivers/iommu/iommu.c                 | 130 +++++++++++++-----
>   include/linux/io-pgtable.h            |   8 ++
>   include/linux/iommu.h                 |   9 ++
>   6 files changed, 289 insertions(+), 115 deletions(-)
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Lu Baolu <baolu.lu@linux.intel.com>
To: "Isaac J. Manjarres" <isaacm@codeaurora.org>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Cc: baolu.lu@linux.intel.com, will@kernel.org, robin.murphy@arm.com,
	pratikp@codeaurora.org
Subject: Re: [RFC PATCH v5 00/15] Optimizing iommu_[map/unmap] performance
Date: Fri, 11 Jun 2021 11:10:14 +0800	[thread overview]
Message-ID: <405c06b2-0f5e-0d9e-5a11-1523522f9d55@linux.intel.com> (raw)
In-Reply-To: <20210408171402.12607-1-isaacm@codeaurora.org>

Hi Isaac,

Any update for this series? The iommu core part looks good to me and I
also have some patches for Intel IOMMU implementation of [un]map_pages.
Just wonder when could iommu core have this optimization.

Best regards,
baolu

On 4/9/21 1:13 AM, Isaac J. Manjarres wrote:
> When unmapping a buffer from an IOMMU domain, the IOMMU framework unmaps
> the buffer at a granule of the largest page size that is supported by
> the IOMMU hardware and fits within the buffer. For every block that
> is unmapped, the IOMMU framework will call into the IOMMU driver, and
> then the io-pgtable framework to walk the page tables to find the entry
> that corresponds to the IOVA, and then unmaps the entry.
> 
> This can be suboptimal in scenarios where a buffer or a piece of a
> buffer can be split into several contiguous page blocks of the same size.
> For example, consider an IOMMU that supports 4 KB page blocks, 2 MB page
> blocks, and 1 GB page blocks, and a buffer that is 4 MB in size is being
> unmapped at IOVA 0. The current call-flow will result in 4 indirect calls,
> and 2 page table walks, to unmap 2 entries that are next to each other in
> the page-tables, when both entries could have been unmapped in one shot
> by clearing both page table entries in the same call.
> 
> The same optimization is applicable to mapping buffers as well, so
> these patches implement a set of callbacks called unmap_pages and
> map_pages to the io-pgtable code and IOMMU drivers which unmaps or maps
> an IOVA range that consists of a number of pages of the same
> page size that is supported by the IOMMU hardware, and allows for
> manipulating multiple page table entries in the same set of indirect
> calls. The reason for introducing these callbacks is to give other IOMMU
> drivers/io-pgtable formats time to change to using the new callbacks, so
> that the transition to using this approach can be done piecemeal.
> 
> Changes since V4:
> 
> * Fixed type for addr_merge from phys_addr_t to unsigned long so
>    that GENMASK() can be used.
> * Hooked up arm_v7s_[unmap/map]_pages to the io-pgtable ops.
> * Introduced a macro for calculating the number of page table entries
>    for the ARM LPAE io-pgtable format.
> 
> Changes since V3:
> 
> * Removed usage of ULL variants of bitops from Will's patches, as
>    they were not needed.
> * Instead of unmapping/mapping pgcount pages, unmap_pages() and
>    map_pages() will at most unmap and map pgcount pages, allowing
>    for part of the pages in pgcount to be mapped and unmapped. This
>    was done to simplify the handling in the io-pgtable layer.
> * Extended the existing PTE manipulation methods in io-pgtable-arm
>    to handle multiple entries, per Robin's suggestion, eliminating
>    the need to add functions to clear multiple PTEs.
> * Implemented a naive form of [map/unmap]_pages() for ARM v7s io-pgtable
>    format.
> * arm_[v7s/lpae]_[map/unmap] will call
>    arm_[v7s/lpae]_[map_pages/unmap_pages] with an argument of 1 page.
> * The arm_smmu_[map/unmap] functions have been removed, since they
>    have been replaced by arm_smmu_[map/unmap]_pages.
> 
> Changes since V2:
> 
> * Added a check in __iommu_map() to check for the existence
>    of either the map or map_pages callback as per Lu's suggestion.
> 
> Changes since V1:
> 
> * Implemented the map_pages() callbacks
> * Integrated Will's patches into this series which
>    address several concerns about how iommu_pgsize() partitioned a
>    buffer (I made a minor change to the patch which changes
>    iommu_pgsize() to use bitmaps by using the ULL variants of
>    the bitops)
> 
> Isaac J. Manjarres (12):
>    iommu/io-pgtable: Introduce unmap_pages() as a page table op
>    iommu: Add an unmap_pages() op for IOMMU drivers
>    iommu/io-pgtable: Introduce map_pages() as a page table op
>    iommu: Add a map_pages() op for IOMMU drivers
>    iommu: Add support for the map_pages() callback
>    iommu/io-pgtable-arm: Prepare PTE methods for handling multiple
>      entries
>    iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()
>    iommu/io-pgtable-arm: Implement arm_lpae_map_pages()
>    iommu/io-pgtable-arm-v7s: Implement arm_v7s_unmap_pages()
>    iommu/io-pgtable-arm-v7s: Implement arm_v7s_map_pages()
>    iommu/arm-smmu: Implement the unmap_pages() IOMMU driver callback
>    iommu/arm-smmu: Implement the map_pages() IOMMU driver callback
> 
> Will Deacon (3):
>    iommu: Use bitmap to calculate page size in iommu_pgsize()
>    iommu: Split 'addr_merge' argument to iommu_pgsize() into separate
>      parts
>    iommu: Hook up '->unmap_pages' driver callback
> 
>   drivers/iommu/arm/arm-smmu/arm-smmu.c |  18 +--
>   drivers/iommu/io-pgtable-arm-v7s.c    |  50 ++++++-
>   drivers/iommu/io-pgtable-arm.c        | 189 +++++++++++++++++---------
>   drivers/iommu/iommu.c                 | 130 +++++++++++++-----
>   include/linux/io-pgtable.h            |   8 ++
>   include/linux/iommu.h                 |   9 ++
>   6 files changed, 289 insertions(+), 115 deletions(-)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-06-11  3:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 17:13 [RFC PATCH v5 00/15] Optimizing iommu_[map/unmap] performance Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 01/15] iommu/io-pgtable: Introduce unmap_pages() as a page table op Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 02/15] iommu: Add an unmap_pages() op for IOMMU drivers Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 03/15] iommu/io-pgtable: Introduce map_pages() as a page table op Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 04/15] iommu: Add a map_pages() op for IOMMU drivers Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 05/15] iommu: Use bitmap to calculate page size in iommu_pgsize() Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 06/15] iommu: Split 'addr_merge' argument to iommu_pgsize() into separate parts Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 07/15] iommu: Hook up '->unmap_pages' driver callback Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 08/15] iommu: Add support for the map_pages() callback Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 09/15] iommu/io-pgtable-arm: Prepare PTE methods for handling multiple entries Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 10/15] iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages() Isaac J. Manjarres
2021-04-08 17:13 ` [RFC PATCH v5 11/15] iommu/io-pgtable-arm: Implement arm_lpae_map_pages() Isaac J. Manjarres
2021-04-20  5:59   ` chenxiang (M)
2021-04-20  5:59     ` chenxiang (M)
2021-04-08 17:13 ` [RFC PATCH v5 12/15] iommu/io-pgtable-arm-v7s: Implement arm_v7s_unmap_pages() Isaac J. Manjarres
2021-04-08 17:14 ` [RFC PATCH v5 13/15] iommu/io-pgtable-arm-v7s: Implement arm_v7s_map_pages() Isaac J. Manjarres
2021-04-08 17:14 ` [RFC PATCH v5 14/15] iommu/arm-smmu: Implement the unmap_pages() IOMMU driver callback Isaac J. Manjarres
2021-04-08 17:14 ` [RFC PATCH v5 15/15] iommu/arm-smmu: Implement the map_pages() " Isaac J. Manjarres
2021-06-11  3:10 ` Lu Baolu [this message]
2021-06-11  3:10   ` [RFC PATCH v5 00/15] Optimizing iommu_[map/unmap] performance Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=405c06b2-0f5e-0d9e-5a11-1523522f9d55@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=isaacm@codeaurora.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=pratikp@codeaurora.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.