linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH 2/5] iommu: Add an unmap_pages() op for IOMMU drivers
       [not found] ` <20210331030042.13348-3-isaacm@codeaurora.org>
@ 2021-03-31  4:47   ` Lu Baolu
       [not found]     ` <4c396e68a076f321ed3f406c2c875006@codeaurora.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Lu Baolu @ 2021-03-31  4:47 UTC (permalink / raw)
  To: Isaac J. Manjarres, iommu, linux-arm-kernel
  Cc: baolu.lu, will, robin.murphy, pratikp

On 3/31/21 11:00 AM, Isaac J. Manjarres wrote:
> Add a callback for IOMMU drivers to provide a path for the
> IOMMU framework to call into an IOMMU driver, which can call
> into the io-pgtable code, to unmap a virtually contiguous
> range of pages of the same size.
> 
> For IOMMU drivers that do not specify an unmap_pages() callback,
> the existing logic of unmapping memory one page block at a time
> will be used.
> 
> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
> Suggested-by: Will Deacon <will@kernel.org>
> ---
>   include/linux/iommu.h | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 5e7fe519430a..9cf81242581a 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -193,6 +193,7 @@ struct iommu_iotlb_gather {
>    * @detach_dev: detach device from an iommu domain
>    * @map: map a physically contiguous memory region to an iommu domain
>    * @unmap: unmap a physically contiguous memory region from an iommu domain
> + * @unmap_pages: unmap a number of pages of the same size from an iommu domain
>    * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
>    * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
>    * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
> @@ -245,6 +246,9 @@ struct iommu_ops {
>   		   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
>   	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
>   		     size_t size, struct iommu_iotlb_gather *iotlb_gather);
> +	size_t (*unmap_pages)(struct iommu_domain *domain, unsigned long iova,
> +			      size_t pgsize, size_t pgcount,
> +			      struct iommu_iotlb_gather *iotlb_gather);

Is it possible to add an equivalent map_pages() callback?

>   	void (*flush_iotlb_all)(struct iommu_domain *domain);
>   	void (*iotlb_sync_map)(struct iommu_domain *domain, unsigned long iova,
>   			       size_t size);
> 

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 2/5] iommu: Add an unmap_pages() op for IOMMU drivers
       [not found]     ` <4c396e68a076f321ed3f406c2c875006@codeaurora.org>
@ 2021-03-31  5:39       ` Lu Baolu
       [not found]         ` <b9bbc82b3e405e5bc1d28ed073f6d5b4@codeaurora.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Lu Baolu @ 2021-03-31  5:39 UTC (permalink / raw)
  To: isaacm; +Cc: baolu.lu, iommu, linux-arm-kernel, will, robin.murphy, pratikp

On 3/31/21 1:36 PM, isaacm@codeaurora.org wrote:
> On 2021-03-30 21:47, Lu Baolu wrote:
>> On 3/31/21 11:00 AM, Isaac J. Manjarres wrote:
>>> Add a callback for IOMMU drivers to provide a path for the
>>> IOMMU framework to call into an IOMMU driver, which can call
>>> into the io-pgtable code, to unmap a virtually contiguous
>>> range of pages of the same size.
>>>
>>> For IOMMU drivers that do not specify an unmap_pages() callback,
>>> the existing logic of unmapping memory one page block at a time
>>> will be used.
>>>
>>> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
>>> Suggested-by: Will Deacon <will@kernel.org>
>>> ---
>>>   include/linux/iommu.h | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 5e7fe519430a..9cf81242581a 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -193,6 +193,7 @@ struct iommu_iotlb_gather {
>>>    * @detach_dev: detach device from an iommu domain
>>>    * @map: map a physically contiguous memory region to an iommu domain
>>>    * @unmap: unmap a physically contiguous memory region from an 
>>> iommu domain
>>> + * @unmap_pages: unmap a number of pages of the same size from an 
>>> iommu domain
>>>    * @flush_iotlb_all: Synchronously flush all hardware TLBs for this 
>>> domain
>>>    * @iotlb_sync_map: Sync mappings created recently using @map to 
>>> the hardware
>>>    * @iotlb_sync: Flush all queued ranges from the hardware TLBs and 
>>> empty flush
>>> @@ -245,6 +246,9 @@ struct iommu_ops {
>>>              phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
>>>       size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
>>>                size_t size, struct iommu_iotlb_gather *iotlb_gather);
>>> +    size_t (*unmap_pages)(struct iommu_domain *domain, unsigned long 
>>> iova,
>>> +                  size_t pgsize, size_t pgcount,
>>> +                  struct iommu_iotlb_gather *iotlb_gather);
>>
>> Is it possible to add an equivalent map_pages() callback?
> Yes, map_pages() can be implemented and can leverage a lot of the 
> implementation
> of unmap_pages(). The problem is that if you map several pages in one 
> call, and then
> encounter an error and have to rollback, you should do TLB maintenance, 
> as iommu_map
> does when it encounters an error. However, we can't call iommu_unmap 
> from io-pgtable-arm
> for example. We can call arm_lpae_unmap_pages() from the later patches, 
> but that doesn't
> solve the TLB maintenance issue. Do you have any thoughts on how to 
> address this?

Call unmap_pages() with the same pages and size to roll back. Does it
work?

Best regards,
baolu

>>
>>>       void (*flush_iotlb_all)(struct iommu_domain *domain);
>>>       void (*iotlb_sync_map)(struct iommu_domain *domain, unsigned 
>>> long iova,
>>>                      size_t size);
>>>
>>
>> Best regards,
>> baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/5] Optimization for unmapping iommu mapped buffers
       [not found] <20210331030042.13348-1-isaacm@codeaurora.org>
       [not found] ` <20210331030042.13348-3-isaacm@codeaurora.org>
@ 2021-04-01  3:28 ` chenxiang (M)
  2021-04-01 15:33 ` Robin Murphy
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: chenxiang (M) @ 2021-04-01  3:28 UTC (permalink / raw)
  To: Isaac J. Manjarres, iommu, linux-arm-kernel
  Cc: will, robin.murphy, pratikp, linuxarm

Hi Isaac,


在 2021/3/31 11:00, Isaac J. Manjarres 写道:
> When unmapping a buffer from an IOMMU domain, the IOMMU framework unmaps
> the buffer at a granule of the largest page size that is supported by
> the IOMMU hardware and fits within the buffer. For every block that
> is unmapped, the IOMMU framework will call into the IOMMU driver, and
> then the io-pgtable framework to walk the page tables to find the entry
> that corresponds to the IOVA, and then unmaps the entry.
>
> This can be suboptimal in scenarios where a buffer or a piece of a
> buffer can be split into several contiguous page blocks of the same size.
> For example, consider an IOMMU that supports 4 KB page blocks, 2 MB page
> blocks, and 1 GB page blocks, and a buffer that is 4 MB in size is being
> unmapped at IOVA 0. The current call-flow will result in 4 indirect calls,
> and 2 page table walks, to unmap 2 entries that are next to each other in
> the page-tables, when both entries could have been unmapped in one shot
> by clearing both page table entries in the same call.
>
> These patches implement a callback called unmap_pages to the io-pgtable
> code and IOMMU drivers which unmaps an IOVA range that consists of a
> number of pages of the same page size that is supported by the IOMMU
> hardware, and allows for clearing multiple entries in the same set of
> indirect calls. The reason for introducing unmap_pages is to give
> other IOMMU drivers/io-pgtable formats time to change to using the new
> unmap_pages callback, so that the transition to using this approach can be
> done piecemeal.
>
> The same optimization is applicable for mapping buffers, however, the
> error handling in the io-pgtable layer couldn't be handled cleanly, as we
> would need to invoke iommu_unmap to unmap the parts of the buffer that
> were mapped, and then do any TLB maintenance. However, that seemed like a
> layering violation.
>
> Any feedback is very much appreciated.

I apply those patchset and implement it for smmuv3 on my kunpeng ARM64 
platform, and test the latency of map/unmap
with tool dma_map_benchmark (./dma_map_benchmark -g xxx),  it promotes 
much on the latency of unmap(us).
Maybe you can add the implement for smmuv3 also. The test result is as 
follows:

                         latency of map/unmap(before opt)    latency of 
map/unmap(after opt)
g=1(4K size)                    0.1/0.7     0.1/0.7
g=2(8K size)                    0.2/1.4     0.2/0.8
g=4(16K size)                  0.3/2.7     0.3/0.9
g=8(32K size)                  0.5/5.4     0.5/1.2
g=16(64K size)                1/10.7 1/1.8
g=32(128K size)              1.8/21.4 1.8/2.8
g=64(256K size)              3.6/42.9 3.6/5.1
g=128(512K size)            7/85.6 7/8.6
g=256(1M size)              13.9/171.1 13.9/15.5
g=512(2M size)              0.2/0.7 0.2/0.9
g=1024(4M size)            0.3/1.5 0.3/1.1


The change for smmuv3 for test are as follows:
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8594b4a..e0268b1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2292,6 +2292,20 @@ static size_t arm_smmu_unmap(struct iommu_domain 
*domain, unsigned long iova,
         return ops->unmap(ops, iova, size, gather);
  }

+static size_t arm_smmu_unmap_pages(struct iommu_domain *domain, 
unsigned long iova,
+                            size_t pgsize, size_t pgcount,
+                            struct iommu_iotlb_gather *gather)
+{
+       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+       struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+
+       if (!ops)
+               return 0;
+
+       return ops->unmap_pages(ops, iova, pgsize, pgcount, gather);
+}
+
+
  static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
  {
         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -2613,6 +2627,7 @@ static struct iommu_ops arm_smmu_ops = {
         .attach_dev             = arm_smmu_attach_dev,
         .map                    = arm_smmu_map,
         .unmap                  = arm_smmu_unmap,
+       .unmap_pages            = arm_smmu_unmap_pages,
         .flush_iotlb_all        = arm_smmu_flush_iotlb_all,
         .iotlb_sync             = arm_smmu_iotlb_sync,
         .iova_to_phys           = arm_smmu_iova_to_phys,

>
> Thanks,
> Isaac
>
> Isaac J. Manjarres (5):
>    iommu/io-pgtable: Introduce unmap_pages() as a page table op
>    iommu: Add an unmap_pages() op for IOMMU drivers
>    iommu: Add support for the unmap_pages IOMMU callback
>    iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()
>    iommu/arm-smmu: Implement the unmap_pages IOMMU driver callback
>
>   drivers/iommu/arm/arm-smmu/arm-smmu.c |  19 +++++
>   drivers/iommu/io-pgtable-arm.c        | 114 +++++++++++++++++++++-----
>   drivers/iommu/iommu.c                 |  44 ++++++++--
>   include/linux/io-pgtable.h            |   4 +
>   include/linux/iommu.h                 |   4 +
>   5 files changed, 159 insertions(+), 26 deletions(-)
>



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/5] Optimization for unmapping iommu mapped buffers
       [not found] <20210331030042.13348-1-isaacm@codeaurora.org>
       [not found] ` <20210331030042.13348-3-isaacm@codeaurora.org>
  2021-04-01  3:28 ` [RFC PATCH 0/5] Optimization for unmapping iommu mapped buffers chenxiang (M)
@ 2021-04-01 15:33 ` Robin Murphy
       [not found] ` <20210331030042.13348-4-isaacm@codeaurora.org>
       [not found] ` <20210331030042.13348-5-isaacm@codeaurora.org>
  4 siblings, 0 replies; 8+ messages in thread
From: Robin Murphy @ 2021-04-01 15:33 UTC (permalink / raw)
  To: Isaac J. Manjarres, iommu, linux-arm-kernel; +Cc: will, pratikp

On 2021-03-31 04:00, Isaac J. Manjarres wrote:
> When unmapping a buffer from an IOMMU domain, the IOMMU framework unmaps
> the buffer at a granule of the largest page size that is supported by
> the IOMMU hardware and fits within the buffer. For every block that
> is unmapped, the IOMMU framework will call into the IOMMU driver, and
> then the io-pgtable framework to walk the page tables to find the entry
> that corresponds to the IOVA, and then unmaps the entry.
> 
> This can be suboptimal in scenarios where a buffer or a piece of a
> buffer can be split into several contiguous page blocks of the same size.
> For example, consider an IOMMU that supports 4 KB page blocks, 2 MB page
> blocks, and 1 GB page blocks, and a buffer that is 4 MB in size is being
> unmapped at IOVA 0. The current call-flow will result in 4 indirect calls,
> and 2 page table walks, to unmap 2 entries that are next to each other in
> the page-tables, when both entries could have been unmapped in one shot
> by clearing both page table entries in the same call.

s/unmap/map/ and s/clear/set/ and those two paragraphs are still just as 
valid. I'd say If it's worth doing anything at all then it's worth doing 
more than just half the job ;)

> These patches implement a callback called unmap_pages to the io-pgtable
> code and IOMMU drivers which unmaps an IOVA range that consists of a
> number of pages of the same page size that is supported by the IOMMU
> hardware, and allows for clearing multiple entries in the same set of
> indirect calls. The reason for introducing unmap_pages is to give
> other IOMMU drivers/io-pgtable formats time to change to using the new
> unmap_pages callback, so that the transition to using this approach can be
> done piecemeal.
> 
> The same optimization is applicable for mapping buffers, however, the
> error handling in the io-pgtable layer couldn't be handled cleanly, as we
> would need to invoke iommu_unmap to unmap the parts of the buffer that
> were mapped, and then do any TLB maintenance. However, that seemed like a
> layering violation.

Why couldn't it just return the partial mapping and let the caller roll 
it back?

Note that having a weird asymmetric interface was how things started out 
way back when - see bd13969b9524 ("iommu: Split iommu_unmaps") for context.

> Any feedback is very much appreciated.

Do you have any real-world performance figures? I proposed this as an 
approach because it was clear it could give *some* benefit for 
relatively low impact, but I'm curious to find out exactly how much, and 
in particular whether it appears to leave anything on the table vs. 
punting the entire operation down into the drivers.

Robin.

> Thanks,
> Isaac
> 
> Isaac J. Manjarres (5):
>    iommu/io-pgtable: Introduce unmap_pages() as a page table op
>    iommu: Add an unmap_pages() op for IOMMU drivers
>    iommu: Add support for the unmap_pages IOMMU callback
>    iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()
>    iommu/arm-smmu: Implement the unmap_pages IOMMU driver callback
> 
>   drivers/iommu/arm/arm-smmu/arm-smmu.c |  19 +++++
>   drivers/iommu/io-pgtable-arm.c        | 114 +++++++++++++++++++++-----
>   drivers/iommu/iommu.c                 |  44 ++++++++--
>   include/linux/io-pgtable.h            |   4 +
>   include/linux/iommu.h                 |   4 +
>   5 files changed, 159 insertions(+), 26 deletions(-)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 3/5] iommu: Add support for the unmap_pages IOMMU callback
       [not found] ` <20210331030042.13348-4-isaacm@codeaurora.org>
@ 2021-04-01 15:34   ` Robin Murphy
  2021-04-01 16:37     ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Robin Murphy @ 2021-04-01 15:34 UTC (permalink / raw)
  To: Isaac J. Manjarres, iommu, linux-arm-kernel; +Cc: will, pratikp

On 2021-03-31 04:00, Isaac J. Manjarres wrote:
> The IOMMU framework currently unmaps memory one page block at a time,
> per the page block sizes that are supported by the IOMMU hardware.
> Now that IOMMU drivers can supply a callback for unmapping multiple
> in one call, add support in the IOMMU framework to calculate how many
> page mappings of the same size can be unmapped in one shot, and invoke the
> IOMMU driver's unmap_pages callback if it has one. Otherwise, the
> existing behavior will be used.
> 
> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
> Suggested-by: Will Deacon <will@kernel.org>
> ---
>   drivers/iommu/iommu.c | 44 +++++++++++++++++++++++++++++++++++++------
>   1 file changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index d0b0a15dba84..dc4295f6bc7f 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2356,8 +2356,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
>   }
>   EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
>   
> -static size_t iommu_pgsize(struct iommu_domain *domain,
> -			   unsigned long addr_merge, size_t size)
> +static size_t __iommu_pgsize(struct iommu_domain *domain,
> +			     unsigned long addr_merge, size_t size)
>   {
>   	unsigned int pgsize_idx;
>   	size_t pgsize;
> @@ -2388,6 +2388,24 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
>   	return pgsize;
>   }
>   
> +static size_t iommu_pgsize(struct iommu_domain *domain,
> +			   unsigned long addr_merge, size_t size,
> +			   size_t *pgcount)
> +{
> +	size_t pgsize = __iommu_pgsize(domain, addr_merge, size);
> +	size_t pgs = 0;
> +
> +	do {
> +		pgs++;
> +		size -= pgsize;
> +		addr_merge += pgsize;
> +	} while (size && __iommu_pgsize(domain, addr_merge, size) == pgsize);

This looks horrifically inefficient. As part of calculating the best 
current page size it should then be pretty trivial to calculate "(size & 
next_pgsize_up - 1) >> pgsize_idx" for the number of current-size pages 
up to the next-better-size boundary (with next_pgsize_up being 0 if 
pgsize is already the largest possible for the relative alignment of 
physical and virtual address). A loop is just... yuck :(

> +
> +	*pgcount = pgs;
> +
> +	return pgsize;
> +}
> +
>   static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
>   		       phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
>   {
> @@ -2422,7 +2440,7 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
>   	pr_debug("map: iova 0x%lx pa %pa size 0x%zx\n", iova, &paddr, size);
>   
>   	while (size) {
> -		size_t pgsize = iommu_pgsize(domain, iova | paddr, size);
> +		size_t pgsize = __iommu_pgsize(domain, iova | paddr, size);
>   
>   		pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
>   			 iova, &paddr, pgsize);
> @@ -2473,6 +2491,21 @@ int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
>   }
>   EXPORT_SYMBOL_GPL(iommu_map_atomic);
>   
> +static size_t __iommu_unmap_pages(struct iommu_domain *domain, unsigned long iova,
> +				  size_t size, struct iommu_iotlb_gather *iotlb_gather)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	size_t pgsize, pgcount;
> +
> +	if (ops->unmap_pages) {
> +		pgsize = iommu_pgsize(domain, iova, size, &pgcount);
> +		return ops->unmap_pages(domain, iova, pgsize, pgcount, iotlb_gather);
> +	}
> +
> +	pgsize = __iommu_pgsize(domain, iova, size);
> +	return ops->unmap(domain, iova, pgsize, iotlb_gather);
> +}
> +
>   static size_t __iommu_unmap(struct iommu_domain *domain,
>   			    unsigned long iova, size_t size,
>   			    struct iommu_iotlb_gather *iotlb_gather)
> @@ -2510,9 +2543,8 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
>   	 * or we hit an area that isn't mapped.
>   	 */
>   	while (unmapped < size) {
> -		size_t pgsize = iommu_pgsize(domain, iova, size - unmapped);
> -
> -		unmapped_page = ops->unmap(domain, iova, pgsize, iotlb_gather);
> +		unmapped_page = __iommu_unmap_pages(domain, iova, size - unmapped,
> +						    iotlb_gather);

I think it would make more sense to restructure the basic function 
around handling a page range, then just have a little inner loop to 
iterate over the individual pages if the driver doesn't provide the new 
callback.

Robin.

>   		if (!unmapped_page)
>   			break;
>   
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 3/5] iommu: Add support for the unmap_pages IOMMU callback
  2021-04-01 15:34   ` [RFC PATCH 3/5] iommu: Add support for the unmap_pages IOMMU callback Robin Murphy
@ 2021-04-01 16:37     ` Will Deacon
  0 siblings, 0 replies; 8+ messages in thread
From: Will Deacon @ 2021-04-01 16:37 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Isaac J. Manjarres, iommu, linux-arm-kernel, pratikp

On Thu, Apr 01, 2021 at 04:34:37PM +0100, Robin Murphy wrote:
> On 2021-03-31 04:00, Isaac J. Manjarres wrote:
> > The IOMMU framework currently unmaps memory one page block at a time,
> > per the page block sizes that are supported by the IOMMU hardware.
> > Now that IOMMU drivers can supply a callback for unmapping multiple
> > in one call, add support in the IOMMU framework to calculate how many
> > page mappings of the same size can be unmapped in one shot, and invoke the
> > IOMMU driver's unmap_pages callback if it has one. Otherwise, the
> > existing behavior will be used.
> > 
> > Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
> > Suggested-by: Will Deacon <will@kernel.org>
> > ---
> >   drivers/iommu/iommu.c | 44 +++++++++++++++++++++++++++++++++++++------
> >   1 file changed, 38 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index d0b0a15dba84..dc4295f6bc7f 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -2356,8 +2356,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
> >   }
> >   EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
> > -static size_t iommu_pgsize(struct iommu_domain *domain,
> > -			   unsigned long addr_merge, size_t size)
> > +static size_t __iommu_pgsize(struct iommu_domain *domain,
> > +			     unsigned long addr_merge, size_t size)
> >   {
> >   	unsigned int pgsize_idx;
> >   	size_t pgsize;
> > @@ -2388,6 +2388,24 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
> >   	return pgsize;
> >   }
> > +static size_t iommu_pgsize(struct iommu_domain *domain,
> > +			   unsigned long addr_merge, size_t size,
> > +			   size_t *pgcount)
> > +{
> > +	size_t pgsize = __iommu_pgsize(domain, addr_merge, size);
> > +	size_t pgs = 0;
> > +
> > +	do {
> > +		pgs++;
> > +		size -= pgsize;
> > +		addr_merge += pgsize;
> > +	} while (size && __iommu_pgsize(domain, addr_merge, size) == pgsize);
> 
> This looks horrifically inefficient. As part of calculating the best current
> page size it should then be pretty trivial to calculate "(size &
> next_pgsize_up - 1) >> pgsize_idx" for the number of current-size pages up
> to the next-better-size boundary (with next_pgsize_up being 0 if pgsize is
> already the largest possible for the relative alignment of physical and
> virtual address). A loop is just... yuck :(

Hehe, I'm glad you said that as I was thinking the same thing. It's
surprisingly fiddly to get this right, but I spent some time hacking it
today and I _think_ I have something that Isaac could build on (and
hopefully won't be too hairy for ->map either).

I'll post it as an RFC in a sec...

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 4/5] iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()
       [not found] ` <20210331030042.13348-5-isaacm@codeaurora.org>
@ 2021-04-01 17:19   ` Robin Murphy
  0 siblings, 0 replies; 8+ messages in thread
From: Robin Murphy @ 2021-04-01 17:19 UTC (permalink / raw)
  To: Isaac J. Manjarres, iommu, linux-arm-kernel; +Cc: will, pratikp

On 2021-03-31 04:00, Isaac J. Manjarres wrote:
> Implement the unmap_pages() callback for the ARM LPAE io-pgtable
> format.
> 
> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
> Suggested-by: Will Deacon <will@kernel.org>
> ---
>   drivers/iommu/io-pgtable-arm.c | 114 +++++++++++++++++++++++++++------
>   1 file changed, 94 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 87def58e79b5..6eccebf1744d 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -248,10 +248,26 @@ static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
>   		__arm_lpae_sync_pte(ptep, cfg);
>   }
>   
> +static void __arm_lpae_sync_ptes(arm_lpae_iopte *ptep, size_t num_ptes,
> +				 struct io_pgtable_cfg *cfg)
> +{
> +	dma_sync_single_for_device(cfg->iommu_dev, __arm_lpae_dma_addr(ptep),
> +				   sizeof(*ptep) * num_ptes, DMA_TO_DEVICE);
> +}
> +
> +static void __arm_lpae_clear_ptes(arm_lpae_iopte *ptep, size_t num_ptes,
> +				  struct io_pgtable_cfg *cfg)
> +{
> +	memset(ptep, 0, sizeof(*ptep) * num_ptes);
> +
> +	if (!cfg->coherent_walk)
> +		__arm_lpae_sync_ptes(ptep, num_ptes, cfg);
> +}
> +

It seesm like overkill to add separate functions - the existing ones 
could easily just take an extra argument, like we do for the v7s format.

>   static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>   			       struct iommu_iotlb_gather *gather,
> -			       unsigned long iova, size_t size, int lvl,
> -			       arm_lpae_iopte *ptep);
> +			       unsigned long iova, size_t size, size_t pgcount,
> +			       int lvl, arm_lpae_iopte *ptep);
>   
>   static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>   				phys_addr_t paddr, arm_lpae_iopte prot,
> @@ -289,7 +305,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>   		size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
>   
>   		tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data);
> -		if (__arm_lpae_unmap(data, NULL, iova, sz, lvl, tblp) != sz) {
> +		if (__arm_lpae_unmap(data, NULL, iova, sz, 1, lvl, tblp) != sz) {
>   			WARN_ON(1);
>   			return -EINVAL;
>   		}
> @@ -516,14 +532,14 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
>   				       struct iommu_iotlb_gather *gather,
>   				       unsigned long iova, size_t size,
>   				       arm_lpae_iopte blk_pte, int lvl,
> -				       arm_lpae_iopte *ptep)
> +				       arm_lpae_iopte *ptep, size_t pgcount)
>   {
>   	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>   	arm_lpae_iopte pte, *tablep;
>   	phys_addr_t blk_paddr;
>   	size_t tablesz = ARM_LPAE_GRANULE(data);
>   	size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
> -	int i, unmap_idx = -1;
> +	int i, unmap_idx_start = -1;
>   
>   	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
>   		return 0;
> @@ -533,14 +549,14 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
>   		return 0; /* Bytes unmapped */
>   
>   	if (size == split_sz)
> -		unmap_idx = ARM_LPAE_LVL_IDX(iova, lvl, data);
> +		unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data);
>   
>   	blk_paddr = iopte_to_paddr(blk_pte, data);
>   	pte = iopte_prot(blk_pte);
>   
>   	for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz) {
>   		/* Unmap! */
> -		if (i == unmap_idx)
> +		if (i >= unmap_idx_start && i < (unmap_idx_start + pgcount))
>   			continue;
>   
>   		__arm_lpae_init_pte(data, blk_paddr, pte, lvl, &tablep[i]);
> @@ -558,20 +574,24 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
>   			return 0;
>   
>   		tablep = iopte_deref(pte, data);
> -	} else if (unmap_idx >= 0) {
> -		io_pgtable_tlb_add_page(&data->iop, gather, iova, size);
> -		return size;
> +	} else if (unmap_idx_start >= 0) {
> +		for (i = 0; i < pgcount; i++) {
> +			io_pgtable_tlb_add_page(&data->iop, gather, iova, size);
> +			iova += size;
> +		}
> +		return pgcount * size;
>   	}
>   
> -	return __arm_lpae_unmap(data, gather, iova, size, lvl, tablep);
> +	return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl, tablep);
>   }
>   
>   static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>   			       struct iommu_iotlb_gather *gather,
> -			       unsigned long iova, size_t size, int lvl,
> -			       arm_lpae_iopte *ptep)
> +			       unsigned long iova, size_t size, size_t pgcount,
> +			       int lvl, arm_lpae_iopte *ptep)
>   {
>   	arm_lpae_iopte pte;
> +	size_t i;
>   	struct io_pgtable *iop = &data->iop;
>   
>   	/* Something went horribly wrong and we ran out of page table */
> @@ -585,11 +605,11 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>   
>   	/* If the size matches this level, we're in the right place */
>   	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
> -		__arm_lpae_set_pte(ptep, 0, &iop->cfg);
> +		__arm_lpae_clear_ptes(ptep, pgcount, &iop->cfg);
>   
>   		if (!iopte_leaf(pte, lvl, iop->fmt)) {
>   			/* Also flush any partial walks */
> -			io_pgtable_tlb_flush_walk(iop, iova, size,
> +			io_pgtable_tlb_flush_walk(iop, iova, pgcount * size,
>   						  ARM_LPAE_GRANULE(data));
>   			ptep = iopte_deref(pte, data);
>   			__arm_lpae_free_pgtable(data, lvl + 1, ptep);
> @@ -601,22 +621,25 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>   			 */
>   			smp_wmb();
>   		} else {
> -			io_pgtable_tlb_add_page(iop, gather, iova, size);
> +			for (i = 0; i < pgcount; i++) {
> +				io_pgtable_tlb_add_page(iop, gather, iova, size);
> +				iova += size;
> +			}
>   		}
>   
> -		return size;
> +		return pgcount * size;
>   	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
>   		/*
>   		 * Insert a table at the next level to map the old region,
>   		 * minus the part we want to unmap
>   		 */
>   		return arm_lpae_split_blk_unmap(data, gather, iova, size, pte,
> -						lvl + 1, ptep);
> +						lvl + 1, ptep, pgcount);
>   	}
>   
>   	/* Keep on walkin' */
>   	ptep = iopte_deref(pte, data);
> -	return __arm_lpae_unmap(data, gather, iova, size, lvl + 1, ptep);
> +	return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl + 1, ptep);
>   }
>   
>   static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
> @@ -635,7 +658,57 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>   	if (WARN_ON(iaext))
>   		return 0;
>   
> -	return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep);
> +	return __arm_lpae_unmap(data, gather, iova, size, 1, data->start_level, ptep);
> +}
> +
> +static size_t arm_lpae_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova,
> +				   size_t pgsize, size_t pgcount,
> +				   struct iommu_iotlb_gather *gather)
> +{
> +	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> +	struct io_pgtable_cfg *cfg = &data->iop.cfg;
> +	arm_lpae_iopte *ptep = data->pgd;
> +	long iaext = (s64)iova >> cfg->ias;
> +	size_t unmapped = 0, unmapped_page;
> +	int last_lvl = data->start_level;
> +	size_t table_size, pages, tbl_offset, max_entries;
> +
> +	if (WARN_ON(!pgsize || (pgsize & cfg->pgsize_bitmap) != pgsize || !pgcount))
> +		return 0;
> +
> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
> +		iaext = ~iaext;
> +	if (WARN_ON(iaext))
> +		return 0;
> +
> +	/*
> +	 * Calculating the page table size here helps avoid situations where
> +	 * a page range that is being unmapped may be mapped at the same level
> +	 * but not mapped by the same tables. Allowing such a scenario to
> +	 * occur can complicate the logic in arm_lpae_split_blk_unmap().
> +	 */
> +	while (ARM_LPAE_BLOCK_SIZE(last_lvl, data) != pgsize)
> +		last_lvl++;
> +
> +	table_size = last_lvl == data->start_level ? ARM_LPAE_PGD_SIZE(data) :
> +		ARM_LPAE_GRANULE(data);
> +	max_entries = table_size / sizeof(*ptep);

I'm really struggling to understand what's going on here :/

When would table_size ever not be equal to page_size? (In a way that 
matters - manipulating top-level block entries in a concatenated stage 2 
table is hardly going to be common enough to deserve specific optimisation)

> +	while (pgcount) {
> +		tbl_offset = ARM_LPAE_LVL_IDX(iova, last_lvl, data);
> +		pages = min_t(size_t, pgcount, max_entries - tbl_offset);
> +		unmapped_page = __arm_lpae_unmap(data, gather, iova, pgsize,
> +						 pages, data->start_level,
> +						 ptep);

This again seems less efficient than it deserves to be - iterating 
within __arm_lpae_unmap() itself would seem to make more sense than 
recursing all the way in and out multiple times per operation.

> +		if (!unmapped_page)
> +			break;
> +
> +		unmapped += unmapped_page;
> +		iova += unmapped_page;
> +		pgcount -= pages;
> +	}
> +
> +	return unmapped;
>   }
>   
>   static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> @@ -751,6 +824,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>   	data->iop.ops = (struct io_pgtable_ops) {
>   		.map		= arm_lpae_map,
>   		.unmap		= arm_lpae_unmap,
> +		.unmap_pages	= arm_lpae_unmap_pages,

Why would we need to keep the old callback and have a bunch of 
duplicated code? Even fully converting all the users isn't _that_ 
involved, but having them just call .unmap_pages with n=1 is even less so.

Robin.

>   		.iova_to_phys	= arm_lpae_iova_to_phys,
>   	};
>   
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 2/5] iommu: Add an unmap_pages() op for IOMMU drivers
       [not found]         ` <b9bbc82b3e405e5bc1d28ed073f6d5b4@codeaurora.org>
@ 2021-04-03  1:35           ` Lu Baolu
  0 siblings, 0 replies; 8+ messages in thread
From: Lu Baolu @ 2021-04-03  1:35 UTC (permalink / raw)
  To: isaacm; +Cc: baolu.lu, iommu, linux-arm-kernel, will, robin.murphy, pratikp

Hi Isaac,

On 4/3/21 1:25 AM, isaacm@codeaurora.org wrote:
> On 2021-03-30 22:39, Lu Baolu wrote:
>> On 3/31/21 1:36 PM, isaacm@codeaurora.org wrote:
>>> On 2021-03-30 21:47, Lu Baolu wrote:
>>>> On 3/31/21 11:00 AM, Isaac J. Manjarres wrote:
>>>>> Add a callback for IOMMU drivers to provide a path for the
>>>>> IOMMU framework to call into an IOMMU driver, which can call
>>>>> into the io-pgtable code, to unmap a virtually contiguous
>>>>> range of pages of the same size.
>>>>>
>>>>> For IOMMU drivers that do not specify an unmap_pages() callback,
>>>>> the existing logic of unmapping memory one page block at a time
>>>>> will be used.
>>>>>
>>>>> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
>>>>> Suggested-by: Will Deacon <will@kernel.org>
>>>>> ---
>>>>>   include/linux/iommu.h | 4 ++++
>>>>>   1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index 5e7fe519430a..9cf81242581a 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -193,6 +193,7 @@ struct iommu_iotlb_gather {
>>>>>    * @detach_dev: detach device from an iommu domain
>>>>>    * @map: map a physically contiguous memory region to an iommu 
>>>>> domain
>>>>>    * @unmap: unmap a physically contiguous memory region from an 
>>>>> iommu domain
>>>>> + * @unmap_pages: unmap a number of pages of the same size from an 
>>>>> iommu domain
>>>>>    * @flush_iotlb_all: Synchronously flush all hardware TLBs for 
>>>>> this domain
>>>>>    * @iotlb_sync_map: Sync mappings created recently using @map to 
>>>>> the hardware
>>>>>    * @iotlb_sync: Flush all queued ranges from the hardware TLBs 
>>>>> and empty flush
>>>>> @@ -245,6 +246,9 @@ struct iommu_ops {
>>>>>              phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
>>>>>       size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
>>>>>                size_t size, struct iommu_iotlb_gather *iotlb_gather);
>>>>> +    size_t (*unmap_pages)(struct iommu_domain *domain, unsigned 
>>>>> long iova,
>>>>> +                  size_t pgsize, size_t pgcount,
>>>>> +                  struct iommu_iotlb_gather *iotlb_gather);
>>>>
>>>> Is it possible to add an equivalent map_pages() callback?
>>> Yes, map_pages() can be implemented and can leverage a lot of the 
>>> implementation
>>> of unmap_pages(). The problem is that if you map several pages in one 
>>> call, and then
>>> encounter an error and have to rollback, you should do TLB 
>>> maintenance, as iommu_map
>>> does when it encounters an error. However, we can't call iommu_unmap 
>>> from io-pgtable-arm
>>> for example. We can call arm_lpae_unmap_pages() from the later 
>>> patches, but that doesn't
>>> solve the TLB maintenance issue. Do you have any thoughts on how to 
>>> address this?
>>
>> Call unmap_pages() with the same pages and size to roll back. Does it
>> work?
>>
>> Best regards,
>> baolu
> Hi Lu,
> 
> I've given map_pages() a shot. Here's the second version of the RFC 
> series: 
> https://lore.kernel.org/linux-iommu/20210402013452.4013-1-isaacm@codeaurora.org/T/#t. 

Thanks for doing that. I will look into it and try it with a VT-d
implementation.

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-03  1:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210331030042.13348-1-isaacm@codeaurora.org>
     [not found] ` <20210331030042.13348-3-isaacm@codeaurora.org>
2021-03-31  4:47   ` [RFC PATCH 2/5] iommu: Add an unmap_pages() op for IOMMU drivers Lu Baolu
     [not found]     ` <4c396e68a076f321ed3f406c2c875006@codeaurora.org>
2021-03-31  5:39       ` Lu Baolu
     [not found]         ` <b9bbc82b3e405e5bc1d28ed073f6d5b4@codeaurora.org>
2021-04-03  1:35           ` Lu Baolu
2021-04-01  3:28 ` [RFC PATCH 0/5] Optimization for unmapping iommu mapped buffers chenxiang (M)
2021-04-01 15:33 ` Robin Murphy
     [not found] ` <20210331030042.13348-4-isaacm@codeaurora.org>
2021-04-01 15:34   ` [RFC PATCH 3/5] iommu: Add support for the unmap_pages IOMMU callback Robin Murphy
2021-04-01 16:37     ` Will Deacon
     [not found] ` <20210331030042.13348-5-isaacm@codeaurora.org>
2021-04-01 17:19   ` [RFC PATCH 4/5] iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages() Robin Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).