linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] iommu/iova: Support limiting IOVA alignment
@ 2020-02-14 20:30 Liam Mark
  2020-02-17 16:46 ` Robin Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Liam Mark @ 2020-02-14 20:30 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kernel-team, Isaac J. Manjarres, Pratik Patel, linux-kernel, iommu


When the IOVA framework applies IOVA alignment it aligns all
IOVAs to the smallest PAGE_SIZE order which is greater than or
equal to the requested IOVA size.

We support use cases that requires large buffers (> 64 MB in
size) to be allocated and mapped in their stage 1 page tables.
However, with this alignment scheme we find ourselves running
out of IOVA space for 32 bit devices, so we are proposing this
config, along the similar vein as CONFIG_CMA_ALIGNMENT for CMA
allocations.

Add CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT to limit the alignment of
IOVAs to some desired PAGE_SIZE order, specified by
CONFIG_IOMMU_IOVA_ALIGNMENT. This helps reduce the impact of
fragmentation caused by the current IOVA alignment scheme, and
gives better IOVA space utilization.

Signed-off-by: Liam Mark <lmark@codeaurora.org>
---
 drivers/iommu/Kconfig | 31 +++++++++++++++++++++++++++++++
 drivers/iommu/iova.c  | 20 +++++++++++++++++++-
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d2fade984999..9684a153cc72 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -3,6 +3,37 @@
 config IOMMU_IOVA
 	tristate
 
+if IOMMU_IOVA
+
+config IOMMU_LIMIT_IOVA_ALIGNMENT
+	bool "Limit IOVA alignment"
+	help
+	  When the IOVA framework applies IOVA alignment it aligns all
+	  IOVAs to the smallest PAGE_SIZE order which is greater than or
+	  equal to the requested IOVA size. This works fine for sizes up
+	  to several MiB, but for larger sizes it results in address
+	  space wastage and fragmentation. For example drivers with a 4
+	  GiB IOVA space might run out of IOVA space when allocating
+	  buffers great than 64 MiB.
+
+	  Enable this option to impose a limit on the alignment of IOVAs.
+
+	  If unsure, say N.
+
+config IOMMU_IOVA_ALIGNMENT
+	int "Maximum PAGE_SIZE order of alignment for IOVAs"
+	depends on IOMMU_LIMIT_IOVA_ALIGNMENT
+	range 4 9
+	default 9
+	help
+	  With this parameter you can specify the maximum PAGE_SIZE order for
+	  IOVAs. Larger IOVAs will be aligned only to this specified order.
+	  The order is expressed a power of two multiplied by the PAGE_SIZE.
+
+	  If unsure, leave the default value "9".
+
+endif
+
 # The IOASID library may also be used by non-IOMMU_API users
 config IOASID
 	tristate
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 0e6a9536eca6..259884c8dbd1 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -177,6 +177,24 @@ int init_iova_flush_queue(struct iova_domain *iovad,
 	rb_insert_color(&iova->node, root);
 }
 
+#ifdef CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT
+static unsigned long limit_align_shift(struct iova_domain *iovad,
+				       unsigned long shift)
+{
+	unsigned long max_align_shift;
+
+	max_align_shift = CONFIG_IOMMU_IOVA_ALIGNMENT + PAGE_SHIFT
+			- iova_shift(iovad);
+	return min_t(unsigned long, max_align_shift, shift);
+}
+#else
+static unsigned long limit_align_shift(struct iova_domain *iovad,
+				       unsigned long shift)
+{
+	return shift;
+}
+#endif
+
 static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 		unsigned long size, unsigned long limit_pfn,
 			struct iova *new, bool size_aligned)
@@ -188,7 +206,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	unsigned long align_mask = ~0UL;
 
 	if (size_aligned)
-		align_mask <<= fls_long(size - 1);
+		align_mask <<= limit_align_shift(iovad, fls_long(size - 1));
 
 	/* Walk the tree backwards */
 	spin_lock_irqsave(&iovad->iova_rbtree_lock, flags);
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] iommu/iova: Support limiting IOVA alignment
  2020-02-14 20:30 [RFC PATCH] iommu/iova: Support limiting IOVA alignment Liam Mark
@ 2020-02-17 16:46 ` Robin Murphy
  2020-02-19 12:37   ` Will Deacon
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2020-02-17 16:46 UTC (permalink / raw)
  To: Liam Mark, Joerg Roedel
  Cc: Isaac J. Manjarres, Pratik Patel, iommu, kernel-team, linux-kernel

On 14/02/2020 8:30 pm, Liam Mark wrote:
> 
> When the IOVA framework applies IOVA alignment it aligns all
> IOVAs to the smallest PAGE_SIZE order which is greater than or
> equal to the requested IOVA size.
> 
> We support use cases that requires large buffers (> 64 MB in
> size) to be allocated and mapped in their stage 1 page tables.
> However, with this alignment scheme we find ourselves running
> out of IOVA space for 32 bit devices, so we are proposing this
> config, along the similar vein as CONFIG_CMA_ALIGNMENT for CMA
> allocations.

As per [1], I'd really like to better understand the allocation patterns 
that lead to such a sparsely-occupied address space to begin with, given 
that the rbtree allocator is supposed to try to maintain locality as far 
as possible, and the rcaches should further improve on that. Are you 
also frequently cycling intermediate-sized buffers which are smaller 
than 64MB but still too big to be cached?  Are there a lot of 
non-power-of-two allocations?

> Add CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT to limit the alignment of
> IOVAs to some desired PAGE_SIZE order, specified by
> CONFIG_IOMMU_IOVA_ALIGNMENT. This helps reduce the impact of
> fragmentation caused by the current IOVA alignment scheme, and
> gives better IOVA space utilization.

Even if the general change did prove reasonable, this IOVA allocator is 
not owned by the DMA API, so entirely removing the option of strict 
size-alignment feels a bit uncomfortable. Personally I'd replace the 
bool argument with an actual alignment value to at least hand the 
authority out to individual callers.

Furthermore, even in DMA API terms, is anyone really ever going to 
bother tuning that config? Since iommu-dma is supposed to be a 
transparent layer, arguably it shouldn't behave unnecessarily 
differently from CMA, so simply piggy-backing off CONFIG_CMA_ALIGNMENT 
would seem logical.

Robin.

[1] 
https://lore.kernel.org/linux-iommu/1581721602-17010-1-git-send-email-isaacm@codeaurora.org/

> Signed-off-by: Liam Mark <lmark@codeaurora.org>
> ---
>   drivers/iommu/Kconfig | 31 +++++++++++++++++++++++++++++++
>   drivers/iommu/iova.c  | 20 +++++++++++++++++++-
>   2 files changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index d2fade984999..9684a153cc72 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -3,6 +3,37 @@
>   config IOMMU_IOVA
>   	tristate
>   
> +if IOMMU_IOVA
> +
> +config IOMMU_LIMIT_IOVA_ALIGNMENT
> +	bool "Limit IOVA alignment"
> +	help
> +	  When the IOVA framework applies IOVA alignment it aligns all
> +	  IOVAs to the smallest PAGE_SIZE order which is greater than or
> +	  equal to the requested IOVA size. This works fine for sizes up
> +	  to several MiB, but for larger sizes it results in address
> +	  space wastage and fragmentation. For example drivers with a 4
> +	  GiB IOVA space might run out of IOVA space when allocating
> +	  buffers great than 64 MiB.
> +
> +	  Enable this option to impose a limit on the alignment of IOVAs.
> +
> +	  If unsure, say N.
> +
> +config IOMMU_IOVA_ALIGNMENT
> +	int "Maximum PAGE_SIZE order of alignment for IOVAs"
> +	depends on IOMMU_LIMIT_IOVA_ALIGNMENT
> +	range 4 9
> +	default 9
> +	help
> +	  With this parameter you can specify the maximum PAGE_SIZE order for
> +	  IOVAs. Larger IOVAs will be aligned only to this specified order.
> +	  The order is expressed a power of two multiplied by the PAGE_SIZE.
> +
> +	  If unsure, leave the default value "9".
> +
> +endif
> +
>   # The IOASID library may also be used by non-IOMMU_API users
>   config IOASID
>   	tristate
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 0e6a9536eca6..259884c8dbd1 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -177,6 +177,24 @@ int init_iova_flush_queue(struct iova_domain *iovad,
>   	rb_insert_color(&iova->node, root);
>   }
>   
> +#ifdef CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT
> +static unsigned long limit_align_shift(struct iova_domain *iovad,
> +				       unsigned long shift)
> +{
> +	unsigned long max_align_shift;
> +
> +	max_align_shift = CONFIG_IOMMU_IOVA_ALIGNMENT + PAGE_SHIFT
> +			- iova_shift(iovad);
> +	return min_t(unsigned long, max_align_shift, shift);
> +}
> +#else
> +static unsigned long limit_align_shift(struct iova_domain *iovad,
> +				       unsigned long shift)
> +{
> +	return shift;
> +}
> +#endif
> +
>   static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
>   		unsigned long size, unsigned long limit_pfn,
>   			struct iova *new, bool size_aligned)
> @@ -188,7 +206,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
>   	unsigned long align_mask = ~0UL;
>   
>   	if (size_aligned)
> -		align_mask <<= fls_long(size - 1);
> +		align_mask <<= limit_align_shift(iovad, fls_long(size - 1));
>   
>   	/* Walk the tree backwards */
>   	spin_lock_irqsave(&iovad->iova_rbtree_lock, flags);
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] iommu/iova: Support limiting IOVA alignment
  2020-02-17 16:46 ` Robin Murphy
@ 2020-02-19 12:37   ` Will Deacon
  2020-02-19 23:22     ` Liam Mark
  0 siblings, 1 reply; 7+ messages in thread
From: Will Deacon @ 2020-02-19 12:37 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Liam Mark, Joerg Roedel, Isaac J. Manjarres, Pratik Patel, iommu,
	kernel-team, linux-kernel

On Mon, Feb 17, 2020 at 04:46:14PM +0000, Robin Murphy wrote:
> On 14/02/2020 8:30 pm, Liam Mark wrote:
> > 
> > When the IOVA framework applies IOVA alignment it aligns all
> > IOVAs to the smallest PAGE_SIZE order which is greater than or
> > equal to the requested IOVA size.
> > 
> > We support use cases that requires large buffers (> 64 MB in
> > size) to be allocated and mapped in their stage 1 page tables.
> > However, with this alignment scheme we find ourselves running
> > out of IOVA space for 32 bit devices, so we are proposing this
> > config, along the similar vein as CONFIG_CMA_ALIGNMENT for CMA
> > allocations.
> 
> As per [1], I'd really like to better understand the allocation patterns
> that lead to such a sparsely-occupied address space to begin with, given
> that the rbtree allocator is supposed to try to maintain locality as far as
> possible, and the rcaches should further improve on that. Are you also
> frequently cycling intermediate-sized buffers which are smaller than 64MB
> but still too big to be cached?  Are there a lot of non-power-of-two
> allocations?

Right, information on the allocation pattern would help with this change
and also the choice of IOVA allocation algorithm. Without it, we're just
shooting in the dark.

> > Add CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT to limit the alignment of
> > IOVAs to some desired PAGE_SIZE order, specified by
> > CONFIG_IOMMU_IOVA_ALIGNMENT. This helps reduce the impact of
> > fragmentation caused by the current IOVA alignment scheme, and
> > gives better IOVA space utilization.
> 
> Even if the general change did prove reasonable, this IOVA allocator is not
> owned by the DMA API, so entirely removing the option of strict
> size-alignment feels a bit uncomfortable. Personally I'd replace the bool
> argument with an actual alignment value to at least hand the authority out
> to individual callers.
> 
> Furthermore, even in DMA API terms, is anyone really ever going to bother
> tuning that config? Since iommu-dma is supposed to be a transparent layer,
> arguably it shouldn't behave unnecessarily differently from CMA, so simply
> piggy-backing off CONFIG_CMA_ALIGNMENT would seem logical.

Agreed, reusing CONFIG_CMA_ALIGNMENT makes a lot of sense here as callers
relying on natural alignment of DMA buffer allocations already have to
deal with that limitation. We could fix it as an optional parameter at
init time (init_iova_domain()), and have the DMA IOMMU implementation
pass it in there.

Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] iommu/iova: Support limiting IOVA alignment
  2020-02-19 12:37   ` Will Deacon
@ 2020-02-19 23:22     ` Liam Mark
  2020-02-20  2:38       ` Christoph Hellwig
  2020-02-20 17:35       ` Robin Murphy
  0 siblings, 2 replies; 7+ messages in thread
From: Liam Mark @ 2020-02-19 23:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, Joerg Roedel, Isaac J. Manjarres, Pratik Patel,
	iommu, kernel-team, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4296 bytes --]

On Wed, 19 Feb 2020, Will Deacon wrote:

> On Mon, Feb 17, 2020 at 04:46:14PM +0000, Robin Murphy wrote:
> > On 14/02/2020 8:30 pm, Liam Mark wrote:
> > > 
> > > When the IOVA framework applies IOVA alignment it aligns all
> > > IOVAs to the smallest PAGE_SIZE order which is greater than or
> > > equal to the requested IOVA size.
> > > 
> > > We support use cases that requires large buffers (> 64 MB in
> > > size) to be allocated and mapped in their stage 1 page tables.
> > > However, with this alignment scheme we find ourselves running
> > > out of IOVA space for 32 bit devices, so we are proposing this
> > > config, along the similar vein as CONFIG_CMA_ALIGNMENT for CMA
> > > allocations.
> > 
> > As per [1], I'd really like to better understand the allocation patterns
> > that lead to such a sparsely-occupied address space to begin with, given
> > that the rbtree allocator is supposed to try to maintain locality as far as
> > possible, and the rcaches should further improve on that. Are you also
> > frequently cycling intermediate-sized buffers which are smaller than 64MB
> > but still too big to be cached?  Are there a lot of non-power-of-two
> > allocations?
> 
> Right, information on the allocation pattern would help with this change
> and also the choice of IOVA allocation algorithm. Without it, we're just
> shooting in the dark.
> 

Thanks for the responses.

I am looking into how much of our allocation pattern details I can share.

My general understanding is that this issue occurs on a 32bit devices 
which have additional restrictions on the IOVA range they can use within those 
32bits.

An example is a use case which involves allocating a lot of buffers ~80MB 
is size, the current algorithm will require an alignment of 128MB for 
those buffers. My understanding is that it simply can't accommodate the number of 80MB 
buffers that are required because the of amount of IOVA space which can't 
be used because of the 128MB alignment requirement.

> > > Add CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT to limit the alignment of
> > > IOVAs to some desired PAGE_SIZE order, specified by
> > > CONFIG_IOMMU_IOVA_ALIGNMENT. This helps reduce the impact of
> > > fragmentation caused by the current IOVA alignment scheme, and
> > > gives better IOVA space utilization.
> > 
> > Even if the general change did prove reasonable, this IOVA allocator is not
> > owned by the DMA API, so entirely removing the option of strict
> > size-alignment feels a bit uncomfortable. Personally I'd replace the bool
> > argument with an actual alignment value to at least hand the authority out
> > to individual callers.
> > 
> > Furthermore, even in DMA API terms, is anyone really ever going to bother
> > tuning that config? Since iommu-dma is supposed to be a transparent layer,
> > arguably it shouldn't behave unnecessarily differently from CMA, so simply
> > piggy-backing off CONFIG_CMA_ALIGNMENT would seem logical.
> 
> Agreed, reusing CONFIG_CMA_ALIGNMENT makes a lot of sense here as callers
> relying on natural alignment of DMA buffer allocations already have to
> deal with that limitation. We could fix it as an optional parameter at
> init time (init_iova_domain()), and have the DMA IOMMU implementation
> pass it in there.
> 

My concern with using CONFIG_CMA_ALIGNMENT alignment is that for us this 
would either involve further fragmenting our CMA regions (moving our CMA 
max alignment from 1MB to max 2MB) or losing so of our 2MB IOVA block 
mappings (changing our IOVA max alignment form 2MB to 1MB).

At least for us CMA allocations are often not DMA mapped into stage 1 page 
tables so moving the CMA max alignment to 2MB in our case would, I think, 
only provide the disadvantage of having to increase the size our CMA 
regions to accommodate this large alignment (which isn’t optimal for 
memory utilization since CMA regions can't satisfy unmovable page 
allocations).

As an alternative would it be possible for the dma-iommu layer to use the 
size of the allocation and the domain pgsize_bitmap field to pick a max 
IOVA alignment, which it can pass in for that IOVA allocation, which will 
maximize block mappings but not waste IOVA space?

Liam

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] iommu/iova: Support limiting IOVA alignment
  2020-02-19 23:22     ` Liam Mark
@ 2020-02-20  2:38       ` Christoph Hellwig
  2020-02-20 17:35       ` Robin Murphy
  1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2020-02-20  2:38 UTC (permalink / raw)
  To: Liam Mark
  Cc: Will Deacon, Isaac J. Manjarres, kernel-team, Robin Murphy,
	linux-kernel, iommu, Pratik Patel

On Wed, Feb 19, 2020 at 03:22:36PM -0800, Liam Mark wrote:
> I am looking into how much of our allocation pattern details I can share.

Well if you can't share the details it is by defintion completely
irrelevant.  Please don't waste our time.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] iommu/iova: Support limiting IOVA alignment
  2020-02-19 23:22     ` Liam Mark
  2020-02-20  2:38       ` Christoph Hellwig
@ 2020-02-20 17:35       ` Robin Murphy
  2020-02-20 23:11         ` Liam Mark
  1 sibling, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2020-02-20 17:35 UTC (permalink / raw)
  To: Liam Mark, Will Deacon
  Cc: Joerg Roedel, Isaac J. Manjarres, Pratik Patel, iommu,
	kernel-team, linux-kernel

On 19/02/2020 11:22 pm, Liam Mark wrote:
> On Wed, 19 Feb 2020, Will Deacon wrote:
> 
>> On Mon, Feb 17, 2020 at 04:46:14PM +0000, Robin Murphy wrote:
>>> On 14/02/2020 8:30 pm, Liam Mark wrote:
>>>>
>>>> When the IOVA framework applies IOVA alignment it aligns all
>>>> IOVAs to the smallest PAGE_SIZE order which is greater than or
>>>> equal to the requested IOVA size.
>>>>
>>>> We support use cases that requires large buffers (> 64 MB in
>>>> size) to be allocated and mapped in their stage 1 page tables.
>>>> However, with this alignment scheme we find ourselves running
>>>> out of IOVA space for 32 bit devices, so we are proposing this
>>>> config, along the similar vein as CONFIG_CMA_ALIGNMENT for CMA
>>>> allocations.
>>>
>>> As per [1], I'd really like to better understand the allocation patterns
>>> that lead to such a sparsely-occupied address space to begin with, given
>>> that the rbtree allocator is supposed to try to maintain locality as far as
>>> possible, and the rcaches should further improve on that. Are you also
>>> frequently cycling intermediate-sized buffers which are smaller than 64MB
>>> but still too big to be cached?  Are there a lot of non-power-of-two
>>> allocations?
>>
>> Right, information on the allocation pattern would help with this change
>> and also the choice of IOVA allocation algorithm. Without it, we're just
>> shooting in the dark.
>>
> 
> Thanks for the responses.
> 
> I am looking into how much of our allocation pattern details I can share.
> 
> My general understanding is that this issue occurs on a 32bit devices
> which have additional restrictions on the IOVA range they can use within those
> 32bits.
> 
> An example is a use case which involves allocating a lot of buffers ~80MB
> is size, the current algorithm will require an alignment of 128MB for
> those buffers. My understanding is that it simply can't accommodate the number of 80MB
> buffers that are required because the of amount of IOVA space which can't
> be used because of the 128MB alignment requirement.

OK, that's a case I can understand :)

>>>> Add CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT to limit the alignment of
>>>> IOVAs to some desired PAGE_SIZE order, specified by
>>>> CONFIG_IOMMU_IOVA_ALIGNMENT. This helps reduce the impact of
>>>> fragmentation caused by the current IOVA alignment scheme, and
>>>> gives better IOVA space utilization.
>>>
>>> Even if the general change did prove reasonable, this IOVA allocator is not
>>> owned by the DMA API, so entirely removing the option of strict
>>> size-alignment feels a bit uncomfortable. Personally I'd replace the bool
>>> argument with an actual alignment value to at least hand the authority out
>>> to individual callers.
>>>
>>> Furthermore, even in DMA API terms, is anyone really ever going to bother
>>> tuning that config? Since iommu-dma is supposed to be a transparent layer,
>>> arguably it shouldn't behave unnecessarily differently from CMA, so simply
>>> piggy-backing off CONFIG_CMA_ALIGNMENT would seem logical.
>>
>> Agreed, reusing CONFIG_CMA_ALIGNMENT makes a lot of sense here as callers
>> relying on natural alignment of DMA buffer allocations already have to
>> deal with that limitation. We could fix it as an optional parameter at
>> init time (init_iova_domain()), and have the DMA IOMMU implementation
>> pass it in there.
>>
> 
> My concern with using CONFIG_CMA_ALIGNMENT alignment is that for us this
> would either involve further fragmenting our CMA regions (moving our CMA
> max alignment from 1MB to max 2MB) or losing so of our 2MB IOVA block
> mappings (changing our IOVA max alignment form 2MB to 1MB).
> 
> At least for us CMA allocations are often not DMA mapped into stage 1 page
> tables so moving the CMA max alignment to 2MB in our case would, I think,
> only provide the disadvantage of having to increase the size our CMA
> regions to accommodate this large alignment (which isn’t optimal for
> memory utilization since CMA regions can't satisfy unmovable page
> allocations).
> 
> As an alternative would it be possible for the dma-iommu layer to use the
> size of the allocation and the domain pgsize_bitmap field to pick a max
> IOVA alignment, which it can pass in for that IOVA allocation, which will
> maximize block mappings but not waste IOVA space?

Given that we already have DMA_ATTR_ALOC_SINGLE_PAGES for video drivers 
and suchlike that know enough to know they want "large buffer" 
allocation behaviour, would it suffice to have a similar attribute that 
says "I'm not too fussed about alignment"? That way there's no visible 
change for anyone who doesn't opt in and might be relying on the 
existing behaviour, intentionally or otherwise.

Then if necessary, the implementation can consider both flags together 
to decide whether to try to round down to the next block size or just 
shove it in anywhere.

Robin.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] iommu/iova: Support limiting IOVA alignment
  2020-02-20 17:35       ` Robin Murphy
@ 2020-02-20 23:11         ` Liam Mark
  0 siblings, 0 replies; 7+ messages in thread
From: Liam Mark @ 2020-02-20 23:11 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, Joerg Roedel, Isaac J. Manjarres, Pratik Patel,
	iommu, kernel-team, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3686 bytes --]

On Thu, 20 Feb 2020, Robin Murphy wrote:

> > > > > Add CONFIG_IOMMU_LIMIT_IOVA_ALIGNMENT to limit the alignment of
> > > > > IOVAs to some desired PAGE_SIZE order, specified by
> > > > > CONFIG_IOMMU_IOVA_ALIGNMENT. This helps reduce the impact of
> > > > > fragmentation caused by the current IOVA alignment scheme, and
> > > > > gives better IOVA space utilization.
> > > > 
> > > > Even if the general change did prove reasonable, this IOVA allocator is
> > > > not
> > > > owned by the DMA API, so entirely removing the option of strict
> > > > size-alignment feels a bit uncomfortable. Personally I'd replace the
> > > > bool
> > > > argument with an actual alignment value to at least hand the authority
> > > > out
> > > > to individual callers.
> > > > 
> > > > Furthermore, even in DMA API terms, is anyone really ever going to
> > > > bother
> > > > tuning that config? Since iommu-dma is supposed to be a transparent
> > > > layer,
> > > > arguably it shouldn't behave unnecessarily differently from CMA, so
> > > > simply
> > > > piggy-backing off CONFIG_CMA_ALIGNMENT would seem logical.
> > > 
> > > Agreed, reusing CONFIG_CMA_ALIGNMENT makes a lot of sense here as callers
> > > relying on natural alignment of DMA buffer allocations already have to
> > > deal with that limitation. We could fix it as an optional parameter at
> > > init time (init_iova_domain()), and have the DMA IOMMU implementation
> > > pass it in there.
> > > 
> > 
> > My concern with using CONFIG_CMA_ALIGNMENT alignment is that for us this
> > would either involve further fragmenting our CMA regions (moving our CMA
> > max alignment from 1MB to max 2MB) or losing so of our 2MB IOVA block
> > mappings (changing our IOVA max alignment form 2MB to 1MB).
> > 
> > At least for us CMA allocations are often not DMA mapped into stage 1 page
> > tables so moving the CMA max alignment to 2MB in our case would, I think,
> > only provide the disadvantage of having to increase the size our CMA
> > regions to accommodate this large alignment (which isn’t optimal for
> > memory utilization since CMA regions can't satisfy unmovable page
> > allocations).
> > 
> > As an alternative would it be possible for the dma-iommu layer to use the
> > size of the allocation and the domain pgsize_bitmap field to pick a max
> > IOVA alignment, which it can pass in for that IOVA allocation, which will
> > maximize block mappings but not waste IOVA space?
> 
> Given that we already have DMA_ATTR_ALOC_SINGLE_PAGES for video drivers and
> suchlike that know enough to know they want "large buffer" allocation
> behaviour, would it suffice to have a similar attribute that says "I'm not too
> fussed about alignment"? That way there's no visible change for anyone who
> doesn't opt in and might be relying on the existing behaviour, intentionally
> or otherwise.
> 
> Then if necessary, the implementation can consider both flags together to
> decide whether to try to round down to the next block size or just shove it in
> anywhere.
> 

This should work for us.
My only concern is that many of our users would be using DMA-Buf memory, 
so DMA mapping would be done using dma_buf_map_attachment which I believe 
still doesn't support specifying any DMA attributes. 

I had previously tried to get support added upstream but wasn't 
successful.
https://lkml.org/lkml/2019/1/18/826
https://lkml.org/lkml/2019/1/18/827

But perhaps this new attribute will provide enough justification for DMA 
attributes (in some form, either explicitly or via flags) to be supported 
via dma_buf_map_attachment.

Liam

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-20 23:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-14 20:30 [RFC PATCH] iommu/iova: Support limiting IOVA alignment Liam Mark
2020-02-17 16:46 ` Robin Murphy
2020-02-19 12:37   ` Will Deacon
2020-02-19 23:22     ` Liam Mark
2020-02-20  2:38       ` Christoph Hellwig
2020-02-20 17:35       ` Robin Murphy
2020-02-20 23:11         ` Liam Mark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).