All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] iommu/io-pgtable-arm: Make allocations NUMA-aware
@ 2018-05-21 18:12 ` Robin Murphy
  0 siblings, 0 replies; 6+ messages in thread
From: Robin Murphy @ 2018-05-21 18:12 UTC (permalink / raw)
  To: will.deacon-5wv7dgnIgG8
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

We would generally expect pagetables to be read by the IOMMU more than
written by the CPU, so in NUMA systems it would be preferable to avoid
the IOMMU making cross-node pagetable walks if possible. We already have
a handle on the IOMMU device for the sake of coherency management, so
it's trivial to grab the appropriate NUMA node when allocating new
pagetable pages.

Note that we drop the semantics of alloc_pages_exact(), but that's fine
since they have never been necessary: the only time we're allocating
more than one page is for stage 2 top-level concatenation, but since
that is based on the number of IPA bits, the size is always some exact
power of two anyway.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/io-pgtable-arm.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 39c2a056da21..e80ca386c5b4 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -231,12 +231,16 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
 				    struct io_pgtable_cfg *cfg)
 {
 	struct device *dev = cfg->iommu_dev;
+	int order = get_order(size);
+	struct page *p;
 	dma_addr_t dma;
-	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
+	void *pages;
 
-	if (!pages)
+	p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order);
+	if (!p)
 		return NULL;
 
+	pages = page_address(p);
 	if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
 		dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
 		if (dma_mapping_error(dev, dma))
@@ -256,7 +260,7 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
 	dev_err(dev, "Cannot accommodate DMA translation for IOMMU page tables\n");
 	dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
 out_free:
-	free_pages_exact(pages, size);
+	__free_pages(p, order);
 	return NULL;
 }
 
@@ -266,7 +270,7 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
 	if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA))
 		dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages),
 				 size, DMA_TO_DEVICE);
-	free_pages_exact(pages, size);
+	free_pages((unsigned long)pages, get_order(size));
 }
 
 static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep,
-- 
2.17.0.dirty

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] iommu/io-pgtable-arm: Make allocations NUMA-aware
@ 2018-05-21 18:12 ` Robin Murphy
  0 siblings, 0 replies; 6+ messages in thread
From: Robin Murphy @ 2018-05-21 18:12 UTC (permalink / raw)
  To: linux-arm-kernel

We would generally expect pagetables to be read by the IOMMU more than
written by the CPU, so in NUMA systems it would be preferable to avoid
the IOMMU making cross-node pagetable walks if possible. We already have
a handle on the IOMMU device for the sake of coherency management, so
it's trivial to grab the appropriate NUMA node when allocating new
pagetable pages.

Note that we drop the semantics of alloc_pages_exact(), but that's fine
since they have never been necessary: the only time we're allocating
more than one page is for stage 2 top-level concatenation, but since
that is based on the number of IPA bits, the size is always some exact
power of two anyway.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/io-pgtable-arm.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 39c2a056da21..e80ca386c5b4 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -231,12 +231,16 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
 				    struct io_pgtable_cfg *cfg)
 {
 	struct device *dev = cfg->iommu_dev;
+	int order = get_order(size);
+	struct page *p;
 	dma_addr_t dma;
-	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
+	void *pages;
 
-	if (!pages)
+	p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order);
+	if (!p)
 		return NULL;
 
+	pages = page_address(p);
 	if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
 		dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
 		if (dma_mapping_error(dev, dma))
@@ -256,7 +260,7 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
 	dev_err(dev, "Cannot accommodate DMA translation for IOMMU page tables\n");
 	dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
 out_free:
-	free_pages_exact(pages, size);
+	__free_pages(p, order);
 	return NULL;
 }
 
@@ -266,7 +270,7 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
 	if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA))
 		dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages),
 				 size, DMA_TO_DEVICE);
-	free_pages_exact(pages, size);
+	free_pages((unsigned long)pages, get_order(size));
 }
 
 static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep,
-- 
2.17.0.dirty

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] iommu/io-pgtable-arm: Make allocations NUMA-aware
  2018-05-21 18:12 ` Robin Murphy
@ 2018-05-21 18:47     ` Will Deacon
  -1 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2018-05-21 18:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Mon, May 21, 2018 at 07:12:40PM +0100, Robin Murphy wrote:
> We would generally expect pagetables to be read by the IOMMU more than
> written by the CPU, so in NUMA systems it would be preferable to avoid
> the IOMMU making cross-node pagetable walks if possible. We already have
> a handle on the IOMMU device for the sake of coherency management, so
> it's trivial to grab the appropriate NUMA node when allocating new
> pagetable pages.
> 
> Note that we drop the semantics of alloc_pages_exact(), but that's fine
> since they have never been necessary: the only time we're allocating
> more than one page is for stage 2 top-level concatenation, but since
> that is based on the number of IPA bits, the size is always some exact
> power of two anyway.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/io-pgtable-arm.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 39c2a056da21..e80ca386c5b4 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -231,12 +231,16 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
>  				    struct io_pgtable_cfg *cfg)
>  {
>  	struct device *dev = cfg->iommu_dev;
> +	int order = get_order(size);
> +	struct page *p;
>  	dma_addr_t dma;
> -	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
> +	void *pages;
>  
> -	if (!pages)
> +	p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order);
> +	if (!p)
>  		return NULL;
>  
> +	pages = page_address(p);

Might be worth checking/masking out __GFP_HIGHMEM if we see it, since we
could theoretically run into trouble if we got back a highmem mapping here
and we're losing the check in __get_free_pages afaict.

Other than than, looks good:

Acked-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] iommu/io-pgtable-arm: Make allocations NUMA-aware
@ 2018-05-21 18:47     ` Will Deacon
  0 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2018-05-21 18:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 21, 2018 at 07:12:40PM +0100, Robin Murphy wrote:
> We would generally expect pagetables to be read by the IOMMU more than
> written by the CPU, so in NUMA systems it would be preferable to avoid
> the IOMMU making cross-node pagetable walks if possible. We already have
> a handle on the IOMMU device for the sake of coherency management, so
> it's trivial to grab the appropriate NUMA node when allocating new
> pagetable pages.
> 
> Note that we drop the semantics of alloc_pages_exact(), but that's fine
> since they have never been necessary: the only time we're allocating
> more than one page is for stage 2 top-level concatenation, but since
> that is based on the number of IPA bits, the size is always some exact
> power of two anyway.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/io-pgtable-arm.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 39c2a056da21..e80ca386c5b4 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -231,12 +231,16 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
>  				    struct io_pgtable_cfg *cfg)
>  {
>  	struct device *dev = cfg->iommu_dev;
> +	int order = get_order(size);
> +	struct page *p;
>  	dma_addr_t dma;
> -	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
> +	void *pages;
>  
> -	if (!pages)
> +	p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order);
> +	if (!p)
>  		return NULL;
>  
> +	pages = page_address(p);

Might be worth checking/masking out __GFP_HIGHMEM if we see it, since we
could theoretically run into trouble if we got back a highmem mapping here
and we're losing the check in __get_free_pages afaict.

Other than than, looks good:

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] iommu/io-pgtable-arm: Make allocations NUMA-aware
  2018-05-21 18:47     ` Will Deacon
@ 2018-05-22 10:58         ` Robin Murphy
  -1 siblings, 0 replies; 6+ messages in thread
From: Robin Murphy @ 2018-05-22 10:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 21/05/18 19:47, Will Deacon wrote:
> On Mon, May 21, 2018 at 07:12:40PM +0100, Robin Murphy wrote:
>> We would generally expect pagetables to be read by the IOMMU more than
>> written by the CPU, so in NUMA systems it would be preferable to avoid
>> the IOMMU making cross-node pagetable walks if possible. We already have
>> a handle on the IOMMU device for the sake of coherency management, so
>> it's trivial to grab the appropriate NUMA node when allocating new
>> pagetable pages.
>>
>> Note that we drop the semantics of alloc_pages_exact(), but that's fine
>> since they have never been necessary: the only time we're allocating
>> more than one page is for stage 2 top-level concatenation, but since
>> that is based on the number of IPA bits, the size is always some exact
>> power of two anyway.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>> ---
>>   drivers/iommu/io-pgtable-arm.c | 12 ++++++++----
>>   1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 39c2a056da21..e80ca386c5b4 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -231,12 +231,16 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
>>   				    struct io_pgtable_cfg *cfg)
>>   {
>>   	struct device *dev = cfg->iommu_dev;
>> +	int order = get_order(size);
>> +	struct page *p;
>>   	dma_addr_t dma;
>> -	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
>> +	void *pages;
>>   
>> -	if (!pages)
>> +	p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order);
>> +	if (!p)
>>   		return NULL;
>>   
>> +	pages = page_address(p);
> 
> Might be worth checking/masking out __GFP_HIGHMEM if we see it, since we
> could theoretically run into trouble if we got back a highmem mapping here
> and we're losing the check in __get_free_pages afaict.

True - the only callers are internal ones, and anyone trying to make 
inappropriate changes here should quickly discover why highmem doesn't 
work without significant surgery all over, but I don't see any harm in 
keeping an equivalent VM_BUG_ON as clear documentation.

> Other than than, looks good:
> 
> Acked-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

Thanks!

Robin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] iommu/io-pgtable-arm: Make allocations NUMA-aware
@ 2018-05-22 10:58         ` Robin Murphy
  0 siblings, 0 replies; 6+ messages in thread
From: Robin Murphy @ 2018-05-22 10:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 21/05/18 19:47, Will Deacon wrote:
> On Mon, May 21, 2018 at 07:12:40PM +0100, Robin Murphy wrote:
>> We would generally expect pagetables to be read by the IOMMU more than
>> written by the CPU, so in NUMA systems it would be preferable to avoid
>> the IOMMU making cross-node pagetable walks if possible. We already have
>> a handle on the IOMMU device for the sake of coherency management, so
>> it's trivial to grab the appropriate NUMA node when allocating new
>> pagetable pages.
>>
>> Note that we drop the semantics of alloc_pages_exact(), but that's fine
>> since they have never been necessary: the only time we're allocating
>> more than one page is for stage 2 top-level concatenation, but since
>> that is based on the number of IPA bits, the size is always some exact
>> power of two anyway.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/io-pgtable-arm.c | 12 ++++++++----
>>   1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 39c2a056da21..e80ca386c5b4 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -231,12 +231,16 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
>>   				    struct io_pgtable_cfg *cfg)
>>   {
>>   	struct device *dev = cfg->iommu_dev;
>> +	int order = get_order(size);
>> +	struct page *p;
>>   	dma_addr_t dma;
>> -	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
>> +	void *pages;
>>   
>> -	if (!pages)
>> +	p = alloc_pages_node(dev_to_node(dev), gfp | __GFP_ZERO, order);
>> +	if (!p)
>>   		return NULL;
>>   
>> +	pages = page_address(p);
> 
> Might be worth checking/masking out __GFP_HIGHMEM if we see it, since we
> could theoretically run into trouble if we got back a highmem mapping here
> and we're losing the check in __get_free_pages afaict.

True - the only callers are internal ones, and anyone trying to make 
inappropriate changes here should quickly discover why highmem doesn't 
work without significant surgery all over, but I don't see any harm in 
keeping an equivalent VM_BUG_ON as clear documentation.

> Other than than, looks good:
> 
> Acked-by: Will Deacon <will.deacon@arm.com>

Thanks!

Robin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-05-22 10:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-21 18:12 [PATCH] iommu/io-pgtable-arm: Make allocations NUMA-aware Robin Murphy
2018-05-21 18:12 ` Robin Murphy
     [not found] ` <b745ecc2acd6aad20b5c3776ecd58022a9019c91.1526926225.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2018-05-21 18:47   ` Will Deacon
2018-05-21 18:47     ` Will Deacon
     [not found]     ` <20180521184717.GA24025-5wv7dgnIgG8@public.gmane.org>
2018-05-22 10:58       ` Robin Murphy
2018-05-22 10:58         ` Robin Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.