* [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update
@ 2021-03-16 13:09 Praveen Kumar Kannoju
2021-03-23 13:42 ` Praveen Kannoju
2021-03-23 16:07 ` Jason Gunthorpe
0 siblings, 2 replies; 8+ messages in thread
From: Praveen Kumar Kannoju @ 2021-03-16 13:09 UTC (permalink / raw)
To: leon, dledford, jgg, linux-rdma, linux-kernel
Cc: rajesh.sivaramasubramaniom, rama.nichanamatlu, aruna.ramakrishna,
jeffery.yoder, Praveen Kumar Kannoju
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly
allocation can sometimes take very long to return (a few seconds),
especially if the system is fragmented and does not have any free chunks
for orders >= 3. This causes the calling application to hang for a long
time. To avoid these long latency spikes, limit max order of allocation to
order 3, and reuse that buffer to populate_xlt() for that MR. This will
increase the latency slightly (in the order of microseconds) for each
mlx5_ib_update_xlt() call, especially for larger MRs (since were making
multiple calls to populate_xlt()), but its a small price to pay to avoid
the large latency spikes with higher order allocations. The flag
__GFP_NORETRY is used while fetching the free pages to ensure that there
are no long compaction stalls when the system's memory is in fragmented
condition.
Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
---
drivers/infiniband/hw/mlx5/mr.c | 22 +++-------------------
1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index db05b0e..dac19f0 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1004,9 +1004,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd,
return mr;
}
-#define MLX5_MAX_UMR_CHUNK ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - \
- MLX5_UMR_MTT_ALIGNMENT)
-#define MLX5_SPARE_UMR_CHUNK 0x10000
+#define MLX5_SPARE_UMR_CHUNK 0x8000
/*
* Allocate a temporary buffer to hold the per-page information to transfer to
@@ -1028,30 +1026,16 @@ static void *mlx5_ib_alloc_xlt(size_t *nents, size_t ent_size, gfp_t gfp_mask)
*/
might_sleep();
- gfp_mask |= __GFP_ZERO;
+ gfp_mask |= __GFP_ZERO | __GFP_NORETRY;
- /*
- * If the system already has a suitable high order page then just use
- * that, but don't try hard to create one. This max is about 1M, so a
- * free x86 huge page will satisfy it.
- */
size = min_t(size_t, ent_size * ALIGN(*nents, xlt_chunk_align),
- MLX5_MAX_UMR_CHUNK);
+ MLX5_SPARE_UMR_CHUNK);
*nents = size / ent_size;
res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
get_order(size));
if (res)
return res;
- if (size > MLX5_SPARE_UMR_CHUNK) {
- size = MLX5_SPARE_UMR_CHUNK;
- *nents = get_order(size) / ent_size;
- res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
- get_order(size));
- if (res)
- return res;
- }
-
*nents = PAGE_SIZE / ent_size;
res = (void *)__get_free_page(gfp_mask);
if (res)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update
2021-03-16 13:09 [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update Praveen Kumar Kannoju
@ 2021-03-23 13:42 ` Praveen Kannoju
2021-03-23 16:07 ` Jason Gunthorpe
1 sibling, 0 replies; 8+ messages in thread
From: Praveen Kannoju @ 2021-03-23 13:42 UTC (permalink / raw)
To: Praveen Kannoju, leon, dledford, jgg, linux-rdma, linux-kernel
Cc: Rajesh Sivaramasubramaniom, Rama Nichanamatlu, Aruna Ramakrishna,
Jeffery Yoder
Ping.
Request the reviewers to go through the patch and let us know if you have any queries with respect to it.
-
Praveen Kumar Kannoju.
-----Original Message-----
From: Praveen Kannoju [mailto:praveen.kannoju@oracle.com]
Sent: 16 March 2021 06:39 PM
To: leon@kernel.org; dledford@redhat.com; jgg@ziepe.ca; linux-rdma@vger.kernel.org; linux-kernel@vger.kernel.org
Cc: Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>; Rama Nichanamatlu <rama.nichanamatlu@oracle.com>; Aruna Ramakrishna <aruna.ramakrishna@oracle.com>; Jeffery Yoder <jeffery.yoder@oracle.com>; Praveen Kannoju <praveen.kannoju@oracle.com>
Subject: [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly allocation can sometimes take very long to return (a few seconds), especially if the system is fragmented and does not have any free chunks for orders >= 3. This causes the calling application to hang for a long time. To avoid these long latency spikes, limit max order of allocation to order 3, and reuse that buffer to populate_xlt() for that MR. This will increase the latency slightly (in the order of microseconds) for each
mlx5_ib_update_xlt() call, especially for larger MRs (since were making multiple calls to populate_xlt()), but its a small price to pay to avoid the large latency spikes with higher order allocations. The flag __GFP_NORETRY is used while fetching the free pages to ensure that there are no long compaction stalls when the system's memory is in fragmented condition.
Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
---
drivers/infiniband/hw/mlx5/mr.c | 22 +++-------------------
1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index db05b0e..dac19f0 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1004,9 +1004,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd,
return mr;
}
-#define MLX5_MAX_UMR_CHUNK ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - \
- MLX5_UMR_MTT_ALIGNMENT)
-#define MLX5_SPARE_UMR_CHUNK 0x10000
+#define MLX5_SPARE_UMR_CHUNK 0x8000
/*
* Allocate a temporary buffer to hold the per-page information to transfer to @@ -1028,30 +1026,16 @@ static void *mlx5_ib_alloc_xlt(size_t *nents, size_t ent_size, gfp_t gfp_mask)
*/
might_sleep();
- gfp_mask |= __GFP_ZERO;
+ gfp_mask |= __GFP_ZERO | __GFP_NORETRY;
- /*
- * If the system already has a suitable high order page then just use
- * that, but don't try hard to create one. This max is about 1M, so a
- * free x86 huge page will satisfy it.
- */
size = min_t(size_t, ent_size * ALIGN(*nents, xlt_chunk_align),
- MLX5_MAX_UMR_CHUNK);
+ MLX5_SPARE_UMR_CHUNK);
*nents = size / ent_size;
res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
get_order(size));
if (res)
return res;
- if (size > MLX5_SPARE_UMR_CHUNK) {
- size = MLX5_SPARE_UMR_CHUNK;
- *nents = get_order(size) / ent_size;
- res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
- get_order(size));
- if (res)
- return res;
- }
-
*nents = PAGE_SIZE / ent_size;
res = (void *)__get_free_page(gfp_mask);
if (res)
--
1.8.3.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update
2021-03-16 13:09 [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update Praveen Kumar Kannoju
2021-03-23 13:42 ` Praveen Kannoju
@ 2021-03-23 16:07 ` Jason Gunthorpe
[not found] ` <80966C8E-341B-4F5D-9DCA-C7D82AB084D5@oracle.com>
1 sibling, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2021-03-23 16:07 UTC (permalink / raw)
To: Praveen Kumar Kannoju
Cc: leon, dledford, linux-rdma, linux-kernel,
rajesh.sivaramasubramaniom, rama.nichanamatlu, aruna.ramakrishna,
jeffery.yoder
On Tue, Mar 16, 2021 at 01:09:01PM +0000, Praveen Kumar Kannoju wrote:
> To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
> 1 MB (order-8) memory, depending on the size of the MR. This costly
> allocation can sometimes take very long to return (a few seconds),
> especially if the system is fragmented and does not have any free chunks
> for orders >= 3. This causes the calling application to hang for a long
> time. To avoid these long latency spikes, limit max order of allocation to
> order 3, and reuse that buffer to populate_xlt() for that MR. This will
> increase the latency slightly (in the order of microseconds) for each
> mlx5_ib_update_xlt() call, especially for larger MRs (since were making
> multiple calls to populate_xlt()), but its a small price to pay to avoid
> the large latency spikes with higher order allocations. The flag
> __GFP_NORETRY is used while fetching the free pages to ensure that there
> are no long compaction stalls when the system's memory is in fragmented
> condition.
>
> Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> drivers/infiniband/hw/mlx5/mr.c | 22 +++-------------------
> 1 file changed, 3 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index db05b0e..dac19f0 100644
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1004,9 +1004,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd,
> return mr;
> }
>
> -#define MLX5_MAX_UMR_CHUNK ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - \
> - MLX5_UMR_MTT_ALIGNMENT)
> -#define MLX5_SPARE_UMR_CHUNK 0x10000
> +#define MLX5_SPARE_UMR_CHUNK 0x8000
>
> /*
> * Allocate a temporary buffer to hold the per-page information to transfer to
> @@ -1028,30 +1026,16 @@ static void *mlx5_ib_alloc_xlt(size_t *nents, size_t ent_size, gfp_t gfp_mask)
> */
> might_sleep();
>
> - gfp_mask |= __GFP_ZERO;
> + gfp_mask |= __GFP_ZERO | __GFP_NORETRY;
>
> - /*
> - * If the system already has a suitable high order page then just use
> - * that, but don't try hard to create one. This max is about 1M, so a
> - * free x86 huge page will satisfy it.
> - */
> size = min_t(size_t, ent_size * ALIGN(*nents, xlt_chunk_align),
> - MLX5_MAX_UMR_CHUNK);
> + MLX5_SPARE_UMR_CHUNK);
> *nents = size / ent_size;
> res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
> get_order(size));
> if (res)
> return res;
>
> - if (size > MLX5_SPARE_UMR_CHUNK) {
> - size = MLX5_SPARE_UMR_CHUNK;
> - *nents = get_order(size) / ent_size;
> - res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
> - get_order(size));
> - if (res)
> - return res;
> - }
Why did you delete this and make the size smaller? Isn't GFP_NORETRY
enough?
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-04-01 17:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 13:09 [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update Praveen Kumar Kannoju
2021-03-23 13:42 ` Praveen Kannoju
2021-03-23 16:07 ` Jason Gunthorpe
[not found] ` <80966C8E-341B-4F5D-9DCA-C7D82AB084D5@oracle.com>
2021-03-23 23:13 ` Jason Gunthorpe
2021-03-24 4:27 ` Aruna Ramakrishna
2021-03-25 14:39 ` Jason Gunthorpe
2021-03-31 17:53 ` Jason Gunthorpe
2021-04-01 15:56 ` Praveen Kannoju
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.