linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
@ 2023-01-19 16:20 Fabio M. De Francesco
  2023-03-03  5:23 ` Fabio M. De Francesco
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Fabio M. De Francesco @ 2023-01-19 16:20 UTC (permalink / raw)
  To: Alexander Viro, Benjamin LaHaise, linux-fsdevel, linux-aio, linux-kernel
  Cc: Fabio M. De Francesco, Venkataramanan, Anirudh, Ira Weiny,
	Jeff Moyer, Kent Overstreet

The use of kmap() and kmap_atomic() are being deprecated in favor of
kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

The use of kmap_local_page() in fs/aio.c is "safe" in the sense that the
code don't hands the returned kernel virtual addresses to other threads
and there are no nestings which should be handled with the stack based
(LIFO) mappings/un-mappings order. Furthermore, the code between the old
kmap_atomic()/kunmap_atomic() did not depend on disabling page-faults
and/or preemption, so that there is no need to call pagefault_disable()
and/or preempt_disable() before the mappings.

Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
fs/aio.c.

Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
with HIGHMEM64GB enabled.

Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
---

I've tested with "./check -g aio". The tests in this group fail 3/26
times, with and without my patch. Therefore, these changes don't introduce
further errors. I'm not aware of any other tests which I may run, so that
any suggestions would be precious and much appreciated :-)

I'm resending this patch because some recipients were missing in the
previous submissions. In the meantime I'm also adding some more information
in the commit message. There are no changes in the code.

Changes from v1:
        Add further information in the commit message, and the
        "Reviewed-by" tags from Ira and Jeff (thanks!).

Changes from v2:
	Rewrite a block of code between mapping/un-mapping to improve
	readability in aio_setup_ring() and add a missing call to
	flush_dcache_page() in ioctx_add_table() (thanks to Al Viro);
	Add a "Reviewed-by" tag from Kent Overstreet (thanks).
 
 fs/aio.c | 46 +++++++++++++++++++++-------------------------
 1 file changed, 21 insertions(+), 25 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 562916d85cba..9b39063dc7ac 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -486,7 +486,6 @@ static const struct address_space_operations aio_ctx_aops = {
 
 static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
 {
-	struct aio_ring *ring;
 	struct mm_struct *mm = current->mm;
 	unsigned long size, unused;
 	int nr_pages;
@@ -567,16 +566,12 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
 	ctx->user_id = ctx->mmap_base;
 	ctx->nr_events = nr_events; /* trusted copy */
 
-	ring = kmap_atomic(ctx->ring_pages[0]);
-	ring->nr = nr_events;	/* user copy */
-	ring->id = ~0U;
-	ring->head = ring->tail = 0;
-	ring->magic = AIO_RING_MAGIC;
-	ring->compat_features = AIO_RING_COMPAT_FEATURES;
-	ring->incompat_features = AIO_RING_INCOMPAT_FEATURES;
-	ring->header_length = sizeof(struct aio_ring);
-	kunmap_atomic(ring);
-	flush_dcache_page(ctx->ring_pages[0]);
+	memcpy_to_page(ctx->ring_pages[0], 0, (const char *)&(struct aio_ring) {
+		       .nr = nr_events, .id = ~0U, .magic = AIO_RING_MAGIC,
+		       .compat_features = AIO_RING_COMPAT_FEATURES,
+		       .incompat_features = AIO_RING_INCOMPAT_FEATURES,
+		       .header_length = sizeof(struct aio_ring) },
+		       sizeof(struct aio_ring));
 
 	return 0;
 }
@@ -678,9 +673,10 @@ static int ioctx_add_table(struct kioctx *ctx, struct mm_struct *mm)
 					 * we are protected from page migration
 					 * changes ring_pages by ->ring_lock.
 					 */
-					ring = kmap_atomic(ctx->ring_pages[0]);
+					ring = kmap_local_page(ctx->ring_pages[0]);
 					ring->id = ctx->id;
-					kunmap_atomic(ring);
+					kunmap_local(ring);
+					flush_dcache_page(ctx->ring_pages[0]);
 					return 0;
 				}
 
@@ -1021,9 +1017,9 @@ static void user_refill_reqs_available(struct kioctx *ctx)
 		 * against ctx->completed_events below will make sure we do the
 		 * safe/right thing.
 		 */
-		ring = kmap_atomic(ctx->ring_pages[0]);
+		ring = kmap_local_page(ctx->ring_pages[0]);
 		head = ring->head;
-		kunmap_atomic(ring);
+		kunmap_local(ring);
 
 		refill_reqs_available(ctx, head, ctx->tail);
 	}
@@ -1129,12 +1125,12 @@ static void aio_complete(struct aio_kiocb *iocb)
 	if (++tail >= ctx->nr_events)
 		tail = 0;
 
-	ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
+	ev_page = kmap_local_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
 	event = ev_page + pos % AIO_EVENTS_PER_PAGE;
 
 	*event = iocb->ki_res;
 
-	kunmap_atomic(ev_page);
+	kunmap_local(ev_page);
 	flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
 
 	pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb,
@@ -1148,10 +1144,10 @@ static void aio_complete(struct aio_kiocb *iocb)
 
 	ctx->tail = tail;
 
-	ring = kmap_atomic(ctx->ring_pages[0]);
+	ring = kmap_local_page(ctx->ring_pages[0]);
 	head = ring->head;
 	ring->tail = tail;
-	kunmap_atomic(ring);
+	kunmap_local(ring);
 	flush_dcache_page(ctx->ring_pages[0]);
 
 	ctx->completed_events++;
@@ -1211,10 +1207,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
 	mutex_lock(&ctx->ring_lock);
 
 	/* Access to ->ring_pages here is protected by ctx->ring_lock. */
-	ring = kmap_atomic(ctx->ring_pages[0]);
+	ring = kmap_local_page(ctx->ring_pages[0]);
 	head = ring->head;
 	tail = ring->tail;
-	kunmap_atomic(ring);
+	kunmap_local(ring);
 
 	/*
 	 * Ensure that once we've read the current tail pointer, that
@@ -1246,10 +1242,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
 		avail = min(avail, nr - ret);
 		avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos);
 
-		ev = kmap(page);
+		ev = kmap_local_page(page);
 		copy_ret = copy_to_user(event + ret, ev + pos,
 					sizeof(*ev) * avail);
-		kunmap(page);
+		kunmap_local(ev);
 
 		if (unlikely(copy_ret)) {
 			ret = -EFAULT;
@@ -1261,9 +1257,9 @@ static long aio_read_events_ring(struct kioctx *ctx,
 		head %= ctx->nr_events;
 	}
 
-	ring = kmap_atomic(ctx->ring_pages[0]);
+	ring = kmap_local_page(ctx->ring_pages[0]);
 	ring->head = head;
-	kunmap_atomic(ring);
+	kunmap_local(ring);
 	flush_dcache_page(ctx->ring_pages[0]);
 
 	pr_debug("%li  h%u t%u\n", ret, head, tail);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
  2023-01-19 16:20 [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page() Fabio M. De Francesco
@ 2023-03-03  5:23 ` Fabio M. De Francesco
  2023-03-27 10:08 ` Fabio M. De Francesco
  2023-06-09 15:04 ` Fabio M. De Francesco
  2 siblings, 0 replies; 7+ messages in thread
From: Fabio M. De Francesco @ 2023-03-03  5:23 UTC (permalink / raw)
  To: Alexander Viro
  Cc: Benjamin LaHaise, linux-fsdevel, linux-aio, linux-kernel,
	Venkataramanan, Anirudh, Ira Weiny, Jeff Moyer, Kent Overstreet

On giovedì 19 gennaio 2023 17:20:55 CET Fabio M. De Francesco wrote:
> The use of kmap() and kmap_atomic() are being deprecated in favor of
> kmap_local_page().
> 
> There are two main problems with kmap(): (1) It comes with an overhead as
> the mapping space is restricted and protected by a global lock for
> synchronization and (2) it also requires global TLB invalidation when the
> kmap’s pool wraps and it might block when the mapping space is fully
> utilized until a slot becomes available.
> 
> With kmap_local_page() the mappings are per thread, CPU local, can take
> page faults, and can be called from any context (including interrupts).
> It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> the tasks can be preempted and, when they are scheduled to run again, the
> kernel virtual addresses are restored and still valid.
> 
> The use of kmap_local_page() in fs/aio.c is "safe" in the sense that the
> code don't hands the returned kernel virtual addresses to other threads
> and there are no nestings which should be handled with the stack based
> (LIFO) mappings/un-mappings order. Furthermore, the code between the old
> kmap_atomic()/kunmap_atomic() did not depend on disabling page-faults
> and/or preemption, so that there is no need to call pagefault_disable()
> and/or preempt_disable() before the mappings.
> 
> Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
> fs/aio.c.
> 
> Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
> with HIGHMEM64GB enabled.
> 
> Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
> Suggested-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
> ---
> 
> I've tested with "./check -g aio". The tests in this group fail 3/26
> times, with and without my patch. Therefore, these changes don't introduce
> further errors. I'm not aware of any other tests which I may run, so that
> any suggestions would be precious and much appreciated :-)
> 
> I'm resending this patch because some recipients were missing in the
> previous submissions. In the meantime I'm also adding some more information
> in the commit message. There are no changes in the code.
> 
> Changes from v1:
>         Add further information in the commit message, and the
>         "Reviewed-by" tags from Ira and Jeff (thanks!).
> 
> Changes from v2:
> 	Rewrite a block of code between mapping/un-mapping to improve
> 	readability in aio_setup_ring() and add a missing call to
> 	flush_dcache_page() in ioctx_add_table() (thanks to Al Viro);
> 	Add a "Reviewed-by" tag from Kent Overstreet (thanks).
> 
>  fs/aio.c | 46 +++++++++++++++++++++-------------------------
>  1 file changed, 21 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 562916d85cba..9b39063dc7ac 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -486,7 +486,6 @@ static const struct address_space_operations 
aio_ctx_aops
> = {
> 
>  static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
>  {
> -	struct aio_ring *ring;
>  	struct mm_struct *mm = current->mm;
>  	unsigned long size, unused;
>  	int nr_pages;
> @@ -567,16 +566,12 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned
> int nr_events) ctx->user_id = ctx->mmap_base;
>  	ctx->nr_events = nr_events; /* trusted copy */
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> -	ring->nr = nr_events;	/* user copy */
> -	ring->id = ~0U;
> -	ring->head = ring->tail = 0;
> -	ring->magic = AIO_RING_MAGIC;
> -	ring->compat_features = AIO_RING_COMPAT_FEATURES;
> -	ring->incompat_features = AIO_RING_INCOMPAT_FEATURES;
> -	ring->header_length = sizeof(struct aio_ring);
> -	kunmap_atomic(ring);
> -	flush_dcache_page(ctx->ring_pages[0]);
> +	memcpy_to_page(ctx->ring_pages[0], 0, (const char *)&(struct 
aio_ring) {
> +		       .nr = nr_events, .id = ~0U, .magic = 
AIO_RING_MAGIC,
> +		       .compat_features = AIO_RING_COMPAT_FEATURES,
> +		       .incompat_features = AIO_RING_INCOMPAT_FEATURES,
> +		       .header_length = sizeof(struct aio_ring) },
> +		       sizeof(struct aio_ring));
> 
>  	return 0;
>  }
> @@ -678,9 +673,10 @@ static int ioctx_add_table(struct kioctx *ctx, struct
> mm_struct *mm) * we are protected from page migration
>  					 * changes ring_pages by -
>ring_lock.
>  					 */
> -					ring = kmap_atomic(ctx-
>ring_pages[0]);
> +					ring = kmap_local_page(ctx-
>ring_pages[0]);
>  					ring->id = ctx->id;
> -					kunmap_atomic(ring);
> +					kunmap_local(ring);
> +					flush_dcache_page(ctx-
>ring_pages[0]);
>  					return 0;
>  				}
> 
> @@ -1021,9 +1017,9 @@ static void user_refill_reqs_available(struct kioctx
> *ctx) * against ctx->completed_events below will make sure we do the
>  		 * safe/right thing.
>  		 */
> -		ring = kmap_atomic(ctx->ring_pages[0]);
> +		ring = kmap_local_page(ctx->ring_pages[0]);
>  		head = ring->head;
> -		kunmap_atomic(ring);
> +		kunmap_local(ring);
> 
>  		refill_reqs_available(ctx, head, ctx->tail);
>  	}
> @@ -1129,12 +1125,12 @@ static void aio_complete(struct aio_kiocb *iocb)
>  	if (++tail >= ctx->nr_events)
>  		tail = 0;
> 
> -	ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
> +	ev_page = kmap_local_page(ctx->ring_pages[pos / 
AIO_EVENTS_PER_PAGE]);
>  	event = ev_page + pos % AIO_EVENTS_PER_PAGE;
> 
>  	*event = iocb->ki_res;
> 
> -	kunmap_atomic(ev_page);
> +	kunmap_local(ev_page);
>  	flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
> 
>  	pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb,
> @@ -1148,10 +1144,10 @@ static void aio_complete(struct aio_kiocb *iocb)
> 
>  	ctx->tail = tail;
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	head = ring->head;
>  	ring->tail = tail;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
>  	flush_dcache_page(ctx->ring_pages[0]);
> 
>  	ctx->completed_events++;
> @@ -1211,10 +1207,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  	mutex_lock(&ctx->ring_lock);
> 
>  	/* Access to ->ring_pages here is protected by ctx->ring_lock. */
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	head = ring->head;
>  	tail = ring->tail;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
> 
>  	/*
>  	 * Ensure that once we've read the current tail pointer, that
> @@ -1246,10 +1242,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  		avail = min(avail, nr - ret);
>  		avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos);
> 
> -		ev = kmap(page);
> +		ev = kmap_local_page(page);
>  		copy_ret = copy_to_user(event + ret, ev + pos,
>  					sizeof(*ev) * avail);
> -		kunmap(page);
> +		kunmap_local(ev);
> 
>  		if (unlikely(copy_ret)) {
>  			ret = -EFAULT;
> @@ -1261,9 +1257,9 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  		head %= ctx->nr_events;
>  	}
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	ring->head = head;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
>  	flush_dcache_page(ctx->ring_pages[0]);
> 
>  	pr_debug("%li  h%u t%u\n", ret, head, tail);
> --
> 2.39.0

Hi Al,

I see that this patch is here since Jan 19, 2023.
Is there anything that prevents its merging? Am I expected to do further 
changes? Please notice that it already had three "Reviewed-by:" tags (again 
thanks to Ira, Jeff and Kent). 

Can you please take it in your three?

Thanks,

Fabio




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
  2023-01-19 16:20 [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page() Fabio M. De Francesco
  2023-03-03  5:23 ` Fabio M. De Francesco
@ 2023-03-27 10:08 ` Fabio M. De Francesco
  2023-03-27 13:22   ` Matthew Wilcox
  2023-06-09 15:04 ` Fabio M. De Francesco
  2 siblings, 1 reply; 7+ messages in thread
From: Fabio M. De Francesco @ 2023-03-27 10:08 UTC (permalink / raw)
  To: Alexander Viro
  Cc: Benjamin LaHaise, linux-fsdevel, linux-aio, linux-kernel,
	Venkataramanan, Anirudh, Ira Weiny, Jeff Moyer, Kent Overstreet

On giovedì 19 gennaio 2023 17:20:55 CEST Fabio M. De Francesco wrote:
> The use of kmap() and kmap_atomic() are being deprecated in favor of
> kmap_local_page().
> 
> There are two main problems with kmap(): (1) It comes with an overhead as
> the mapping space is restricted and protected by a global lock for
> synchronization and (2) it also requires global TLB invalidation when the
> kmap’s pool wraps and it might block when the mapping space is fully
> utilized until a slot becomes available.
> 
> With kmap_local_page() the mappings are per thread, CPU local, can take
> page faults, and can be called from any context (including interrupts).
> It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> the tasks can be preempted and, when they are scheduled to run again, the
> kernel virtual addresses are restored and still valid.
> 
> The use of kmap_local_page() in fs/aio.c is "safe" in the sense that the
> code don't hands the returned kernel virtual addresses to other threads
> and there are no nesting which should be handled with the stack based
> (LIFO) mappings/un-mappings order. Furthermore, the code between the old
> kmap_atomic()/kunmap_atomic() did not depend on disabling page-faults
> and/or preemption, so that there is no need to call pagefault_disable()
> and/or preempt_disable() before the mappings.
> 
> Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
> fs/aio.c.
> 
> Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
> with HIGHMEM64GB enabled.
>
Hi Al,

I see that this patch is here since Jan 19, 2023.
Is there anything that prevents its merging? Am I expected to do further 
changes? Please notice that it already had three "Reviewed-by:" tags (again 
thanks to Ira, Jeff and Kent). 

Can you please take it in your three?

Thanks,

Fabio
> 
> Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
> Suggested-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
> ---
> 
> I've tested with "./check -g aio". The tests in this group fail 3/26
> times, with and without my patch. Therefore, these changes don't introduce
> further errors. I'm not aware of any other tests which I may run, so that
> any suggestions would be precious and much appreciated :-)
> 
> I'm resending this patch because some recipients were missing in the
> previous submissions. In the meantime I'm also adding some more information
> in the commit message. There are no changes in the code.
> 
> Changes from v1:
>         Add further information in the commit message, and the
>         "Reviewed-by" tags from Ira and Jeff (thanks!).
> 
> Changes from v2:
> 	Rewrite a block of code between mapping/un-mapping to improve
> 	readability in aio_setup_ring() and add a missing call to
> 	flush_dcache_page() in ioctx_add_table() (thanks to Al Viro);
> 	Add a "Reviewed-by" tag from Kent Overstreet (thanks).
> 
>  fs/aio.c | 46 +++++++++++++++++++++-------------------------
>  1 file changed, 21 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 562916d85cba..9b39063dc7ac 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -486,7 +486,6 @@ static const struct address_space_operations 
aio_ctx_aops
> = {
> 
>  static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
>  {
> -	struct aio_ring *ring;
>  	struct mm_struct *mm = current->mm;
>  	unsigned long size, unused;
>  	int nr_pages;
> @@ -567,16 +566,12 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned
> int nr_events) ctx->user_id = ctx->mmap_base;
>  	ctx->nr_events = nr_events; /* trusted copy */
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> -	ring->nr = nr_events;	/* user copy */
> -	ring->id = ~0U;
> -	ring->head = ring->tail = 0;
> -	ring->magic = AIO_RING_MAGIC;
> -	ring->compat_features = AIO_RING_COMPAT_FEATURES;
> -	ring->incompat_features = AIO_RING_INCOMPAT_FEATURES;
> -	ring->header_length = sizeof(struct aio_ring);
> -	kunmap_atomic(ring);
> -	flush_dcache_page(ctx->ring_pages[0]);
> +	memcpy_to_page(ctx->ring_pages[0], 0, (const char *)&(struct 
aio_ring) {
> +		       .nr = nr_events, .id = ~0U, .magic = 
AIO_RING_MAGIC,
> +		       .compat_features = AIO_RING_COMPAT_FEATURES,
> +		       .incompat_features = AIO_RING_INCOMPAT_FEATURES,
> +		       .header_length = sizeof(struct aio_ring) },
> +		       sizeof(struct aio_ring));
> 
>  	return 0;
>  }
> @@ -678,9 +673,10 @@ static int ioctx_add_table(struct kioctx *ctx, struct
> mm_struct *mm) * we are protected from page migration
>  					 * changes ring_pages by -
>ring_lock.
>  					 */
> -					ring = kmap_atomic(ctx-
>ring_pages[0]);
> +					ring = kmap_local_page(ctx-
>ring_pages[0]);
>  					ring->id = ctx->id;
> -					kunmap_atomic(ring);
> +					kunmap_local(ring);
> +					flush_dcache_page(ctx-
>ring_pages[0]);
>  					return 0;
>  				}
> 
> @@ -1021,9 +1017,9 @@ static void user_refill_reqs_available(struct kioctx
> *ctx) * against ctx->completed_events below will make sure we do the
>  		 * safe/right thing.
>  		 */
> -		ring = kmap_atomic(ctx->ring_pages[0]);
> +		ring = kmap_local_page(ctx->ring_pages[0]);
>  		head = ring->head;
> -		kunmap_atomic(ring);
> +		kunmap_local(ring);
> 
>  		refill_reqs_available(ctx, head, ctx->tail);
>  	}
> @@ -1129,12 +1125,12 @@ static void aio_complete(struct aio_kiocb *iocb)
>  	if (++tail >= ctx->nr_events)
>  		tail = 0;
> 
> -	ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
> +	ev_page = kmap_local_page(ctx->ring_pages[pos / 
AIO_EVENTS_PER_PAGE]);
>  	event = ev_page + pos % AIO_EVENTS_PER_PAGE;
> 
>  	*event = iocb->ki_res;
> 
> -	kunmap_atomic(ev_page);
> +	kunmap_local(ev_page);
>  	flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
> 
>  	pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb,
> @@ -1148,10 +1144,10 @@ static void aio_complete(struct aio_kiocb *iocb)
> 
>  	ctx->tail = tail;
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	head = ring->head;
>  	ring->tail = tail;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
>  	flush_dcache_page(ctx->ring_pages[0]);
> 
>  	ctx->completed_events++;
> @@ -1211,10 +1207,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  	mutex_lock(&ctx->ring_lock);
> 
>  	/* Access to ->ring_pages here is protected by ctx->ring_lock. */
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	head = ring->head;
>  	tail = ring->tail;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
> 
>  	/*
>  	 * Ensure that once we've read the current tail pointer, that
> @@ -1246,10 +1242,10 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  		avail = min(avail, nr - ret);
>  		avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos);
> 
> -		ev = kmap(page);
> +		ev = kmap_local_page(page);
>  		copy_ret = copy_to_user(event + ret, ev + pos,
>  					sizeof(*ev) * avail);
> -		kunmap(page);
> +		kunmap_local(ev);
> 
>  		if (unlikely(copy_ret)) {
>  			ret = -EFAULT;
> @@ -1261,9 +1257,9 @@ static long aio_read_events_ring(struct kioctx *ctx,
>  		head %= ctx->nr_events;
>  	}
> 
> -	ring = kmap_atomic(ctx->ring_pages[0]);
> +	ring = kmap_local_page(ctx->ring_pages[0]);
>  	ring->head = head;
> -	kunmap_atomic(ring);
> +	kunmap_local(ring);
>  	flush_dcache_page(ctx->ring_pages[0]);
> 
>  	pr_debug("%li  h%u t%u\n", ret, head, tail);
> --
> 2.39.0





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
  2023-03-27 10:08 ` Fabio M. De Francesco
@ 2023-03-27 13:22   ` Matthew Wilcox
  2023-03-27 18:37     ` Kent Overstreet
  2023-06-07 14:59     ` Fabio M. De Francesco
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew Wilcox @ 2023-03-27 13:22 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Alexander Viro, Benjamin LaHaise, linux-fsdevel, linux-aio,
	linux-kernel, Venkataramanan, Anirudh, Ira Weiny, Jeff Moyer,
	Kent Overstreet

On Mon, Mar 27, 2023 at 12:08:20PM +0200, Fabio M. De Francesco wrote:
> On giovedì 19 gennaio 2023 17:20:55 CEST Fabio M. De Francesco wrote:
> > The use of kmap() and kmap_atomic() are being deprecated in favor of
> > kmap_local_page().
> > 
> > There are two main problems with kmap(): (1) It comes with an overhead as
> > the mapping space is restricted and protected by a global lock for
> > synchronization and (2) it also requires global TLB invalidation when the
> > kmap’s pool wraps and it might block when the mapping space is fully
> > utilized until a slot becomes available.
> > 
> > With kmap_local_page() the mappings are per thread, CPU local, can take
> > page faults, and can be called from any context (including interrupts).
> > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> > the tasks can be preempted and, when they are scheduled to run again, the
> > kernel virtual addresses are restored and still valid.
> > 
> > The use of kmap_local_page() in fs/aio.c is "safe" in the sense that the
> > code don't hands the returned kernel virtual addresses to other threads
> > and there are no nesting which should be handled with the stack based
> > (LIFO) mappings/un-mappings order. Furthermore, the code between the old
> > kmap_atomic()/kunmap_atomic() did not depend on disabling page-faults
> > and/or preemption, so that there is no need to call pagefault_disable()
> > and/or preempt_disable() before the mappings.
> > 
> > Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
> > fs/aio.c.

Or should we just stop allocating aio rings from HIGHMEM and remove
the calls to kmap()?  How much memory are we talking about here?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
  2023-03-27 13:22   ` Matthew Wilcox
@ 2023-03-27 18:37     ` Kent Overstreet
  2023-06-07 14:59     ` Fabio M. De Francesco
  1 sibling, 0 replies; 7+ messages in thread
From: Kent Overstreet @ 2023-03-27 18:37 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Fabio M. De Francesco, Alexander Viro, Benjamin LaHaise,
	linux-fsdevel, linux-aio, linux-kernel, Venkataramanan, Anirudh,
	Ira Weiny, Jeff Moyer

On Mon, Mar 27, 2023 at 02:22:46PM +0100, Matthew Wilcox wrote:
> Or should we just stop allocating aio rings from HIGHMEM and remove
> the calls to kmap()?  How much memory are we talking about here?

I don't think that should stop us from taking these patches, but yes.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
  2023-03-27 13:22   ` Matthew Wilcox
  2023-03-27 18:37     ` Kent Overstreet
@ 2023-06-07 14:59     ` Fabio M. De Francesco
  1 sibling, 0 replies; 7+ messages in thread
From: Fabio M. De Francesco @ 2023-06-07 14:59 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alexander Viro, Benjamin LaHaise, linux-fsdevel, linux-aio,
	linux-kernel, Venkataramanan, Anirudh, Ira Weiny, Jeff Moyer,
	Kent Overstreet

On lunedì 27 marzo 2023 15:22:46 CEST Matthew Wilcox wrote:
> On Mon, Mar 27, 2023 at 12:08:20PM +0200, Fabio M. De Francesco wrote:
> > On giovedì 19 gennaio 2023 17:20:55 CEST Fabio M. De Francesco wrote:
> > > The use of kmap() and kmap_atomic() are being deprecated in favor of
> > > kmap_local_page().
> > >
> > > [...]
> > >
> > > Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
> > > fs/aio.c.
>
> Or should we just stop allocating aio rings from HIGHMEM and remove
> the calls to kmap()?  How much memory are we talking about here?

Matthew,

Well, I'll do as you suggested. Actually, I should have made this change when 
you suggested it but... well, I think you can easily guess why I did not.

Here it seems that a call of find_or_create_pages() with the GFP_USER flag
instead of GFP_HIGHUSER is all that is required. And then I'll get rid of the
mappings in favor of some straight page_address().

I just gave a look after months, so I could very well have missed something 
else. If what I just saw it's all that must be changed, I'll send the new 
patch by tomorrow.

Thanks,

Fabio

P.S.: I had sent other patches that must also be changed according to a 
similar comment you made. Obviously, I'll work also on them (no matter if you 
can't probably recall the short series to fs/ufs I'm referring to).




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
  2023-01-19 16:20 [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page() Fabio M. De Francesco
  2023-03-03  5:23 ` Fabio M. De Francesco
  2023-03-27 10:08 ` Fabio M. De Francesco
@ 2023-06-09 15:04 ` Fabio M. De Francesco
  2 siblings, 0 replies; 7+ messages in thread
From: Fabio M. De Francesco @ 2023-06-09 15:04 UTC (permalink / raw)
  To: Alexander Viro, Benjamin LaHaise, Ira Weiny, Matthew Wilcox
  Cc: linux-fsdevel, linux-aio, linux-kernel, Jeff Moyer, Kent Overstreet

On giovedì 19 gennaio 2023 17:20:55 CEST Fabio M. De Francesco wrote:
> The use of kmap() and kmap_atomic() are being deprecated in favor of
> kmap_local_page().

According to a suggestion by Matthew, I just sent another patch which stops 
allocating aio rings from ZONE_HIGHMEM.[1]

Therefore, please drop this patch.

Since the purpose of the new patch is entirely different from this, I changed 
the subject and reset the version number to v1.

Thanks,

Fabio

[1] https://lore.kernel.org/lkml/20230609145937.17610-1-fmdefrancesco@gmail.com/
 
> There are two main problems with kmap(): (1) It comes with an overhead as
> the mapping space is restricted and protected by a global lock for
> synchronization and (2) it also requires global TLB invalidation when the
> kmap’s pool wraps and it might block when the mapping space is fully
> utilized until a slot becomes available.
> 
> With kmap_local_page() the mappings are per thread, CPU local, can take
> page faults, and can be called from any context (including interrupts).
> It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> the tasks can be preempted and, when they are scheduled to run again, the
> kernel virtual addresses are restored and still valid.
> 
> The use of kmap_local_page() in fs/aio.c is "safe" in the sense that the
> code don't hands the returned kernel virtual addresses to other threads
> and there are no nesting which should be handled with the stack based
> (LIFO) mappings/un-mappings order. Furthermore, the code between the old
> kmap_atomic()/kunmap_atomic() did not depend on disabling page-faults
> and/or preemption, so that there is no need to call pagefault_disable()
> and/or preempt_disable() before the mappings.
> 
> Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
> fs/aio.c.
> 
> Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
> with HIGHMEM64GB enabled.
> 
> Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
> Suggested-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
> ---




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-06-09 15:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19 16:20 [PATCH v3] fs/aio: Replace kmap{,_atomic}() with kmap_local_page() Fabio M. De Francesco
2023-03-03  5:23 ` Fabio M. De Francesco
2023-03-27 10:08 ` Fabio M. De Francesco
2023-03-27 13:22   ` Matthew Wilcox
2023-03-27 18:37     ` Kent Overstreet
2023-06-07 14:59     ` Fabio M. De Francesco
2023-06-09 15:04 ` Fabio M. De Francesco

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).