All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
@ 2016-10-07 15:55 ` Brian Boylston
  0 siblings, 0 replies; 8+ messages in thread
From: Brian Boylston @ 2016-10-07 15:55 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: oliver.moreno, x86, linux-kernel, Ingo Molnar, Al Viro,
	H. Peter Anvin, Thomas Gleixner

copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it to also
use a nocache copy for bvecs.  This improves performance by 2-3X when
splice()ing to a file in a DAX-mounted, pmem-backed file system.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <x86@kernel.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Brian Boylston <brian.boylston@hpe.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Reported-by: Oliver Moreno <oliver.moreno@hpe.com>
---
 arch/x86/include/asm/pmem.h |  6 +++---
 lib/iov_iter.c              | 11 +++++++++--
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pmem.h b/arch/x86/include/asm/pmem.h
index 643eba4..d071f45c 100644
--- a/arch/x86/include/asm/pmem.h
+++ b/arch/x86/include/asm/pmem.h
@@ -73,12 +73,12 @@ static inline void arch_wb_cache_pmem(void *addr, size_t size)
 }
 
 /*
- * copy_from_iter_nocache() on x86 only uses non-temporal stores for iovec
- * iterators, so for other types (bvec & kvec) we must do a cache write-back.
+ * copy_from_iter_nocache() on x86 uses non-temporal stores for iovec and
+ * bvec iterators, but for kvec we must do a cache write-back.
  */
 static inline bool __iter_needs_pmem_wb(struct iov_iter *i)
 {
-	return iter_is_iovec(i) == false;
+	return (i->type & ITER_KVEC) == ITER_KVEC;
 }
 
 /**
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 7e3138c..df4cb00 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -342,6 +342,13 @@ static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t
 	kunmap_atomic(from);
 }
 
+static void memcpy_from_page_nocache(char *to, struct page *page, size_t offset, size_t len)
+{
+	char *from = kmap_atomic(page);
+	__copy_from_user_inatomic_nocache(to, from, len);
+	kunmap_atomic(from);
+}
+
 static void memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len)
 {
 	char *to = kmap_atomic(page);
@@ -392,8 +399,8 @@ size_t copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 	iterate_and_advance(i, bytes, v,
 		__copy_from_user_nocache((to += v.iov_len) - v.iov_len,
 					 v.iov_base, v.iov_len),
-		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
+		memcpy_from_page_nocache((to += v.bv_len) - v.bv_len,
+					 v.bv_page, v.bv_offset, v.bv_len),
 		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
 	)
 
-- 
1.8.3.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
@ 2016-10-07 15:55 ` Brian Boylston
  0 siblings, 0 replies; 8+ messages in thread
From: Brian Boylston @ 2016-10-07 15:55 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: linux-kernel, toshi.kani, oliver.moreno, Brian Boylston,
	Ross Zwisler, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Al Viro, Dan Williams

copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it to also
use a nocache copy for bvecs.  This improves performance by 2-3X when
splice()ing to a file in a DAX-mounted, pmem-backed file system.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <x86@kernel.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Brian Boylston <brian.boylston@hpe.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Reported-by: Oliver Moreno <oliver.moreno@hpe.com>
---
 arch/x86/include/asm/pmem.h |  6 +++---
 lib/iov_iter.c              | 11 +++++++++--
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pmem.h b/arch/x86/include/asm/pmem.h
index 643eba4..d071f45c 100644
--- a/arch/x86/include/asm/pmem.h
+++ b/arch/x86/include/asm/pmem.h
@@ -73,12 +73,12 @@ static inline void arch_wb_cache_pmem(void *addr, size_t size)
 }
 
 /*
- * copy_from_iter_nocache() on x86 only uses non-temporal stores for iovec
- * iterators, so for other types (bvec & kvec) we must do a cache write-back.
+ * copy_from_iter_nocache() on x86 uses non-temporal stores for iovec and
+ * bvec iterators, but for kvec we must do a cache write-back.
  */
 static inline bool __iter_needs_pmem_wb(struct iov_iter *i)
 {
-	return iter_is_iovec(i) == false;
+	return (i->type & ITER_KVEC) == ITER_KVEC;
 }
 
 /**
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 7e3138c..df4cb00 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -342,6 +342,13 @@ static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t
 	kunmap_atomic(from);
 }
 
+static void memcpy_from_page_nocache(char *to, struct page *page, size_t offset, size_t len)
+{
+	char *from = kmap_atomic(page);
+	__copy_from_user_inatomic_nocache(to, from, len);
+	kunmap_atomic(from);
+}
+
 static void memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len)
 {
 	char *to = kmap_atomic(page);
@@ -392,8 +399,8 @@ size_t copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 	iterate_and_advance(i, bytes, v,
 		__copy_from_user_nocache((to += v.iov_len) - v.iov_len,
 					 v.iov_base, v.iov_len),
-		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
+		memcpy_from_page_nocache((to += v.bv_len) - v.bv_len,
+					 v.bv_page, v.bv_offset, v.bv_len),
 		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
 	)
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
  2016-10-07 15:55 ` Brian Boylston
@ 2016-10-07 17:08   ` Al Viro
  -1 siblings, 0 replies; 8+ messages in thread
From: Al Viro @ 2016-10-07 17:08 UTC (permalink / raw)
  To: Brian Boylston
  Cc: linux-nvdimm, oliver.moreno, x86, linux-kernel, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner

On Fri, Oct 07, 2016 at 10:55:11AM -0500, Brian Boylston wrote:
> copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it to also
> use a nocache copy for bvecs.  This improves performance by 2-3X when
> splice()ing to a file in a DAX-mounted, pmem-backed file system.

> +static void memcpy_from_page_nocache(char *to, struct page *page, size_t offset, size_t len)
> +{
> +	char *from = kmap_atomic(page);
> +	__copy_from_user_inatomic_nocache(to, from, len);
> +	kunmap_atomic(from);
> +}

At the very least, it will blow up on any architecture with split
userland and kernel MMU contexts.  You *can't* feed a kernel pointer
to things like that and expect it to work.  At the very least, you
need to add memcpy_nocache() and have it default to memcpy(), with
non-dummy version on x86.  And use _that_, rather than messing with
__copy_from_user_inatomic_nocache()
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
@ 2016-10-07 17:08   ` Al Viro
  0 siblings, 0 replies; 8+ messages in thread
From: Al Viro @ 2016-10-07 17:08 UTC (permalink / raw)
  To: Brian Boylston
  Cc: linux-nvdimm, linux-kernel, toshi.kani, oliver.moreno,
	Ross Zwisler, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Dan Williams

On Fri, Oct 07, 2016 at 10:55:11AM -0500, Brian Boylston wrote:
> copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it to also
> use a nocache copy for bvecs.  This improves performance by 2-3X when
> splice()ing to a file in a DAX-mounted, pmem-backed file system.

> +static void memcpy_from_page_nocache(char *to, struct page *page, size_t offset, size_t len)
> +{
> +	char *from = kmap_atomic(page);
> +	__copy_from_user_inatomic_nocache(to, from, len);
> +	kunmap_atomic(from);
> +}

At the very least, it will blow up on any architecture with split
userland and kernel MMU contexts.  You *can't* feed a kernel pointer
to things like that and expect it to work.  At the very least, you
need to add memcpy_nocache() and have it default to memcpy(), with
non-dummy version on x86.  And use _that_, rather than messing with
__copy_from_user_inatomic_nocache()

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
  2016-10-07 17:08   ` Al Viro
@ 2016-10-10 15:26     ` Kani, Toshimitsu
  -1 siblings, 0 replies; 8+ messages in thread
From: Kani, Toshimitsu @ 2016-10-10 15:26 UTC (permalink / raw)
  To: viro, Boylston, Brian; +Cc: linux-nvdimm, Moreno,

On Fri, 2016-10-07 at 18:08 +0100, Al Viro wrote:
> On Fri, Oct 07, 2016 at 10:55:11AM -0500, Brian Boylston wrote:
> > 
> > copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it
> > to also use a nocache copy for bvecs.  This improves performance by
> > 2-3X when splice()ing to a file in a DAX-mounted, pmem-backed file
> > system.
> 
> > 
> > +static void memcpy_from_page_nocache(char *to, struct page *page,
> > size_t offset, size_t len)
> > +{
> > +	char *from = kmap_atomic(page);
> > +	__copy_from_user_inatomic_nocache(to, from, len);
> > +	kunmap_atomic(from);
> > +}
> 
> At the very least, it will blow up on any architecture with split
> userland and kernel MMU contexts.  You *can't* feed a kernel pointer
> to things like that and expect it to work.  At the very least, you
> need to add memcpy_nocache() and have it default to memcpy(), with
> non-dummy version on x86.  And use _that_, rather than messing with
> __copy_from_user_inatomic_nocache()

Good point.  I think we can add memcpy_nocache() which calls
__copy_from_user_inatomic_nocache() on x86 and defauts to memcpy() on
other architectures.

Thanks,
-Toshi
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
@ 2016-10-10 15:26     ` Kani, Toshimitsu
  0 siblings, 0 replies; 8+ messages in thread
From: Kani, Toshimitsu @ 2016-10-10 15:26 UTC (permalink / raw)
  To: viro, Boylston, Brian
  Cc: linux-kernel, tglx, x86, dan.j.williams, hpa,
	linux-nvdimm@lists.01.org, Moreno, Oliver, mingo, ross.zwisler

On Fri, 2016-10-07 at 18:08 +0100, Al Viro wrote:
> On Fri, Oct 07, 2016 at 10:55:11AM -0500, Brian Boylston wrote:
> > 
> > copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it
> > to also use a nocache copy for bvecs.  This improves performance by
> > 2-3X when splice()ing to a file in a DAX-mounted, pmem-backed file
> > system.
> 
> > 
> > +static void memcpy_from_page_nocache(char *to, struct page *page,
> > size_t offset, size_t len)
> > +{
> > +	char *from = kmap_atomic(page);
> > +	__copy_from_user_inatomic_nocache(to, from, len);
> > +	kunmap_atomic(from);
> > +}
> 
> At the very least, it will blow up on any architecture with split
> userland and kernel MMU contexts.  You *can't* feed a kernel pointer
> to things like that and expect it to work.  At the very least, you
> need to add memcpy_nocache() and have it default to memcpy(), with
> non-dummy version on x86.  And use _that_, rather than messing with
> __copy_from_user_inatomic_nocache()

Good point.  I think we can add memcpy_nocache() which calls
__copy_from_user_inatomic_nocache() on x86 and defauts to memcpy() on
other architectures.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
  2016-10-10 15:26     ` Kani, Toshimitsu
@ 2016-10-11 13:26       ` Boylston, Brian
  -1 siblings, 0 replies; 8+ messages in thread
From: Boylston, Brian @ 2016-10-11 13:26 UTC (permalink / raw)
  To: Kani, Toshimitsu, viro; +Cc: linux-nvdimm, Moreno,

Kani, Toshimitsu wrote on 2016-10-10:
> On Fri, 2016-10-07 at 18:08 +0100, Al Viro wrote:
>> On Fri, Oct 07, 2016 at 10:55:11AM -0500, Brian Boylston wrote:
>>> 
>>> copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it
>>> to also use a nocache copy for bvecs.  This improves performance by
>>> 2-3X when splice()ing to a file in a DAX-mounted, pmem-backed file
>>> system.
>> 
>>> 
>>> +static void memcpy_from_page_nocache(char *to, struct page *page,
>>> size_t offset, size_t len)
>>> +{
>>> +	char *from = kmap_atomic(page);
>>> +	__copy_from_user_inatomic_nocache(to, from, len);
>>> +	kunmap_atomic(from);
>>> +}
>> 
>> At the very least, it will blow up on any architecture with split
>> userland and kernel MMU contexts.  You *can't* feed a kernel pointer
>> to things like that and expect it to work.  At the very least, you
>> need to add memcpy_nocache() and have it default to memcpy(), with
>> non-dummy version on x86.  And use _that_, rather than messing with
>> __copy_from_user_inatomic_nocache()
> 
> Good point.  I think we can add memcpy_nocache() which calls
> __copy_from_user_inatomic_nocache() on x86 and defauts to memcpy() on
> other architectures.

Thanks, Al and Toshi, for the feedback.  I'll re-work and come back.

Brian

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache()
@ 2016-10-11 13:26       ` Boylston, Brian
  0 siblings, 0 replies; 8+ messages in thread
From: Boylston, Brian @ 2016-10-11 13:26 UTC (permalink / raw)
  To: Kani, Toshimitsu, viro
  Cc: linux-kernel, tglx, x86, dan.j.williams, hpa,
	linux-nvdimm@lists.01.org, Moreno, Oliver, mingo, ross.zwisler

Kani, Toshimitsu wrote on 2016-10-10:
> On Fri, 2016-10-07 at 18:08 +0100, Al Viro wrote:
>> On Fri, Oct 07, 2016 at 10:55:11AM -0500, Brian Boylston wrote:
>>> 
>>> copy_from_iter_nocache() is only "nocache" for iovecs.  Enhance it
>>> to also use a nocache copy for bvecs.  This improves performance by
>>> 2-3X when splice()ing to a file in a DAX-mounted, pmem-backed file
>>> system.
>> 
>>> 
>>> +static void memcpy_from_page_nocache(char *to, struct page *page,
>>> size_t offset, size_t len)
>>> +{
>>> +	char *from = kmap_atomic(page);
>>> +	__copy_from_user_inatomic_nocache(to, from, len);
>>> +	kunmap_atomic(from);
>>> +}
>> 
>> At the very least, it will blow up on any architecture with split
>> userland and kernel MMU contexts.  You *can't* feed a kernel pointer
>> to things like that and expect it to work.  At the very least, you
>> need to add memcpy_nocache() and have it default to memcpy(), with
>> non-dummy version on x86.  And use _that_, rather than messing with
>> __copy_from_user_inatomic_nocache()
> 
> Good point.  I think we can add memcpy_nocache() which calls
> __copy_from_user_inatomic_nocache() on x86 and defauts to memcpy() on
> other architectures.

Thanks, Al and Toshi, for the feedback.  I'll re-work and come back.

Brian

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-10-11 14:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-07 15:55 [PATCH] use a nocache copy for bvecs in copy_from_iter_nocache() Brian Boylston
2016-10-07 15:55 ` Brian Boylston
2016-10-07 17:08 ` Al Viro
2016-10-07 17:08   ` Al Viro
2016-10-10 15:26   ` Kani, Toshimitsu
2016-10-10 15:26     ` Kani, Toshimitsu
2016-10-11 13:26     ` Boylston, Brian
2016-10-11 13:26       ` Boylston, Brian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.