linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] use nocache copy in copy_from_iter_nocache()
@ 2016-10-26 15:50 Brian Boylston
  2016-10-26 15:50 ` [PATCH v2 1/3] introduce memcpy_nocache() Brian Boylston
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Brian Boylston @ 2016-10-26 15:50 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: linux-kernel, toshi.kani, oliver.moreno, Brian Boylston,
	Ross Zwisler, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Al Viro, Dan Williams

Currently, copy_from_iter_nocache() uses "nocache" copies only for
iovecs; bvecs and kvecs use normal copies.  This requires
x86's arch_copy_from_iter_pmem() to issue flushes for bvecs and kvecs,
which has a negative impact on performance when splice()ing from a pipe
to a pmem-backed file on a DAX-mounted file system.

This patch set enables nocache copies in copy_from_iter_nocache() for
bvecs and kvecs for arches that support it (x86 initially).  This provides
a 2-3X improvement in splice() pipe-to-DAX-file throughput.

The first patch introduces memcpy_nocache(), which defaults to just
memcpy(), but for which an x86-specific implementation is provided.

For this patch, I sought to use a static inline function for x86, but
I could not find an obvious header file to put it in.
The build seemed to work when I put it in arch/x86/include/asm/uaccess.h,
but that didn't feel completely right.  I also tried
arch/x86/include/asm/pmem.h, but that doesn't feel right either and it
didn't build.  So, I offer it here in arch/x86/lib/misc.c for discussion.

The second patch updates copy_from_iter_nocache() to use the new
memcpy_nocache().

The third patch removes the flushes from x86's arch_copy_from_iter_pmem().

For testing, I ran fio with the posixaio, mmap, sync, psync, vsync, pvsync,
and splice engines, against both ext4 and xfs.  Only the splice engine
showed any change in performance.  For example, for xfs:

Unpatched 4.8:

Run status group 2 (all jobs):
  WRITE: io=37602MB, aggrb=641724KB/s, minb=641724KB/s, maxb=641724KB/s, mint=60001msec, maxt=60001msec

Run status group 3 (all jobs):
  WRITE: io=36244MB, aggrb=618553KB/s, minb=618553KB/s, maxb=618553KB/s, mint=60001msec, maxt=60001msec

With this patch set:

Run status group 2 (all jobs):
  WRITE: io=128055MB, aggrb=2134.3MB/s, minb=2134.3MB/s, maxb=2134.3MB/s, mint=60001msec, maxt=60001msec

Run status group 3 (all jobs):
  WRITE: io=122586MB, aggrb=2043.8MB/s, minb=2043.8MB/s, maxb=2043.8MB/s, mint=60001msec, maxt=60001msec

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <x86@kernel.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Brian Boylston <brian.boylston@hpe.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Reported-by: Oliver Moreno <oliver.moreno@hpe.com>

Changes in v2:
  - Split into multiple patches (Toshi Kani)
  - Introduce memcpy_nocache() (Al Viro)
  - Use nocache for kvecs as well

Brian Boylston (3):
  introduce memcpy_nocache()
  use a nocache copy for bvecs and kvecs in copy_from_iter_nocache()
  x86: remove unneeded flush in arch_copy_from_iter_pmem()

 arch/x86/include/asm/pmem.h      | 19 +------------------
 arch/x86/include/asm/string_32.h |  3 +++
 arch/x86/include/asm/string_64.h |  3 +++
 arch/x86/lib/misc.c              | 12 ++++++++++++
 include/linux/string.h           | 15 +++++++++++++++
 lib/iov_iter.c                   | 14 +++++++++++---
 6 files changed, 45 insertions(+), 21 deletions(-)

-- 
2.8.3

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-01-04  2:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-26 15:50 [PATCH v2 0/3] use nocache copy in copy_from_iter_nocache() Brian Boylston
2016-10-26 15:50 ` [PATCH v2 1/3] introduce memcpy_nocache() Brian Boylston
2016-10-26 19:30   ` Thomas Gleixner
2016-10-28  1:52     ` Boylston, Brian
2016-10-26 19:51   ` Boaz Harrosh
2016-10-28  1:54     ` Boylston, Brian
2016-11-01 14:25       ` Boaz Harrosh
2016-12-28 23:43         ` Al Viro
2016-12-29 18:23           ` Dan Williams
2016-12-30  3:52             ` Al Viro
2016-12-30  4:56               ` Dan Williams
2016-12-31  2:25                 ` [RFC] memcpy_nocache() and memcpy_writethrough() Al Viro
2017-01-02  2:35                   ` Elliott, Robert (Persistent Memory)
2017-01-02  5:09                     ` Al Viro
2017-01-03 21:14                       ` Dan Williams
2017-01-03 23:22                         ` Al Viro
2017-01-03 23:46                           ` Linus Torvalds
2017-01-04  0:57                             ` Dan Williams
2017-01-04  1:38                           ` Dan Williams
2017-01-04  1:59                             ` Al Viro
2017-01-04  2:14                               ` Dan Williams
2016-10-26 15:50 ` [PATCH v2 2/3] use a nocache copy for bvecs and kvecs in copy_from_iter_nocache() Brian Boylston
2016-10-27  4:46   ` Ross Zwisler
2016-10-26 15:50 ` [PATCH v2 3/3] x86: remove unneeded flush in arch_copy_from_iter_pmem() Brian Boylston
2016-10-26 19:57   ` Boaz Harrosh
2016-10-28  1:58     ` Boylston, Brian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).