stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY overhead
@ 2019-05-17  0:33 Dan Williams
  2019-05-17  8:47 ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: Dan Williams @ 2019-05-17  0:33 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Jan Kara, stable, Jeff Moyer, Ingo Molnar, Christoph Hellwig,
	Al Viro, Thomas Gleixner, Matthew Wilcox, Jeff Smits,
	linux-fsdevel, linux-kernel

Jeff discovered that performance improves from ~375K iops to ~519K iops
on a simple psync-write fio workload when moving the location of 'struct
page' from the default PMEM location to DRAM. This result is surprising
because the expectation is that 'struct page' for dax is only needed for
third party references to dax mappings. For example, a dax-mapped buffer
passed to another system call for direct-I/O requires 'struct page' for
sending the request down the driver stack and pinning the page. There is
no usage of 'struct page' for first party access to a file via
read(2)/write(2) and friends.

However, this "no page needed" expectation is violated by
CONFIG_HARDENED_USERCOPY and the check_copy_size() performed in
copy_from_iter_full_nocache() and copy_to_iter_mcsafe(). The
check_heap_object() helper routine assumes the buffer is backed by a
page-allocator DRAM page and applies some checks.  Those checks are
invalid, dax pages are not from the heap, and redundant,
dax_iomap_actor() has already validated that the I/O is within bounds.

Bypass this overhead and call the 'no check' versions of the
copy_{to,from}_iter operations directly.

Fixes: 0aed55af8834 ("x86, uaccess: introduce copy_from_iter_flushcache...")
Cc: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <willy@infradead.org>
Reported-and-tested-by: Jeff Smits <jeff.smits@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/pmem.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 845c5b430cdd..c894f45e5077 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -281,16 +281,21 @@ static long pmem_dax_direct_access(struct dax_device *dax_dev,
 	return __pmem_direct_access(pmem, pgoff, nr_pages, kaddr, pfn);
 }
 
+/*
+ * Use the 'no check' versions of copy_from_iter_flushcache() and
+ * copy_to_iter_mcsafe() to bypass HARDENED_USERCOPY overhead. Bounds
+ * checking is handled by dax_iomap_actor()
+ */
 static size_t pmem_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff,
 		void *addr, size_t bytes, struct iov_iter *i)
 {
-	return copy_from_iter_flushcache(addr, bytes, i);
+	return _copy_from_iter_flushcache(addr, bytes, i);
 }
 
 static size_t pmem_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff,
 		void *addr, size_t bytes, struct iov_iter *i)
 {
-	return copy_to_iter_mcsafe(addr, bytes, i);
+	return _copy_to_iter_mcsafe(addr, bytes, i);
 }
 
 static const struct dax_operations pmem_dax_ops = {


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-20 15:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-17  0:33 [PATCH] libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY overhead Dan Williams
2019-05-17  8:47 ` Jan Kara
2019-05-17  9:06   ` David Laight
2019-05-17 15:53     ` Kees Cook
2019-05-17 16:14       ` David Laight
2019-05-17 16:40         ` Kees Cook
2019-05-17 15:08   ` Dan Williams
2019-05-17 15:56     ` Kees Cook
2019-05-17 17:28       ` Dan Williams
2019-05-17 19:25         ` Kees Cook
2019-05-19  4:46           ` Dan Williams
2019-05-20  7:52             ` Jan Kara
2019-05-20 15:40               ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).