All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/13 v2] dax, ext4: Synchronous page faults
@ 2017-08-17 16:08 ` Jan Kara
  0 siblings, 0 replies; 142+ messages in thread
From: Jan Kara @ 2017-08-17 16:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Christoph Hellwig, Boaz Harrosh, Jan Kara, linux-nvdimm,
	linux-xfs, Andy Lutomirski, linux-ext4

Hello,

here is second version of my patches to implement synchronous page faults for
DAX mappings to make flushing of DAX mappings possible from userspace so that
they can be flushed on finer than page granularity and also avoid the overhead
of a syscall.

We use a new mmap flag MAP_SYNC to indicate that page faults for the mapping
should be synchronous.  The guarantee provided by this flag is: While a block
is writeably mapped into page tables of this mapping, it is guaranteed to be
visible in the file at that offset also after a crash.

How I implement this is that ->iomap_begin() indicates by a flag that inode
block mapping metadata is unstable and may need flushing (use the same test as
whether fdatasync() has metadata to write). If yes, DAX fault handler refrains
from inserting / write-enabling the page table entry and returns special flag
VM_FAULT_NEEDDSYNC together with a PFN to map to the filesystem fault handler.
The handler then calls fdatasync() (vfs_fsync_range()) for the affected range
and after that calls DAX code to update the page table entry appropriately.

>From my (fairly limited) knowledge of XFS it seems XFS should be able to do the
same and it should be even possible for filesystem to implement safe remapping
of a file offset to a different block (i.e. break reflink, do defrag, or
similar stuff) like:

1) Block page faults
2) fdatasync() remapped range (there can be outstanding data modifications
   not yet flushed)
3) unmap_mapping_range()
4) Now remap blocks
5) Unblock page faults

Basically we do the same on events like punch hole so there is not much new
there.

Note that the implementation of MAP_SYNC flag is pretty crude for now just to
enable testing since Dan is working in the same area to implement another mmap
flag. Once the decision on how to implement new mmap flag is settled, I can
clean up that patch.

I did some basic performance testing on the patches over ramdisk - timed
latency of page faults when faulting 512 pages. I did several tests: with file
preallocated / with file empty, with background file copying going on / without
it, with / without MAP_SYNC (so that we get comparison).  The results are
(numbers are in microseconds):

File preallocated, no background load no MAP_SYNC:
min=5 avg=6 max=42
4 - 7 us: 398
8 - 15 us: 110
16 - 31 us: 2
32 - 63 us: 2

File preallocated, no background load, MAP_SYNC:
min=10 avg=10 max=43
8 - 15 us: 509
16 - 31 us: 2
32 - 63 us: 1

File empty, no background load, no MAP_SYNC:
min=21 avg=23 max=76
16 - 31 us: 503
32 - 63 us: 8
64 - 127 us: 1

File empty, no background load, MAP_SYNC:
min=91 avg=108 max=234
64 - 127 us: 467
128 - 255 us: 45

File empty, background load, no MAP_SYNC:
min=21 avg=23 max=67
16 - 31 us: 507
32 - 63 us: 4
64 - 127 us: 1

File empty, background load, MAP_SYNC:
min=94 avg=112 max=181
64 - 127 us: 489
128 - 255 us: 23

So here we can see the difference between MAP_SYNC vs non MAP_SYNC is about
100-200 us when we need to wait for transaction commit in this setup. 

Anyway, here are the patches, comments are welcome.

Changes since v1:
* switched to using mmap flag MAP_SYNC
* cleaned up fault handlers to avoid passing pfn in vmf->orig_pte
* switched to not touching page tables before we are ready to insert final
  entry as it was unnecessary and not really simplifying anything
* renamed fault flag to VM_FAULT_NEEDDSYNC
* other smaller fixes found by reviewers

								Honza
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 142+ messages in thread

end of thread, other threads:[~2017-08-24 16:45 UTC | newest]

Thread overview: 142+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-17 16:08 [RFC PATCH 0/13 v2] dax, ext4: Synchronous page faults Jan Kara
2017-08-17 16:08 ` Jan Kara
2017-08-17 16:08 ` Jan Kara
2017-08-17 16:08 ` Jan Kara
2017-08-17 16:08 ` [PATCH 01/13] mm: Remove VM_FAULT_HWPOISON_LARGE_MASK Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08 ` [PATCH 02/13] dax: Simplify arguments of dax_insert_mapping() Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08 ` [PATCH 03/13] dax: Factor out getting of pfn out of iomap Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-18 22:06   ` Ross Zwisler
2017-08-18 22:06     ` Ross Zwisler
2017-08-23 18:30   ` Christoph Hellwig
2017-08-23 18:30     ` Christoph Hellwig
2017-08-17 16:08 ` [PATCH 04/13] dax: Create local variable for VMA in dax_iomap_pte_fault() Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-18 22:08   ` Ross Zwisler
2017-08-18 22:08     ` Ross Zwisler
2017-08-23 18:30   ` Christoph Hellwig
2017-08-23 18:30     ` Christoph Hellwig
2017-08-17 16:08 ` [PATCH 05/13] dax: Create local variable for vmf->flags & FAULT_FLAG_WRITE test Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-18 22:08   ` Ross Zwisler
2017-08-18 22:08     ` Ross Zwisler
2017-08-18 22:08     ` Ross Zwisler
2017-08-23 18:31   ` Christoph Hellwig
2017-08-23 18:31     ` Christoph Hellwig
2017-08-23 18:31     ` Christoph Hellwig
2017-08-17 16:08 ` [PATCH 06/13] dax: Inline dax_insert_mapping() into the callsite Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-18 22:10   ` Ross Zwisler
2017-08-18 22:10     ` Ross Zwisler
2017-08-18 22:10     ` Ross Zwisler
2017-08-23 18:31   ` Christoph Hellwig
2017-08-23 18:31     ` Christoph Hellwig
2017-08-23 18:31     ` Christoph Hellwig
2017-08-17 16:08 ` [PATCH 07/13] dax: Inline dax_pmd_insert_mapping() " Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-18 22:12   ` Ross Zwisler
2017-08-18 22:12     ` Ross Zwisler
2017-08-18 22:12     ` Ross Zwisler
2017-08-23 18:32   ` Christoph Hellwig
2017-08-23 18:32     ` Christoph Hellwig
2017-08-17 16:08 ` [PATCH 08/13] dax: Fix comment describing dax_iomap_fault() Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-18 22:12   ` Ross Zwisler
2017-08-18 22:12     ` Ross Zwisler
2017-08-23 18:32   ` Christoph Hellwig
2017-08-23 18:32     ` Christoph Hellwig
2017-08-17 16:08 ` [PATCH 09/13] dax: Allow dax_iomap_fault() to return pfn Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-21 18:45   ` Ross Zwisler
2017-08-21 18:45     ` Ross Zwisler
2017-08-23 18:34   ` Christoph Hellwig
2017-08-23 18:34     ` Christoph Hellwig
2017-08-23 18:34     ` Christoph Hellwig
2017-08-24  7:26     ` Jan Kara
2017-08-24  7:26       ` Jan Kara
2017-08-17 16:08 ` [PATCH 10/13] mm: Wire up MAP_SYNC Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-21 21:37   ` Ross Zwisler
2017-08-21 21:37     ` Ross Zwisler
2017-08-22  9:36     ` Jan Kara
2017-08-22  9:36       ` Jan Kara
2017-08-21 21:57   ` Ross Zwisler
2017-08-21 21:57     ` Ross Zwisler
2017-08-21 21:57     ` Ross Zwisler
2017-08-22  9:34     ` Jan Kara
2017-08-22  9:34       ` Jan Kara
2017-08-22 17:27     ` Dan Williams
2017-08-22 17:27       ` Dan Williams
2017-08-22 17:27       ` Dan Williams
2017-08-23 18:43   ` Christoph Hellwig
2017-08-23 18:43     ` Christoph Hellwig
2017-08-23 18:43     ` Christoph Hellwig
2017-08-24  7:16     ` Jan Kara
2017-08-24  7:16       ` Jan Kara
2017-08-17 16:08 ` [PATCH 11/13] dax, iomap: Add support for synchronous faults Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-21 18:58   ` Ross Zwisler
2017-08-21 18:58     ` Ross Zwisler
2017-08-22  9:46     ` Jan Kara
2017-08-22  9:46       ` Jan Kara
2017-08-21 21:09   ` Ross Zwisler
2017-08-21 21:09     ` Ross Zwisler
2017-08-22 10:08     ` Jan Kara
2017-08-22 10:08       ` Jan Kara
2017-08-22 10:08       ` Jan Kara
2017-08-24 12:27   ` Christoph Hellwig
2017-08-24 12:27     ` Christoph Hellwig
2017-08-24 12:34     ` Jan Kara
2017-08-24 12:34       ` Jan Kara
2017-08-24 13:38       ` Christoph Hellwig
2017-08-24 13:38         ` Christoph Hellwig
2017-08-24 16:45         ` Jan Kara
2017-08-24 16:45           ` Jan Kara
2017-08-17 16:08 ` [PATCH 12/13] dax: Implement dax_insert_pfn_mkwrite() Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-21 19:01   ` Ross Zwisler
2017-08-21 19:01     ` Ross Zwisler
2017-08-17 16:08 ` [PATCH 13/13] ext4: Support for synchronous DAX faults Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-17 16:08   ` Jan Kara
2017-08-21 19:19   ` Ross Zwisler
2017-08-21 19:19     ` Ross Zwisler
2017-08-22 10:18     ` Jan Kara
2017-08-22 10:18       ` Jan Kara
2017-08-22 10:18       ` Jan Kara
2017-08-23 18:37   ` Christoph Hellwig
2017-08-23 18:37     ` Christoph Hellwig
2017-08-24  7:18     ` Jan Kara
2017-08-24  7:18       ` Jan Kara
2017-08-24 12:31   ` Christoph Hellwig
2017-08-24 12:31     ` Christoph Hellwig
2017-08-24 12:34     ` Christoph Hellwig
2017-08-24 12:34       ` Christoph Hellwig
2017-08-24 12:34       ` Christoph Hellwig
2017-08-24 12:36     ` Jan Kara
2017-08-24 12:36       ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.