linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/15 v2] dax: Clear dirty bits after flushing caches
@ 2016-07-22 12:19 Jan Kara
  2016-07-22 12:19 ` [PATCH 01/15] mm: Create vm_fault structure earlier Jan Kara
                   ` (14 more replies)
  0 siblings, 15 replies; 19+ messages in thread
From: Jan Kara @ 2016-07-22 12:19 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-fsdevel, linux-nvdimm, Dan Williams, Ross Zwisler, Jan Kara

Hello,

this is a second revision of my patches to clear dirty bits from radix tree of
DAX inodes when caches for corresponding pfns have been flushed. This patch set
is significantly larger than the previous version because I'm changing how
->fault, ->page_mkwrite, and ->pfn_mkwrite handlers may choose to handle the
fault so that we don't have to leak details about DAX locking into the generic
code. In principle, these patches enable handlers to easily update PTEs and do
other work necessary to finish the fault without duplicating the functionality
present in the generic code.  I'd be really interested in feedback from mm
folks whether such changes to fault handling code are fine or what they'd do
differently...

Changes since v1:
* make sure all PTE updates happen under radix tree entry lock to protect
  against races between faults & write-protecting code
* remove information about DAX locking from mm/memory.c
* smaller updates based on Ross' feedback

----
Background information regarding the motivation:

Currently we never clear dirty bits in the radix tree of a DAX inode. Thus
fsync(2) flushes all the dirty pfns again and again. This patches implement
clearing of the dirty tag in the radix tree so that we issue flush only when
needed.

The difficulty with clearing the dirty tag is that we have to protect against
a concurrent page fault setting the dirty tag and writing new data into the
page. So we need a lock serializing page fault and clearing of the dirty tag
and write-protecting PTEs (so that we get another pagefault when pfn is written
to again and we have to set the dirty tag again).

The effect of the patch set is easily visible:

Writing 1 GB of data via mmap, then fsync twice.

Before this patch set both fsyncs take ~205 ms on my test machine, after the
patch set the first fsync takes ~283 ms (the additional cost of walking PTEs,
clearing dirty bits etc. is very noticeable), the second fsync takes below
1 us.

As a bonus, these patches make filesystem freezing for DAX filesystems
reliable because mappings are now properly writeprotected while freezing the
fs.

Patches have passed xfstests for both xfs and ext4.

								Honza

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-08-09 14:50 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-22 12:19 [PATCH 0/15 v2] dax: Clear dirty bits after flushing caches Jan Kara
2016-07-22 12:19 ` [PATCH 01/15] mm: Create vm_fault structure earlier Jan Kara
2016-07-22 12:19 ` [PATCH 02/15] mm: Propagate original vm_fault into do_fault_around() Jan Kara
2016-07-22 12:19 ` [PATCH 03/15] mm: Add pmd and orig_pte fields to vm_fault Jan Kara
2016-07-22 12:19 ` [PATCH 04/15] mm: Allow full handling of COW faults in ->fault handlers Jan Kara
2016-07-22 12:19 ` [PATCH 05/15] mm: Factor out functionality to finish page faults Jan Kara
2016-07-22 12:19 ` [PATCH 06/15] mm: Move handling of COW faults into DAX code Jan Kara
2016-07-22 12:19 ` [PATCH 07/15] dax: Make cache flushing protected by entry lock Jan Kara
2016-07-22 12:19 ` [PATCH 08/15] mm: Export follow_pte() Jan Kara
2016-07-22 12:19 ` [PATCH 09/15] mm: Remove unnecessary vma->vm_ops check Jan Kara
2016-07-22 12:19 ` [PATCH 10/15] mm: Factor out common parts of write fault handling Jan Kara
2016-07-22 12:19 ` [PATCH 11/15] mm: Move part of wp_page_reuse() into the single call site Jan Kara
2016-07-22 12:19 ` [PATCH 12/15] mm: Lift vm_fault structure creation from do_page_mkwrite() Jan Kara
2016-07-22 12:19 ` [PATCH 13/15] mm: Provide helper for finishing mkwrite faults Jan Kara
2016-08-09 14:50   ` [lkp] [mm] 0c649028cd: vm-scalability.throughput 343.9% improvement kernel test robot
2016-07-22 12:19 ` [PATCH 14/15] dax: Protect PTE modification on WP fault by radix tree entry lock Jan Kara
2016-07-25 21:30   ` Ross Zwisler
2016-07-26 14:09     ` Jan Kara
2016-07-22 12:19 ` [PATCH 15/15] dax: Clear dirty entry tags on cache flush Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).