linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] dax final updates and fixes for 4.10-rc2
@ 2017-01-01  2:57 Williams, Dan J
  0 siblings, 0 replies; only message in thread
From: Williams, Dan J @ 2017-01-01  2:57 UTC (permalink / raw)
  To: torvalds; +Cc: linux-mm, linux-nvdimm, jack, linux-fsdevel

Hi Linus, please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes

...to receive the completion of Jan's DAX work for 4.10.

As I mentioned in the libnvdimm-for-4.10 pull request [1], these are
some final fixes for the DAX dirty-cacheline-tracking invalidation work
that was merged through the -mm, ext4, and xfs trees in -rc1. These
patches were prepared prior to the merge window, but we waited for
4.10-rc1 to have a stable merge base after all the prerequisites were
merged.

Quoting Jan on the overall changes in these patches:

    So I'd like all these 6 patches to go for rc2. The first three
    patches fix invalidation of exceptional DAX entries (a bug which is
    there for a long time) - without these patches data loss can occur
    on power failure even though user called fsync(2). The other three
    patches change locking of DAX faults so that ->iomap_begin() is
    called in a more relaxed locking context and we are safe to start a
    transaction there for ext4.

These have received a build success notification from the kbuild robot,
and pass the latest libnvdimm unit tests. There have not been any -next
releases since -rc1, so they have not appeared there.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-December/008279.h
tml

---

The following changes since commit 7ce7d89f48834cefece7804d38fc5d85382edf77:

  Linux 4.10-rc1 (2016-12-25 16:13:08 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes

for you to fetch changes up to 1db175428ee374489448361213e9c3b749d14900:

  ext4: Simplify DAX fault path (2016-12-26 20:29:25 -0800)

----------------------------------------------------------------
Jan Kara (6):
      ext2: Return BH_New buffers for zeroed blocks
      mm: Invalidate DAX radix tree entries only if appropriate
      dax: Avoid page invalidation races and unnecessary radix tree traversals
      dax: Finish fault completely when loading holes
      dax: Call ->iomap_begin without entry lock during dax fault
      ext4: Simplify DAX fault path

 fs/dax.c            | 243 +++++++++++++++++++++++++++++++++-------------------
 fs/ext2/inode.c     |   3 +-
 fs/ext4/file.c      |  48 +++--------
 include/linux/dax.h |   3 +
 mm/truncate.c       |  75 +++++++++++++---
 5 files changed, 229 insertions(+), 143 deletions(-)

commit e568df6b84ff05a22467503afc11bee7a6ba0700
Author: Jan Kara <jack@suse.cz>
Date:   Wed Aug 10 16:42:53 2016 +0200

    ext2: Return BH_New buffers for zeroed blocks
    
    So far we did not return BH_New buffers from ext2_get_blocks() when we
    allocated and zeroed-out a block for DAX inode to avoid racy zeroing in
    DAX code. This zeroing is gone these days so we can remove the
    workaround.
    
    Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

commit c6dcf52c23d2d3fb5235cec42d7dd3f786b87d55
Author: Jan Kara <jack@suse.cz>
Date:   Wed Aug 10 17:22:44 2016 +0200

    mm: Invalidate DAX radix tree entries only if appropriate
    
    Currently invalidate_inode_pages2_range() and invalidate_mapping_pages()
    just delete all exceptional radix tree entries they find. For DAX this
    is not desirable as we track cache dirtiness in these entries and when
    they are evicted, we may not flush caches although it is necessary. This
    can for example manifest when we write to the same block both via mmap
    and via write(2) (to different offsets) and fsync(2) then does not
    properly flush CPU caches when modification via write(2) was the last
    one.
    
    Create appropriate DAX functions to handle invalidation of DAX entries
    for invalidate_inode_pages2_range() and invalidate_mapping_pages() and
    wire them up into the corresponding mm functions.
    
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

commit e3fce68cdbed297d927e993b3ea7b8b1cee545da
Author: Jan Kara <jack@suse.cz>
Date:   Wed Aug 10 17:10:28 2016 +0200

    dax: Avoid page invalidation races and unnecessary radix tree traversals
    
    Currently dax_iomap_rw() takes care of invalidating page tables and
    evicting hole pages from the radix tree when write(2) to the file
    happens. This invalidation is only necessary when there is some block
    allocation resulting from write(2). Furthermore in current place the
    invalidation is racy wrt page fault instantiating a hole page just after
    we have invalidated it.
    
    So perform the page invalidation inside dax_iomap_actor() where we can
    do it only when really necessary and after blocks have been allocated so
    nobody will be instantiating new hole pages anymore.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

commit f449b936f1aff7696b24a338f493d5cee8d48d55
Author: Jan Kara <jack@suse.cz>
Date:   Wed Oct 19 14:48:38 2016 +0200

    dax: Finish fault completely when loading holes
    
    The only case when we do not finish the page fault completely is when we
    are loading hole pages into a radix tree. Avoid this special case and
    finish the fault in that case as well inside the DAX fault handler. It
    will allow us for easier iomap handling.
    
    Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

commit 9f141d6ef6258a3a37a045842d9ba7e68f368956
Author: Jan Kara <jack@suse.cz>
Date:   Wed Oct 19 14:34:31 2016 +0200

    dax: Call ->iomap_begin without entry lock during dax fault
    
    Currently ->iomap_begin() handler is called with entry lock held. If the
    filesystem held any locks between ->iomap_begin() and ->iomap_end()
    (such as ext4 which will want to hold transaction open), this would cause
    lock inversion with the iomap_apply() from standard IO path which first
    calls ->iomap_begin() and only then calls ->actor() callback which grabs
    entry locks for DAX (if it faults when copying from/to user provided
    buffers).
    
    Fix the problem by nesting grabbing of entry lock inside ->iomap_begin()
    - ->iomap_end() pair.
    
    Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

commit 1db175428ee374489448361213e9c3b749d14900
Author: Jan Kara <jack@suse.cz>
Date:   Fri Oct 21 11:33:49 2016 +0200

    ext4: Simplify DAX fault path
    
    Now that dax_iomap_fault() calls ->iomap_begin() without entry lock, we
    can use transaction starting in ext4_iomap_begin() and thus simplify
    ext4_dax_fault(). It also provides us proper retries in case of ENOSPC.
    
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2017-01-01  2:57 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-01  2:57 [GIT PULL] dax final updates and fixes for 4.10-rc2 Williams, Dan J

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).