linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <mawilcox@microsoft.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-nvdimm@ml01.01.org" <linux-nvdimm@ml01.01.org>,
	Dave Chinner <david@fromorbit.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Jan Kara" <jack@suse.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: RE: [PATCH v2 2/9] ext2: tell DAX the size of allocation holes
Date: Sat, 10 Sep 2016 08:15:06 +0000	[thread overview]
Message-ID: <DM2PR21MB00899C835BC0AF476B6683CDCBFD0@DM2PR21MB0089.namprd21.prod.outlook.com> (raw)
In-Reply-To: <CAPcyv4hjna08+Yw23w_V2f-RbBE6ar220+YGCuBVA-TACKWNug@mail.gmail.com>

From: Dan Williams [mailto:dan.j.williams@intel.com]
> /me grumbles about top-posting...

Let's see if this does any better .. there's lots of new features, but I don't see a 'wrap lines at 80 columns' option.  Unfortunately.

> On Fri, Sep 9, 2016 at 1:35 PM, Matthew Wilcox <mawilcox@microsoft.com>
> wrote:
> > I thought after Storage Summit, we had broad agreement that we were
> > moving to a primary DAX API that was not BH (nor indeed iomap) based.  We
> > would still have DAX helpers for block based filesystems (because duplicating
> > all that code between filesystems is pointless), but I now know of three
> > filesystems which are not block based that are interested in using DAX.  Jared
> > Hulbert's AXFS is a nice public example.
> >
> > I posted a prototype of this here:
> >
> >
> https://groups.google.com/d/msg/linux.kernel/xFFHVCQM7Go/ZQeDVYTnFgAJ
> >
> > It is, of course, woefully out of date, but some of the principles in it are still
> good (and I'm working to split it into digestible chunks).
> >
> > The essence:
> >
> > 1. VFS or VM calls filesystem (eg ->fault()) 2. Filesystem calls DAX
> > (eg dax_fault()) 3. DAX looks in radix tree, finds no information.
> > 4. DAX calls (NEW!) mapping->a_ops->populate_pfns 5a. Filesystem (if
> > not block based) does its own thing to find out the PFNs corresponding
> > to the requested range, then inserts them into the radix tree (possible helper
> in DAX code) 5b. Filesystem (if block based) looks up its internal data structure
> (eg extent tree) and
> >    calls dax_create_pfns() (see giant patch from yesterday, only instead of
> >    passing a get_block_t, the filesystem has already filled in a bh which
> >    describes the entire extent that this access happens to land in).
> > 6b. DAX takes care of calling bdev_direct_access() from dax_create_pfns().
> >
> > Now, notice that there's no interaction with the rest of the filesystem here.
> We can swap out BHs and iomaps relatively trivially; there's no call for making
> grand changes, like converting ext2 over to iomap.  The BH or iomap is only
> used for communicating the extent from the filesystem to DAX.
> >
> > Do we have agreement that this is the right way to go?
> 
> My $0.02...
> 
> So the current dax implementation is still struggling to get right (pmd faulting,
> dirty entry cleaning, etc) and this seems like a rewrite that sets us up for future
> features without addressing the current bugs and todo items.  In comparison
> the iomap conversion work seems incremental and conserving of current
> development momentum.

I believe your assessment is incorrect.  If converting the current DAX code to
use iomap forces converting ext2, then it's time to get rid of all the half-measures
currently in place.  You left off one todo item that this does get us a step closer to
fixing -- support for DMA to mmaped DAX files.  I think it also puts us in a better
position to fix the 2MB support, locking, and dirtiness tracking.  Oh, and it does
fix the multivolume support (because the sectors in the radix tree could be
interpreted as being from the wrong volume).

> I agree with you that continuing to touch ext2 is not a good idea, but I'm not
> yet convinced that now is the time to go do dax-2.0 when we haven't finished
> shipping dax-1.0.

dax-1.0 died long ago ... I think we're up to at least DAX version 4 by now.

  parent reply	other threads:[~2016-09-10  8:15 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-23 22:04 [PATCH v2 0/9] re-enable DAX PMD support Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 1/9] ext4: allow DAX writeback for hole punch Ross Zwisler
2016-09-21 15:22   ` Ross Zwisler
2016-09-22  6:59     ` Jan Kara
2016-09-22 15:51     ` Theodore Ts'o
2016-08-23 22:04 ` [PATCH v2 2/9] ext2: tell DAX the size of allocation holes Ross Zwisler
2016-08-25  7:57   ` Christoph Hellwig
2016-08-25 19:25     ` Ross Zwisler
2016-08-26 21:29     ` Ross Zwisler
2016-08-29  0:42       ` Dave Chinner
2016-08-29  7:41       ` Christoph Hellwig
2016-08-29 12:57         ` Theodore Ts'o
2016-08-30  7:21           ` Christoph Hellwig
2016-09-09 16:48           ` Ross Zwisler
2016-09-09 20:35             ` Matthew Wilcox
2016-09-09 22:34               ` Dan Williams
2016-09-10  7:31                 ` Christoph Hellwig
2016-09-10  7:50                   ` Matthew Wilcox
2016-09-10 17:49                   ` Theodore Ts'o
2016-09-11  0:42                     ` Matthew Wilcox
2016-09-10  8:15                 ` Matthew Wilcox [this message]
2016-09-10 14:56                   ` Dan Williams
2016-09-10  7:30               ` Christoph Hellwig
2016-09-10  7:33                 ` Matthew Wilcox
2016-09-10  7:42                   ` Christoph Hellwig
2016-09-10  7:52                     ` Matthew Wilcox
2016-09-11 12:47                       ` Christoph Hellwig
2016-09-11 22:57                         ` Ross Zwisler
2016-09-10 15:55                 ` Matthew Wilcox
2016-09-15 20:09   ` Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 3/9] ext4: " Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 4/9] dax: remove buffer_size_valid() Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 5/9] dax: make 'wait_table' global variable static Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 6/9] dax: consistent variable naming for DAX entries Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 7/9] dax: coordinate locking for offsets in PMD range Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 8/9] dax: re-enable DAX PMD support Ross Zwisler
2016-08-23 22:04 ` [PATCH v2 9/9] dax: remove "depends on BROKEN" from FS_DAX_PMD Ross Zwisler
2016-08-30 23:01 ` [PATCH v2 0/9] re-enable DAX PMD support Ross Zwisler
2016-08-31 20:20   ` Kani, Toshimitsu
2016-08-31 21:36     ` Ross Zwisler
2016-08-31 22:08       ` Kani, Toshimitsu
2016-09-01 16:21         ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM2PR21MB00899C835BC0AF476B6683CDCBFD0@DM2PR21MB0089.namprd21.prod.outlook.com \
    --to=mawilcox@microsoft.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).