All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Matthew Wilcox <willy@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@ml01.01.org>
Subject: Re: [PATCH 2/2] dax: fix bdev NULL pointer dereferences
Date: Wed, 3 Feb 2016 12:09:03 +0100	[thread overview]
Message-ID: <20160203110903.GG12574@quack.suse.cz> (raw)
In-Reply-To: <20160202184643.GA3260@linux.intel.com>

On Tue 02-02-16 13:46:43, Matthew Wilcox wrote:
> On Tue, Feb 02, 2016 at 09:46:21AM -0800, Dan Williams wrote:
> > What a about a super_operation?  That seems the right level, given
> > we're currently doing:
> > 
> > inode->i_sb->s_bdev
> > 
> > ...it does not seem terrible to instead do:
> > 
> > inode->i_sb->s_op->get_block()
> 
> The point is that filesystems have lots of different get_block operations,
> and the right one to use depends not just on the inode, but also upon
> what VFS function is being called, and in some filesystems the phase
> of the moon, or the file open flags (so even inode->i_ops->get_block is
> wrong; file->f_ops->get_block would be better, but of course we've lost
> that by the point we're doing writeback).

See what I wrote to Ross. I think this particular issue needs to be solved
by moving the flushing to ->writepages() callback.

> I now realise that basing DAX around get_block & buffer_heads was a mistake.
> I think the Right Solution (not for 4.5) is to ask filesystems to populate
> the radix tree.  A flow somewhat like this:
> 
> 1. VFS or VM calls filesystem (eg ->fault())
> 2. Filesystem calls DAX (eg dax_fault())
> 3. DAX looks in radix tree, finds no information.
> 4. DAX calls (NEW!) mapping->a_ops->populate_pfns
> 5. Filesystem looks up its internal data structure (eg extent tree) and
>    calls dax_create_pfns() (see giant patch from yesterday, only instead of
>    passing a get_block_t, the filesystem has already filled in a bh which
>    describes the entire extent that this access happens to land in).
> 6. DAX continues to take care of calling bdev_direct_access() from
>    dax_create_pfns().
 
So I don't think that ->populate_pfns() is the right interface because it
doesn't really tell the filesystem what you want to do. It is essentially
like get_blocks() callback only you additionaly ask the fs to fill in the
mapping information into the radix tree. So it has the same problems as
get_blocks() callback in inode_operations (or superblock_operations,
aops, or anywhere else). History has taught us (there was get_blocks()
callback in inode operations in ancient times ;) that fs really needs to
know wider context to decide how exactly to fulfill the request.

I don't see anything obviously wrong with using radix tree as a primary
source of mapping information for DAX (after all we do that for page cache
as well where the mapping information is attached to pages in the radix
tree). But this seems independent of the get_blocks() vs something else
discussion.

And if your problem is with vaguely defined meaning of buffer_head flags
returned from get_blocks() callback, using the iomap interface (which XFS
currently uses for pNFS) would solve that.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  parent reply	other threads:[~2016-02-03 11:08 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-28 19:35 [PATCH 1/2] block: fix pfn_mkwrite() DAX fault handler Ross Zwisler
2016-01-28 19:35 ` Ross Zwisler
2016-01-28 19:35 ` [PATCH 2/2] dax: fix bdev NULL pointer dereferences Ross Zwisler
2016-01-28 19:35   ` Ross Zwisler
2016-01-28 20:21   ` Dan Williams
2016-01-28 20:21     ` Dan Williams
2016-01-28 21:38   ` Christoph Hellwig
2016-01-29 18:28     ` Ross Zwisler
2016-01-29 23:34       ` Ross Zwisler
2016-01-30  0:18         ` Dan Williams
2016-01-31 22:44         ` Dave Chinner
2016-01-30  5:28       ` Matthew Wilcox
2016-01-30  6:01         ` Dan Williams
2016-01-30  7:08           ` Jared Hulbert
2016-01-31  2:32           ` Matthew Wilcox
2016-01-31  6:12             ` Ross Zwisler
2016-01-31 10:55               ` Matthew Wilcox
2016-01-31 16:38                 ` Dan Williams
2016-01-31 18:07                   ` Matthew Wilcox
2016-01-31 18:18                     ` Dan Williams
2016-01-31 18:27                       ` Matthew Wilcox
2016-01-31 18:50                         ` Dan Williams
2016-01-31 19:51                     ` Dan Williams
2016-02-01 13:44             ` Matthew Wilcox
2016-02-01 14:51         ` Jan Kara
2016-02-01 20:49           ` Matthew Wilcox
2016-02-01 21:47           ` Dave Chinner
2016-02-02  6:06             ` Jared Hulbert
2016-02-02  6:46               ` Dan Williams
2016-02-02  8:05                 ` Jared Hulbert
2016-02-02 16:51                   ` Dan Williams
2016-02-02 21:46                     ` Jared Hulbert
2016-02-03  0:34                       ` Matthew Wilcox
2016-02-03  1:21                         ` Jared Hulbert
2016-02-02 11:17             ` Jan Kara
2016-02-02 16:33               ` Dan Williams
2016-02-02 16:46                 ` Jan Kara
2016-02-02 17:10                   ` Dan Williams
2016-02-02 17:34                     ` Ross Zwisler
2016-02-02 17:46                       ` Dan Williams
2016-02-02 17:47                         ` Dan Williams
2016-02-02 18:24                           ` Ross Zwisler
2016-02-02 18:46                         ` Matthew Wilcox
2016-02-02 18:59                           ` Dan Williams
2016-02-02 20:14                             ` Matthew Wilcox
2016-02-03 11:09                           ` Jan Kara [this message]
2016-02-03 10:46                       ` Jan Kara
2016-02-03 20:13                         ` Ross Zwisler
2016-02-04  9:15                           ` Jan Kara
2016-02-04 23:38                             ` Ross Zwisler
2016-02-06 23:15                             ` Dave Chinner
2016-02-07  5:27                               ` Ross Zwisler
2016-02-04 19:56                         ` Ross Zwisler
2016-02-04 20:29                           ` Jan Kara
2016-02-04 22:19                             ` Ross Zwisler
2016-02-05 22:25                             ` Ross Zwisler
2016-02-06 23:40                               ` Dave Chinner
2016-02-07  6:43                                 ` Ross Zwisler
2016-02-08 13:48                                   ` Jan Kara
2016-02-07  8:38                               ` Christoph Hellwig
2016-02-08 15:55                                 ` Ross Zwisler
2016-02-02 18:41               ` Ross Zwisler
2016-02-02 18:53                 ` Ross Zwisler
2016-02-02  0:02     ` Ross Zwisler
2016-02-02  7:10       ` Dave Chinner
2016-02-02 10:34       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160203110903.GG12574@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.