All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jared Hulbert <jaredeh@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Matthew Wilcox <willy@linux.intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.com>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@ml01.01.org>
Subject: Re: [PATCH 2/2] dax: fix bdev NULL pointer dereferences
Date: Mon, 1 Feb 2016 22:06:28 -0800	[thread overview]
Message-ID: <CA+ZsKJ7RVX6EiMoPd62pOjgKWuSxizn8nuZwVJHqNwx3gbHrJg@mail.gmail.com> (raw)
In-Reply-To: <20160201214730.GR20456@dastard>

On Mon, Feb 1, 2016 at 1:47 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Feb 01, 2016 at 03:51:47PM +0100, Jan Kara wrote:
>> On Sat 30-01-16 00:28:33, Matthew Wilcox wrote:
>> > On Fri, Jan 29, 2016 at 11:28:15AM -0700, Ross Zwisler wrote:
>> > > I guess I need to go off and understand if we can have DAX mappings on such a
>> > > device.  If we can, we may have a problem - we can get the block_device from
>> > > get_block() in I/O path and the various fault paths, but we don't have access
>> > > to get_block() when flushing via dax_writeback_mapping_range().  We avoid
>> > > needing it the normal case by storing the sector results from get_block() in
>> > > the radix tree.
>> >
>> > I think we're doing it wrong by storing the sector in the radix tree; we'd
>> > really need to store both the sector and the bdev which is too much data.
>> >
>> > If we store the PFN of the underlying page instead, we don't have this
>> > problem.  Instead, we have a different problem; of the device going
>> > away under us.  I'm trying to find the code which tears down PTEs when
>> > the device goes away, and I'm not seeing it.  What do we do about user
>> > mappings of the device?
>>
>> So I don't have a strong opinion whether storing PFN or sector is better.
>> Maybe PFN is somewhat more generic but OTOH turning DAX off for special
>> cases like inodes on XFS RT devices would be IMHO fine.
>
> We need to support alternate devices.

Embedded devices trying to use NOR Flash to free up RAM was
historically one of the more prevalent real world uses of the old
filemap_xip.c code although the users never made it to mainline.  So I
spent some time last week trying to figure out how to make a subset of
DAX not depend on CONFIG_BLOCK.  It was a very frustrating and
unfruitful experience.  I discarded my main conclusion as impractical,
but now that I see the difficultly DAX faces in dealing with
"alternate devices" especially some of the crazy stuff btrfs can do, I
wonder if it's not so crazy after all.

Lets stop calling bdev_direct_access() directly from DAX.  Let the
filesystems do it.

Sure we could enable generic_dax_direct_access() helper for the
filesystems that only support single devices to make it easy.  But XFS
and btrfs for example, have to do the work of figuring out what bdev
is required and then calling bdev_direct_access().

My reasoning is that the filesystem knows how to map inodes and
offsets to devices and sectors, no matter how complex that is.  It
would even enable a filesystem to intelligently use a mix of
direct_access and regular block devices down the road.  Of course it
would also make the block-less solution doable.

Good idea?  Stupid idea?

  reply	other threads:[~2016-02-02  6:06 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-28 19:35 [PATCH 1/2] block: fix pfn_mkwrite() DAX fault handler Ross Zwisler
2016-01-28 19:35 ` Ross Zwisler
2016-01-28 19:35 ` [PATCH 2/2] dax: fix bdev NULL pointer dereferences Ross Zwisler
2016-01-28 19:35   ` Ross Zwisler
2016-01-28 20:21   ` Dan Williams
2016-01-28 20:21     ` Dan Williams
2016-01-28 21:38   ` Christoph Hellwig
2016-01-29 18:28     ` Ross Zwisler
2016-01-29 23:34       ` Ross Zwisler
2016-01-30  0:18         ` Dan Williams
2016-01-31 22:44         ` Dave Chinner
2016-01-30  5:28       ` Matthew Wilcox
2016-01-30  6:01         ` Dan Williams
2016-01-30  7:08           ` Jared Hulbert
2016-01-31  2:32           ` Matthew Wilcox
2016-01-31  6:12             ` Ross Zwisler
2016-01-31 10:55               ` Matthew Wilcox
2016-01-31 16:38                 ` Dan Williams
2016-01-31 18:07                   ` Matthew Wilcox
2016-01-31 18:18                     ` Dan Williams
2016-01-31 18:27                       ` Matthew Wilcox
2016-01-31 18:50                         ` Dan Williams
2016-01-31 19:51                     ` Dan Williams
2016-02-01 13:44             ` Matthew Wilcox
2016-02-01 14:51         ` Jan Kara
2016-02-01 20:49           ` Matthew Wilcox
2016-02-01 21:47           ` Dave Chinner
2016-02-02  6:06             ` Jared Hulbert [this message]
2016-02-02  6:46               ` Dan Williams
2016-02-02  8:05                 ` Jared Hulbert
2016-02-02 16:51                   ` Dan Williams
2016-02-02 21:46                     ` Jared Hulbert
2016-02-03  0:34                       ` Matthew Wilcox
2016-02-03  1:21                         ` Jared Hulbert
2016-02-02 11:17             ` Jan Kara
2016-02-02 16:33               ` Dan Williams
2016-02-02 16:46                 ` Jan Kara
2016-02-02 17:10                   ` Dan Williams
2016-02-02 17:34                     ` Ross Zwisler
2016-02-02 17:46                       ` Dan Williams
2016-02-02 17:47                         ` Dan Williams
2016-02-02 18:24                           ` Ross Zwisler
2016-02-02 18:46                         ` Matthew Wilcox
2016-02-02 18:59                           ` Dan Williams
2016-02-02 20:14                             ` Matthew Wilcox
2016-02-03 11:09                           ` Jan Kara
2016-02-03 10:46                       ` Jan Kara
2016-02-03 20:13                         ` Ross Zwisler
2016-02-04  9:15                           ` Jan Kara
2016-02-04 23:38                             ` Ross Zwisler
2016-02-06 23:15                             ` Dave Chinner
2016-02-07  5:27                               ` Ross Zwisler
2016-02-04 19:56                         ` Ross Zwisler
2016-02-04 20:29                           ` Jan Kara
2016-02-04 22:19                             ` Ross Zwisler
2016-02-05 22:25                             ` Ross Zwisler
2016-02-06 23:40                               ` Dave Chinner
2016-02-07  6:43                                 ` Ross Zwisler
2016-02-08 13:48                                   ` Jan Kara
2016-02-07  8:38                               ` Christoph Hellwig
2016-02-08 15:55                                 ` Ross Zwisler
2016-02-02 18:41               ` Ross Zwisler
2016-02-02 18:53                 ` Ross Zwisler
2016-02-02  0:02     ` Ross Zwisler
2016-02-02  7:10       ` Dave Chinner
2016-02-02 10:34       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+ZsKJ7RVX6EiMoPd62pOjgKWuSxizn8nuZwVJHqNwx3gbHrJg@mail.gmail.com \
    --to=jaredeh@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.