Linux-NVDIMM Archive on lore.kernel.org
 help / color / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	virtio-fs@redhat.com, Stefan Hajnoczi <stefanha@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 01/19] dax: remove block device dependencies
Date: Tue, 14 Jan 2020 16:28:05 -0500
Message-ID: <20200114212805.GB3145@redhat.com> (raw)
In-Reply-To: <CAPcyv4iXKFt207Pen+E1CnqCFtC1G85fxw5EXFVx+jtykGWMXA@mail.gmail.com>

On Tue, Jan 14, 2020 at 12:39:00PM -0800, Dan Williams wrote:
> On Tue, Jan 14, 2020 at 12:31 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > On Thu, Jan 09, 2020 at 12:03:01PM -0800, Dan Williams wrote:
> > > On Thu, Jan 9, 2020 at 3:27 AM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Tue 07-01-20 10:49:55, Dan Williams wrote:
> > > > > On Tue, Jan 7, 2020 at 10:33 AM Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > > > W.r.t partitioning, bdev_dax_pgoff() seems to be the pain point where
> > > > > > dax code refers back to block device to figure out partition offset in
> > > > > > dax device. If we create a dax object corresponding to "struct block_device"
> > > > > > and store sector offset in that, then we could pass that object to dax
> > > > > > code and not worry about referring back to bdev. I have written some
> > > > > > proof of concept code and called that object "dax_handle". I can post
> > > > > > that code if there is interest.
> > > > >
> > > > > I don't think it's worth it in the end especially considering
> > > > > filesystems are looking to operate on /dev/dax devices directly and
> > > > > remove block entanglements entirely.
> > > > >
> > > > > > IMHO, it feels useful to be able to partition and use a dax capable
> > > > > > block device in same way as non-dax block device. It will be really
> > > > > > odd to think that if filesystem is on /dev/pmem0p1, then dax can't
> > > > > > be enabled but if filesystem is on /dev/mapper/pmem0p1, then dax
> > > > > > will work.
> > > > >
> > > > > That can already happen today. If you do not properly align the
> > > > > partition then dax operations will be disabled. This proposal just
> > > > > extends that existing failure domain to make all partitions fail to
> > > > > support dax.
> > > >
> > > > Well, I have some sympathy with the sysadmin that has /dev/pmem0 device,
> > > > decides to create partitions on it for whatever (possibly misguided)
> > > > reason and then ponders why the hell DAX is not working? And PAGE_SIZE
> > > > partition alignment is so obvious and widespread that I don't count it as a
> > > > realistic error case sysadmins would be pondering about currently.
> > > >
> > > > So I'd find two options reasonably consistent:
> > > > 1) Keep status quo where partitions are created and support DAX.
> > > > 2) Stop partition creation altogether, if anyones wants to split pmem
> > > > device further, he can use dm-linear for that (i.e., kpartx).
> > > >
> > > > But I'm not sure if the ship hasn't already sailed for option 2) to be
> > > > feasible without angry users and Linus reverting the change.
> > >
> > > Christoph? I feel myself leaning more and more to the "keep pmem
> > > partitions" camp.
> > >
> > > I don't see "drop partition support" effort ending well given the long
> > > standing "ext4 fails to mount when dax is not available" precedent.
> > >
> > > I think the next least bad option is to have a dax_get_by_host()
> > > variant that passes an offset and length pair rather than requiring a
> > > later bdev_dax_pgoff() to recall the offset. This also prevents
> > > needing to add another dax-device object representation.
> >
> > I am wondering what's the conclusion on this. I want to this to make
> > progress in some direction so that I can make progress on virtiofs DAX
> > support.
> 
> I think we should at least try to delete the partition support and see
> if anyone screams. Have a module option to revert the behavior so
> people are not stuck waiting for the revert to land, but if it stays
> quiet then we're in a better place with that support pushed out of the
> dax core.

Hi Dan,

So basically keep partition support code just that disable it by default
and it is enabled by some knob say kernel command line option/module
option.

At what point of time will we remove that code completely. I mean what
if people scream after two kernel releases, after we have removed the
code.

Also, from distribution's perspective, we might not hear from our
customers for a very long time (till we backport that code in to
existing releases or release this new code in next major release). From
that view point I will not like to break existing user visible behavior.

How bad it is to keep partition support around. To me it feels reasonaly
simple where we just have to store offset into dax device into another
dax object and pass that object around (instead of dax_device). If that's
the case, I am not sure why to even venture into a direction where some
user's setup might be broken.

Also from an application perspective, /dev/pmem is a block device, so it
should behave like a block device, (including kernel partition table support).
From that view, dax looks like just an additional feature of that device
which can be enabled by passing option "-o dax".

IOW, can we reconsider the idea of not supporting kernel partition tables
for dax capable block devices. I can only see downsides of removing kernel
partition table support and only upside seems to be little cleanup of dax
core code.

Thanks
Vivek
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

  reply index

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-21 17:57 [PATCH v3 00/19][RFC] virtio-fs: Enable DAX support Vivek Goyal
2019-08-21 17:57 ` [PATCH 01/19] dax: remove block device dependencies Vivek Goyal
2019-08-26 11:51   ` Christoph Hellwig
2019-08-27 16:38     ` Vivek Goyal
2019-08-28  6:58       ` Christoph Hellwig
2019-08-28 17:58         ` Vivek Goyal
2019-08-28 22:53           ` Dave Chinner
2019-08-29  0:04             ` Dan Williams
2019-08-29  9:32               ` Christoph Hellwig
2019-12-16 18:10               ` Vivek Goyal
2020-01-07 12:51                 ` Christoph Hellwig
2020-01-07 14:22                   ` Dan Williams
2020-01-07 17:07                     ` Darrick J. Wong
2020-01-07 17:29                       ` Dan Williams
2020-01-07 18:01                         ` Vivek Goyal
2020-01-07 18:07                           ` Dan Williams
2020-01-07 18:33                             ` Vivek Goyal
2020-01-07 18:49                               ` Dan Williams
2020-01-07 19:02                                 ` Darrick J. Wong
2020-01-07 19:46                                   ` Dan Williams
2020-01-07 23:38                                     ` Dan Williams
2020-01-09 11:24                                 ` Jan Kara
2020-01-09 20:03                                   ` Dan Williams
2020-01-10 12:36                                     ` Christoph Hellwig
2020-01-14 20:31                                     ` Vivek Goyal
2020-01-14 20:39                                       ` Dan Williams
2020-01-14 21:28                                         ` Vivek Goyal [this message]
2020-01-14 22:23                                           ` Dan Williams
2020-01-15 19:56                                             ` Vivek Goyal
2020-01-15 20:17                                               ` Dan Williams
2020-01-15 21:08                                                 ` Jeff Moyer
2020-01-16 18:09                                                   ` Dan Williams
2020-01-16 18:39                                                     ` Vivek Goyal
2020-01-16 19:09                                                       ` Dan Williams
2020-01-16 19:23                                                         ` Vivek Goyal
2020-01-15  9:03                                           ` Jan Kara
2019-08-21 17:57 ` [PATCH 02/19] dax: Pass dax_dev to dax_writeback_mapping_range() Vivek Goyal
2019-08-26 11:53   ` Christoph Hellwig
2019-08-26 20:33     ` Vivek Goyal
2019-08-26 20:58       ` Vivek Goyal
2019-08-26 21:33         ` Dan Williams
2019-08-28  6:58         ` Christoph Hellwig
2020-01-03 14:12         ` Vivek Goyal
2020-01-03 18:12           ` Dan Williams
2020-01-03 18:18             ` Dan Williams
2020-01-03 18:33               ` Vivek Goyal
2020-01-03 19:30                 ` Dan Williams
2020-01-03 18:43               ` Vivek Goyal
2019-08-27 13:45       ` Jan Kara
2019-08-21 17:57 ` [PATCH 03/19] virtio: Add get_shm_region method Vivek Goyal
2019-08-21 17:57 ` [PATCH 04/19] virtio: Implement get_shm_region for PCI transport Vivek Goyal
2019-08-26  1:43   ` [Virtio-fs] " piaojun
2019-08-26 13:06     ` Vivek Goyal
2019-08-27  9:41       ` piaojun
2019-08-27  8:34   ` Cornelia Huck
2019-08-27  8:46     ` Cornelia Huck
2019-08-27 11:53     ` Vivek Goyal
2019-08-21 17:57 ` [PATCH 05/19] virtio: Implement get_shm_region for MMIO transport Vivek Goyal
2019-08-27  8:39   ` Cornelia Huck
2019-08-27 11:54     ` Vivek Goyal
2019-08-21 17:57 ` [PATCH 06/19] fuse, dax: add fuse_conn->dax_dev field Vivek Goyal
2019-08-21 17:57 ` [PATCH 07/19] virtio_fs, dax: Set up virtio_fs dax_device Vivek Goyal
2019-08-21 17:57 ` [PATCH 08/19] fuse: Keep a list of free dax memory ranges Vivek Goyal
2019-08-21 17:57 ` [PATCH 09/19] fuse: implement FUSE_INIT map_alignment field Vivek Goyal
2019-08-21 17:57 ` [PATCH 10/19] fuse: Introduce setupmapping/removemapping commands Vivek Goyal
2019-08-21 17:57 ` [PATCH 11/19] fuse, dax: Implement dax read/write operations Vivek Goyal
2019-08-21 19:49   ` Liu Bo
2019-08-22 12:59     ` Vivek Goyal
2019-08-21 17:57 ` [PATCH 12/19] fuse, dax: add DAX mmap support Vivek Goyal
2019-08-21 17:57 ` [PATCH 13/19] fuse: Define dax address space operations Vivek Goyal
2019-08-21 17:57 ` [PATCH 14/19] fuse, dax: Take ->i_mmap_sem lock during dax page fault Vivek Goyal
2019-08-21 17:57 ` [PATCH 15/19] fuse: Maintain a list of busy elements Vivek Goyal
2019-08-21 17:57 ` [PATCH 16/19] dax: Create a range version of dax_layout_busy_page() Vivek Goyal
2019-08-21 17:57 ` [PATCH 17/19] fuse: Add logic to free up a memory range Vivek Goyal
2019-08-21 17:57 ` [PATCH 18/19] fuse: Release file in process context Vivek Goyal
2019-08-21 17:57 ` [PATCH 19/19] fuse: Take inode lock for dax inode truncation Vivek Goyal

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200114212805.GB3145@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dgilbert@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=miklos@szeredi.hu \
    --cc=stefanha@redhat.com \
    --cc=virtio-fs@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVDIMM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvdimm/0 linux-nvdimm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvdimm linux-nvdimm/ https://lore.kernel.org/linux-nvdimm \
		linux-nvdimm@lists.01.org
	public-inbox-index linux-nvdimm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.01.lists.linux-nvdimm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git