Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Chinner <david@fromorbit.com>,
	"Theodore Y. Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org,
	torvalds@linux-foundation.org
Subject: Re: [PATCH V5 00/12] Enable per-file/per-directory DAX operations V5
Date: Mon, 9 Mar 2020 10:04:37 -0700
Message-ID: <20200309170437.GA271052@iweiny-DESK2.sc.intel.com> (raw)
In-Reply-To: <20200305155144.GA5598@lst.de>

On Thu, Mar 05, 2020 at 04:51:44PM +0100, Christoph Hellwig wrote:
> FYI, I still will fully NAK any series that adds additional locks
> and thus atomic instructions to basically every fs call, and grows
> the inode by a rw_semaphore plus and atomic64_t.  I also think the
> whole idea of switching operation vectors at runtime is fatally flawed
> and we should never add such code, nevermind just for a fringe usecase
> of a fringe feature.

Being new to this area of the kernel I'm not clear on the history...

It was my understanding that the per-file flag support was a requirement to
removing the experimental designation from DAX.  Is this still the case?

Ira

> 
> On Wed, Feb 26, 2020 at 09:24:30PM -0800, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Changes from V4:
> > 	* Open code the aops lock rather than add it to the xfs_ilock()
> > 	  subsystem (Darrick's comments were obsoleted by this change)
> > 	* Fix lkp build suggestions and bugs
> > 
> > Changes from V3:
> > 	* Remove global locking...  :-D
> > 	* put back per inode locking and remove pre-mature optimizations
> > 	* Fix issues with Directories having IS_DAX() set
> > 	* Fix kernel crash issues reported by Jeff
> > 	* Add some clean up patches
> > 	* Consolidate diflags to iflags functions
> > 	* Update/add documentation
> > 	* Reorder/rename patches quite a bit
> > 
> > Changes from V2:
> > 
> > 	* Move i_dax_sem to be a global percpu_rw_sem rather than per inode
> > 		Internal discussions with Dan determined this would be easier,
> > 		just as performant, and slightly less overhead that having it
> > 		in the SB as suggested by Jan
> > 	* Fix locking order in comments and throughout code
> > 	* Change "mode" to "state" throughout commits
> > 	* Add CONFIG_FS_DAX wrapper to disable inode_[un]lock_state() when not
> > 		configured
> > 	* Add static branch for which is activated by a device which supports
> > 		DAX in XFS
> > 	* Change "lock/unlock" to up/down read/write as appropriate
> > 		Previous names were over simplified
> > 	* Update comments/documentation
> > 
> > 	* Remove the xfs specific lock to the vfs (global) layer.
> > 	* Fix i_dax_sem locking order and comments
> > 
> > 	* Move 'i_mapped' count from struct inode to struct address_space and
> > 		rename it to mmap_count
> > 	* Add inode_has_mappings() call
> > 
> > 	* Fix build issues
> > 	* Clean up syntax spacing and minor issues
> > 	* Update man page text for STATX_ATTR_DAX
> > 	* Add reviewed-by's
> > 	* Rebase to 5.6
> > 
> > 	Rename patch:
> > 		from: fs/xfs: Add lock/unlock state to xfs
> > 		to: fs/xfs: Add write DAX lock to xfs layer
> > 	Add patch:
> > 		fs/xfs: Clarify lockdep dependency for xfs_isilocked()
> > 	Drop patch:
> > 		fs/xfs: Fix truncate up
> > 
> > 
> > At LSF/MM'19 [1] [2] we discussed applications that overestimate memory
> > consumption due to their inability to detect whether the kernel will
> > instantiate page cache for a file, and cases where a global dax enable via a
> > mount option is too coarse.
> > 
> > The following patch series enables selecting the use of DAX on individual files
> > and/or directories on xfs, and lays some groundwork to do so in ext4.  In this
> > scheme the dax mount option can be omitted to allow the per-file property to
> > take effect.
> > 
> > The insight at LSF/MM was to separate the per-mount or per-file "physical"
> > capability switch from an "effective" attribute for the file.
> > 
> > At LSF/MM we discussed the difficulties of switching the DAX state of a file
> > with active mappings / page cache.  It was thought the races could be avoided
> > by limiting DAX state flips to 0-length files.
> > 
> > However, this turns out to not be true.[3] This is because address space
> > operations (a_ops) may be in use at any time the inode is referenced and users
> > have expressed a desire to be able to change the DAX state on a file with data
> > in it.  For those reasons this patch set allows changing the DAX state flag on
> > a file as long as it is not current mapped.
> > 
> > Details of when and how DAX state can be changed on a file is included in a
> > documentation patch.
> > 
> > It should be noted that the physical DAX flag inheritance is not shown in this
> > patch set as it was maintained from previous work on XFS.  The physical DAX
> > flag and it's inheritance will need to be added to other file systems for user
> > control. 
> > 
> > As submitted this works on real hardware testing.
> > 
> > 
> > [1] https://lwn.net/Articles/787973/
> > [2] https://lwn.net/Articles/787233/
> > [3] https://lkml.org/lkml/2019/10/20/96
> > [4] https://patchwork.kernel.org/patch/11310511/
> > 
> > 
> > To: linux-kernel@vger.kernel.org
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Dave Chinner <david@fromorbit.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: "Theodore Y. Ts'o" <tytso@mit.edu>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: linux-ext4@vger.kernel.org
> > Cc: linux-xfs@vger.kernel.org
> > Cc: linux-fsdevel@vger.kernel.org
> > 
> > 
> > Ira Weiny (12):
> >   fs/xfs: Remove unnecessary initialization of i_rwsem
> >   fs: Remove unneeded IS_DAX() check
> >   fs/stat: Define DAX statx attribute
> >   fs/xfs: Isolate the physical DAX flag from enabled
> >   fs/xfs: Create function xfs_inode_enable_dax()
> >   fs: Add locking for a dynamic address space operations state
> >   fs: Prevent DAX state change if file is mmap'ed
> >   fs/xfs: Hold off aops users while changing DAX state
> >   fs/xfs: Clean up locking in dax invalidate
> >   fs/xfs: Allow toggle of effective DAX flag
> >   fs/xfs: Remove xfs_diflags_to_linux()
> >   Documentation/dax: Update Usage section
> > 
> >  Documentation/filesystems/dax.txt | 84 +++++++++++++++++++++++++-
> >  Documentation/filesystems/vfs.rst | 16 +++++
> >  fs/attr.c                         |  1 +
> >  fs/inode.c                        | 16 ++++-
> >  fs/iomap/buffered-io.c            |  1 +
> >  fs/open.c                         |  4 ++
> >  fs/stat.c                         |  5 ++
> >  fs/xfs/xfs_icache.c               |  5 +-
> >  fs/xfs/xfs_inode.h                |  2 +
> >  fs/xfs/xfs_ioctl.c                | 98 +++++++++++++++----------------
> >  fs/xfs/xfs_iops.c                 | 69 +++++++++++++++-------
> >  include/linux/fs.h                | 73 ++++++++++++++++++++++-
> >  include/uapi/linux/stat.h         |  1 +
> >  mm/fadvise.c                      |  7 ++-
> >  mm/filemap.c                      |  4 ++
> >  mm/huge_memory.c                  |  1 +
> >  mm/khugepaged.c                   |  2 +
> >  mm/mmap.c                         | 19 +++++-
> >  mm/util.c                         |  9 ++-
> >  19 files changed, 328 insertions(+), 89 deletions(-)
> > 
> > -- 
> > 2.21.0
> ---end quoted text---

  reply index

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-27  5:24 ira.weiny
2020-02-27  5:24 ` [PATCH V5 01/12] fs/xfs: Remove unnecessary initialization of i_rwsem ira.weiny
2020-02-27 17:25   ` Ira Weiny
2020-02-27  5:24 ` [PATCH V5 02/12] fs: Remove unneeded IS_DAX() check ira.weiny
2020-02-27  5:24 ` [PATCH V5 03/12] fs/stat: Define DAX statx attribute ira.weiny
2020-02-27  5:24 ` [PATCH V5 04/12] fs/xfs: Isolate the physical DAX flag from enabled ira.weiny
2020-02-27  5:24 ` [PATCH V5 05/12] fs/xfs: Create function xfs_inode_enable_dax() ira.weiny
2020-03-01 22:37   ` Dave Chinner
2020-02-27  5:24 ` [PATCH V5 06/12] fs: Add locking for a dynamic address space operations state ira.weiny
2020-03-02  1:26   ` Dave Chinner
2020-03-02  1:36     ` Dave Chinner
2020-02-27  5:24 ` [PATCH V5 07/12] fs: Prevent DAX state change if file is mmap'ed ira.weiny
2020-02-27  5:24 ` [PATCH V5 08/12] fs/xfs: Hold off aops users while changing DAX state ira.weiny
2020-02-27  5:24 ` [PATCH V5 09/12] fs/xfs: Clean up locking in dax invalidate ira.weiny
2020-02-27  5:24 ` [PATCH V5 10/12] fs/xfs: Allow toggle of effective DAX flag ira.weiny
2020-02-27  5:24 ` [PATCH V5 11/12] fs/xfs: Remove xfs_diflags_to_linux() ira.weiny
2020-02-27  5:24 ` [PATCH V5 12/12] Documentation/dax: Update Usage section ira.weiny
2020-03-05 15:51 ` [PATCH V5 00/12] Enable per-file/per-directory DAX operations V5 Christoph Hellwig
2020-03-09 17:04   ` Ira Weiny [this message]
2020-03-11  3:36     ` Darrick J. Wong
2020-03-11  6:29       ` Christoph Hellwig
2020-03-11 17:07         ` Dan Williams
2020-03-16  9:52           ` Jan Kara
2020-03-16  9:55             ` Christoph Hellwig
2020-04-01  4:00               ` Darrick J. Wong
2020-04-01 10:25                 ` Jan Kara
2020-04-02  8:53                   ` Christoph Hellwig
2020-04-02 20:55                     ` Ira Weiny
2020-04-03  7:27                       ` Christoph Hellwig
2020-04-03 15:48                         ` Ira Weiny
2020-04-03 17:03                           ` Jan Kara
2020-04-03 18:18                             ` Ira Weiny
2020-04-03 18:21                               ` Ira Weiny
2020-04-03 18:37                               ` Darrick J. Wong
2020-04-05  6:19                                 ` Ira Weiny
2020-04-06 10:00                               ` Jan Kara
2020-04-03 18:29                             ` Darrick J. Wong
2020-04-03 16:05                         ` Darrick J. Wong
2020-04-03  4:39                 ` Ira Weiny
2020-03-11  6:39       ` Dave Chinner
2020-03-11  6:44         ` Christoph Hellwig
2020-03-11 17:07           ` Dan Williams
2020-03-12  0:49           ` Dave Chinner
2020-03-12  3:00             ` Darrick J. Wong
2020-03-12  7:27             ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200309170437.GA271052@iweiny-DESK2.sc.intel.com \
    --to=ira.weiny@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git