nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: jack@suse.cz, linux-nvdimm@lists.01.org,
	Dave Chinner <david@fromorbit.com>,
	linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v6 14/15] xfs: prepare xfs_break_layouts() for another layout type
Date: Mon, 19 Mar 2018 10:45:59 -0700	[thread overview]
Message-ID: <20180319174559.GG1757@magnolia> (raw)
In-Reply-To: <152112916514.24669.8643877835071945330.stgit@dwillia2-desk3.amr.corp.intel.com>

On Thu, Mar 15, 2018 at 08:52:45AM -0700, Dan Williams wrote:
> When xfs is operating as the back-end of a pNFS block server, it prevents
> collisions between local and remote operations by requiring a lease to
> be held for remotely accessed blocks. Local filesystem operations break
> those leases before writing or mutating the extent map of the file.
> 
> A similar mechanism is needed to prevent operations on pinned dax
> mappings, like device-DMA, from colliding with extent unmap operations.
> 
> BREAK_WRITE and BREAK_TRUNCATE are introduced as two distinct levels of
> layout breaking.
> 
> Layouts are broken in the BREAK_WRITE case to ensure that layout-holders
> do not collide with local writes. Additionally, layouts are broken in
> the BREAK_TRUNCATE case to make sure the layout-holder has a consistent
> view of the file's extent map. While BREAK_WRITE breaks can be satisfied
> be recalling FL_LAYOUT leases, BREAK_TRUNCATE breaks additionally
> require waiting for busy dax-pages to go idle.
> 
> Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Dave Chinner <david@fromorbit.com>
> Reported-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  fs/xfs/xfs_file.c  |   23 +++++++++++++++++------
>  fs/xfs/xfs_inode.h |   18 ++++++++++++++++--
>  fs/xfs/xfs_ioctl.c |    2 +-
>  fs/xfs/xfs_iops.c  |    5 +++--
>  4 files changed, 37 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 5742d395a4e4..399c5221f101 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -352,7 +352,7 @@ xfs_file_aio_write_checks(
>  
>  	xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
>  	*iolock |= XFS_MMAPLOCK_EXCL;
> -	error = xfs_break_layouts(inode, iolock);
> +	error = xfs_break_layouts(inode, iolock, BREAK_WRITE);
>  	if (error) {
>  		*iolock &= ~XFS_MMAPLOCK_EXCL;
>  		xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
> @@ -762,7 +762,8 @@ xfs_file_write_iter(
>  int
>  xfs_break_layouts(
>  	struct inode		*inode,
> -	uint			*iolock)
> +	uint			*iolock,
> +	enum layout_break_reason reason)
>  {
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	int			ret;
> @@ -770,9 +771,19 @@ xfs_break_layouts(
>  	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_SHARED | XFS_IOLOCK_EXCL
>  				| XFS_MMAPLOCK_EXCL));
>  
> -	ret = xfs_break_leased_layouts(inode, iolock);
> -	if (ret > 0)
> -		ret = 0;
> +	switch (reason) {
> +	case BREAK_TRUNCATE:
> +		/* fall through */
> +	case BREAK_WRITE:
> +		ret = xfs_break_leased_layouts(inode, iolock);
> +		if (ret > 0)
> +			ret = 0;
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +
>  	return ret;
>  }
>  
> @@ -802,7 +813,7 @@ xfs_file_fallocate(
>  		return -EOPNOTSUPP;
>  
>  	xfs_ilock(ip, iolock);
> -	error = xfs_break_layouts(inode, &iolock);
> +	error = xfs_break_layouts(inode, &iolock, BREAK_TRUNCATE);
>  	if (error)
>  		goto out_unlock;
>  
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 74c63f3a720f..1a66c7afcf45 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -379,6 +379,20 @@ static inline void xfs_ifunlock(struct xfs_inode *ip)
>  					>> XFS_ILOCK_SHIFT)
>  
>  /*
> + * Layouts are broken in the BREAK_WRITE case to ensure that
> + * layout-holders do not collide with local writes. Additionally,
> + * layouts are broken in the BREAK_TRUNCATE case to make sure the
> + * layout-holder has a consistent view of the file's extent map. While
> + * BREAK_WRITE breaks can be satisfied be recalling FL_LAYOUT leases,
> + * BREAK_TRUNCATE breaks additionally require waiting for busy dax-pages
> + * to go idle.
> + */
> +enum layout_break_reason {
> +        BREAK_WRITE,

I wonder if this belongs in a system header file since the same general
principle is going to apply to ext4 and others?  If this really is a
special xfs thing, then please use the "XFS_" prefix.

> +        BREAK_TRUNCATE,

The "truncate" part of the name misled me for a bit.  That operation is
really about "make sure there are no busy dax pages because we're about
to change some file pmem^Wblock mappings", right?  Truncation, hole
punching, reflinking, and zero_range (because it punches the range and
reallocates unwritten extents) all have to wait for busy dax pages to
become free before they can begin unmapping blocks.  So this isn't just
about truncation specifically; can I suggest "BREAK_UNMAPI" ?

(I also haven't figured out whether the same requirements apply to
things that *add* extent maps to the file, but I have to run to a
meeting in a few so wanted to get this email sent.)

--D

> +};
> +
> +/*
>   * For multiple groups support: if S_ISGID bit is set in the parent
>   * directory, group of new file is set to that of the parent, and
>   * new subdirectory gets S_ISGID bit from parent.
> @@ -447,8 +461,8 @@ int	xfs_zero_eof(struct xfs_inode *ip, xfs_off_t offset,
>  		     xfs_fsize_t isize, bool *did_zeroing);
>  int	xfs_zero_range(struct xfs_inode *ip, xfs_off_t pos, xfs_off_t count,
>  		bool *did_zero);
> -int	xfs_break_layouts(struct inode *inode, uint *iolock);
> -
> +int	xfs_break_layouts(struct inode *inode, uint *iolock,
> +		enum layout_break_reason reason);
>  
>  /* from xfs_iops.c */
>  extern void xfs_setup_inode(struct xfs_inode *ip);
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index d70a1919e787..847a67186d95 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -643,7 +643,7 @@ xfs_ioc_space(
>  		return error;
>  
>  	xfs_ilock(ip, iolock);
> -	error = xfs_break_layouts(inode, &iolock);
> +	error = xfs_break_layouts(inode, &iolock, BREAK_TRUNCATE);
>  	if (error)
>  		goto out_unlock;
>  
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 78eb56d447df..f9fcadb5b555 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -1026,13 +1026,14 @@ xfs_vn_setattr(
>  	int			error;
>  
>  	if (iattr->ia_valid & ATTR_SIZE) {
> -		struct xfs_inode	*ip = XFS_I(d_inode(dentry));
> +		struct inode		*inode = d_inode(dentry);
> +		struct xfs_inode	*ip = XFS_I(inode);
>  		uint			iolock;
>  
>  		xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
>  		iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
>  
> -		error = xfs_break_layouts(d_inode(dentry), &iolock);
> +		error = xfs_break_layouts(inode, &iolock, BREAK_TRUNCATE);
>  		if (error) {
>  			xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
>  			return error;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  parent reply	other threads:[~2018-03-19 17:40 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-15 15:51 [PATCH v6 00/15] dax: fix dma vs truncate/hole-punch Dan Williams
2018-03-15 15:51 ` [PATCH v6 01/15] dax: store pfns in the radix Dan Williams
2018-03-15 15:51 ` [PATCH v6 02/15] fs, dax: prepare for dax-specific address_space_operations Dan Williams
2018-03-16 18:59   ` Christoph Hellwig
2018-03-15 15:51 ` [PATCH v6 03/15] block, dax: remove dead code in blkdev_writepages() Dan Williams
2018-03-16 18:59   ` Christoph Hellwig
2018-03-15 15:51 ` [PATCH v6 04/15] xfs, dax: introduce xfs_dax_aops Dan Williams
2018-03-16 19:00   ` Christoph Hellwig
2018-03-15 15:51 ` [PATCH v6 05/15] ext4, dax: introduce ext4_dax_aops Dan Williams
2018-03-15 15:51 ` [PATCH v6 06/15] ext2, dax: introduce ext2_dax_aops Dan Williams
2018-03-18  4:02   ` kbuild test robot
2018-03-18  4:02   ` [RFC PATCH] ext2, dax: ext2_dax_aops can be static kbuild test robot
2018-03-15 15:52 ` [PATCH v6 07/15] fs, dax: use page->mapping to warn if truncate collides with a busy page Dan Williams
2018-03-18  6:26   ` kbuild test robot
2018-03-15 15:52 ` [PATCH v6 08/15] mm, dax: enable filesystems to trigger dev_pagemap ->page_free callbacks Dan Williams
2018-03-15 15:52 ` [PATCH v6 09/15] mm, dev_pagemap: introduce CONFIG_DEV_PAGEMAP_OPS Dan Williams
2018-03-15 15:52 ` [PATCH v6 10/15] memremap: mark devm_memremap_pages() EXPORT_SYMBOL_GPL Dan Williams
2018-03-15 15:52 ` [PATCH v6 11/15] mm, fs, dax: handle layout changes to pinned dax mappings Dan Williams
2018-03-16 19:01   ` Christoph Hellwig
2018-03-17 22:14   ` kbuild test robot
2018-03-15 15:52 ` [PATCH v6 12/15] xfs: require mmap lock for xfs_break_layouts() Dan Williams
2018-03-16 19:04   ` Christoph Hellwig
2018-03-16 19:10     ` Dan Williams
2018-03-19 17:33   ` Darrick J. Wong
2018-03-19 17:57     ` Dan Williams
2018-03-19 18:19       ` Darrick J. Wong
2018-03-19 18:34         ` Dan Williams
2018-03-19 19:45       ` Christoph Hellwig
2018-03-19 20:10         ` Dan Williams
2018-03-19 21:14           ` Christoph Hellwig
2018-03-15 15:52 ` [PATCH v6 13/15] xfs: communicate lock drop events from xfs_break_layouts() Dan Williams
2018-03-16 19:08   ` Christoph Hellwig
2018-03-15 15:52 ` [PATCH v6 14/15] xfs: prepare xfs_break_layouts() for another layout type Dan Williams
2018-03-16 19:08   ` Christoph Hellwig
2018-03-19 17:45   ` Darrick J. Wong [this message]
2018-03-19 18:09     ` Dan Williams
2018-03-15 15:52 ` [PATCH v6 15/15] xfs, dax: introduce xfs_break_dax_layouts() Dan Williams
2018-03-16 19:09   ` Christoph Hellwig
2018-03-17 22:11   ` kbuild test robot
2018-03-17 23:47   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180319174559.GG1757@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).