All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: ira.weiny@intel.com
Cc: linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	"Theodore Y. Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 5/5] fs/xfs: Allow toggle of physical DAX flag
Date: Mon, 21 Oct 2019 11:45:36 +1100	[thread overview]
Message-ID: <20191021004536.GD8015@dread.disaster.area> (raw)
In-Reply-To: <20191020155935.12297-6-ira.weiny@intel.com>

On Sun, Oct 20, 2019 at 08:59:35AM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Switching between DAX and non-DAX on a file is racy with respect to data
> operations.  However, if no data is involved the flag is safe to switch.
> 
> Allow toggling the physical flag if a file is empty.  The file length
> check is not racy with respect to other operations as it is performed
> under XFS_MMAPLOCK_EXCL and XFS_IOLOCK_EXCL locks.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/xfs/xfs_ioctl.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index d3a7340d3209..3839684c249b 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -33,6 +33,7 @@
>  #include "xfs_sb.h"
>  #include "xfs_ag.h"
>  #include "xfs_health.h"
> +#include "libxfs/xfs_dir2.h"
>  
>  #include <linux/mount.h>
>  #include <linux/namei.h>
> @@ -1232,12 +1233,10 @@ xfs_diflags_to_linux(
>  		inode->i_flags |= S_NOATIME;
>  	else
>  		inode->i_flags &= ~S_NOATIME;
> -#if 0	/* disabled until the flag switching races are sorted out */
>  	if (xflags & FS_XFLAG_DAX)
>  		inode->i_flags |= S_DAX;
>  	else
>  		inode->i_flags &= ~S_DAX;
> -#endif

This code has bit-rotted. See xfs_setup_iops(), where we now have a
different inode->i_mapping->a_ops for DAX inodes.

That, fundamentally, is the issue here - it's not setting/clearing
the DAX flag that is the issue, it's doing a swap of the
mapping->a_ops while there may be other code using that ops
structure.

IOWs, if there is any code anywhere in the kernel that
calls an address space op without holding one of the three locks we
hold here (i_rwsem, MMAPLOCK, ILOCK) then it can race with the swap
of the address space operations.

By limiting the address space swap to file sizes of zero, we rule
out the page fault path (mmap of a zero length file segv's with an
access beyond EOF on the first read/write page fault, right?).
However, other aops callers that might run unlocked and do the wrong
thing if the aops pointer is swapped between check of the aop method
existing and actually calling it even if the file size is zero?

A quick look shows that FIBMAP (ioctl_fibmap())) looks susceptible
to such a race condition with the current definitions of the XFS DAX
aops. I'm guessing there will be others, but I haven't looked
further than this...

>  	/* lock, flush and invalidate mapping in preparation for flag change */
>  	xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL);
> +
> +	if (i_size_read(inode) != 0) {
> +		error = -EOPNOTSUPP;
> +		goto out_unlock;
> +	}

Wrong error. Should be the same as whatever is returned when we try
to change the extent size hint and can't because the file is
non-zero in length (-EINVAL, I think). Also needs a comment
explainging why this check exists, and probably better written as
i_size_read() > 0 ....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2019-10-21  0:45 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-20 15:59 [PATCH 0/5] Enable per-file/directory DAX operations ira.weiny
2019-10-20 15:59 ` [PATCH 1/5] fs/stat: Define DAX statx attribute ira.weiny
2019-10-22 11:32   ` Boaz Harrosh
2019-10-22 16:51     ` Ira Weiny
2019-10-20 15:59 ` [PATCH 2/5] fs/xfs: Isolate the physical DAX flag from effective ira.weiny
2019-10-21  0:26   ` Dave Chinner
2019-10-21 17:40     ` Ira Weiny
2019-10-20 15:59 ` [PATCH 3/5] fs/xfs: Separate functionality of xfs_inode_supports_dax() ira.weiny
2019-10-20 15:59 ` [PATCH 4/5] fs/xfs: Clean up DAX support check ira.weiny
2019-10-20 15:59 ` [PATCH 5/5] fs/xfs: Allow toggle of physical DAX flag ira.weiny
2019-10-21  0:45   ` Dave Chinner [this message]
2019-10-21 22:49     ` Ira Weiny
2019-10-21 23:46       ` Dave Chinner
2019-11-08 13:12       ` Jan Kara
2019-11-08 13:46         ` Jan Kara
2019-11-08 19:36           ` Ira Weiny
2019-11-11 16:07             ` Jan Kara
2019-11-11 23:54               ` Ira Weiny
2019-10-22 11:21 ` [PATCH 0/5] Enable per-file/directory DAX operations Boaz Harrosh
2019-10-23 13:09   ` Boaz Harrosh
2019-10-23 22:13     ` Dave Chinner
2019-10-24  2:31       ` Boaz Harrosh
2019-10-24  7:34         ` Dave Chinner
2019-10-24 14:05           ` Boaz Harrosh
2019-10-24 21:35             ` Dave Chinner
2019-10-24 23:29               ` Boaz Harrosh
2019-10-25  0:36                 ` Dave Chinner
2019-10-25  1:15                   ` Boaz Harrosh
2019-10-25 20:49                   ` Ira Weiny
2019-10-27 22:10                     ` Dave Chinner
2019-10-31 16:17                       ` Ira Weiny
2019-11-01 22:47                         ` Dave Chinner
2019-11-02  4:25                           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191021004536.GD8015@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.