All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Boaz Harrosh <boaz@plexistor.com>,
	linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	"Theodore Y. Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/5] Enable per-file/directory DAX operations
Date: Fri, 25 Oct 2019 13:49:26 -0700	[thread overview]
Message-ID: <20191025204926.GA26184@iweiny-DESK2.sc.intel.com> (raw)
In-Reply-To: <20191025003603.GE4614@dread.disaster.area>

On Fri, Oct 25, 2019 at 11:36:03AM +1100, Dave Chinner wrote:
> On Fri, Oct 25, 2019 at 02:29:04AM +0300, Boaz Harrosh wrote:
> > On 25/10/2019 00:35, Dave Chinner wrote:
> 
> If something like a find or backup program brings the inode into
> cache, the app may not even get the behaviour it wants, and it can't
> change it until the inode is evicted from cache, which may be never.

Why would this be never?

> Nobody wants implicit/random/uncontrollable/unchangeable behaviour
> like this.

I'm thinking this could work with a bit of effort on the users part.  While the
behavior does have a bit of uncertainty, I feel like there has to be a way to
get the inode to drop from the cache when a final iput() happens on the inode.

Admin programs should not leave files open forever, without the users knowing
about it.  So I don't understand why the inode could not be evicted from the
cache if the FS knew that this change had been made and the inode needs to be
"re-loaded".  See below...

> 
> > (And never change the flag on the fly)
> > (Just brain storming here)
> 
> We went over all this ground when we disabled the flag in the first
> place. We disabled the flag because we couldn't come up with a sane
> way to flip the ops vector short of tracking the number of aops
> calls in progress at any given time. i.e. reference counting the
> aops structure, but that's hard to do with a const ops structure,
> and so it got disabled rather than allowing users to crash
> kernels....

Agreed.  We can't change the a_ops without some guarantee that no one is using
the file.  Which means we need all fds to close and a final iput().  I thought
that would mean an eviction of the inode and a subsequent reload.

Yesterday I coded up the following (applies on top of this series) but I can't
seem to get it to work because I believe xfs is keeping a reference on the
inode.  What am I missing?  I think if I could get xfs to recognize that the
inode needs to be cleared from it's cache this would work, with some caveats.

Currently this works if I remount the fs or if I use <procfs>/drop_caches like
Boaz mentioned.

Isn't there a way to get xfs to do that on it's own?

Ira


From 7762afd95a3e21a782dffd2fd8e13ae4a23b5e4a Mon Sep 17 00:00:00 2001
From: Ira Weiny <ira.weiny@intel.com>
Date: Fri, 25 Oct 2019 13:32:07 -0700
Subject: [PATCH] squash: Allow phys change on any length file

delay the changing of the effective bit to when the inode is re-read
into the cache.

Currently a work in Progress because xfs seems to cache the inodes as
well and I'm not sure how to get xfs to release it's reference.
---
 fs/xfs/xfs_ioctl.c | 18 +++++++-----------
 include/linux/fs.h |  5 ++++-
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 89cf47ef273e..4d730d5781d9 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1233,10 +1233,13 @@ xfs_diflags_to_linux(
 		inode->i_flags |= S_NOATIME;
 	else
 		inode->i_flags &= ~S_NOATIME;
-	if (xflags & FS_XFLAG_DAX)
-		inode->i_flags |= S_DAX;
-	else
-		inode->i_flags &= ~S_DAX;
+	/* NOTE: we do not allow the physical DAX flag to immediately change
+	 * the effective IS_DAX() flag tell the VFS layer to remove the inode
+	 * from the cache on the final iput() to force recreation on the next
+	 * 'fresh' open */
+	if (((xflags & FS_XFLAG_DAX) && !IS_DAX(inode)) ||
+	    (!(xflags & FS_XFLAG_DAX) && IS_DAX(inode)))
+		inode->i_flags |= S_REVALIDATE;
 }
 
 static int
@@ -1320,13 +1323,6 @@ xfs_ioctl_setattr_dax_invalidate(
 	/* lock, flush and invalidate mapping in preparation for flag change */
 	xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL);
 
-	/* File size must be zero to avoid races with asynchronous page
-	 * faults */
-	if (i_size_read(inode) > 0) {
-		error = -EINVAL;
-		goto out_unlock;
-	}
-
 	error = filemap_write_and_wait(inode->i_mapping);
 	if (error)
 		goto out_unlock;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0b4d8fc79e0f..4e9b7bf53c86 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1998,6 +1998,7 @@ struct super_operations {
 #define S_ENCRYPTED	16384	/* Encrypted file (using fs/crypto/) */
 #define S_CASEFOLD	32768	/* Casefolded file */
 #define S_VERITY	65536	/* Verity file (using fs/verity/) */
+#define S_REVALIDATE	131072	/* Drop inode from cache on final put */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system
@@ -2040,6 +2041,7 @@ static inline bool sb_rdonly(const struct super_block *sb) { return sb->s_flags
 #define IS_ENCRYPTED(inode)	((inode)->i_flags & S_ENCRYPTED)
 #define IS_CASEFOLDED(inode)	((inode)->i_flags & S_CASEFOLD)
 #define IS_VERITY(inode)	((inode)->i_flags & S_VERITY)
+#define IS_REVALIDATE(inode)	((inode)->i_flags & S_REVALIDATE)
 
 #define IS_WHITEOUT(inode)	(S_ISCHR(inode->i_mode) && \
 				 (inode)->i_rdev == WHITEOUT_DEV)
@@ -3027,7 +3029,8 @@ extern int inode_needs_sync(struct inode *inode);
 extern int generic_delete_inode(struct inode *inode);
 static inline int generic_drop_inode(struct inode *inode)
 {
-	return !inode->i_nlink || inode_unhashed(inode);
+	return !inode->i_nlink || inode_unhashed(inode) ||
+		IS_REVALIDATE(inode);
 }
 
 extern struct inode *ilookup5_nowait(struct super_block *sb,
-- 
2.20.1



  parent reply	other threads:[~2019-10-25 20:49 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-20 15:59 [PATCH 0/5] Enable per-file/directory DAX operations ira.weiny
2019-10-20 15:59 ` [PATCH 1/5] fs/stat: Define DAX statx attribute ira.weiny
2019-10-22 11:32   ` Boaz Harrosh
2019-10-22 16:51     ` Ira Weiny
2019-10-20 15:59 ` [PATCH 2/5] fs/xfs: Isolate the physical DAX flag from effective ira.weiny
2019-10-21  0:26   ` Dave Chinner
2019-10-21 17:40     ` Ira Weiny
2019-10-20 15:59 ` [PATCH 3/5] fs/xfs: Separate functionality of xfs_inode_supports_dax() ira.weiny
2019-10-20 15:59 ` [PATCH 4/5] fs/xfs: Clean up DAX support check ira.weiny
2019-10-20 15:59 ` [PATCH 5/5] fs/xfs: Allow toggle of physical DAX flag ira.weiny
2019-10-21  0:45   ` Dave Chinner
2019-10-21 22:49     ` Ira Weiny
2019-10-21 23:46       ` Dave Chinner
2019-11-08 13:12       ` Jan Kara
2019-11-08 13:46         ` Jan Kara
2019-11-08 19:36           ` Ira Weiny
2019-11-11 16:07             ` Jan Kara
2019-11-11 23:54               ` Ira Weiny
2019-10-22 11:21 ` [PATCH 0/5] Enable per-file/directory DAX operations Boaz Harrosh
2019-10-23 13:09   ` Boaz Harrosh
2019-10-23 22:13     ` Dave Chinner
2019-10-24  2:31       ` Boaz Harrosh
2019-10-24  7:34         ` Dave Chinner
2019-10-24 14:05           ` Boaz Harrosh
2019-10-24 21:35             ` Dave Chinner
2019-10-24 23:29               ` Boaz Harrosh
2019-10-25  0:36                 ` Dave Chinner
2019-10-25  1:15                   ` Boaz Harrosh
2019-10-25 20:49                   ` Ira Weiny [this message]
2019-10-27 22:10                     ` Dave Chinner
2019-10-31 16:17                       ` Ira Weiny
2019-11-01 22:47                         ` Dave Chinner
2019-11-02  4:25                           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191025204926.GA26184@iweiny-DESK2.sc.intel.com \
    --to=ira.weiny@intel.com \
    --cc=boaz@plexistor.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.