From: "Ernesto A. Fernández" <ernesto.mnd.fernandez@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org,
"Darrick J. Wong" <darrick.wong@oracle.com>,
"Zorro Lang" <zlang@redhat.com>,
"Ernesto A. Fernández" <ernesto.mnd.fernandez@gmail.com>
Subject: Re: [PATCH] xfs: preserve i_mode if __xfs_set_acl() fails
Date: Tue, 15 Aug 2017 03:18:58 -0300 [thread overview]
Message-ID: <20170815061857.GA2803@debian.home> (raw)
In-Reply-To: <20170815020018.GF21024@dastard>
Hi Dave, thanks for the review!
On Tue, Aug 15, 2017 at 12:00:18PM +1000, Dave Chinner wrote:
> On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote:
> > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > index de7b9bd..c3193e1 100644
> > --- a/fs/xfs/libxfs/xfs_attr.c
> > +++ b/fs/xfs/libxfs/xfs_attr.c
> > @@ -395,6 +395,7 @@ xfs_attr_remove(
> > struct xfs_defer_ops dfops;
> > xfs_fsblock_t firstblock;
> > int error;
> > + int noattr = 0;
>
> bool.
>
> And make it "bool has_attr = true" to avoid the nasty double
> negative of "!noattr" later on.
I called it noattr because the error was -ENOATTR, but you are right,
has_attr is better. I assume you would prefer mode_changed to be a bool
as well?
>
> >
> > XFS_STATS_INC(mp, xs_attr_remove);
> >
> > @@ -438,7 +439,12 @@ xfs_attr_remove(
> > xfs_trans_ijoin(args.trans, dp, 0);
> >
> > if (!xfs_inode_hasattr(dp)) {
> > - error = -ENOATTR;
> > + /*
> > + * Commit even if there is no attr to remove, because when
> > + * setting ACLs the caller may be changing the mode in the
> > + * same transaction.
> > + */
> > + noattr = 1;
> > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
> > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> > error = xfs_attr_shortform_remove(&args);
> > @@ -458,7 +464,7 @@ xfs_attr_remove(
> > if (mp->m_flags & XFS_MOUNT_WSYNC)
> > xfs_trans_set_sync(args.trans);
> >
> > - if ((flags & ATTR_KERNOTIME) == 0)
> > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr)
> > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG);
>
> Why?
Because we don't want the ctime to change when you try to delete an
attribute that didn't exist in the first place. At least that's how
it went before, so I tried to keep it the same.
>
> >
> > /*
> > @@ -468,6 +474,9 @@ xfs_attr_remove(
> > error = xfs_trans_commit(args.trans);
> > xfs_iunlock(dp, XFS_ILOCK_EXCL);
> >
> > + if (!error && noattr)
> > + error = -ENOATTR;
> > +
> > return error;
> >
> > out:
> > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> > index 7034e17..1fe4363 100644
> > --- a/fs/xfs/xfs_acl.c
> > +++ b/fs/xfs/xfs_acl.c
> > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
> > return error;
> > }
> >
> > -static int
> > -xfs_set_mode(struct inode *inode, umode_t mode)
> > -{
> > - int error = 0;
> > -
> > - if (mode != inode->i_mode) {
> > - struct iattr iattr;
> > -
> > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME;
> > - iattr.ia_mode = mode;
> > - iattr.ia_ctime = current_time(inode);
> > -
> > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL);
> > - }
> > -
> > - return error;
> > -}
>
> Ok, so that does an atomic, transactional change to the inode
> mode....
>
> > int
> > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
> > {
> > int error = 0;
> > + int mode_changed = 0;
> > + umode_t old_mode = inode->i_mode;
> > + struct timespec old_ctime = inode->i_ctime;
> >
> > if (!acl)
> > goto set_acl;
> > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
> > return error;
> >
> > if (type == ACL_TYPE_ACCESS) {
> > - umode_t mode;
> > -
> > - error = posix_acl_update_mode(inode, &mode, &acl);
> > - if (error)
> > - return error;
> > - error = xfs_set_mode(inode, mode);
> > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl);
> > if (error)
> > return error;
> > + inode->i_ctime = current_time(inode);
> > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg);
> > + mode_changed = 1;
> > }
>
> And this doesn't.
>
> So this breaks all our rules for inode metadata updates, both for
> runtime access serialisation and for crash consistency.
>
I'm only making changes to the vfs inode, and xfs_set_acl() should always
be called with i_rwsem held for writing. I think that should be enough to
serialize access to the vfs inode fields.
I'm sorry, I'm not sure I follow the crash consistency part. This is not
writing anything to disk. My expectation was that the changes I made to
the vfs inode would be attached to the transaction later, when calling
xfs_trans_log_inode() in xfs_attr_set() or xfs_attr_remove(). Would that
not work as intended? I was mostly imitating how btrfs sets an ACL, so I
could be misunderstanding how xfs does journaling.
> > set_acl:
> > - return __xfs_set_acl(inode, acl, type);
> > + error = __xfs_set_acl(inode, acl, type);
> > + if (error && mode_changed) {
> > + inode->i_mode = old_mode;
> > + inode->i_ctime = old_ctime;
> > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg);
> > + }
>
> Same here.
i_rwsem would still be held here. My reasoning is that, since there was
an error, no changes were actually committed; so there would be no need
to commit the restoration either, just restore the old in-memory vfs inode
fields. Once again, this is inspired by btrfs, so I suppose it may fail
here in some way I can't see.
(I should probably clarify I did run tests before submitting)
> IOWs, you can't just remove a transactional modification of inode
> state and replace it with in-memory VFS inode level changes. You
> need to do the changes in transaction context and with the correct
> locks held.
>
> IOWs, if you need to undo the mode changes xfs_set_mode() did, you
> need to call xfs_set_mode() again. And because modifications have
But that could fail as well, and how would we handle that? If this bug
is to be fixed the whole thing needs to be a single transaction. That's
why I removed the xfs_set_mode() function in the first place.
> been made, the new ctime should remain in place, too.
>
> Of course, this still isn't atomic from a runtime access POV, but
> that's substantially harder to provide than just undoing the mode
> change on ACL attribute save error.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
Thanks,
Ernest
next prev parent reply other threads:[~2017-08-15 6:19 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-14 8:18 [PATCH] xfs: preserve i_mode if __xfs_set_acl() fails Ernesto A. Fernández
2017-08-14 22:28 ` Darrick J. Wong
2017-08-15 0:58 ` Ernesto A. Fernández
2017-08-15 2:00 ` Dave Chinner
2017-08-15 6:18 ` Ernesto A. Fernández [this message]
2017-08-15 8:44 ` Dave Chinner
2017-08-15 19:29 ` Ernesto A. Fernández
2017-08-16 0:25 ` Dave Chinner
2017-08-16 7:16 ` Ernesto A. Fernández
2017-08-16 13:35 ` Dave Chinner
2017-08-16 19:31 ` Ernesto A. Fernández
2017-08-17 0:00 ` Dave Chinner
2017-08-17 5:34 ` Ernesto A. Fernández
2017-08-17 6:34 ` Dave Chinner
2017-12-07 17:31 ` Bill O'Donnell
2017-12-07 17:38 ` Bill O'Donnell
2017-12-07 17:43 ` Darrick J. Wong
2017-12-07 17:51 ` Bill O'Donnell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170815061857.GA2803@debian.home \
--to=ernesto.mnd.fernandez@gmail.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).