All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Catherine Hoang <catherine.hoang@oracle.com>,
	linux-xfs@vger.kernel.org, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH v1 2/4] xfs: implement custom freeze/thaw functions
Date: Mon, 13 Mar 2023 22:25:02 -0700	[thread overview]
Message-ID: <20230314052502.GB11376@frogsfrogsfrogs> (raw)
In-Reply-To: <CAOQ4uxhKEhQ4X+rE4AYq70iEWKfqwQOZu47w_n1dbXd-wOeHTw@mail.gmail.com>

On Tue, Mar 14, 2023 at 07:11:56AM +0200, Amir Goldstein wrote:
> On Tue, Mar 14, 2023 at 6:25 AM Catherine Hoang
> <catherine.hoang@oracle.com> wrote:
> >
> > Implement internal freeze/thaw functions and prevent other threads from changing
> > the freeze level by adding a new SB_FREEZE_ECLUSIVE level. This is required to
> 
> This looks troubling in several ways:
> - Layering violation
> - Duplication of subtle vfs code
> 
> > prevent concurrent transactions while we are updating the uuid.
> >
> 
> Wouldn't it be easier to hold s_umount while updating the uuid?

Why?  Userspace holds an open file descriptor, the fs won't get
unmounted.

> Let userspace freeze before XFS_IOC_SETFSUUID and let
> XFS_IOC_SETFSUUID take s_umount and verify that fs is frozen.

Ugh, no, I don't want *userspace* to have to know how to do that.

> > Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
> > ---
> >  fs/xfs/xfs_super.c | 112 +++++++++++++++++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_super.h |   5 ++
> >  2 files changed, 117 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 2479b5cbd75e..6a52ae660810 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -2279,6 +2279,118 @@ static inline int xfs_cpu_hotplug_init(void) { return 0; }
> >  static inline void xfs_cpu_hotplug_destroy(void) {}
> >  #endif
> >
> > +/*
> > + * We need to disable all writer threads, which means taking the first two
> > + * freeze levels to put userspace to sleep, and the third freeze level to
> > + * prevent background threads from starting new transactions.  Take one level
> > + * more to prevent other callers from unfreezing the filesystem while we run.
> > + */
> > +int
> > +xfs_internal_freeze(
> > +       struct xfs_mount        *mp)
> > +{
> > +       struct super_block      *sb = mp->m_super;
> > +       int                     level;
> > +       int                     error = 0;
> > +
> > +       /* Wait until we're ready to freeze. */
> > +       down_write(&sb->s_umount);
> > +       while (sb->s_writers.frozen != SB_UNFROZEN) {
> > +               up_write(&sb->s_umount);
> > +               delay(HZ / 10);
> > +               down_write(&sb->s_umount);
> > +       }
> 
> That can easily wait forever, without any task holding any lock.

Indeed, this needs at a bare minimum some kind of fatal_signal_pending
check every time through the loop.

> > +
> > +       if (sb_rdonly(sb)) {
> > +               sb->s_writers.frozen = SB_FREEZE_EXCLUSIVE;
> > +               goto done;
> > +       }
> > +
> > +       sb->s_writers.frozen = SB_FREEZE_WRITE;
> > +       /* Release s_umount to preserve sb_start_write -> s_umount ordering */
> > +       up_write(&sb->s_umount);
> > +       percpu_down_write(sb->s_writers.rw_sem + SB_FREEZE_WRITE - 1);
> > +       down_write(&sb->s_umount);
> > +
> > +       /* Now we go and block page faults... */
> > +       sb->s_writers.frozen = SB_FREEZE_PAGEFAULT;
> > +       percpu_down_write(sb->s_writers.rw_sem + SB_FREEZE_PAGEFAULT - 1);
> > +
> > +       /*
> > +        * All writers are done so after syncing there won't be dirty data.
> > +        * Let xfs_fs_sync_fs flush dirty data so the VFS won't start writeback
> > +        * and to disable the background gc workers.
> > +        */
> > +       error = sync_filesystem(sb);
> > +       if (error) {
> > +               sb->s_writers.frozen = SB_UNFROZEN;
> > +               for (level = SB_FREEZE_PAGEFAULT - 1; level >= 0; level--)
> > +                       percpu_up_write(sb->s_writers.rw_sem + level);
> > +               wake_up(&sb->s_writers.wait_unfrozen);
> > +               up_write(&sb->s_umount);
> > +               return error;
> > +       }
> > +
> > +       /* Now wait for internal filesystem counter */
> > +       sb->s_writers.frozen = SB_FREEZE_FS;
> > +       percpu_down_write(sb->s_writers.rw_sem + SB_FREEZE_FS - 1);
> > +
> > +       xfs_log_clean(mp);

Hmm... some of these calls really ought to be returning errors.

> > +
> > +       /*
> > +        * To prevent anyone else from unfreezing us, set the VFS freeze
> > +        * level to one higher than FREEZE_COMPLETE.
> > +        */
> > +       sb->s_writers.frozen = SB_FREEZE_EXCLUSIVE;
> > +       for (level = SB_FREEZE_LEVELS - 1; level >= 0; level--)
> > +               percpu_rwsem_release(sb->s_writers.rw_sem + level, 0,
> > +                               _THIS_IP_);
> 
> If you really must introduce a new freeze level, you should do it in vfs
> and not inside xfs, even if xfs is the only current user of the new leve.

Luis is already trying to do something similar to this.  So far Jan and
I seem to be the only ones who have taken a look at this fs-internal
freeze...

https://lore.kernel.org/linux-fsdevel/20230114003409.1168311-4-mcgrof@kernel.org/

--D

> Thanks,
> Amir.

  reply	other threads:[~2023-03-14  5:25 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-14  4:21 [PATCH v1 0/4] setting uuid of online filesystems Catherine Hoang
2023-03-14  4:21 ` [PATCH v1 1/4] xfs: refactor xfs_uuid_mount and xfs_uuid_unmount Catherine Hoang
2023-03-14  4:21 ` [PATCH v1 2/4] xfs: implement custom freeze/thaw functions Catherine Hoang
2023-03-14  5:11   ` Amir Goldstein
2023-03-14  5:25     ` Darrick J. Wong [this message]
2023-03-14  6:00       ` Amir Goldstein
2023-03-16  5:16         ` Darrick J. Wong
2023-03-14  4:21 ` [PATCH v1 3/4] xfs: add XFS_IOC_SETFSUUID ioctl Catherine Hoang
2023-03-14  5:50   ` Amir Goldstein
2023-03-15 23:12     ` Catherine Hoang
2023-03-16  8:09       ` Amir Goldstein
2023-03-18  0:39         ` Darrick J. Wong
2023-03-18  9:31           ` Amir Goldstein
2023-03-14  4:21 ` [PATCH v1 4/4] xfs: export meta uuid via xfs_fsop_geom Catherine Hoang
2023-03-14  6:28 ` [PATCH v1 0/4] setting uuid of online filesystems Dave Chinner
2023-03-16 20:41   ` Catherine Hoang
2023-03-19  0:16     ` Dave Chinner
2023-03-28  1:38       ` Darrick J. Wong
2023-03-18  0:11   ` Darrick J. Wong
2023-03-18  9:04     ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230314052502.GB11376@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=catherine.hoang@oracle.com \
    --cc=jack@suse.cz \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.