From: Alex Lyakas <alex@zadara.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Eric Sandeen <sandeen@sandeen.net>,
Brian Foster <bfoster@redhat.com>,
linux-xfs@vger.kernel.org
Subject: Re: [RFC-PATCH] xfs: do not update sunit/swidth in the superblock to match those provided during mount
Date: Mon, 2 Dec 2019 10:07:42 +0200 [thread overview]
Message-ID: <CAOcd+r2gaoZz4=xWAh+d=-TNTOtBDvbEezN9OH3oks2p1TNhBg@mail.gmail.com> (raw)
In-Reply-To: <20191201215732.GB2695@dread.disaster.area>
Hi Dave,
On Sun, Dec 1, 2019 at 11:57 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Sun, Dec 01, 2019 at 11:00:32AM +0200, Alex Lyakas wrote:
> > Hi Dave,
> >
> > Thank you for your response.
> >
> > On Sat, Nov 30, 2019 at 10:28 PM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Wed, Nov 27, 2019 at 06:19:29AM -0800, Christoph Hellwig wrote:
> > > > Can we all take a little step back and think about the implications
> > > > of the original patch from Alex? Because I think there is very little.
> > > > And updated sunit/swidth is just a little performance optimization,
> > > > and anyone who really cares about changing that after the fact can
> > > > trivially add those to fstab.
> > > >
> > > > So I think something like his original patch plus a message during
> > > > mount that the new values are not persisted should be perfectly fine.
> > >
> > > Well, the original purpose of the mount options was to persist a new
> > > sunit/swidth to the superblock...
> > >
> > > Let's ignore the fact that it was a result of a CXFS client mount
> > > bug trashing the existing sunit/swidth values, and instead focus on
> > > the fact we've been telling people for years that you "only need to
> > > set these once after a RAID reshape" and so we have a lot of users
> > > out there expecting it to persist the new values...
> > >
> > > I don't think we can just redefine the documented and expected
> > > behaviour of a mount option like this.
> > >
> > > With that in mind, the xfs(5) man page explicitly states this:
> > >
> > > The sunit and swidth parameters specified must be compatible
> > > with the existing filesystem alignment characteristics. In
> > > general, that means the only valid changes to sunit are
> > > increasing it by a power-of-2 multiple. Valid swidth values
> > > are any integer multiple of a valid sunit value.
> > >
> > > Note the comment about changes to sunit? What is being done here -
> > > halving the sunit from 64 to 32 blocks is invalid, documented as
> > > invalid, but the kernel does not enforce this. We should fix the
> > > kernel code to enforce the alignment rules that the mount option
> > > is documented to require.
> > >
> > > If we want to change the alignment characteristics after mkfs, then
> > > use su=1,sw=1 as the initial values, then the first mount can use
> > > the options to change it to whatever is present after mkfs has run.
> >
> > If I understand your response correctly:
> > - some sunit/swidth changes during mount are legal and some aren't
> > - the legal changes should be persisted in the superblock
>
> Yup.
>
> > What about the repair? Even if user performs a legal change, it still
> > breaks the repairability of the file system.
>
> It is not a legal ichange if it moves the root inode to a new
> location. IOWs, if the alignment mods will result in the root inode
> changing location, then it should be rejected by the kernel.
>
> Anyway, we need more details about your test environment, because
> the example you gave:
>
> | # mkfs
> | mkfs.xfs -f -K -p /etc/zadara/xfs.protofile -d sunit=64,swidth=64 -l sunit=32 /dev/vda
> |
> | #mount with a different sunit/swidth:
> | mount -onoatime,sync,nouuid,sunit=32,swidth=32 /dev/vda /mnt/xfs
> |
> | #umount
> | umount /mnt/xfs
> |
> | #xfs_repair
> | xfs_repair -n /dev/vda
> | # reports false corruption and eventually segfaults[1]
>
> Does not reproduce the reported failure on my workstation running
> v5.3.0 kernel and a v5.0.0 xfsprogs:
>
> $ sudo mkfs.xfs -f -d file,name=1t.img,size=1t,sunit=64,swidth=64 -l sunit=32
> meta-data=1t.img isize=512 agcount=32, agsize=8388608 blks
> = sectsz=512 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=0
> = reflink=0
> data = bsize=4096 blocks=268435456, imaxpct=5
> = sunit=8 swidth=8 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1
> log =internal log bsize=4096 blocks=131072, version=2
> = sectsz=512 sunit=4 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
I believe in one of earlier emails in this thread, Brian mentioned
that this is related to "sparse inodes" being enabled:
"I couldn't reproduce this at first because sparse inodes is enabled
by default and that introduces more strict inode alignment requirements."
Can you please try mkfs'ing with spare inodes disabled?
Thanks,
Alex.
> $ sudo mount -o loop -o sunit=32,swidth=32 1t.img /mnt/1t
> $ sudo xfs_info /mnt/1t
> ....
> data = bsize=4096 blocks=268435456, imaxpct=5
> = sunit=4 swidth=4 blks
> ....
> $ sudo umount /mnt/1t
> $ sudo xfs_repair -f 1t.img
> Phase 1 - find and verify superblock...
> - reporting progress in intervals of 15 minutes
> Phase 2 - using internal log
> - zero log...
> - scan filesystem freespace and inode maps...
> - 08:42:14: scanning filesystem freespace - 32 of 32 allocation groups done
> - found root inode chunk
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> ....
> Phase 7 - verify and correct link counts...
> - 08:42:18: verify and correct link counts - 32 of 32 allocation groups done
> done
> $ echo $?
> 0
> $ sudo mount -o loop 1t.img /mnt/1t
> $ sudo xfs_info /mnt/1t
> ....
> data = bsize=4096 blocks=268435456, imaxpct=5
> = sunit=4 swidth=4 blks
> ....
> $
>
> So reducing the sunit doesn't necessarily change the root inode
> location, and so in some cases reducing the sunit doesn't change
> the root inode location, either.
>
> > For now, we made a local change to not persist sunit/swidth updates in
> > the superblock. Because we must have a working repair, and our kernel
> > (4.14 stable) allows any sunit/swidth changes.
>
> From the above, it's not clear where the problem lies - it may be
> that there's a bug in repair we've fixed since whatever version you
> are using....
>
> > We can definitely adhere to the recommended behavior of setting
> > sunit/swidth=1 during mkfs, provided the repair still works after
> > mounting with different sunit/swidth.
>
> ... hence I'd suggest that more investigation needs to be done
> before you do anything permanent...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2019-12-02 8:07 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-21 18:08 [RFC-PATCH] xfs: do not update sunit/swidth in the superblock to match those provided during mount Alex Lyakas
2019-11-22 15:43 ` Brian Foster
2019-11-24 9:13 ` Alex Lyakas
2019-11-24 16:40 ` Darrick J. Wong
2019-11-24 17:38 ` Eric Sandeen
2019-11-25 13:07 ` Brian Foster
2019-11-26 8:50 ` Alex Lyakas
2019-11-25 13:07 ` Brian Foster
2019-11-26 8:49 ` Alex Lyakas
2019-11-26 11:54 ` Brian Foster
2019-11-26 13:37 ` Alex Lyakas
2019-11-26 16:53 ` Eric Sandeen
2019-11-27 14:19 ` Christoph Hellwig
2019-11-27 15:19 ` Brian Foster
2019-11-30 20:28 ` Dave Chinner
2019-12-01 9:00 ` Alex Lyakas
2019-12-01 21:57 ` Dave Chinner
2019-12-02 8:07 ` Alex Lyakas [this message]
2019-12-01 23:29 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOcd+r2gaoZz4=xWAh+d=-TNTOtBDvbEezN9OH3oks2p1TNhBg@mail.gmail.com' \
--to=alex@zadara.com \
--cc=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=sandeen@sandeen.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).