linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Petr Vorel <pvorel@suse.cz>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	Jan Kara <jack@suse.cz>, Hannes Reinecke <hare@suse.de>,
	linux-xfs@vger.kernel.org, ltp@lists.linux.it
Subject: Re: LTP test df01.sh detected different size of loop device in v5.19
Date: Thu, 18 Aug 2022 17:25:33 +0200	[thread overview]
Message-ID: <Yv5Z7eu5RGnutMly@pevik> (raw)
In-Reply-To: <8d33a7a0-7a7c-47a1-ed84-83fd25089897@sandeen.net>

Hi Eric, all,

> On 8/15/22 4:31 AM, Petr Vorel wrote:
> > Hi Dave,

> >> On Fri, Aug 12, 2022 at 03:20:37PM +0200, Petr Vorel wrote:
> >>> Hi all,

> >>> LTP test df01.sh found different size of loop device in v5.19.
> >>> Test uses loop device formatted on various file systems, only XFS fails.
> >>> It randomly fails during verifying that loop size usage changes:

> >>> grep ${TST_DEVICE} output | grep -q "${total}.*${used}" [1]

> >>> How to reproduce:
> >>> # PATH="/opt/ltp/testcases/bin:$PATH" df01.sh -f xfs # it needs several tries to hit

> >>> df saved output:
> >>> Filesystem     1024-blocks    Used Available Capacity Mounted on
> >>> ...
> >>> /dev/loop0          256672   16208    240464       7% /tmp/LTP_df01.1kRwoUCCR7/mntpoint
> >>> df output:
> >>> Filesystem     1024-blocks    Used Available Capacity Mounted on
> >>> ...
> >>> tmpfs               201780       0    201780       0% /run/user/0
> >>> /dev/loop0          256672   15160    241512       6% /tmp/LTP_df01.1kRwoUCCR7/mntpoint
> >>> => different size
> >>> df01 4 TFAIL: 'df -k -P' failed, not expected.

> >> Yup, most likely because we changed something in XFS related to
> >> internal block reservation spaces. That is, the test is making
> >> fundamentally flawed assumptions about filesystem used space
> >> accounting.

> >> It is wrong to assuming that the available capacity of a given empty
> >> filesystem will never change.  Assumptions like this have been
> >> invalid for decades because the available space can change based on
> >> the underlying configuration or the filesystem. e.g. different
> >> versions of mkfs.xfs set different default parameters and so simply
> >> changing the version of xfsprogs you use between the two comparision
> >> tests will make it fail....

> >> And, well, XFS also has XFS_IOC_{GS}ET_RESBLKS ioctls that allow
> >> userspace to change the amount of reserved blocks. They were
> >> introduced in 1997, and since then we've changed the default
> >> reservation the filesystem takes at least a dozen times.

> > Thanks a lot for valuable info.

> >>>> It might be a false positive / bug in the test, but it's at least a changed behavior.

> >> Yup, any test that assumes "available space" does not change from
> >> kernel version to kernel version is flawed. There is no guarantee
> >> that this ever stays the same, nor that it needs to stay the same.
> > I'm sorry I was not clear. Test [1] does not measure "available space" between
> > kernel releases. It just run df command with parameters, saves it's output
> > and compares "1024-blocks" and "Used" columns of df output with stat output:

> Annotating what these do...

> > 		local total=$(stat -f mntpoint --printf=%b)  # number of blocks allocated
> > 		local free=$(stat -f mntpoint --printf=%f)   # free blocks in filesystem
> > 		local used=$((total-free))                   # (number of blocks free)

> > 		local bsize=$(stat -f mntpoint --printf=%s)  # block size ("for faster transfers")
> > 		total=$((($total * $bsize + 512)/ 1024))     # number of 1k blocks allocated?
> > 		used=$((($used * $bsize + 512) / 1024))      # number of 1k blocks used?

> > And comparison with "$used" is what sometimes fails.

> > BTW this happens on both distros when loop device is on tmpfs. I'm trying to
> > trigger it on ext4 and btrfs, not successful so far. Looks like it's tmpfs
> > related.

> > If that's really expected, we might remove this check for used for XFS
> > (not sure if check only for total makes sense).

> It's kind of hard to follow this test, but it seems to be trying to
> ensure that df output is consistent with du (statfs) numbers, before
> and after creating and removing a 1MB file.  I guess it's literally
> testing df itself, i.e. that it actually presents the numbers it obtained
> from statfs.

> AFAICT the difference in the failure is 1024 1K blocks, which is the size
> the file that's been created and removed during the test.

> My best guess is that this is xfs inode deferred inode inactivation hanging
> onto the space a little longer than the test expects.

> This is probably because the new-ish xfs inode inactivation no longer blocks
> on inode garbage collection during statfs().

> IOWS, I think the test expects that free space is reflected in statfs numbers
> immediately after a file is removed, and that's no longer the case here. They
> change in between the df check and the statfs check.

> (The test isn't just checking that the values are correct, it is checking that
> the values are /immediately/ correct.)

> Putting a "sleep 1" after the "rm -f" in the test seems to fix it; IIRC
> the max time to wait for inodegc is 1s. This does slow the test down a bit.

Sure, it looks like we can sleep just 50ms on my hw (although better might be to
poll for the result [1]), I just wanted to make sure there is no bug/regression
before hiding it with sleep.

Thanks for your input!

Kind regards,
Petr

[1] https://people.kernel.org/metan/why-sleep-is-almost-never-acceptable-in-tests

> -Eric

+++ testcases/commands/df/df01.sh
@@ -63,6 +63,10 @@ df_test()
 		tst_res TFAIL "'$cmd' failed."
 	fi
 
+	if [ "$DF_FS_TYPE" = xfs ]; then
+		tst_sleep 50ms
+	fi
+
 	ROD_SILENT rm -rf mntpoint/testimg
 
 	# flush file system buffers, then we can get the actual sizes.

  reply	other threads:[~2022-08-18 15:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-12 13:20 LTP test df01.sh detected different size of loop device in v5.19 Petr Vorel
2022-08-12 13:24 ` Petr Vorel
2022-08-12 14:00   ` Petr Vorel
2022-08-14 22:44 ` Dave Chinner
2022-08-15  9:31   ` Petr Vorel
2022-08-15 20:09     ` Eric Sandeen
2022-08-18 15:25       ` Petr Vorel [this message]
2022-08-18 16:05         ` Eric Sandeen
2022-08-18 16:27           ` Darrick J. Wong
2022-08-18 17:01             ` Petr Vorel
2022-08-18 21:19               ` Eric Sandeen
2022-08-19 16:00                 ` Petr Vorel
2022-08-19 16:06                   ` [LTP] " Bird, Tim
2022-08-19 19:30                     ` Petr Vorel
2022-08-18 21:02           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yv5Z7eu5RGnutMly@pevik \
    --to=pvorel@suse.cz \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=hare@suse.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ltp@lists.linux.it \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).