From: "Darrick J. Wong" <djwong@kernel.org>
To: Hugh Dickins <hughd@google.com>
Cc: Lukas Czerner <lczerner@redhat.com>,
Mikulas Patocka <mpatocka@redhat.com>,
Zdenek Kabelac <zkabelac@redhat.com>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: Re: unusual behavior of loop dev with backing file in tmpfs
Date: Wed, 12 Jan 2022 09:19:37 -0800 [thread overview]
Message-ID: <20220112171937.GA19154@magnolia> (raw)
In-Reply-To: <5e66a9-4739-80d9-5bb5-cbe2c8fef36@google.com>
On Tue, Jan 11, 2022 at 08:28:02PM -0800, Hugh Dickins wrote:
> On Fri, 26 Nov 2021, Lukas Czerner wrote:
> >
> > I've noticed unusual test failure in e2fsprogs testsuite
> > (m_assume_storage_prezeroed) where we use mke2fs to create a file system
> > on loop device backed in file on tmpfs. For some reason sometimes the
> > resulting file number of allocated blocks (stat -c '%b' /tmp/file) differs,
> > but it really should not.
> >
> > I was trying to create a simplified reproducer and noticed the following
> > behavior on mainline kernel (v5.16-rc2-54-g5d9f4cf36721)
> >
> > # truncate -s16M /tmp/file
> > # stat -c '%b' /tmp/file
> > 0
> >
> > # losetup -f /tmp/file
> > # stat -c '%b' /tmp/file
> > 672
> >
> > That alone is a little unexpected since the file is really supposed to
> > be empty and when copied out of the tmpfs, it really is empty. But the
> > following is even more weird.
> >
> > We have a loop setup from above, so let's assume it's /dev/loop0. The
> > following should be executed in quick succession, like in a script.
> >
> > # dd if=/dev/zero of=/dev/loop0 bs=4k
> > # blkdiscard -f /dev/loop0
> > # stat -c '%b' /tmp/file
> > 0
> > # sleep 1
> > # stat -c '%b' /tmp/file
> > 672
> >
> > Is that expected behavior ? From what I've seen when I use mkfs instead
> > of this simplified example the number of blocks allocated as reported by
> > stat can vary a quite a lot given more complex operations. The file itself
> > does not seem to be corrupted in any way, so it is likely just an
> > accounting problem.
> >
> > Any idea what is going on there ?
>
> I have half an answer; but maybe you worked it all out meanwhile anyway.
>
> Yes, it happens like that for me too: 672 (but 216 on an old installation).
>
> Half the answer is that funny code at the head of shmem_file_read_iter():
> /*
> * Might this read be for a stacking filesystem? Then when reading
> * holes of a sparse file, we actually need to allocate those pages,
> * and even mark them dirty, so it cannot exceed the max_blocks limit.
> */
> if (!iter_is_iovec(to))
> sgp = SGP_CACHE;
> which allocates pages to the tmpfs for reads from /dev/loop0; whereas
> normally a read of a sparse tmpfs file would just give zeroes without
> allocating.
>
> [Do we still need that code? Mikulas asked 18 months ago, and I never
> responded (sorry) because I failed to arrive at an informed answer.
> It comes from a time while unionfs on tmpfs was actively developing,
> and solved a real problem then; but by the time it went into tmpfs,
> unionfs had already been persuaded to proceed differently, and no
> longer needed it. I kept it in for indeterminate other stacking FSs,
> but it's probably just culted cargo, doing more harm than good. I
> suspect the best thing to do is, after the 5.17 merge window closes,
> revive Mikulas's patch to delete it and see if anyone complains.]
I for one wouldn't mind if tmpfs no longer instantiated cache pages for
a read from a hole -- it's a little strange, since most disk filesystems
(well ok xfs and ext4, haven't checked the others) don't do that.
Anyone who really wants a preallocated page should probably be using
fallocate or something...
--D
> But what is asynchronously reading /dev/loop0 (instantiating pages
> initially, and reinstantiating them after blkdiscard)? I assume it's
> some block device tracker, trying to read capacity and/or partition
> table; whether from inside or outside the kernel, I expect you'll
> guess much better than I can.
>
> Hugh
next prev parent reply other threads:[~2022-01-12 17:19 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-26 7:51 unusual behavior of loop dev with backing file in tmpfs Lukas Czerner
2022-01-12 4:28 ` Hugh Dickins
2022-01-12 12:29 ` Mikulas Patocka
2022-01-12 17:19 ` Darrick J. Wong [this message]
2022-01-12 17:46 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220112171937.GA19154@magnolia \
--to=djwong@kernel.org \
--cc=hughd@google.com \
--cc=lczerner@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mpatocka@redhat.com \
--cc=zkabelac@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).