All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shyam Prasad N <nspmangalore@gmail.com>
To: tytso@mit.edu, David Howells <dhowells@redhat.com>,
	Steve French <smfrench@gmail.com>,
	linux-ext4@vger.kernel.org
Subject: Regarding ext4 extent allocation strategy
Date: Tue, 13 Jul 2021 12:22:14 +0530	[thread overview]
Message-ID: <CANT5p=o3i4kWQuMFF5zKQp04JnWEQnYuo+cvyH8asGMvTVBBkw@mail.gmail.com> (raw)

Hi,

Our team in Microsoft, which works on the Linux SMB3 client kernel
filesystem has recently been exploring the use of fscache on top of
ext4 for caching the network filesystem data for some customer
workloads.

However, the maintainer of fscache (David Howells) recently warned us
that a few other extent based filesystem developers pointed out a
theoretical bug in the current implementation of fscache/cachefiles.
It currently does not maintain a separate metadata for the cached data
it holds, but instead uses the sparseness of the underlying filesystem
to track the ranges of the data that is being cached.
The bug that has been pointed out with this is that the underlying
filesystems could bridge holes between data ranges with zeroes or
punch hole in data ranges that contain zeroes. (@David please add if I
missed something).

David has already begun working on the fix to this by maintaining the
metadata of the cached ranges in fscache itself.
However, since it could take some time for this fix to be approved and
then backported by various distros, I'd like to understand if there is
a potential problem in using fscache on top of ext4 without the fix.
If ext4 doesn't do any such optimizations on the data ranges, or has a
way to disable such optimizations, I think we'll be okay to use the
older versions of fscache even without the fix mentioned above.

Opinions?

-- 
Regards,
Shyam

             reply	other threads:[~2021-07-13  6:52 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13  6:52 Shyam Prasad N [this message]
2021-07-13 11:39 ` Regarding ext4 extent allocation strategy Theodore Y. Ts'o
2021-07-13 12:57   ` Shyam Prasad N
2021-07-13 20:18     ` Theodore Y. Ts'o
2021-07-14  0:37       ` Shyam Prasad N
2022-02-18  3:18   ` Gao Xiang
2022-02-22  2:48     ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANT5p=o3i4kWQuMFF5zKQp04JnWEQnYuo+cvyH8asGMvTVBBkw@mail.gmail.com' \
    --to=nspmangalore@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=smfrench@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.