linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jane Chu <jane.chu@oracle.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 3/5] vfs: add a zero-initialization mode to fallocate
Date: Wed, 22 Sep 2021 14:27:25 -0700	[thread overview]
Message-ID: <20210922212725.GN570615@magnolia> (raw)
In-Reply-To: <20210922054931.GT1756565@dread.disaster.area>

On Wed, Sep 22, 2021 at 03:49:31PM +1000, Dave Chinner wrote:
> On Tue, Sep 21, 2021 at 09:13:54PM -0700, Darrick J. Wong wrote:
> > On Wed, Sep 22, 2021 at 01:59:07PM +1000, Dave Chinner wrote:
> > > On Tue, Sep 21, 2021 at 07:38:01PM -0700, Darrick J. Wong wrote:
> > > > On Tue, Sep 21, 2021 at 07:16:26PM -0700, Dan Williams wrote:
> > > > > On Tue, Sep 21, 2021 at 1:32 AM Christoph Hellwig <hch@infradead.org> wrote:
> > > > > >
> > > > > > On Tue, Sep 21, 2021 at 10:44:31AM +1000, Dave Chinner wrote:
> > > > > > > I think this wants to be a behavioural modifier for existing
> > > > > > > operations rather than an operation unto itself. i.e. similar to how
> > > > > > > KEEP_SIZE modifies ALLOC behaviour but doesn't fundamentally alter
> > > > > > > the guarantees ALLOC provides userspace.
> > > > > > >
> > > > > > > In this case, the change of behaviour over ZERO_RANGE is that we
> > > > > > > want physical zeros to be written instead of the filesystem
> > > > > > > optimising away the physical zeros by manipulating the layout
> > > > > > > of the file.
> > > > > >
> > > > > > Yes.
> > > > > >
> > > > > > > Then we have and API that looks like:
> > > > > > >
> > > > > > >       ALLOC           - allocate space efficiently
> > > > > > >       ALLOC | INIT    - allocate space by writing zeros to it
> > > > > > >       ZERO            - zero data and preallocate space efficiently
> > > > > > >       ZERO | INIT     - zero range by writing zeros to it
> > > > > > >
> > > > > > > Which seems to cater for all the cases I know of where physically
> > > > > > > writing zeros instead of allocating unwritten extents is the
> > > > > > > preferred behaviour of fallocate()....
> > > > > >
> > > > > > Agreed.  I'm not sure INIT is really the right name, but I can't come
> > > > > > up with a better idea offhand.
> > > > > 
> > > > > FUA? As in, this is a forced-unit-access zeroing all the way to media
> > > > > bypassing any mechanisms to emulate zero-filled payloads on future
> > > > > reads.
> > > 
> > > Yes, that's the semantic we want, but FUA already defines specific
> > > data integrity behaviour in the storage stack w.r.t. volatile
> > > caches.
> > > 
> > > Also, FUA is associated with devices - it's low level storage jargon
> > > and so is not really appropriate to call a user interface operation
> > > FUA where users have no idea what a "unit" or "access" actually
> > > means.
> > > 
> > > Hence we should not overload this name with some other operation
> > > that does not have (and should not have) explicit data integrity
> > > requirements. That will just cause confusion for everyone.
> > > 
> > > > FALLOC_FL_ZERO_EXISTING, because you want to zero the storage that
> > > > already exists at that file range?
> > > 
> > > IMO that doesn't work as a behavioural modifier for ALLOC because
> > > the ALLOC semantics are explicitly "don't touch existing user
> > > data"...
> > 
> > Well since you can't preallocate /and/ zerorange at the same time...
> > 
> > /* For FALLOC_FL_ZERO_RANGE, write zeroes to pre-existing mapped storage. */
> > #define FALLOC_FL_ZERO_EXISTING		(0x80)
> 
> Except we also want the newly allocated regions (i.e. where holes
> were) in that range being zeroed to have zeroes written to them as
> well, yes? Otherwise we end up with a combination of unwritten
> extents and physical zeroes, and you can't use
> ZERORANGE|EXISTING as a replacement for PUNCH + ALLOC|INIT
> 
> /*
>  * For preallocation and zeroing operations, force the filesystem to
>  * write zeroes rather than use unwritten extents to indicate the
>  * range contains zeroes.
>  *
>  * For filesystems that support unwritten extents, this trades off
>  * slow fallocate performance for faster first write performance as
>  * unwritten extent conversion on the first write to each block in
>  * the range is not needed.
>  *
>  * Care is required when using FALLOC_FL_ALLOC_INIT_DATA as it will
>  * be much slower overall for large ranges and/or slow storage
>  * compared to using unwritten extents.
>  */
> #define FALLOC_FL_ALLOC_INIT_DATA	(1 << 7)

I prefer FALLOC_FL_ZEROINIT_DATA here, because in the ZERO|INIT case
we're not allocating any new space, merely rewriting existing storage.
I also want to expand the description slightly:

/*
 * For preallocation, force the filesystem to write zeroes rather than
 * use unwritten extents to indicate the range contains zeroes.  For
 * zeroing operations, force the filesystem to write zeroes to existing
 * written extents.

--D

> 
> Cheers,
> 
> Dave.
> 
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2021-09-22 21:27 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-18  1:30 [PATCHSET RFC v2 jane 0/5] vfs: enable userspace to reset damaged file storage Darrick J. Wong
2021-09-18  1:30 ` [PATCH 1/5] dax: prepare pmem for use by zero-initializing contents and clearing poisons Darrick J. Wong
2021-09-18 16:54   ` riteshh
2021-09-20 17:22     ` Darrick J. Wong
2021-09-21  4:07       ` riteshh
2021-09-22 18:26         ` Darrick J. Wong
2021-09-22 19:47           ` riteshh
2021-09-22 20:26           ` Dan Williams
2021-09-21  8:34   ` Christoph Hellwig
2021-09-22 18:10     ` Darrick J. Wong
2021-09-18  1:30 ` [PATCH 2/5] iomap: use accelerated zeroing on a block device to zero a file range Darrick J. Wong
2021-09-18 16:55   ` riteshh
2021-09-21  8:29   ` Christoph Hellwig
2021-09-22 18:53     ` Darrick J. Wong
2021-09-21 22:33   ` Dave Chinner
2021-09-22 18:54     ` Darrick J. Wong
2021-09-18  1:31 ` [PATCH 3/5] vfs: add a zero-initialization mode to fallocate Darrick J. Wong
2021-09-18 16:58   ` riteshh
2021-09-20 17:52   ` Eric Biggers
2021-09-20 18:06     ` Darrick J. Wong
2021-09-21  0:44   ` Dave Chinner
2021-09-21  8:31     ` Christoph Hellwig
2021-09-22  2:16       ` Dan Williams
2021-09-22  2:38         ` Darrick J. Wong
2021-09-22  3:59           ` Dave Chinner
2021-09-22  4:13             ` Darrick J. Wong
2021-09-22  5:49               ` Dave Chinner
2021-09-22 21:27                 ` Darrick J. Wong [this message]
2021-09-23  0:02                   ` Darrick J. Wong
2021-09-23  0:44                     ` Darrick J. Wong
2021-09-23  1:42                     ` Dave Chinner
2021-09-23  2:43                       ` Dan Williams
2021-09-23  5:42                         ` Dan Williams
2021-09-23 22:54                           ` Dave Chinner
2021-09-24  1:18                             ` Dan Williams
2021-09-24  1:21                               ` Jane Chu
2021-09-24  1:35                                 ` Darrick J. Wong
2021-09-27 21:07                                   ` Dave Chinner
2021-09-27 21:57                                     ` Jane Chu
2021-09-28  0:08                                       ` Dan Williams
2021-09-22  5:28     ` riteshh
2021-09-18  1:31 ` [PATCH 4/5] xfs: implement FALLOC_FL_ZEROINIT_RANGE Darrick J. Wong
2021-09-18  1:31 ` [PATCH 5/5] ext4: " Darrick J. Wong
2021-09-18 17:07   ` riteshh
2021-09-20 18:11     ` Darrick J. Wong
2021-09-21  6:10       ` riteshh
2021-09-18 18:05 ` [PATCHSET RFC v2 jane 0/5] vfs: enable userspace to reset damaged file storage Dan Williams
2021-09-23  0:51 ` Darrick J. Wong
2021-09-23  1:17   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210922212725.GN570615@magnolia \
    --to=djwong@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jane.chu@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).