From: Dave Chinner <david@fromorbit.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: linux-xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH, RFC] libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero
Date: Fri, 14 Feb 2020 10:48:18 +1100 [thread overview]
Message-ID: <20200213234818.GV10776@dread.disaster.area> (raw)
In-Reply-To: <4bc3be27-b09d-a708-f053-6f7240642667@sandeen.net>
On Thu, Feb 13, 2020 at 03:12:24PM -0600, Eric Sandeen wrote:
> I had a request from someone who cared about mkfs speed(!)
> over a slower network block device to look into using faster
> zeroing methods, particularly for the log, during mkfs.xfs.
>
> e2fsprogs already does this, thanks to some guy named Darrick:
>
> /*
> * If we know about ZERO_RANGE, try that before we try PUNCH_HOLE because
> * ZERO_RANGE doesn't unmap preallocated blocks. We prefer fallocate because
> * it always invalidates page cache, and libext2fs requires that reads after
> * ZERO_RANGE return zeroes.
> */
> static int __unix_zeroout(int fd, off_t offset, off_t len)
> {
> int ret = -1;
>
> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
> ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, offset, len);
> if (ret == 0)
> return 0;
> #endif
> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)
> ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> offset, len);
> if (ret == 0)
> return 0;
> #endif
> errno = EOPNOTSUPP;
> return ret;
> }
>
> and nobody has exploded so far, AFAIK. :) So, floating this idea
> for xfsprogs. I'm a little scared of the second #ifdef block above, but
> if that's really ok/consistent/safe we could add it too.
If FALLOC_FL_PUNCH_HOLE is defined, then FALLOC_FL_KEEP_SIZE is
guaranteed to be defined, so that condition check is somewhat
redundant. See commit 79124f18b335 ("fs: add hole punching to
fallocate")....
> The patch moves some defines around too, I could split that up and resend
> if this isn't laughed out of the room.
>
> Thanks,
> -Eric
>
> =====
>
> libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero
>
> I had a request from someone who cared about mkfs speed(!)
> over a slower network block device to look into using faster
> zeroing methods, particularly for the log, during mkfs.
>
> Using FALLOC_FL_ZERO_RANGE is faster in this case than writing
> a bunch of zeros across a wire.
>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
>
> diff --git a/include/linux.h b/include/linux.h
> index 8f3c32b0..425badb5 100644
> --- a/include/linux.h
> +++ b/include/linux.h
> @@ -113,6 +113,26 @@ static __inline__ void platform_uuid_copy(uuid_t *dst, uuid_t *src)
> uuid_copy(*dst, *src);
> }
>
> +#ifndef FALLOC_FL_PUNCH_HOLE
> +#define FALLOC_FL_PUNCH_HOLE 0x02
> +#endif
> +
> +#ifndef FALLOC_FL_COLLAPSE_RANGE
> +#define FALLOC_FL_COLLAPSE_RANGE 0x08
> +#endif
> +
> +#ifndef FALLOC_FL_ZERO_RANGE
> +#define FALLOC_FL_ZERO_RANGE 0x10
> +#endif
> +
> +#ifndef FALLOC_FL_INSERT_RANGE
> +#define FALLOC_FL_INSERT_RANGE 0x20
> +#endif
> +
> +#ifndef FALLOC_FL_UNSHARE_RANGE
> +#define FALLOC_FL_UNSHARE_RANGE 0x40
> +#endif
These were added to allow xfs_io to test these operations before
there was userspace support for them. I do not think we should
propagate them outside of xfs_io - if they are not supported by the
distro at build time, we shouldn't attempt to use them in mkfs.
i.e. xfs_io is test-enablement code for developers, mkfs.xfs is
production code for users, so different rules kinda exist for
them...
> diff --git a/libxfs/Makefile b/libxfs/Makefile
> index fbcc963a..b4e8864b 100644
> --- a/libxfs/Makefile
> +++ b/libxfs/Makefile
> @@ -105,6 +105,10 @@ CFILES = cache.c \
> #
> #LCFLAGS +=
>
> +ifeq ($(HAVE_FALLOCATE),yes)
> +LCFLAGS += -DHAVE_FALLOCATE
> +endif
HAVE_FALLOCATE comes from an autoconf test. I suspect that this
needs to be made more finegrained, testing for fallocate() features,
not just just whether the syscall exists. And the above should
probably be in include/builddefs.in so that it's available to all
of xfsprogs, not just libxfs code...
> +
> FCFLAGS = -I.
>
> LTLIBS = $(LIBPTHREAD) $(LIBRT)
> diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
> index 0d9d7202..94f63bbf 100644
> --- a/libxfs/rdwr.c
> +++ b/libxfs/rdwr.c
> @@ -4,6 +4,9 @@
> * All Rights Reserved.
> */
>
> +#if defined(HAVE_FALLOCATE)
> +#include <linux/falloc.h>
> +#endif
Urk. That should be in include/linux.h, right?
>
> #include "libxfs_priv.h"
> #include "init.h"
> @@ -60,9 +63,21 @@ int
> libxfs_device_zero(struct xfs_buftarg *btp, xfs_daddr_t start, uint len)
> {
> xfs_off_t start_offset, end_offset, offset;
> - ssize_t zsize, bytes;
> + ssize_t zsize, bytes, len_bytes;
> char *z;
> - int fd;
> + int ret, fd;
> +
> + fd = libxfs_device_to_fd(btp->dev);
> + start_offset = LIBXFS_BBTOOFF64(start);
> + end_offset = LIBXFS_BBTOOFF64(start + len) - start_offset;
> +
> +#if defined(HAVE_FALLOCATE)
> + /* try to use special zeroing methods, fall back to writes if needed */
> + len_bytes = LIBXFS_BBTOOFF64(len);
> + ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
> + if (ret == 0)
> + return 0;
> +#endif
I kinda dislike the "return if success" hidden inside the ifdef -
it's not a code pattern I'd expect to see. This is what I'd tend
to expect in include/linux.h:
#if defined(HAVE_FALLOCATE_ZERO_RANGE)
static inline int
platform_zero_range(
int fd,
xfs_off_t start_offset,
xfs_off_t end_offset)
{
int ret;
ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
if (!ret)
return 0;
return -errno;
}
#else
#define platform_zero_range(fd, s, o) (true)
#endif
and then the code in libxfs_device_zero() does:
error = platform_zero_range(fd, start_offset, len_bytes);
if (!error)
return 0;
without adding nasty #defines...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2020-02-13 23:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-13 21:12 [PATCH, RFC] libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero Eric Sandeen
2020-02-13 23:48 ` Dave Chinner [this message]
2020-02-13 23:57 ` Eric Sandeen
2020-02-14 0:25 ` Dave Chinner
2020-02-14 1:05 ` [PATCH V2] " Eric Sandeen
2020-02-14 1:34 ` Dave Chinner
2020-02-14 1:43 ` Eric Sandeen
2020-02-22 3:22 ` [PATCH V3] " Eric Sandeen
2020-02-22 7:24 ` Darrick J. Wong
2020-02-22 15:23 ` Eric Sandeen
2020-02-25 18:13 ` [PATCH V4] " Eric Sandeen
2020-02-25 18:46 ` Christoph Hellwig
2020-02-25 19:16 ` Darrick J. Wong
2020-02-25 23:33 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200213234818.GV10776@dread.disaster.area \
--to=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
--cc=sandeen@sandeen.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).