linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: linux-xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH, RFC] libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero
Date: Fri, 14 Feb 2020 10:48:18 +1100	[thread overview]
Message-ID: <20200213234818.GV10776@dread.disaster.area> (raw)
In-Reply-To: <4bc3be27-b09d-a708-f053-6f7240642667@sandeen.net>

On Thu, Feb 13, 2020 at 03:12:24PM -0600, Eric Sandeen wrote:
> I had a request from someone who cared about mkfs speed(!)
> over a slower network block device to look into using faster
> zeroing methods, particularly for the log, during mkfs.xfs.
> 
> e2fsprogs already does this, thanks to some guy named Darrick:
> 
> /*
>  * If we know about ZERO_RANGE, try that before we try PUNCH_HOLE because
>  * ZERO_RANGE doesn't unmap preallocated blocks.  We prefer fallocate because
>  * it always invalidates page cache, and libext2fs requires that reads after
>  * ZERO_RANGE return zeroes.
>  */
> static int __unix_zeroout(int fd, off_t offset, off_t len)
> {
>         int ret = -1;
> 
> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
>         ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, offset, len);
>         if (ret == 0)
>                 return 0;
> #endif
> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)
>         ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>                         offset,  len);
>         if (ret == 0)
>                 return 0;
> #endif
>         errno = EOPNOTSUPP;
>         return ret;
> }
> 
> and nobody has exploded so far, AFAIK.  :)  So, floating this idea
> for xfsprogs.  I'm a little scared of the second #ifdef block above, but
> if that's really ok/consistent/safe we could add it too.

If FALLOC_FL_PUNCH_HOLE is defined, then FALLOC_FL_KEEP_SIZE is
guaranteed to be defined, so that condition check is somewhat
redundant. See commit 79124f18b335 ("fs: add hole punching to
fallocate")....

> The patch moves some defines around too, I could split that up and resend
> if this isn't laughed out of the room.
> 
> Thanks,
> -Eric
> 
> =====
> 
> libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero
> 
> I had a request from someone who cared about mkfs speed(!)
> over a slower network block device to look into using faster
> zeroing methods, particularly for the log, during mkfs.
> 
> Using FALLOC_FL_ZERO_RANGE is faster in this case than writing
> a bunch of zeros across a wire.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> diff --git a/include/linux.h b/include/linux.h
> index 8f3c32b0..425badb5 100644
> --- a/include/linux.h
> +++ b/include/linux.h
> @@ -113,6 +113,26 @@ static __inline__ void platform_uuid_copy(uuid_t *dst, uuid_t *src)
>  	uuid_copy(*dst, *src);
>  }
>  
> +#ifndef FALLOC_FL_PUNCH_HOLE
> +#define FALLOC_FL_PUNCH_HOLE	0x02
> +#endif
> +
> +#ifndef FALLOC_FL_COLLAPSE_RANGE
> +#define FALLOC_FL_COLLAPSE_RANGE 0x08
> +#endif
> +
> +#ifndef FALLOC_FL_ZERO_RANGE
> +#define FALLOC_FL_ZERO_RANGE 0x10
> +#endif
> +
> +#ifndef FALLOC_FL_INSERT_RANGE
> +#define FALLOC_FL_INSERT_RANGE 0x20
> +#endif
> +
> +#ifndef FALLOC_FL_UNSHARE_RANGE
> +#define FALLOC_FL_UNSHARE_RANGE 0x40
> +#endif

These were added to allow xfs_io to test these operations before
there was userspace support for them. I do not think we should
propagate them outside of xfs_io - if they are not supported by the
distro at build time, we shouldn't attempt to use them in mkfs.

i.e. xfs_io is test-enablement code for developers, mkfs.xfs is
production code for users, so different rules kinda exist for
them...

> diff --git a/libxfs/Makefile b/libxfs/Makefile
> index fbcc963a..b4e8864b 100644
> --- a/libxfs/Makefile
> +++ b/libxfs/Makefile
> @@ -105,6 +105,10 @@ CFILES = cache.c \
>  #
>  #LCFLAGS +=
>  
> +ifeq ($(HAVE_FALLOCATE),yes)
> +LCFLAGS += -DHAVE_FALLOCATE
> +endif

HAVE_FALLOCATE comes from an autoconf test. I suspect that this
needs to be made more finegrained, testing for fallocate() features,
not just just whether the syscall exists. And the above should
probably be in include/builddefs.in so that it's available to all
of xfsprogs, not just libxfs code...

> +
>  FCFLAGS = -I.
>  
>  LTLIBS = $(LIBPTHREAD) $(LIBRT)
> diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
> index 0d9d7202..94f63bbf 100644
> --- a/libxfs/rdwr.c
> +++ b/libxfs/rdwr.c
> @@ -4,6 +4,9 @@
>   * All Rights Reserved.
>   */
>  
> +#if defined(HAVE_FALLOCATE)
> +#include <linux/falloc.h>
> +#endif

Urk. That should be in include/linux.h, right?

>  
>  #include "libxfs_priv.h"
>  #include "init.h"
> @@ -60,9 +63,21 @@ int
>  libxfs_device_zero(struct xfs_buftarg *btp, xfs_daddr_t start, uint len)
>  {
>  	xfs_off_t	start_offset, end_offset, offset;
> -	ssize_t		zsize, bytes;
> +	ssize_t		zsize, bytes, len_bytes;
>  	char		*z;
> -	int		fd;
> +	int		ret, fd;
> +
> +	fd = libxfs_device_to_fd(btp->dev);
> +	start_offset = LIBXFS_BBTOOFF64(start);
> +	end_offset = LIBXFS_BBTOOFF64(start + len) - start_offset;
> +
> +#if defined(HAVE_FALLOCATE)
> +	/* try to use special zeroing methods, fall back to writes if needed */
> +	len_bytes = LIBXFS_BBTOOFF64(len);
> +	ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
> +	if (ret == 0)
> +		return 0;
> +#endif

I kinda dislike the "return if success" hidden inside the ifdef -
it's not a code pattern I'd expect to see. This is what I'd tend
to expect in include/linux.h:

#if defined(HAVE_FALLOCATE_ZERO_RANGE)
static inline int
platform_zero_range(
	int		fd,
	xfs_off_t	start_offset,
	xfs_off_t	end_offset)
{
	int		ret;

	ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
	if (!ret)
		return 0;
	return -errno;
}
#else
#define platform_zero_range(fd, s, o)	(true)
#endif

and then the code in libxfs_device_zero() does:

	error = platform_zero_range(fd, start_offset, len_bytes);
	if (!error)
		return 0;

without adding nasty #defines...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-02-13 23:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13 21:12 [PATCH, RFC] libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero Eric Sandeen
2020-02-13 23:48 ` Dave Chinner [this message]
2020-02-13 23:57   ` Eric Sandeen
2020-02-14  0:25     ` Dave Chinner
2020-02-14  1:05 ` [PATCH V2] " Eric Sandeen
2020-02-14  1:34   ` Dave Chinner
2020-02-14  1:43     ` Eric Sandeen
2020-02-22  3:22 ` [PATCH V3] " Eric Sandeen
2020-02-22  7:24   ` Darrick J. Wong
2020-02-22 15:23     ` Eric Sandeen
2020-02-25 18:13 ` [PATCH V4] " Eric Sandeen
2020-02-25 18:46   ` Christoph Hellwig
2020-02-25 19:16   ` Darrick J. Wong
2020-02-25 23:33     ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200213234818.GV10776@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).