linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 00/16] xfs: Block size > PAGE_SIZE support
Date: Thu, 8 Nov 2018 14:17:56 -0800	[thread overview]
Message-ID: <20181108221756.GA15721@magnolia> (raw)
In-Reply-To: <20181108090432.GC19305@dastard>

On Thu, Nov 08, 2018 at 08:04:32PM +1100, Dave Chinner wrote:
> On Wed, Nov 07, 2018 at 05:38:43PM -0800, Darrick J. Wong wrote:
> > On Thu, Nov 08, 2018 at 09:04:41AM +1100, Dave Chinner wrote:
> > > On Wed, Nov 07, 2018 at 09:14:05AM -0800, Darrick J. Wong wrote:
> > > > On Wed, Nov 07, 2018 at 05:31:11PM +1100, Dave Chinner wrote:
> > > > > Hi folks,
> > > > > 
> > > > > We've had a fair number of problems reported on 64k block size
> > > > > filesystems of late, but none of the XFS developers have Power or
> > > > > ARM machines handy to reproduce them or even really test the fixes.
> > > > > 
> > > > > The iomap infrastructure we introduced a while back was designed
> > > > > with the capabity of block size > page size support in mind, but we
> > > > > hadn't tried to implement it.
> > > > > 
> > > > > So after another 64k block size bug report late last week I said to
> > > > > Darrick "How hard could it be"?
> > > > 
> > > > "Nothing is ever simple" :)
> > > 
> > > "It'll only take a couple of minutes!"
> > > 
> > > > > About 6 billion (yes, B) fsx ops later, I have most of the XFS
> > > > > functionality working on 64k block sizes on x86_64.  Buffered
> > > > > read/write, mmap read/write and direct IO read/write all work. All
> > > > > the fallocate() operations work correctly, as does truncate. xfsdump
> > > > > and xfs_restore are happy with it, as is xfs_repair. xfs-scrub
> > > > > needed some help, but I've tested Darrick's fixes for that quite a
> > > > > bit over the past few days.
> > > > > 
> > > > > It passes most of xfstests - there's some test failures that I have
> > > > > to determine whether they are code bugs or test problems (i.e. some
> > > > > tests don't deal with 64k block sizes correctly or assume block size
> > > > > <= page size).
> > > > > 
> > > > > What I haven't tested yet is shared extents - the COW path,
> > > > > clone_file_range and dedupe_file_range. I discovered earlier today
> > > > > that fsx doesn't support copy/clone/dedupe_file_operations
> > > > > operations, so before I go any further I need to enxpahnce fsx. Then
> > > > 
> > > > I assume that means you only tested this on reflink=0 filesystems?
> > > 
> > > Correct.
> > > 
> > > > Looking at fsstress, it looks like we don't test copy_file_range either.
> > > > I can try adding the missing clone/dedupe/copy to both programs, but
> > > > maybe you've already done that while I was asleep?
> > > 
> > > No, I haven't started on this yet. I've been sleeping. :P
> > 
> > I started wondering if we were missing anything from not having fsx
> > support clone/dedupe and ended up with:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fsstress-clone
> 
> Some fixes to that below.
> 
> I haven't got to testing dedupe or clone - copy_file_range explodes
> in under 40 operations in on generic/263. do_splice_direct() looks
> to be broken in several different waysat this point.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> fsx: clean up copy/dedupe file range support.
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> copy_file_range() needs to obey read/write constraints otherwise is
> blows up when direct IO is used.
> 
> FIDEDUPERANGE has a completely screwed up API for error reporting.
> The ioctl succeeds even if dedupe fails, so you have to check
> every individual dedupe operations for failure. Without this, dedupe
> "succeeds" on kernels filesystems that don't even support dedupe...
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  ltp/fsx.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index fad50e0022af..b51910b8b2e1 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -1382,7 +1382,11 @@ do_dedupe_range(unsigned offset, unsigned length, unsigned dest)
>  	fdr->info[0].dest_fd = fd;
>  	fdr->info[0].dest_offset = dest;
>  
> -	if (ioctl(fd, FIDEDUPERANGE, fdr) == -1) {
> +	if (ioctl(fd, FIDEDUPERANGE, fdr) == -1 ||
> +	    fdr->info[0].status < 0) {
> +		if (fdr->info[0].status < 0)
> +			errno = -fdr->info[0].status;
> +
>  		if (errno == EOPNOTSUPP || errno == ENOTTY) {
>  			if (!quiet && testcalls > simulatedopcount)
>  				prt("skipping unsupported dedupe range\n");
> @@ -1416,6 +1420,11 @@ do_copy_range(unsigned offset, unsigned length, unsigned dest)
>  	loff_t o1, o2;
>  	ssize_t nr;
>  
> +	offset -= offset % readbdy;
> +	dest -= dest % writebdy;
> +	if (o_direct)
> +		length -= length % readbdy;

Don't we want byte-granularity copies if we're doing buffered copies?

('Want' is such a strong word, maybe I don't want to find out what other
skeletons are lurking in do_splice_direct...)

--D

> +
>  	if (length == 0) {
>  		if (!quiet && testcalls > simulatedopcount)
>  			prt("skipping zero length copy range\n");

  reply	other threads:[~2018-11-09  7:55 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-07  6:31 [RFC PATCH 00/16] xfs: Block size > PAGE_SIZE support Dave Chinner
2018-11-07  6:31 ` [PATCH 01/16] xfs: drop ->writepage completely Dave Chinner
2018-11-09 15:12   ` Christoph Hellwig
2018-11-12 21:08     ` Dave Chinner
2021-02-02 20:51       ` Darrick J. Wong
2018-11-07  6:31 ` [PATCH 02/16] xfs: move writepage context warnings to writepages Dave Chinner
2018-11-07  6:31 ` [PATCH 03/16] xfs: finobt AG reserves don't consider last AG can be a runt Dave Chinner
2018-11-07 16:55   ` Darrick J. Wong
2018-11-09  0:21     ` Dave Chinner
2018-11-07  6:31 ` [PATCH 04/16] xfs: extent shifting doesn't fully invalidate page cache Dave Chinner
2018-11-07  6:31 ` [PATCH 05/16] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
2018-11-09 15:15   ` Christoph Hellwig
2018-11-07  6:31 ` [PATCH 06/16] iomap: support block size > page size for direct IO Dave Chinner
2018-11-08 11:28   ` Nikolay Borisov
2018-11-09 15:18   ` Christoph Hellwig
2018-11-11  1:12     ` Dave Chinner
2018-11-07  6:31 ` [PATCH 07/16] iomap: prepare buffered IO paths for block size > page size Dave Chinner
2018-11-09 15:19   ` Christoph Hellwig
2018-11-11  1:15     ` Dave Chinner
2018-11-07  6:31 ` [PATCH 08/16] iomap: mode iomap_zero_range and friends Dave Chinner
2018-11-09 15:19   ` Christoph Hellwig
2018-11-07  6:31 ` [PATCH 09/16] iomap: introduce zero-around functionality Dave Chinner
2018-11-07  6:31 ` [PATCH 10/16] iomap: enable zero-around for iomap_zero_range() Dave Chinner
2018-11-07  6:31 ` [PATCH 11/16] iomap: Don't mark partial pages zeroing uptodate for zero-around Dave Chinner
2018-11-07  6:31 ` [PATCH 12/16] iomap: zero-around in iomap_page_mkwrite Dave Chinner
2018-11-07  6:31 ` [PATCH 13/16] xfs: add zero-around controls to iomap Dave Chinner
2018-11-07  6:31 ` [PATCH 14/16] xfs: align writepages to large block sizes Dave Chinner
2018-11-09 15:22   ` Christoph Hellwig
2018-11-11  1:20     ` Dave Chinner
2018-11-11 16:32       ` Christoph Hellwig
2018-11-14 14:19   ` Brian Foster
2018-11-14 21:18     ` Dave Chinner
2018-11-15 12:55       ` Brian Foster
2018-11-16  6:19         ` Dave Chinner
2018-11-16 13:29           ` Brian Foster
2018-11-19  1:14             ` Dave Chinner
2018-11-07  6:31 ` [PATCH 15/16] xfs: expose block size in stat Dave Chinner
2018-11-07  6:31 ` [PATCH 16/16] xfs: enable block size larger than page size support Dave Chinner
2018-11-07 17:14 ` [RFC PATCH 00/16] xfs: Block size > PAGE_SIZE support Darrick J. Wong
2018-11-07 22:04   ` Dave Chinner
2018-11-08  1:38     ` Darrick J. Wong
2018-11-08  9:04       ` Dave Chinner
2018-11-08 22:17         ` Darrick J. Wong [this message]
2018-11-08 22:22           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181108221756.GA15721@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).