All of lore.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Eryu Guan <guaneryu@gmail.com>
Cc: fstests <fstests@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	Filipe Manana <fdmanana@suse.com>
Subject: Re: [PATCH] generic: test for deduplication between different files
Date: Sun, 19 Aug 2018 16:41:31 +0100	[thread overview]
Message-ID: <CAL3q7H7igoFqM+5sKUCVia_TK4Q9nR+4Ys7WuZ3eW1OiYY5_gA@mail.gmail.com> (raw)
In-Reply-To: <20180819140759.GG2791@desktop>

On Sun, Aug 19, 2018 at 3:07 PM, Eryu Guan <guaneryu@gmail.com> wrote:
> On Fri, Aug 17, 2018 at 09:39:24AM +0100, fdmanana@kernel.org wrote:
>> From: Filipe Manana <fdmanana@suse.com>
>>
>> Test that deduplication of an entire file that has a size that is not
>> aligned to the filesystem's block size into a different file does not
>> corrupt the destination's file data.
>>
>> This test is motivated by a bug found in Btrfs which is fixed by the
>> following patch for the linux kernel:
>>
>>   "Btrfs: fix data corruption when deduplicating between different files"
>>
>> XFS also fails this test, at least as of linux kernel 4.18-rc7, exactly
>> with the same corruption as in Btrfs - some bytes of a block get replaced
>> with zeroes after the deduplication.
>>
>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>> ---
>>  tests/generic/505     | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  tests/generic/505.out | 33 ++++++++++++++++++++
>>  tests/generic/group   |  1 +
>>  3 files changed, 118 insertions(+)
>>  create mode 100755 tests/generic/505
>>  create mode 100644 tests/generic/505.out
>>
>> diff --git a/tests/generic/505 b/tests/generic/505
>> new file mode 100755
>> index 00000000..5ee232a2
>> --- /dev/null
>> +++ b/tests/generic/505
>> @@ -0,0 +1,84 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Copyright (C) 2018 SUSE Linux Products GmbH. All Rights Reserved.
>> +#
>> +# FS QA Test No. 505
>> +#
>> +# Test that deduplication of an entire file that has a size that is not aligned
>> +# to the filesystem's block size into a different file does not corrupt the
>> +# destination's file data.
>> +#
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +tmp=/tmp/$$
>> +status=1     # failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +_cleanup()
>> +{
>> +     cd /
>> +     rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +. ./common/reflink
>> +
>> +# real QA test starts here
>> +_supported_fs generic
>> +_supported_os Linux
>> +_require_scratch_dedupe
>> +
>> +rm -f $seqres.full
>> +
>> +_scratch_mkfs >>$seqres.full 2>&1
>> +_scratch_mount
>> +
>> +# The first byte with a value of 0xae starts at an offset (2518890) which is not
>> +# a multiple of the block size.
>> +$XFS_IO_PROG -f \
>> +     -c "pwrite -S 0x6b 0 2518890" \
>> +     -c "pwrite -S 0xae 2518890 102398" \
>> +     $SCRATCH_MNT/foo | _filter_xfs_io
>> +
>> +# Create a second file with a length not aligned to the block size, whose bytes
>> +# all have the value 0x6b, so that its extent(s) can be deduplicated with the
>> +# first file.
>> +$XFS_IO_PROG -f -c "pwrite -S 0x6b 0 557771" $SCRATCH_MNT/bar | _filter_xfs_io
>> +
>> +# The file is filled with bytes having the value 0x6b from offset 0 to offset
>> +# 2518889 and with the value 0xae from offset 2518890 to offset 2621287.
>> +echo "File content before deduplication:"
>> +od -t x1 $SCRATCH_MNT/foo
>> +
>> +# Now deduplicate the entire second file into a range of the first file that
>> +# also has all bytes with the value 0x6b. The destination range's end offset
>> +# must not be aligned to the block size and must be less then the offset of
>> +# the first byte with the value 0xae (byte at offset 2518890).
>> +$XFS_IO_PROG -c "dedupe $SCRATCH_MNT/bar 0 1957888 557771" $SCRATCH_MNT/foo \
>> +     | _filter_xfs_io
>> +
>> +# The bytes in the range starting at offset 2515659 (end of the deduplication
>> +# range) and ending at offset 2519040 (start offset rounded up to the block
>> +# size) must all have the value 0xae (and not replaced with 0x00 values).
>
> This doesn't seem right to me, range [2515659, 2518890) should be 0x6b
> not 0xae, while range [2518890, 2519040) indeed should contain 0xae.

Yes, indeed. My mistake (got it right in the comment before the first
call to "od").
Can you fix it up (if there's nothing else to fix), or do you need me
to send a new version?

Thanks!

>
> Thanks,
> Eryu
>
>> +# In other words, we should have exactly the same data we had before we asked
>> +# for deduplication.
>> +echo "File content after deduplication and before unmounting:"
>> +od -t x1 $SCRATCH_MNT/foo
>> +
>> +# Unmount the filesystem and mount it again. This guarantees any file data in
>> +# the page cache is dropped.
>> +_scratch_cycle_mount
>> +
>> +# The bytes in the range starting at offset 2515659 (end of the deduplication
>> +# range) and ending at offset 2519040 (start offset rounded up to the block
>> +# size) must all have the value 0xae (and not replaced with 0x00 values).
>> +# In other words, we should have exactly the same data we had before we asked
>> +# for deduplication.
>> +echo "File content after unmounting:"
>> +od -t x1 $SCRATCH_MNT/foo
>> +
>> +status=0
>> +exit
>> diff --git a/tests/generic/505.out b/tests/generic/505.out
>> new file mode 100644
>> index 00000000..7556b9fb
>> --- /dev/null
>> +++ b/tests/generic/505.out
>> @@ -0,0 +1,33 @@
>> +QA output created by 505
>> +wrote 2518890/2518890 bytes at offset 0
>> +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +wrote 102398/102398 bytes at offset 2518890
>> +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +wrote 557771/557771 bytes at offset 0
>> +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +File content before deduplication:
>> +0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
>> +*
>> +11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae
>> +11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae
>> +*
>> +11777540 ae ae ae ae ae ae ae ae
>> +11777550
>> +deduped 557771/557771 bytes at offset 1957888
>> +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +File content after deduplication and before unmounting:
>> +0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
>> +*
>> +11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae
>> +11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae
>> +*
>> +11777540 ae ae ae ae ae ae ae ae
>> +11777550
>> +File content after unmounting:
>> +0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
>> +*
>> +11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae
>> +11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae
>> +*
>> +11777540 ae ae ae ae ae ae ae ae
>> +11777550
>> diff --git a/tests/generic/group b/tests/generic/group
>> index 55155de8..2ff1bd7e 100644
>> --- a/tests/generic/group
>> +++ b/tests/generic/group
>> @@ -507,3 +507,4 @@
>>  502 auto quick log
>>  503 auto quick dax punch collapse zero
>>  504 auto quick locks
>> +505 auto quick clone dedupe
>> --
>> 2.11.0
>>

  reply	other threads:[~2018-08-19 18:53 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-17  8:39 [PATCH] generic: test for deduplication between different files fdmanana
2018-08-19 14:07 ` Eryu Guan
2018-08-19 15:41   ` Filipe Manana [this message]
2018-08-19 16:19     ` Eryu Guan
2018-08-19 16:21       ` Filipe Manana
2018-08-19 23:11 ` Dave Chinner
2018-08-20  1:09   ` [patch] file dedupe (and maybe clone) data corruption (was Re: [PATCH] generic: test for deduplication between different files) Dave Chinner
2018-08-20 15:33     ` Darrick J. Wong
2018-08-21  0:49       ` Dave Chinner
2018-08-21  1:17         ` Eric Sandeen
2018-08-21  4:49           ` Dave Chinner
2018-08-23 12:58       ` Zygo Blaxell
2018-08-24  2:19         ` Zygo Blaxell
2018-08-30  6:27         ` Dave Chinner
2018-08-31  5:10           ` Zygo Blaxell
2018-09-06  8:38             ` Dave Chinner
2018-09-07  3:53               ` Zygo Blaxell
2018-09-10  9:06                 ` Dave Chinner
2018-09-19  4:12                   ` Zygo Blaxell
2018-09-21  2:59                     ` Dave Chinner
2018-09-21  4:40                       ` Zygo Blaxell
2018-10-01 20:34                         ` Andreas Dilger
2018-08-21 15:55     ` Filipe Manana
2018-08-21 15:57   ` [PATCH] generic: test for deduplication between different files Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL3q7H7igoFqM+5sKUCVia_TK4Q9nR+4Ys7WuZ3eW1OiYY5_gA@mail.gmail.com \
    --to=fdmanana@kernel.org \
    --cc=fdmanana@suse.com \
    --cc=fstests@vger.kernel.org \
    --cc=guaneryu@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.