Linux-BTRFS Archive on
 help / color / Atom feed
From: David Sterba <>
Subject: Re: [PATCH] Btrfs: fix corruption reading shared and compressed extents after hole punching
Date: Wed, 27 Feb 2019 15:29:04 +0100
Message-ID: <> (raw)
In-Reply-To: <>

On Thu, Feb 14, 2019 at 03:17:20PM +0000, wrote:
> From: Filipe Manana <>
> In the past we had data corruption when reading compressed extents that
> are shared within the same file and they are consecutive, this got fixed
> by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and
> shared extents") and by commit 808f80b46790f ("Btrfs: update fix for read
> corruption of compressed and shared extents"). However there was a case
> that was missing in those fixes, which is when the shared and compressed
> extents are referenced with a non-zero offset. The following shell script
> creates a reproducer for this issue:
>   #!/bin/bash
>   mkfs.btrfs -f /dev/sdc &> /dev/null
>   mount -o compress /dev/sdc /mnt/sdc
>   # Create a file with 3 consecutive compressed extents, each has an
>   # uncompressed size of 128Kb and a compressed size of 4Kb.
>   for ((i = 1; i <= 3; i++)); do
>       head -c 4096 /dev/zero
>       for ((j = 1; j <= 31; j++)); do
>           head -c 4096 /dev/zero | tr '\0' "\377"
>       done
>   done > /mnt/sdc/foobar
>   sync
>   echo "Digest after file creation:   $(md5sum /mnt/sdc/foobar)"
>   # Clone the first extent into offsets 128K and 256K.
>   xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar
>   xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar
>   sync
>   echo "Digest after cloning:         $(md5sum /mnt/sdc/foobar)"
>   # Punch holes into the regions that are already full of zeroes.
>   xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar
>   xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar
>   xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar
>   sync
>   echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
>   echo "Dropping page cache..."
>   sysctl -q vm.drop_caches=1
>   echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
>   umount /dev/sdc
> When running the script we get the following output:
>   Digest after file creation:   5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
>   linked 131072/131072 bytes at offset 131072
>   128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec)
>   linked 131072/131072 bytes at offset 262144
>   128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec)
>   Digest after cloning:         5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
>   Digest after hole punching:   5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
>   Dropping page cache...
>   Digest after hole punching:   fba694ae8664ed0c2e9ff8937e7f1484  /mnt/sdc/foobar
> This happens because after reading all the pages of the extent in the
> range from 128K to 256K for example, we read the hole at offset 256K
> and then when reading the page at offset 260K we don't submit the
> existing bio, which is responsible for filling all the page in the
> range 128K to 256K only, therefore adding the pages from range 260K
> to 384K to the existing bio and submitting it after iterating over the
> entire range. Once the bio completes, the uncompressed data fills only
> the pages in the range 128K to 256K because there's no more data read
> from disk, leaving the pages in the range 260K to 384K unfilled. It is
> just a slightly different variant of what was solved by commit
> 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared
> extents").
> Fix this by forcing a bio submit, during readpages(), whenever we find a
> compressed extent map for a page that is different from the extent map
> for the previous page or has a different starting offset (in case it's
> the same compressed extent), instead of the extent map's original start
> offset.
> A test case for fstests follows soon.
> Reported-by: Zygo Blaxell <>
> Tested-by: Zygo Blaxell <>
> Cc: # 4.3+
> Signed-off-by: Filipe Manana <>

Added to misc-next and scheduled for 5.1, thanks. And thanks to Zygo for
the report.

      reply index

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14 15:17 fdmanana
2019-02-27 14:29 ` David Sterba [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on

Archives are clonable:
	git clone --mirror linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ \
	public-inbox-index linux-btrfs

Newsgroup available over NNTP:

AGPL code for this site: git clone public-inbox