All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Eryu Guan <eguan@redhat.com>
Cc: linux-btrfs@vger.kernel.org, fstests@vger.kernel.org
Subject: Re: [PATCH] fstests: btrfs: Test send on heavily deduped file
Date: Tue, 19 Jul 2016 13:42:03 +0800	[thread overview]
Message-ID: <fb7acb1e-3835-1511-d66e-7d186c94e35e@cn.fujitsu.com> (raw)
In-Reply-To: <20160719043524.GL27776@eguan.usersys.redhat.com>



At 07/19/2016 12:35 PM, Eryu Guan wrote:
> On Tue, Jul 19, 2016 at 10:44:02AM +0800, Qu Wenruo wrote:
>> For fully deduped file, whose file extents are all pointing to the same
>> extent, btrfs backref walk can be very time consuming, long enough to
>> trigger softlock.
>>
>> Unfortunately, btrfs send is one of the caller of such backref walk
>> under an O(n) loop, making the total time complexity to O(n^3) or more.
>>
>> And even worse, btrfs send will allocate memory in such loop, to trigger
>> OOM on system with small memory(<4G).
>>
>> This test case will check if btrfs send will cause these problems.
>>
>> Reporeted-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
>> To: Filipe Manana <fdmanana@gmail.com>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> ---
>> To Filipe:
>>   For the soft lockup, I will try my best to figure out some method to
>>   avoid such lockup (but it will still be very time consuming though).
>>
>>   But for the OOM problem, would you mind disabling clone/reflink
>>   detection in btrfs send?
>>
>>   In fact we should really avoid doing full backref walk inside an O(n)
>>   loop (just like previous fiemap ioctl test case), and avoid any full
>>   backref walk if possible.
>>   So I'm afraid that's the only solution yet.
>>
>> Thanks,
>> Qu
>> ---
>>  tests/btrfs/127     | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  tests/btrfs/127.out |  3 ++
>>  tests/btrfs/group   |  1 +
>>  3 files changed, 93 insertions(+)
>>  create mode 100755 tests/btrfs/127
>>  create mode 100644 tests/btrfs/127.out
>>
>> diff --git a/tests/btrfs/127 b/tests/btrfs/127
>> new file mode 100755
>> index 0000000..a31a653
>> --- /dev/null
>> +++ b/tests/btrfs/127
>> @@ -0,0 +1,89 @@
>> +#! /bin/bash
>> +# FS QA Test 127
>> +#
>> +# Check if btrfs send can handle large deduped file, whose file extents
>> +# are all pointing to one extent.
>> +# Such file structure will cause quite large pressure to any operation which
>> +# iterates all backref of one extent.
>> +# And unfortunately, btrfs send is one of these operations, and will cause
>> +# softlock or OOM on systems with small memory(<4G).
>> +#
>> +#-----------------------------------------------------------------------
>> +# Copyright (c) 2016 Fujitsu. All Rights Reserved.
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU General Public License as
>> +# published by the Free Software Foundation.
>> +#
>> +# This program is distributed in the hope that it would be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program; if not, write the Free Software Foundation,
>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>> +#-----------------------------------------------------------------------
>> +#
>> +
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +tmp=/tmp/$$
>> +status=1	# failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +_cleanup()
>> +{
>> +	cd /
>> +	rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +. ./common/reflink
>> +
>> +# remove previous $seqres.full before test
>> +rm -f $seqres.full
>> +
>> +# real QA test starts here
>> +
>> +# Modify as appropriate.
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +_require_scratch_reflink
>> +
>> +_scratch_mkfs > /dev/null 2>&1
>> +_scratch_mount
>> +
>> +nr_extents=$((4096 * $LOAD_FACTOR))
>> +
>> +# Use 128K blocksize, the default value of both deduperemove or
>> +# inband dedupe
>> +blocksize=$((128 * 1024))
>> +file=$SCRATCH_MNT/foobar
>> +
>> +# create the initial file, whose file extents are all point to one extent
>> +_pwrite_byte 0xcdcdcdcd 0 $blocksize  $file | _filter_xfs_io
>> +
>> +for i in $(seq 1 $(($nr_extents - 1))); do
>> +	_reflink_range $file 0 $file $(($i * $blocksize)) $blocksize \
>> +		> /dev/null 2>&1
>> +done
>> +
>> +# create a RO snapshot, so we can send out the snapshot
>> +_run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/ro_snap
>> +
>> +# send out the subvolume, and it will either:
>> +# 1) OOM since memory is allocated inside a O(n^3) loop
>> +# 2) Softlock since time consuming backref walk is called without scheduling.
>> +# the send destination is not important, just send will cause the problem
>> +_run_btrfs_util_prog send $SCRATCH_MNT/ro_snap > /dev/null 2>&1
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/btrfs/127.out b/tests/btrfs/127.out
>> new file mode 100644
>> index 0000000..8b08bf8
>> --- /dev/null
>> +++ b/tests/btrfs/127.out
>> @@ -0,0 +1,3 @@
>> +QA output created by 127
>> +wrote 131072/131072 bytes at offset 0
>> +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> diff --git a/tests/btrfs/group b/tests/btrfs/group
>> index a21a80a..d9174b5 100644
>> --- a/tests/btrfs/group
>> +++ b/tests/btrfs/group
>> @@ -129,3 +129,4 @@
>>  124 auto replace
>>  125 auto replace
>>  126 auto quick qgroup
>> +127 auto clone send
>
> This test uses $LOAD_FACTOR, so it should be in 'stress' group. And it
> hangs the latest kernel, stop other tests from running, I think we can
> add it to 'dangerous' group as well.
>

Thanks for this info.
I'm completely OK to add this group to 'stress' and 'dangerous'.


However I'm a little curious about the meaning/standard of these groups.

Does 'dangerous' conflicts with 'auto'?
Since under most case, tester would just execute './check -g auto' and 
the system hangs at the test case.
So I'm a little confused with the 'auto' group.

BTW, I also hopes there will be some documentation explaining the 
standard of these groups, so some guys like me can avoid wasting time of 
maintainers.

Thanks,
Qu

> I can fix them at merge time, if there's no other major updates to be
> done. (I'll let the patch sitting in the list for more time, in case
> others have more review comments).
>
> Thanks,
> Eryu
>
>



  reply	other threads:[~2016-07-19  5:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19  2:44 [PATCH] fstests: btrfs: Test send on heavily deduped file Qu Wenruo
2016-07-19  4:35 ` Eryu Guan
2016-07-19  5:42   ` Qu Wenruo [this message]
2016-07-20  7:01     ` Eryu Guan
2016-07-20  7:40       ` Qu Wenruo
2016-07-20 23:37         ` Dave Chinner
2016-07-21  2:05           ` Qu Wenruo
2016-07-21 22:57             ` Dave Chinner
2016-07-20 23:30       ` Dave Chinner
2016-07-19  5:06 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb7acb1e-3835-1511-d66e-7d186c94e35e@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=eguan@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.