All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: fdmanana@gmail.com, Atemu <atemu.main@gmail.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: BUG: btrfs send: Kernel's memory usage rises until OOM kernel panic after sending ~37GiB
Date: Mon, 28 Oct 2019 20:36:09 +0800	[thread overview]
Message-ID: <6364c263-0e47-9ff1-9288-7f6cadcc69bb@gmx.com> (raw)
In-Reply-To: <CAL3q7H4Wc0GnKNORVvwCOEk1QhzUweJr1JnN=+Scx5-TpQ5+yA@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 2169 bytes --]



On 2019/10/28 下午7:30, Filipe Manana wrote:
> On Sun, Oct 27, 2019 at 4:51 PM Atemu <atemu.main@gmail.com> wrote:
>>
>>> It's really hard to determine, you could try the following command to
>>> determine:
>>> # btrfs ins dump-tree -t extent --bfs /dev/nvme/btrfs |\
>>>   grep "(.*_ITEM.*)" | awk '{print $4" "$5" "$6" size "$10}'
>>>
>>> Then which key is the most shown one and its size.
>>>
>>> If a key's objectid (the first value) shows up multiple times, it's a
>>> kinda heavily shared extent.
>>>
>>> Then search that objectid in the full extent tree dump, to find out how
>>> it's shared.
>>
>> I analyzed it a bit differently but this should be the information we wanted:
>>
>> https://gist.github.com/Atemu/206c44cd46474458c083721e49d84a42
> 
> That's quite a lot of extents shared many times.
> That indeed slows backreference walking and therefore send which uses it.
> While the slowdown is known, the memory consumption I wasn't aware of,
> but from your logs, it's not clear
> where it comes exactly from, something to be looked at. There's also a
> significant number of data checksum errors.
> 
> I think in the meanwhile send can just skip backreference walking and
> attempt to clone whenever the number of
> backreferences for an inode exceeds some limit, in which case it would
> fallback to writes instead of cloning.

Long time ago I had a purpose to record sent extents in an rbtree, then
instead of do the full backref walk, go that rbtree walk instead.
That should still be way faster than full backref walk, and still have a
good enough hit rate.
(And of course, if it fails, falls back to regular write)

Thanks,
Qu

> 
> I'll look into it, thanks for the report (and Qu for telling how to
> get the backreference counts).
> 
>>
>> Yeah...
>>
>> Is there any way to "unshare" these worst cases without having to
>> btrfs defragment everything?
>>
>> I also uploaded the (compressed) extent tree dump if you want to take
>> a look yourself (205MB, expires in 7 days):
>>
>> https://send.firefox.com/download/a729c57a94fcd89e/#w51BjzRmGnCg2qKNs39UNw
>>
>> -Atemu
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

  reply	other threads:[~2019-10-28 12:36 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-26 17:46 BUG: btrfs send: Kernel's memory usage rises until OOM kernel panic after sending ~37GiB Atemu
2019-10-27  0:50 ` Qu Wenruo
2019-10-27 10:33   ` Atemu
2019-10-27 11:34     ` Qu Wenruo
2019-10-27 12:55       ` Atemu
2019-10-27 13:43         ` Qu Wenruo
2019-10-27 15:19           ` Atemu
2019-10-27 15:19       ` Atemu
2019-10-27 23:16         ` Qu Wenruo
2019-10-28 12:26           ` Atemu
2019-10-28 11:30         ` Filipe Manana
2019-10-28 12:36           ` Qu Wenruo [this message]
2019-10-28 12:43             ` Filipe Manana
2019-10-28 14:58               ` Martin Raiber
2019-10-28 12:44           ` Atemu
2019-10-28 13:01             ` Filipe Manana
2019-10-28 13:44               ` Atemu
2019-10-31 13:55                 ` Atemu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6364c263-0e47-9ff1-9288-7f6cadcc69bb@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=atemu.main@gmail.com \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.