All of lore.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@gmail.com>
To: Atemu <atemu.main@gmail.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: BUG: btrfs send: Kernel's memory usage rises until OOM kernel panic after sending ~37GiB
Date: Mon, 28 Oct 2019 11:30:01 +0000	[thread overview]
Message-ID: <CAL3q7H4Wc0GnKNORVvwCOEk1QhzUweJr1JnN=+Scx5-TpQ5+yA@mail.gmail.com> (raw)
In-Reply-To: <CAE4GHg=4S4KqzBGHo-7T3cmmgECZxWZ-vXJMq8SYnnwy16h3xg@mail.gmail.com>

On Sun, Oct 27, 2019 at 4:51 PM Atemu <atemu.main@gmail.com> wrote:
>
> > It's really hard to determine, you could try the following command to
> > determine:
> > # btrfs ins dump-tree -t extent --bfs /dev/nvme/btrfs |\
> >   grep "(.*_ITEM.*)" | awk '{print $4" "$5" "$6" size "$10}'
> >
> > Then which key is the most shown one and its size.
> >
> > If a key's objectid (the first value) shows up multiple times, it's a
> > kinda heavily shared extent.
> >
> > Then search that objectid in the full extent tree dump, to find out how
> > it's shared.
>
> I analyzed it a bit differently but this should be the information we wanted:
>
> https://gist.github.com/Atemu/206c44cd46474458c083721e49d84a42

That's quite a lot of extents shared many times.
That indeed slows backreference walking and therefore send which uses it.
While the slowdown is known, the memory consumption I wasn't aware of,
but from your logs, it's not clear
where it comes exactly from, something to be looked at. There's also a
significant number of data checksum errors.

I think in the meanwhile send can just skip backreference walking and
attempt to clone whenever the number of
backreferences for an inode exceeds some limit, in which case it would
fallback to writes instead of cloning.

I'll look into it, thanks for the report (and Qu for telling how to
get the backreference counts).

>
> Yeah...
>
> Is there any way to "unshare" these worst cases without having to
> btrfs defragment everything?
>
> I also uploaded the (compressed) extent tree dump if you want to take
> a look yourself (205MB, expires in 7 days):
>
> https://send.firefox.com/download/a729c57a94fcd89e/#w51BjzRmGnCg2qKNs39UNw
>
> -Atemu



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

  parent reply	other threads:[~2019-10-28 11:30 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-26 17:46 BUG: btrfs send: Kernel's memory usage rises until OOM kernel panic after sending ~37GiB Atemu
2019-10-27  0:50 ` Qu Wenruo
2019-10-27 10:33   ` Atemu
2019-10-27 11:34     ` Qu Wenruo
2019-10-27 12:55       ` Atemu
2019-10-27 13:43         ` Qu Wenruo
2019-10-27 15:19           ` Atemu
2019-10-27 15:19       ` Atemu
2019-10-27 23:16         ` Qu Wenruo
2019-10-28 12:26           ` Atemu
2019-10-28 11:30         ` Filipe Manana [this message]
2019-10-28 12:36           ` Qu Wenruo
2019-10-28 12:43             ` Filipe Manana
2019-10-28 14:58               ` Martin Raiber
2019-10-28 12:44           ` Atemu
2019-10-28 13:01             ` Filipe Manana
2019-10-28 13:44               ` Atemu
2019-10-31 13:55                 ` Atemu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL3q7H4Wc0GnKNORVvwCOEk1QhzUweJr1JnN=+Scx5-TpQ5+yA@mail.gmail.com' \
    --to=fdmanana@gmail.com \
    --cc=atemu.main@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.