All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Atemu <atemu.main@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: BUG: btrfs send: Kernel's memory usage rises until OOM kernel panic after sending ~37GiB
Date: Sun, 27 Oct 2019 08:50:08 +0800	[thread overview]
Message-ID: <cb5f9048-919f-0ff9-0765-d5a33e58afa7@gmx.com> (raw)
In-Reply-To: <CAE4GHg=W+a319=Ra_PNh3LV0hdD-Y12k-0N5ej72FSt=Fq520Q@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 2198 bytes --]



On 2019/10/27 上午1:46, Atemu wrote:
> Hi linux-btrfs,
> after btrfs sending ~37GiB of a snapshot of one of my subvolumes,
> btrfs send stalls (`pv` (which I'm piping it through) does not report
> any significant throughput anymore) and shortly after, the Kernel's
> memory usage starts to rise until it runs OOM and panics.
> 
> Here's the tail of dmesg I saved before such a Kernel panic:
> 
> https://gist.githubusercontent.com/Atemu/3af591b9fa02efee10303ccaac3b4a85/raw/f27c0c911f4a9839a6e59ed494ff5066c7754e07/btrfs%2520send%2520OOM%2520log
> 
> (I cancelled the first btrfs send in this example FYI, that's not part
> of nor required for this bug.)
> 
> And here's a picture of the screen after the Kernel panic:
> 
> https://photos.app.goo.gl/cEj5TA9B5V8eRXsy9
> 
> (This was recorded a while back but I am able to repoduce the same bug
> on archlinux-2019.10.01-x86_64.iso.)
> 
> The snapshot holds ~3.8TiB of data that has been compressed (ZSTD:3)
> and heavily deduplicated down to ~1.9TiB.

That's the problem.

Deduped files caused heavy overload for backref walk.
And send has to do backref walk, and you see the problem...

I'm very interested how heavily deduped the file is.
If it's just all 0 pages, hole punching is more effective than dedupe,
and causes 0 backref overhead.

Thanks,
Qu

> For deduplication I used `bedup dedup` and `duperemove -x -r -h -A -b
> 32K ---skip-zeroes --dedupe-options=same,fiemap,noblock` and IIRC it
> was mostly done around the time 4.19 and 4.20 were recent.
> 
> The Inode that btrfs reports as corrupt towards the end of the dmesg
> is a 37GiB 7z archive (size correlates) and can be read without errors
> on a live system where the bug hasn't been triggered yet. Since it
> happens to be a 7z archive, I can even confirm its integrity with `7z
> t`.
> A scrub and `btrfs check --check-data-csum` don't detect any errors either.
> 
> Please tell me what other information I could provide that might be
> useful/necessary for squashing this bug,
> Atemu
> 
> PS: I could spin up a VM with device mapper snapshots of the drives,
> destructive troubleshooting is possible if needed.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-10-27  0:50 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-26 17:46 BUG: btrfs send: Kernel's memory usage rises until OOM kernel panic after sending ~37GiB Atemu
2019-10-27  0:50 ` Qu Wenruo [this message]
2019-10-27 10:33   ` Atemu
2019-10-27 11:34     ` Qu Wenruo
2019-10-27 12:55       ` Atemu
2019-10-27 13:43         ` Qu Wenruo
2019-10-27 15:19           ` Atemu
2019-10-27 15:19       ` Atemu
2019-10-27 23:16         ` Qu Wenruo
2019-10-28 12:26           ` Atemu
2019-10-28 11:30         ` Filipe Manana
2019-10-28 12:36           ` Qu Wenruo
2019-10-28 12:43             ` Filipe Manana
2019-10-28 14:58               ` Martin Raiber
2019-10-28 12:44           ` Atemu
2019-10-28 13:01             ` Filipe Manana
2019-10-28 13:44               ` Atemu
2019-10-31 13:55                 ` Atemu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb5f9048-919f-0ff9-0765-d5a33e58afa7@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=atemu.main@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.