Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
* many busy btrfs processes during heavy cpu and memory pressure
@ 2019-08-12  2:27 Chris Murphy
  2019-08-12  5:43 ` Qu Wenruo
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Murphy @ 2019-08-12  2:27 UTC (permalink / raw)
  To: Btrfs BTRFS

I'm not sure this is a bug, but I'm also not sure if the behavior is expected.

Test system as follows:

Intel i7-2820QM, 4/8 cores
8 GiB RAM, 8 GiB swap on SSD plain partition
Samsung SSD 840 EVO 250GB
kernel 5.3.0-0.rc3.git0.1.fc31.x86_64+debug, but same behavior seen on 5.2.6

Test involves using a desktop, GNOME shell, while building webkitgtk.
This uses all available RAM, and eventually all available swap.

While the build fails on ext4 as well as on Btrfs, the difference on
Btrfs is many btrfs processes taking up quite a lot of cpu resources.
And iotop shows many processes with unexpectedly high read IO. I don't
have enough data collected to be certain, but it does seem on Btrfs
the oom killer is substantially delayed. Realistically, by the time
the system is in this state, practically speaking it's lost.

Screenshot shows iotop and top state information for this system, at
the time sysrq+t is taken.

Full 'journalctl -k' output is rather excessive, 13MB uncompressed,
714K zstd compressed
https://drive.google.com/open?id=1bYYedsj1O4pii51MUy-7cWhnWGXb67XE

from last sysrq+t
https://drive.google.com/open?id=1vhnIki9lpiWK8T5Qsl81_RToQ8CFdnfU

last screenshot, matching above sysrq+t
https://drive.google.com/open?id=12jpQeskPsvHmfvDjWSPOwIWSz09JIUlk


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: many busy btrfs processes during heavy cpu and memory pressure
  2019-08-12  2:27 many busy btrfs processes during heavy cpu and memory pressure Chris Murphy
@ 2019-08-12  5:43 ` Qu Wenruo
  2019-08-12 16:00   ` Chris Murphy
  0 siblings, 1 reply; 3+ messages in thread
From: Qu Wenruo @ 2019-08-12  5:43 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

[-- Attachment #1.1: Type: text/plain, Size: 1591 bytes --]



On 2019/8/12 上午10:27, Chris Murphy wrote:
> I'm not sure this is a bug, but I'm also not sure if the behavior is expected.
> 
> Test system as follows:
> 
> Intel i7-2820QM, 4/8 cores
> 8 GiB RAM, 8 GiB swap on SSD plain partition
> Samsung SSD 840 EVO 250GB
> kernel 5.3.0-0.rc3.git0.1.fc31.x86_64+debug, but same behavior seen on 5.2.6
> 
> Test involves using a desktop, GNOME shell, while building webkitgtk.
> This uses all available RAM, and eventually all available swap.
> 
> While the build fails on ext4 as well as on Btrfs, the difference on
> Btrfs is many btrfs processes taking up quite a lot of cpu resources.
> And iotop shows many processes with unexpectedly high read IO. I don't
> have enough data collected to be certain, but it does seem on Btrfs
> the oom killer is substantially delayed. Realistically, by the time
> the system is in this state, practically speaking it's lost.
> 
> Screenshot shows iotop and top state information for this system, at
> the time sysrq+t is taken.
> 
> Full 'journalctl -k' output is rather excessive, 13MB uncompressed,
> 714K zstd compressed
> https://drive.google.com/open?id=1bYYedsj1O4pii51MUy-7cWhnWGXb67XE
> 
> from last sysrq+t
> https://drive.google.com/open?id=1vhnIki9lpiWK8T5Qsl81_RToQ8CFdnfU
> 
> last screenshot, matching above sysrq+t
> https://drive.google.com/open?id=12jpQeskPsvHmfvDjWSPOwIWSz09JIUlk

This shows it's btrfs endio workqueue, which do the data verification
against csum tree.

So you see the point, ext* just doesn't support data csum.

Thanks,
Qu




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: many busy btrfs processes during heavy cpu and memory pressure
  2019-08-12  5:43 ` Qu Wenruo
@ 2019-08-12 16:00   ` Chris Murphy
  0 siblings, 0 replies; 3+ messages in thread
From: Chris Murphy @ 2019-08-12 16:00 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Chris Murphy, Btrfs BTRFS

On Sun, Aug 11, 2019 at 11:43 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2019/8/12 上午10:27, Chris Murphy wrote:
> > I'm not sure this is a bug, but I'm also not sure if the behavior is expected.
> >
> > Test system as follows:
> >
> > Intel i7-2820QM, 4/8 cores
> > 8 GiB RAM, 8 GiB swap on SSD plain partition
> > Samsung SSD 840 EVO 250GB
> > kernel 5.3.0-0.rc3.git0.1.fc31.x86_64+debug, but same behavior seen on 5.2.6
> >
> > Test involves using a desktop, GNOME shell, while building webkitgtk.
> > This uses all available RAM, and eventually all available swap.
> >
> > While the build fails on ext4 as well as on Btrfs, the difference on
> > Btrfs is many btrfs processes taking up quite a lot of cpu resources.
> > And iotop shows many processes with unexpectedly high read IO. I don't
> > have enough data collected to be certain, but it does seem on Btrfs
> > the oom killer is substantially delayed. Realistically, by the time
> > the system is in this state, practically speaking it's lost.
> >
> > Screenshot shows iotop and top state information for this system, at
> > the time sysrq+t is taken.
> >
> > Full 'journalctl -k' output is rather excessive, 13MB uncompressed,
> > 714K zstd compressed
> > https://drive.google.com/open?id=1bYYedsj1O4pii51MUy-7cWhnWGXb67XE
> >
> > from last sysrq+t
> > https://drive.google.com/open?id=1vhnIki9lpiWK8T5Qsl81_RToQ8CFdnfU
> >
> > last screenshot, matching above sysrq+t
> > https://drive.google.com/open?id=12jpQeskPsvHmfvDjWSPOwIWSz09JIUlk
>
> This shows it's btrfs endio workqueue, which do the data verification
> against csum tree.
>
> So you see the point, ext* just doesn't support data csum.

But 10-17% CPU, times 8 processes? Even during scrub at maximum SSD
read there isn't such a load doing csum computations.

Get a load of this screenshot:
https://drive.google.com/file/d/1IDboR1fzP4onu_tzyZxsx7M5cT_RJ7Iz/view

That doesn't even make sense. How is it possible Btrfs is using 100%
CPU times 10 processes? There aren't even that many cores. And then
Firefox is using 800% CPU? Another 8 cores that don't exist. And then
look at iotop which is reporting 28G/s reads? This is an ordinary SATA
SSD that can't do more than maybe 600M/s reads. Something is very
weird and misreporting. But again, only on Btrfs. It doesn't happen
with ext4, even though the system hang user experience is the same and
not worse on Btrfs. Just the system statistics seems much crazier on
Btrfs.

The other time I've seen this behavior? Running Firefox through gdb
with certain kinds of crashes, that have nothing to do with swap.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-12  2:27 many busy btrfs processes during heavy cpu and memory pressure Chris Murphy
2019-08-12  5:43 ` Qu Wenruo
2019-08-12 16:00   ` Chris Murphy

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org linux-btrfs@archiver.kernel.org
	public-inbox-index linux-btrfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox