cpu bound I/O behaviour in linux 5.4 (possibly others)

* cpu bound I/O behaviour in linux 5.4 (possibly others)
@ 2020-02-14 11:30 Marc Lehmann
  2020-02-14 11:57 ` Nikolay Borisov
  2020-02-14 12:36 ` Roman Mamedov
  0 siblings, 2 replies; 7+ messages in thread
From: Marc Lehmann @ 2020-02-14 11:30 UTC (permalink / raw)
  To: linux-btrfs

Hi!

I've upgraded a machine to linux 5.4.15 that runs a small netnews
system. It normally pulls news with about 20MB/s. After upgrading (it
seems) that this process is now CPU bound, and I get only about 10mb/s
throughput. Otherwise, everything seems fine - no obvious bugs, and no
obvious performance problems.

"CPU-bound" specifically means that the disk(s) seem pretty idle (it an
6x10TB raid5), I can do a lot of I/O without slowing down the transfer,
but there is always a single kworker which is constantly at 100% cpu (i.e.
one core) in top:

 8963 root      20   0       0      0      0 R 2 100.0   0.0   2:04 [kworker/u8:15+flush-btrfs-3]

When I cat /proc/8963/task/8963/stack regularly, I get either no output or
(most often) this single line:

   [<0>] tree_search_offset.isra.0+0x16a/0x1d0 [btrfs]

It is possible that this is _not_ new behaviour with 5.4, but I often use
top, and I can't remember having a kworker stuck at 100% cpu for days.
(The fs is about a year old and had no issues so far, the last scrub is about
a week old).

Another symptom is that Dirty in /proc/meminfo is typically at 7-8GB,
which is more or less the value of /proc/sys/vm/dirty_ratio, Writeback is
usually 0 or has small values, and running sync often takes 30m or more.

The 100% cpu is definitely caused by the news transfer - pausing it and
waiting a while makes it effectively disappear and everything goes back to
normal.

The news process effectively does this in multiple parallel loops:

   openat(AT_FDCWD, "/store/04267/26623~", O_WRONLY|O_CREAT|O_EXCL, 0600...
   write(75, "Path: ask005.abavia.com!"..., 656453...
   close(75)                   = 0
   renameat2(AT_FDCWD, "/store/04267/26623~", AT_FDCWD, "/store/04267/26623", 0 ...

The file layout is one layer of subdirectories with 100000 files inside
each, which has posed absolutely no probelms withe xt4/xfs in the past,
and also btrfs didn't seem to mind.

My question is, would this be expected behaviour? If yes, is it something
that can be influenced/improved on my side?

I can investigate and do some experiments, but I cannot easily update
kernels/do reboots on this system.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp@schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\

^ permalink raw reply	[flat|nested] 7+ messages in thread