Memory CG and 5.1 to 5.6 uprade slows backup

* Memory CG and 5.1 to 5.6 uprade slows backup
@ 2020-04-09  9:25 Bruno Prémont
  2020-04-09  9:46 ` Michal Hocko
  2020-04-09 10:50 ` Chris Down
  0 siblings, 2 replies; 18+ messages in thread
From: Bruno Prémont @ 2020-04-09  9:25 UTC (permalink / raw)
  To: cgroups, linux-mm; +Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov

Hi,

Upgrading from 5.1 kernel to 5.6 kernel on a production system using
cgroups (v2) and having backup process in a memory.high=2G cgroup
sees backup being highly throttled (there are about 1.5T to be
backuped).

Most memory usage in that cgroup is for file cache.

Here are the memory details for the cgroup:
memory.current:2147225600
memory.events:low 0
memory.events:high 423774
memory.events:max 31131
memory.events:oom 0
memory.events:oom_kill 0
memory.events.local:low 0
memory.events.local:high 423774
memory.events.local:max 31131
memory.events.local:oom 0
memory.events.local:oom_kill 0
memory.high:2147483648
memory.low:33554432
memory.max:2415919104
memory.min:0
memory.oom.group:0
memory.pressure:some avg10=90.42 avg60=72.59 avg300=78.30 total=298252577711
memory.pressure:full avg10=90.32 avg60=72.53 avg300=78.24 total=295658626500
memory.stat:anon 10887168
memory.stat:file 2062102528
memory.stat:kernel_stack 73728
memory.stat:slab 76148736
memory.stat:sock 360448
memory.stat:shmem 0
memory.stat:file_mapped 12029952
memory.stat:file_dirty 946176
memory.stat:file_writeback 405504
memory.stat:anon_thp 0
memory.stat:inactive_anon 0
memory.stat:active_anon 10121216
memory.stat:inactive_file 1954959360
memory.stat:active_file 106418176
memory.stat:unevictable 0
memory.stat:slab_reclaimable 75247616
memory.stat:slab_unreclaimable 901120
memory.stat:pgfault 8651676
memory.stat:pgmajfault 2013
memory.stat:workingset_refault 8670651
memory.stat:workingset_activate 409200
memory.stat:workingset_nodereclaim 62040
memory.stat:pgrefill 1513537
memory.stat:pgscan 47519855
memory.stat:pgsteal 44933838
memory.stat:pgactivate 7986
memory.stat:pgdeactivate 1480623
memory.stat:pglazyfree 0
memory.stat:pglazyfreed 0
memory.stat:thp_fault_alloc 0
memory.stat:thp_collapse_alloc 0

Numbers that change most are pgscan/pgsteal
Regularly the backup process seems to be blocked for about 2s, but not
within a syscall according to strace.

Is there a way to tell kernel that this cgroup should not be throttled
and its inactive file cache given up (rather quickly).

The aim here is to avoid backup from killing production task file cache
but not starving it.

If there is some useful info missing, please tell (eventually adding how
I can obtain it).

On a side note, I liked v1's mode of soft/hard memory limit where the
memory amount between soft and hard could be used if system has enough
free memory. For v2 the difference between high and max seems almost of
no use.

A cgroup parameter for impacting RO file cache differently than
anonymous memory or otherwise dirty memory would be great too.

Thanks,
Bruno

^ permalink raw reply	[flat|nested] 18+ messages in thread