From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7AB6C2BB55 for ; Thu, 9 Apr 2020 09:46:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7F5EB214D8 for ; Thu, 9 Apr 2020 09:46:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7F5EB214D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 354C38E0014; Thu, 9 Apr 2020 05:46:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 304308E0006; Thu, 9 Apr 2020 05:46:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 219988E0014; Thu, 9 Apr 2020 05:46:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 16EF08E0006 for ; Thu, 9 Apr 2020 05:46:20 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E5F9118116A10 for ; Thu, 9 Apr 2020 09:46:19 +0000 (UTC) X-FDA: 76687836078.07.alley97_545bdc907132e X-HE-Tag: alley97_545bdc907132e X-Filterd-Recvd-Size: 5818 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Thu, 9 Apr 2020 09:46:19 +0000 (UTC) Received: by mail-wr1-f44.google.com with SMTP id w10so11190010wrm.4 for ; Thu, 09 Apr 2020 02:46:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=LId4DAWOHvNuMMutRw21LME3ewUa/DnUdtIKzTRHI7w=; b=Y06Wu0rq4u6N1Q+RtsREzQ7Wk6tqz9qws8smU3hU5IcV8WE89AdxDZijKjEFBcWbL/ N9jKRk2wmu4z7hp13j4OoLQoEsBct0p+A4hmBYmhE3BQYetsZO2tmtxDBIb7S7WSy8LI H8YKc/gow8Rm9eBkR++1rzOXk285gyOAo42f1TIr00PuMqCLKh2apGmINF9f2q05wSJU Oyu1wyvGzcgFelIE9TbM72H9gnqqFUnspqajm/mbhyHknywJDFO/VNB1lrU7r5TcQM1U 9A6Fk7q3N8R0TztpVu3V/eMuf7yDNlitbU+xlv1hCS+nlUOOlFWvsY9HC2wYw6rZoD+g bxyg== X-Gm-Message-State: AGi0PuZpSMYNwoY/UVwdUORhi2N4vPN9GuK8MdUYLqJofYDegJ2Npbib t6n8BXXV5BusnhqffHG59Js= X-Google-Smtp-Source: APiQypKT0yoHVLUSTGGMtIqSXhW/xasf+pY1RxvSukiHoF9vE4Tpujs3PAnahafjfhq0Sk6rcv5J2A== X-Received: by 2002:adf:e282:: with SMTP id v2mr13177978wri.329.1586425578442; Thu, 09 Apr 2020 02:46:18 -0700 (PDT) Received: from localhost (ip-37-188-180-223.eurotel.cz. [37.188.180.223]) by smtp.gmail.com with ESMTPSA id c4sm3185981wmb.5.2020.04.09.02.46.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Apr 2020 02:46:17 -0700 (PDT) Date: Thu, 9 Apr 2020 11:46:15 +0200 From: Michal Hocko To: Bruno =?iso-8859-1?Q?Pr=E9mont?= Cc: cgroups@vger.kernel.org, linux-mm@kvack.org, Johannes Weiner , Vladimir Davydov , Chris Down Subject: Re: Memory CG and 5.1 to 5.6 uprade slows backup Message-ID: <20200409094615.GE18386@dhcp22.suse.cz> References: <20200409112505.2e1fc150@hemera.lan.sysophe.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20200409112505.2e1fc150@hemera.lan.sysophe.eu> Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [Cc Chris] On Thu 09-04-20 11:25:05, Bruno Pr=E9mont wrote: > Hi, >=20 > Upgrading from 5.1 kernel to 5.6 kernel on a production system using > cgroups (v2) and having backup process in a memory.high=3D2G cgroup > sees backup being highly throttled (there are about 1.5T to be > backuped). What does /proc/sys/vm/dirty_* say? Is it possible that the reclaim is not making progress on too many dirty pages and that triggers the back off mechanism that has been implemented recently in 5.4 (have a look at=20 0e4b01df8659 ("mm, memcg: throttle allocators when failing reclaim over memory.high") and e26733e0d0ec ("mm, memcg: throttle allocators based on ancestral memory.high"). Keeping the rest of the email for reference. > Most memory usage in that cgroup is for file cache. >=20 > Here are the memory details for the cgroup: > memory.current:2147225600 > memory.events:low 0 > memory.events:high 423774 > memory.events:max 31131 > memory.events:oom 0 > memory.events:oom_kill 0 > memory.events.local:low 0 > memory.events.local:high 423774 > memory.events.local:max 31131 > memory.events.local:oom 0 > memory.events.local:oom_kill 0 > memory.high:2147483648 > memory.low:33554432 > memory.max:2415919104 > memory.min:0 > memory.oom.group:0 > memory.pressure:some avg10=3D90.42 avg60=3D72.59 avg300=3D78.30 total=3D= 298252577711 > memory.pressure:full avg10=3D90.32 avg60=3D72.53 avg300=3D78.24 total=3D= 295658626500 > memory.stat:anon 10887168 > memory.stat:file 2062102528 > memory.stat:kernel_stack 73728 > memory.stat:slab 76148736 > memory.stat:sock 360448 > memory.stat:shmem 0 > memory.stat:file_mapped 12029952 > memory.stat:file_dirty 946176 > memory.stat:file_writeback 405504 > memory.stat:anon_thp 0 > memory.stat:inactive_anon 0 > memory.stat:active_anon 10121216 > memory.stat:inactive_file 1954959360 > memory.stat:active_file 106418176 > memory.stat:unevictable 0 > memory.stat:slab_reclaimable 75247616 > memory.stat:slab_unreclaimable 901120 > memory.stat:pgfault 8651676 > memory.stat:pgmajfault 2013 > memory.stat:workingset_refault 8670651 > memory.stat:workingset_activate 409200 > memory.stat:workingset_nodereclaim 62040 > memory.stat:pgrefill 1513537 > memory.stat:pgscan 47519855 > memory.stat:pgsteal 44933838 > memory.stat:pgactivate 7986 > memory.stat:pgdeactivate 1480623 > memory.stat:pglazyfree 0 > memory.stat:pglazyfreed 0 > memory.stat:thp_fault_alloc 0 > memory.stat:thp_collapse_alloc 0 >=20 > Numbers that change most are pgscan/pgsteal > Regularly the backup process seems to be blocked for about 2s, but not > within a syscall according to strace. >=20 > Is there a way to tell kernel that this cgroup should not be throttled > and its inactive file cache given up (rather quickly). >=20 > The aim here is to avoid backup from killing production task file cache > but not starving it. >=20 >=20 > If there is some useful info missing, please tell (eventually adding ho= w > I can obtain it). >=20 >=20 > On a side note, I liked v1's mode of soft/hard memory limit where the > memory amount between soft and hard could be used if system has enough > free memory. For v2 the difference between high and max seems almost of > no use. >=20 > A cgroup parameter for impacting RO file cache differently than > anonymous memory or otherwise dirty memory would be great too. >=20 >=20 > Thanks, > Bruno --=20 Michal Hocko SUSE Labs