From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 012A9C433DB for ; Mon, 22 Mar 2021 15:54:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F6D76187E for ; Mon, 22 Mar 2021 15:54:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F6D76187E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E2F7A6B00D7; Mon, 22 Mar 2021 11:35:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DDEFA6B00D9; Mon, 22 Mar 2021 11:35:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C80C56B00DB; Mon, 22 Mar 2021 11:35:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0235.hostedemail.com [216.40.44.235]) by kanga.kvack.org (Postfix) with ESMTP id A692B6B00D7 for ; Mon, 22 Mar 2021 11:35:13 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0438618019F35 for ; Mon, 22 Mar 2021 15:54:04 +0000 (UTC) X-FDA: 77947956408.24.FA0CB81 Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) by imf23.hostedemail.com (Postfix) with ESMTP id 6B52FA004CDD for ; Mon, 22 Mar 2021 15:53:56 +0000 (UTC) Received: by mail-qk1-f180.google.com with SMTP id y5so9473514qkl.9 for ; Mon, 22 Mar 2021 08:53:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=E5pR4bhxZETSxTpVA3uMOpN6g18ZTkn19HmQ/Nnh+4o=; b=EHV7q9LwYDxGRVPpLfLVmAGUPg55im2cpNJ3+h7ZlJ1Ip9KZXSuVb095YaAk3LoOe/ TgStUPZvFQicUOGPKSakFK2kCfiXea14bbW2LFqTMzJHAQ699cVZK8BAkp+LNTaytRUn Xp4dYojRqHJtOPYC1Y+13U+2/rglDmFPvm8uzwDef0JIWnPky/yXvAcfWJVfTw1hGDwY QkFU8zUYaLJLXnZzSQddyAL1Lp/z6sVM4sS/u2Utm0nw571hVtcoCo3bbORJzj5JccAO 05Tbf0Ttjbba+BA460+lrwKiAcZ1gqEgq2hUl1t2MsaiTHwbFZ74Gz6I60bixo9cDDrQ GvwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=E5pR4bhxZETSxTpVA3uMOpN6g18ZTkn19HmQ/Nnh+4o=; b=NkF0yTdmpI9L7FiEYQ6JZofJ7v1Ec21uUMb8iCNH46qNbZOXdM6SpiuwqtUGkb3ddm G8c9BTlK8CAdsWeZuHHx05/C8y48ehn7GkOWTU1SoKaxEVpuDxagk0ZIuqOrHlnOxQ6r 1hNfBozq9tx02pazhPSHjmCYEt2gAh2S7Kv0x3mh8B+6YhOhyfkQT8S/Kh5Ecb+P6gc3 XWZo0KPKboKMlafS5jMb7cj3/DS/+qIlgD2LqYyDyULsnpv+pYByYlL9TsLq7AZoJH+/ m8obuKIkt9aMBQSg7UCSmhrpHk2IPmvGe4aadMpaULVzwlQ44+GA93ADe8Ls26ZkOa4K 5gOg== X-Gm-Message-State: AOAM533FBIivmXUzjjjB+UgCA4s+Tkb12QkNXEBpRCIZf0Hk18PWcRW0 0d8GHzvky/ZMa9+EETCDNe+xKg== X-Google-Smtp-Source: ABdhPJwOPDa0VeHX2SUaYbWSspS6E9g6n6b7KiPDGxc8+VLv8EfhQeQgWB9cMaa8ljeevDtYUcNFKA== X-Received: by 2002:a37:94f:: with SMTP id 76mr760874qkj.222.1616428423644; Mon, 22 Mar 2021 08:53:43 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:b54e]) by smtp.gmail.com with ESMTPSA id m3sm10989585qkk.15.2021.03.22.08.53.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Mar 2021 08:53:43 -0700 (PDT) Date: Mon, 22 Mar 2021 11:53:41 -0400 From: Johannes Weiner To: Hugh Dickins Cc: Andrew Morton , Shakeel Butt , Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH 2/2] mm: memcontrol: deprecate swapaccounting=0 mode Message-ID: References: <20210319054944.50048-1-hannes@cmpxchg.org> <20210319054944.50048-2-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: 3udth4zb6anok6prp3kjkueha3zh1umy X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 6B52FA004CDD Received-SPF: none (cmpxchg.org>: No applicable sender policy available) receiver=imf23; identity=mailfrom; envelope-from=""; helo=mail-qk1-f180.google.com; client-ip=209.85.222.180 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616428436-739440 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 19, 2021 at 05:37:05PM -0700, Hugh Dickins wrote: > On Fri, 19 Mar 2021, Johannes Weiner wrote: > > > The swapaccounting= commandline option already does very little > > today. To close a trivial containment failure case, the swap ownership > > tracking part of the swap controller has recently become mandatory > > (see commit 2d1c498072de ("mm: memcontrol: make swap tracking an > > integral part of memory control") for details), which makes up the > > majority of the work during swapout, swapin, and the swap slot map. > > > > The only thing left under this flag is the page_counter operations and > > the visibility of the swap control files in the first place, which are > > rather meager savings. There also aren't many scenarios, if any, where > > controlling the memory of a cgroup while allowing it unlimited access > > to a global swap space is a workable resource isolation stragegy. > > > > On the other hand, there have been several bugs and confusion around > > the many possible swap controller states (cgroup1 vs cgroup2 behavior, > > memory accounting without swap accounting, memcg runtime disabled). > > > > This puts the maintenance overhead of retaining the toggle above its > > practical benefits. Deprecate it. > > > > Suggested-by: Shakeel Butt > > Signed-off-by: Johannes Weiner > > This crashes, and needs a fix: see below (plus some nits). > > But it's a very welcome cleanup: just getting rid of all those > !cgroup_memory_noswap double negatives is a relief in itself. > > It does suggest eliminating CONFIG_MEMCG_SWAP altogether (just > using #ifdef CONFIG_SWAP instead, in those parts of CONFIG_MEMCG code); > but you're right that's a separate cleanup, and not nearly so worthwhile > as this one (I notice CONFIG_MEMCG_SWAP in some of the arch defconfigs, > and don't know whether whoever removes CONFIG_MEMCG_SWAP would be > obligated to remove those too). 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control") made it invisible to the user already, I only kept the symbol for convenience in the Makefile: obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o But I guess we could replace it with ifdef CONFIG_SWAP obj-$(CONFIG_MEMCG) += swap_cgroup.c endif and I agree, everywhere else it's currently used would be nicer without it. I'll send a separate patch. > > @@ -99,7 +92,11 @@ static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq); > > /* Whether legacy memory+swap accounting is active */ > > static bool do_memsw_account(void) > > { > > - return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap; > > + /* cgroup2 doesn't do mem+swap accounting */ > > + if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > + return false; > > + > > + return true; > > Nit: I'm not fond of the "if (boolean()) return true; else return false;" > codestyle, and would prefer the straightforward > > return !cgroup_subsys_on_dfl(memory_cgrp_subsys); > > but you've chosen otherwise, so, okay. I prefer a series of branches if a single expression becomes unwieldy, and individual conditions deserve their own comments. But it's indeed pretty silly in this case, I'll make it a single line. > > @@ -7124,7 +7121,7 @@ long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg) > > { > > long nr_swap_pages = get_nr_swap_pages(); > > > > - if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > That needs to check mem_cgroup_disabled() where it used to check > cgroup_memory_noswap. The convolutions are confusing (which is > precisely why this is such a welcome cleanup), but without a > mem_cgroup_disabled() (or NULL memcg) check there, the > cgroup_disable=memory case oopses on NULLish pointer dereference > when mem_cgroup_get_nr_swap_pages() is called from get_scan_count(). > > (This little function has been repeatedly troublesome that way.) Sigh, yes, this will hopefully be the last instance of this bug... Thanks for catching that, I'll fix it up. > > @@ -7141,7 +7138,7 @@ bool mem_cgroup_swap_full(struct page *page) > > > > if (vm_swap_full()) > > return true; > > - if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > Would it now be better to say "if (do_memsw_account())" there? Yes, I'll change that. > Or would it be better to eliminate do_memsw_account() altogether, > and just use !cgroup_subsys_on_dfl(memory_cgrp_subsys) throughout? I have found do_memsw_account() useful in the past to find those related pieces. The details elude me know but I remember searching for this string often during the recent work in this area. > (Though I don't find "cgroup_subsys_on_dfl" very informative.) This routinely bothers me as well, but I have not been able to come up with a good replacement. memcg_v1()? memcg_legacy_mode()? memory_cgroup_in_bytes()? > > @@ -7291,27 +7287,13 @@ static struct cftype memsw_files[] = { > > { }, /* terminate */ > > }; > > > > -/* > > - * If mem_cgroup_swap_init() is implemented as a subsys_initcall() > > - * instead of a core_initcall(), this could mean cgroup_memory_noswap still > > - * remains set to false even when memcg is disabled via "cgroup_disable=memory" > > - * boot parameter. This may result in premature OOPS inside > > - * mem_cgroup_get_nr_swap_pages() function in corner cases. > > - */ > > static int __init mem_cgroup_swap_init(void) > > { > > - /* No memory control -> no swap control */ > > - if (mem_cgroup_disabled()) > > - cgroup_memory_noswap = true; > > - > > - if (cgroup_memory_noswap) > > - return 0; > > - > > Shakeel suggested "if (mem_cgroup_disabled()) return 0;" here, > and that was the first thing I tried when I got the crash. It really isn't obviously safe. I'll put it back in.