From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0F47C433E0 for ; Mon, 22 Feb 2021 18:46:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4561F64E27 for ; Mon, 22 Feb 2021 18:46:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4561F64E27 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CA54E8D0001; Mon, 22 Feb 2021 13:46:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C540F6B0074; Mon, 22 Feb 2021 13:46:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B69D68D0001; Mon, 22 Feb 2021 13:46:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0251.hostedemail.com [216.40.44.251]) by kanga.kvack.org (Postfix) with ESMTP id A1B1E6B0073 for ; Mon, 22 Feb 2021 13:46:08 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5C77234A3 for ; Mon, 22 Feb 2021 18:46:08 +0000 (UTC) X-FDA: 77846783616.03.5C37A65 Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by imf18.hostedemail.com (Postfix) with ESMTP id F3F0A2000381 for ; Mon, 22 Feb 2021 18:46:08 +0000 (UTC) Received: by mail-lj1-f174.google.com with SMTP id c8so60122518ljd.12 for ; Mon, 22 Feb 2021 10:46:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EF4DZ8/YPHLgiqcKgmmLocSOb1RhL6xlS59EKbvTOMk=; b=l7F/ZAQ0Ut25vN6NuV1Mgnps+4oWPHrywIpIgaIMkipU6kKTEaaav9qFISBOq1DdyE NxV5B0YqGvSpNTeBTU4M8gA9rVt3DhfzqEvukbwx9tXDmt6zQ2n3ccGgf5qIKe49GnOK +MG8d7OArOJULgO1WVyGHqga7pkUYoKEF0jYlWUT7KupD2fnJOKOBEvX+wG9rctXY6A0 huZx9EPyD30ZfcsfqJCpapDIv4dTgJ9lGV8+dWQVJX0mNpKcf+5Bv88WBKvJOqVaTD7s YZ1Umhgva/38fQ8BD83N9bpYvAW5BKOdVqDYTmQ9f7UVVfHRMuI58kMKCF4Rq1tWhOEZ 1ufg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EF4DZ8/YPHLgiqcKgmmLocSOb1RhL6xlS59EKbvTOMk=; b=G8e9R8hOLol3h9ASq8qYzSPPf0uHqwxCt4siAcnwiRt3l4v+SUyP0FLe/jIMrdKnGU qZYQARv0UJinqFJvq2xqwfBoN/CDTi6W74Ca8qFv15AVEbUdkVE/5FSaXICYauTR45xS 8+2v2D8aKPNmGuCMEw7kmwboPqsLiP2NN3+OL1KT15FeAzayINXOnw2TvvozFdlBjO2o fXaypZapx7b9JMyyAOeI46YMoK+/qGy6liKGtvHAak4ISIg/D74+V1vd1MXdHmyH2H0u 9VcVJijFfR83ayOaa6pSNS+q4vUjbL7hR0VtbeYF8+205PXjzs6jDqYn6u7nOfR+wubj gL5g== X-Gm-Message-State: AOAM531t0zyDl2R6VA8hqQI1EJykh+x1TR/jBrEbkGn3+yWX8Htoc5ce F3N5gZ7rZX0BWAQsIwLHFiogjfdXGde/S9QEf54Vcg== X-Google-Smtp-Source: ABdhPJwAlLu4bzPgqgTebDPTwhJRG9Bc8Q1yEa/sHT1s8b5RpgzeO9D8REbo/sutzBeci2QV80/CUYjcvDc2ktE4rDE= X-Received: by 2002:a2e:2f05:: with SMTP id v5mr13605682ljv.279.1614019566052; Mon, 22 Feb 2021 10:46:06 -0800 (PST) MIME-Version: 1.0 References: <20210219224405.1544597-1-shakeelb@google.com> In-Reply-To: From: Shakeel Butt Date: Mon, 22 Feb 2021 10:45:54 -0800 Message-ID: Subject: Re: [PATCH] memcg: charge before adding to swapcache on swapin To: Johannes Weiner Cc: Hugh Dickins , Roman Gushchin , Michal Hocko , Andrew Morton , Cgroups , Linux MM , LKML Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F3F0A2000381 X-Stat-Signature: p4wr6k4bskjxm8u11go4tpoowjirmxua Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=mail-lj1-f174.google.com; client-ip=209.85.208.174 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614019568-734640 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Feb 19, 2021 at 4:34 PM Johannes Weiner wrote: > > On Fri, Feb 19, 2021 at 02:44:05PM -0800, Shakeel Butt wrote: > > Currently the kernel adds the page, allocated for swapin, to the > > swapcache before charging the page. This is fine but now we want a > > per-memcg swapcache stat which is essential for folks who wants to > > transparently migrate from cgroup v1's memsw to cgroup v2's memory and > > swap counters. > > > > To correctly maintain the per-memcg swapcache stat, one option which > > this patch has adopted is to charge the page before adding it to > > swapcache. One challenge in this option is the failure case of > > add_to_swap_cache() on which we need to undo the mem_cgroup_charge(). > > Specifically undoing mem_cgroup_uncharge_swap() is not simple. > > > > This patch circumvent this specific issue by removing the failure path > > of add_to_swap_cache() by providing __GFP_NOFAIL. Please note that in > > this specific situation ENOMEM was the only possible failure of > > add_to_swap_cache() which is removed by using __GFP_NOFAIL. > > > > Another option was to use __mod_memcg_lruvec_state(NR_SWAPCACHE) in > > mem_cgroup_charge() but then we need to take of the do_swap_page() case > > where synchronous swap devices bypass the swapcache. The do_swap_page() > > already does hackery to set and reset PageSwapCache bit to make > > mem_cgroup_charge() execute the swap accounting code and then we would > > need to add additional parameter to tell to not touch NR_SWAPCACHE stat > > as that code patch bypass swapcache. > > > > This patch added memcg charging API explicitly foe swapin pages and > > cleaned up do_swap_page() to not set and reset PageSwapCache bit. > > > > Signed-off-by: Shakeel Butt > > The patch makes sense to me. While it extends the charge interface, I > actually quite like that it charges the page earlier - before putting > it into wider circulation. It's a step in the right direction. > > But IMO the semantics of mem_cgroup_charge_swapin_page() are a bit too > fickle: the __GFP_NOFAIL in add_to_swap_cache() works around it, but > having a must-not-fail-after-this line makes the code tricky to work > on and error prone. > > It would be nicer to do a proper transaction sequence. > > > @@ -497,16 +497,15 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > __SetPageLocked(page); > > __SetPageSwapBacked(page); > > > > - /* May fail (-ENOMEM) if XArray node allocation failed. */ > > - if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK, &shadow)) { > > - put_swap_page(page, entry); > > + if (mem_cgroup_charge_swapin_page(page, NULL, gfp_mask, entry)) > > goto fail_unlock; > > - } > > > > - if (mem_cgroup_charge(page, NULL, gfp_mask)) { > > - delete_from_swap_cache(page); > > - goto fail_unlock; > > - } > > + /* > > + * Use __GFP_NOFAIL to not worry about undoing the changes done by > > + * mem_cgroup_charge_swapin_page() on failure of add_to_swap_cache(). > > + */ > > + add_to_swap_cache(page, entry, > > + (gfp_mask|__GFP_NOFAIL) & GFP_RECLAIM_MASK, &shadow); > > How about: > > mem_cgroup_charge_swapin_page() > add_to_swap_cache() > mem_cgroup_finish_swapin_page() > > where finish_swapin_page() only uncharges the swap entry (on cgroup1) > once the swap->memory transition is complete? > > Otherwise the patch looks good to me. Thanks for the review and yes this makes the code much more clear and maintainable.