From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1ADB8C4320A for ; Tue, 17 Aug 2021 19:40:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A85C061051 for ; Tue, 17 Aug 2021 19:40:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A85C061051 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E854A6B0071; Tue, 17 Aug 2021 15:40:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E36696B0072; Tue, 17 Aug 2021 15:40:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D24E06B0073; Tue, 17 Aug 2021 15:40:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0114.hostedemail.com [216.40.44.114]) by kanga.kvack.org (Postfix) with ESMTP id B93CF6B0071 for ; Tue, 17 Aug 2021 15:40:07 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5E75518379561 for ; Tue, 17 Aug 2021 19:40:07 +0000 (UTC) X-FDA: 78485588454.06.7C7C01D Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf28.hostedemail.com (Postfix) with ESMTP id 123FB9000E28 for ; Tue, 17 Aug 2021 19:40:06 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id c63-20020a25e5420000b0290580b26e708aso222243ybh.12 for ; Tue, 17 Aug 2021 12:40:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=Zhmz4tSi4Rbh2SiW1zcs8JSYVEpkkiLZgOLqRwEHjaU=; b=fbexivHCjvntQgkjo3I+gEVgsP4V9hqh6tDCH83I0jXI1pcqmMZlp/LyD9rLT4X7cg via0sfB3W+qysZ4h+voLgsl9vzTNGHXf4DWk/StmX65YobTQbDo8uN4xar8WA11xbz+W 05gI76lN3+KuCLWiTU1LOtl/mM+kJ39pia0WIy/6oTgLMuvAdgeQy4xxz/Z4/b6q3czq Pyaod5Bz7PJpvnHGOcCPVZH+FZOfR8Pn3f2GDGq2bjxnW0LnK9dpZKL6mYRixxjzCzZJ D4C6xl6IQrgBLWvU2s6GzY+DFuThP/K8CZ0NBsAAKvlM40xdaqL2vS8yb9AiEKMRx5r8 +1zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=Zhmz4tSi4Rbh2SiW1zcs8JSYVEpkkiLZgOLqRwEHjaU=; b=nAgZbr5fbyHTHMc3h+5cs+DWXRrT6q0Un+ELkbbbu083IvoAsRNr0srjeHfvuz4mDH hlz2wre9MLdoqq12v3gEvKm15vE3fTDyPUJhPO1EYr+o990ZKyDhCnAe4DsHquo1RbPE 8/LDRqx86++zZ4adVsSvboR//C5kLiUcjjnUJKa9nTqJv6Q1wVGEPWiy0c0Gz+2pBk8/ XXYlsCsveMy8iLeTMzZOOJqi2VFbiOEo2wWn0MuRoVMVMHoPDM98Je+Ye1XTHisM+JN9 cpw1WC1HglANiyb45oMxvatrKkWNv5Z1MwnOH47C0TeCTct0g24U1Zy081YdHRSpUjPB RA2A== X-Gm-Message-State: AOAM532M21iqsRcMTFVWgw9dhz2Z9cYURWvQ+TYswvFE43S4E3Zou1DU BsWwBSXaAI2D39Tt+2UQRiZWQN4REL8= X-Google-Smtp-Source: ABdhPJzbIZL390hrvi8eavJF1sdqv/zqhJbfwZzHdC1o04WbVIqWvEX0yP6xtZUlb7MTycbQGwseSBQ5Bow= X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:7b40:a358:5395:95a8]) (user=weiwan job=sendgmr) by 2002:a25:7354:: with SMTP id o81mr6239011ybc.195.1629229206243; Tue, 17 Aug 2021 12:40:06 -0700 (PDT) Date: Tue, 17 Aug 2021 12:40:03 -0700 Message-Id: <20210817194003.2102381-1-weiwan@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.33.0.rc1.237.g0d66db33f3-goog Subject: [PATCH net-next] net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem() From: Wei Wang To: netdev@vger.kernel.org, "David S . Miller" , Jakub Kicinski , cgroups@vger.kernel.org, linux-mm@kvack.org Cc: Roman Gushchin , Eric Dumazet , Shakeel Butt Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 123FB9000E28 X-Stat-Signature: hs5qoazqm8hua3f79ny3skm75cokxiyr Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=fbexivHC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3lhAcYQYKCDcpXbpTgZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--weiwan.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3lhAcYQYKCDcpXbpTgZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--weiwan.bounces.google.com X-HE-Tag: 1629229206-302340 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add gfp_t mask as an input parameter to mem_cgroup_charge_skmem(), to give more control to the networking stack and enable it to change memcg charging behavior. In the future, the networking stack may decide to avoid oom-kills when fallbacks are more appropriate. One behavior change in mem_cgroup_charge_skmem() by this patch is to avoid force charging by default and let the caller decide when and if force charging is needed through the presence or absence of __GFP_NOFAIL. Signed-off-by: Wei Wang Reviewed-by: Shakeel Butt --- include/linux/memcontrol.h | 3 ++- include/net/sock.h | 5 +++++ mm/memcontrol.c | 24 +++++++++++------------- net/core/sock.c | 16 ++++++++++++---- net/ipv4/inet_connection_sock.c | 3 ++- net/ipv4/tcp_output.c | 3 ++- 6 files changed, 34 insertions(+), 20 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index bfe5c486f4ad..f0ee30881ca9 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1581,7 +1581,8 @@ static inline void mem_cgroup_flush_foreign(struct bdi_writeback *wb) #endif /* CONFIG_CGROUP_WRITEBACK */ struct sock; -bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages); +bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, + gfp_t gfp_mask); void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages); #ifdef CONFIG_MEMCG extern struct static_key_false memcg_sockets_enabled_key; diff --git a/include/net/sock.h b/include/net/sock.h index 6e761451c927..95b25777b53e 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2400,6 +2400,11 @@ static inline gfp_t gfp_any(void) return in_softirq() ? GFP_ATOMIC : GFP_KERNEL; } +static inline gfp_t gfp_memcg_charge(void) +{ + return in_softirq() ? GFP_NOWAIT : GFP_KERNEL; +} + static inline long sock_rcvtimeo(const struct sock *sk, bool noblock) { return noblock ? 0 : sk->sk_rcvtimeo; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8ef06f9e0db1..be585ceaba98 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -7048,14 +7048,14 @@ void mem_cgroup_sk_free(struct sock *sk) * mem_cgroup_charge_skmem - charge socket memory * @memcg: memcg to charge * @nr_pages: number of pages to charge + * @gfp_mask: reclaim mode * * Charges @nr_pages to @memcg. Returns %true if the charge fit within - * @memcg's configured limit, %false if the charge had to be forced. + * @memcg's configured limit, %false if it doesn't. */ -bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages) +bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, + gfp_t gfp_mask) { - gfp_t gfp_mask = GFP_KERNEL; - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) { struct page_counter *fail; @@ -7063,21 +7063,19 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages) memcg->tcpmem_pressure = 0; return true; } - page_counter_charge(&memcg->tcpmem, nr_pages); memcg->tcpmem_pressure = 1; + if (gfp_mask & __GFP_NOFAIL) { + page_counter_charge(&memcg->tcpmem, nr_pages); + return true; + } return false; } - /* Don't block in the packet receive path */ - if (in_softirq()) - gfp_mask = GFP_NOWAIT; - - mod_memcg_state(memcg, MEMCG_SOCK, nr_pages); - - if (try_charge(memcg, gfp_mask, nr_pages) == 0) + if (try_charge(memcg, gfp_mask, nr_pages) == 0) { + mod_memcg_state(memcg, MEMCG_SOCK, nr_pages); return true; + } - try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages); return false; } diff --git a/net/core/sock.c b/net/core/sock.c index aada649e07e8..950f1e70dbf5 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2728,10 +2728,12 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) { struct proto *prot = sk->sk_prot; long allocated = sk_memory_allocated_add(sk, amt); + bool memcg_charge = mem_cgroup_sockets_enabled && sk->sk_memcg; bool charged = true; - if (mem_cgroup_sockets_enabled && sk->sk_memcg && - !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt))) + if (memcg_charge && + !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt, + gfp_memcg_charge()))) goto suppress_allocation; /* Under limit. */ @@ -2785,8 +2787,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) /* Fail only if socket is _under_ its sndbuf. * In this case we cannot block, so that we have to fail. */ - if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) + if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) { + /* Force charge with __GFP_NOFAIL */ + if (memcg_charge && !charged) { + mem_cgroup_charge_skmem(sk->sk_memcg, amt, + gfp_memcg_charge() | __GFP_NOFAIL); + } return 1; + } } if (kind == SK_MEM_SEND || (kind == SK_MEM_RECV && charged)) @@ -2794,7 +2802,7 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) sk_memory_allocated_sub(sk, amt); - if (mem_cgroup_sockets_enabled && sk->sk_memcg) + if (memcg_charge && charged) mem_cgroup_uncharge_skmem(sk->sk_memcg, amt); return 0; diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 754013fa393b..f25d02ad4a8a 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -534,7 +534,8 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern) atomic_read(&newsk->sk_rmem_alloc)); mem_cgroup_sk_alloc(newsk); if (newsk->sk_memcg && amt) - mem_cgroup_charge_skmem(newsk->sk_memcg, amt); + mem_cgroup_charge_skmem(newsk->sk_memcg, amt, + GFP_KERNEL | __GFP_NOFAIL); release_sock(newsk); } diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 29553fce8502..6d72f3ea48c4 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3373,7 +3373,8 @@ void sk_forced_mem_schedule(struct sock *sk, int size) sk_memory_allocated_add(sk, amt); if (mem_cgroup_sockets_enabled && sk->sk_memcg) - mem_cgroup_charge_skmem(sk->sk_memcg, amt); + mem_cgroup_charge_skmem(sk->sk_memcg, amt, + gfp_memcg_charge() | __GFP_NOFAIL); } /* Send a FIN. The caller locks the socket for us. -- 2.33.0.rc1.237.g0d66db33f3-goog