From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82AF7C636C9 for ; Sat, 17 Jul 2021 16:52:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EE4456115A for ; Sat, 17 Jul 2021 16:52:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE4456115A Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 223508D00F4; Sat, 17 Jul 2021 12:52:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D42B8D00EC; Sat, 17 Jul 2021 12:52:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 026148D00F4; Sat, 17 Jul 2021 12:52:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id D3FBE8D00EC for ; Sat, 17 Jul 2021 12:52:40 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 81CAE1EA2B for ; Sat, 17 Jul 2021 16:52:39 +0000 (UTC) X-FDA: 78372673638.17.FD4BD80 Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com [209.85.208.177]) by imf15.hostedemail.com (Postfix) with ESMTP id 4392BD0000AB for ; Sat, 17 Jul 2021 16:52:39 +0000 (UTC) Received: by mail-lj1-f177.google.com with SMTP id a6so18623744ljq.3 for ; Sat, 17 Jul 2021 09:52:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EtEG9h57/w2CMZDQ/fsZ1KRb5SK57GhIIkopkVoDlbc=; b=XUHU2gNDJcqocEEkrB3+yDpFf9dowkAfmkfyxZaPTu+QszPAXTXXRjnG8ep3deLymd ucMq+pR92OmT0PPAlb0xw5c4xHYO/rT1Ik8saw3SZyfo07VZPMiXHc0ESsqvTZVo/sqZ 5EOTeqO384bYUJHBIVov2RIlCYxhF3ERBXISt/O29DR2bkSoLBeeL7exWBDQ6VgsDgIX LGpw6g/b8j+Xrz4TZwPw9yeJSltMWo/lJG+nB0JMrn4KafjmOgODPthC7r5wU1v69hS8 z8mg2VrtpCLALg6+mDy2BLhQYGkro6gZe09MpmWOAKjPf7wKsC72O8TZkSgqEVswBxI/ stGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EtEG9h57/w2CMZDQ/fsZ1KRb5SK57GhIIkopkVoDlbc=; b=PVdXMBRH+zebyQhOruPdlX3MM/SOOCKK5pT4Ih0ITJK8+UU9phnq6gQkAEA66bxqXo bAYzbIpyOY8FnwLG1+VS5Qy+wBIqdZZez5BXxOqXX1bhjuoMo0xHVUl6lloSzaYkaF2+ Hsbl3haaT53rdsem2+BiJZT24REebG/RTOFx8U3Wbvn6WX7N18TtQhv6TTx1S64EDUpJ BABbv/uxLCb1r71m3iIrv6XCXYLfhd5qE1GfIiCG2vSvmHZ6tPrs/UxU+xaU55V7qTui 4qHWeVnPn55Ca/85Hn+m1HMphL/qOPkXas/dCk4SpL+wAb7i9o3AfE+H3PWejs9qMp00 ag6g== X-Gm-Message-State: AOAM532FEQ0liYubAWImT0pEs6mxdpof+FbFYnJK8lIrNtRblKBIQhDS 6BNtIsvbFjGFaFuTCpI749UHm+yjtX7Tw+hQJBhL3A== X-Google-Smtp-Source: ABdhPJy+71mO0IW5IvYEEplKNvjGRMYrsTvQ6tjagpv+c0IMjPm5ync58Z3xMwVX/+nz5lUA/SSxxKD2BtiJTruR414= X-Received: by 2002:a2e:8215:: with SMTP id w21mr13951832ljg.160.1626540757244; Sat, 17 Jul 2021 09:52:37 -0700 (PDT) MIME-Version: 1.0 References: <1626517201-24086-1-git-send-email-nglaive@gmail.com> In-Reply-To: <1626517201-24086-1-git-send-email-nglaive@gmail.com> From: Shakeel Butt Date: Sat, 17 Jul 2021 09:52:25 -0700 Message-ID: Subject: Re: [PATCH] memcg: charge fs_context and legacy_fs_context To: Yutian Yang , Andrew Morton Cc: Michal Hocko , Johannes Weiner , Vladimir Davydov , Cgroups , Linux MM , shenwenbo@zju.edu.cn Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=XUHU2gND; spf=pass (imf15.hostedemail.com: domain of shakeelb@google.com designates 209.85.208.177 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam05 X-Stat-Signature: ep9hkkkjce719s413wi4tjd9iwchmf4b X-Rspamd-Queue-Id: 4392BD0000AB X-HE-Tag: 1626540759-120319 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: +Andrew Morton On Sat, Jul 17, 2021 at 3:23 AM Yutian Yang wrote: > > This patch adds accounting flags to fs_context and legacy_fs_context > allocation sites so that kernel could correctly charge these objects. > > We have written a PoC to demonstrate the effect of the missing-charging > bugs. The PoC takes around 1,200MB unaccounted memory, while it is charged > for only 362MB memory usage. We evaluate the PoC on QEMU x86_64 v5.2.90 > + Linux kernel v5.10.19 + Debian buster. All the limitations including > ulimits and sysctl variables are set as default. Specifically, the hard > NOFILE limit and nr_open in sysctl are both 1,048,576. > > /*------------------------- POC code ----------------------------*/ > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \ > } while (0) > > #define STACK_SIZE (8 * 1024) > #ifndef __NR_fsopen > #define __NR_fsopen 430 > #endif > static inline int fsopen(const char *fs_name, unsigned int flags) > { > return syscall(__NR_fsopen, fs_name, flags); > } > > static char thread_stack[512][STACK_SIZE]; > > int thread_fn(void* arg) > { > for (int i = 0; i< 800000; ++i) { > int fsfd = fsopen("nfs", FSOPEN_CLOEXEC); > if (fsfd == -1) { > errExit("fsopen"); > } > } > while(1); > return 0; > } > > int main(int argc, char *argv[]) { > int thread_pid; > for (int i = 0; i < 1; ++i) { > thread_pid = clone(thread_fn, thread_stack[i] + STACK_SIZE, \ > SIGCHLD, NULL); > } > while(1); > return 0; > } > > /*-------------------------- end --------------------------------*/ > > > Thanks! > Yutian Yang, > Zhejiang University > > > Signed-off-by: Yutian Yang Reviewed-by: Shakeel Butt I think this can go through the mm tree. > --- > fs/fs_context.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/fs_context.c b/fs/fs_context.c > index 2834d1afa..4858645ca 100644 > --- a/fs/fs_context.c > +++ b/fs/fs_context.c > @@ -231,7 +231,7 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, > struct fs_context *fc; > int ret = -ENOMEM; > > - fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL); > + fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT); > if (!fc) > return ERR_PTR(-ENOMEM); > > @@ -631,7 +631,7 @@ const struct fs_context_operations legacy_fs_context_ops = { > */ > static int legacy_init_fs_context(struct fs_context *fc) > { > - fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL); > + fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL_ACCOUNT); > if (!fc->fs_private) > return -ENOMEM; > fc->ops = &legacy_fs_context_ops; > -- > 2.25.1 > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shakeel Butt Subject: Re: [PATCH] memcg: charge fs_context and legacy_fs_context Date: Sat, 17 Jul 2021 09:52:25 -0700 Message-ID: References: <1626517201-24086-1-git-send-email-nglaive@gmail.com> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EtEG9h57/w2CMZDQ/fsZ1KRb5SK57GhIIkopkVoDlbc=; b=XUHU2gNDJcqocEEkrB3+yDpFf9dowkAfmkfyxZaPTu+QszPAXTXXRjnG8ep3deLymd ucMq+pR92OmT0PPAlb0xw5c4xHYO/rT1Ik8saw3SZyfo07VZPMiXHc0ESsqvTZVo/sqZ 5EOTeqO384bYUJHBIVov2RIlCYxhF3ERBXISt/O29DR2bkSoLBeeL7exWBDQ6VgsDgIX LGpw6g/b8j+Xrz4TZwPw9yeJSltMWo/lJG+nB0JMrn4KafjmOgODPthC7r5wU1v69hS8 z8mg2VrtpCLALg6+mDy2BLhQYGkro6gZe09MpmWOAKjPf7wKsC72O8TZkSgqEVswBxI/ stGQ== In-Reply-To: <1626517201-24086-1-git-send-email-nglaive-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Yutian Yang , Andrew Morton Cc: Michal Hocko , Johannes Weiner , Vladimir Davydov , Cgroups , Linux MM , shenwenbo-Y5EWUtBUdg4nDS1+zs4M5A@public.gmane.org +Andrew Morton On Sat, Jul 17, 2021 at 3:23 AM Yutian Yang wrote: > > This patch adds accounting flags to fs_context and legacy_fs_context > allocation sites so that kernel could correctly charge these objects. > > We have written a PoC to demonstrate the effect of the missing-charging > bugs. The PoC takes around 1,200MB unaccounted memory, while it is charged > for only 362MB memory usage. We evaluate the PoC on QEMU x86_64 v5.2.90 > + Linux kernel v5.10.19 + Debian buster. All the limitations including > ulimits and sysctl variables are set as default. Specifically, the hard > NOFILE limit and nr_open in sysctl are both 1,048,576. > > /*------------------------- POC code ----------------------------*/ > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \ > } while (0) > > #define STACK_SIZE (8 * 1024) > #ifndef __NR_fsopen > #define __NR_fsopen 430 > #endif > static inline int fsopen(const char *fs_name, unsigned int flags) > { > return syscall(__NR_fsopen, fs_name, flags); > } > > static char thread_stack[512][STACK_SIZE]; > > int thread_fn(void* arg) > { > for (int i = 0; i< 800000; ++i) { > int fsfd = fsopen("nfs", FSOPEN_CLOEXEC); > if (fsfd == -1) { > errExit("fsopen"); > } > } > while(1); > return 0; > } > > int main(int argc, char *argv[]) { > int thread_pid; > for (int i = 0; i < 1; ++i) { > thread_pid = clone(thread_fn, thread_stack[i] + STACK_SIZE, \ > SIGCHLD, NULL); > } > while(1); > return 0; > } > > /*-------------------------- end --------------------------------*/ > > > Thanks! > Yutian Yang, > Zhejiang University > > > Signed-off-by: Yutian Yang Reviewed-by: Shakeel Butt I think this can go through the mm tree. > --- > fs/fs_context.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/fs_context.c b/fs/fs_context.c > index 2834d1afa..4858645ca 100644 > --- a/fs/fs_context.c > +++ b/fs/fs_context.c > @@ -231,7 +231,7 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, > struct fs_context *fc; > int ret = -ENOMEM; > > - fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL); > + fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT); > if (!fc) > return ERR_PTR(-ENOMEM); > > @@ -631,7 +631,7 @@ const struct fs_context_operations legacy_fs_context_ops = { > */ > static int legacy_init_fs_context(struct fs_context *fc) > { > - fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL); > + fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL_ACCOUNT); > if (!fc->fs_private) > return -ENOMEM; > fc->ops = &legacy_fs_context_ops; > -- > 2.25.1 >