From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51FDCC433ED for ; Mon, 26 Apr 2021 19:40:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 26C276135D for ; Mon, 26 Apr 2021 19:40:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239268AbhDZTkk (ORCPT ); Mon, 26 Apr 2021 15:40:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233755AbhDZTkj (ORCPT ); Mon, 26 Apr 2021 15:40:39 -0400 Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F28EC061574 for ; Mon, 26 Apr 2021 12:39:57 -0700 (PDT) Received: by mail-lf1-x12e.google.com with SMTP id h36so35870322lfv.7 for ; Mon, 26 Apr 2021 12:39:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=uKTFMl2K/jBtlUdprMSpQyuu4jIlYrjBFVJg7lJPhEk=; b=QLHlSH2LL9uXh4sy9QtauiXaoF8OnWxtk9QCSh8791OUvFB4sYoW8jVjCMkGTrtcQL FrGfNwUH0L3V08sKOzFRiJBbxgvswr4ji3xy0XAO/ufgHwMRGO6lR3COBApdiOIM/wk5 POZkZIlQVZuug1Bn5mrxSe11DlrX+WEBcb3+u5Ut7b/ygHXvPDZutuvyEZ7HLinkTewW ISOX6S3OTcWbJyAX2Us30EFYhEOiff5QHkx6fv3pKd5//JiqcTwxKiS1xZfFqNrVY4K4 it/m5g7kX04xUv8RCDAMnDfVccezvq8wz9ORMKhwhg9p6SAxNBLtvy4KiSRqAZSu+wuD JM3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=uKTFMl2K/jBtlUdprMSpQyuu4jIlYrjBFVJg7lJPhEk=; b=qQHqweimUIUmVuI4ShExNmCa/aenrvFWN5lFpjMz/BTFklugBvY8o6+WLlRh5vUr4+ uNeItI3zYYOsY9KKXCPCKIxyr3AWDAj0Xr5cbe/IA8mIPN3OfMHzui0lGIN4HGpflyrS zGE+s9vIdftnhqrTNMmeRFLLVZm1xbf8rhXpRM1UCzgp++xWIbICit+qqN9+N8pQ8SFN OKWU1ubtnUrAuoRmXtXVYkJ+LQYGBz9sIGKtbtfRFSEbmPexy53P8egfkj7xQaL/yTMf 83tCQxNppg3AzHtCGnSDStOg9r/GQztH7ktzs+cqtjngeVkOUnTGpqQnov1T8RHRjaqx wj9A== X-Gm-Message-State: AOAM5320zEktkFrhEli/6aLVENmxXz+zFfFUZUxePUXCuRPUziTM99Tn 7zRG4JE9/ox5TJ57MwcpMbVvMiCgRlexaOF+I6EKvg== X-Google-Smtp-Source: ABdhPJxuQA1Y5wY99wy7C17n6eTxD1SMFnUX6rYc22m5KHJz1myuJrd/hnlyB35vGzrj9l9cBTvJ5Tb/WNk/8ky1sRs= X-Received: by 2002:a19:ed11:: with SMTP id y17mr1253321lfy.117.1619465995569; Mon, 26 Apr 2021 12:39:55 -0700 (PDT) MIME-Version: 1.0 References: <7b777e22-5b0d-7444-343d-92cbfae5f8b4@virtuozzo.com> <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> In-Reply-To: <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> From: Shakeel Butt Date: Mon, 26 Apr 2021 12:39:44 -0700 Message-ID: Subject: Re: [PATCH v2 1/1] memcg: enable accounting for pids in nested pid namespaces To: Vasily Averin Cc: Michal Hocko , Cgroups , LKML , Roman Gushchin , Christian Brauner , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Serge Hallyn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 24, 2021 at 4:54 AM Vasily Averin wrote: > > Commit 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") > enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep, > but forgot to adjust the setting for nested pid namespaces. > As a result, pid memory is not accounted exactly where it is really neede= d, > inside memcg-limited containers with their own pid namespaces. > > Pid was one the first kernel objects enabled for memcg accounting. > init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that > any new pids in the system are memcg-accounted. > > Though recently I've noticed that it is wrong. nested pid namespaces crea= tes > own slab caches for pid objects, nested pids have increased size because = contain > id both for all parent and for own pid namespaces. The problem is that th= ese slab > caches are _NOT_ marked by SLAB_ACCOUNT, as a result any pids allocated i= n > nested pid namespaces are not memcg-accounted. > > Pid struct in nested pid namespace consumes up to 500 bytes memory, > 100000 such objects gives us up to ~50Mb unaccounted memory, > this allow container to exceed assigned memcg limits. > > Fixes: 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") > Cc: stable@vger.kernel.org > Signed-off-by: Vasily Averin > Reviewed-by: Michal Koutn=C3=BD > Acked-by: Christian Brauner > Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shakeel Butt Subject: Re: [PATCH v2 1/1] memcg: enable accounting for pids in nested pid namespaces Date: Mon, 26 Apr 2021 12:39:44 -0700 Message-ID: References: <7b777e22-5b0d-7444-343d-92cbfae5f8b4@virtuozzo.com> <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=uKTFMl2K/jBtlUdprMSpQyuu4jIlYrjBFVJg7lJPhEk=; b=QLHlSH2LL9uXh4sy9QtauiXaoF8OnWxtk9QCSh8791OUvFB4sYoW8jVjCMkGTrtcQL FrGfNwUH0L3V08sKOzFRiJBbxgvswr4ji3xy0XAO/ufgHwMRGO6lR3COBApdiOIM/wk5 POZkZIlQVZuug1Bn5mrxSe11DlrX+WEBcb3+u5Ut7b/ygHXvPDZutuvyEZ7HLinkTewW ISOX6S3OTcWbJyAX2Us30EFYhEOiff5QHkx6fv3pKd5//JiqcTwxKiS1xZfFqNrVY4K4 it/m5g7kX04xUv8RCDAMnDfVccezvq8wz9ORMKhwhg9p6SAxNBLtvy4KiSRqAZSu+wuD JM3g== In-Reply-To: <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> List-ID: Content-Type: text/plain; charset="macroman" To: Vasily Averin Cc: Michal Hocko , Cgroups , LKML , Roman Gushchin , Christian Brauner , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Serge Hallyn On Sat, Apr 24, 2021 at 4:54 AM Vasily Averin wrote: > > Commit 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") > enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep, > but forgot to adjust the setting for nested pid namespaces. > As a result, pid memory is not accounted exactly where it is really neede= d, > inside memcg-limited containers with their own pid namespaces. > > Pid was one the first kernel objects enabled for memcg accounting. > init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that > any new pids in the system are memcg-accounted. > > Though recently I've noticed that it is wrong. nested pid namespaces crea= tes > own slab caches for pid objects, nested pids have increased size because = contain > id both for all parent and for own pid namespaces. The problem is that th= ese slab > caches are _NOT_ marked by SLAB_ACCOUNT, as a result any pids allocated i= n > nested pid namespaces are not memcg-accounted. > > Pid struct in nested pid namespace consumes up to 500 bytes memory, > 100000 such objects gives us up to ~50Mb unaccounted memory, > this allow container to exceed assigned memcg limits. > > Fixes: 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") > Cc: stable@vger.kernel.org > Signed-off-by: Vasily Averin > Reviewed-by: Michal Koutn=C3=BD > Acked-by: Christian Brauner > Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt