From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C05AC433ED for ; Sat, 24 Apr 2021 11:54:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 370F66128B for ; Sat, 24 Apr 2021 11:54:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237294AbhDXLzW (ORCPT ); Sat, 24 Apr 2021 07:55:22 -0400 Received: from relay.sw.ru ([185.231.240.75]:46696 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237263AbhDXLzQ (ORCPT ); Sat, 24 Apr 2021 07:55:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Date:Message-ID:Subject :From; bh=Ef6Icn37gpYOQIKG7/7Yf8A/Wb3Zt7XRkbRLccrvlZk=; b=IX8ZZaUQQJLWyf7r75v oHsMoiIk6HgNgJrYsfIpUKhC+Fn4jQxnkbwJw/pYVlsml1jhMvQagmugzWRhPobxURM0fLYzOFxB+ tNt7HAw+GUCs4TXaoq3yX19xbi5viaLG48dd4QpVMAOaW9ZzdM+wWknja1vp8MEB/SkJ92uKFxc= Received: from [10.93.0.56] by relay.sw.ru with esmtp (Exim 4.94) (envelope-from ) id 1laGrr-001INq-CA; Sat, 24 Apr 2021 14:54:35 +0300 From: Vasily Averin Subject: [PATCH v2 1/1] memcg: enable accounting for pids in nested pid namespaces To: Michal Hocko , cgroups@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Roman Gushchin , Christian Brauner , =?UTF-8?Q?Michal_Koutn=c3=bd?= , Serge Hallyn References: <7b777e22-5b0d-7444-343d-92cbfae5f8b4@virtuozzo.com> Message-ID: <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> Date: Sat, 24 Apr 2021 14:54:35 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <7b777e22-5b0d-7444-343d-92cbfae5f8b4@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep, but forgot to adjust the setting for nested pid namespaces. As a result, pid memory is not accounted exactly where it is really needed, inside memcg-limited containers with their own pid namespaces. Pid was one the first kernel objects enabled for memcg accounting. init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that any new pids in the system are memcg-accounted. Though recently I've noticed that it is wrong. nested pid namespaces creates own slab caches for pid objects, nested pids have increased size because contain id both for all parent and for own pid namespaces. The problem is that these slab caches are _NOT_ marked by SLAB_ACCOUNT, as a result any pids allocated in nested pid namespaces are not memcg-accounted. Pid struct in nested pid namespace consumes up to 500 bytes memory, 100000 such objects gives us up to ~50Mb unaccounted memory, this allow container to exceed assigned memcg limits. Fixes: 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") Cc: stable@vger.kernel.org Signed-off-by: Vasily Averin Reviewed-by: Michal Koutný Acked-by: Christian Brauner Acked-by: Roman Gushchin --- kernel/pid_namespace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index 6cd6715..a46a372 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -51,7 +51,8 @@ static struct kmem_cache *create_pid_cachep(unsigned int level) mutex_lock(&pid_caches_mutex); /* Name collision forces to do allocation under mutex. */ if (!*pkc) - *pkc = kmem_cache_create(name, len, 0, SLAB_HWCACHE_ALIGN, 0); + *pkc = kmem_cache_create(name, len, 0, + SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, 0); mutex_unlock(&pid_caches_mutex); /* current can fail, but someone else can succeed. */ return READ_ONCE(*pkc); -- 1.8.3.1 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vasily Averin Subject: [PATCH v2 1/1] memcg: enable accounting for pids in nested pid namespaces Date: Sat, 24 Apr 2021 14:54:35 +0300 Message-ID: <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> References: <7b777e22-5b0d-7444-343d-92cbfae5f8b4@virtuozzo.com> Mime-Version: 1.0 Content-Transfer-Encoding: base64 Return-path: DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Date:Message-ID:Subject :From; bh=Ef6Icn37gpYOQIKG7/7Yf8A/Wb3Zt7XRkbRLccrvlZk=; b=IX8ZZaUQQJLWyf7r75v oHsMoiIk6HgNgJrYsfIpUKhC+Fn4jQxnkbwJw/pYVlsml1jhMvQagmugzWRhPobxURM0fLYzOFxB+ tNt7HAw+GUCs4TXaoq3yX19xbi5viaLG48dd4QpVMAOaW9ZzdM+wWknja1vp8MEB/SkJ92uKFxc= In-Reply-To: <7b777e22-5b0d-7444-343d-92cbfae5f8b4-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> Content-Language: en-US List-ID: Content-Type: text/plain; charset="macroman" To: Michal Hocko , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Roman Gushchin , Christian Brauner , =?UTF-8?Q?Michal_Koutn=c3=bd?= , Serge Hallyn Q29tbWl0IDVkMDk3MDU2YzlhMCAoImttZW1jZzogYWNjb3VudCBjZXJ0YWluIGttZW0gYWxsb2Nh dGlvbnMgdG8gbWVtY2ciKQplbmFibGVkIG1lbWNnIGFjY291bnRpbmcgZm9yIHBpZHMgYWxsb2Nh dGVkIGZyb20gaW5pdF9waWRfbnMucGlkX2NhY2hlcCwKYnV0IGZvcmdvdCB0byBhZGp1c3QgdGhl IHNldHRpbmcgZm9yIG5lc3RlZCBwaWQgbmFtZXNwYWNlcy4KQXMgYSByZXN1bHQsIHBpZCBtZW1v cnkgaXMgbm90IGFjY291bnRlZCBleGFjdGx5IHdoZXJlIGl0IGlzIHJlYWxseSBuZWVkZWQsCmlu c2lkZSBtZW1jZy1saW1pdGVkIGNvbnRhaW5lcnMgd2l0aCB0aGVpciBvd24gcGlkIG5hbWVzcGFj ZXMuCgpQaWQgd2FzIG9uZSB0aGUgZmlyc3Qga2VybmVsIG9iamVjdHMgZW5hYmxlZCBmb3IgbWVt Y2cgYWNjb3VudGluZy4KaW5pdF9waWRfbnMucGlkX2NhY2hlcCBtYXJrZWQgYnkgU0xBQl9BQ0NP VU5UIGFuZCB3ZSBjYW4gZXhwZWN0IHRoYXQKYW55IG5ldyBwaWRzIGluIHRoZSBzeXN0ZW0gYXJl IG1lbWNnLWFjY291bnRlZC4KClRob3VnaCByZWNlbnRseSBJJ3ZlIG5vdGljZWQgdGhhdCBpdCBp cyB3cm9uZy4gbmVzdGVkIHBpZCBuYW1lc3BhY2VzIGNyZWF0ZXMgCm93biBzbGFiIGNhY2hlcyBm b3IgcGlkIG9iamVjdHMsIG5lc3RlZCBwaWRzIGhhdmUgaW5jcmVhc2VkIHNpemUgYmVjYXVzZSBj b250YWluIAppZCBib3RoIGZvciBhbGwgcGFyZW50IGFuZCBmb3Igb3duIHBpZCBuYW1lc3BhY2Vz LiBUaGUgcHJvYmxlbSBpcyB0aGF0IHRoZXNlIHNsYWIKY2FjaGVzIGFyZSBfTk9UXyBtYXJrZWQg YnkgU0xBQl9BQ0NPVU5ULCBhcyBhIHJlc3VsdCBhbnkgcGlkcyBhbGxvY2F0ZWQgaW4gCm5lc3Rl ZCBwaWQgbmFtZXNwYWNlcyBhcmUgbm90IG1lbWNnLWFjY291bnRlZC4KClBpZCBzdHJ1Y3QgaW4g bmVzdGVkIHBpZCBuYW1lc3BhY2UgY29uc3VtZXMgdXAgdG8gNTAwIGJ5dGVzIG1lbW9yeSwgCjEw MDAwMCBzdWNoIG9iamVjdHMgZ2l2ZXMgdXMgdXAgdG8gfjUwTWIgdW5hY2NvdW50ZWQgbWVtb3J5 LAp0aGlzIGFsbG93IGNvbnRhaW5lciB0byBleGNlZWQgYXNzaWduZWQgbWVtY2cgbGltaXRzLgoK Rml4ZXM6IDVkMDk3MDU2YzlhMCAoImttZW1jZzogYWNjb3VudCBjZXJ0YWluIGttZW0gYWxsb2Nh dGlvbnMgdG8gbWVtY2ciKQpDYzogc3RhYmxlLXU3OXV3WEwyOVRZNzZaMnJNNW1IWEFAcHVibGlj LmdtYW5lLm9yZwpTaWduZWQtb2ZmLWJ5OiBWYXNpbHkgQXZlcmluIDx2dnMtNUhkd0d1bjVsZitn U3B4c0pEMUM0d0BwdWJsaWMuZ21hbmUub3JnPgpSZXZpZXdlZC1ieTogTWljaGFsIEtvdXRuw70g PG1rb3V0bnktSUJpOVJHL2I2N2tAcHVibGljLmdtYW5lLm9yZz4KQWNrZWQtYnk6IENocmlzdGlh biBCcmF1bmVyIDxjaHJpc3RpYW4uYnJhdW5lci1HZVdJSC9uTVp6TFFUMGRaUitBbGZBQHB1Ymxp Yy5nbWFuZS5vcmc+CkFja2VkLWJ5OiBSb21hbiBHdXNoY2hpbiA8Z3Vyby1iMTBrWVAyZE9NZ0Bw dWJsaWMuZ21hbmUub3JnPgotLS0KIGtlcm5lbC9waWRfbmFtZXNwYWNlLmMgfCAzICsrLQogMSBm aWxlIGNoYW5nZWQsIDIgaW5zZXJ0aW9ucygrKSwgMSBkZWxldGlvbigtKQoKZGlmZiAtLWdpdCBh L2tlcm5lbC9waWRfbmFtZXNwYWNlLmMgYi9rZXJuZWwvcGlkX25hbWVzcGFjZS5jCmluZGV4IDZj ZDY3MTUuLmE0NmEzNzIgMTAwNjQ0Ci0tLSBhL2tlcm5lbC9waWRfbmFtZXNwYWNlLmMKKysrIGIv a2VybmVsL3BpZF9uYW1lc3BhY2UuYwpAQCAtNTEsNyArNTEsOCBAQCBzdGF0aWMgc3RydWN0IGtt ZW1fY2FjaGUgKmNyZWF0ZV9waWRfY2FjaGVwKHVuc2lnbmVkIGludCBsZXZlbCkKIAltdXRleF9s b2NrKCZwaWRfY2FjaGVzX211dGV4KTsKIAkvKiBOYW1lIGNvbGxpc2lvbiBmb3JjZXMgdG8gZG8g YWxsb2NhdGlvbiB1bmRlciBtdXRleC4gKi8KIAlpZiAoISpwa2MpCi0JCSpwa2MgPSBrbWVtX2Nh Y2hlX2NyZWF0ZShuYW1lLCBsZW4sIDAsIFNMQUJfSFdDQUNIRV9BTElHTiwgMCk7CisJCSpwa2Mg PSBrbWVtX2NhY2hlX2NyZWF0ZShuYW1lLCBsZW4sIDAsCisJCQkJCSBTTEFCX0hXQ0FDSEVfQUxJ R04gfCBTTEFCX0FDQ09VTlQsIDApOwogCW11dGV4X3VubG9jaygmcGlkX2NhY2hlc19tdXRleCk7 CiAJLyogY3VycmVudCBjYW4gZmFpbCwgYnV0IHNvbWVvbmUgZWxzZSBjYW4gc3VjY2VlZC4gKi8K IAlyZXR1cm4gUkVBRF9PTkNFKCpwa2MpOwotLSAKMS44LjMuMQoK