From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D09FC433EF for ; Fri, 20 May 2022 01:07:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344208AbiETBHQ (ORCPT ); Thu, 19 May 2022 21:07:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344201AbiETBHM (ORCPT ); Thu, 19 May 2022 21:07:12 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0AFB91567 for ; Thu, 19 May 2022 18:07:10 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-2ff37db94e8so59046237b3.16 for ; Thu, 19 May 2022 18:07:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=G6gtvqYNjnVwht1zFDxId2+CvU+SMvS0Nm0E4p7o9cM=; b=Gq0rw+MrvLWVPrWEBXryZB1jQV/V1G9kqtZ5HVH9URkE6LQh50hBnHJMNqkGngRE1B OKQFxrvMTob/C1c6Prc225WuKjE8WCTLx3fHin7rlnzNSeIOtlGLTu74nl6iciYuNa6V wKGClUZev9JgvVbePXUNUNBu7Uz1/J49jE9sWsl47yg7HwP0TiOqv882QwVKD+FEXZY5 VRHfGsaPeQ8uLVWv02vR138ZcXZ+G+eh5CxwWdQnlEieVAYcg1VZgbsIj+VW85zoNOBM wo1vIABj/7CfTuXAAnloVwGNjfYF36gl6mrEArXq+HVLQCpFYG1ejzHnmXk3KpUV9HEj Th3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=G6gtvqYNjnVwht1zFDxId2+CvU+SMvS0Nm0E4p7o9cM=; b=B+OJdI2HUHnZWQkAWkWxSD7SNMVWvSLxVFvevv84i21+Uzdyma850EWLY0g6vY5Uk0 wD6HhHdpYyxixccEc5HXZnWdHESOew120pzU0FFK8IqdzeJ2Ip4Y9Yecvoa09qSxbRvl DxhWijCwbR9utvwoylFQf1dcK8t57GhGB4hlh+I5pyKlTu57aHU6NzA4A4nsVrWLvRy8 BUsBcaifb90G/ZQv1ISBGjFGjarDABoWz/1FHfBzZxEx/Cd7mOiFHn/5skjRtgZZK78n 8ih4VnWf/lXxaXk11jgsUyCVuumNoLLtCRwNdmI26nW0xfx56jGeRUA/SMd7SnP+A3/b QRgg== X-Gm-Message-State: AOAM533OCxzgZU2R8hHpYvD7obP+Hgs1dTHQn28fYemQu4nEeq/Ex3R4 bXVEZ8rNM2Ahox/DxxvjRHoFV3rVSDvNdA== X-Google-Smtp-Source: ABdhPJyR9Z86DoOKmunyVf1W/6GIyEHZTsm6haOiAELQ00SfT9U7iHEZekEdan2CrSC96iDk6GNAkAcC4XrGfQ== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:28b]) (user=shakeelb job=sendgmr) by 2002:a25:2f16:0:b0:64c:b761:b440 with SMTP id v22-20020a252f16000000b0064cb761b440mr7560583ybv.359.1653008830081; Thu, 19 May 2022 18:07:10 -0700 (PDT) Date: Fri, 20 May 2022 01:07:07 +0000 In-Reply-To: <212f1b74-7d4e-29f2-9e92-2a1820beff61@openvz.org> Message-Id: <20220520010707.77z5blxpbsddwmyo@google.com> Mime-Version: 1.0 References: <212f1b74-7d4e-29f2-9e92-2a1820beff61@openvz.org> Subject: Re: [PATCH 1/4] memcg: enable accounting for large allocations in mem_cgroup_css_alloc From: Shakeel Butt To: Vasily Averin Cc: Roman Gushchin , "Michal =?utf-8?Q?Koutn=C3=BD?=" , kernel@openvz.org, linux-kernel@vger.kernel.org, Vlastimil Babka , Michal Hocko , cgroups@vger.kernel.org Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 13, 2022 at 06:51:41PM +0300, Vasily Averin wrote: > cgroup mkdir can be misused inside memcg limited container. It can allocate > a lot of host memory without memcg accounting, cause global memory shortage > and force OOM to kill random host process. > > Below [1] is result of mkdir /sys/fs/cgroup/test tracing on VM with 4 cpus > > number bytes $1*$2 sum note call_site > of alloc > allocs > ------------------------------------------------------------ > 1 14448 14448 14448 = percpu_alloc_percpu: > 1 8192 8192 22640 (mem_cgroup_css_alloc+0x54) > 49 128 6272 28912 (__kernfs_new_node+0x4e) > 49 96 4704 33616 (simple_xattr_alloc+0x2c) > 49 88 4312 37928 (__kernfs_iattrs+0x56) > 1 4096 4096 42024 (cgroup_mkdir+0xc7) > 1 3840 3840 45864 = percpu_alloc_percpu: > 4 512 2048 47912 (alloc_fair_sched_group+0x166) > 4 512 2048 49960 (alloc_fair_sched_group+0x139) > 1 2048 2048 52008 (mem_cgroup_css_alloc+0x109) > [smaller objects skipped] > --- > Total 61728 > > '=' -- accounted allocations > > This patch enabled accounting for one of the main memory hogs in this > experiment: allocation which are called inside mem_cgroup_css_alloc() > > Signed-off-by: Vasily Averin > Link: [1] https://lore.kernel.org/all/1aa4cd22-fcb6-0e8d-a1c6-23661d618864@openvz.org/ > Acked-by: Shakeel Butt From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shakeel Butt Subject: Re: [PATCH 1/4] memcg: enable accounting for large allocations in mem_cgroup_css_alloc Date: Fri, 20 May 2022 01:07:07 +0000 Message-ID: <20220520010707.77z5blxpbsddwmyo@google.com> References: <212f1b74-7d4e-29f2-9e92-2a1820beff61@openvz.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=G6gtvqYNjnVwht1zFDxId2+CvU+SMvS0Nm0E4p7o9cM=; b=Gq0rw+MrvLWVPrWEBXryZB1jQV/V1G9kqtZ5HVH9URkE6LQh50hBnHJMNqkGngRE1B OKQFxrvMTob/C1c6Prc225WuKjE8WCTLx3fHin7rlnzNSeIOtlGLTu74nl6iciYuNa6V wKGClUZev9JgvVbePXUNUNBu7Uz1/J49jE9sWsl47yg7HwP0TiOqv882QwVKD+FEXZY5 VRHfGsaPeQ8uLVWv02vR138ZcXZ+G+eh5CxwWdQnlEieVAYcg1VZgbsIj+VW85zoNOBM wo1vIABj/7CfTuXAAnloVwGNjfYF36gl6mrEArXq+HVLQCpFYG1ejzHnmXk3KpUV9HEj Th3w== In-Reply-To: <212f1b74-7d4e-29f2-9e92-2a1820beff61-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> List-ID: Content-Transfer-Encoding: 7bit To: Vasily Averin Cc: Roman Gushchin , Michal =?utf-8?Q?Koutn=C3=BD?= , kernel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Vlastimil Babka , Michal Hocko , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Fri, May 13, 2022 at 06:51:41PM +0300, Vasily Averin wrote: > cgroup mkdir can be misused inside memcg limited container. It can allocate > a lot of host memory without memcg accounting, cause global memory shortage > and force OOM to kill random host process. > > Below [1] is result of mkdir /sys/fs/cgroup/test tracing on VM with 4 cpus > > number bytes $1*$2 sum note call_site > of alloc > allocs > ------------------------------------------------------------ > 1 14448 14448 14448 = percpu_alloc_percpu: > 1 8192 8192 22640 (mem_cgroup_css_alloc+0x54) > 49 128 6272 28912 (__kernfs_new_node+0x4e) > 49 96 4704 33616 (simple_xattr_alloc+0x2c) > 49 88 4312 37928 (__kernfs_iattrs+0x56) > 1 4096 4096 42024 (cgroup_mkdir+0xc7) > 1 3840 3840 45864 = percpu_alloc_percpu: > 4 512 2048 47912 (alloc_fair_sched_group+0x166) > 4 512 2048 49960 (alloc_fair_sched_group+0x139) > 1 2048 2048 52008 (mem_cgroup_css_alloc+0x109) > [smaller objects skipped] > --- > Total 61728 > > '=' -- accounted allocations > > This patch enabled accounting for one of the main memory hogs in this > experiment: allocation which are called inside mem_cgroup_css_alloc() > > Signed-off-by: Vasily Averin > Link: [1] https://lore.kernel.org/all/1aa4cd22-fcb6-0e8d-a1c6-23661d618864-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org/ > Acked-by: Shakeel Butt