From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA82EC433DF for ; Tue, 11 Aug 2020 15:28:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A16FB20B1F for ; Tue, 11 Aug 2020 15:28:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="O+oz/y94" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A16FB20B1F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3BB9A6B0007; Tue, 11 Aug 2020 11:28:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 36DB86B0008; Tue, 11 Aug 2020 11:28:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 284996B000A; Tue, 11 Aug 2020 11:28:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id 12D656B0007 for ; Tue, 11 Aug 2020 11:28:45 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B73C72C04 for ; Tue, 11 Aug 2020 15:28:44 +0000 (UTC) X-FDA: 77138670168.29.crush51_0b13d0026fe3 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 14FD618086E25 for ; Tue, 11 Aug 2020 15:28:44 +0000 (UTC) X-HE-Tag: crush51_0b13d0026fe3 X-Filterd-Recvd-Size: 6328 Received: from mail-qv1-f68.google.com (mail-qv1-f68.google.com [209.85.219.68]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Tue, 11 Aug 2020 15:28:43 +0000 (UTC) Received: by mail-qv1-f68.google.com with SMTP id t6so6141530qvw.1 for ; Tue, 11 Aug 2020 08:28:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Lgl2XWPxOkvTO1C8S88OrGJFzEG0BedRecFChwB3gY8=; b=O+oz/y94S7Ot2vFCIAFHszkvovBOf36KdyrH2IEJWc4M4iGAI+0vUVeb65yAePs+57 DROlBFWLKmiEh+5yYPQAuZxOyCCI59rbE60yQsK/k5p+hCWZLwZuiio3saMyRdst9aN6 W00Raq/cnhUCgHcCrtxHXZRE6zekJg2/KMflF2aaWNHLVuSUxA1oo1s4fEWzKninscH7 +U1Bhx4SjBooCPa+GwTjN7sivvJ+csbnAYbo7Rsrvfg+DIRVEeLlRuCSag4yPLoeRdCG FufH0Ro9lxoInq9MGfEz5Pr+bqIIzZVDkYh86LTyBcauJu8KDfKElTqTqBT5krSr8MMX gq3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Lgl2XWPxOkvTO1C8S88OrGJFzEG0BedRecFChwB3gY8=; b=OSdbdwGiBLHdydqdJ4G07sJVcuTy0sUc+CqtpfttQgQGmKwRfX5qdjb+O5W9V3LNgW g/P/JD5FZHFpna9wGFJVPQUslaJUVU5svYW+oJ1nEWraxIFt78XEseuEFoVFUcKsNHn/ ihUPzDICH4LkyJpUeEPFhxhea2zc8Fbg9b6tyFv1CckDwoQQMXUox6bkEtASwsY5P3+e qfYkdbcuwIlsxF6s/YqNDrf6DkNHOQKIcrJiYGQwQo6Y2nmRbWcBkVpNdCYHwN9Ni7Ql EfBGKJbFdrn99EO7UmOySiRLYuhvBo+KfBHaNBD7aY9xyhGr22Da1NhiTkLdSJwTGeNJ FjEg== X-Gm-Message-State: AOAM532RQD4wJ384EwC4utKnkejbRhpuUABodlXJSfHmbiARJKd4yeVr 9upO8yVDVco8HQhwAUJB0AuudJNY+oU= X-Google-Smtp-Source: ABdhPJwUsKgHMBfWAiaT6/HGMo7TWT1YWSQ7mkn2a8Es9KiyKcTAN0v8Enpnj6TS8l5TEuNaxVqYfA== X-Received: by 2002:ad4:446d:: with SMTP id s13mr1833462qvt.183.1597159722398; Tue, 11 Aug 2020 08:28:42 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2dd0]) by smtp.gmail.com with ESMTPSA id l189sm17003362qke.67.2020.08.11.08.28.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Aug 2020 08:28:41 -0700 (PDT) Date: Tue, 11 Aug 2020 11:27:37 -0400 From: Johannes Weiner To: Roman Gushchin Cc: Andrew Morton , Dennis Zhou , Tejun Heo , Christoph Lameter , Michal Hocko , Shakeel Butt , linux-mm@kvack.org, kernel-team@fb.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup Message-ID: <20200811152737.GB650506@cmpxchg.org> References: <20200623184515.4132564-1-guro@fb.com> <20200623184515.4132564-5-guro@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200623184515.4132564-5-guro@fb.com> X-Rspamd-Queue-Id: 14FD618086E25 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 23, 2020 at 11:45:14AM -0700, Roman Gushchin wrote: > Memory cgroups are using large chunks of percpu memory to store vmstat > data. Yet this memory is not accounted at all, so in the case when there > are many (dying) cgroups, it's not exactly clear where all the memory is. > > Because the size of memory cgroup internal structures can dramatically > exceed the size of object or page which is pinning it in the memory, it's > not a good idea to simple ignore it. It actually breaks the isolation > between cgroups. > > Let's account the consumed percpu memory to the parent cgroup. > > Signed-off-by: Roman Gushchin > Acked-by: Dennis Zhou Acked-by: Johannes Weiner This makes sense, and the accounting is in line with how we track and distribute child creation quotas (cgroup.max.descendants and cgroup.max.depth) up the cgroup tree. I have one minor comment that isn't a dealbreaker for me: > @@ -5069,13 +5069,15 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node) > if (!pn) > return 1; > > - pn->lruvec_stat_local = alloc_percpu(struct lruvec_stat); > + pn->lruvec_stat_local = alloc_percpu_gfp(struct lruvec_stat, > + GFP_KERNEL_ACCOUNT); > if (!pn->lruvec_stat_local) { > kfree(pn); > return 1; > } > > - pn->lruvec_stat_cpu = alloc_percpu(struct lruvec_stat); > + pn->lruvec_stat_cpu = alloc_percpu_gfp(struct lruvec_stat, > + GFP_KERNEL_ACCOUNT); > if (!pn->lruvec_stat_cpu) { > free_percpu(pn->lruvec_stat_local); > kfree(pn); > @@ -5149,11 +5151,13 @@ static struct mem_cgroup *mem_cgroup_alloc(void) > goto fail; > } > > - memcg->vmstats_local = alloc_percpu(struct memcg_vmstats_percpu); > + memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu, > + GFP_KERNEL_ACCOUNT); > if (!memcg->vmstats_local) > goto fail; > > - memcg->vmstats_percpu = alloc_percpu(struct memcg_vmstats_percpu); > + memcg->vmstats_percpu = alloc_percpu_gfp(struct memcg_vmstats_percpu, > + GFP_KERNEL_ACCOUNT); > if (!memcg->vmstats_percpu) > goto fail; > > @@ -5202,7 +5206,9 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) > struct mem_cgroup *memcg; > long error = -ENOMEM; > > + memalloc_use_memcg(parent); > memcg = mem_cgroup_alloc(); > + memalloc_unuse_memcg(); The disconnect between 1) requesting accounting and 2) which cgroup to charge is making me uneasy. It makes mem_cgroup_alloc() a bit of a handgrenade, because accounting to the current task is almost guaranteed to be wrong if the use_memcg() annotation were to get lost in a refactor or not make it to a new caller of the function. The saving grace is that mem_cgroup_alloc() is pretty unlikely to be used elsewhere. And pretending it's an independent interface would be overengineering. But how about the following in mem_cgroup_alloc() and alloc_mem_cgroup_per_node_info() to document that caller relationship: /* We charge the parent cgroup, never the current task */ WARN_ON_ONCE(!current->active_memcg);