From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC1AFC433F5 for ; Tue, 8 Feb 2022 12:13:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B840C6B0074; Tue, 8 Feb 2022 07:13:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B34016B0075; Tue, 8 Feb 2022 07:13:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A22F46B0078; Tue, 8 Feb 2022 07:13:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 929E46B0074 for ; Tue, 8 Feb 2022 07:13:17 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 45027181F1108 for ; Tue, 8 Feb 2022 12:13:17 +0000 (UTC) X-FDA: 79119502434.06.0032190 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf29.hostedemail.com (Postfix) with ESMTP id B24CA12000E for ; Tue, 8 Feb 2022 12:13:16 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 7554C1F383; Tue, 8 Feb 2022 12:13:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1644322395; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=QLOYwEHWVSzw5MCgmxkNKcSwN+xc3CSk3rKsHEFXQ0w=; b=umK5/28r4Ry0plkxqSGj9jm/h56veWrz8pKMp5ppf56ETUdrsuH1hb86FdlcFXeoSoulaz lFOqeYg0yKPsQLWHc2dl8AHyW3Z0SGNQaJUUnjkpcUPE824yWe2B0B5R3iQOcbdFzLrhCJ ik3wNbCbeuoS3QIUfYl3YJI1RMqv5xo= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id D93CAA3B8C; Tue, 8 Feb 2022 12:13:14 +0000 (UTC) Date: Tue, 8 Feb 2022 13:13:14 +0100 From: Michal Hocko To: Waiman Long Cc: Johannes Weiner , Vladimir Davydov , Andrew Morton , Petr Mladek , Steven Rostedt , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Ira Weiny , Mike Rapoport , David Rientjes , Roman Gushchin , Rafael Aquini , Mike Rapoport Subject: Re: [PATCH v5 3/4] mm/page_owner: Print memcg information Message-ID: References: <20220208000532.1054311-1-longman@redhat.com> <20220208000532.1054311-4-longman@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220208000532.1054311-4-longman@redhat.com> Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b="umK5/28r"; spf=pass (imf29.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-Stat-Signature: tmtx7uhtptepkwin8psu6bh17tr6ea6i X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B24CA12000E X-Rspam-User: X-HE-Tag: 1644322396-184144 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 07-02-22 19:05:31, Waiman Long wrote: > It was found that a number of dying memcgs were not freed because > they were pinned by some charged pages that were present. Even "echo 1 > > /proc/sys/vm/drop_caches" wasn't able to free those pages. These dying > but not freed memcgs tend to increase in number over time with the side > effect that percpu memory consumption as shown in /proc/meminfo also > increases over time. I still believe that this is very suboptimal way to debug offline memcgs but memcg information can be useful in other contexts and it doesn't cost us anything except for an additional output so I am fine with this. > In order to find out more information about those pages that pin > dying memcgs, the page_owner feature is extended to print memory > cgroup information especially whether the cgroup is dying or not. > RCU read lock is taken when memcg is being accessed to make sure > that it won't be freed. > > Signed-off-by: Waiman Long > Acked-by: David Rientjes > Acked-by: Roman Gushchin > Acked-by: Mike Rapoport With few comments/questions below. > --- > mm/page_owner.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 44 insertions(+) > > diff --git a/mm/page_owner.c b/mm/page_owner.c > index 28dac73e0542..d4c311455753 100644 > --- a/mm/page_owner.c > +++ b/mm/page_owner.c > @@ -10,6 +10,7 @@ > #include > #include > #include > +#include > #include > > #include "internal.h" > @@ -325,6 +326,47 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, > seq_putc(m, '\n'); > } > > +/* > + * Looking for memcg information and print it out > + */ I am not sure this is particularly useful comment. > +static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret, > + struct page *page) > +{ > +#ifdef CONFIG_MEMCG > + unsigned long memcg_data; > + struct mem_cgroup *memcg; > + bool dying; > + > + rcu_read_lock(); > + memcg_data = READ_ONCE(page->memcg_data); > + if (!memcg_data) > + goto out_unlock; > + > + if (memcg_data & MEMCG_DATA_OBJCGS) > + ret += scnprintf(kbuf + ret, count - ret, > + "Slab cache page\n"); > + > + memcg = page_memcg_check(page); > + if (!memcg) > + goto out_unlock; > + > + dying = (memcg->css.flags & CSS_DYING); Is there any specific reason why you haven't used mem_cgroup_online? > + ret += scnprintf(kbuf + ret, count - ret, > + "Charged %sto %smemcg ", > + PageMemcgKmem(page) ? "(via objcg) " : "", > + dying ? "dying " : ""); > + > + /* Write cgroup name directly into kbuf */ > + cgroup_name(memcg->css.cgroup, kbuf + ret, count - ret); > + ret += strlen(kbuf + ret); cgroup_name should return the length of the path added to the buffer. > + ret += scnprintf(kbuf + ret, count - ret, "\n"); I do not see any overflow prevention here. I believe you really need to check ret >= count after each scnprintf/cgroup_name. -- Michal Hocko SUSE Labs