From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D634EC433E2 for ; Tue, 15 Sep 2020 02:44:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8EFDB20770 for ; Tue, 15 Sep 2020 02:44:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="eFWJJ0TF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726131AbgIOCoo (ORCPT ); Mon, 14 Sep 2020 22:44:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726045AbgIOCoi (ORCPT ); Mon, 14 Sep 2020 22:44:38 -0400 Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99CE2C06178A for ; Mon, 14 Sep 2020 19:44:38 -0700 (PDT) Received: by mail-pg1-x541.google.com with SMTP id k14so1179186pgi.9 for ; Mon, 14 Sep 2020 19:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yRm2Nxbq6RoVxMTPA/SHMqmJrVHr838yuFX4cqndMt0=; b=eFWJJ0TF8vJC4+HhW6AD/lLi2RSYaQDrfjMEP5aBQwKpOBh+N89x6QvOvayKTyfUAv J1VijI5i7/xN7kqGEV3O9vOg2jd37zxxSvagDqHHsFxvqOrKSIPgQlRF02NPvFru9249 6FEZNt14c57Y1HZfaWJ4194FK/cVLCXbSwWHDqmGVVhfG6R+DYJqiI8woGHMfB8UbBI5 GzM6ol77fu87BqvhuvY4IfjF+RehsqB3RsQcYvZHpKdSYE9672De8rmIQjtoMLebx4bW Wm1gNdEATLWCiB3Ebhku7lVbSw/2hVV2QcZwW7pGs3Nl+MSSI6x52vcD8b8QONt025UP 5I+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yRm2Nxbq6RoVxMTPA/SHMqmJrVHr838yuFX4cqndMt0=; b=ONvNsmAfZXRQnTrKkWAncna7oIeMqnzu9vEycDjokRgDLcj2TO3dOaOg2CWta0YFDk sdGztqaiff6egupkJ1qLHFxpl2qwXpdTsHv5cVNIPf1CCFbPP8X+UYNAtFjVgZU2Q7NP tKh+aN7UmvzZvFCrtXOmdcJxMx1FVEmopCo5HlbIG1CFBhfeoD1XAL5lULRm0OoyWigh iW5MnRE/drL+ZzZsg/HGQ2FQVovMqRd0qzkZJbaOxAT/Watz6yrU3MkYizRiwbdGWtnO kdgfo8lqASfli2hJ2NrEl1x8QrrxTrrofh5D8Aeb0xJ9mfymol0nsUPkjKCLVJ3bi+GZ J/iA== X-Gm-Message-State: AOAM531WG9Q23uFw+pOP7+F7eC+10SdjG3+85pCXrIpYegfsweNWw7Gu +KW9UfVNj7Y1/Ua184zpUYCMrOkT05X0WYXJ9Bej+Q== X-Google-Smtp-Source: ABdhPJwkRr5960WGYZ2KPNSyBhEd+TxajHMwCIC2l1xYrvNRVAGlp68vEs1nbcICHHkKUR6ds7/lo3QNlZUU9ydGu1s= X-Received: by 2002:a63:5515:: with SMTP id j21mr12647797pgb.31.1600137877541; Mon, 14 Sep 2020 19:44:37 -0700 (PDT) MIME-Version: 1.0 References: <20200913070010.44053-1-songmuchun@bytedance.com> <8387344f-0e43-9b6e-068d-b2c45bbda1de@infradead.org> In-Reply-To: <8387344f-0e43-9b6e-068d-b2c45bbda1de@infradead.org> From: Muchun Song Date: Tue, 15 Sep 2020 10:44:01 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2 To: Randy Dunlap Cc: tj@kernel.org, Zefan Li , Johannes Weiner , corbet@lwn.net, Michal Hocko , Vladimir Davydov , Andrew Morton , Shakeel Butt , Roman Gushchin , Cgroups , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , kernel test robot Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 15, 2020 at 3:07 AM Randy Dunlap wrote: > > On 9/13/20 12:00 AM, Muchun Song wrote: > > In the cgroup v1, we have a numa_stat interface. This is useful for > > providing visibility into the numa locality information within an > > memcg since the pages are allowed to be allocated from any physical > > node. One of the use cases is evaluating application performance by > > combining this information with the application's CPU allocation. > > But the cgroup v2 does not. So this patch adds the missing information. > > > > Signed-off-by: Muchun Song > > Suggested-by: Shakeel Butt > > Reported-by: kernel test robot > > --- > > changelog in v3: > > 1. Fix compiler error on powerpc architecture reported by kernel test robot. > > 2. Fix a typo from "anno" to "anon". > > > > changelog in v2: > > 1. Add memory.numa_stat interface in cgroup v2. > > > > Documentation/admin-guide/cgroup-v2.rst | 72 ++++++++++++++++ > > mm/memcontrol.c | 107 ++++++++++++++++++++++++ > > 2 files changed, 179 insertions(+) > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > > index 6be43781ec7f..92207f0012e4 100644 > > --- a/Documentation/admin-guide/cgroup-v2.rst > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > @@ -1368,6 +1368,78 @@ PAGE_SIZE multiple when read back. > > collapsing an existing range of pages. This counter is not > > present when CONFIG_TRANSPARENT_HUGEPAGE is not set. > > > > + memory.numa_stat > > + A read-only flat-keyed file which exists on non-root cgroups. > > + > > + This breaks down the cgroup's memory footprint into different > > + types of memory, type-specific details, and other information > > + per node on the state of the memory management system. > > + > > + This is useful for providing visibility into the numa locality > > capitalize acronyms, please: NUMA OK, I will do that. Thanks. > > > > + information within an memcg since the pages are allowed to be > > + allocated from any physical node. One of the use cases is evaluating > > + application performance by combining this information with the > > + application's CPU allocation. > > + > > + All memory amounts are in bytes. > > + > > + The output format of memory.numa_stat is:: > > + > > + type N0= N1= ... > > Now I'm confused. 5 lines above here it says "All memory amounts are in bytes" > but these appear to be in pages. Which is it? and what size pages if that matters? Sorry. It's my mistake. I will fix it. > > Is it like this? > type N0= N1= ... Thanks. > > > > > + The entries are ordered to be human readable, and new entries > > + can show up in the middle. Don't rely on items remaining in a > > + fixed position; use the keys to look up specific values! > > + > > + anon > > + Amount of memory per node used in anonymous mappings such > > + as brk(), sbrk(), and mmap(MAP_ANONYMOUS) > > + > > + file > > + Amount of memory per node used to cache filesystem data, > > + including tmpfs and shared memory. > > + > > + kernel_stack > > + Amount of memory per node allocated to kernel stacks. > > + > > + shmem > > + Amount of cached filesystem data per node that is swap-backed, > > + such as tmpfs, shm segments, shared anonymous mmap()s > > + > > + file_mapped > > + Amount of cached filesystem data per node mapped with mmap() > > + > > + file_dirty > > + Amount of cached filesystem data per node that was modified but > > + not yet written back to disk > > + > > + file_writeback > > + Amount of cached filesystem data per node that was modified and > > + is currently being written back to disk > > + > > + anon_thp > > + Amount of memory per node used in anonymous mappings backed by > > + transparent hugepages > > + > > + inactive_anon, active_anon, inactive_file, active_file, unevictable > > + Amount of memory, swap-backed and filesystem-backed, > > + per node on the internal memory management lists used > > + by the page reclaim algorithm. > > + > > + As these represent internal list state (eg. shmem pages are on anon > > e.g. Thanks. > > > + memory management lists), inactive_foo + active_foo may not be equal to > > + the value for the foo counter, since the foo counter is type-based, not > > + list-based. > > + > > + slab_reclaimable > > + Amount of memory per node used for storing in-kernel data > > + structures which might be reclaimed, such as dentries and > > + inodes. > > + > > + slab_unreclaimable > > + Amount of memory per node used for storing in-kernel data > > + structures which cannot be reclaimed on memory pressure. > > Some of the descriptions above end with a '.' and some do not. Please be consistent. Will do that. > > > + > > memory.swap.current > > A read-only single value file which exists on non-root > > cgroups. > > > thanks. > -- > ~Randy > -- Yours, Muchun From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67061C433E2 for ; Tue, 15 Sep 2020 02:44:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C78E220770 for ; Tue, 15 Sep 2020 02:44:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="eFWJJ0TF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C78E220770 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 363646B0083; Mon, 14 Sep 2020 22:44:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 315EF6B0085; Mon, 14 Sep 2020 22:44:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2534F6B0087; Mon, 14 Sep 2020 22:44:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0246.hostedemail.com [216.40.44.246]) by kanga.kvack.org (Postfix) with ESMTP id 0F3AE6B0083 for ; Mon, 14 Sep 2020 22:44:40 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B180D181AEF07 for ; Tue, 15 Sep 2020 02:44:39 +0000 (UTC) X-FDA: 77263752678.29.gun79_2212ad52710d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 85740180868E2 for ; Tue, 15 Sep 2020 02:44:39 +0000 (UTC) X-HE-Tag: gun79_2212ad52710d X-Filterd-Recvd-Size: 8839 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Sep 2020 02:44:38 +0000 (UTC) Received: by mail-pf1-f193.google.com with SMTP id k15so1053849pfc.12 for ; Mon, 14 Sep 2020 19:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yRm2Nxbq6RoVxMTPA/SHMqmJrVHr838yuFX4cqndMt0=; b=eFWJJ0TF8vJC4+HhW6AD/lLi2RSYaQDrfjMEP5aBQwKpOBh+N89x6QvOvayKTyfUAv J1VijI5i7/xN7kqGEV3O9vOg2jd37zxxSvagDqHHsFxvqOrKSIPgQlRF02NPvFru9249 6FEZNt14c57Y1HZfaWJ4194FK/cVLCXbSwWHDqmGVVhfG6R+DYJqiI8woGHMfB8UbBI5 GzM6ol77fu87BqvhuvY4IfjF+RehsqB3RsQcYvZHpKdSYE9672De8rmIQjtoMLebx4bW Wm1gNdEATLWCiB3Ebhku7lVbSw/2hVV2QcZwW7pGs3Nl+MSSI6x52vcD8b8QONt025UP 5I+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yRm2Nxbq6RoVxMTPA/SHMqmJrVHr838yuFX4cqndMt0=; b=r9XKCS6NTsyiz5cQ13CNPedxE/P4B2Yx2zFLQHcZXb+I8OREht0ZW2uMqvfYHESTsd ORTATOIjH0rtFEDNoRradITbaIqYm80pPxMlKM0FaTLRcthbS0jrURZxNqY1YgrB/MD2 iQ4NnoIcKynXS4lipbP8iLR8mhAxhNMDH0la5PJ9+mTHM3KybPKxLSxJURyY0xo6sff3 re8AnQ6zCFC5ipaJ7y9FSgqo2v1Yv1et+aDq1x+/gDfCqlhoAUK5V/LYyLeskDLyItid vg/vlyukJS6IPvXwxAbfYgr9heavyxmjxWke/grXLV+5a84o19vUlQ3Z8+gdaqCVqWdq Fpww== X-Gm-Message-State: AOAM532WUsddyr8BAieuWkp2/wl5Tdx6aWoX4AHCyiFaEdqlVCC0t6qf apVda0Lq+oLuQysOOhsa3Ow+Yxof7twhmzh3d5Nf4A== X-Google-Smtp-Source: ABdhPJwkRr5960WGYZ2KPNSyBhEd+TxajHMwCIC2l1xYrvNRVAGlp68vEs1nbcICHHkKUR6ds7/lo3QNlZUU9ydGu1s= X-Received: by 2002:a63:5515:: with SMTP id j21mr12647797pgb.31.1600137877541; Mon, 14 Sep 2020 19:44:37 -0700 (PDT) MIME-Version: 1.0 References: <20200913070010.44053-1-songmuchun@bytedance.com> <8387344f-0e43-9b6e-068d-b2c45bbda1de@infradead.org> In-Reply-To: <8387344f-0e43-9b6e-068d-b2c45bbda1de@infradead.org> From: Muchun Song Date: Tue, 15 Sep 2020 10:44:01 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2 To: Randy Dunlap Cc: tj@kernel.org, Zefan Li , Johannes Weiner , corbet@lwn.net, Michal Hocko , Vladimir Davydov , Andrew Morton , Shakeel Butt , Roman Gushchin , Cgroups , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , kernel test robot Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 85740180868E2 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 15, 2020 at 3:07 AM Randy Dunlap wrote: > > On 9/13/20 12:00 AM, Muchun Song wrote: > > In the cgroup v1, we have a numa_stat interface. This is useful for > > providing visibility into the numa locality information within an > > memcg since the pages are allowed to be allocated from any physical > > node. One of the use cases is evaluating application performance by > > combining this information with the application's CPU allocation. > > But the cgroup v2 does not. So this patch adds the missing information. > > > > Signed-off-by: Muchun Song > > Suggested-by: Shakeel Butt > > Reported-by: kernel test robot > > --- > > changelog in v3: > > 1. Fix compiler error on powerpc architecture reported by kernel test robot. > > 2. Fix a typo from "anno" to "anon". > > > > changelog in v2: > > 1. Add memory.numa_stat interface in cgroup v2. > > > > Documentation/admin-guide/cgroup-v2.rst | 72 ++++++++++++++++ > > mm/memcontrol.c | 107 ++++++++++++++++++++++++ > > 2 files changed, 179 insertions(+) > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > > index 6be43781ec7f..92207f0012e4 100644 > > --- a/Documentation/admin-guide/cgroup-v2.rst > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > @@ -1368,6 +1368,78 @@ PAGE_SIZE multiple when read back. > > collapsing an existing range of pages. This counter is not > > present when CONFIG_TRANSPARENT_HUGEPAGE is not set. > > > > + memory.numa_stat > > + A read-only flat-keyed file which exists on non-root cgroups. > > + > > + This breaks down the cgroup's memory footprint into different > > + types of memory, type-specific details, and other information > > + per node on the state of the memory management system. > > + > > + This is useful for providing visibility into the numa locality > > capitalize acronyms, please: NUMA OK, I will do that. Thanks. > > > > + information within an memcg since the pages are allowed to be > > + allocated from any physical node. One of the use cases is evaluating > > + application performance by combining this information with the > > + application's CPU allocation. > > + > > + All memory amounts are in bytes. > > + > > + The output format of memory.numa_stat is:: > > + > > + type N0= N1= ... > > Now I'm confused. 5 lines above here it says "All memory amounts are in bytes" > but these appear to be in pages. Which is it? and what size pages if that matters? Sorry. It's my mistake. I will fix it. > > Is it like this? > type N0= N1= ... Thanks. > > > > > + The entries are ordered to be human readable, and new entries > > + can show up in the middle. Don't rely on items remaining in a > > + fixed position; use the keys to look up specific values! > > + > > + anon > > + Amount of memory per node used in anonymous mappings such > > + as brk(), sbrk(), and mmap(MAP_ANONYMOUS) > > + > > + file > > + Amount of memory per node used to cache filesystem data, > > + including tmpfs and shared memory. > > + > > + kernel_stack > > + Amount of memory per node allocated to kernel stacks. > > + > > + shmem > > + Amount of cached filesystem data per node that is swap-backed, > > + such as tmpfs, shm segments, shared anonymous mmap()s > > + > > + file_mapped > > + Amount of cached filesystem data per node mapped with mmap() > > + > > + file_dirty > > + Amount of cached filesystem data per node that was modified but > > + not yet written back to disk > > + > > + file_writeback > > + Amount of cached filesystem data per node that was modified and > > + is currently being written back to disk > > + > > + anon_thp > > + Amount of memory per node used in anonymous mappings backed by > > + transparent hugepages > > + > > + inactive_anon, active_anon, inactive_file, active_file, unevictable > > + Amount of memory, swap-backed and filesystem-backed, > > + per node on the internal memory management lists used > > + by the page reclaim algorithm. > > + > > + As these represent internal list state (eg. shmem pages are on anon > > e.g. Thanks. > > > + memory management lists), inactive_foo + active_foo may not be equal to > > + the value for the foo counter, since the foo counter is type-based, not > > + list-based. > > + > > + slab_reclaimable > > + Amount of memory per node used for storing in-kernel data > > + structures which might be reclaimed, such as dentries and > > + inodes. > > + > > + slab_unreclaimable > > + Amount of memory per node used for storing in-kernel data > > + structures which cannot be reclaimed on memory pressure. > > Some of the descriptions above end with a '.' and some do not. Please be consistent. Will do that. > > > + > > memory.swap.current > > A read-only single value file which exists on non-root > > cgroups. > > > thanks. > -- > ~Randy > -- Yours, Muchun From mboxrd@z Thu Jan 1 00:00:00 1970 From: Muchun Song Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2 Date: Tue, 15 Sep 2020 10:44:01 +0800 Message-ID: References: <20200913070010.44053-1-songmuchun@bytedance.com> <8387344f-0e43-9b6e-068d-b2c45bbda1de@infradead.org> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yRm2Nxbq6RoVxMTPA/SHMqmJrVHr838yuFX4cqndMt0=; b=eFWJJ0TF8vJC4+HhW6AD/lLi2RSYaQDrfjMEP5aBQwKpOBh+N89x6QvOvayKTyfUAv J1VijI5i7/xN7kqGEV3O9vOg2jd37zxxSvagDqHHsFxvqOrKSIPgQlRF02NPvFru9249 6FEZNt14c57Y1HZfaWJ4194FK/cVLCXbSwWHDqmGVVhfG6R+DYJqiI8woGHMfB8UbBI5 GzM6ol77fu87BqvhuvY4IfjF+RehsqB3RsQcYvZHpKdSYE9672De8rmIQjtoMLebx4bW Wm1gNdEATLWCiB3Ebhku7lVbSw/2hVV2QcZwW7pGs3Nl+MSSI6x52vcD8b8QONt025UP 5I+g== In-Reply-To: <8387344f-0e43-9b6e-068d-b2c45bbda1de-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Randy Dunlap Cc: tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, Zefan Li , Johannes Weiner , corbet-T1hC0tSOHrs@public.gmane.org, Michal Hocko , Vladimir Davydov , Andrew Morton , Shakeel Butt , Roman Gushchin , Cgroups , linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, LKML , Linux Memory Management List , kernel test robot On Tue, Sep 15, 2020 at 3:07 AM Randy Dunlap wrote: > > On 9/13/20 12:00 AM, Muchun Song wrote: > > In the cgroup v1, we have a numa_stat interface. This is useful for > > providing visibility into the numa locality information within an > > memcg since the pages are allowed to be allocated from any physical > > node. One of the use cases is evaluating application performance by > > combining this information with the application's CPU allocation. > > But the cgroup v2 does not. So this patch adds the missing information. > > > > Signed-off-by: Muchun Song > > Suggested-by: Shakeel Butt > > Reported-by: kernel test robot > > --- > > changelog in v3: > > 1. Fix compiler error on powerpc architecture reported by kernel test robot. > > 2. Fix a typo from "anno" to "anon". > > > > changelog in v2: > > 1. Add memory.numa_stat interface in cgroup v2. > > > > Documentation/admin-guide/cgroup-v2.rst | 72 ++++++++++++++++ > > mm/memcontrol.c | 107 ++++++++++++++++++++++++ > > 2 files changed, 179 insertions(+) > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > > index 6be43781ec7f..92207f0012e4 100644 > > --- a/Documentation/admin-guide/cgroup-v2.rst > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > @@ -1368,6 +1368,78 @@ PAGE_SIZE multiple when read back. > > collapsing an existing range of pages. This counter is not > > present when CONFIG_TRANSPARENT_HUGEPAGE is not set. > > > > + memory.numa_stat > > + A read-only flat-keyed file which exists on non-root cgroups. > > + > > + This breaks down the cgroup's memory footprint into different > > + types of memory, type-specific details, and other information > > + per node on the state of the memory management system. > > + > > + This is useful for providing visibility into the numa locality > > capitalize acronyms, please: NUMA OK, I will do that. Thanks. > > > > + information within an memcg since the pages are allowed to be > > + allocated from any physical node. One of the use cases is evaluating > > + application performance by combining this information with the > > + application's CPU allocation. > > + > > + All memory amounts are in bytes. > > + > > + The output format of memory.numa_stat is:: > > + > > + type N0= N1= ... > > Now I'm confused. 5 lines above here it says "All memory amounts are in bytes" > but these appear to be in pages. Which is it? and what size pages if that matters? Sorry. It's my mistake. I will fix it. > > Is it like this? > type N0= N1= ... Thanks. > > > > > + The entries are ordered to be human readable, and new entries > > + can show up in the middle. Don't rely on items remaining in a > > + fixed position; use the keys to look up specific values! > > + > > + anon > > + Amount of memory per node used in anonymous mappings such > > + as brk(), sbrk(), and mmap(MAP_ANONYMOUS) > > + > > + file > > + Amount of memory per node used to cache filesystem data, > > + including tmpfs and shared memory. > > + > > + kernel_stack > > + Amount of memory per node allocated to kernel stacks. > > + > > + shmem > > + Amount of cached filesystem data per node that is swap-backed, > > + such as tmpfs, shm segments, shared anonymous mmap()s > > + > > + file_mapped > > + Amount of cached filesystem data per node mapped with mmap() > > + > > + file_dirty > > + Amount of cached filesystem data per node that was modified but > > + not yet written back to disk > > + > > + file_writeback > > + Amount of cached filesystem data per node that was modified and > > + is currently being written back to disk > > + > > + anon_thp > > + Amount of memory per node used in anonymous mappings backed by > > + transparent hugepages > > + > > + inactive_anon, active_anon, inactive_file, active_file, unevictable > > + Amount of memory, swap-backed and filesystem-backed, > > + per node on the internal memory management lists used > > + by the page reclaim algorithm. > > + > > + As these represent internal list state (eg. shmem pages are on anon > > e.g. Thanks. > > > + memory management lists), inactive_foo + active_foo may not be equal to > > + the value for the foo counter, since the foo counter is type-based, not > > + list-based. > > + > > + slab_reclaimable > > + Amount of memory per node used for storing in-kernel data > > + structures which might be reclaimed, such as dentries and > > + inodes. > > + > > + slab_unreclaimable > > + Amount of memory per node used for storing in-kernel data > > + structures which cannot be reclaimed on memory pressure. > > Some of the descriptions above end with a '.' and some do not. Please be consistent. Will do that. > > > + > > memory.swap.current > > A read-only single value file which exists on non-root > > cgroups. > > > thanks. > -- > ~Randy > -- Yours, Muchun