From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 781A2C433F5 for ; Wed, 16 Feb 2022 08:30:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3CE66B007D; Wed, 16 Feb 2022 03:30:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DC58A6B007B; Wed, 16 Feb 2022 03:30:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C70E76B0071; Wed, 16 Feb 2022 03:30:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0183.hostedemail.com [216.40.44.183]) by kanga.kvack.org (Postfix) with ESMTP id ADAA76B0071 for ; Wed, 16 Feb 2022 03:30:49 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 66AD39367E for ; Wed, 16 Feb 2022 08:30:49 +0000 (UTC) X-FDA: 79147972218.27.7D6D069 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by imf30.hostedemail.com (Postfix) with ESMTP id 4600480002 for ; Wed, 16 Feb 2022 08:30:47 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=xhao@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0V4cXjy8_1645000243; Received: from localhost.localdomain(mailfrom:xhao@linux.alibaba.com fp:SMTPD_---0V4cXjy8_1645000243) by smtp.aliyun-inc.com(127.0.0.1); Wed, 16 Feb 2022 16:30:44 +0800 From: Xin Hao To: sj@kernel.org Cc: xhao@linux.alibaba.com, rongwei.wang@linux.alibaba.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH V1 0/5] mm/damon: Add NUMA access statistics function support Date: Wed, 16 Feb 2022 16:30:36 +0800 Message-Id: X-Mailer: git-send-email 2.31.0 MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4600480002 X-Stat-Signature: ebk1qq63utz17dz9df4oapm7fwwrudsw X-Rspam-User: Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf30.hostedemail.com: domain of xhao@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=xhao@linux.alibaba.com X-HE-Tag: 1645000247-598471 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On today's cloud computing service scenario, NUMA (non uniform memory acc= ess) architecture server has been applied on a large scale. Using Damon functi= on, it can easily and lightweight identify hot and cold memory, but it can no= t display the situation of locale and remote NUMA memory access. The purpose of these serie patches is to identify the situation of NUMA a= ccess in combination with DAMON, especially for remote NUMA access in hot memor= y. We hope to detect this situation in the data center and use page migratio= n or multi backup page technology to optimize the behavior of memory access. So next, we will further improve Damon NUMA function: 1. Support hugtlbfs NUMA access statistics. 2. Add the DAMO tool to parse NUMA local & remote in "damon_region" suppo= rt. 3. For hot memory remote NUMA access, support page migration or multi bac= kup page. About DAMON correctness of numa access statistics We wrote a test case, allocate about 1G memory, and use numa_alloc(), set= 512M in NUMA node0 and 512M in NUMA node1, and The test case alternately accesses= the 1G of memory. We used "perf record -e damon:damon_aggregated" and "perf script" cmd to obtain data, like this: kdamond.0 target_id=3D0 nr_regions=3D10 281473056325632-281473127964672:= : 12 0 5243 5513 kdamond.0 target_id=3D0 nr_regions=3D10 281473127964672-281473238028288:= 8 1 5427 5399 ... kdamond.0 target_id=3D0 nr_regions=3D10 281473056325632-281473127964672= : 9 3 7669 7632 kdamond.0 target_id=3D0 nr_regions=3D10 281473127964672-28147323802828= 8: 7 2 7913 7892 And compared with numastat like this: Per-node process memory usage (in MBs) for PID 111676 (lt-numademo) Node 0 Node 1 Node 2 --------------- --------------- --------------- Huge 0.00 0.00 0.00 Heap 0.02 0.00 0.00 Stack 0.01 0.00 0.00 Private 565.24 564.00 0.00 ---------------- --------------- --------------- --------------- Total 565.27 564.00 0.00 This comparison can determine the accuracy of Damon NUMA memory access st= atistics. About the impact of DAMON NUMA access on Performance During the benchmakr test, we found that the MBW benchmark memcpy test i= tem will cause about 3% performance degradation, and there is no performance = degradation in other benchmarks. So we added "numa_stat" switch in DAMON dbgfs interface, turn on this swi= tch when NUMA access statistics is required. Xin Hao (5): mm/damon: Add NUMA local and remote variables in 'damon_region' mm/damon: Add 'damon_region' NUMA fault simulation support mm/damon: Add 'damon_region' NUMA access statistics core implementation mm/damon/dbgfs: Add numa simulate switch mm/damon/tracepoint: Add 'damon_region' NUMA access statistics support include/linux/damon.h | 25 ++++++++++ include/trace/events/damon.h | 9 +++- mm/damon/core.c | 94 +++++++++++++++++++++++++++++++++++- mm/damon/dbgfs.c | 70 ++++++++++++++++++++++++--- mm/damon/paddr.c | 25 ++++++++-- mm/damon/prmtv-common.c | 44 +++++++++++++++++ mm/damon/prmtv-common.h | 3 ++ mm/damon/vaddr.c | 45 ++++++++++------- mm/huge_memory.c | 5 ++ mm/memory.c | 5 ++ 10 files changed, 292 insertions(+), 33 deletions(-) -- 2.27.0