From: "Du, Fan" <fan.du@intel.com>
To: SeongJae Park <sjpark@amazon.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: SeongJae Park <sjpark@amazon.de>,
"Jonathan.Cameron@Huawei.com" <Jonathan.Cameron@Huawei.com>,
"aarcange@redhat.com" <aarcange@redhat.com>,
"acme@kernel.org" <acme@kernel.org>,
"alexander.shishkin@linux.intel.com"
<alexander.shishkin@linux.intel.com>,
"amit@kernel.org" <amit@kernel.org>,
"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
"brendan.d.gregg@gmail.com" <brendan.d.gregg@gmail.com>,
"brendanhiggins@google.com" <brendanhiggins@google.com>,
"cai@lca.pw" <cai@lca.pw>,
"colin.king@canonical.com" <colin.king@canonical.com>,
"corbet@lwn.net" <corbet@lwn.net>,
"david@redhat.com" <david@redhat.com>,
"dwmw@amazon.com" <dwmw@amazon.com>,
"foersleo@amazon.de" <foersleo@amazon.de>,
"irogers@google.com" <irogers@google.com>,
"jolsa@redhat.com" <jolsa@redhat.com>,
"kirill@shutemov.name" <kirill@shutemov.name>,
"mark.rutland@arm.com" <mark.rutland@arm.com>,
"mgorman@suse.de" <mgorman@suse.de>,
"minchan@kernel.org" <minchan@kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"namhyung@kernel.org" <namhyung@kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
"rdunlap@infradead.org" <rdunlap@infradead.org>,
"riel@surriel.com" <riel@surriel.com>,
"rientjes@google.com" <rientjes@google.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"rppt@kernel.org" <rppt@kernel.org>,
"sblbir@amazon.com" <sblbir@amazon.com>,
"shakeelb@google.com" <shakeelb@google.com>,
"shuah@kernel.org" <shuah@kernel.org>,
"sj38.park@gmail.com" <sj38.park@gmail.com>,
"snu@amazon.de" <snu@amazon.de>,
"vbabka@suse.cz" <vbabka@suse.cz>,
"vdavydov.dev@gmail.com" <vdavydov.dev@gmail.com>,
"yang.shi@linux.alibaba.com" <yang.shi@linux.alibaba.com>,
"Huang, Ying" <ying.huang@intel.com>,
"linux-damon@amazon.com" <linux-damon@amazon.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Du, Fan" <fan.du@intel.com>
Subject: RE: [RFC v5 10/11] tools/damon/record: Support NUMA specific recording
Date: Mon, 20 Jul 2020 06:53:14 +0000 [thread overview]
Message-ID: <DM5PR11MB1595FA5D5E3BDB77C1EFF50F997B0@DM5PR11MB1595.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20200707144540.21216-11-sjpark@amazon.com>
>-----Original Message-----
>From: owner-linux-mm@kvack.org <owner-linux-mm@kvack.org> On Behalf
>Of SeongJae Park
>Sent: Tuesday, July 7, 2020 10:46 PM
>To: akpm@linux-foundation.org
>Cc: SeongJae Park <sjpark@amazon.de>; Jonathan.Cameron@Huawei.com;
>aarcange@redhat.com; acme@kernel.org; alexander.shishkin@linux.intel.com;
>amit@kernel.org; benh@kernel.crashing.org; brendan.d.gregg@gmail.com;
>brendanhiggins@google.com; cai@lca.pw; colin.king@canonical.com;
>corbet@lwn.net; david@redhat.com; dwmw@amazon.com;
>foersleo@amazon.de; irogers@google.com; jolsa@redhat.com;
>kirill@shutemov.name; mark.rutland@arm.com; mgorman@suse.de;
>minchan@kernel.org; mingo@redhat.com; namhyung@kernel.org;
>peterz@infradead.org; rdunlap@infradead.org; riel@surriel.com;
>rientjes@google.com; rostedt@goodmis.org; rppt@kernel.org;
>sblbir@amazon.com; shakeelb@google.com; shuah@kernel.org;
>sj38.park@gmail.com; snu@amazon.de; vbabka@suse.cz;
>vdavydov.dev@gmail.com; yang.shi@linux.alibaba.com; Huang, Ying
><ying.huang@intel.com>; linux-damon@amazon.com; linux-mm@kvack.org;
>linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org
>Subject: [RFC v5 10/11] tools/damon/record: Support NUMA specific
>recording
>
>From: SeongJae Park <sjpark@amazon.de>
>
>This commit updates the DAMON user space tool (damo-record) for NUMA
>specific physical memory monitoring. With this change, users can
>monitor accesses to physical memory of specific NUMA node.
>
>Signed-off-by: SeongJae Park <sjpark@amazon.de>
>---
> tools/damon/_paddr_layout.py | 158
>+++++++++++++++++++++++++++++++++++
> tools/damon/record.py | 21 ++++-
> 2 files changed, 178 insertions(+), 1 deletion(-)
> create mode 100644 tools/damon/_paddr_layout.py
>
>diff --git a/tools/damon/_paddr_layout.py b/tools/damon/_paddr_layout.py
>new file mode 100644
>index 000000000000..10056172db21
>--- /dev/null
>+++ b/tools/damon/_paddr_layout.py
>@@ -0,0 +1,158 @@
>+#!/usr/bin/env python3
>+# SPDX-License-Identifier: GPL-2.0
>+
>+import os
>+
>+class PaddrRange:
>+ start = None
>+ end = None
>+ nid = None
>+ state = None
>+ name = None
>+
>+ def __init__(self, start, end, nid, state, name):
>+ self.start = start
>+ self.end = end
>+ self.nid = nid
>+ self.state = state
>+ self.name = name
>+
>+ def interleaved(self, prange):
>+ if self.end <= prange.start:
>+ return None
>+ if prange.end <= self.start:
>+ return None
>+ return [max(self.start, prange.start), min(self.end, prange.end)]
>+
>+ def __str__(self):
>+ return '%x-%x, nid %s, state %s, name %s' % (self.start, self.end,
>+ self.nid, self.state, self.name)
>+
>+class MemBlock:
>+ nid = None
>+ index = None
>+ state = None
>+
>+ def __init__(self, nid, index, state):
>+ self.nid = nid
>+ self.index = index
>+ self.state = state
>+
>+ def __str__(self):
>+ return '%d (%s)' % (self.index, self.state)
>+
>+ def __repr__(self):
>+ return self.__str__()
>+
>+def readfile(file_path):
>+ with open(file_path, 'r') as f:
>+ return f.read()
>+
>+def collapse_ranges(ranges):
>+ ranges = sorted(ranges, key=lambda x: x.start)
>+ merged = []
>+ for r in ranges:
>+ if not merged:
>+ merged.append(r)
>+ continue
>+ last = merged[-1]
>+ if last.end != r.start or last.nid != r.nid or last.state != r.state:
>+ merged.append(r)
>+ else:
>+ last.end = r.end
>+ return merged
>+
>+def memblocks_to_ranges(blocks, block_size):
>+ ranges = []
>+ for b in blocks:
>+ ranges.append(PaddrRange(b.index * block_size,
>+ (b.index + 1) * block_size, b.nid, b.state, None))
>+
>+ return collapse_ranges(ranges)
>+
>+def memblock_ranges():
>+ SYSFS='/sys/devices/system/node'
>+ sz_block = int(readfile('/sys/devices/system/memory/block_size_bytes'),
>16)
>+ sys_nodes = [x for x in os.listdir(SYSFS) if x.startswith('node')]
>+
>+ blocks = []
>+ for sys_node in sys_nodes:
>+ nid = int(sys_node[4:])
>+
>+ sys_node_files = os.listdir(os.path.join(SYSFS, sys_node))
>+ for f in sys_node_files:
>+ if not f.startswith('memory'):
>+ continue
>+ index = int(f[6:])
>+ sys_state = os.path.join(SYSFS, sys_node, f, 'state')
>+ state = readfile(sys_state).strip()
>+
>+ blocks.append(MemBlock(nid, index, state))
>+
>+ return memblocks_to_ranges(blocks, sz_block)
>+
>+def iomem_ranges():
>+ ranges = []
>+
>+ with open('/proc/iomem', 'r') as f:
>+ # example of the line: '100000000-42b201fff : System RAM'
>+ for line in f:
>+ fields = line.split(':')
>+ if len(fields) < 2:
>+ continue
>+ name = ':'.join(fields[1:]).strip()
>+ addrs = fields[0].split('-')
>+ if len(addrs) != 2:
>+ continue
>+ start = int(addrs[0], 16)
>+ end = int(addrs[1], 16) + 1
>+ ranges.append(PaddrRange(start, end, None, None, name))
>+
>+ return ranges
Hi SeongJae
Here on system with persistent memory, user can plug {all or portion of }persistent memory into buddy system
with drivers from patchset[1] implemented. The persistent memory in this setup will be treated as normal system
RAM, and have valid full-fledged page structure as well.
For example, here is what /proc/iomem looks like in system with such configuration on my testbed.
1840000000-963fffffff : Persistent Memory
1840000000-18be1fffff : namespace0.0
18c0000000-37bfffffff : dax0.0
18c0000000-37bfffffff : System RAM <- first 128G of persistent memory corresponding to node 2
# numactl -H
available: 3 nodes (0-2)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
node 0 size: 95447 MB
node 0 free: 92391 MB
node 1 cpus: 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
node 1 size: 96733 MB
node 1 free: 95670 MB
node 2 cpus:
node 2 size: 126976 MB
node 2 free: 126935 MB
node distances:
node 0 1 2
0: 10 21 17
1: 21 10 28
2: 17 28 10
So here we have two ranges:
1840000000-963fffffff : Persistent Memory
18c0000000-37bfffffff : System RAM
[1]:
https://patchwork.kernel.org/cover/10829019/
>+def paddr_ranges():
>+ ranges1 = memblock_ranges()
>+ ranges2 = iomem_ranges()
>+ merged = []
>+
>+ for r in ranges1:
>+ subsets = []
>+ for r2 in ranges2:
>+ interleaved = r.interleaved(r2)
>+ if interleaved == None:
>+ continue
>+
>+ start, end = interleaved
>+ left = None
>+ if start > r.start:
>+ left = PaddrRange(r.start, start, r.nid, r.state, r.name)
>+ subsets.append(left)
>+
>+ middle = PaddrRange(start, end, r.nid, r.state, r.name)
>+ if r2.nid:
>+ middle.nid = r2.nid
>+ if r2.state:
>+ middle.state = r2.state
>+ if r2.name:
>+ middle.name = r2.name
Memory block from numa node 2 will match with range "1840000000-963fffffff : Persistent Memory"
But take the name "Persistent memory", expected "System RAM" here.
>+ subsets.append(middle)
>+ r.start = end
>+ if r.start < r.end:
>+ subsets = [r]
>+
>+ merged += subsets
>+ return merged
>+
>+def pr_ranges(ranges):
>+ print('#%12s %13s\tnode\tstate\tresource\tsize' % ('start', 'end'))
>+ for r in ranges:
>+ print('%13d %13d\t%s\t%s\t%s\t%d' % (r.start, r.end, r.nid,
>+ r.state, r.name, r.end - r.start))
>+
>+def main():
>+ ranges = paddr_ranges()
>+
>+ pr_ranges(ranges)
>+
>+if __name__ == '__main__':
>+ main()
>diff --git a/tools/damon/record.py b/tools/damon/record.py
>index 416dca940c1d..8440a9818810 100644
>--- a/tools/damon/record.py
>+++ b/tools/damon/record.py
>@@ -12,6 +12,7 @@ import subprocess
> import time
>
> import _damon
>+import _paddr_layout
>
> def do_record(target, is_target_cmd, init_regions, attrs, old_attrs):
> if os.path.isfile(attrs.rfile_path):
>@@ -70,6 +71,8 @@ def set_argparser(parser):
> help='the target command or the pid to record')
> parser.add_argument('-l', '--rbuf', metavar='<len>', type=int,
> default=1024*1024, help='length of record result buffer')
>+ parser.add_argument('--numa_node', metavar='<node id>', type=int,
>+ help='if target is \'paddr\', limit it to the numa node')
> parser.add_argument('-o', '--out', metavar='<file path>', type=str,
> default='damon.data', help='output file path')
>
>@@ -96,6 +99,18 @@ def default_paddr_region():
> ret = [start, end]
> return ret
>
>+def paddr_region_of(numa_node):
>+ regions = []
>+ default_region = default_paddr_region()
>+ paddr_ranges = _paddr_layout.paddr_ranges()
>+ for r in paddr_ranges:
>+ if r.end <= default_region[0] or default_region[1] <= r.start:
>+ continue
>+ if r.nid == numa_node and r.name == 'System RAM':
>+ regions.append([r.start, r.end])
To profile physical address for numa node 2, above checking will not return valid node 2 range
i.e., 18c0000000-37bfffffff : System RAM, but empty ones.
I tweaked the checking to match "Persistent Memory" also, it looks physical address monitoring with
numa node specified works from first glance.
I'm willing to give it a try once you posted next updated version.
>+ return regions
>+
> def main(args=None):
> global orig_attrs
> if not args:
>@@ -113,12 +128,16 @@ def main(args=None):
> args.schemes = ''
> new_attrs = _damon.cmd_args_to_attrs(args)
> init_regions = _damon.cmd_args_to_init_regions(args)
>+ numa_node = args.numa_node
> target = args.target
>
> target_fields = target.split()
> if target == 'paddr': # physical memory address space
> if not init_regions:
>- init_regions = [default_paddr_region()]
>+ if numa_node:
>+ init_regions = paddr_region_of(numa_node)
>+ else:
>+ init_regions = [default_paddr_region()]
> do_record(target, False, init_regions, new_attrs, orig_attrs)
> elif not subprocess.call('which %s > /dev/null' % target_fields[0],
> shell=True, executable='/bin/bash'):
>--
>2.17.1
>
next prev parent reply other threads:[~2020-07-20 6:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-07 14:45 [RFC v5 00/11] DAMON: Support Physical Memory Address Space Monitoring SeongJae Park
2020-07-07 14:45 ` [RFC v5 01/11] mm/damon/debugfs: Allow users to set initial monitoring target regions SeongJae Park
2020-07-07 14:45 ` [RFC v5 02/11] tools/damon: Support init target regions specification SeongJae Park
2020-07-07 14:45 ` [RFC v5 03/11] mm/damon-test: Add more unit tests for 'init_regions' SeongJae Park
2020-07-07 20:01 ` Brendan Higgins
2020-07-07 14:45 ` [RFC v5 04/11] selftests/damon/_chk_record: Do not check number of gaps SeongJae Park
2020-07-07 14:45 ` [RFC v5 05/11] Docs/damon: Document 'initial_regions' feature SeongJae Park
2020-07-07 14:45 ` [RFC v5 06/11] mm/rmap: Export essential functions for rmap_run SeongJae Park
2020-07-07 14:45 ` [RFC v5 07/11] mm/damon: Implement callbacks for physical memory monitoring SeongJae Park
2020-07-07 14:45 ` [RFC v5 08/11] mm/damon/debugfs: Support " SeongJae Park
2020-07-07 14:45 ` [RFC v5 09/11] tools/damon/record: " SeongJae Park
2020-07-07 14:45 ` [RFC v5 10/11] tools/damon/record: Support NUMA specific recording SeongJae Park
2020-07-20 6:53 ` Du, Fan [this message]
2020-07-20 7:48 ` SeongJae Park
2020-07-07 14:45 ` [RFC v5 11/11] Docs/damon: Document physical memory monitoring support SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM5PR11MB1595FA5D5E3BDB77C1EFF50F997B0@DM5PR11MB1595.namprd11.prod.outlook.com \
--to=fan.du@intel.com \
--cc=Jonathan.Cameron@Huawei.com \
--cc=aarcange@redhat.com \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=amit@kernel.org \
--cc=benh@kernel.crashing.org \
--cc=brendan.d.gregg@gmail.com \
--cc=brendanhiggins@google.com \
--cc=cai@lca.pw \
--cc=colin.king@canonical.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=dwmw@amazon.com \
--cc=foersleo@amazon.de \
--cc=irogers@google.com \
--cc=jolsa@redhat.com \
--cc=kirill@shutemov.name \
--cc=linux-damon@amazon.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=mgorman@suse.de \
--cc=minchan@kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=sblbir@amazon.com \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
--cc=sj38.park@gmail.com \
--cc=sjpark@amazon.com \
--cc=sjpark@amazon.de \
--cc=snu@amazon.de \
--cc=vbabka@suse.cz \
--cc=vdavydov.dev@gmail.com \
--cc=yang.shi@linux.alibaba.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).