* [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance
@ 2011-11-18 11:55 Petr Holasek
2011-11-18 11:55 ` [PATCH 2/2] NUMA emulation x86_64: Documentation changes in boot-options.txt Petr Holasek
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Petr Holasek @ 2011-11-18 11:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andrew Morton
Cc: linux-kernel, x86, Anton Arapov, Petr Holasek
As default, when numa emulation is turned on, node distance table
uses physical distance, so for 4 nodes emulated on 1 physical table is
node 0 1 2 3
0: 10 10 10 10
1: 10 10 10 10
2: 10 10 10 10
3: 10 10 10 10
This patch adds new [distance] argument to
numa=fake=<number/size of nodes>[,distance]
When distance argument is used, it sets linear distance between nodes
like that:
__distance__
___|___ ____|___ ________ ________
| | | | | | | |
| node1 |---| node 2 |---| node 3 |---| node 4 |
|_______| |________| |________| |________|
| | |
| | |
|____distance * 2________| |
| |
|____________distance * 3______________|
This feature might be useful for testing some numa awareness features in
both user and kernel spaces.
Signed-off-by: Petr Holasek <pholasek@redhat.com>
---
arch/x86/mm/numa_emulation.c | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index d0ed086..1824972 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -309,6 +309,8 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
u8 *phys_dist = NULL;
size_t phys_size = numa_dist_cnt * numa_dist_cnt * sizeof(phys_dist[0]);
int max_emu_nid, dfl_phys_nid;
+ unsigned long dist_level;
+ char *c;
int i, j, ret;
if (!emu_cmdline)
@@ -404,6 +406,17 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
if (emu_nid_to_phys[i] == NUMA_NO_NODE)
emu_nid_to_phys[i] = dfl_phys_nid;
+ /* load distance level parameter */
+ dist_level = -1;
+ c = strchr(emu_cmdline, ',');
+ if (c) {
+ c++;
+ ret = kstrtoul(c, 10, &dist_level);
+ if (ret < 0 || dist_level < LOCAL_DISTANCE ||
+ dist_level * max_emu_nid > ULONG_MAX)
+ dist_level = -1;
+ }
+
/* transform distance table */
numa_reset_distance();
for (i = 0; i < max_emu_nid + 1; i++) {
@@ -418,6 +431,9 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
else
dist = phys_dist[physi * numa_dist_cnt + physj];
+ if (dist_level != -1 && i != j)
+ dist = abs(i - j) * dist_level;
+
numa_set_distance(i, j, dist);
}
}
--
1.7.6.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] NUMA emulation x86_64: Documentation changes in boot-options.txt
2011-11-18 11:55 [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Petr Holasek
@ 2011-11-18 11:55 ` Petr Holasek
2011-11-18 19:53 ` [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Andrew Morton
2011-11-20 2:06 ` [PATCH 1/2] " David Rientjes
2 siblings, 0 replies; 8+ messages in thread
From: Petr Holasek @ 2011-11-18 11:55 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andrew Morton
Cc: linux-kernel, x86, Anton Arapov, Petr Holasek
Signed-off-by: Petr Holasek <pholasek@redhat.com>
---
Documentation/x86/x86_64/boot-options.txt | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index c54b4f5..33c0c10 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -166,13 +166,15 @@ NUMA
numa=noacpi Don't parse the SRAT table for NUMA setup
- numa=fake=<size>[MG]
+ numa=fake=<size>[MG][,distance]
If given as a memory unit, fills all system RAM with nodes of
size interleaved over physical nodes.
+ Optional distance sets linear distance between emulated nodes.
- numa=fake=<N>
+ numa=fake=<N>[,distance]
If given as an integer, fills all system RAM with N fake nodes
interleaved over physical nodes.
+ Optional distance sets linear distance between emulated nodes.
ACPI
--
1.7.6.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance
2011-11-18 11:55 [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Petr Holasek
2011-11-18 11:55 ` [PATCH 2/2] NUMA emulation x86_64: Documentation changes in boot-options.txt Petr Holasek
@ 2011-11-18 19:53 ` Andrew Morton
2011-11-19 0:31 ` Petr Holasek
2011-11-20 2:06 ` [PATCH 1/2] " David Rientjes
2 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2011-11-18 19:53 UTC (permalink / raw)
To: Petr Holasek
Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel, x86,
Anton Arapov
On Fri, 18 Nov 2011 12:55:07 +0100
Petr Holasek <pholasek@redhat.com> wrote:
> As default, when numa emulation is turned on, node distance table
> uses physical distance, so for 4 nodes emulated on 1 physical table is
>
> node 0 1 2 3
> 0: 10 10 10 10
> 1: 10 10 10 10
> 2: 10 10 10 10
> 3: 10 10 10 10
>
> This patch adds new [distance] argument to
>
> numa=fake=<number/size of nodes>[,distance]
>
> When distance argument is used, it sets linear distance between nodes
> like that:
>
> __distance__
> ___|___ ____|___ ________ ________
> | | | | | | | |
> | node1 |---| node 2 |---| node 3 |---| node 4 |
> |_______| |________| |________| |________|
> | | |
> | | |
> |____distance * 2________| |
> | |
> |____________distance * 3______________|
>
> This feature might be useful for testing some numa awareness features in
> both user and kernel spaces.
>
"might" is a red flag. We don't merge things which might be useful!
*Is* it useful? If so then please tell us why and explain how it might
be useful to others.
> @@ -404,6 +406,17 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
> if (emu_nid_to_phys[i] == NUMA_NO_NODE)
> emu_nid_to_phys[i] = dfl_phys_nid;
>
> + /* load distance level parameter */
> + dist_level = -1;
> + c = strchr(emu_cmdline, ',');
> + if (c) {
> + c++;
> + ret = kstrtoul(c, 10, &dist_level);
> + if (ret < 0 || dist_level < LOCAL_DISTANCE ||
> + dist_level * max_emu_nid > ULONG_MAX)
> + dist_level = -1;
If this happens, the user goofed and we should tell them, with a printk.
[patch 2/2] adds the documentation for the feature and should be
included in the same patch as the implementation.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NUMA emulation x86_64: numa=fake parameter for custom nodes distance
2011-11-18 19:53 ` [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Andrew Morton
@ 2011-11-19 0:31 ` Petr Holasek
2011-11-20 2:09 ` David Rientjes
0 siblings, 1 reply; 8+ messages in thread
From: Petr Holasek @ 2011-11-19 0:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel, x86,
Anton Arapov
On Fri, 18 Nov 2011, Andrew Morton wrote:
> Date: Fri, 18 Nov 2011 11:53:36 -0800
> From: Andrew Morton <akpm@linux-foundation.org>
> To: Petr Holasek <pholasek@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>,
> "H. Peter Anvin" <hpa@zytor.com>, linux-kernel@vger.kernel.org,
> x86@kernel.org, Anton Arapov <anton@redhat.com>
> Subject: Re: [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for
> custom nodes distance
>
> On Fri, 18 Nov 2011 12:55:07 +0100
> Petr Holasek <pholasek@redhat.com> wrote:
>
> > As default, when numa emulation is turned on, node distance table
> > uses physical distance, so for 4 nodes emulated on 1 physical table is
> >
> > node 0 1 2 3
> > 0: 10 10 10 10
> > 1: 10 10 10 10
> > 2: 10 10 10 10
> > 3: 10 10 10 10
> >
> > This patch adds new [distance] argument to
> >
> > numa=fake=<number/size of nodes>[,distance]
> >
> > When distance argument is used, it sets linear distance between nodes
> > like that:
> >
> > __distance__
> > ___|___ ____|___ ________ ________
> > | | | | | | | |
> > | node1 |---| node 2 |---| node 3 |---| node 4 |
> > |_______| |________| |________| |________|
> > | | |
> > | | |
> > |____distance * 2________| |
> > | |
> > |____________distance * 3______________|
> >
> > This feature might be useful for testing some numa awareness features in
> > both user and kernel spaces.
> >
>
> "might" is a red flag. We don't merge things which might be useful!
>
> *Is* it useful? If so then please tell us why and explain how it might
> be useful to others.
A lot of developers still have no access to large NUMA machines and
possibility of NUMA emulation could involve more of them to thinking
about NUMA awareness of their apps/kernel code.
>
> > @@ -404,6 +406,17 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
> > if (emu_nid_to_phys[i] == NUMA_NO_NODE)
> > emu_nid_to_phys[i] = dfl_phys_nid;
> >
> > + /* load distance level parameter */
> > + dist_level = -1;
> > + c = strchr(emu_cmdline, ',');
> > + if (c) {
> > + c++;
> > + ret = kstrtoul(c, 10, &dist_level);
> > + if (ret < 0 || dist_level < LOCAL_DISTANCE ||
> > + dist_level * max_emu_nid > ULONG_MAX)
> > + dist_level = -1;
>
> If this happens, the user goofed and we should tell them, with a printk.
>
>
> [patch 2/2] adds the documentation for the feature and should be
> included in the same patch as the implementation.
Apologize, I'll send v2 of patch with printk() and documentation all-in-one
if it is necessary.
thanks,
Petr H
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance
2011-11-18 11:55 [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Petr Holasek
2011-11-18 11:55 ` [PATCH 2/2] NUMA emulation x86_64: Documentation changes in boot-options.txt Petr Holasek
2011-11-18 19:53 ` [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Andrew Morton
@ 2011-11-20 2:06 ` David Rientjes
2 siblings, 0 replies; 8+ messages in thread
From: David Rientjes @ 2011-11-20 2:06 UTC (permalink / raw)
To: Petr Holasek
Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andrew Morton,
linux-kernel, x86, Anton Arapov
On Fri, 18 Nov 2011, Petr Holasek wrote:
> As default, when numa emulation is turned on, node distance table
> uses physical distance, so for 4 nodes emulated on 1 physical table is
>
> node 0 1 2 3
> 0: 10 10 10 10
> 1: 10 10 10 10
> 2: 10 10 10 10
> 3: 10 10 10 10
>
That should only be true if you're booting on a system with one physical
node and an SRAT, otherwise the distance between fake nodes should be
representative of their physical distance. For example, if you boot
with numa=fake=4 on a two symmetrical two-node box, you should get
something like
10 10 20 20
10 10 20 20
20 20 10 10
20 20 10 10
It's done like this intentionally so you can test NUMA without having many
nodes. What you're doing is changing the distance even though there is no
actual difference in latency on the hardware so it's an incorrect
representation.
> This patch adds new [distance] argument to
>
> numa=fake=<number/size of nodes>[,distance]
>
> When distance argument is used, it sets linear distance between nodes
> like that:
>
> __distance__
> ___|___ ____|___ ________ ________
> | | | | | | | |
> | node1 |---| node 2 |---| node 3 |---| node 4 |
> |_______| |________| |________| |________|
> | | |
> | | |
> |____distance * 2________| |
> | |
> |____________distance * 3______________|
>
> This feature might be useful for testing some numa awareness features in
> both user and kernel spaces.
>
I don't see any use case for this other than testing if code can actually
order nodes correctly or not. The distances that you're now adding are,
by definition, incorrect since they aren't the same as exported by the
true SLIT (which is what happens by default now) so nothing other than
functional testing of node ordering is achieved with this patch.
So nack on this approach.
What you could do, however, and would be generally useful even outside of
NUMA emulation, is to add fake SLIT functionality so that you can define
it yourself on the command line. You could use that either with or
without NUMA emulation if you know the physical SLIT is incorrect in some
way. Then, you get the same functionality as your patch here by using it
in combination with numa=fake and the added bonus is that you don't need
any of the "distance * 2" or "distance * 3" limitations.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NUMA emulation x86_64: numa=fake parameter for custom nodes distance
2011-11-19 0:31 ` Petr Holasek
@ 2011-11-20 2:09 ` David Rientjes
2011-11-21 20:41 ` Petr Holasek
0 siblings, 1 reply; 8+ messages in thread
From: David Rientjes @ 2011-11-20 2:09 UTC (permalink / raw)
To: Petr Holasek
Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
linux-kernel, x86, Anton Arapov
On Sat, 19 Nov 2011, Petr Holasek wrote:
> A lot of developers still have no access to large NUMA machines and
> possibility of NUMA emulation could involve more of them to thinking
> about NUMA awareness of their apps/kernel code.
>
That's a bogus argument, numa=fake already allows you to construct as
large of a NUMA box as you want in a faked environment. The distances
have nothing to do with that.
The distances you're adding here are, by definition, incorrect because it
doesn't respect the actual distance between physical nodes that numa=fake
uses already. If you're using numa=fake on an UMA machine, then the
performance of the kernel will be just that, you won't actual see any
introduced latency between fake nodes just by changing the distance. So
you're completely invalidating what internode distances actually mean.
I'd much rather see an option to fake the SLIT that could do all of this
without limitation and would be possible to debug issues in the future.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NUMA emulation x86_64: numa=fake parameter for custom nodes distance
2011-11-20 2:09 ` David Rientjes
@ 2011-11-21 20:41 ` Petr Holasek
2011-11-21 22:24 ` David Rientjes
0 siblings, 1 reply; 8+ messages in thread
From: Petr Holasek @ 2011-11-21 20:41 UTC (permalink / raw)
To: David Rientjes
Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
linux-kernel, x86, Anton Arapov
On Sat, 19 Nov 2011, David Rientjes wrote:
> Date: Sat, 19 Nov 2011 18:09:59 -0800 (PST)
> From: David Rientjes <rientjes@google.com>
> To: Petr Holasek <pholasek@redhat.com>
> cc: Andrew Morton <akpm@linux-foundation.org>, Thomas Gleixner
> <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin"
> <hpa@zytor.com>, linux-kernel@vger.kernel.org, x86@kernel.org, Anton
> Arapov <anton@redhat.com>
> Subject: Re: NUMA emulation x86_64: numa=fake parameter for custom nodes
> distance
>
> On Sat, 19 Nov 2011, Petr Holasek wrote:
>
> > A lot of developers still have no access to large NUMA machines and
> > possibility of NUMA emulation could involve more of them to thinking
> > about NUMA awareness of their apps/kernel code.
> >
>
> That's a bogus argument, numa=fake already allows you to construct as
> large of a NUMA box as you want in a faked environment. The distances
> have nothing to do with that.
>
> The distances you're adding here are, by definition, incorrect because it
> doesn't respect the actual distance between physical nodes that numa=fake
> uses already. If you're using numa=fake on an UMA machine, then the
> performance of the kernel will be just that, you won't actual see any
> introduced latency between fake nodes just by changing the distance. So
> you're completely invalidating what internode distances actually mean.
>
> I'd much rather see an option to fake the SLIT that could do all of this
> without limitation and would be possible to debug issues in the future.
This patch was designed as nothing more than helper for debugging/testing
purposes, e.g. when it is useful to have more values in exports than only
LOCAL_DISTANCEs. So that's the reason why it disregards former distances
between physical nodes.
Faking the SLIT table is a really good point, if this patch would be
eventually rejected, I will rework the patch in that manner.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: NUMA emulation x86_64: numa=fake parameter for custom nodes distance
2011-11-21 20:41 ` Petr Holasek
@ 2011-11-21 22:24 ` David Rientjes
0 siblings, 0 replies; 8+ messages in thread
From: David Rientjes @ 2011-11-21 22:24 UTC (permalink / raw)
To: Petr Holasek
Cc: Andrew Morton, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
linux-kernel, x86, Anton Arapov
On Mon, 21 Nov 2011, Petr Holasek wrote:
> This patch was designed as nothing more than helper for debugging/testing
> purposes, e.g. when it is useful to have more values in exports than only
> LOCAL_DISTANCEs. So that's the reason why it disregards former distances
> between physical nodes.
>
I understand, but like I said: the only debugging and testing it would be
useful for is node ordering. The actual latency of memory accesses are
not going to be representative of the new distances and will lead to
confusion since they're wrong. It's also pretty limited in even that
regard because all nodes are now spaced by the same distance so they're
just spread out linearly instead of actually representing a real NUMA
architecture.
> Faking the SLIT table is a really good point, if this patch would be
> eventually rejected, I will rework the patch in that manner.
>
That has applicability even outside of debugging, you could override your
own machine's slit if you know it's bogus. The way it's defined is very
lengthy, however, and would require (4 * nr_nodes^2) characters at maximum
since the max distance is three characters, 255 (unreachable node), and
you'd need to separate them by one character, a comma. That's 256 chars
for eight nodes!
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-11-21 22:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-18 11:55 [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Petr Holasek
2011-11-18 11:55 ` [PATCH 2/2] NUMA emulation x86_64: Documentation changes in boot-options.txt Petr Holasek
2011-11-18 19:53 ` [PATCH 1/2] NUMA emulation x86_64: numa=fake parameter for custom nodes distance Andrew Morton
2011-11-19 0:31 ` Petr Holasek
2011-11-20 2:09 ` David Rientjes
2011-11-21 20:41 ` Petr Holasek
2011-11-21 22:24 ` David Rientjes
2011-11-20 2:06 ` [PATCH 1/2] " David Rientjes
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.