All of lore.kernel.org
 help / color / mirror / Atom feed
* x86 NUMA error on OSSTest box
@ 2022-10-03 21:21 Andrew Cooper
  2022-10-04  8:04 ` Jan Beulich
  2022-10-04  9:25 ` Jan Beulich
  0 siblings, 2 replies; 3+ messages in thread
From: Andrew Cooper @ 2022-10-03 21:21 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, Roger Pau Monne, Wei Liu; +Cc: Henry Wang

While working on another issue, I spotted this:

(XEN) ACPI: EINJ 6CB9D638, 0150 (r1 ORACLE     X7-2 41060300 INTL        1)
(XEN) System RAM: 32429MB (33208204kB)
(XEN) SRAT: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
(XEN) SRAT: Node 0 PXM 0 [0000000100000000, 000000047fffffff]
(XEN) SRAT: Node 1 PXM 1 [0000000480000000, 000000087fffffff]
(XEN) NUMA: Using 19 for the hash shift.
(XEN) Your memory is not aligned you need to rebuild your hypervisor
with a bigger NODEMAPSIZE shift=19
(XEN) SRAT: No NUMA node hash function found. Contact maintainer
(XEN) SRAT: SRAT not used.
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-0000000880000000
(XEN) Domain heap initialised

on sabro0 in OSSTest on current staging.  I do not know if it's a recent
regression or not.

The SRAT looks reasonable (in fact, far better than most I've seen). 
Given no legitimate requirement for aligned memory that I'm aware of, I
think Xen's behaviour here is buggy and wants resolving.

~Andrew

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: x86 NUMA error on OSSTest box
  2022-10-03 21:21 x86 NUMA error on OSSTest box Andrew Cooper
@ 2022-10-04  8:04 ` Jan Beulich
  2022-10-04  9:25 ` Jan Beulich
  1 sibling, 0 replies; 3+ messages in thread
From: Jan Beulich @ 2022-10-04  8:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Henry Wang, xen-devel, Roger Pau Monne, Wei Liu

On 03.10.2022 23:21, Andrew Cooper wrote:
> While working on another issue, I spotted this:
> 
> (XEN) ACPI: EINJ 6CB9D638, 0150 (r1 ORACLE     X7-2 41060300 INTL        1)
> (XEN) System RAM: 32429MB (33208204kB)
> (XEN) SRAT: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
> (XEN) SRAT: Node 0 PXM 0 [0000000100000000, 000000047fffffff]
> (XEN) SRAT: Node 1 PXM 1 [0000000480000000, 000000087fffffff]
> (XEN) NUMA: Using 19 for the hash shift.
> (XEN) Your memory is not aligned you need to rebuild your hypervisor
> with a bigger NODEMAPSIZE shift=19
> (XEN) SRAT: No NUMA node hash function found. Contact maintainer
> (XEN) SRAT: SRAT not used.
> (XEN) No NUMA configuration found
> (XEN) Faking a node at 0000000000000000-0000000880000000
> (XEN) Domain heap initialised
> 
> on sabro0 in OSSTest on current staging.  I do not know if it's a recent
> regression or not.
> 
> The SRAT looks reasonable (in fact, far better than most I've seen). 
> Given no legitimate requirement for aligned memory that I'm aware of, I
> think Xen's behaviour here is buggy and wants resolving.

Judging from flight 173273's logs (on sabro1) this is a recent issue,
which then must result from one of my changes. There we simply have

Sep 22 01:54:39.843438 (XEN) SRAT: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
Sep 22 01:54:39.915465 (XEN) SRAT: Node 0 PXM 0 [0000000100000000, 000000047fffffff]
Sep 22 01:54:39.927478 (XEN) SRAT: Node 1 PXM 1 [0000000480000000, 000000087fffffff]
Sep 22 01:54:39.927500 (XEN) NUMA: Using 19 for the hash shift.

For the moment I can't make the connection, as we still pick 19 for the
shift value. I'll take a closer look.

Jan


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: x86 NUMA error on OSSTest box
  2022-10-03 21:21 x86 NUMA error on OSSTest box Andrew Cooper
  2022-10-04  8:04 ` Jan Beulich
@ 2022-10-04  9:25 ` Jan Beulich
  1 sibling, 0 replies; 3+ messages in thread
From: Jan Beulich @ 2022-10-04  9:25 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Henry Wang, xen-devel, Roger Pau Monne, Wei Liu

On 03.10.2022 23:21, Andrew Cooper wrote:
> While working on another issue, I spotted this:
> 
> (XEN) ACPI: EINJ 6CB9D638, 0150 (r1 ORACLE     X7-2 41060300 INTL        1)
> (XEN) System RAM: 32429MB (33208204kB)
> (XEN) SRAT: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
> (XEN) SRAT: Node 0 PXM 0 [0000000100000000, 000000047fffffff]
> (XEN) SRAT: Node 1 PXM 1 [0000000480000000, 000000087fffffff]
> (XEN) NUMA: Using 19 for the hash shift.
> (XEN) Your memory is not aligned you need to rebuild your hypervisor
> with a bigger NODEMAPSIZE shift=19
> (XEN) SRAT: No NUMA node hash function found. Contact maintainer
> (XEN) SRAT: SRAT not used.
> (XEN) No NUMA configuration found
> (XEN) Faking a node at 0000000000000000-0000000880000000
> (XEN) Domain heap initialised
> 
> on sabro0 in OSSTest on current staging.  I do not know if it's a recent
> regression or not.
> 
> The SRAT looks reasonable (in fact, far better than most I've seen). 
> Given no legitimate requirement for aligned memory that I'm aware of, I
> think Xen's behaviour here is buggy and wants resolving.

That's yet another off-by-1 afaics, which was not mattering until the
first off-by-1 was eliminated. I'll make a(nother) patch, but I first
want to figure out why I didn't see this issue myself (of whether I
merely overlooked it).

Jan


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-04  9:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 21:21 x86 NUMA error on OSSTest box Andrew Cooper
2022-10-04  8:04 ` Jan Beulich
2022-10-04  9:25 ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.