linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2 physical-cpu (like 2x6core) config and NUMA?
@ 2012-09-17 18:00 Linda Walsh
  2012-09-18  5:05 ` Mike Galbraith
  2012-09-18  6:55 ` Jike Song
  0 siblings, 2 replies; 4+ messages in thread
From: Linda Walsh @ 2012-09-17 18:00 UTC (permalink / raw)
  To: Linux-Kernel

I was wondering, on dual processor MB's, Intel uses dedicated memory for

each cpu ....  6 memchips in the X5XXX series, and to access the memory
of the other chip's cores, the memory has to be transferred over the QPI
bus.

So wouldn't it be of benefit if such dual chip configurations were to
be setup as 'NUMA', as there is a higher cost between migrating 
memory/processes
between Cores on different chips vs. on the same chip?  

I note from 'cpupower -c all frequency-info, that the "odd" cpu-cores
all hve to run at the same clock frequency, and the "even" all have
to run together, which I take to mean that the odd number cores are
on 1 chip and the even numbered cores are on the other chip.

Since the QPI path is limited and appears to be < the local memory access
rate, wouldn't it be appropriate if 2 cpu-chip setups were configured
as 2 NUMA cores?  

Although -- I have no clue how the memory space is divided between the
two cores -- i.e. I don't know if say, I have 24G on each, if they
alternate 4G in the physical address space or what (that would all be
handed (or mapped) before the chips come up.. so it could be contiguous).


Does the kernel support scheduling based on the different speed of
memory between "on die" vs. "off die"?   I was surprised to see
that it viewed my system as 1 NUMA node with all 12 on 1 node -- when
I know that it is physically organized as 2x6. 

Do I have to configure that manually or did I maybe turn off something
in my kernel config I should have turned on?

(HW= 2 X5560 @ 2.8GHZ w/6 cores/siblings/chip...

And most certainly, 6 cores would share a 12MB L3 cache as
well, that wouldn't be "hot" for the other 6 cores.

Any suggestions would be appreciated.

Thanks,
Linda






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2 physical-cpu (like 2x6core) config and NUMA?
  2012-09-17 18:00 2 physical-cpu (like 2x6core) config and NUMA? Linda Walsh
@ 2012-09-18  5:05 ` Mike Galbraith
  2012-09-18  6:55 ` Jike Song
  1 sibling, 0 replies; 4+ messages in thread
From: Mike Galbraith @ 2012-09-18  5:05 UTC (permalink / raw)
  To: Linda Walsh; +Cc: Linux-Kernel

On Mon, 2012-09-17 at 11:00 -0700, Linda Walsh wrote: 
> I was wondering, on dual processor MB's, Intel uses dedicated memory for
> 
> each cpu ....  6 memchips in the X5XXX series, and to access the memory
> of the other chip's cores, the memory has to be transferred over the QPI
> bus.
> 
> So wouldn't it be of benefit if such dual chip configurations were to
> be setup as 'NUMA', as there is a higher cost between migrating 
> memory/processes
> between Cores on different chips vs. on the same chip?  
> 
> I note from 'cpupower -c all frequency-info, that the "odd" cpu-cores
> all hve to run at the same clock frequency, and the "even" all have
> to run together, which I take to mean that the odd number cores are
> on 1 chip and the even numbered cores are on the other chip.
> 
> Since the QPI path is limited and appears to be < the local memory access
> rate, wouldn't it be appropriate if 2 cpu-chip setups were configured
> as 2 NUMA cores?  
> 
> Although -- I have no clue how the memory space is divided between the
> two cores -- i.e. I don't know if say, I have 24G on each, if they
> alternate 4G in the physical address space or what (that would all be
> handed (or mapped) before the chips come up.. so it could be contiguous).
> 
> 
> Does the kernel support scheduling based on the different speed of
> memory between "on die" vs. "off die"?   I was surprised to see
> that it viewed my system as 1 NUMA node with all 12 on 1 node -- when
> I know that it is physically organized as 2x6.

Yeah, the scheduler will setup for numa if srat says the box is numa.

I have a 64 core DL980 box that numactl --hardware says is a single
node, but that's due to ram truly _existing_ only on one node.   Not a
wonderful (or even supported) setup.

If ram isn't physically plugged into the right spots, or some bios
option makes the box appear to be single node, that's what you'll see
too, (SIBLING maybe) MC and CPU domains, but no NUMA.

-Mike


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2 physical-cpu (like 2x6core) config and NUMA?
  2012-09-17 18:00 2 physical-cpu (like 2x6core) config and NUMA? Linda Walsh
  2012-09-18  5:05 ` Mike Galbraith
@ 2012-09-18  6:55 ` Jike Song
  2012-09-18 11:04   ` Linda Walsh
  1 sibling, 1 reply; 4+ messages in thread
From: Jike Song @ 2012-09-18  6:55 UTC (permalink / raw)
  To: Linda Walsh; +Cc: Linux-Kernel

On Tue, Sep 18, 2012 at 2:00 AM, Linda Walsh <lkml@tlinx.org> wrote:
> Does the kernel support scheduling based on the different speed of
> memory between "on die" vs. "off die"?   I was surprised to see
> that it viewed my system as 1 NUMA node with all 12 on 1 node -- when
> I know that it is physically organized as 2x6.
> Do I have to configure that manually or did I maybe turn off something
> in my kernel config I should have turned on?

Do you have anything printed with:

 # acpidump -a --table SRAT

?

I have a Dell box with 2 physical CPUs, each of them has 6 cores.  If
I enable "Node Interleaving" in BIOS' Memory Setting, and boot the
kernel, the SRAT table will be invisible to Linux - results the same
topology as you got.  If I disable "Node Interleaving" and boot the
kernel, SRAT is present, and the topology is correct.

Hope that helps.

-- 
Thanks,
Jike

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2 physical-cpu (like 2x6core) config and NUMA?
  2012-09-18  6:55 ` Jike Song
@ 2012-09-18 11:04   ` Linda Walsh
  0 siblings, 0 replies; 4+ messages in thread
From: Linda Walsh @ 2012-09-18 11:04 UTC (permalink / raw)
  To: Jike Song; +Cc: Linux-Kernel

Jike Song wrote:
> Do you have anything printed with:
>
>  # acpidump -a --table SRAT
>   
It prints out a bunch of stuff, but the word SRAT wasn't in the
output.

I don't remember a node interleaving in the BIOS -- There's a ECC mode,
where I only get 4/6 slots for usable memory, or an optimized mode that uses
6/6 mem sort.  Each processer has it's own set of 6-slots for memory
(PowerEdge 610 dual X5660's)...

I wonder if the optimizing mode is node interleaving?   But the only other
result results in taking a 1/3rd memory penalty...Hmm...

I'm wondering about this BIOS .. I had to use pcie_aspm=force to get it
to get some parts to work at all.  Like the on-board I/O DMA engine --
linux didn't even see the hardware without the aspm=force.

Maybe a bios update is needed?  will probe along those lines...

I can always call Dell tech support...still under warantee...;-)

Thanks for the ideas and confirmation....


> Hope that helps.
>   
Yep!



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-09-18 11:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-17 18:00 2 physical-cpu (like 2x6core) config and NUMA? Linda Walsh
2012-09-18  5:05 ` Mike Galbraith
2012-09-18  6:55 ` Jike Song
2012-09-18 11:04   ` Linda Walsh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).