linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Numaq in 2.4 and 2.6
@ 2003-12-06  9:45 Mika Penttilä
  2003-12-06 11:23 ` William Lee Irwin III
  0 siblings, 1 reply; 8+ messages in thread
From: Mika Penttilä @ 2003-12-06  9:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: William Lee Irwin III

While comparing numaq support in 2.4.23 and 2.6.0-test11 came accross 
following...

In 2.4.23 mpparse.c we do :
    phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid);

and then launch the cpus using NMI and logical addressing in the order 
phys_cpu_present_map indicates.


In 2.6.0-test11mpparse.c we do :
    tmp = apicid_to_cpu_present(apicid);
    physids_or(phys_cpu_present_map, phys_cpu_present_map, tmp);

where apicid is the result of :
    static inline int generate_logical_apicid(int quad, int phys_apicid)
    {
        return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1);
    }

and phys_apicid == m->mpc_apicid

Again we lauch the cpus using NMI and logical addressing.


So the the set of apicids fed to do_boot_cpu() in 2.4 and 2.6 must be 
different using the same mp table. And both use logical addressing. 
Seems that 2.4 expects mpc_apicid to be something like (quad | cpu) and 
2.6 only cpu, the quad comes from the translation table.

The conclusion is that the same mp table can't work in 2.4 and 2.6? No?

--Mika



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Numaq in 2.4 and 2.6
  2003-12-06  9:45 Numaq in 2.4 and 2.6 Mika Penttilä
@ 2003-12-06 11:23 ` William Lee Irwin III
  2003-12-06 12:20   ` Mika Penttilä
       [not found]   ` <3FD1C94C.1020104@kolumbus.fi>
  0 siblings, 2 replies; 8+ messages in thread
From: William Lee Irwin III @ 2003-12-06 11:23 UTC (permalink / raw)
  To: Mika Penttil?; +Cc: linux-kernel

On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote:
> While comparing numaq support in 2.4.23 and 2.6.0-test11 came accross 
> following...
> In 2.4.23 mpparse.c we do :
>    phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid);
> and then launch the cpus using NMI and logical addressing in the order 
> phys_cpu_present_map indicates.
> In 2.6.0-test11mpparse.c we do :
>    tmp = apicid_to_cpu_present(apicid);
>    physids_or(phys_cpu_present_map, phys_cpu_present_map, tmp);
> where apicid is the result of :
>    static inline int generate_logical_apicid(int quad, int phys_apicid)
>    {
>        return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1);
>    }
> and phys_apicid == m->mpc_apicid
> Again we lauch the cpus using NMI and logical addressing.

The sole purposes of this (AFAICT) are for reassigning physical ID's of
the IO-APIC's, and cpu wakeup. You're noticing the first of several
inconsistencies:

(a) The NUMA-Q BIOS stores logical (clustered hierarchical) APIC ID's
	in the MP table instead of physical APIC ID's. This confuses
	various things.
(b) NUMA-Q's are P-III -based, i.e. serial APIC. The global
	phys_cpu_present_map does not suffice to represent the things,
	though some sort of mangled physical APIC ID's are kept in it.
	To properly describe serial APIC systems of its kind, there
	needs to be one analogue of phys_cpu_present_map per-node, as
	each node has a separate APIC bus with its own domain for
	physical APIC ID's. This explains (a) as it's impossible to
	have distinct physical APIC ID's for > 15 cpus on serial APIC
	-based systems.
(c) The 2.6 code actually decodes the logical APIC ID to generate a
	fake xAPIC-like physical APIC ID and uses that as an index
	into the phys_cpu_present_map. This is used essentially for
	cpu enumeration.
(d) The rest of the setup phys_cpu_present_map is used for is already
	done by the BIOS. The code cheats by ignoring the IO-APIC
	renumbering phase entirely for NUMA_Q. Granted, it's supposed
	to be there to doublecheck the BIOS.


On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote:
> So the the set of apicids fed to do_boot_cpu() in 2.4 and 2.6 must be 
> different using the same mp table. And both use logical addressing. 
> Seems that 2.4 expects mpc_apicid to be something like (quad | cpu) and 
> 2.6 only cpu, the quad comes from the translation table.
> The conclusion is that the same mp table can't work in 2.4 and 2.6? No?

It's all okay, albeit ugly and obfuscated as the code doesn't truly
describe the hardware.


-- wli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Numaq in 2.4 and 2.6
  2003-12-06 11:23 ` William Lee Irwin III
@ 2003-12-06 12:20   ` Mika Penttilä
       [not found]   ` <3FD1C94C.1020104@kolumbus.fi>
  1 sibling, 0 replies; 8+ messages in thread
From: Mika Penttilä @ 2003-12-06 12:20 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel


William Lee Irwin III wrote:

>On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote:
>  
>
>>While comparing numaq support in 2.4.23 and 2.6.0-test11 came accross 
>>following...
>>In 2.4.23 mpparse.c we do :
>>   phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid);
>>and then launch the cpus using NMI and logical addressing in the order 
>>phys_cpu_present_map indicates.
>>In 2.6.0-test11mpparse.c we do :
>>   tmp = apicid_to_cpu_present(apicid);
>>   physids_or(phys_cpu_present_map, phys_cpu_present_map, tmp);
>>where apicid is the result of :
>>   static inline int generate_logical_apicid(int quad, int phys_apicid)
>>   {
>>       return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1);
>>   }
>>and phys_apicid == m->mpc_apicid
>>Again we lauch the cpus using NMI and logical addressing.
>>    
>>
>
>The sole purposes of this (AFAICT) are for reassigning physical ID's of
>the IO-APIC's, and cpu wakeup. You're noticing the first of several
>inconsistencies:
>
>(a) The NUMA-Q BIOS stores logical (clustered hierarchical) APIC ID's
>	in the MP table instead of physical APIC ID's. This confuses
>	various things.
>(b) NUMA-Q's are P-III -based, i.e. serial APIC. The global
>	phys_cpu_present_map does not suffice to represent the things,
>	though some sort of mangled physical APIC ID's are kept in it.
>	To properly describe serial APIC systems of its kind, there
>	needs to be one analogue of phys_cpu_present_map per-node, as
>	each node has a separate APIC bus with its own domain for
>	physical APIC ID's. This explains (a) as it's impossible to
>	have distinct physical APIC ID's for > 15 cpus on serial APIC
>	-based systems.
>(c) The 2.6 code actually decodes the logical APIC ID to generate a
>	fake xAPIC-like physical APIC ID and uses that as an index
>	into the phys_cpu_present_map. This is used essentially for
>	cpu enumeration.
>(d) The rest of the setup phys_cpu_present_map is used for is already
>	done by the BIOS. The code cheats by ignoring the IO-APIC
>	renumbering phase entirely for NUMA_Q. Granted, it's supposed
>	to be there to doublecheck the BIOS.
>
>
>On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote:
>  
>
>>So the the set of apicids fed to do_boot_cpu() in 2.4 and 2.6 must be 
>>different using the same mp table. And both use logical addressing. 
>>Seems that 2.4 expects mpc_apicid to be something like (quad | cpu) and 
>>2.6 only cpu, the quad comes from the translation table.
>>The conclusion is that the same mp table can't work in 2.4 and 2.6? No?
>>    
>>
>
>It's all okay, albeit ugly and obfuscated as the code doesn't truly
>describe the hardware.
>
>
>-- wli
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>  
>

Ok...the only thing that still confuses is the apicid to actually used 
to start the cpu. In NUMA-Q case we don't program the LDRs in either 2.4 
or 2.6, the bios does this. So the NMI IPI must have the same 
destinations in both 2.4 and 2.6 in order to lauch the same cpus.

In 2.4, the mpc_apicids are used as such as NMI IPI destinations. In 
2.6, the mangled ones (by generate_logical_apicid()) are used as NMI IPI 
destinations. If the mpc_apicid is already in sort of (cluster, cpu) 
format (and used in 2.4 NMI IPI), it can't be the same after mangling?

--Mika




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Numaq in 2.4 and 2.6
       [not found]   ` <3FD1C94C.1020104@kolumbus.fi>
@ 2003-12-06 12:36     ` William Lee Irwin III
  2003-12-06 13:09       ` Mika Penttilä
  0 siblings, 1 reply; 8+ messages in thread
From: William Lee Irwin III @ 2003-12-06 12:36 UTC (permalink / raw)
  To: Mika Penttil?; +Cc: linux-kernel

On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote:
> Ok...the only thing that still confuses is the apicid to actually used 
> to start the cpu. In NUMA-Q case we don't program the LDRs in either 2.4 
> or 2.6, the bios does this. So the NMI IPI must have the same 
> destinations in both 2.4 and 2.6 in order to lauch the same cpus.

On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote:
> In 2.4, the mpc_apicids are used as such as NMI IPI destinations. In 
> 2.6, the mangled ones (by generate_logical_apicid()) are used as NMI IPI 
> destinations. If the mpc_apicid is already in sort of (cluster, cpu) 
> format (and used in 2.4 NMI IPI), it can't be the same after mangling?

The mangled physical APIC ID used as an index into phys_cpu_present_map
happens to determine the clustered hierarchical logical APIC ID, and so
wakeup_secondary_cpu() (switched via #ifdef) gets the right number.
There is a correspondence between (node, physical APIC ID) pairs and
logical APIC ID's that's part of the BIOS's bootstrap protocol. The
calculations you're looking at are based on that, and the logical APIC
ID's are encoded in that paired format by the BIOS, and in the mangled
format as indices into phys_cpu_present_map.

Both 2.4 and 2.6 use cpu_present_to_apicid() to do that translation on
the fly given an index into phys_cpu_present_map().


-- wli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Numaq in 2.4 and 2.6
  2003-12-06 13:09       ` Mika Penttilä
@ 2003-12-06 13:07         ` William Lee Irwin III
  2003-12-06 13:23           ` Mika Penttilä
  0 siblings, 1 reply; 8+ messages in thread
From: William Lee Irwin III @ 2003-12-06 13:07 UTC (permalink / raw)
  To: Mika Penttil?; +Cc: linux-kernel

On Sat, Dec 06, 2003 at 03:09:10PM +0200, Mika Penttil? wrote:
> and later in same function :
> phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid);
> but _not_
> phys_cpu_present_map |= apicid_to_phys_cpu_present(logical_apicid);
> as one would expect (and would make it identical to 2.6 behaviour).... A 
> bug?

You may very well have just debugged an issue I've had with 2.4 on
the things. =)


-- wli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Numaq in 2.4 and 2.6
  2003-12-06 12:36     ` William Lee Irwin III
@ 2003-12-06 13:09       ` Mika Penttilä
  2003-12-06 13:07         ` William Lee Irwin III
  0 siblings, 1 reply; 8+ messages in thread
From: Mika Penttilä @ 2003-12-06 13:09 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel



William Lee Irwin III wrote:

>On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote:
>  
>
>>Ok...the only thing that still confuses is the apicid to actually used 
>>to start the cpu. In NUMA-Q case we don't program the LDRs in either 2.4 
>>or 2.6, the bios does this. So the NMI IPI must have the same 
>>destinations in both 2.4 and 2.6 in order to lauch the same cpus.
>>    
>>
>
>On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote:
>  
>
>>In 2.4, the mpc_apicids are used as such as NMI IPI destinations. In 
>>2.6, the mangled ones (by generate_logical_apicid()) are used as NMI IPI 
>>destinations. If the mpc_apicid is already in sort of (cluster, cpu) 
>>format (and used in 2.4 NMI IPI), it can't be the same after mangling?
>>    
>>
>
>The mangled physical APIC ID used as an index into phys_cpu_present_map
>happens to determine the clustered hierarchical logical APIC ID, and so
>wakeup_secondary_cpu() (switched via #ifdef) gets the right number.
>There is a correspondence between (node, physical APIC ID) pairs and
>logical APIC ID's that's part of the BIOS's bootstrap protocol. The
>calculations you're looking at are based on that, and the logical APIC
>ID's are encoded in that paired format by the BIOS, and in the mangled
>format as indices into phys_cpu_present_map.
>
>Both 2.4 and 2.6 use cpu_present_to_apicid() to do that translation on
>the fly given an index into phys_cpu_present_map().
>
>
>-- wli
>  
>
Thanks, I understand what's happening in 2.6. So I think there might be 
a problem with 2.4.23 then. In mpparse.c :

void __init MP_processor_info (struct mpc_config_processor *m)
{
int ver, quad, logical_apicid;

if (!(m->mpc_cpuflag & CPU_ENABLED))
return;

logical_apicid = m->mpc_apicid;
if (clustered_apic_mode == CLUSTERED_APIC_NUMAQ) {
quad = translation_table[mpc_record]->trans_quad;
logical_apicid = (quad << 4) +
(m->mpc_apicid ? m->mpc_apicid << 1 : 1);
printk("Processor #%d %s APIC version %d (quad %d, apic %d)\n",
m->mpc_apicid,
mpc_family((m->mpc_cpufeature & CPU_FAMILY_MASK)>>8 ,
(m->mpc_cpufeature & CPU_MODEL_MASK)>>4),
m->mpc_apicver, quad, logical_apicid);
.....
and later in same function :

phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid);

but _not_

phys_cpu_present_map |= apicid_to_phys_cpu_present(logical_apicid);

as one would expect (and would make it identical to 2.6 behaviour).... A 
bug?


--Mika









^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Numaq in 2.4 and 2.6
  2003-12-06 13:23           ` Mika Penttilä
@ 2003-12-06 13:23             ` William Lee Irwin III
  0 siblings, 0 replies; 8+ messages in thread
From: William Lee Irwin III @ 2003-12-06 13:23 UTC (permalink / raw)
  To: Mika Penttil?; +Cc: linux-kernel

William Lee Irwin III wrote:
>> You may very well have just debugged an issue I've had with 2.4 on
>> the things. =)

On Sat, Dec 06, 2003 at 03:23:59PM +0200, Mika Penttil? wrote:
> And that would explain my confusion in the first place...I don't have 
> the hardware, so feel free to submit a patch to  someone responsible for 
> this area if you like... Anyway thanks for your explanations!

My testing bandwidth isn't particularly high at the moment, but when I
get a chance, I'll try it out and send it in to Marcelo.


-- wli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Numaq in 2.4 and 2.6
  2003-12-06 13:07         ` William Lee Irwin III
@ 2003-12-06 13:23           ` Mika Penttilä
  2003-12-06 13:23             ` William Lee Irwin III
  0 siblings, 1 reply; 8+ messages in thread
From: Mika Penttilä @ 2003-12-06 13:23 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel



William Lee Irwin III wrote:

>On Sat, Dec 06, 2003 at 03:09:10PM +0200, Mika Penttil? wrote:
>  
>
>>and later in same function :
>>phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid);
>>but _not_
>>phys_cpu_present_map |= apicid_to_phys_cpu_present(logical_apicid);
>>as one would expect (and would make it identical to 2.6 behaviour).... A 
>>bug?
>>    
>>
>
>You may very well have just debugged an issue I've had with 2.4 on
>the things. =)
>
>
>  
>
And that would explain my confusion in the first place...I don't have 
the hardware, so feel free to submit a patch to  someone responsible for 
this area if you like... Anyway thanks for your explanations!

--Mika



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-12-06 13:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-06  9:45 Numaq in 2.4 and 2.6 Mika Penttilä
2003-12-06 11:23 ` William Lee Irwin III
2003-12-06 12:20   ` Mika Penttilä
     [not found]   ` <3FD1C94C.1020104@kolumbus.fi>
2003-12-06 12:36     ` William Lee Irwin III
2003-12-06 13:09       ` Mika Penttilä
2003-12-06 13:07         ` William Lee Irwin III
2003-12-06 13:23           ` Mika Penttilä
2003-12-06 13:23             ` William Lee Irwin III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).