* Numaq in 2.4 and 2.6 @ 2003-12-06 9:45 Mika Penttilä 2003-12-06 11:23 ` William Lee Irwin III 0 siblings, 1 reply; 8+ messages in thread From: Mika Penttilä @ 2003-12-06 9:45 UTC (permalink / raw) To: linux-kernel; +Cc: William Lee Irwin III While comparing numaq support in 2.4.23 and 2.6.0-test11 came accross following... In 2.4.23 mpparse.c we do : phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid); and then launch the cpus using NMI and logical addressing in the order phys_cpu_present_map indicates. In 2.6.0-test11mpparse.c we do : tmp = apicid_to_cpu_present(apicid); physids_or(phys_cpu_present_map, phys_cpu_present_map, tmp); where apicid is the result of : static inline int generate_logical_apicid(int quad, int phys_apicid) { return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1); } and phys_apicid == m->mpc_apicid Again we lauch the cpus using NMI and logical addressing. So the the set of apicids fed to do_boot_cpu() in 2.4 and 2.6 must be different using the same mp table. And both use logical addressing. Seems that 2.4 expects mpc_apicid to be something like (quad | cpu) and 2.6 only cpu, the quad comes from the translation table. The conclusion is that the same mp table can't work in 2.4 and 2.6? No? --Mika ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Numaq in 2.4 and 2.6 2003-12-06 9:45 Numaq in 2.4 and 2.6 Mika Penttilä @ 2003-12-06 11:23 ` William Lee Irwin III 2003-12-06 12:20 ` Mika Penttilä [not found] ` <3FD1C94C.1020104@kolumbus.fi> 0 siblings, 2 replies; 8+ messages in thread From: William Lee Irwin III @ 2003-12-06 11:23 UTC (permalink / raw) To: Mika Penttil?; +Cc: linux-kernel On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote: > While comparing numaq support in 2.4.23 and 2.6.0-test11 came accross > following... > In 2.4.23 mpparse.c we do : > phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid); > and then launch the cpus using NMI and logical addressing in the order > phys_cpu_present_map indicates. > In 2.6.0-test11mpparse.c we do : > tmp = apicid_to_cpu_present(apicid); > physids_or(phys_cpu_present_map, phys_cpu_present_map, tmp); > where apicid is the result of : > static inline int generate_logical_apicid(int quad, int phys_apicid) > { > return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1); > } > and phys_apicid == m->mpc_apicid > Again we lauch the cpus using NMI and logical addressing. The sole purposes of this (AFAICT) are for reassigning physical ID's of the IO-APIC's, and cpu wakeup. You're noticing the first of several inconsistencies: (a) The NUMA-Q BIOS stores logical (clustered hierarchical) APIC ID's in the MP table instead of physical APIC ID's. This confuses various things. (b) NUMA-Q's are P-III -based, i.e. serial APIC. The global phys_cpu_present_map does not suffice to represent the things, though some sort of mangled physical APIC ID's are kept in it. To properly describe serial APIC systems of its kind, there needs to be one analogue of phys_cpu_present_map per-node, as each node has a separate APIC bus with its own domain for physical APIC ID's. This explains (a) as it's impossible to have distinct physical APIC ID's for > 15 cpus on serial APIC -based systems. (c) The 2.6 code actually decodes the logical APIC ID to generate a fake xAPIC-like physical APIC ID and uses that as an index into the phys_cpu_present_map. This is used essentially for cpu enumeration. (d) The rest of the setup phys_cpu_present_map is used for is already done by the BIOS. The code cheats by ignoring the IO-APIC renumbering phase entirely for NUMA_Q. Granted, it's supposed to be there to doublecheck the BIOS. On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote: > So the the set of apicids fed to do_boot_cpu() in 2.4 and 2.6 must be > different using the same mp table. And both use logical addressing. > Seems that 2.4 expects mpc_apicid to be something like (quad | cpu) and > 2.6 only cpu, the quad comes from the translation table. > The conclusion is that the same mp table can't work in 2.4 and 2.6? No? It's all okay, albeit ugly and obfuscated as the code doesn't truly describe the hardware. -- wli ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Numaq in 2.4 and 2.6 2003-12-06 11:23 ` William Lee Irwin III @ 2003-12-06 12:20 ` Mika Penttilä [not found] ` <3FD1C94C.1020104@kolumbus.fi> 1 sibling, 0 replies; 8+ messages in thread From: Mika Penttilä @ 2003-12-06 12:20 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel William Lee Irwin III wrote: >On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote: > > >>While comparing numaq support in 2.4.23 and 2.6.0-test11 came accross >>following... >>In 2.4.23 mpparse.c we do : >> phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid); >>and then launch the cpus using NMI and logical addressing in the order >>phys_cpu_present_map indicates. >>In 2.6.0-test11mpparse.c we do : >> tmp = apicid_to_cpu_present(apicid); >> physids_or(phys_cpu_present_map, phys_cpu_present_map, tmp); >>where apicid is the result of : >> static inline int generate_logical_apicid(int quad, int phys_apicid) >> { >> return (quad << 4) + (phys_apicid ? phys_apicid << 1 : 1); >> } >>and phys_apicid == m->mpc_apicid >>Again we lauch the cpus using NMI and logical addressing. >> >> > >The sole purposes of this (AFAICT) are for reassigning physical ID's of >the IO-APIC's, and cpu wakeup. You're noticing the first of several >inconsistencies: > >(a) The NUMA-Q BIOS stores logical (clustered hierarchical) APIC ID's > in the MP table instead of physical APIC ID's. This confuses > various things. >(b) NUMA-Q's are P-III -based, i.e. serial APIC. The global > phys_cpu_present_map does not suffice to represent the things, > though some sort of mangled physical APIC ID's are kept in it. > To properly describe serial APIC systems of its kind, there > needs to be one analogue of phys_cpu_present_map per-node, as > each node has a separate APIC bus with its own domain for > physical APIC ID's. This explains (a) as it's impossible to > have distinct physical APIC ID's for > 15 cpus on serial APIC > -based systems. >(c) The 2.6 code actually decodes the logical APIC ID to generate a > fake xAPIC-like physical APIC ID and uses that as an index > into the phys_cpu_present_map. This is used essentially for > cpu enumeration. >(d) The rest of the setup phys_cpu_present_map is used for is already > done by the BIOS. The code cheats by ignoring the IO-APIC > renumbering phase entirely for NUMA_Q. Granted, it's supposed > to be there to doublecheck the BIOS. > > >On Sat, Dec 06, 2003 at 11:45:51AM +0200, Mika Penttil? wrote: > > >>So the the set of apicids fed to do_boot_cpu() in 2.4 and 2.6 must be >>different using the same mp table. And both use logical addressing. >>Seems that 2.4 expects mpc_apicid to be something like (quad | cpu) and >>2.6 only cpu, the quad comes from the translation table. >>The conclusion is that the same mp table can't work in 2.4 and 2.6? No? >> >> > >It's all okay, albeit ugly and obfuscated as the code doesn't truly >describe the hardware. > > >-- wli >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > Ok...the only thing that still confuses is the apicid to actually used to start the cpu. In NUMA-Q case we don't program the LDRs in either 2.4 or 2.6, the bios does this. So the NMI IPI must have the same destinations in both 2.4 and 2.6 in order to lauch the same cpus. In 2.4, the mpc_apicids are used as such as NMI IPI destinations. In 2.6, the mangled ones (by generate_logical_apicid()) are used as NMI IPI destinations. If the mpc_apicid is already in sort of (cluster, cpu) format (and used in 2.4 NMI IPI), it can't be the same after mangling? --Mika ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <3FD1C94C.1020104@kolumbus.fi>]
* Re: Numaq in 2.4 and 2.6 [not found] ` <3FD1C94C.1020104@kolumbus.fi> @ 2003-12-06 12:36 ` William Lee Irwin III 2003-12-06 13:09 ` Mika Penttilä 0 siblings, 1 reply; 8+ messages in thread From: William Lee Irwin III @ 2003-12-06 12:36 UTC (permalink / raw) To: Mika Penttil?; +Cc: linux-kernel On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote: > Ok...the only thing that still confuses is the apicid to actually used > to start the cpu. In NUMA-Q case we don't program the LDRs in either 2.4 > or 2.6, the bios does this. So the NMI IPI must have the same > destinations in both 2.4 and 2.6 in order to lauch the same cpus. On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote: > In 2.4, the mpc_apicids are used as such as NMI IPI destinations. In > 2.6, the mangled ones (by generate_logical_apicid()) are used as NMI IPI > destinations. If the mpc_apicid is already in sort of (cluster, cpu) > format (and used in 2.4 NMI IPI), it can't be the same after mangling? The mangled physical APIC ID used as an index into phys_cpu_present_map happens to determine the clustered hierarchical logical APIC ID, and so wakeup_secondary_cpu() (switched via #ifdef) gets the right number. There is a correspondence between (node, physical APIC ID) pairs and logical APIC ID's that's part of the BIOS's bootstrap protocol. The calculations you're looking at are based on that, and the logical APIC ID's are encoded in that paired format by the BIOS, and in the mangled format as indices into phys_cpu_present_map. Both 2.4 and 2.6 use cpu_present_to_apicid() to do that translation on the fly given an index into phys_cpu_present_map(). -- wli ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Numaq in 2.4 and 2.6 2003-12-06 12:36 ` William Lee Irwin III @ 2003-12-06 13:09 ` Mika Penttilä 2003-12-06 13:07 ` William Lee Irwin III 0 siblings, 1 reply; 8+ messages in thread From: Mika Penttilä @ 2003-12-06 13:09 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel William Lee Irwin III wrote: >On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote: > > >>Ok...the only thing that still confuses is the apicid to actually used >>to start the cpu. In NUMA-Q case we don't program the LDRs in either 2.4 >>or 2.6, the bios does this. So the NMI IPI must have the same >>destinations in both 2.4 and 2.6 in order to lauch the same cpus. >> >> > >On Sat, Dec 06, 2003 at 02:19:24PM +0200, Mika Penttil? wrote: > > >>In 2.4, the mpc_apicids are used as such as NMI IPI destinations. In >>2.6, the mangled ones (by generate_logical_apicid()) are used as NMI IPI >>destinations. If the mpc_apicid is already in sort of (cluster, cpu) >>format (and used in 2.4 NMI IPI), it can't be the same after mangling? >> >> > >The mangled physical APIC ID used as an index into phys_cpu_present_map >happens to determine the clustered hierarchical logical APIC ID, and so >wakeup_secondary_cpu() (switched via #ifdef) gets the right number. >There is a correspondence between (node, physical APIC ID) pairs and >logical APIC ID's that's part of the BIOS's bootstrap protocol. The >calculations you're looking at are based on that, and the logical APIC >ID's are encoded in that paired format by the BIOS, and in the mangled >format as indices into phys_cpu_present_map. > >Both 2.4 and 2.6 use cpu_present_to_apicid() to do that translation on >the fly given an index into phys_cpu_present_map(). > > >-- wli > > Thanks, I understand what's happening in 2.6. So I think there might be a problem with 2.4.23 then. In mpparse.c : void __init MP_processor_info (struct mpc_config_processor *m) { int ver, quad, logical_apicid; if (!(m->mpc_cpuflag & CPU_ENABLED)) return; logical_apicid = m->mpc_apicid; if (clustered_apic_mode == CLUSTERED_APIC_NUMAQ) { quad = translation_table[mpc_record]->trans_quad; logical_apicid = (quad << 4) + (m->mpc_apicid ? m->mpc_apicid << 1 : 1); printk("Processor #%d %s APIC version %d (quad %d, apic %d)\n", m->mpc_apicid, mpc_family((m->mpc_cpufeature & CPU_FAMILY_MASK)>>8 , (m->mpc_cpufeature & CPU_MODEL_MASK)>>4), m->mpc_apicver, quad, logical_apicid); ..... and later in same function : phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid); but _not_ phys_cpu_present_map |= apicid_to_phys_cpu_present(logical_apicid); as one would expect (and would make it identical to 2.6 behaviour).... A bug? --Mika ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Numaq in 2.4 and 2.6 2003-12-06 13:09 ` Mika Penttilä @ 2003-12-06 13:07 ` William Lee Irwin III 2003-12-06 13:23 ` Mika Penttilä 0 siblings, 1 reply; 8+ messages in thread From: William Lee Irwin III @ 2003-12-06 13:07 UTC (permalink / raw) To: Mika Penttil?; +Cc: linux-kernel On Sat, Dec 06, 2003 at 03:09:10PM +0200, Mika Penttil? wrote: > and later in same function : > phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid); > but _not_ > phys_cpu_present_map |= apicid_to_phys_cpu_present(logical_apicid); > as one would expect (and would make it identical to 2.6 behaviour).... A > bug? You may very well have just debugged an issue I've had with 2.4 on the things. =) -- wli ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Numaq in 2.4 and 2.6 2003-12-06 13:07 ` William Lee Irwin III @ 2003-12-06 13:23 ` Mika Penttilä 2003-12-06 13:23 ` William Lee Irwin III 0 siblings, 1 reply; 8+ messages in thread From: Mika Penttilä @ 2003-12-06 13:23 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel William Lee Irwin III wrote: >On Sat, Dec 06, 2003 at 03:09:10PM +0200, Mika Penttil? wrote: > > >>and later in same function : >>phys_cpu_present_map |= apicid_to_phys_cpu_present(m->mpc_apicid); >>but _not_ >>phys_cpu_present_map |= apicid_to_phys_cpu_present(logical_apicid); >>as one would expect (and would make it identical to 2.6 behaviour).... A >>bug? >> >> > >You may very well have just debugged an issue I've had with 2.4 on >the things. =) > > > > And that would explain my confusion in the first place...I don't have the hardware, so feel free to submit a patch to someone responsible for this area if you like... Anyway thanks for your explanations! --Mika ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Numaq in 2.4 and 2.6 2003-12-06 13:23 ` Mika Penttilä @ 2003-12-06 13:23 ` William Lee Irwin III 0 siblings, 0 replies; 8+ messages in thread From: William Lee Irwin III @ 2003-12-06 13:23 UTC (permalink / raw) To: Mika Penttil?; +Cc: linux-kernel William Lee Irwin III wrote: >> You may very well have just debugged an issue I've had with 2.4 on >> the things. =) On Sat, Dec 06, 2003 at 03:23:59PM +0200, Mika Penttil? wrote: > And that would explain my confusion in the first place...I don't have > the hardware, so feel free to submit a patch to someone responsible for > this area if you like... Anyway thanks for your explanations! My testing bandwidth isn't particularly high at the moment, but when I get a chance, I'll try it out and send it in to Marcelo. -- wli ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-12-06 13:23 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-12-06 9:45 Numaq in 2.4 and 2.6 Mika Penttilä 2003-12-06 11:23 ` William Lee Irwin III 2003-12-06 12:20 ` Mika Penttilä [not found] ` <3FD1C94C.1020104@kolumbus.fi> 2003-12-06 12:36 ` William Lee Irwin III 2003-12-06 13:09 ` Mika Penttilä 2003-12-06 13:07 ` William Lee Irwin III 2003-12-06 13:23 ` Mika Penttilä 2003-12-06 13:23 ` William Lee Irwin III
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).