> In message <3FD7F1B9.5080100@cyberone.com.au> you write: > > http://www.kerneltrap.org/~npiggin/w26/ > > Against 2.6.0-test11 > > > > This includes the SMT description for P4. Initial results shows > > comparable performance to Ingo's shared runqueue's patch on a dual P4 > > Xeon. > > I'm still not convinced. Sharing runqueues is simple, and in fact > exactly what you want for HT: you want to balance *runqueues*, not > CPUs. In fact, it can be done without a CONFIG_SCHED_SMT addition. > > Your patch is more general, more complex, but doesn't actually seem to > buy anything. It puts a general domain structure inside the > scheduler, without putting it anywhere else which wants it (eg. slab > cache balancing). My opinion is either (1) produce a general NUMA > topology which can then be used by the scheduler, or (2) do the > minimal change in the scheduler which makes HT work well. FWIW, here is a patch I was working on a while back, to have multilevel NUMA heirarchies (based on arch specific NUMA topology) and more importantly, a runqueue centric point of view for all the load balance routines. This patch is quite rough, and I have not looked at this patch in a while, but maybe it could help someone? Also, with runqueue centric approach, shared runqueues should just "work", so adding that to this patch should be fairly clean. One more thing, we are missing some stuff in the NUMA topology, which Rusty mentions in another email, like core arrangement, arch states, cache locations/types -all that stuff eventually should make it into some sort of topology, be it NUMA topology stuff or a more generic thing like sysfs. Right now we are a bit limited at what the scheduler looks at, just cpu_to_node type stuff... -Andrew Theurer