Here is a simple benchmark which is NUMA sensitive and simulates a simple but normal situation in an environment running number crunching jobs. It starts N independent tasks which access a large array in a random manner. This is both bandwidth and latency sensitive. The output shows on which node(s) the tasks have spent their lives. Additionally it shows (on a NUMA scheduler kernel) the homenode (iSched). Could you please run it on the virgin kernel and on the "Both numasched patches, hack node IDs, alloc from current->node" one? Maybe we see what's wrong... Thanks, Erich