On Saturday 21 September 2002 18:46, Martin J. Bligh wrote: > > Hmmm .... well I ran the One True Benchmark (tm). The patch > > *increased* my kernel compile time from about 20s to about 28s. > > Not sure I like that idea ;-) Anything you'd like tweaked, or > > more info? Both user and system time were up ... I'll grab a > > profile of kernel stuff. > > From the below, I'd suggest you're getting pages off the wrong > nodes: do_anonymous_page is page zeroing, and rmqueue the buddy > allocator. Are you sure the current->node thing is getting set > correctly? I'll try backing out your alloc_pages tweaking, and > see what happens. Could you please check in dmesg whether the CPU pools are initialised correctly? Maybe something goes wrong for your platform. The node_distance is most probably non-optimal for NUMAQ, that might need some tuning. The default is set for maximum 8 nodes, nodes 1-4 and 5-8 being in separate supernodes, with the latency ratios 1:1.5:2. You could use the attached patch for getting an idea about the load distribution. It's a quick&dirty hack which creates files called /proc/sched/load/rqNN :load of RQs, including info on tasks not running on their homenode /proc/sched/history/ilbNN : history of last 25 initial load balancing decisions for runqueue NN /proc/sched/history/lbNN : last 25 load balancing decisions on rq NN. It should be possible to find the reason for the poor performance by looking at the nr_homenode entries in /proc/sched/load/rqNN. Thanks, best regards, Erich