> On 2012.06.14 08:43 - 0700, Charles Wang wrote: Just trying to catch up... > I re-create the problem by waiter, using following parameters: > ./waiter 16 1800 900 9444 100 1 That lists output at a very high rate and has about 5 kilo hertz sleep frequency per process (at least on my computer). Suggest something like this instead: ./waiter 16 1800 9000 94444 100 1 Which still makes your point. > 9444 and 100 is the key point that can make > "processing time+sleeping time" less than 1 tick. > We have 16 processors, and the load of every processor is about 0.25~0.3, > so the actual load should be 16*0.25=4. But the loadavg calculated by the > current logic is only about 1. O.K. finally I understand. Your application has an extremely high rate of switching between processing and sleeping or waiting for something. On the order of 1000s of Hertz, far exceeding the basic tick rate within default kernel compiles. Yes, under such conditions it can not be expected that Reported Load Averages would be accurate. The kernel would have to be re-compiled with a much higher basic tick rate. In February/March as part of the fix testing for the reported load averages way way way too low issue, I tested operating such conditions where even a tick based kernel would be expected to start breaking down. All I cared about in those tests was that the tickless kernel started to degrade in a similar fashion as the tick based kernel, which it did [1]. I was only going to around 420 Hertz per process, and now you are going even higher. However, and based on your findings, I was able to re-create conditions of errors too low in Reported Load Averages under conditions (low enough frequencies) where we would expect things to work properly. See the attached PNG file, also posted at [2]. (Kernel 3.5 RC2 an i7 processor with 8 cpus) I have back edited Peter's patch (from a subsequent e-mail) into my working kernel (3.2 based), and will test over the weekend. @Charles: Early next week maybe we can compare results from your tests. I'll try your solutions also on my computer if you post the code (either on list for off list) [1] http://www.smythies.com/~doug/network/load_average/original.html#higher_freq [2] http://www.smythies.com/~doug/network/load_average/high_freq_35rc2.png