On 17.04.11 04:53:32, Peter Zijlstra wrote: > On Sun, 2011-04-17 at 10:18 +0200, Ingo Molnar wrote: > > So with 6 counters it would be a loop of 720, with 8 counters a loop of 40320, > > with 10 counters a loop of 3628800 ... O(n!) is not fun. > > Right, and we'll hit this case at least once when scheduling a > over-committed system. Intel Sandy Bridge can have 8 counters per core + > 3 fixed counters, giving an n=11 situation. You do _NOT_ want to have > one 39916800 cycle loop before we determine the PMU isn't schedulable, > that's simply unacceptable. Of course it is not that much as the algorithm is already optimized and we only walk through possible ways. Also, the more constraints we have the less we have to walk. So lets assume a worst case of 8 unconstraint counters, I reimplemented the algorithm in the perl script attached and counted 251 loops, following numbers I got depending on the number of counters: $ perl counter-scheduling.pl | grep Num Number of counters: 2, loops: 10, redos: 4, ratio: 2.5 Number of counters: 3, loops: 26, redos: 7, ratio: 3.7 Number of counters: 4, loops: 53, redos: 11, ratio: 4.8 Number of counters: 5, loops: 89, redos: 15, ratio: 5.9 Number of counters: 6, loops: 134, redos: 19, ratio: 7.1 Number of counters: 7, loops: 188, redos: 23, ratio: 8.2 Number of counters: 8, loops: 251, redos: 27, ratio: 9.3 Number of counters: 9, loops: 323, redos: 31, ratio: 10.4 Number of counters: 10, loops: 404, redos: 35, ratio: 11.5 Number of counters: 11, loops: 494, redos: 39, ratio: 12.7 Number of counters: 12, loops: 593, redos: 43, ratio: 13.8 It seems the algorithm is about number-of-counter times slower than the current. I think this is worth some further considerations. There is also some room for improvement with my algorithm. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center