On Sat, 16 Sep 2017, Linus Torvalds wrote: > On Sat, Sep 16, 2017 at 10:35 AM, Thomas Gleixner wrote: > > > > Don't bother. I found it already. On UP we have: > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > > > which is a total fail as it breaks any code which uses for_each_cpu() or > > any of the other variants on UP by assuming that all cpumask have bit 0 > > set. > > It's fairly fundamental. UP assumes that all CPU masks are always that > "one CPU set". Not just here - everywhere. > > I guess we could somehow try to move away from that, but really, the > assumption of fixed masks ends up simplifying the code generation a > lot, so it made tons of sense back when UP was a primary target. > > So it's an approach that is somewhat historical, but I'm not sure it's > worth re-visiting that old decision. People should simply not expect > to traverse over empty masks in anything that is UP. > > So I suspect your perf fix is the right one, and maybe we could/should > just make people more aware of the empty cpumask issue with UP. Right, I just got a bit frightened as I really was not aware about that 'opmtimization' which means that so far I just was lucky not to trip over it. Thanks, tglx