On Sat, Mar 4, 2023 at 12:51 PM Yury Norov wrote: > > > That particular code sequence is arguably broken to begin with. > > setall() should really only be used as a mask, most definitely not as > > some kind of "all possible cpus". > > Sorry, don't understand this. See the example patch I sent out. Literally just make the rule be "we play games with cpumasks in that they have two different 'sizes', so just make sure the bits in the bigger and faster size are always clear". That simple rule just means that we can then use that bigger constant size in all cases where "upper bits zero" just don't matter. Which is basically all of them. Your for_each_cpu_not() example is actually a great example: it should damn well not exist at all. I hadn't even noticed how broken it was. Exactly like the other broken case (that I *did* notice - cpumask_complement), it has no actual valid users. It _literally_ only exists as a pointless test-case. So this is *literally* what I'm talking about: you are making up silly cases that then act as "arguments" for making all the _real_ cases slower. Stop it. Silly useless cases are just that - silly and useless. They should not be arguments for the real cases then being optimized and simplified. Updated patch to remove 'for_each_cpu_not()' attached. It's still completely untested. Treat this very much as a "Let's make the common cases faster, at least for !MAXSMP". Linus