[ Adding a few more people that tend to be involved in signal handling. Just in case - even if they probably don't care ] On Mon, Feb 24, 2020 at 12:09 PM Linus Torvalds wrote: > > TOTALLY UNTESTED patch attached. It may be completely buggy garbage, > but it _looks_ trivial enough. I've tested it, and the profiles on the silly microbenchmark look much nicer. Now it's just the sigpending update shows up, the refcount case clearly still occasionally happens, but it's now in the noise. I made slight changes to the __sigqueue_alloc() case to generate better code: since we now use that atomic_inc_return() anyway, we might as well then use the value that is returned for the RLIMIT_SIGPENDING check too, instead of reading it again. That might avoid another potential cacheline bounce, plus the generated code just looks better. Updated (and now slightly tested!) patch attached. It would be interesting if this is noticeable on your benchmark numbers. I didn't actually _time_ anything, I just looked at profiles. But my setup clearly isn't going to see the horrible contention case anyway, so my timing numbers wouldn't be all that interesting. Linus