On Sun, Dec 8, 2019 at 10:04 AM Linus Torvalds wrote: > > On Sun, Dec 8, 2019 at 8:46 AM Akemi Yagi wrote: > > > > I forgot to mention that the aforementioned bug in make was originally > > reported for Fedora. They now have a patched version of make > > (make-4.2.1-15.fc32) in Rawhide. > > Yes, as mentioned I do expect it's some jobserver interaction - I've > seen bad jobserver behavior before. But my 'make' hasn't been upgraded > since May, as far as I can tell from my logs, so the huge performance > regression was new. Interestingly, when I'm trying to do the non-thundering-herd pipe reads and writes, a side effect of that is that the read target is "fair" (ie when there are multiple concurrent waiting read() calls, it's always the first one that gets it). And that seems to really mess up the jobserver bug, and triggers it every single time. Fairness is often bad for locking throughput, but this is on another level entirely, and yeah, I see a lot of "defunct" shell and make processes, so it does seem to be the make bug. (Adding scheduler people to the participants list, because this patch is the "avoid using the return value of wait_event_interruptible_exclusive()" version because I think wait_event_interruptible_exclusive() is itself buggy). I built a new version of 'make' from source, and that one seems to work fine with the attached patch. But the standard Fedora 'make' binary completely hates this patch. Sad. The patch does seem to be the RightThing(tm) to do, but this make bug makes it inadvisable to apply it unless you want to play with things. Linus