Hello Christophe, On Sat, 2020-03-28 at 10:19 +0000, Christophe Leroy wrote: > Hi Leonardo, > > > > On 03/27/2020 03:51 PM, Leonardo Bras wrote: > > > > > [SNIP] > > - If the lock is already free, it would change nothing, > > - Otherwise, the lock will wait. > > - Waiting cycle just got bigger. > > - Worst case scenario: running one more cycle, given lock->slock can > > turn to 0 just after checking.Could you please point where I failed to see the performance penalty? > > (I need to get better at this :) ) > > You are right that when the lock is free, it changes nothing. However > when it is not, it is not just one cycle. Sorry, what I meant here is one "waiting cycle", meaning that in WCS there would be 1 extra iteration on that while. Or it would 'spin' one more time. > > Here is arch_spin_lock() without your patch: > > 00000440 : > 440: 39 40 00 01 li r10,1 > 444: 7d 20 18 28 lwarx r9,0,r3 > 448: 2c 09 00 00 cmpwi r9,0 > 44c: 40 82 00 10 bne 45c > 450: 7d 40 19 2d stwcx. r10,0,r3 > 454: 40 a2 ff f0 bne 444 > 458: 4c 00 01 2c isync > 45c: 2f 89 00 00 cmpwi cr7,r9,0 > 460: 4d be 00 20 bclr+ 12,4*cr7+eq > 464: 7c 21 0b 78 mr r1,r1 > 468: 81 23 00 00 lwz r9,0(r3) > 46c: 2f 89 00 00 cmpwi cr7,r9,0 > 470: 40 be ff f4 bne cr7,464 > 474: 7c 42 13 78 mr r2,r2 > 478: 7d 20 18 28 lwarx r9,0,r3 > 47c: 2c 09 00 00 cmpwi r9,0 > 480: 40 82 00 10 bne 490 > 484: 7d 40 19 2d stwcx. r10,0,r3 > 488: 40 a2 ff f0 bne 478 > 48c: 4c 00 01 2c isync > 490: 2f 89 00 00 cmpwi cr7,r9,0 > 494: 40 be ff d0 bne cr7,464 > 498: 4e 80 00 20 blr > > Here is arch_spin_lock() with your patch. I enclose with === what comes > in addition: > > 00000440 : > 440: 39 40 00 01 li r10,1 > 444: 7d 20 18 28 lwarx r9,0,r3 > 448: 2c 09 00 00 cmpwi r9,0 > 44c: 40 82 00 10 bne 45c > 450: 7d 40 19 2d stwcx. r10,0,r3 > 454: 40 a2 ff f0 bne 444 > 458: 4c 00 01 2c isync > 45c: 2f 89 00 00 cmpwi cr7,r9,0 > 460: 4d be 00 20 bclr+ 12,4*cr7+eq > ===================================================== > 464: 3d 40 00 00 lis r10,0 > 466: R_PPC_ADDR16_HA crash_skip_spinlock > 468: 39 4a 00 00 addi r10,r10,0 > 46a: R_PPC_ADDR16_LO crash_skip_spinlock > 46c: 39 00 00 01 li r8,1 > 470: 89 2a 00 00 lbz r9,0(r10) > 474: 2f 89 00 00 cmpwi cr7,r9,0 > 478: 4c 9e 00 20 bnelr cr7 > ===================================================== > 47c: 7c 21 0b 78 mr r1,r1 > 480: 81 23 00 00 lwz r9,0(r3) > 484: 2f 89 00 00 cmpwi cr7,r9,0 > 488: 40 be ff f4 bne cr7,47c > 48c: 7c 42 13 78 mr r2,r2 > 490: 7d 20 18 28 lwarx r9,0,r3 > 494: 2c 09 00 00 cmpwi r9,0 > 498: 40 82 00 10 bne 4a8 > 49c: 7d 00 19 2d stwcx. r8,0,r3 > 4a0: 40 a2 ff f0 bne 490 > 4a4: 4c 00 01 2c isync > 4a8: 2f 89 00 00 cmpwi cr7,r9,0 > 4ac: 40 be ff c4 bne cr7,470 > 4b0: 4e 80 00 20 blr > > > Christophe I agree. When there is waiting, it will usually add some time to it. Accounting that spinlocks are widely used, it will cause a slowdown in the whole system. Thanks for the feedback, Best regards,