* [GIT PULL] locking/urgent for v5.12 @ 2021-04-25 9:39 Borislav Petkov 2021-04-25 16:37 ` Linus Torvalds 2021-04-25 16:52 ` pr-tracker-bot 0 siblings, 2 replies; 6+ messages in thread From: Borislav Petkov @ 2021-04-25 9:39 UTC (permalink / raw) To: Linus Torvalds; +Cc: x86-ml, lkml Hi Linus, please pull one urgent locking fix for 5.12. Thx. --- The following changes since commit d434405aaab7d0ebc516b68a8fc4100922d7f5ef: Linux 5.12-rc7 (2021-04-11 15:16:13 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/locking_urgent_for_v5.12 for you to fetch changes up to 84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896: locking/qrwlock: Fix ordering in queued_write_lock_slowpath() (2021-04-17 13:40:50 +0200) ---------------------------------------------------------------- - Fix ordering in the queued writer lock's slowpath. ---------------------------------------------------------------- Ali Saidi (1): locking/qrwlock: Fix ordering in queued_write_lock_slowpath() kernel/locking/qrwlock.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) -- Regards/Gruss, Boris. SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] locking/urgent for v5.12 2021-04-25 9:39 [GIT PULL] locking/urgent for v5.12 Borislav Petkov @ 2021-04-25 16:37 ` Linus Torvalds 2021-04-25 16:39 ` Linus Torvalds 2021-04-25 16:52 ` pr-tracker-bot 1 sibling, 1 reply; 6+ messages in thread From: Linus Torvalds @ 2021-04-25 16:37 UTC (permalink / raw) To: Borislav Petkov, Peter Zijlstra (Intel), Ali Saidi; +Cc: x86-ml, lkml [ Side note: this is cc'd to x86-ml, even though x86 is the _one_ architecture that was guaranteed to be not at all affected by the actual locking bug, since a locked op is always ordered on x86. ] On Sun, Apr 25, 2021 at 2:39 AM Borislav Petkov <bp@suse.de> wrote: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/locking_urgent_for_v5.12 > > - Fix ordering in the queued writer lock's slowpath. So I'm looking at that change, because the code is confusing. Why did it add that "cnts" variable? We know it must have the value _QW_WAITING, since that's what the atomic_cond_read_relaxed() waits for. I'm assuming it's because of the switch to try_cmpxchg by PeterZ? HOWEVER. That actually just makes the code even MORE unreadable. That code was odd and hard to read even before, but now it's positively confusing. New confusion: - Why is the truly non-critical cmpxchg using "try_cmpxhg()", when the _first_ cmpxchg - above the loop - is not? Pre-existing confusion: - Why is the code using "atomic_add()" to set a bit? Yeah, yeah, neither of these are *bugs*, but Christ is that code hard to read. The "use add to set a bit" is valid because of the spinlock serialization (ie only one add can ever happen), and the cmpxchg-vs-try_cmpxchg confusion isn't buggy, it's just really really confusing that that same function is using two different - but equivalent - cmpxchg things on the same variable literally a couple of lines apart. I've pulled this, but can we please - make *both* of the cmpxchg's use "try_cmpxchg()" (and thus that "cnts" variable)? - add a comment about _why_ it's doing "atomic_add()" instead of the much more logical "atomic_or()", and about how the spinlock serializes it I'm assuming the "atomic_add()" is simply because many more architectures have that as an actual intrinsic atomic. I understand. But it's really really not obvious from the code. Linus ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] locking/urgent for v5.12 2021-04-25 16:37 ` Linus Torvalds @ 2021-04-25 16:39 ` Linus Torvalds 2021-04-25 17:06 ` Waiman Long 0 siblings, 1 reply; 6+ messages in thread From: Linus Torvalds @ 2021-04-25 16:39 UTC (permalink / raw) To: Borislav Petkov, Peter Zijlstra (Intel), Ali Saidi, Steve Capper, Will Deacon, Waiman Long Cc: x86-ml, lkml Oh, and replying to myself only because I spazzed out and pressed "send" before I had filled out the full participants line. Sorry for the duplicate message quoted in full below. Linus On Sun, Apr 25, 2021 at 9:37 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > [ Side note: this is cc'd to x86-ml, even though x86 is the _one_ > architecture that was guaranteed to be not at all affected by the > actual locking bug, since a locked op is always ordered on x86. ] > > On Sun, Apr 25, 2021 at 2:39 AM Borislav Petkov <bp@suse.de> wrote: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/locking_urgent_for_v5.12 > > > > - Fix ordering in the queued writer lock's slowpath. > > So I'm looking at that change, because the code is confusing. > > Why did it add that "cnts" variable? We know it must have the value > _QW_WAITING, since that's what the atomic_cond_read_relaxed() waits > for. > > I'm assuming it's because of the switch to try_cmpxchg by PeterZ? > > HOWEVER. > > That actually just makes the code even MORE unreadable. > > That code was odd and hard to read even before, but now it's > positively confusing. > > New confusion: > - Why is the truly non-critical cmpxchg using "try_cmpxhg()", when > the _first_ cmpxchg - above the loop - is not? > > Pre-existing confusion: > - Why is the code using "atomic_add()" to set a bit? > > Yeah, yeah, neither of these are *bugs*, but Christ is that code hard > to read. The "use add to set a bit" is valid because of the spinlock > serialization (ie only one add can ever happen), and the > cmpxchg-vs-try_cmpxchg confusion isn't buggy, it's just really really > confusing that that same function is using two different - but > equivalent - cmpxchg things on the same variable literally a couple of > lines apart. > > I've pulled this, but can we please > > - make *both* of the cmpxchg's use "try_cmpxchg()" (and thus that > "cnts" variable)? > > - add a comment about _why_ it's doing "atomic_add()" instead of the > much more logical "atomic_or()", and about how the spinlock serializes > it > > I'm assuming the "atomic_add()" is simply because many more > architectures have that as an actual intrinsic atomic. I understand. > But it's really really not obvious from the code. > > Linus ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] locking/urgent for v5.12 2021-04-25 16:39 ` Linus Torvalds @ 2021-04-25 17:06 ` Waiman Long 2021-04-26 8:04 ` Peter Zijlstra 0 siblings, 1 reply; 6+ messages in thread From: Waiman Long @ 2021-04-25 17:06 UTC (permalink / raw) To: Linus Torvalds, Borislav Petkov, Peter Zijlstra (Intel), Ali Saidi, Steve Capper, Will Deacon Cc: x86-ml, lkml On 4/25/21 12:39 PM, Linus Torvalds wrote: > Oh, and replying to myself only because I spazzed out and pressed > "send" before I had filled out the full participants line. > > Sorry for the duplicate message quoted in full below. > > Linus > > On Sun, Apr 25, 2021 at 9:37 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> [ Side note: this is cc'd to x86-ml, even though x86 is the _one_ >> architecture that was guaranteed to be not at all affected by the >> actual locking bug, since a locked op is always ordered on x86. ] >> >> On Sun, Apr 25, 2021 at 2:39 AM Borislav Petkov <bp@suse.de> wrote: >>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/locking_urgent_for_v5.12 >>> >>> - Fix ordering in the queued writer lock's slowpath. >> So I'm looking at that change, because the code is confusing. >> >> Why did it add that "cnts" variable? We know it must have the value >> _QW_WAITING, since that's what the atomic_cond_read_relaxed() waits >> for. >> >> I'm assuming it's because of the switch to try_cmpxchg by PeterZ? Yes, try_cmpxchg() requires a variable to hold the new value as well as a place to return the actual value before the cmpxchg(). It is just the way try_cmpxchg() works. >> >> HOWEVER. >> >> That actually just makes the code even MORE unreadable. >> >> That code was odd and hard to read even before, but now it's >> positively confusing. >> >> New confusion: >> - Why is the truly non-critical cmpxchg using "try_cmpxhg()", when >> the _first_ cmpxchg - above the loop - is not? At least for x86, try_cmpxchg() seems to produce a slight better assembly code than the regular cmpxchg(). I guess that may be one of the reason Peter changed it to use try_cmpxchg(). Another reason that I can think of is to make the code fit in one line instead of splitting it up into two lines like the original version from Ali. >> >> Pre-existing confusion: >> - Why is the code using "atomic_add()" to set a bit? >> >> Yeah, yeah, neither of these are *bugs*, but Christ is that code hard >> to read. The "use add to set a bit" is valid because of the spinlock >> serialization (ie only one add can ever happen), and the >> cmpxchg-vs-try_cmpxchg confusion isn't buggy, it's just really really >> confusing that that same function is using two different - but >> equivalent - cmpxchg things on the same variable literally a couple of >> lines apart. As you have said, the spinlock serialization makes sure that only 1 writer is allowed to do that. I agree that using atomic_or() looks better in this case. Both of them are equivalent in this particular case. >> >> I've pulled this, but can we please >> >> - make *both* of the cmpxchg's use "try_cmpxchg()" (and thus that >> "cnts" variable)? Yes, we can certainly change the other cmpxchg() to try_cmpxchg(). >> >> - add a comment about _why_ it's doing "atomic_add()" instead of the >> much more logical "atomic_or()", and about how the spinlock serializes >> it >> >> I'm assuming the "atomic_add()" is simply because many more >> architectures have that as an actual intrinsic atomic. I understand. >> But it's really really not obvious from the code. >> I will post a patch to make the suggested change to qrwlock.c. Cheers, Longman ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] locking/urgent for v5.12 2021-04-25 17:06 ` Waiman Long @ 2021-04-26 8:04 ` Peter Zijlstra 0 siblings, 0 replies; 6+ messages in thread From: Peter Zijlstra @ 2021-04-26 8:04 UTC (permalink / raw) To: Waiman Long Cc: Linus Torvalds, Borislav Petkov, Ali Saidi, Steve Capper, Will Deacon, x86-ml, lkml On Sun, Apr 25, 2021 at 01:06:52PM -0400, Waiman Long wrote: > On 4/25/21 12:39 PM, Linus Torvalds wrote: > > > I'm assuming it's because of the switch to try_cmpxchg by PeterZ? > > Yes, try_cmpxchg() requires a variable to hold the new value as well as a > place to return the actual value before the cmpxchg(). It is just the way > try_cmpxchg() works. Right; by virtue of it returning a boolean, the value return needs to be through a pointer argument. > > > New confusion: > > > - Why is the truly non-critical cmpxchg using "try_cmpxhg()", when > > > the _first_ cmpxchg - above the loop - is not? > At least for x86, try_cmpxchg() seems to produce a slight better assembly > code than the regular cmpxchg(). I guess that may be one of the reason Peter > changed it to use try_cmpxchg(). Another reason that I can think of is to > make the code fit in one line instead of splitting it up into two lines like > the original version from Ali. Right, x86 generates slightly better asm (and potentially so for any architecture that has CAS state in condition codes) while it's a wash for other architectures (specifically we checked at the time arm64 didn't generate worse code). > > > > > > Pre-existing confusion: > > > - Why is the code using "atomic_add()" to set a bit? > > > > > > Yeah, yeah, neither of these are *bugs*, but Christ is that code hard > > > to read. The "use add to set a bit" is valid because of the spinlock > > > serialization (ie only one add can ever happen), and the > > > cmpxchg-vs-try_cmpxchg confusion isn't buggy, it's just really really > > > confusing that that same function is using two different - but > > > equivalent - cmpxchg things on the same variable literally a couple of > > > lines apart. > As you have said, the spinlock serialization makes sure that only 1 writer > is allowed to do that. I agree that using atomic_or() looks better in this > case. Both of them are equivalent in this particular case. Agreed, I think the reason is that because of the read-side doing the BIAS add/sub, some of that snuck into the write side. AFAIK no arch lacks the atomic_or() intrinsic. The one that's often an issue is atomic_fetch_or() (x86 for one doesn't have it :/). > > > I've pulled this, but can we please > > > > > > - make *both* of the cmpxchg's use "try_cmpxchg()" (and thus that > > > "cnts" variable)? > Yes, we can certainly change the other cmpxchg() to try_cmpxchg(). > > > > > > - add a comment about _why_ it's doing "atomic_add()" instead of the > > > much more logical "atomic_or()", and about how the spinlock serializes > > > it > > > > > > I'm assuming the "atomic_add()" is simply because many more > > > architectures have that as an actual intrinsic atomic. I understand. > > > But it's really really not obvious from the code. > > > > I will post a patch to make the suggested change to qrwlock.c. Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] locking/urgent for v5.12 2021-04-25 9:39 [GIT PULL] locking/urgent for v5.12 Borislav Petkov 2021-04-25 16:37 ` Linus Torvalds @ 2021-04-25 16:52 ` pr-tracker-bot 1 sibling, 0 replies; 6+ messages in thread From: pr-tracker-bot @ 2021-04-25 16:52 UTC (permalink / raw) To: Borislav Petkov; +Cc: Linus Torvalds, x86-ml, lkml The pull request you sent on Sun, 25 Apr 2021 11:39:31 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/locking_urgent_for_v5.12 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/0146da0d4cecad571f69f02fe35d75d6dba9723c Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-04-26 8:05 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-25 9:39 [GIT PULL] locking/urgent for v5.12 Borislav Petkov 2021-04-25 16:37 ` Linus Torvalds 2021-04-25 16:39 ` Linus Torvalds 2021-04-25 17:06 ` Waiman Long 2021-04-26 8:04 ` Peter Zijlstra 2021-04-25 16:52 ` pr-tracker-bot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).