From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964800AbVLPWjZ (ORCPT ); Fri, 16 Dec 2005 17:39:25 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932543AbVLPWjZ (ORCPT ); Fri, 16 Dec 2005 17:39:25 -0500 Received: from smtp.osdl.org ([65.172.181.4]:7075 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S932554AbVLPWjY (ORCPT ); Fri, 16 Dec 2005 17:39:24 -0500 Date: Fri, 16 Dec 2005 14:38:47 -0800 (PST) From: Linus Torvalds To: "David S. Miller" cc: dhowells@redhat.com, nickpiggin@yahoo.com.au, arjan@infradead.org, akpm@osdl.org, alan@lxorguk.ukuu.org.uk, cfriesen@nortel.com, hch@infradead.org, matthew@wil.cx, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation In-Reply-To: <20051216.142349.89717140.davem@davemloft.net> Message-ID: References: <15412.1134561432@warthog.cambridge.redhat.com> <12186.1134732601@warthog.cambridge.redhat.com> <20051216.142349.89717140.davem@davemloft.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 16 Dec 2005, David S. Miller wrote: > > Actually, this points out a problem with "compare and swap". The > typical loop is of the form: > > LOAD [MEM], REG1 > OP REG1, X, REG2 > CAS [MEM], REG1, REG2 > > That first LOAD instruction, if it misses in the L2, causes the cache > line to be requested for sharing. Then the CAS instruction will need > to issue another cache coherency transaction to get the cache line > into owned state. A number of architectures have a "prefetch for write ownership" instruction that you can use for this. Exactly because "ld+cas" should not get a shared line initially. I though sparc had an ASI to do the same? No? > (Are there any CPUs that peek forward and look for the CAS > instruction to decide to issue the more appropriate request > for the cache line in Owned state? That would be cool...) I don't think anybody does, although it wouldn't be impossible. Any OoO processor would _tend_ to have enough visibility that they could see any stores that are "close" (not just a cas) and might be clever enough to modify a memory read to be a read-with-intent-to-write op. Of course, if it was in memory (as opposed to somebody elses caches), I think most cache protocols will start it up in exclusive state anyway. Same may or may not happen when you have a dirty hit on another CPU that requires a write-back (ie the other CPU would always invalidate on writeback). It would seem to be the obvious thing to do for better lock performance, and I'd assume that locks are some of the most common cases of real cache interactions, so maybe the shared case only effectively happens if two CPU's are reading at the same time. Somebody who looks at cache protocol diagrams could check. I'm too lazy. Linus