From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S290330AbUKAXy5 (ORCPT ); Mon, 1 Nov 2004 18:54:57 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S266271AbUKAXrx (ORCPT ); Mon, 1 Nov 2004 18:47:53 -0500 Received: from fw.osdl.org ([65.172.181.6]:52398 "EHLO mail.osdl.org") by vger.kernel.org with ESMTP id S290355AbUKAXmR (ORCPT ); Mon, 1 Nov 2004 18:42:17 -0500 Date: Mon, 1 Nov 2004 15:42:00 -0800 (PST) From: Linus Torvalds To: linux-os@analogic.com cc: dean gaudet , Andreas Steinmetz , Kernel Mailing List , Richard Henderson , Andi Kleen , Andrew Morton , Jan Hubicka Subject: Re: Semaphore assembly-code bug In-Reply-To: Message-ID: References: <417550FB.8020404@drdos.com> <1098218286.8675.82.camel@mentorng.gurulabs.com> <41757478.4090402@drdos.com> <20041020034524.GD10638@michonline.com> <1098245904.23628.84.camel@krustophenia.net> <41826A7E.6020801@domdv.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 1 Nov 2004, linux-os wrote: > > No. You've just shown that you like to argue. I recall that you > recently, like within the past 24 hours, supplied a patch that > got rid of the time-consuming stack operations in your semaphore > code. Remember, you changed it to pass parameters in registers. ... because that fixed a _bug_. > Why would you bother if stack operations are free? I didn't say that instructions are free. I just tried (unsuccessfully) to tell you that "lea" is not free either, and that "lea" has some serious problems on several setups, ranging from old cpu's (AGI stalls) to new CPU's (stack engine stalls). And that "pop" is often faster. And you have been arguing against it despite the fact that I ended up writing a small test-program to show that it's true. It's a _stupid_ test-program, but the fact is, you only need a single test-case to prove some theory wrong. Your theory that "lea" is somehow always cheaper than "pop" is wrong. > It's not a total focus. It's just necessary emphasis. Any work > done by your computer, ultimately comes from and goes to memory. Not so. A lot of work is done in cache. Any access that doesn't change the state of the cache is a no-op as far as the memory bus is concerned. Ie a store to a cacheline that is already dirty is just a cache access, as is a load from a cacheline that is already loaded. This is especially true on x86 CPU's, where the lack of registers means that the core has been highly optimized for doing cached operations. Remember: a CPU is not some kind of abstract entity - it's a very practical piece of engineering that has been highly optimized for certain usage patterns. And the fact is, "lea" on %esp is not a common usage pattern. Which is why, in practice, you will find CPU's that end up not optimizing for it. While "pop"+"pop" is a _very_ common pattern, and why existing CPU's do them efficiently. Linus