From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932072AbWBXQoZ (ORCPT ); Fri, 24 Feb 2006 11:44:25 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932292AbWBXQoZ (ORCPT ); Fri, 24 Feb 2006 11:44:25 -0500 Received: from iolanthe.rowland.org ([192.131.102.54]:10965 "HELO iolanthe.rowland.org") by vger.kernel.org with SMTP id S932072AbWBXQoZ (ORCPT ); Fri, 24 Feb 2006 11:44:25 -0500 Date: Fri, 24 Feb 2006 11:44:23 -0500 (EST) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Benjamin LaHaise cc: Andrew Morton , , Kernel development list Subject: Re: [PATCH] Avoid calling down_read and down_write during startup In-Reply-To: <20060224151510.GC7101@kvack.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 24 Feb 2006, Benjamin LaHaise wrote: > On Fri, Feb 24, 2006 at 10:04:23AM -0500, Alan Stern wrote: > > What do you think of the two suggestions in my previous message? > > Even if the read version of the lock only touches a cacheline local to > the cpu, you'd still have to use the lock prefix to allow for correctness > when a writer comes along. It is not cacheline bouncing that worries me, > it is serialising instructions and memory barriers as those hurt immensely > when the data is in the cache. I've been looking at a lot of profiles on > P4s of late, and every single locked instruction is painful as it means > all of the memory ordering rules come into play. Neither suggestion > addresses that overhead that has been introduced. In that case you should be worried not about acquiring and releasing the rwsem at the beginning and end of blocking_notifier_call_chain; you should be worried about all the RCU serialization in the core notifier_call_chain routine. In fact, that could be changed for blocking notifier chains. Since we own the readlock, we know that the list won't get changed while we're using it. So the blocking version of the call-chain routine could avoid using the RCU dereferencing mechanism. The atomic chains are a different matter. The ones that don't run in NMI context could use an rw-spinlock for protection, allowing them also to avoid memory barriers while going through the list. The notifier chains that do run in NMI don't have this luxury. Fortunately I don't think there are very many of them. Alan Stern