From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount Date: Mon, 2 Sep 2013 00:30:05 +0100 Message-ID: <20130901233005.GX13318@ZenIV.linux.org.uk> References: <20130901212355.GU13318@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Sedat Dilek , Waiman Long , Ingo Molnar , Benjamin Herrenschmidt , Jeff Layton , Miklos Szeredi , Ingo Molnar , Thomas Gleixner , linux-fsdevel , Linux Kernel Mailing List , Peter Zijlstra , Steven Rostedt , Andi Kleen , "Chandramouleeswaran, Aswin" , "Norton, Scott J" To: Linus Torvalds Return-path: Received: from zeniv.linux.org.uk ([195.92.253.2]:50417 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753087Ab3IAXaS (ORCPT ); Sun, 1 Sep 2013 19:30:18 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sun, Sep 01, 2013 at 03:48:01PM -0700, Linus Torvalds wrote: > I made DEFINE_LGLOCK use DEFINE_PER_CPU_SHARED_ALIGNED for the > spinlock, so that each local lock gets its own cacheline, and the > total loops jumped to 62M (from 52-54M before). So when I looked at > the numbers, I thought "oh, that helped". > > But then I looked closer, and realized that I just see a fair amount > of boot-to-boot variation anyway (probably a lot to do with cache > placement and how dentries got allocated etc). And it didn't actually > help at all, the problem is stilte there, and lg_local_lock is still > really really high on the profile, at 8% cpu time: > > - 8.00% lg_local_lock > - lg_local_lock > + 64.83% mntput_no_expire > + 33.81% path_init > + 0.78% mntput > + 0.58% path_lookupat > > which just looks insane. And no, no lg_global_lock visible anywhere.. > > So it's not false sharing. But something is bouncing *that* particular > lock around. Hrm... It excludes sharing between the locks, all right. AFAICS, that won't exclude sharing with plain per-cpu vars, will it? Could you tell what vfsmount_lock is sharing with on that build? The stuff between it and files_lock doesn't have any cross-CPU writers, but with that change it's the stuff after it that becomes interesting...