From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754091AbbFTA3w (ORCPT ); Fri, 19 Jun 2015 20:29:52 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:53335 "EHLO e32.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753182AbbFTA3q (ORCPT ); Fri, 19 Jun 2015 20:29:46 -0400 X-Helo: d03dlp01.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org Date: Fri, 19 Jun 2015 17:29:37 -0700 From: "Paul E. McKenney" To: Andi Kleen Cc: Dave Hansen , dave.hansen@linux.intel.com, akpm@linux-foundation.org, jack@suse.cz, viro@zeniv.linux.org.uk, eparis@redhat.com, john@johnmccutchan.com, rlove@rlove.org, tim.c.chen@linux.intel.com, linux-kernel@vger.kernel.org Subject: Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files Message-ID: <20150620002937.GE3913@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20150619215025.4F689817@viggo.jf.intel.com> <20150619233306.GT25760@tassilo.jf.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150619233306.GT25760@tassilo.jf.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15062000-0005-0000-0000-000010259C6D Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 19, 2015 at 04:33:06PM -0700, Andi Kleen wrote: > On Fri, Jun 19, 2015 at 02:50:25PM -0700, Dave Hansen wrote: > > > > From: Dave Hansen > > > > I have a _tiny_ microbenchmark that sits in a loop and writes > > single bytes to a file. Writing one byte to a tmpfs file is > > around 2x slower than reading one byte from a file, which is a > > _bit_ more than I expecte. This is a dumb benchmark, but I think > > it's hard to deny that write() is a hot path and we should avoid > > unnecessary overhead there. > > > > I did a 'perf record' of 30-second samples of read and write. > > The top item in a diffprofile is srcu_read_lock() from > > fsnotify(). There are active inotify fd's from systemd, but > > nothing is actually listening to the file or its part of > > the filesystem. > > > > I *think* we can avoid taking the srcu_read_lock() for the > > common case where there are no actual marks on the file > > being modified *or* the vfsmount. > > What is so expensive in it? Just the memory barrier in it? > > Perhaps the function can be tuned in general. The memory barrier we are pretty much stuck with unless we want synchronize_srcu() to be quite a bit more expensive (and for SRCU to be unusable from offline and idle) -- and that synchronize_srcu() expense drove rewrite from the earlier version to this one. It is possible to cut down from two to one instances of __this_cpu_inc(), however. It of course would be possible to have two types of SRCU, one for fast grace periods and the other for memory-barrier-free read-side critical sections, but obviously a very clear case would need to be made for this. At least judging from the reactions the last time I introduced a new flavor of RCU. ;-) So, echoing Andi, what exactly is expensive? Thanx, Paul > -Andi > > int __srcu_read_lock(struct srcu_struct *sp) > { > int idx; > > idx = ACCESS_ONCE(sp->completed) & 0x1; > preempt_disable(); > __this_cpu_inc(sp->per_cpu_ref->c[idx]); > smp_mb(); /* B */ /* Avoid leaking the critical section. */ > __this_cpu_inc(sp->per_cpu_ref->seq[idx]); > preempt_enable(); > return idx; > } > > > > > > The *_fsnotify_mask is an aggregate of each of the masks from > > each mark. If we have nothing set in the masks at all then there > > are no marks and no need to do anything with 'ignored masks' > > since none exist. This keeps us from having to do the costly > > srcu_read_lock() for a check which is very cheap. > > > > This patch gave a 10.8% speedup in writes/second on my test. > > > > Signed-off-by: Dave Hansen > > Cc: Andrew Morton > > Cc: Jan Kara > > Cc: Al Viro > > Cc: Eric Paris > > Cc: John McCutchan > > Cc: Robert Love > > Cc: Tim Chen > > Cc: Andi Kleen > > Cc: linux-kernel@vger.kernel.org > > --- > > > > b/fs/notify/fsnotify.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff -puN fs/notify/fsnotify.c~optimize-fsnotify fs/notify/fsnotify.c > > --- a/fs/notify/fsnotify.c~optimize-fsnotify 2015-06-19 13:29:53.117283581 -0700 > > +++ b/fs/notify/fsnotify.c 2015-06-19 13:29:53.123283853 -0700 > > @@ -213,6 +213,16 @@ int fsnotify(struct inode *to_tell, __u3 > > !(test_mask & to_tell->i_fsnotify_mask) && > > !(mnt && test_mask & mnt->mnt_fsnotify_mask)) > > return 0; > > + /* > > + * Optimization: The *_fsnotify_mask is an aggregate of each of the > > + * masks from each mark. If we have nothing set in the masks at > > + * all then there are no marks and no need to do anything with > > + * 'ignored masks' since none exist. This keeps us from having to > > + * do the costly srcu_read_lock() for a check which is very cheap. > > + */ > > + if (!to_tell->i_fsnotify_mask && > > + (!mnt || !mnt->mnt_fsnotify_mask)) > > + return 0; > > > > idx = srcu_read_lock(&fsnotify_mark_srcu); > > > > _ > > -- > ak@linux.intel.com -- Speaking for myself only > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Please read the FAQ at http://www.tux.org/lkml/