From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ipmail01.adl2.internode.on.net ([150.101.137.133]:26274 "EHLO ipmail01.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726215AbeHRBrQ (ORCPT ); Fri, 17 Aug 2018 21:47:16 -0400 Date: Sat, 18 Aug 2018 08:42:00 +1000 From: Dave Chinner To: Kent Overstreet Cc: Carlos Maiolino , linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk Subject: Re: [PATCH 0/2] generic_file_buffered_read improvements Message-ID: <20180817224159.GE31495@dastard> References: <20180815232632.32548-1-kent.overstreet@gmail.com> <20180816075740.2gazls4q46y5rmuq@odin.usersys.redhat.com> <20180816100531.GC24640@kmo-pixel> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180816100531.GC24640@kmo-pixel> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Thu, Aug 16, 2018 at 06:05:31AM -0400, Kent Overstreet wrote: > On Thu, Aug 16, 2018 at 09:57:40AM +0200, Carlos Maiolino wrote: > > On Wed, Aug 15, 2018 at 07:26:30PM -0400, Kent Overstreet wrote: > > > Small patch series to > > > - firstly, refactor generic_file_buffered_read enough that it can be modified > > > in more interesting ways without going insane, and then > > > > > > - secondly, change it to use find_get_pages_contig() to batch up the page > > > operations, and then copy data to userspace in a separate loop that touches > > > no other shared cachelines. > > > > > > I've been seeing profiles where the radix tree lookups in the buffered read path > > > are a shockingly large portion of the profile (around 25%, if memory serves) - > > > that's what this patch series is addressing. I've benchmarked small block reads > > > as well, performance there is unaffected or slightly improved (it's within the > > > margin of error). > > > > > > > /me didn't review the patches, but... > > > > Could you share how you benchmarked it? Despite the fact I'm curious about it, > > it's going to be interesting the 'proof' of such improvement. > > I tried coming up with a microbenchmark and gave up because it was getting too > ridiculous - you need something _modifying_ the page cache radix tree for the > contention to show up. That's usually going to be page reclaim, which means you > can't just populate the page cache and run your benchmark, you have to be > reading stuff in and evicting stuff. Which means you need waaay higher end IO > devices than what I have at home for it to show up. POSIX_FADV_DONTNEED will remove clean pages from the radix tree on command. So essentially you could just hammer on a single file from lots of CPUs with buffered reads and POSIX_FADV_DONTNEED to stress the radix tree. Cheers, Dave. -- Dave Chinner david@fromorbit.com