From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752719AbdLNOOv (ORCPT ); Thu, 14 Dec 2017 09:14:51 -0500 Received: from mail.kernel.org ([198.145.29.99]:45228 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751480AbdLNOOu (ORCPT ); Thu, 14 Dec 2017 09:14:50 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A358B218D1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=jlayton@kernel.org Message-ID: <1513260887.3504.12.camel@kernel.org> Subject: Re: [PATCH 00/19] fs: rework and optimize i_version handling in filesystems From: Jeff Layton To: Dave Chinner Cc: "J. Bruce Fields" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, neilb@suse.de, amir73il@gmail.com, jack@suse.de, viro@zeniv.linux.org.uk Date: Thu, 14 Dec 2017 09:14:47 -0500 In-Reply-To: <1513209746.3498.59.camel@kernel.org> References: <20171213142017.23653-1-jlayton@kernel.org> <20171213150533.GA9205@fieldses.org> <1513196068.3498.32.camel@kernel.org> <20171213230334.GC5858@dastard> <1513209746.3498.59.camel@kernel.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.2 (3.26.2-1.fc27) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2017-12-13 at 19:02 -0500, Jeff Layton wrote: > On Thu, 2017-12-14 at 10:03 +1100, Dave Chinner wrote: > > On Wed, Dec 13, 2017 at 03:14:28PM -0500, Jeff Layton wrote: > > > On Wed, 2017-12-13 at 10:05 -0500, J. Bruce Fields wrote: > > > > This is great, thanks. > > > > > > > > On Wed, Dec 13, 2017 at 09:19:58AM -0500, Jeff Layton wrote: > > > > > With this, we reduce inode metadata updates across all 3 filesystems > > > > > down to roughly the frequency of the timestamp granularity, particularly > > > > > when it's not being queried (the vastly common case). > > > > > > > > > > The pessimal workload here is 1 byte writes, and it helps that > > > > > significantly. Of course, that's not what we'd consider a real-world > > > > > workload. > > > > > > > > > > A tiobench-example.fio workload also shows some modest performance > > > > > gains, and I've gotten mails from the kernel test robot that show some > > > > > significant performance gains on some microbenchmarks (case-msync-mt in > > > > > the vm-scalability testsuite to be specific), with an earlier version of > > > > > this set. > > > > > > > > > > With larger writes, the gains with this patchset mostly vaporize, > > > > > but it does not seem to cause performance to regress anywhere, AFAICT. > > > > > > > > > > I'm happy to run other workloads if anyone can suggest them. > > > > > > > > > > At this point, the patchset works and does what it's expected to do in > > > > > my own testing. It seems like it's at least a modest performance win > > > > > across all 3 major disk-based filesystems. It may also encourage others > > > > > to implement i_version as well since it reduces the cost. > > > > > > > > Do you have an idea what the remaining cost is? > > > > > > > > Especially in the ext4 case, are you still able to measure any > > > > difference in performance between the cases where i_version is turned on > > > > and off, after these patches? > > > > > > Attached is a fio jobfile + the output from 3 different runs using it > > > with ext4. This one is using 4k writes. There was no querying of > > > i_version during the runs. I've done several runs with each and these > > > are pretty representative of the results: > > > > > > old = 4.15-rc3, i_version enabled > > > ivers = 4.15-rc3 + these patches, i_version enabled > > > noivers = 4.15-rc3 + these patches, i_version disabled > > > > > > To snip out the run status lines: > > > > > > old: > > > WRITE: bw=85.6MiB/s (89.8MB/s), 9994KiB/s-11.1MiB/s (10.2MB/s-11.7MB/s), io=50.2GiB (53.8GB), run=600001-600001msec > > > > > > ivers: > > > WRITE: bw=110MiB/s (115MB/s), 13.5MiB/s-14.2MiB/s (14.1MB/s-14.9MB/s), io=64.3GiB (69.0GB), run=600001-600001msec > > > > > > noivers: > > > WRITE: bw=117MiB/s (123MB/s), 14.2MiB/s-15.2MiB/s (14.9MB/s-15.9MB/s), io=68.7GiB (73.8GB), run=600001-600001msec > > > > > > So, I see some performance degradation with -o iversion compared to not > > > having it enabled (maybe due to the extra atomic fetches?), but this set > > > erases most of the difference. > > > > So what is the performance difference when something is actively > > querying the i_version counter as fast as it can (e.g. file being > > constantly stat()d via NFS whilst being modified)? How does the > > performance compare to the old code in that case? > > > > I haven't benchmarked that with the latest set, but I did with the set > that I posted around a year ago. Basically I just ran a similar test to > this, and had another shell open doing statx(..., STATX_VERSION, ...); > the thing in a tight loop. > > I did see some performance hit vs. the case where no one is viewing it, > but it was still significantly faster than the unpatched version that > was incrementing the counter every time. > > That was on a different test rig, and the patchset has some small > differences now. I'll see if I can get some hard numbers with such a > testcase soon. I reran the test on xfs, with another shell running test-statx to query the i_version value in a tight loop: WRITE: bw=110MiB/s (115MB/s), 13.6MiB/s-14.0MiB/s (14.2MB/s-14.7MB/s), io=64.5GiB (69.3GB), run=600001-600001msec ...contrast that with the run where I was not doing any queries: WRITE: bw=129MiB/s (136MB/s), 15.8MiB/s-16.6MiB/s (16.6MB/s-17.4MB/s), io=75.9GiB (81.5GB), run=600001-600001msec ...vs the unpatched kernel: WRITE: bw=86.7MiB/s (90.0MB/s), 9689KiB/s-11.7MiB/s (9921kB/s-12.2MB/s), io=50.8GiB (54.6GB), run=600001-600002msec There is some clear peformance impact when you are running frequent queries of the i_version. My gut feeling is that you could probably make the new code perform worse than the old if you were to _really_ hammer the inode with queries for the i_version (probably from many threads in parallel) while doing a lot of small writes to it. That'd be a pretty unusual workload though. -- Jeff Layton