From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f41.google.com ([209.85.218.41]:35210 "EHLO mail-oi0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750942AbcINEBJ (ORCPT ); Wed, 14 Sep 2016 00:01:09 -0400 Received: by mail-oi0-f41.google.com with SMTP id w11so3263551oia.2 for ; Tue, 13 Sep 2016 21:01:09 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20160914133925.2fba4629@roar.ozlabs.ibm.com> References: <20160908213835.GY30056@dastard> <20160908235521.GL2356@ZenIV.linux.org.uk> <20160909015324.GD30056@dastard> <20160909023452.GO2356@ZenIV.linux.org.uk> <20160909221945.GQ2356@ZenIV.linux.org.uk> <20160914031648.GB2356@ZenIV.linux.org.uk> <20160914133925.2fba4629@roar.ozlabs.ibm.com> From: Linus Torvalds Date: Tue, 13 Sep 2016 21:01:07 -0700 Message-ID: Subject: Re: xfs_file_splice_read: possible circular locking dependency detected Content-Type: text/plain; charset=UTF-8 Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Nicholas Piggin Cc: Al Viro , Dave Chinner , CAI Qian , linux-xfs , xfs@oss.sgi.com, Jens Axboe On Tue, Sep 13, 2016 at 8:39 PM, Nicholas Piggin wrote: > > But even for those, at 16 entries, the bulk of the cost *should* be hitting > struct page cachelines and refcounting. The rest should mostly stay in cache. Yes. And those costs will be exactly the same whether we do 16 entries at a time or 4 loops of 4 entries. There's something to be said for small temp buffers. They often have better cache behavior thanks to re-use than having larger arrays. But I still think that the biggest win could be from just trying to cut down on code, if we can just say "we'll limit splice to N entries" (where "N" is small enough that we really can do everything in a simple stack allocation - I suspect 16 is already too big, and we really should look at 4 or 8). And if we actually get a report of a performance regression, we'd at least hear who actually *uses* splice and notices. I'm (sadly) still not at all convinced that "splice()" was ever a good idea. I think it was a clever idea, and it is definitely much more powerful conceptually than sendfile(), but I also suspect that it's simply not used enough to be really worth the pain. You can get great benchmark numbers with it. But whether it actually matters in real life? I really don't know. But if we screw it up, and make the buffers too small, and people actually complain and tell us about what they are doing, that in itself would be a good datapoint. So I wouldn't be too worried about just trying things out. We certainly don't want to *break* anything, but at the same time I really don't think we should be too nervous about it either. Which is why I'd be more than happy to say "Just try limiting things to a pretty small buffer and see if anybody even notices!" Linus