From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CB5BC2BA15 for ; Wed, 1 Apr 2020 23:52:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1C8F720772 for ; Wed, 1 Apr 2020 23:52:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387456AbgDAXwd (ORCPT ); Wed, 1 Apr 2020 19:52:33 -0400 Received: from mx2.suse.de ([195.135.220.15]:38342 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733265AbgDAXwd (ORCPT ); Wed, 1 Apr 2020 19:52:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 7C7D9ADD3; Wed, 1 Apr 2020 23:52:30 +0000 (UTC) From: NeilBrown To: Trond Myklebust , "Anna.Schumaker\@Netapp.com" , Andrew Morton , Jan Kara Date: Thu, 02 Apr 2020 10:52:21 +1100 Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org, LKML Subject: Writeback fixes for NFS In-Reply-To: <87tv2b7q72.fsf@notabene.neil.brown.name> References: <87tv2b7q72.fsf@notabene.neil.brown.name> Message-ID: <87v9miydai.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Please ignore my previous patch (which this is in reply to), it was flawed in various ways. I now understand the code a bit better and have a somewhat simpler patch which appears to address the same problem. The problem is that writeback to NFS often produces lots of small writes (10s of K) rather than fewer large writes (1M). This pattern can often hurt throughput, but in certain circumstances it can hurt NFS throughput more than expected. Each nfs_writepages() call results in an NFS commit being sent to the server. If writeback triggers lots of smaller nfs_writepages calls, this means lots of COMMITs. If the server is slow to handle the COMMIT (I've seen the Ganesha NFS server take over 200ms per commit), these COMMITs can overlap, queue up, and choke the NFS server and cause order-of-magnitude drop in throughput. So we really want to only call nfs_writepages when there are a largish number of pages to be written - i.e. that are 'dirty'. For historical reasons that I didn't thoroughly research but I'm confident are no longer relevant, pages that have been written to the NFS server but have not yet been the subject of a COMMIT - so-called "unstable" pages - are effectively accounted that same as "dirty" pages (sometimes called "reclaimable"). This can result in writeback thinking there are lots of "dirty" pages to reclaim, while nfs_writepages can only find a few that it can write out. The second patch following changes the accounting for these "unstable" pages. They are now always accounted exactly the same was writeback pages. Conceptually they can be thought of as still in writeback, but the writeback is now happening on the server. A COMMIT will always automatically follow the writes generated by nfs_writepages, so from the perspective of the VM, there really is no difference: It has scheduled the write and there is nothing else it can do except wait. Testing this patch showed that loop-back NFS is prone to deadlocks again. I cannot see exactly how the change to 'unstable' accounting affected this, but I can see that the old +25% heuristic can no longer be justified given the complexity of writeback calculations. So the first patch following changes how writeback is handled for NFS servers handling loop-back requests (and other similar services) so that it is more obviously safe against excessive dirty pages scheduled for other devices. Thanks, NeilBrown --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAl6FKTYACgkQOeye3VZi gbkMhQ/9Eb/q3J6cLCyuDP/OqMQWyM38kLkbAhYYBgjcrgn7r/VmruQJfREPQGZc f2i7X9VZiBFCIh0HXRdsR17d4qU6NCtAtf234EppdkceGrT+yA1RNdUV1nOFZJCY 1Qs/xyNzHOgzveedx2wuGJ5BA94Dd6MVeNE+DxFEgzqPexp14/8vqALhbLB0GLJu eN8R7w+uCIUvTDeQp0TFuG6aUWDcQoIWi3aCxMfTICyYxjG35Nss2N/N7HinmZa/ zg8nMnE/iCFc7It89N/6i8IjjAE62SBcj5kfhzdqY3DguVX6nio3raef/ZMoH3bS j7DEQacqwUVOsvoLutEzGBRRZ+GQEa2+Cal5AniuUpBOfOr+DyhOOpSVDPNLBY/b 7yTzK1BR1ttFI3NtJpKoFbNdKfpWkIpdebPhe6AcfOT+rhnbgXWkl15oexsiZWOU q5k49bL9HPZ6NRsMng2pS2W7BYNQVqin70XuO2XnOTHLa+BOBh0cm6k0QTjc+XI+ /bvNonecFYqQMAcDtVDwo7G3bCwPxcSfcUDM5QD+TqbJ5tZLF2yHu1KWcZhTrtWf q++Lrs0NGYXoe/iyqtNYkdBFOk/4YzWVZzzfNi8feV1oqYRACqKhSmicFhdr4OYc vEfR84TCAEldq/TWUw0o51i2HvZvcsJ1hwl5PdwnCAGsnQC8rsw= =nBP7 -----END PGP SIGNATURE----- --=-=-=--