From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBBF8C35242 for ; Tue, 11 Feb 2020 19:06:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BD32E2168B for ; Tue, 11 Feb 2020 19:06:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730731AbgBKTGE (ORCPT ); Tue, 11 Feb 2020 14:06:04 -0500 Received: from shelob.surriel.com ([96.67.55.147]:58556 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728503AbgBKTGD (ORCPT ); Tue, 11 Feb 2020 14:06:03 -0500 Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92.3) (envelope-from ) id 1j1ar4-0001St-CT; Tue, 11 Feb 2020 14:05:54 -0500 Message-ID: <29b6e848ff4ad69b55201751c9880921266ec7f4.camel@surriel.com> Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU From: Rik van Riel To: Johannes Weiner , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Dave Chinner , Yafang Shao , Michal Hocko , Roman Gushchin , Andrew Morton , Linus Torvalds , Al Viro , kernel-team@fb.com Date: Tue, 11 Feb 2020 14:05:38 -0500 In-Reply-To: <20200211175507.178100-1-hannes@cmpxchg.org> References: <20200211175507.178100-1-hannes@cmpxchg.org> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-be8J/gfsysm0dy6l7psF" User-Agent: Evolution 3.34.2 (3.34.2-1.fc31) MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-be8J/gfsysm0dy6l7psF Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2020-02-11 at 12:55 -0500, Johannes Weiner wrote: > The VFS inode shrinker is currently allowed to reclaim inodes with > populated page cache. As a result it can drop gigabytes of hot and > active page cache on the floor without consulting the VM (recorded as > "inodesteal" events in /proc/vmstat). >=20 > This causes real problems in practice. Consider for example how the > VM > would cache a source tree, such as the Linux git tree. As large parts > of the checked out files and the object database are accessed > repeatedly, the page cache holding this data gets moved to the active > list, where it's fully (and indefinitely) insulated from one-off > cache > moving through the inactive list. > This behavior of invalidating page cache from the inode shrinker goes > back to even before the git import of the kernel tree. It may have > been less noticeable when the VM itself didn't have real workingset > protection, and floods of one-off cache would push out any active > cache over time anyway. But the VM has come a long way since then and > the inode shrinker is now actively subverting its caching strategy. Two things come to mind when looking at this: - highmem - NUMA IIRC one of the reasons reclaim is done in this way is because a page cache page in one area of memory (highmem, or a NUMA node) can end up pinning inode slab memory in another memory area (normal zone, other NUMA node). I do not know how much of a concern that still is nowadays, but it seemed something worth bringing up. --=20 All Rights Reversed. --=-be8J/gfsysm0dy6l7psF Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEKR73pCCtJ5Xj3yADznnekoTE3oMFAl5C+wIACgkQznnekoTE 3oO/IQgAl8ZKBW1n3o9BCqwLSqcu66jPS/q2dziIacDoXS3zW7ME3LAqluQa3Qen cN2+lPymRfObV9cUMHBd5Q8lZSPu4ABn/Vgp5I37pyA9WOgfC3yLVWvgbWIXn40u Rnl9TQn6TIsvZTY/3VD3MYrbry3Q87wrOrrUyRzeL7kZQ3s6njARKXrN44yN+ABf DirTGAH3PeBMd+JZNVT3yAGcp3EW1Oe2Fda99orpAh/kD7dKK1Gat/s2k0AwHvZz o3zhYqLbIi+4cNGj/g234KsMJpEfRwjZxVcsYaenm3qaWR4arNYV/5+M0lYsRNYK 8YRHaOQR5GDctvip88bDvdThWfDplw== =XJG5 -----END PGP SIGNATURE----- --=-be8J/gfsysm0dy6l7psF-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42EA2C35242 for ; Tue, 11 Feb 2020 19:06:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 42B2D2173E for ; Tue, 11 Feb 2020 19:06:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 42B2D2173E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CC3486B02EA; Tue, 11 Feb 2020 14:06:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C733E6B0311; Tue, 11 Feb 2020 14:06:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB03B6B0312; Tue, 11 Feb 2020 14:06:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id A09086B02EA for ; Tue, 11 Feb 2020 14:06:05 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 496FA2C8B for ; Tue, 11 Feb 2020 19:06:05 +0000 (UTC) X-FDA: 76478776290.26.smell40_8c65edc439f57 X-HE-Tag: smell40_8c65edc439f57 X-Filterd-Recvd-Size: 3855 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Tue, 11 Feb 2020 19:06:04 +0000 (UTC) Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92.3) (envelope-from ) id 1j1ar4-0001St-CT; Tue, 11 Feb 2020 14:05:54 -0500 Message-ID: <29b6e848ff4ad69b55201751c9880921266ec7f4.camel@surriel.com> Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU From: Rik van Riel To: Johannes Weiner , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Dave Chinner , Yafang Shao , Michal Hocko , Roman Gushchin , Andrew Morton , Linus Torvalds , Al Viro , kernel-team@fb.com Date: Tue, 11 Feb 2020 14:05:38 -0500 In-Reply-To: <20200211175507.178100-1-hannes@cmpxchg.org> References: <20200211175507.178100-1-hannes@cmpxchg.org> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-be8J/gfsysm0dy6l7psF" User-Agent: Evolution 3.34.2 (3.34.2-1.fc31) MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --=-be8J/gfsysm0dy6l7psF Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2020-02-11 at 12:55 -0500, Johannes Weiner wrote: > The VFS inode shrinker is currently allowed to reclaim inodes with > populated page cache. As a result it can drop gigabytes of hot and > active page cache on the floor without consulting the VM (recorded as > "inodesteal" events in /proc/vmstat). >=20 > This causes real problems in practice. Consider for example how the > VM > would cache a source tree, such as the Linux git tree. As large parts > of the checked out files and the object database are accessed > repeatedly, the page cache holding this data gets moved to the active > list, where it's fully (and indefinitely) insulated from one-off > cache > moving through the inactive list. > This behavior of invalidating page cache from the inode shrinker goes > back to even before the git import of the kernel tree. It may have > been less noticeable when the VM itself didn't have real workingset > protection, and floods of one-off cache would push out any active > cache over time anyway. But the VM has come a long way since then and > the inode shrinker is now actively subverting its caching strategy. Two things come to mind when looking at this: - highmem - NUMA IIRC one of the reasons reclaim is done in this way is because a page cache page in one area of memory (highmem, or a NUMA node) can end up pinning inode slab memory in another memory area (normal zone, other NUMA node). I do not know how much of a concern that still is nowadays, but it seemed something worth bringing up. --=20 All Rights Reversed. --=-be8J/gfsysm0dy6l7psF Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEKR73pCCtJ5Xj3yADznnekoTE3oMFAl5C+wIACgkQznnekoTE 3oO/IQgAl8ZKBW1n3o9BCqwLSqcu66jPS/q2dziIacDoXS3zW7ME3LAqluQa3Qen cN2+lPymRfObV9cUMHBd5Q8lZSPu4ABn/Vgp5I37pyA9WOgfC3yLVWvgbWIXn40u Rnl9TQn6TIsvZTY/3VD3MYrbry3Q87wrOrrUyRzeL7kZQ3s6njARKXrN44yN+ABf DirTGAH3PeBMd+JZNVT3yAGcp3EW1Oe2Fda99orpAh/kD7dKK1Gat/s2k0AwHvZz o3zhYqLbIi+4cNGj/g234KsMJpEfRwjZxVcsYaenm3qaWR4arNYV/5+M0lYsRNYK 8YRHaOQR5GDctvip88bDvdThWfDplw== =XJG5 -----END PGP SIGNATURE----- --=-be8J/gfsysm0dy6l7psF--