From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C1FCC61DA4 for ; Tue, 14 Mar 2023 01:51:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 98CD46B0072; Mon, 13 Mar 2023 21:51:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 93C5A6B0074; Mon, 13 Mar 2023 21:51:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82B136B0075; Mon, 13 Mar 2023 21:51:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 731B06B0072 for ; Mon, 13 Mar 2023 21:51:00 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 433D440FDA for ; Tue, 14 Mar 2023 01:51:00 +0000 (UTC) X-FDA: 80565825480.06.8D2F77F Received: from dormouse.elm.relay.mailchannels.net (dormouse.elm.relay.mailchannels.net [23.83.212.50]) by imf30.hostedemail.com (Postfix) with ESMTP id 9ADD280012 for ; Tue, 14 Mar 2023 01:50:56 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=stgolabs.net header.s=dreamhost header.b=NWd4+fd9; arc=pass ("mailchannels.net:s=arc-2022:i=1"); spf=pass (imf30.hostedemail.com: domain of dave@stgolabs.net designates 23.83.212.50 as permitted sender) smtp.mailfrom=dave@stgolabs.net; dmarc=none ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678758657; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CeVq90zpoQPpRIIwvxHQSv+0qiRjCSPI+wnNP15M1+k=; b=p07t82rxVHN1chNutMqxBPRpsN+WNmi/xIivY0U4vY8KA0CzdBu5a5wbZe046DwU4E8cyJ BdjRC5RKH77DNrwGv03VrV9wvdiWO3StQP4khXQyORnZ4oSRsMDgvVAFKVG4wDW3rmluLh t2Gwmrs3pNd5pImrSE0bSnsmIKVUJBA= ARC-Authentication-Results: i=2; imf30.hostedemail.com; dkim=pass header.d=stgolabs.net header.s=dreamhost header.b=NWd4+fd9; arc=pass ("mailchannels.net:s=arc-2022:i=1"); spf=pass (imf30.hostedemail.com: domain of dave@stgolabs.net designates 23.83.212.50 as permitted sender) smtp.mailfrom=dave@stgolabs.net; dmarc=none ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1678758657; a=rsa-sha256; cv=pass; b=mQWogiyinWqN8nDJWCc0FIAL2d2Ac0K8tJ4+qTTPXxUcUbmwcbPUzftqr/9zOJ7Dh+kzcu NZ7Aar1ItmPm2oEuSuyUXByRUPn60HiaTwaInNnMglSUqIvETsj72YqKr82TkhDYr9wFYs kblnpBGDuVvRx5bZzio/GACxyESLHwg= X-Sender-Id: dreamhost|x-authsender|dave@stgolabs.net Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id DF9FE9210B7; Tue, 14 Mar 2023 01:50:54 +0000 (UTC) Received: from pdx1-sub0-mail-a209.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 212A8920D95; Tue, 14 Mar 2023 01:50:54 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1678758654; a=rsa-sha256; cv=none; b=8Umcoy/eo5+arqVc3uP0nZwdkwBgS1hIQ+GJT4m6C7WWSLKB3fTB3YmcGiFN2Ee8hD1KP8 4ESnavRUjghjafWHc5IgXG+mfqy5DMSyNW/v1ozpIl/XKL+NbRRosBkWYj5ehDdRMmHNm4 KQ5Bwy7hmFa3UaHVY7eUdskCPelh8MiZLcRPl0Y0xXRbdlKw1synoql3lG2x8d87d/tBhi qBRBNcGACoASZXKWXR/y3w9rWJOc047jMj13zKdpPzBg2VBlphagPrsy6TsXOxiLXIa03+ 9n8Q13j7e2v1+bDd7X/au5YDYMWNyBzAl+7i1SZp21vLCJkuS/SX6NwaJPo6xA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1678758654; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CeVq90zpoQPpRIIwvxHQSv+0qiRjCSPI+wnNP15M1+k=; b=QgNw5DBchnlvfkc8ZlzadIfG7seuBBAnGKrIzLBXAmOHE62a9TTHtwUmFhdANHLCIf8RqZ 0gXeY1IbDN66klVrHsoLbz2BSmGhJR3E345PPVWre4wAlKXE6xioz9a1Gb7cCRN1zs0Eeo x7QO3u8FGBd0tfquvJF3jtfFxCyyoTi4NnOk7r7XcMaWlYiMobMwq1tEDT4mML0SA7Gd0a XjhaHd5GywqYcRnWdxizJfMMKtAkgB8vg6Jw3aLvHiKvDLUeAL4V6PsJ7E1uCtTWY2aF6D WO0TWMMV4n0woOFQ4SpZLpPRVYBzu+3JNAgQhDxLGR0ZYOoQBLZI6p/r+iA3sA== ARC-Authentication-Results: i=1; rspamd-7b575b84b5-scwl7; auth=pass smtp.auth=dreamhost smtp.mailfrom=dave@stgolabs.net X-Sender-Id: dreamhost|x-authsender|dave@stgolabs.net X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|dave@stgolabs.net X-MailChannels-Auth-Id: dreamhost X-Absorbed-Print: 27b296e03cb36e20_1678758654506_3032418653 X-MC-Loop-Signature: 1678758654506:3492920659 X-MC-Ingress-Time: 1678758654506 Received: from pdx1-sub0-mail-a209.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.102.66.52 (trex/6.7.2); Tue, 14 Mar 2023 01:50:54 +0000 Received: from offworld (unknown [104.36.30.209]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: dave@stgolabs.net) by pdx1-sub0-mail-a209.dreamhost.com (Postfix) with ESMTPSA id 4PbGhF11ynz1w; Mon, 13 Mar 2023 18:50:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stgolabs.net; s=dreamhost; t=1678758653; bh=CeVq90zpoQPpRIIwvxHQSv+0qiRjCSPI+wnNP15M1+k=; h=Date:From:To:Cc:Subject:Content-Type; b=NWd4+fd9Ah4hDRUxmKoySKU/RhNnaTRT4RDNBDtsRBOJlBPZlx1eDwBcchDuiuNd/ sKO7OsMzYTw79czSIBRko+AzXXBmSElDyuql7E/AVm0QkVIbDSIdttvjQvBdqRdR1o QvGIqQES+PVfwGcXT6MJuUBnk2sc7QiDCc4ZYv6OXeTttfcS8VvLBIb6JCg6JI0AzS Wpa8MkTVxY/Lo9TK7c3iBQ/r+FuTAjL/RyQjhXSQ8MQMKa+l+YPFC8NvTy0IwwAnGE 6yF82jryLFnUstgZgimMAMEuOsETvxxx5b9AkTpk5l0044CC/UkHDfgH6gx1179Do5 3+/K010UKZYTg== Date: Mon, 13 Mar 2023 18:21:30 -0700 From: Davidlohr Bueso To: Luis Chamberlain Cc: hughd@google.com, akpm@linux-foundation.org, willy@infradead.org, brauner@kernel.org, linux-mm@kvack.org, p.raghav@samsung.com, da.gomez@samsung.com, a.manzanares@samsung.com, yosryahmed@google.com, keescook@chromium.org, patches@lists.linux.dev, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 0/6] tmpfs: add the option to disable swap Message-ID: <20230314012130.3peeuifjjemynklk@offworld> Mail-Followup-To: Luis Chamberlain , hughd@google.com, akpm@linux-foundation.org, willy@infradead.org, brauner@kernel.org, linux-mm@kvack.org, p.raghav@samsung.com, da.gomez@samsung.com, a.manzanares@samsung.com, yosryahmed@google.com, keescook@chromium.org, patches@lists.linux.dev, linux-kernel@vger.kernel.org References: <20230309230545.2930737-1-mcgrof@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20230309230545.2930737-1-mcgrof@kernel.org> User-Agent: NeoMutt/20220429 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: u1u4sbchkfw519jkxwx9cgss5pkj6njg X-Rspamd-Queue-Id: 9ADD280012 X-HE-Tag: 1678758656-137636 X-HE-Meta: U2FsdGVkX1+ip+mBzxuY+nMQIVm9MHuc0Hip4Oetp5y6PGgWftQBzdRdp4Ok6mu3k0wkhdHjIzQiIU52YltH6qP8miqr7zZRSBOii2ltwlq19sz5KFoIpBMP0JtLY07rplPSK3Y7DP4FO2gH4npbFKudjzj1AT2WC3dRnAlKCNYB9bTJ2xuuFSB7B9mAjLIi9VbuCPi5DEbt2/lArahun/ddLDof/cNSBxFEnGx8PijcvrfoVO5iBzUnW6ONgi6PJDqIlOy/KfheWNotImZPuyjxgdEqFDCHL9VjC/VNwgYHp8yjXB1Mv9q/DWCgeUa9SG0j0RLIbLxDq5RAaCYOfNO01DDDyCLCSo3CnMhLk+IHzcRR07ANyQoAI1PtlnssUHIYmNB1stq+HFLsRL39NBI/X3vSg/htQDkJ+iWRFnGKiMCX6WYo+AvvTT9cGzo+6LPwFhbYiF167KfhD4O0wXDKQgBdLCrdqdcgbsT0MBlzq3ImyFLtZkdXYeOr2R5N0yW3lk0lThilHJvEshD4V3GJzG46s/hAcG5I9FzwfOD6trxS1IkG3weAgXB2JnA3Vw2lIhcYtXrHDycj4l9h1wk9QMfG4dgcG3Fin9+bnMmZ4/B2t2PxDM16qcOJewhxnxS9XuoOq4XT5fFb0aLjzohhnTk8TH3ZDJJEeMidAV1fzMnvgerez5J1l6X7rv4VC7BryivPzJnM9ueoWdZiDI8iNJgUVHPWjoTL71vKya3tid3JeThMxN9lPOw+kifhHfjx2LGyprXdaj1BBhBS5Pbb0zn4o/+2a8w8HdCnuY/wQ05lDI+cTB9pMcKHelZQJr4Ts/mtOjtUZnDyZePt8k43HjDr+bjjb9uiks+/eowJ5KE/Kjdz/jJhLkgmFX7vW4ywedi3ORhvw+uczJgKzNMZRP/tM+3Ggbdqs6f6qt/fZTK+5WN5CM4CkyBKCB/1g+MwOaRY87taFQFGr9g 4yJZMNN0 qqUk0FxoRuJrnJGIeBgGCYPiIC9wPaEDkH4qMze6Zd9+NbeggyKNLancaEzezuCqYrMYjCPYt490E36sa5VSvTExNnZrqdqj9PuH6zkflKp+EsVw9riIHtQVxA3bP3TaZIk4pfn3AXODbIDmgaWLRPCvIMafrAe46AcTnEy4do/qROWQraRfwV4Thu0QpiEBrwoKAlyzEl46U/WjYQz5GQNRexILEoranlSYuOz4MhkHTiOr+EiesbrDLQQuJfyqs/LfcXeLlAY20s/qDO+s14QqaKzJXuNHtZxqfp7RM60YUWCFaMnb7gnTh25dOYa8WPhruGKUbDq3js3A4UdQN5zr9ldvbIc48uyKVfNqlqHO/zZGE387rb25lYg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 09 Mar 2023, Luis Chamberlain wrote: >Changes on this v2 PATCH series: > > o Added all respective tags for Reviewed-by, Acked-by's > o David Hildenbrand suggested on the update-docs patch to mention THP. > It turns out tmpfs.rst makes absolutely no mention to THP at all > so I added all the relevant options to the docs including the > system wide sysfs file. All that should hopefully demistify that > and make it clearer. > o Yosry Ahmed spell checked my patch "shmem: add support to ignore swap" > >Changes since RFCv2 to the first real PATCH series: > > o Added Christian Brauner'd Acked-by for the noswap patch (the only > change in that patch is just the new shmem_show_options() change I > describe below). > o Embraced Yosry Ahmed's recommendation to use mapping_set_unevictable() > to at ensure the folios at least appear in the unevictable LRU. > Since that is the goal, this accomplishes what we want and the VM > takes care of things for us. The shem writepage() still uses a stop-gap > to ensure we don't get called for swap when its shmem uses > mapping_set_unevictable(). > o I had evaluated using shmem_lock() instead of calling mapping_set_unevictable() > but upon my review this doesn't make much sense, as shmem_lock() was > designed to make use of the RLIMIT_MEMLOCK and this was designed for > files / IPC / unprivileged perf limits. If we were to use > shmem_lock() we'd bump the count on each new inode. Using > shmem_lock() would also complicate inode allocation on shmem as > we'd to unwind on failure from the user_shm_lock(). It would also > beg the question of when to capture a ucount for an inode, should we > just share one for the superblock at shmem_fill_super() or do we > really need to capture it at every single inode creation? In theory > we could end up with different limits. The simple solution is to > juse use mapping_set_unevictable() upon inode creation and be done > with it, as it cannot fail. > o Update the documentation for tmpfs before / after my patch to > reflect use cases a bit more clearly between ramfs, tmpfs and brd > ramdisks. > o I updated the shmem_show_options() to also reveal the noswap option > when its used. > o Address checkpatch style complaint with spaces before tabs on > shmem_fs.h. > >Chances since first RFC: > > o Matthew suggested BUG_ON(!folio_test_locked(folio)) is not needed > on writepage() callback for shmem so just remove that. > o Based on Matthew's feedback the inode is set up early as it is not > reset in case we split the folio. So now we move all the variables > we can set up really early. > o shmem writepage() should only be issued on reclaim, so just move > the WARN_ON_ONCE(!wbc->for_reclaim) early so that the code and > expectations are easier to read. This also avoid the folio splitting > in case of that odd case. > o There are a few cases where the shmem writepage() could possibly > hit, but in the total_swap_pages we just bail out. We shouldn't be > splitting the folio then. Likewise for VM_LOCKED case. But for > a writepage() on a VM_LOCKED case is not expected so we want to > learn about it so add a WARN_ON_ONCE() on that condition. > o Based on Yosry Ahmed's feedback the patch which allows tmpfs to > disable swap now just uses mapping_set_unevictable() on inode > creation. In that case writepage() should not be called so we > augment the WARN_ON_ONCE() for writepage() for that case to ensure > that never happens. > >To test I've used kdevops [0] 8 vpcu 4 GiB libvirt guest on linux-next. > >I'm doing this work as part of future experimentation with tmpfs and the >page cache, but given a common complaint found about tmpfs is the >innability to work without the page cache I figured this might be useful >to others. It turns out it is -- at least Christian Brauner indicates >systemd uses ramfs for a few use-cases because they don't want to use >swap and so having this option would let them move over to using tmpfs >for those small use cases, see systemd-creds(1). > >To see if you hit swap: > >mkswap /dev/nvme2n1 >swapon /dev/nvme2n1 >free -h > >With swap - what we see today >============================= >mount -t tmpfs -o size=5G tmpfs /data-tmpfs/ >dd if=/dev/urandom of=/data-tmpfs/5g-rand2 bs=1G count=5 >free -h > total used free shared buff/cache available >Mem: 3.7Gi 2.6Gi 1.2Gi 2.2Gi 2.2Gi 1.2Gi >Swap: 99Gi 2.8Gi 97Gi > > >Without swap >============= > >free -h > total used free shared buff/cache available >Mem: 3.7Gi 387Mi 3.4Gi 2.1Mi 57Mi 3.3Gi >Swap: 99Gi 0B 99Gi >mount -t tmpfs -o size=5G -o noswap tmpfs /data-tmpfs/ >dd if=/dev/urandom of=/data-tmpfs/5g-rand2 bs=1G count=5 >free -h > total used free shared buff/cache available >Mem: 3.7Gi 2.6Gi 1.2Gi 2.3Gi 2.3Gi 1.1Gi >Swap: 99Gi 21Mi 99Gi > >The mix and match remount testing >================================= > ># Cannot disable swap after it was first enabled: >mount -t tmpfs -o size=5G tmpfs /data-tmpfs/ >mount -t tmpfs -o remount -o size=5G -o noswap tmpfs /data-tmpfs/ >mount: /data-tmpfs: mount point not mounted or bad option. > dmesg(1) may have more information after failed mount system call. >dmesg -c >tmpfs: Cannot disable swap on remount > ># Remount with the same noswap option is OK: >mount -t tmpfs -o size=5G -o noswap tmpfs /data-tmpfs/ >mount -t tmpfs -o remount -o size=5G -o noswap tmpfs /data-tmpfs/ >dmesg -c > ># Trying to enable swap with a remount after it first disabled: >mount -t tmpfs -o size=5G -o noswap tmpfs /data-tmpfs/ >mount -t tmpfs -o remount -o size=5G tmpfs /data-tmpfs/ >mount: /data-tmpfs: mount point not mounted or bad option. > dmesg(1) may have more information after failed mount system call. >dmesg -c >tmpfs: Cannot enable swap on remount if it was disabled on first mount Nice! For the whole series: Reviewed-by: Davidlohr Bueso