From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D34AC64EC7 for ; Tue, 28 Feb 2023 23:22:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C64F6B0072; Tue, 28 Feb 2023 18:22:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 376446B0073; Tue, 28 Feb 2023 18:22:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23D836B0074; Tue, 28 Feb 2023 18:22:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 11BE66B0072 for ; Tue, 28 Feb 2023 18:22:28 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CCA5BAB1D3 for ; Tue, 28 Feb 2023 23:22:27 +0000 (UTC) X-FDA: 80518276734.30.83F2381 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf16.hostedemail.com (Postfix) with ESMTP id 0BF3E18000B for ; Tue, 28 Feb 2023 23:22:25 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kiN9rGGB; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of chrisl@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677626546; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=L8CY+uKk+tk+bhSxYF0bP7FkRABwSNVL6JbmPyhWlvw=; b=u7HR0m/qqWWKdb7RYGh94FkW1HAN7AKM3GtI7dgzY4YLg047INV1qJGJ7NkIal8OtNqTTs e0gGTJR+9YYGSQ8Ab3kvObaN/gSQayASwsTxc9oGj7VChe1j0jLaHYY7SgHwRsKuuopHCU FCatJbgAQCl4gP8hB0SvTr4DbHsPxYA= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kiN9rGGB; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of chrisl@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677626546; a=rsa-sha256; cv=none; b=BGtVos5ZOTFkcuO8Sf9GfHpET1f3DahDmmj0A5DbUAOdMEs0d6pe9Y9Az7RZh6LMQdLKZy PbGotzgsvL7y1+BwT8+z15BkeiROv7ezKhccHp072Dbjq5Tjew6wdH0REz9fZrBwy0psAi ySNA29VWSlaS/YnUQobXmVq7mecl1s0= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 43B98B80ED5; Tue, 28 Feb 2023 23:22:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 61148C433EF; Tue, 28 Feb 2023 23:22:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677626543; bh=sgo9spwddZrJ66GIinFkJXblg7WDitH7v4osLZJAKig=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kiN9rGGBio1SXrYwBkqV+RE+fM0CPHuYfD4JWQChLMMP4SvIotS/pZX+sg1U2ODXX xAomgcY3yFxk/r6bImSVXv5WU4j5mdQRySBfuyjlsMXfRrKysyuIh/xFUoZxDml/Jn sjxDhK0EPWKiZL2t9yGg9aY/b/pIOAWzqvqaa0dzXw+84RQnabXxTXjv58smP67eZB yHXtwHGd+bdqIYmC6kDbfNanxXOR1NPYbC7uUSlN775PZgZwQDm9XRPqx9afqIIP21 RAxsNCX7YBlD7iYEpyvMKDYt14XxffVfbnkdn2U8t+YEnsVVTudksLFIdaa5Um8Vn9 X7UuzQXCy/pIQ== Date: Tue, 28 Feb 2023 15:22:20 -0800 From: Chris Li To: Matthew Wilcox Cc: Yosry Ahmed , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton , Huang Ying , NeilBrown Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0BF3E18000B X-Stat-Signature: qkxcr3rtwpswk7fcox7donzs1nh5bjgm X-HE-Tag: 1677626545-417085 X-HE-Meta: U2FsdGVkX1+Ro2YT/t3ZnRhdTH49sL3/UQv3mD7WWEUTk6vIPAI4kQ48aQhf2oHwhRkk5vTM7wT6yJ+JefnU/I9KrS4QrwW5XzOS9ppoIBYpuBfxYm8STmjyrRqyI5EnbqRIGWSx6orydTIrF4xHd7cKyxAsNs1oq7PxNhGTR05iePezkH3+dEE8Gotc4cRXiuBc/ha1YFxdsElzMRP8G/ya1wHZy0zjgno3p9eppzyUxL9Xk+Cj/fHH1j5NaySjBYKUiDr3o/Aq48T+T5kEL2ysNrBhDRRgi1x2kHdGGtnV1zl3WZqIJBH37ZdWdVb9B4X3Q5k8UrxMhCAWFLqe+MGID4mOpCOV2M2r2cjiI1ail55ak3ECor/Y/dUuyGmHlYdBuCVjxLtQvZ+Z5cy6cLoO/177cfNrjnNLzVQC3hwVWDsA/nknE4/HvwWOoy1XYLnkPbdNjGmSaac/n2LoJKrjiMHvs/BactcevbnAlY7HUiGTh8n8yIRnJtaV7Avm/zui3V9g7rzGO0QraGaqB09KRx0T2pbpSkV6KMBkLB/pB3uoY6Z2D9Hbh60XbpJ6T5tsWKVXkUHqir+eO+kNMdu6ScdyOPyuXssOEoov3xQuD78DYLGgqCKadMMEWiYPQo7j+hJ/z7EIBu+HGWfWYspJwqQR/2yXnJ7ryL7+/lAXNLoDDTs5h6dGCHpLZgeUhpo6rs7VhQvBR0TAb9t/UyWYLUgabc6RpqZ1cYLX30o8lbM1cJuB7iuZpz0R/UNY3MugKzLNU91c6mjp7IIhF3YdLZZhbhYu95MpFvyWJeX45EGXyztef1C7YMc2NStARI8ojIUV8Klhy+UvtKVo4/hVKS9v+p3UtzoWo1Anr+skzh79QBrYmM4gNibee8hMMKhblC9cwLRtcgTqnW3koslZgBVCmXXaRc1cHuPq/XuQxGPyxr4BGPugZSv70YfyfsITHh4J/DfwKsHZzIw siEEWwry rDyyiLZ5tCK4eDk/M1d7ljcAiKA6Rj9n6WyaRG1lNvBOT0osap8FApdU8fhjwtVd0d6mXWXG8SXNpN82F8OsvNJif2JTJ5CRt8oTv+NWBJTmdHctN9Rj8si4dOqGHb7ZA5dxX5tByWDc4CaW+VxZfvn7XDtDdk3YQmzZI8eeJp7uM3Bb86M1al0Dbp5ne9hUcX3S+5GNtLAhWEgbGmBGM8QMKmn/hQb8p5manj4PkOkizi4tkUoNdcL/AyE2oMdUUrPqdIs5BntQPPqM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Matthew, On Sun, Feb 19, 2023 at 04:31:33AM +0000, Matthew Wilcox wrote: > > I think an overhaul of the swap code is long overdue. I appreciate > you're very much focused on zswap, but there are many other problems. > For example, swap does not work on zoned devices. Swap readahead is > generally physical (ie optimised for spinning discs) rather than logical > (more appropriate for SSDs). Swap's management of free space is crude > compared to real filesystems. The way that swap bypasses the filesystem > when writing to swap files is awful. I haven't even started to look at Can you expand a bit on that? I assume you want to see the swap file behavior more like a normal file system and reuse more of the readpage() and writepage() path. > what changes need to be made to swap in order to swap out arbitrary-order > folios (instead of PMD-sized + PTE-sized). When the page fault happens, does the whole folios get swapped in or break into smaller pages? > I'm probably not a great person to participate in the design of a > replacement system. I don't know nearly enough about anonymous memory. > I'd be sitting in the back shouting unhelpful things like, "Can't you > see an anon_vma is the exact same thing as an inode?" and "Why don't > we steal the block allocation functions from XFS?" and "Why do tmpfs I notice the swap_map has one byte per swap entry even the swap is not used. > pages have to move to the swap cache; can't we just leave them in the > page cache and pass them to the swap code directly?" All great suggestions and I am very interested in that. Chris