From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75451C6FD1D for ; Mon, 20 Mar 2023 06:26:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90C00900004; Mon, 20 Mar 2023 02:26:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BB40900002; Mon, 20 Mar 2023 02:26:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 782EE900004; Mon, 20 Mar 2023 02:26:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 68749900002 for ; Mon, 20 Mar 2023 02:26:01 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 29D8E8082A for ; Mon, 20 Mar 2023 06:26:01 +0000 (UTC) X-FDA: 80588291322.11.2A406ED Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf05.hostedemail.com (Postfix) with ESMTP id 789E4100003 for ; Mon, 20 Mar 2023 06:25:59 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=u2kus9Mt; spf=pass (imf05.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679293559; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DjtKFfuAZwVtG02gslI3Q/4CNpu8mlUbS1Ot85twhCA=; b=uz+T4DI2RrpeDpVQGCaLD4CEh0Q8hUrsOKH9k5jxSLfG+ceFAgxgQA5+NZRdZaFqGVnq1q a2YiShEu3szPz1MuFqL1smze56H+OZWUp52f6KIyDWW1xTCs8E2x/CWuE3qqc8gy0O1O8v OyMEL9Nvj30LkevWKxh7zTAeQSh9W+8= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=u2kus9Mt; spf=pass (imf05.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679293559; a=rsa-sha256; cv=none; b=m5J+9L1RCnIQQ66Za/GjoH5G4NyLJJAzyIg8e4WeKVteXsy9YUoaR6OGYvBH74NoKH8nLV Dco/s4HELT2/K9bV9ajLGsztIQY/t4XngU/fv+ovwyDjErMDDyeTBPTsbwmNQFQiNq1wfV Ao495TGH6a+x2gSoHakpGhYzyRjgQf0= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6CF7C61219; Mon, 20 Mar 2023 06:25:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7F2F1C433EF; Mon, 20 Mar 2023 06:25:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679293557; bh=iESq/mOwVRpwlFoI64wQUmn2mzYb9Owy20L/OUDvhGU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=u2kus9Mt6TtVEKZ94GKCzWsOHxXywpkC2Q7fUY8JTITgEvjqd9m1rIZt01B5KJu3V aqf5DfFyhdrbeFqpAxoYoGbxFFX9h+Gh/3zUVhQZN+8dbfzd5QAYqoD2ARIwJ69wkr 3pZqjKuh2tc9yOUuisf3WMBJNXS/1MJLAmo7ZK60aKGgINQW/9YBI3JJ0sT1Y4cCL1 9xdpg90bdnE9dKNwWXmqf5zckauGa6Gj2xYd4hoTNxLahvql24zA3vD1NXGUz9uSO/ i6wmEYNmNDL8xxiUo6+8/NJSXddoOBHQE4gJbCdJnv3e9Mh6Fv3yx/k4v0iXjZxcjl ZCtQaXOlZGuZg== Date: Sun, 19 Mar 2023 23:25:54 -0700 From: Chris Li To: "Huang, Ying" Cc: Yosry Ahmed , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton , Aneesh Kumar K V , Michal Hocko , Wei Xu Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap Message-ID: References: <87356e850j.fsf@yhuang6-desk2.ccr.corp.intel.com> <87y1o571aa.fsf@yhuang6-desk2.ccr.corp.intel.com> <87o7ox762m.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkkt5e4o.fsf@yhuang6-desk2.ccr.corp.intel.com> <87y1ns3zeg.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87y1ns3zeg.fsf@yhuang6-desk2.ccr.corp.intel.com> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 789E4100003 X-Stat-Signature: c5yz945ek1g3zifdyz6uhhkok99mho5h X-HE-Tag: 1679293559-539895 X-HE-Meta: U2FsdGVkX1+m6US4KEKBbbi2CVjI9sp9+B+7o2C802em6bt+UH42N519jIKZdnLDFWgW+4MalOEP0ViZemYsKOCJQegs03/4S5kmsTPO3bSMSeSNUY9xx5Z+Cl8feeSXW8wre6stuRFgncePsHkCvEz1zFBsCfq35WxIXPanWgHnkly9+6G6yOnNH/jAT4N9fOqYjOeSf/s8D5t6VgUtwgXcOXWW5WBoP41PTpDAXGlP67jfA8tii/mTKQbMM/GTbCJz0vTn87AVqrk4iBRYJ+5kjtZNZRdt9k3K3ZD5xs9jftEnkqmeNzg+a5QNGXe/QQ15rKgbYzBW7z3Dw6fa7GNMnKSGZjNyKuS4uibVyHWwzBAKfry0/rFmirTb43NG64USi3Y9xN8JXLS/8UJyUT1SyxX6520yWVxvkqoG05m5mHsb67UEI/5wQiGHEgWwnZLt5KL2pzm2OGI/PyEExSKDMTFSm5BJFCZ83e4QQ0hkPUP7G82Om0bsGNnvg0bQ99glFQKZH15i58Xc53ZIM2yoV+ZSW/prKAyIrQflCavHG/m3cVYLufLU9FojhE4I/4UX/6BWTm6E5DfBVyCZE73zX4L9GMX/lwZhKT6thGDE3uq38nhfPPrNmVi3UpgR5pv11EXMY8w3YdPv7/4Eh6v2DkxQZE7bLXdS89RFJMklv/V3RVU4qgbiakMn24QGwQ8wjgU/ZNlYQCXS2A6WW/vQ/HZEIB3DngE8g6fD+eAcgN6U/GkR/Jh7ITuS5I30pM9HBCCzlM/McGRuUWwN3JL7nrDvjVUv/L4L8EbiGSAw05vlE2hOvPnqiAzz0PM+3ll+Yf3cV9LyRQdwUuywl2n5bI2QUipNGo4+I83zQAZpMJV1X61BUuLHPOZx8YNhz2hA02gsSghAlXfQZFBrvAuznecPck9LnDs1frh8zNTBw1wgfez8M10mLOTLiub0+JpzDAoQri377qRwsXd WgyumIi5 FuOqdEH27iW/Xs4S8DDoJggJ75F2tpagIDSBP1jmLhSQx5yTwJZGeNvpJvhf3FtJDpsUGKcLB9hjjuoDrN0YqWm9gNKQPmlrUHGGubeuliUNfwrFwnmQ4UGGu1qNpL5G2i/zxSjGZ6lykXApby3F6J6B5WzudLMWEarReXr8phCljYmYUwARzBbaAmGyR6Gz+zzxhUJmgR2nlHXQrtGZyhUU+2TP9rfzAbfxRioyrh+3HxNdehh/fdLstLdFN46QHJyWqcSk1NB5DWoE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 20, 2023 at 10:55:03AM +0800, Huang, Ying wrote: > > > > How so? With the indirection enabled, the page tables & page cache > > have the swap id (or swap_desc index), which can point to a swap entry > > or a zswap entry -- which can change when the page is moved between > > zswap & swapfiles. How is xarray (a) indexed by the swap entry in this > > case? Shouldn't be indexed by the abstract swap id so that the > > writeback from zswap is transparent? > > In my mind, > > - swap core will define a abstract interface to swap implementations > (zswap, swap device/file, maybe more in the future), like VFS. I like your idea very much. > > - zswap will be a special swap implementation (compressing instead of > writing to disk). Agree. > > - swap core will manage the indirection layer and swap cache. Agree, those are very good points. > > - swap core can move swap pages between swap implementations (e.g., from > zswap to a swap device, or from one swap device to another swap > device) with the help of the indirection layer. We need to carefully design the swap cache that, when moving between swap implementaions, there will be one shared swap cache. The current swap cache belongs to swap devices, so two devices will have the same page in two swap caches. > In this design, the writeback from zswap becomes moving swapped pages > from zswap to a swap device. Ack. > > If my understanding were correct, your suggestion is kind of moving > zswap logic to the swap core? And zswap will be always at a higher > layer on top of swap device/file? It seems that way to me. I will let Yosry confirm that. > > I am not sure how this works with zswap. Currently swap_map[] > > implementation is specific for swapfiles, it does not work for zswap > > unless we implement separate swap counting logic for zswap & > > swapfiles. Same for the swapcache, it currently supports being indexed > > by a swap entry, it would need to support being indexed by a swap id, > > or have a separate swap cache for zswap. Having separate > > implementation would add complexity, and we would need to perform > > handoffs of the swap count/cache when a page is moved from zswap to a > > swapfile. > > We can allocate a swap entry for each swapped page in zswap. One thing to consider when moving page from zswap to swap file, is the zswap swap entry the same entry as the swap file entry. > > I think for this proposal, there are only 2 hardcoded tiers. Zswap is > > fast, swapfile is slow. In the future, we can support more dynamic > > tiering if the need arises. > > We can start from a simple implementation. And I think that it's better > to consider the general design too. Try not to make it impossible now. In my mind there are a few usage cases: 1) using only swap file. 2) using only zswap, no swap file. 3) Using zswap + swap file (SSD). The swap core should handle both 3 cases well with minial memory waste. Chris