From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6675AC6FD1C for ; Fri, 24 Mar 2023 17:23:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0EEA46B0074; Fri, 24 Mar 2023 13:23:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 07654900002; Fri, 24 Mar 2023 13:23:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E59786B0078; Fri, 24 Mar 2023 13:23:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D8C206B0074 for ; Fri, 24 Mar 2023 13:23:31 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BA2CF1A09CE for ; Fri, 24 Mar 2023 17:23:31 +0000 (UTC) X-FDA: 80604463422.06.EB6D68C Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf10.hostedemail.com (Postfix) with ESMTP id ED7E3C002E for ; Fri, 24 Mar 2023 17:23:29 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=at6d8p6d; spf=pass (imf10.hostedemail.com: domain of chrisl@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679678610; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5kJ48nty1tz6UThgdwbfih+Q4b4KiTl7VG4OJjuWRkI=; b=MhUFWuoTxNOEAuNRBVge+SLYYHqNl3Hvurw3VAZezSDD/rUSMe4UWVLcyMW2fw+dsA5D19 r3rOHkr2qbjKr7r30zQub56VudKwGkh/Gm+IzpZRvaNP++rXSv+xuzDevuWoGoHaxcwxIQ rt9nS93kSYgeaSRkER0TYafGycSXxzs= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=at6d8p6d; spf=pass (imf10.hostedemail.com: domain of chrisl@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679678610; a=rsa-sha256; cv=none; b=H2Qna4FrHXE5OuaJjiY5ah8RvBBW+70js/8k+cOPh3hZGlSIzYwVBir5CZNxEM+l/unUKM gEVjYqIotFbsx/GXfLCrd39wX+lm4q1FOshyRagCsfX3RXsUG5ASobJRcCkMrpRRNCgtWT 3mjv6h8RMC9+2hZ0W3D4WRPlI3KLWP4= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 484D8B825B5; Fri, 24 Mar 2023 17:23:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B36EC4339B; Fri, 24 Mar 2023 17:23:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679678607; bh=uIUGQxWVi9BiMW3iPxrNFita2hpZd/fUwuzgir/eh78=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=at6d8p6d6prVkvIhjO/KhflsCWkU5HzfUZ2Tg78+v09vbPe9l0EIfSmLnX35Cw0Vl whvqiO/89qZjE9WzspswZZnGFuOdNWlQUcpvcyKNLUVKKUNizlO4fKA7slgnxAEVjH fxJYEu/OAoh4uBJ6oWxqog3MAw22W1LQfxdYqm7mnQdcMSLiMk+5h9Jwnb5+i5paOF rQiu4GaEBib5y9ZWGzMUwLCjHrw152T809e5R+3ueE1O97nN8aWuBRYqPPFfBYkdWK BRtOc7US4h0GJt9h7iQlwn1kG4Y34QM63SIZg4IqZVdtOpAXK7UT2ZgKJbw/aucHAc /kM+ITcKyHf0A== Date: Fri, 24 Mar 2023 10:23:24 -0700 From: Chris Li To: Yosry Ahmed Cc: "Huang, Ying" , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton , Aneesh Kumar K V , Michal Hocko , Wei Xu Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap Message-ID: References: <87y1ns3zeg.fsf@yhuang6-desk2.ccr.corp.intel.com> <874jqcteyq.fsf@yhuang6-desk2.ccr.corp.intel.com> <87v8isrwck.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkkkrps8.fsf@yhuang6-desk2.ccr.corp.intel.com> <87sfduri1j.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: ED7E3C002E X-Stat-Signature: j6en57umpucrqi6mu8cw1i59fso6nn41 X-Rspam-User: X-HE-Tag: 1679678609-291239 X-HE-Meta: U2FsdGVkX1+iBnoon8G33wJpctMPX/K6/GmCpHG2WYrsZL/JC66jZIXjyeZDhI/Z53LxbfJb9OAI212fp6skul9f+6wSU+i5Uf2lhtO1MpZzzfZcHQFNLQMhAm/1H/DeH6fQZo9EWsRWwBgz3W+V9efrKuGgohuBNonHB1L5ppnA3HHEtWAbplF0r4n5T7LSWVYV843IbINIq0m2AZBFPmSwKK6FuoQQ6oskFQkIX2UikeSS8V8lJ2aD9F1v4rt2mS+pI5SzP2fTYBkFvvFArHXaXb0YJUDiVtQDZBl5hcIfwGkAcNwrfeo4XtlNx4W5PmiRUVSdWqR1CWwIWDbrEWyru1gRL/RHZ/8HCXa07O3mHrBIqYbCIVgBPiS37o0sMGl8+aM8ZH4Y5Welkja8AvmkQ66pfg/tOMTmV0xNFXEAYY4HbRAlDIPXrlUfHCeweIa5lF1jBSBbqN1Hin+T783vDkapYUTzFwFI+Qhhg+xEVmYdwyhbxXUvRClRQo1RsdCNz2xjcAaU+Zx1zmDm/pBCV3L3yKomBgntYv0EERjXgoKslx5qnVIFp+/E3ofm3qJdG2dhobWrOxSUJNBlYBWK2ETu+Zfum9wLRIJygvOYdRv5UbdkceXijjl0mzgb0qP6N86QIPixiuD/+F2N4PVLlonxiZCDd67S1TetALPrcc+OEvW3xgqAjWKdpYrGAhSalNVEGuaoG3xxNSk91KYbnYeIs5Lfp2w/6RZlYH3qCQ4p7lhgRS1F7/wVB5nfGRAlb98h7vkYe4WY5phlmA4OSmtXxs800qKQu/EozHKs6ynzsMzP2hcqx1lx/UGVVtkbYMhzebbRn7NNRwRu4GJTUgCMesp5L3HDbk5PM+djNVSrVbncZZd1uocGJ3ftvEItDyK5BBlqQ5Ju+r9biJHNlKr46fICtNM7KrN9/vUZAORW0IGDTwxnE1B4GYJq2KMKFTFZG6WnKWhN58A qs0Ayakw UF27zzRT67G54sMCgdpKqGHNFwyI/AexrFWy+IU93p2e7GZjS6EjEKQgCfJh4CE9/2xsBsOTuUIss/VjTVUmjjf+4fnma3Gjs+wTcnYIdHctXMhfI+W4LHS75HQ48DLV75Ar8eJ2zXWnoFszk4pQLvXYeTF4oR19qYBz0LcPIz69C8yJIxFds+pl4EXACTNj76edA8HOR0x4shDE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 24, 2023 at 12:28:31AM -0700, Yosry Ahmed wrote: > > In fact, I just suggest to use the minimal design on top of the current > > implementation as the first step. Then, you can improve it step by > > step. > > > > The first step could be the minimal effort to implement indirection > > layer and moving swapped pages between swap implementations. Based on > > that, you can build other optimizations, such as pulling swap counting > > to the swap core. For each step, we can evaluate the gain and cost with > > data. > > Right, I understand that, but to implement the indirection layer on > top of the current implementation, then we will need to support using > zswap without a backing swap device. In order to do this without Agree with Ying on the minimal approach here as well. There are two ways to approach this. 1) Forget zswap, make a minimal implementation to move the page between two swapfile device. It can be swapfile back to two loop back files. Any indirect layer you design will need to convert this usage case any way. 2) Make zswap work without a swapfile. You can implement the zswap on a fake ghosts swap file. If you keep the zswap as frontswap, just make zswap can work without a real swapfile. Make that as your first minimal step. Then it does not need to touch the swap count changes. I view make that step is independent of moving pages between swap device. That patch exists and I consider it has value to some users. > > Anyway, I don't think you can just implement all your final solution in > > one step. And, I think the minimal design suggested could be a starting > > point. > > I agree that's a great point, I am just afraid that we will avoid > implementing that full final solution and instead do a lot of work > inside zswap to make up for the difference (e.g. swap entry > management, swap counting). Also, that work in zswap may end up being > unacceptable due to the maintenance burden and/or complexity. If you do either 1) or 2), you can keep these two paths separate. Even if you want to move the page between zswap and swapfile. Idea 3) You don't have to change the swap count code, you can do a minimal change moves the page between zswap and another block device. That way you can get two differenet swap entry with existing code. Chris