From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AE9DC76196 for ; Tue, 28 Mar 2023 06:21:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1568B900002; Tue, 28 Mar 2023 02:21:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DFAA6B0074; Tue, 28 Mar 2023 02:21:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9B2B900002; Tue, 28 Mar 2023 02:21:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D51506B0072 for ; Tue, 28 Mar 2023 02:21:58 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id AF737A0129 for ; Tue, 28 Mar 2023 06:21:58 +0000 (UTC) X-FDA: 80617311516.01.8E8FD43 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf16.hostedemail.com (Postfix) with ESMTP id DBF50180007 for ; Tue, 28 Mar 2023 06:21:55 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=lfIq54Hz; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf16.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679984516; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zDSyhQ3BCrwIdOJ7duGQ7Q/k4xZ18zJaZOTmhPaK9wU=; b=eFKwmujNMK2P3HaFrLIRlLgel1xyuGp/OE4IeiCSr+LTFBv0c5LcEJFFcwa9fCXv5k+f25 MRl4SHEdK6vlfpwJjU8BRED/eKFkwJdZj7l2HfrgPcs4qjcd2+t4z6sVlFokIEn5JCJRmI FEShr9GkT27rllskr1FKaWpXdDMYiNA= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=lfIq54Hz; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf16.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679984516; a=rsa-sha256; cv=none; b=NydAXEvKn8FPKWh8+62OlYQV68Ow0u5AFnCF4RsPzxBnptVK7t7PmjzyiM4e7ecOWLnzo3 HPYIYKBHS7uw9btI4dTwED1DjkaL0qKWJFx+oQodFv8kwVE3wSZ7UyfVekjFhGk9kNE20D 3UREMgDQcNaZ5RVz7sJfKYAOMYYW/ck= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1679984516; x=1711520516; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version:content-transfer-encoding; bh=HFnTTKfvPt2jtBvHk/S/BNLcqlLASEqvB35yW30vGMM=; b=lfIq54Hzx4pM/jvvo/pnG3vKWZtYuiCiOAFrLZGCpmbRF1phRMteF6ac 3WJzT70I9I/jR6Bka2gy/C1EAsDZxtY7PqDvJJk/oI392U/kdYW4Lt+8v 9BR2AsS+D6t+oRHJ7tXsQ8szAHGYySuZOG9884FLbAddYzLFOxYPeyu3e mKdLXKXklMmlQGrFBZaeHxV23gqGkVm5MKBWDL2ffZPMaUsj92KdCrL/m Yfg7bbmGmEHrD3c7lxGNwu16B5/5D1rdgt3neCUpjgTuJXJfh7I7OkwpD ldR+p2QgbrbDB+KPQDgT/nUQwEFXYRZv/Hc1W86n/t09DR1wgEec7JV9J Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10662"; a="403093112" X-IronPort-AV: E=Sophos;i="5.98,296,1673942400"; d="scan'208";a="403093112" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 23:21:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10662"; a="633904512" X-IronPort-AV: E=Sophos;i="5.98,296,1673942400"; d="scan'208";a="633904512" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2023 23:21:49 -0700 From: "Huang, Ying" To: Yosry Ahmed Cc: Chris Li , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton , Aneesh Kumar K V , Michal Hocko , Wei Xu Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap References: <87y1ns3zeg.fsf@yhuang6-desk2.ccr.corp.intel.com> <874jqcteyq.fsf@yhuang6-desk2.ccr.corp.intel.com> <87v8isrwck.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkkkrps8.fsf@yhuang6-desk2.ccr.corp.intel.com> <87sfduri1j.fsf@yhuang6-desk2.ccr.corp.intel.com> <87edpbq96g.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Tue, 28 Mar 2023 14:20:48 +0800 In-Reply-To: (Yosry Ahmed's message of "Mon, 27 Mar 2023 22:54:05 -0700") Message-ID: <87jzz1pfb3.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: DBF50180007 X-Stat-Signature: 5q9nfphqacu51itq937e8x43y445syaw X-HE-Tag: 1679984515-510239 X-HE-Meta: U2FsdGVkX1/R6NcouAbdrymyWgaUVWYem158Dw+wObDsWF36ESrDnVflm8PxDGZ75OKJmDRBng2qe++OMVHyt5l6K3GtnL6jtIrRICuu4gJ19BWxffYzFZn0YNGvW/+g/tCJWDYUEuG6mbQ9rCNH+Jx96Xj6PVMVJCrABSL3cDBlMX5Vi5tO85vxfkl1BGsXy0R73lBYXcYR/dxOl5qEO2UF3cV+LHyxcHYHtCE9ISnlslKsbTMeWR3715TmEe5ZJ5VjBcY01gkV96qB/gv95YTcBaLKf/IFj6j7JihZtU+Q6hippAED6Fz4SskVny1tyQHeo9IgKr7a4Ng6VJ2Tdj6imBil9ekIHc+/VFT/Vzy44NjZ/NZNxFf/RCPBLvwV97MvHKMeJhQ7KeLDRhuRcch4KqAyMeLZoVO6JO7v/8hijmX2lcNTiBD0+43L6IRLyrKTjwHo0LutCodWjOAe6j/f9AFRgKJIQ0poG/C9uN284B18YCQm/8TmHOpOlvqQ11jY5UktrOWNc+JwzpOAMWpEIgRfqf/yN3BxUtyXDJtDkWbUTyOfSRQgXLtIoc8fqUxtVMWdchg9ZEy6FHt4U4Q1SrEcuFJMxIuX1h4jHlCbOmK0v5amxBcWPYMU9i99LB/XlWE82lHJJMviHoHI4gWXtOtFjUS9hxFxsArZyx1d3m7F/x/eUnr3MOp38ycNwWyrGTQKir3B7D4Ks/TIJEEaZomdbmZ8dAayzZ1Eu0Z9jrsxN08tjve04EufKWo6BVqRyH8RTKS55ShaDGzPcXTl+4IgGn4dnpPx52KvrNQCsJ06fUJvra5hFMFFhvzUINPCSPHDBtgUrM38Gd99lrfPyJwRDrS6CscdW7/oMfGsnkQWkR9gzIET0dXKoOshaU0SypkcAhKk3Gnugr1XVh9N9qhtlr0yYqmj/3fiY5va1UNJIE+9tKOh0GipA+feMvLfLGXfyBfEhdTMJSy 7Uecmggh cedblKh0L8rvE421CRT92OaefTEfXjqBVYMkQL7p9gdVSxY3/b1V8+lXYiOb8W9oezHMxulk2sHvfw8KapxIb8JXoJ7WAxf1GNraPSjLablEi5ZY1zmXh3TYFjG9wR0kOpSixdsSuQl03VJDyxgPo6l2B+iGOcVgecfO2NCHo3+zEskns0Yvnjj6ZV/0qivUHXuQqwIoGwLzSroJPtyjf3ubOFIp34VMGIHSQv0jtXSdv+QKn3dMcwq1uaEL8rNqdvC4Z5gq9dDfugEBAQ9fHUiKZuUvSdjOPtObn0PiIkxzu8Nif6ytTRMWGAaPlpsGORIb7VvkAs6CFWtl2GMLKeDd4kh6PjOiY/aCR93++fGRYz5phNk6XlExgOg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Yosry Ahmed writes: > On Sun, Mar 26, 2023 at 6:24=E2=80=AFPM Huang, Ying wrote: >> >> Chris Li writes: >> >> > On Fri, Mar 24, 2023 at 12:28:31AM -0700, Yosry Ahmed wrote: >> >> > In fact, I just suggest to use the minimal design on top of the cur= rent >> >> > implementation as the first step. Then, you can improve it step by >> >> > step. >> >> > >> >> > The first step could be the minimal effort to implement indirection >> >> > layer and moving swapped pages between swap implementations. Based= on >> >> > that, you can build other optimizations, such as pulling swap count= ing >> >> > to the swap core. For each step, we can evaluate the gain and cost= with >> >> > data. >> >> >> >> Right, I understand that, but to implement the indirection layer on >> >> top of the current implementation, then we will need to support using >> >> zswap without a backing swap device. In order to do this without >> > >> > Agree with Ying on the minimal approach here as well. >> > >> > There are two ways to approach this. >> > >> > 1) Forget zswap, make a minimal implementation to move the page between >> > two swapfile device. It can be swapfile back to two loop back files. >> > >> > Any indirect layer you design will need to convert this usage case >> > any way. >> > >> > 2) Make zswap work without a swapfile. >> > You can implement the zswap on a fake ghosts swap file. >> > >> > If you keep the zswap as frontswap, just make zswap can work without >> > a real swapfile. >> > >> > Make that as your first minimal step. Then it does not need to touch >> > the swap count changes. >> > >> > I view make that step is independent of moving pages between swap devi= ce. >> > >> > That patch exists and I consider it has value to some users. >> >> This sounds like an even smaller approach as the first step. Further >> improvement can be built on top of it. > > I am not sure how this would be a step towards the abstraction goal we > have been discussing. > > We have been discussing starting out with a minimal indirection layer, > in the shape of an xarray that maps a swap ID to a swap entry, and > that can be disabled with a config option. > > For such a design to work, we have to implement swap entry management > & swap counting in zswap, right? Am I missing something? Chris suggested to avoid to implement the swap entry management & swap counting in zswap via using a "fake ghost swap file". Copied his suggestion as below, " >> > 2) Make zswap work without a swapfile. >> > You can implement the zswap on a fake ghosts swap file. >> > >> > If you keep the zswap as frontswap, just make zswap can work without >> > a real swapfile. >> > >> > Make that as your first minimal step. Then it does not need to touch >> > the swap count changes. " Best Regards, Huang, Ying >> >> >> > Anyway, I don't think you can just implement all your final solutio= n in >> >> > one step. And, I think the minimal design suggested could be a sta= rting >> >> > point. >> >> >> >> I agree that's a great point, I am just afraid that we will avoid >> >> implementing that full final solution and instead do a lot of work >> >> inside zswap to make up for the difference (e.g. swap entry >> >> management, swap counting). Also, that work in zswap may end up being >> >> unacceptable due to the maintenance burden and/or complexity. >> > >> > If you do either 1) or 2), you can keep these two paths separate. >> > >> > Even if you want to move the page between zswap and swapfile. >> > >> > Idea 3) >> > You don't have to change the swap count code, you can do a >> > minimal change moves the page between zswap and another block >> > device. That way you can get two differenet swap entry with >> > existing code. >> > >> > Chris >>