From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4ADCC76196 for ; Tue, 28 Mar 2023 05:54:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCA6F900002; Tue, 28 Mar 2023 01:54:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B535D6B0074; Tue, 28 Mar 2023 01:54:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F684900002; Tue, 28 Mar 2023 01:54:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 896166B0072 for ; Tue, 28 Mar 2023 01:54:45 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2FE13407BE for ; Tue, 28 Mar 2023 05:54:45 +0000 (UTC) X-FDA: 80617242930.15.C1B013F Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by imf20.hostedemail.com (Postfix) with ESMTP id 517E01C001D for ; Tue, 28 Mar 2023 05:54:43 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="DAnzR/8p"; spf=pass (imf20.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679982883; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LBFg9cajIJAiHwFSiQboCmzzVF4BzhHPuGwm0kmvDmg=; b=QWkpdcTTF38g4yl25NppOAR9w07YXCnxAADjepZKMKOMfTwiDpxAC0Ml1QcHGqgnGn5hQ1 3CMWrIrmV84sbsS5pjuhg7tT2bFmxYZZvLo2jntZzQAp54/7lMGkIri/kTUbbwYfRhqB3h sQHCGe8s9jyOttfdWcBbXEFFkxwFiAA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="DAnzR/8p"; spf=pass (imf20.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679982883; a=rsa-sha256; cv=none; b=bEQqQCXjH28kXnmdzBWrAp8vHv8wr3VxYqtgZmgRxUwrFND5vq3d0NEs3A32TXmpRPFwP/ tI7Ky9BnkvNJ6o7197E8kTgZ55Kr9MojmlDWNzvhNVPOSZttGoeCd+1be9u5KQsNTAimS1 +q4lRnsJlcjvCWhSytWm09E6o89MoEo= Received: by mail-ed1-f45.google.com with SMTP id cn12so45040262edb.4 for ; Mon, 27 Mar 2023 22:54:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679982882; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LBFg9cajIJAiHwFSiQboCmzzVF4BzhHPuGwm0kmvDmg=; b=DAnzR/8pDOOp06XDvDKflWjBywLVelR4/zGNTEcJSCG0/2rZQt++vexeMUW/kNjV3x eUb3LLadEnLiZQjxbBjA7ghTzWtdpV0UO99NfolKwotghYRNekQXWll3TvOezTTYWujD dpor+HAlZkctB1dEw1AHgy3mo5G3cnKIeYpsd3Ms3SyElABhTjdDEX+2/Ea6lrwaEBf0 Y/9BteKbDoyJke3cY8RnSM/aiQJBKQF6zBz2Paf7KFSOsCkdqn/MQfO5h97JYEOkCPAN h81gHvanyiqtWfmPOpe4b+g5eRF/36qGUWKzVY7wTd7e4cxAkFm1rK3t7hAckEhrqBu4 mGgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679982882; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LBFg9cajIJAiHwFSiQboCmzzVF4BzhHPuGwm0kmvDmg=; b=hja0R9rCEYqFisx1RQy4bcoR7Cv8LIXtVuxcAKkkASvcLgdnrg6n7QylXjs4WkLmft yQqFAf7iLYqUpMBF5P2/fG4Tdg8ceOUbr7s5pgmY7BG0uPFCp6sQiS4QFJCHxqdZoWzr 21vMCWIhXqf7zZe4XSz3KplKI/ycOrKUtTFbL7p1lSz+1ku3NWuYKahvuwBtqsOaz7Lv akgRjv6U1Iamdu0w6kMgkXheDROfFZweSMSrvCtSG2K20uk0sdppmnVkJkqIn0lzPcfC Lec09L/Kv8sKwkfikIsLmUv8xv3ikMYZd2qFQTkSssJ7CLHFSbP/xF5oW2UHEOMYkVlw zpDQ== X-Gm-Message-State: AAQBX9foyTz15PliYKy2hu4yoMWDQHh1i7FkaIjyge/JVPa2vJy+OqwK w5hvfwEluztKjzR9b1tGs5Nen2J1DguIMET8IaVyhQ== X-Google-Smtp-Source: AKy350bp8xvnmVp9xgzGjIQ3kQcg3QUWcz4wByexIBSaC6v5MSCk84xnhzL+D4e1/SXwzSazKviZgMYsgV8VsaejCW8= X-Received: by 2002:a50:d581:0:b0:502:1d1c:7d37 with SMTP id v1-20020a50d581000000b005021d1c7d37mr7181296edi.8.1679982881647; Mon, 27 Mar 2023 22:54:41 -0700 (PDT) MIME-Version: 1.0 References: <87y1ns3zeg.fsf@yhuang6-desk2.ccr.corp.intel.com> <874jqcteyq.fsf@yhuang6-desk2.ccr.corp.intel.com> <87v8isrwck.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkkkrps8.fsf@yhuang6-desk2.ccr.corp.intel.com> <87sfduri1j.fsf@yhuang6-desk2.ccr.corp.intel.com> <87edpbq96g.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87edpbq96g.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Yosry Ahmed Date: Mon, 27 Mar 2023 22:54:05 -0700 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap To: "Huang, Ying" Cc: Chris Li , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton , Aneesh Kumar K V , Michal Hocko , Wei Xu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 517E01C001D X-Stat-Signature: h8c3z86rxk8aozaohoi1krj3brihi3o6 X-HE-Tag: 1679982883-652804 X-HE-Meta: U2FsdGVkX19+DAMjekNr+Nh6BHrWYcnxPw+jMhtXr14CIRZP7YSQJkJMwl4/xqkAIau943Udjpmne9/l1+SiB2rV3yXe5JXqYi41f3btd1GLQHDIebtjb+BqXplo+PNIxoGDDQ6IcIKXrdxFhty1nED6EAgFZX5w8glZ8JpdA4OLSAXGllzpKWY2MN6fwQbXrTUXI0x9W0NxFsj6AE9SrO6iTkrDlSKwk/1zbLhAvUrwesMlRijiRDbjiAo2Jt9551Wu5/uh0d0hed3uZJi5gZOUCR3h5cE8tNk1H9FUdzn/qmnVijx1obntYArsUljS4GJwwE/mVyqI+/RTjxGpgpT1wVmdkJnYVGTm8XIuIZwxsAkDpiUxWOAkz4IEBBlONFfiAGWWWUCjn9wakLt+RUjO+xzToOQ32hpRwEiH0ofKZM4UZstQqU9LoMK45s5RIRv6T5+3L8jEN+MHdG1YcQo+ZjMuMtedtgcctjn9ljypKyFErj6j4E2ZGPKU3Vf7xC9pIPPW1A+9ifsfi/dH2rJhiPapvMZdeh7DVtGtAqCTTXTJag1bQ4HtABSobhTEYlSH60zv6lxK5JDvmBW6xr2MkAlaQ9j1GTEna5YbH6SGXyJhvgmWIjsKHCMnhvSrJC0SOFcpt9xHkchyXibnwAXCyIXRjZD7r5iuLOY+/zICIT1Uq77wn3a7CTSS/L90hi8PkEExRWWmkL30p6YEAbYVhYkqSf7LD9axT2Xi6D4eMkN2cGhVfv0qKyR82s/dLRtQ74FAp9DlJjpAPx9yUZhOcq6zbrC6Nhcr6UDtAyKXhGhlCBqv1UUhp6pj7H4d1lV9pTVMPRvBSr9b67BJaMFkSCx0XrgBFzhu1UESqLKesm+HAy/TxIDrG3mjLnuVHWX85YiLcKCdoQSMvBKqO8KcMh/Sb71y7vnMJW2MVYloKjkwTAxkYSzYmY8YV6FjpxjzsEapYqLhDAWnGzq TVNKr9px 1uCzO1KAPnTs10XyWnTTGAvEbmHRTVVlEDkA1tpNnmpVqr6y94LSLdH5LCJsBSbRt6mVAnUnuF3MdP5L0r7DC59N9qSKtUM/Yc+A+dQwAYngtmR47z/Wj1BMqz6MX+SXTQsTo/m14GQL6jnAFMT5nmZVUQqyOfrQhGUJY2cBs1+4BISA+hXr4NYINJFFoETMZcjHiuRqgthMDl5lwg8HuuBCRs50EH+mKrX8x9u7Cv0wNNxUALBYKhAimGBlws9J26ohIb3kw0bz8wFMruXCbdY5yYB36Bzde9t1iuQpbiIWsSYP9mh48z4yhvKwnAwrUT0bUFxALFvif9RY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Mar 26, 2023 at 6:24=E2=80=AFPM Huang, Ying = wrote: > > Chris Li writes: > > > On Fri, Mar 24, 2023 at 12:28:31AM -0700, Yosry Ahmed wrote: > >> > In fact, I just suggest to use the minimal design on top of the curr= ent > >> > implementation as the first step. Then, you can improve it step by > >> > step. > >> > > >> > The first step could be the minimal effort to implement indirection > >> > layer and moving swapped pages between swap implementations. Based = on > >> > that, you can build other optimizations, such as pulling swap counti= ng > >> > to the swap core. For each step, we can evaluate the gain and cost = with > >> > data. > >> > >> Right, I understand that, but to implement the indirection layer on > >> top of the current implementation, then we will need to support using > >> zswap without a backing swap device. In order to do this without > > > > Agree with Ying on the minimal approach here as well. > > > > There are two ways to approach this. > > > > 1) Forget zswap, make a minimal implementation to move the page between > > two swapfile device. It can be swapfile back to two loop back files. > > > > Any indirect layer you design will need to convert this usage case > > any way. > > > > 2) Make zswap work without a swapfile. > > You can implement the zswap on a fake ghosts swap file. > > > > If you keep the zswap as frontswap, just make zswap can work without > > a real swapfile. > > > > Make that as your first minimal step. Then it does not need to touch > > the swap count changes. > > > > I view make that step is independent of moving pages between swap devic= e. > > > > That patch exists and I consider it has value to some users. > > This sounds like an even smaller approach as the first step. Further > improvement can be built on top of it. I am not sure how this would be a step towards the abstraction goal we have been discussing. We have been discussing starting out with a minimal indirection layer, in the shape of an xarray that maps a swap ID to a swap entry, and that can be disabled with a config option. For such a design to work, we have to implement swap entry management & swap counting in zswap, right? Am I missing something? > > Best Regards, > Huang, Ying > > >> > Anyway, I don't think you can just implement all your final solution= in > >> > one step. And, I think the minimal design suggested could be a star= ting > >> > point. > >> > >> I agree that's a great point, I am just afraid that we will avoid > >> implementing that full final solution and instead do a lot of work > >> inside zswap to make up for the difference (e.g. swap entry > >> management, swap counting). Also, that work in zswap may end up being > >> unacceptable due to the maintenance burden and/or complexity. > > > > If you do either 1) or 2), you can keep these two paths separate. > > > > Even if you want to move the page between zswap and swapfile. > > > > Idea 3) > > You don't have to change the swap count code, you can do a > > minimal change moves the page between zswap and another block > > device. That way you can get two differenet swap entry with > > existing code. > > > > Chris >