From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60C65C64EC7 for ; Tue, 28 Feb 2023 23:29:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC2DC6B0078; Tue, 28 Feb 2023 18:29:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C71C86B007B; Tue, 28 Feb 2023 18:29:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AEB726B007D; Tue, 28 Feb 2023 18:29:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9B7246B0078 for ; Tue, 28 Feb 2023 18:29:06 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4EA211A056C for ; Tue, 28 Feb 2023 23:29:06 +0000 (UTC) X-FDA: 80518293492.06.901C91F Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by imf16.hostedemail.com (Postfix) with ESMTP id 511E0180016 for ; Tue, 28 Feb 2023 23:29:04 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GbhbpfXl; spf=pass (imf16.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677626944; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3/Hihcc/wEwp3S7oNGupEqPxjDr812jZw8e2bhQqjfw=; b=01++BkYIMcwauZyiwLrj+wBJe8Z/pbQM+RI3sFdkEqIYhWHzhicH5pVGe3Ub57o3kDpsvX CtoC6rX59s+1popLoHcybssjs+6xvuZSq1e2UBI0wt8mN1j7pXVO8ryS5pmQt8frZY38Br 2UUzOuBduOG+7vDc6YQ3R1OzAty2cPo= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GbhbpfXl; spf=pass (imf16.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677626944; a=rsa-sha256; cv=none; b=yradrXMFbbbUMPhe4ahZFVjPtcV0MbbxPUaGkE9gDcm4jqnkxcwgwTJxa5pAklG2N8DP1g k42L/0wfaV2+kPlI51SH3G/W+wCfFv3PkySIMU6RXAwuhIc6ktdsAOWym0jF96EwFQPxa2 sjpD9P7qwijZ9PJyXjQgQdi7FFwNoN0= Received: by mail-pj1-f45.google.com with SMTP id q31-20020a17090a17a200b0023750b69614so11175612pja.5 for ; Tue, 28 Feb 2023 15:29:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1677626943; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=3/Hihcc/wEwp3S7oNGupEqPxjDr812jZw8e2bhQqjfw=; b=GbhbpfXljDZU2VAyLOzP6NDctxKz3cFl5lcL0y2fBez9KYeQyLdZhrDYj+j2Zm9CzF uzJRGAd5QUtD4gTl7PYLpgWdeQ8KGLdkCoEWTU6KJ0e517o5R2WuR3tg8qKy+4kBIts2 6WJ/2PZm1NPjypLsMpRomVRN4cLytCM1EC/ulYKRNQKBo/3HdCGeR7hp89kQnvybAuhi FTk7FoBbDgW4d2b5OmC7jBEyvrJVMrsk8hb8XSQ75+kUMioPibCecNKaXwdNLAM5e88H 9ZFVwhGkD23DKYJlPZECA4Xn1AtWQiE4Vfx3uBWqRN93tCvhygJXMQjoErXewD2CsJtX I7yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677626943; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3/Hihcc/wEwp3S7oNGupEqPxjDr812jZw8e2bhQqjfw=; b=d9icGJP++tFi+HtGWWvc4XAj99KCCi1NCcPaBa8a1VUshDtOOpCmDksNfc97c2ufDe 0uf2qLgmx39FGmfIkVsmre71bn5R3frNS/sX+SpnrMAvIW5FujcI25fU84sKGTzLHq/E q77ZgFaYP+iblCydXRpXfgm4LzZ4MCDhkIYPxYvfAo4/d74A7lzCnMMGM0EOKqX9RO2r rIb2Wdw0Vyjmjrkls/9XDz1/aSBM9qRJuUG/ubYwWhODozZbznJePT435gTnNQHP3vq/ dpCKAdQp9gEUESZoWx7dcRciVBZE3qRPpD6vy3ToD8+w/duSfm1rH6zKYwZRPBnHURBm zuOQ== X-Gm-Message-State: AO0yUKUm3xiQhWWYksuwgQrJlNQVVm/5Mg4rYkTMPOkhvAWTaqyvRcjx W/21LWWRJQAivUc/M0l3Vuc= X-Google-Smtp-Source: AK7set8a/rx+e1AOGjTMOT1ZPwzzzrfpitzqHxKWz+jwtdujpiMgcTUbtYfVBoCl6B+cLA/Jzv8duA== X-Received: by 2002:a05:6a20:3ca5:b0:cc:e358:94f6 with SMTP id b37-20020a056a203ca500b000cce35894f6mr5858377pzj.56.1677626943083; Tue, 28 Feb 2023 15:29:03 -0800 (PST) Received: from google.com ([2620:15c:211:201:639:82f5:b510:3494]) by smtp.gmail.com with ESMTPSA id s13-20020a65690d000000b004fcda0e59c3sm6219952pgq.69.2023.02.28.15.29.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Feb 2023 15:29:02 -0800 (PST) Date: Tue, 28 Feb 2023 15:29:00 -0800 From: Minchan Kim To: Yosry Ahmed Cc: Sergey Senozhatsky , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Andrew Morton Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 511E0180016 X-Stat-Signature: fnw3ocszuh9nmpr73nk8h9yk6bnzes6g X-Rspam-User: X-HE-Tag: 1677626944-324073 X-HE-Meta: U2FsdGVkX195/hTYagUTVCiXhOyFdrWc2tk3RwQlVYo8+EFAijVxZXFheNc3moRbd0SxepkxOBay8AOY8FRhaEhBAlsXrBms7kdCnfBmGzfp1cTTG7he/5bDRl9yNkIZ1EAHptNmRX2Jnj/HYK5U1HXOz98nNflx9/RpNgQjr9a2lGKgW0K+I+kVuq0wbPB6OFGQMhKg8EjI8ysFRajb0fyQyNTJZcDg9jwB1I24PdNdFTc8QBXh1NVu9y6WwWIowm1de/lY2NeJ8rGpj5dLo8feS/F3x26r8gOa/UyZAWaUj6wOLEKccl0Tsz7ylgfQt+vond2DEdq6MTbIH2buLRt7NOzEspSMEZGA6UE3BYeUYYvESxt5uBXbySrNKNkdMozfbgkYHyOVjcHlHUwcOMbdyRAMI+Aa/C31L/a+WotmrHUER2usqGc1Q/23mYSx7Qkk/Eje55l/VmSrph4YTvVrQ5Rxh6AWfCEGGEX4wX+QEJ93oGidsopdwciHhmt2tNpKeEyAMclii6Cn03DauwsEUcJAQ5PbtQWbmKvupzgB4fM4qrp1gqKDNCLIKEy9lvY0VYIENtKivL1GFHAV5gOd0ZbttSyNJEJ4VnwkXQW601AV14zUm09FdYt5PgSuIBK8r3qRm9xNJq/uQIXcal9qxxQF1ZlMYKoQmAy3lOQAfJA/meAI0NYN3Uv62LvHLWByse+wnL/kX3d/DPypdhSjbCrcHXLbqzlmAm5ubV9KIuCpqA6ErdqIyqDtqwVVv/jyMyjVQZGDgTbHZErdahlAxGAXHgr3iREQ0ff3UgtQDzPHBFsJP2nVqpFEMG/28AwsJL3ANeYmSJxMp+gWE/7rhD2qM0Z4X/mJugAMmE76MG0deufTq9J+0Z7h4RoA5Ye7TgvGSnVn7RKi2gF0xCtnk+EBtCthx4L4fnxKumHu2nlzwjDwYPdsmXZsYQvhQRFNK0zOez9Kqo43uWf KoxLU8wb htm9DZkwz6Nx0+YOqrnvx1Lx0/gj1KtwMXFRHWE+UVu1YksRtGr4O+PWGssBgTSHz4DOafh6wt4yMmUQz/rO1KOWCmrgegBic+8w/ckhhing2c5XQJBON4blySqTSKIhV8OOm7Y4V7QdYWK6F1bwDeLtkjZgCOfshPuGgkjVCKA560RSiXQ2LIHB295Fze0OZNQjRZgYhto0rOTmQkTtsHPvpm7WpEvUxBbI5X/m4pUyFx8I7l4Exd+37Qivxkr5Z47/ko4zX+adwqRKrTuo119CisKiQoTH8/eKQpzzeONgaTWFEy2qaDb8stwZEqwmGjwwVJUE5WDwZ5y9cxd0tOzQpokq1fZxcYhb7GGPI4byzNnX36mr77xasD/tIBEl5x/jeyXh6Hu0/sUQN0fhWOuCZ74b9kCQq4EYoUSCKS/YCnPi/IIIdo8C6d6DLK8PF3cbrR8aaYrtQbdk8oNE8qp3D+hw2EWMzNQpzjK59hHqn6XvAid5Kp+nVT4WmEZ5Sh2OscU2EgaoyS+65mjX8Rcr4xm7LgQjOgUGdjL/FeYYDY0ln7fQLeWdodwOnI0el6aVY6qS7ZQrGcWPTPw1jfYK96blOemAFi8U1d7zuixAI3/mifVKzkcMmLmI+IbdrMOdBJFUYD1zdkoxDZ9riHqRTmrr6sTNWg8TqYdyshBWuPCFyLBaMbveaa3ky74fn+kREUetkw6152XKC5+oB93Eh851GpoxrV81Q0uwmMkvfxzs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Yosry, On Tue, Feb 28, 2023 at 12:12:05AM -0800, Yosry Ahmed wrote: > On Mon, Feb 27, 2023 at 8:54 PM Sergey Senozhatsky > wrote: > > > > On (23/02/18 14:38), Yosry Ahmed wrote: > > [..] > > > ==================== Idea ==================== > > > Introduce a data structure, which I currently call a swap_desc, as an > > > abstraction layer between swapping implementation and the rest of MM > > > code. Page tables & page caches would store a swap id (encoded as a > > > swp_entry_t) instead of directly storing the swap entry associated > > > with the swapfile. This swap id maps to a struct swap_desc, which acts > > > as our abstraction layer. All MM code not concerned with swapping > > > details would operate in terms of swap descs. The swap_desc can point > > > to either a normal swap entry (associated with a swapfile) or a zswap > > > entry. It can also include all non-backend specific operations, such > > > as the swapcache (which would be a simple pointer in swap_desc), swap > > > counting, etc. It creates a clear, nice abstraction layer between MM > > > code and the actual swapping implementation. > > > > > > ==================== Benefits ==================== > > > This work enables using zswap without a backing swapfile and increases > > > the swap capacity when zswap is used with a swapfile. It also creates > > > a separation that allows us to skip code paths that don't make sense > > > in the zswap path (e.g. readahead). We get to drop zswap's rbtree > > > which might result in better performance (less lookups, less lock > > > contention). > > > > > > The abstraction layer also opens the door for multiple cleanups (e.g. > > > removing swapper address spaces, removing swap count continuation > > > code, etc). Another nice cleanup that this work enables would be > > > separating the overloaded swp_entry_t into two distinct types: one for > > > things that are stored in page tables / caches, and for actual swap > > > entries. In the future, we can potentially further optimize how we use > > > the bits in the page tables instead of sticking everything into the > > > current type/offset format. > > > > > > Another potential win here can be swapoff, which can be more practical > > > by directly scanning all swap_desc's instead of going through page > > > tables and shmem page caches. > > > > > > Overall zswap becomes more accessible and available to a wider range > > > of use cases. > > > > I assume this also brings us closer to a proper writeback LRU handling? > > I assume by proper LRU handling you mean: > - Swap writeback LRU that lives outside of the zpool backends (i.e in > zswap itself or even outside zswap). Even outside zswap to support any combination on any heterogenous multiple swap device configuration. The indirection layer would be essential to support it but it would be also great if we don't waste any memory for the user who don't want the feature. Just FYI, there was similar discussion long time ago about the indirection layer. https://lore.kernel.org/linux-mm/4DA25039.3020700@redhat.com/