From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2F18C6FA99 for ; Fri, 10 Mar 2023 23:14:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 00A478E0001; Fri, 10 Mar 2023 18:14:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFCA66B0074; Fri, 10 Mar 2023 18:14:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEB4D8E0001; Fri, 10 Mar 2023 18:14:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CF7E46B0072 for ; Fri, 10 Mar 2023 18:14:29 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 853F0C07AF for ; Fri, 10 Mar 2023 23:14:29 +0000 (UTC) X-FDA: 80554544658.08.0277AB8 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf07.hostedemail.com (Postfix) with ESMTP id A00EC40017 for ; Fri, 10 Mar 2023 23:14:27 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DGC587W3; spf=pass (imf07.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678490067; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=24LJuIalAAl6+Ay+esLemrhfQmrJn4uxZWmhy1kQpps=; b=qsOyoprRNuLSHyJulY/jG2ZPhR8ECWu+dgPCA1BmauseLf8g0upkrqHI1wJHbT+ZiutMVN fubQIbt7QeRfwsE8Ir4kHK7EjRx+oTUvdxgwO9cl3KjAbF1ygvZ5tQ1q/ir+ZkFM+EAqaK glMLBT5iDZtfFonUakEcwXtF+iMs+Pw= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DGC587W3; spf=pass (imf07.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678490067; a=rsa-sha256; cv=none; b=Ev1HNvUv/KLOidl6aglmd3WSoWPD0x2SSyfgoEppSrLAiJCTM1hgkMEAUVKM7bcU32dJC0 Qk/rOr1e1IT3PQKpPYEnrrEl6Hi1k7RRlKQp4nmFpXeN1cyIXG8pgS/NUQ8j+p9qIRw/Tu zounogd+A6AHx99I9PqOqvF3ueXXkq0= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8608C61D72; Fri, 10 Mar 2023 23:14:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 52EACC433EF; Fri, 10 Mar 2023 23:14:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1678490066; bh=ebLIevheuwMEbQPCw3biI5CumAOza8qboda1Z/X53qQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=DGC587W3eLbXEShAk2eDpg9r7LpDr3G7abawcM7y4UZl+HaRUmdJM4WG2s1lmNa/1 2FHiL44q4VwUpzhBbnAUSuOPP5+vQWtEQRHDb+3lqRVLdE/m8oS/TWHIe4OvhrSM1J cmGPuN9EwfPmoiia+9uGAtns+F4G2UZ1FauNeJo6hAStCdc468lHRrBjtd6iZbc2Ef Fk+nmwN09N+g3da/HyoZQprDn0iqTmCTsLIT3tiZep4g+bJA9VJhfUb1N5PtKItees 9eKv028ZpQ48ouB7YO7+pWAvpBFPDW9kID6FP15MWvOY8/161jIJwr0u6fJsYraCOm DWzAJg+67jt0Q== Date: Fri, 10 Mar 2023 15:14:23 -0800 From: Chris Li To: "Huang, Ying" Cc: Yosry Ahmed , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton , Aneesh Kumar K V , Michal Hocko , Wei Xu Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap Message-ID: References: <87356e850j.fsf@yhuang6-desk2.ccr.corp.intel.com> <87y1o571aa.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87y1o571aa.fsf@yhuang6-desk2.ccr.corp.intel.com> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A00EC40017 X-Stat-Signature: m3w8qwmd8k3subipcs544yncorhfcmht X-HE-Tag: 1678490067-334936 X-HE-Meta: U2FsdGVkX19lpBaU+rO65J3l6HLHZ9YWdsx73CzQ5JKUDx4aK2GHUCp2o4LvWsEeR/rND23UvlhQe0v4X8CRN7SawuQswnlquZWSQpQlPWbQ1bptgsVjv9ISBJSgpaiTI9BifFZ5cCrgXDVyQ6HrXkqjfVvSaopvqT/XtFGuT8/KcwhuPH+YzNE/1JMQlM7Q5CMViNSdche4qZEw4+jrINXu3rWQy5f14RX6s1kP8htcvygSKtk42rAsS+zSB7S2zPuCwagfZ2ObAPUIp6ldP9klXNc4ydUwPkNFuGgGAdwnGVDRb2fgoG2ge6f4QKFNCyZ6GThIJ7QkXp8SwgUz3xQ31hqVldXmvvnk/cZ2tPca1G4M+zHx/DYdzeDFTA22sLrh2FTA6iu+cwlL8huoO5r+oQQnmaP3aK/Mc5IVI+qnzsRKi5vIlcwiFQXD80Mt+XBmePQrs6Qnc6kP0Rfho+mXzieNn1mhM4YyF3W9Cj/OEDgz9oegk4swCfEEetUZNUgTnCWGASYKXyoZg5DJvSa3GtwXNboLJp6/epCcj1BTul/HfdwrPcqdXkjqLoqERelvPrLMlBxXnZlHl5Z1+hIWRsmP1BGr/fpZfZLC6lQ1bUX6JleZCJWmQB12NZLwghQGGQ5+nlfbLUiYQyaY1peuh8+N1O4ZgzrLi8xNy26g6PCQr3mf0BNzPLDSazIkcWT3TIkYssEtlHBkRoVRo0YpiUNR7Zn9Z5ruCfxE5e4qYLarMZC4gv9xoPvn2w2AdB9cD9ftiONi2fXIXkD3CPVTgTI/SQgjjyXhTUBRAYWbFwYOT+GjKeLwMMBfI6Ag11HYAf8IEKOhJRVayyLPhYjeVHAJ7omSyFGA1BNFjjHvIMptRFQ/D+io8ZbKSboqeFfvW1vHn8ynQj2F3FQbzXFQXb5a21VCYX3tLYV3GQzGJp8oG7UZ8X3kYbHFUOCjPYTJDK8d8y82GEz8ECS +aT2QPvP n/mv6pIz2bgIFjRIvj1MaxXs8dByOWP5coMiM7kfW/sQAmH+nuF53kRiOepPxWpLcIhoG7BXyZCzzX84NIfXYaxchr/feRNv858Vh4Z+nR+MXp7Q0cCrXbIy5HpXOyi86Bv1ndNN9TuxPM+qR/6x7zlMEE1iQEhXhc7XCG1j35LhUukdd4BGqV/w/aXo72J0BteFUOUGijTz8pNfbKZqCBebaeQdTuMnwMnbdn+Fl+pxcxLuZGntqvMX+HX2EDLYh5lLVcBfw67vVPI4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 10, 2023 at 11:06:37AM +0800, Huang, Ying wrote: > > Unfortunately it's a little bit more. 24 is the extra overhead. > > > > Today we have an xarray entry for each swapped out page, that either > > has the swapcache pointer or the shadow entry. > > > > With this implementation, we have an xarray entry for each swapped out > > page, that has a pointer to the swap_desc. > > > > Ignoring the overhead of the xarray itself, we have (8 + 24) / (8 + 1) = 3.5556. > > OK. I see. We can only hold 8 bytes for each xarray entry. To save > memory usage, we can allocate multiple swap_desc (e.g., 16) for each > xarray entry. Then the memory usage of xarray becomes 1/N. The xarray look up key is the swap offset from the swap entry. If you put more than one swap_desc under the one xarray entry. It will mean all those different swap_descs will share a swap offset. > >> > Yeah, I initially thought we would only need the swp_entry_t -> > >> > swap_desc reverse mapping for readahead, and that we can only store > >> > that for spinning disks, but I was wrong. We need for other things as > >> > well today: swapoff, when trying to find an empty swap slot and we > >> > start trying to free swap slots used only by the swapcache. However, I > >> > think both of these cases can be fixed (I can share more details if > >> > you want). If everything goes well we should only need to maintain the > >> > reverse mapping (extra overhead above 24 bytes) for swap files on > >> > spinning disks for readahead. > >> > > >> >> > >> >> Looking forward to your discussion. > > Per my understanding, the indirection is to make it easy to move > (swapped) pages among swap devices based on hot/cold. This is similar > as the target of memory tiering. It appears that we can extend the > memory tiering (mm/memory-tiers.c) framework to cover swap devices too? > Is it possible for zswap to be faster than some slow memory media? I just took a look at the mm/memory-tier.c. It does not look like it can cover block device swap without major overhauling. Chris