From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A17AEC6FA8E for ; Thu, 2 Mar 2023 22:55:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EA7436B0071; Thu, 2 Mar 2023 17:55:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E57976B0073; Thu, 2 Mar 2023 17:55:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1F036B0074; Thu, 2 Mar 2023 17:55:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C112A6B0071 for ; Thu, 2 Mar 2023 17:55:49 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 85F1D40490 for ; Thu, 2 Mar 2023 22:55:49 +0000 (UTC) X-FDA: 80525467218.23.45B794D Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by imf21.hostedemail.com (Postfix) with ESMTP id C168D1C0005 for ; Thu, 2 Mar 2023 22:55:47 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UZbsHmeL; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677797747; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=B2Cv+p24GAJJLxXRHAl2ItA2KSQ1DB9SnIzRhslDMnw=; b=YJ9qhTSzyFDEJvcyH3RxuzLJvGsc8ihlwLPzpCd46Y/ZxKTTS2in+irYvAW8BuN1Clas4Z fW8kTvGk93l+hPk1Kc4qE8fctM1POiwygO/HHUzj5q6A1370psY0b08acUUbjVsq3Jf55o mi82MdojtQPAzxQf52s8qEz2kO9/g3U= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UZbsHmeL; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677797747; a=rsa-sha256; cv=none; b=ppNrZ3o79ueVHU02CeewpzoI3XcCudCoHaVeCxhHcxHs9MlZcWfvXFfZ92HHVRmyav5qKG glN1mmC5zErW5nBoC6bZM2J0PiRYBLr7nzPbd/ubGyYpcCmXN5Qu0eOxt4pSpNYzUVzFjZ qh31dQqytlkX6mLIwTtR5n2BT7WOZTY= Received: by mail-ed1-f45.google.com with SMTP id a25so3541119edb.0 for ; Thu, 02 Mar 2023 14:55:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1677797746; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=B2Cv+p24GAJJLxXRHAl2ItA2KSQ1DB9SnIzRhslDMnw=; b=UZbsHmeLLtj70WtI7SDpMCNXtIjaE2PZu1ku2xRTAtGA7Pj7dVRI+AdikNBCk8Xlpa ubDJrZevwphgK5KvBXs7nIXIsW0GcdIJReGZaxP25YAxBlfR6Uwwk1FZOOmH0+Z8/TDc yAD2sL84DD1/QFCiItsMamWkCKKuTHuvLzGf6FbpzYLvH2wspQnULv2hukRAagbojsFZ sPd2QGs6lR2zoQw9Ux6R0tEYEPfj6TEwcmibnmnafZHJjRBekkYCOW27Zon1Plk2FQQj +JLXbuToKLp5o+wqfJ5u5Fn+qqzw54BQEgHWwhlVSHaoDGL3um3gDezI7itpm8qVf4jh c6MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677797746; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=B2Cv+p24GAJJLxXRHAl2ItA2KSQ1DB9SnIzRhslDMnw=; b=NPtMVawVeey5Nk0A+aqW3pLVvwwARqAFszsyUDHrGqSkqs7aOzIumqgGYCv55PU7ko 0aj7jBYZk9o0BwFxGhUEWpF7xdPSZvd2Yw2rJkQ1H0O525R1wo5xCn/zdWGL39f/8REe gjl3G6YXWGOCrkCrmJ7ysmDAbgden4Qg/P9AWaazFqqx3M5DzEZT+dQdDbgACDow7zZx YeqGGmyKdhNoWe2tkhgQclNcpoT4dZgDzEdPBsD2tj09AnOrZiTqwicjLUOnJSbzkSZP 3vdeZS+eX10PfB5idpPGb13tMsj3SdEMkb2s/x3Lh7M4+/QlORPDTdnBYlbthzOYB0kn NG5w== X-Gm-Message-State: AO0yUKUIsEI1D+ZkQNId5xNGoP9rHJjR3aNVPFKxh8Y/nXCaiJdqwT3D S+h3wz/Hvi1XnfKmHTFEF0aRtLqT1mN2zv7elQohUw== X-Google-Smtp-Source: AK7set+RqmlPQferPY6f5h+ZU+v72uOl9vazGaZnHyZUNlcTPyM6TDSjV5VwOfZ4YXeGuV0DCIrmXwPPbgyx5iKQeYQ= X-Received: by 2002:a17:906:5a69:b0:878:7afe:f880 with SMTP id my41-20020a1709065a6900b008787afef880mr6210217ejc.10.1677797745989; Thu, 02 Mar 2023 14:55:45 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Yosry Ahmed Date: Thu, 2 Mar 2023 14:55:09 -0800 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap To: Rik van Riel Cc: Chris Li , Minchan Kim , Johannes Weiner , Sergey Senozhatsky , lsf-pc@lists.linux-foundation.org, Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C168D1C0005 X-Stat-Signature: dcfffhx6gzff7wam8ssk4t6czaip7n77 X-HE-Tag: 1677797747-229624 X-HE-Meta: U2FsdGVkX18T/xfa2z+0wjNMU6LbdfXt/16dFFw40ELv2JI+MpSqRwkdkXhIf2gn3f6DnKJ8I8EDoJFlvV/COKfATowvbW0+xzJJlIlb1IcvAVtcsSM0xuBJkEvdwzWt5D3SePQgDzBKiMvQl/IZsXxZipGP6+XF/W6XjtQyPGbVYGbG03bHJxNw527A6W2MtPLWQ50sKiJhkMt++dDBdqxJ1xDdIuC17hDDz7oqQVNZZ+LeYX+nCqm2HT1GZe+ha8VmaiANBr+bvbBMFvfLDfb/zA4BO5tnIa45BFF7qH/EyAxR9F4psZW4cUmh6kw+V6v6o5p44fv7mlFF9PGDAiZqT0FNx3r9yhIYhW5iyZpe9AN3ATrXdU4lKC0tHyic3fQSMfy5V+dl7nLb3K/g8xRU6v83aeBGRa4CuBge7i86bmMjdNkWuvP1r9OoSUr1/lKZDQ3P+5VpHewjLBSbRfnRJJdxWSB6GsVTVEkhT4AJXyHBsuTNq1WelwM1cEjXm3036wD/03iJrmaSwX2KYaH17/ieP98c7ffKJ96VdrtTcwbJUMK7SLSM2KyKc6vkx4j6Hiv/xq8RS4Dsgv3ZSzj1xVY79kYCoRP7F9sVYrtPFl+7BL86ET4jOKPD3219UU/GCkOpbhCvUA2E/lifutAzobn1DqYKwQzM16T/Xg0/jvg/9a0BHyDsn6w1AID4jIrqU4Lnr5k2VUMcW8MP7g1yvfxY709J2GTtslOElrlNJTB7u74SY3JU+1NgRfOX/89wVYyW9P/+oHh0j5Dw5aoXlcfkj58L4FQijivHU4zLMHDQciEDPfDzueHz6Uej98NBln8vyvomD1bYbiWC7Tyi2dRfO7EqCVtvkNf2iunaQc5ZyyF1Rma1YsjfQTAeZczZGjGvkq5vDTxWO95Gyk4StUQFWSjbND+A87SnzcJ0F4RtG38uGhYf7ZmcGKKJ6lkGoGTXcZesnCc556K 6xaWK/6Q gsVBb9ng+YjkVXKimou+PP5nvo7+/3ak75cqFERbfLJSK3XOG+N+9L8XD46kS9VEphGJe0XPBdq/0VlDSdqfF0SnUQXtojudOsxN/Z8r1x9mJz3q3fG2Pmqg6w2Cl9mCol5JRGMX+iRnxq4fA7ykoAzS3BXKZ7vVBFUo2DCdjklybQkp7KPNgyV766AyZdaRMAM+R5jp/4m4Ol0O3fWzBaAtU/l8Ah7rOo7wNXtuDDu0ZV32+iJajpREySKzKJ7Weubh3MufKBfisRpVDWpwixfovNmj68hhlES4dQjXukITdZTnS/msjWtYhrw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 2, 2023 at 2:36 PM Rik van Riel wrote: > > On Thu, 2023-03-02 at 13:42 -0800, Chris Li wrote: > > On Thu, Mar 02, 2023 at 01:23:14PM -0500, Rik van Riel wrote: > > > > > > > > > One possible implementation might be to have swap page table > > > entries > > > point to a swap address in this indirection layer, and the > > > indirection > > > layer can be an xarray containing the actual swap entries > > > specifying > > > at which position in which swap device the data can be found. > > > > The questions, do we have this indirection layer apply to all swap > > entries? > > > I believe we should have a system that tracks every swap entry > the same, data structure wise. Otherwise we will have two sets > of code in the kernel, and it will be too easy to get corner > cases wrong. > > > My small tweak is to limit the indirection layer only to non leaf > > swap devices. Then it is actually very close to what I am proposing. > > Just your "indirection layer" is my "special swap device". > > > > Again, "special swap device" is a very bad name, let's name it > > something > > more useful. > > > > > That might be a net reduction in the code over what we have today, > > > because it gets rid of some ugly corner cases. > > > > Great. > > ... but that won't happen if the indirection layer only applies > to some swap devices, because we will still need to keep around > the crazy code to deal with the swap devices that don't have it. I agree with Rik here. We can certainly special case the indirection layer and only apply to some swap backends (e.g. zswap), but this makes things more complicated. For example, if each swap backend maintains swap_count in their own way, we have to hand over the swap count when we move a swapped page between backends. With a common data structure like the proposed swap_desc, everything becomes easier to reason about. The core swapping logic that is agnostic to the backend like swapcache and swap counting lives in one common place and becomes easier to reason about. Swap backends like swapfiles or zswap can then implement a common interface to do backend specific operations, like allocating entries, reading/writing pages, etc. This, of course, isn't free. There is an associated overhead. It's a trade off like most things are. We want to work towards the outcome of that tradeoff that makes sense, we don't want to incur too much overhead, but we also don't want a very complicated and error-prone implementation. Rik, I am wondering about your thoughts on this proposal and how you think it can be improved? > > -- > All Rights Reversed.