All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huang, Ying" <ying.huang@intel.com>,
	Chris Li <chrisl@kernel.org>,
	 lsf-pc@lists.linux-foundation.org, Linux-MM <linux-mm@kvack.org>,
	 Michal Hocko <mhocko@kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	 David Rientjes <rientjes@google.com>,
	Hugh Dickins <hughd@google.com>,
	 Seth Jennings <sjenning@redhat.com>,
	Dan Streetman <ddstreet@ieee.org>,
	 Vitaly Wool <vitaly.wool@konsulko.com>,
	Yang Shi <shy828301@gmail.com>,  Peter Xu <peterx@redhat.com>,
	Minchan Kim <minchan@kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>,
	 Michal Hocko <mhocko@suse.com>, Wei Xu <weixugc@google.com>
Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap
Date: Tue, 28 Mar 2023 12:59:55 -0700	[thread overview]
Message-ID: <CAJD7tkbudmPTEumgKJZ5pXy6O79ySbGiCnAZXnUUuEmfZ6KCtQ@mail.gmail.com> (raw)
In-Reply-To: <ZCL2VujaXo3GrncW@cmpxchg.org>

On Tue, Mar 28, 2023 at 7:14 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Tue, Mar 28, 2023 at 12:59:31AM -0700, Yosry Ahmed wrote:
> > On Tue, Mar 28, 2023 at 12:01 AM Huang, Ying <ying.huang@intel.com> wrote:
> > > Yosry Ahmed <yosryahmed@google.com> writes:
> > > > We also have to unnecessarily limit the size of zswap with the size of
> > > > this fake swapfile.
> > >
> > > I guess you need to limit the size of zswap anyway, because you need to
> > > decide when to start to writeback or moving to the lower tiers.
> >
> > zswap has a knob to limit its size, but based on the actual memory
> > usage of zswap (i.e the size of compressed pages). There is ongoing
> > work as well to autotune this if I remember correctly. Having to deal
> > with both the limit on compressed memory and the limited on the
> > uncompressed size of swapped pages is cumbersome. Again, we already
> > have this behavior today, but the initial swap_desc proposal aimed to
> > avoid it.
>
> Right.
>
> The optimal size of the zswap pool on top of a swapfile depends on the
> size and compressibility of the warm set of the workload: data that's
> too cold for regular memory yet too hot for swap. This is obviously
> highly dynamic, and even varies over time within individual jobs.
>
> With this proposal, we'd have to provision a static swap map for the
> highest expected offloading rate and compression ratio on every host
> of a shared pool. On 256G machines that would put the fixed overhead
> at a couple of hundred MB if I counted right.
>
> Not the end of the world I guess. And I agree it would make for
> simpler initial patches. OTOH, it would add more quirks to the swap
> code instead of cleaning it up. And given how common compressed memory
> setups are nowadays, it still feels like it's trading off too far in
> favor of regular swap setups at the expense of compression.

Right, I don't like adding more quirks to the swap code. I guess for
Android and ChromeOS, even though they are using compressed memory, it
is zram not zswap, so any extra overhead by swap_descs for normal swap
setups would also affect Android -- so that's something to think
about.

>
> So it wouldn't be my first preference. But it sounds workable.

If we settle on this as a first step, perhaps to avoid any ABI changes
we can have the kernel create a virtual swap device for zswap if it is
enabled, without userspace interfering or having to do swapon on a
sparse swapfile like we do today with ghost swapfiles at Google. We
can then implement indirection logic that only supports moving pages
between swap devices -- and perhaps only restrict it to only support
the virtual zswap swap device as a top tier initially.

The only user visible effect would be that if the user has zswap
enabled and did not configure a swapfile, zswap would start
compressing pages regardless, but that's what we're hoping for anyway
-- I wouldn't think this is a breaking change.

This also wouldn't be my first preference, but it seems like a smaller
step from what we have today. As long as we don't have ABI
dependencies we can always come back and change it later I suppose.


  reply	other threads:[~2023-03-28 20:00 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-18 22:38 [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap Yosry Ahmed
2023-02-19  4:31 ` Matthew Wilcox
2023-02-19  9:34   ` Yosry Ahmed
2023-02-28 23:22   ` Chris Li
2023-03-01  0:08     ` Matthew Wilcox
2023-03-01 23:22       ` Chris Li
2023-02-21 18:39 ` Yang Shi
2023-02-21 18:56   ` Yosry Ahmed
2023-02-21 19:26     ` Yang Shi
2023-02-21 19:46       ` Yosry Ahmed
2023-02-21 23:34         ` Yang Shi
2023-02-21 23:38           ` Yosry Ahmed
2023-02-22 16:57             ` Johannes Weiner
2023-02-22 22:46               ` Yosry Ahmed
2023-02-28  4:29                 ` Kalesh Singh
2023-02-28  8:09                   ` Yosry Ahmed
2023-02-28  4:54 ` Sergey Senozhatsky
2023-02-28  8:12   ` Yosry Ahmed
2023-02-28 23:29     ` Minchan Kim
2023-03-02  0:58       ` Yosry Ahmed
2023-03-02  1:25         ` Yosry Ahmed
2023-03-02 17:05         ` Chris Li
2023-03-02 17:47         ` Chris Li
2023-03-02 18:15           ` Johannes Weiner
2023-03-02 18:56             ` Chris Li
2023-03-02 18:23           ` Rik van Riel
2023-03-02 21:42             ` Chris Li
2023-03-02 22:36               ` Rik van Riel
2023-03-02 22:55                 ` Yosry Ahmed
2023-03-03  4:05                   ` Chris Li
2023-03-03  0:01                 ` Chris Li
2023-03-02 16:58       ` Chris Li
2023-03-01 10:44     ` Sergey Senozhatsky
2023-03-02  1:01       ` Yosry Ahmed
2023-02-28 23:11 ` Chris Li
2023-03-02  0:30   ` Yosry Ahmed
2023-03-02  1:00     ` Yosry Ahmed
2023-03-02 16:51     ` Chris Li
2023-03-03  0:33     ` Minchan Kim
2023-03-03  0:49       ` Yosry Ahmed
2023-03-03  1:25         ` Minchan Kim
2023-03-03 17:15           ` Yosry Ahmed
2023-03-09 12:48     ` Huang, Ying
2023-03-09 19:58       ` Chris Li
2023-03-09 20:19       ` Yosry Ahmed
2023-03-10  3:06         ` Huang, Ying
2023-03-10 23:14           ` Chris Li
2023-03-13  1:10             ` Huang, Ying
2023-03-15  7:41               ` Yosry Ahmed
2023-03-16  1:42                 ` Huang, Ying
2023-03-11  1:06           ` Yosry Ahmed
2023-03-13  2:12             ` Huang, Ying
2023-03-15  8:01               ` Yosry Ahmed
2023-03-16  7:50                 ` Huang, Ying
2023-03-17 10:19                   ` Yosry Ahmed
2023-03-17 18:19                     ` Chris Li
2023-03-17 18:23                       ` Yosry Ahmed
2023-03-20  2:55                     ` Huang, Ying
2023-03-20  6:25                       ` Chris Li
2023-03-23  0:56                         ` Huang, Ying
2023-03-23  6:46                           ` Chris Li
2023-03-23  6:56                             ` Huang, Ying
2023-03-23 18:28                               ` Chris Li
2023-03-23 18:40                                 ` Yosry Ahmed
2023-03-23 19:49                                   ` Chris Li
2023-03-23 19:54                                     ` Yosry Ahmed
2023-03-23 21:10                                       ` Chris Li
2023-03-24 17:28                                       ` Chris Li
2023-03-22  5:56                       ` Yosry Ahmed
2023-03-23  1:48                         ` Huang, Ying
2023-03-23  2:21                           ` Yosry Ahmed
2023-03-23  3:16                             ` Huang, Ying
2023-03-23  3:27                               ` Yosry Ahmed
2023-03-23  5:37                                 ` Huang, Ying
2023-03-23 15:18                                   ` Yosry Ahmed
2023-03-24  2:37                                     ` Huang, Ying
2023-03-24  7:28                                       ` Yosry Ahmed
2023-03-24 17:23                                         ` Chris Li
2023-03-27  1:23                                           ` Huang, Ying
2023-03-28  5:54                                             ` Yosry Ahmed
2023-03-28  6:20                                               ` Huang, Ying
2023-03-28  6:29                                                 ` Yosry Ahmed
2023-03-28  6:59                                                   ` Huang, Ying
2023-03-28  7:59                                                     ` Yosry Ahmed
2023-03-28 14:14                                                       ` Johannes Weiner
2023-03-28 19:59                                                         ` Yosry Ahmed [this message]
2023-03-28 21:22                                                           ` Chris Li
2023-03-28 21:30                                                             ` Yosry Ahmed
2023-03-28 20:50                                                       ` Chris Li
2023-03-28 21:01                                                         ` Yosry Ahmed
2023-03-28 21:32                                                           ` Chris Li
2023-03-28 21:44                                                             ` Yosry Ahmed
2023-03-28 22:01                                                               ` Chris Li
2023-03-28 22:02                                                                 ` Yosry Ahmed
2023-03-29  1:31                                                               ` Huang, Ying
2023-03-29  1:41                                                                 ` Yosry Ahmed
2023-03-29 16:04                                                                   ` Chris Li
2023-04-04  8:24                                                                     ` Huang, Ying
2023-04-04  8:10                                                                   ` Huang, Ying
2023-04-04  8:47                                                                     ` Yosry Ahmed
2023-04-06  1:40                                                                       ` Huang, Ying
2023-03-29 15:22                                                                 ` Chris Li
2023-03-10  2:07 ` Luis Chamberlain
2023-03-10  2:15   ` Yosry Ahmed
2023-05-12  3:07 ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJD7tkbudmPTEumgKJZ5pXy6O79ySbGiCnAZXnUUuEmfZ6KCtQ@mail.gmail.com \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=chrisl@kernel.org \
    --cc=ddstreet@ieee.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=sjenning@redhat.com \
    --cc=vitaly.wool@konsulko.com \
    --cc=weixugc@google.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.