All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Alexey Romanov <avromanov@sberdevices.ru>
Cc: minchan@kernel.org, senozhatsky@chromium.org, ngupta@vflare.org,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel@sberdevices.ru,
	ddrokosov@sberdevices.ru
Subject: Re: [RFC PATCH v1 0/4] Introduce merge identical pages mechanism
Date: Mon, 21 Nov 2022 15:44:51 -0500	[thread overview]
Message-ID: <Y3vjQ7VJYUEWl2uc@cmpxchg.org> (raw)
In-Reply-To: <20221121190020.66548-1-avromanov@sberdevices.ru>

On Mon, Nov 21, 2022 at 10:00:16PM +0300, Alexey Romanov wrote:
> Hello!
> 
> This RFC series adds feature which allows merge identical
> compressed pages into a single one. The main idea is that
> zram only stores object references, which store the compressed
> content of the pages. Thus, the contents of the zsmalloc objects
> don't change in any way.
> 
> For simplicity, let's imagine that 3 pages with the same content
> got into zram:
> 
> +----------------+   +----------------+   +----------------+
> |zram_table_entry|   |zram_table_entry|   |zram_table_entry|
> +-------+--------+   +-------+--------+   +--------+-------+
>         |                    |                     |
>         | handle (1)         | handle (2)          | handle (3)
> +-------v--------+   +-------v---------+  +--------v-------+
> |zsmalloc  object|   |zsmalloc  object |  |zsmalloc  object|
> ++--------------++   +-+-------------+-+  ++--------------++
>  +--------------+      +-------------+     +--------------+
>  | buffer: "abc"|      |buffer: "abc"|     | buffer: "abc"|
>  +--------------+      +-------------+     +--------------+
> 
> As you can see, the data is duplicated. Merge mechanism saves
> (after scanning objects) only one zsmalloc object. Here's
> what happens ater the scan and merge:
> 
> +----------------+   +----------------+   +----------------+
> |zram_table_entry|   |zram_table_entry|   |zram_tabl _entry|
> +-------+--------+   +-------+--------+   +--------+-------+
>         |                    |                     |
>         | handle (1)         | handle (1)          | handle (1)
>         |           +--------v---------+           |
>         +-----------> zsmalloc  object <-----------+
>                     +--+-------------+-+
>                        +-------------+
>                        |buffer: "abc"|
>                        +-------------+
> 
> Thus, we reduced the amount of memory occupied by 3 times.
> 
> This mechanism doesn't affect the perf of the zram itself in
> any way (maybe just a little bit on the zram_free_page function).
> In order to describe each such identical object, we (constantly)
> need sizeof(zram_rbtree_node) bytes. So, for example, if the system
> has 20 identical buffers with a size of 1024, the memory gain will be
> (20 * 1024) - (1 * 1024 + sizeof(zram_rbtree_node)) = 19456 -
> sizeof(zram_rbtree_node) bytes. But, it should be understood, these are
> counts without zsmalloc data structures overhead.
> 
> Testing on my system (8GB ram + 1 gb zram swap) showed that at high 
> loads, on average, when calling the merge mechanism, we can save 
> up to 15-20% of the memory usage.

This looks pretty great.

However, I'm curious why it's specific to zram, and not part of
zsmalloc? That way zswap would benefit as well, without having to
duplicate the implementation. This happened for example with
page_same_filled() and zswap_is_page_same_filled().

It's zsmalloc's job to store content efficiently, so couldn't this
feature (just like the page_same_filled one) be an optimization that
zsmalloc does transparently for all its users?

> This patch serices adds a new sysfs node (trigger merging) and new 
> field in mm_stat (how many pages are merged in zram at the moment):
> 
>   $ cat /sys/block/zram/mm_stat
>     431452160 332984392 339894272 0 339894272 282 0 51374 51374 0
> 
>   $ echo 1 > /sys/block/zram/merge
> 
>   $ cat /sys/block/zram/mm_stat
>     431452160 270376848 287301504 0 339894272 282 0 51374 51374 6593

The optimal frequency for calling this is probably tied to prevalent
memory pressure, which is somewhat tricky to do from userspace.

Would it make sense to hook this up to a shrinker?

  parent reply	other threads:[~2022-11-21 20:44 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 19:00 [RFC PATCH v1 0/4] Introduce merge identical pages mechanism Alexey Romanov
2022-11-21 19:00 ` [RFC PATCH v1 1/4] zram: introduce " Alexey Romanov
2022-11-23  8:25   ` Chen Wandun
2022-11-23  9:04     ` Aleksey Romanov
2022-11-21 19:00 ` [RFC PATCH v1 2/4] zram: add merge sysfs knob Alexey Romanov
2022-11-21 19:00 ` [RFC PATCH v1 3/4] zram: add pages_merged counter to mm_stat Alexey Romanov
2022-11-21 19:00 ` [RFC PATCH v1 4/4] zram: recompression: add ZRAM_MERGED check Alexey Romanov
2022-11-21 20:44 ` Johannes Weiner [this message]
2022-11-22  3:00   ` [RFC PATCH v1 0/4] Introduce merge identical pages mechanism Sergey Senozhatsky
2022-11-22  3:07     ` Sergey Senozhatsky
2022-11-22 12:14       ` Aleksey Romanov
2022-11-23  4:13         ` Sergey Senozhatsky
2022-11-23  8:53           ` Dmitry Rokosov
2022-12-01 10:14             ` Dmitry Rokosov
2022-12-01 10:47               ` Sergey Senozhatsky
2022-12-01 11:14                 ` Dmitry Rokosov
2022-12-01 13:29                   ` Sergey Senozhatsky
2023-01-11 14:00                 ` Alexey Romanov
2023-02-06 10:37                   ` Sergey Senozhatsky
2022-11-23  9:07           ` Aleksey Romanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3vjQ7VJYUEWl2uc@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=avromanov@sberdevices.ru \
    --cc=ddrokosov@sberdevices.ru \
    --cc=kernel@sberdevices.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.