All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Romanov <avromanov@sberdevices.ru>
To: <minchan@kernel.org>, <ngupta@vflare.org>,
	<senozhatsky@chromium.org>, <linux-block@vger.kernel.org>,
	<axboe@chromium.org>
Cc: <kernel@sberdevices.ru>, <linux-kernel@vger.kernel.org>,
	<mnitenko@gmail.com>, Alexey Romanov <avromanov@sberdevices.ru>
Subject: [RFC PATCH v1] zram: experimental patch with entropy calculation
Date: Fri, 20 May 2022 20:13:09 +0300	[thread overview]
Message-ID: <20220520171309.26768-1-avromanov@sberdevices.ru> (raw)

This patch adds the option CONFIG_ZRAM_ENTROPY which calculates
the information entropy for each page. If the entropy is higher
than the CONFIG_ZRAM_ENTROPY_THRESHOLD parameter, then
the page is stored uncompressed.

This is some attempt to strike a balance between page processing
time and compression. This might be relevant for systems that can
lose a bit in compression and gain a lot in performance effort.

I did some experiments (writing different files directly to /dev/zram0)
and got this:

+---------+-----------+------------+--------+--------------+------------+
| entropy | orig_size | compr_size |  time  | instructions | branches   |
+---------+-----------+------------+--------+--------------+------------+
|    Y    | 407003136 | 406052541  | 0.572s | 4641658347   | 784918702  |
+---------+-----------+------------+--------+--------------+------------+
|    N    | 407003136 | 405371829  | 2.098s | 17187405320  | 2027130210 |
+---------+-----------+------------+--------+--------------+------------+

apk file (~430MB):

+---------+-----------+------------+--------+--------------+------------+
| entropy | orig_size | compr_size |  time  | instructions | branches   |
+---------+-----------+------------+--------+--------------+------------+
|    Y    | 431452160 | 335864766  | 1.632s | 13055231312  | 1747341252 |
+---------+-----------+------------+--------+--------------+------------+
|    N    | 431452160 | 289107700  | 3.196s | 24345720313  | 2846272508 |
+---------+-----------+------------+--------+--------------+------------+

folder with various files as .tar (but not compressed) (~5GB):

+---------+-----------+------------+--------+--------------+------------+
| entropy | orig_size | compr_size |  time  | instructions | branches   |
+---------+-----------+------------+--------+--------------+------------+
|    Y    | 5277380608| 4230903173 | 7.066s | 35039839769  | 5636707902 |
+---------+-----------+------------+--------+--------------+------------+
|    N    | 5277380608| 4216479361 | 11.955s| 76763123464  | 9497319788 |
+---------+-----------+------------+--------+--------------+------------+

Page processing time is noticeably accelerated, while compression loss
is not very critical.

I guess the CONFIG_ZRAM_ENTROPY_THRESHOLD setting should be
different for each of the compression algorithms. At the moment,
I have configured it only for the zstd algorithm, but in the future
I plan to configure and test it for all remaining compression algorithms.

Comments are welcome.

Signed-off-by: Alexey Romanov <avromanov@sberdevices.ru>
---
 drivers/block/zram/Kconfig    | 18 ++++++++++++++++
 drivers/block/zram/zram_drv.c | 39 +++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
index 668c6bf25..d873d631a 100644
--- a/drivers/block/zram/Kconfig
+++ b/drivers/block/zram/Kconfig
@@ -77,3 +77,21 @@ config ZRAM_MEMORY_TRACKING
 	  /sys/kernel/debug/zram/zramX/block_state.
 
 	  See Documentation/admin-guide/blockdev/zram.rst for more information.
+
+config ZRAM_ENTROPY
+	bool "Use entropy optimization for zram"
+	depends on ZRAM && ZRAM_DEF_COMP_ZSTD
+	help
+	  With this feature, entropy will be calculated for each page.
+	  Pages above ZRAM_ENTROPY_THRESHOLD entropy will be
+	  stored uncompressed. Use this feature if you need a performance
+	  boost and a small loss in compression.
+
+config ZRAM_ENTROPY_THRESHOLD
+	int
+	depends on ZRAM && ZRAM_ENTROPY
+	default 100000 if ZRAM_DEF_COMP_ZSTD
+	help
+	  Pages with entropy above ZRAM_ENTROPY_THRESHOLD will be stored
+	  uncompressed. The default value was chosen as a result a lot of
+	  experiments. You can try set your own value.
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index cb253d80d..483b219a4 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1350,6 +1350,35 @@ static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
 	return ret;
 }
 
+
+#ifdef CONFIG_ZRAM_ENTROPY
+static inline u32 ilog2_w(u64 n)
+{
+	return ilog2(n * n * n * n);
+}
+
+static inline s32 shannon_entropy(const u8 *src)
+{
+	s32 entropy_sum = 0;
+	u32 sz_base, i;
+	u16 entropy_count[256] = { 0 };
+
+	for (i = 0; i < PAGE_SIZE; ++i)
+		entropy_count[src[i]]++;
+
+	sz_base = ilog2_w(PAGE_SIZE);
+	for (i = 0; i < ARRAY_SIZE(entropy_count); ++i) {
+		if (entropy_count[i] > 0) {
+			s32 p = entropy_count[i];
+
+			entropy_sum += p * (sz_base - ilog2_w((u64)p));
+		}
+	}
+
+	return entropy_sum;
+}
+#endif
+
 static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 				u32 index, struct bio *bio)
 {
@@ -1376,7 +1405,17 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 compress_again:
 	zstrm = zcomp_stream_get(zram->comp);
 	src = kmap_atomic(page);
+
+#ifdef CONFIG_ZRAM_ENTROPY
+	/* Just save this page uncompressible */
+	if (shannon_entropy((const u8 *)src) > CONFIG_ZRAM_ENTROPY_THRESHOLD)
+		comp_len = PAGE_SIZE;
+	else
+		ret = zcomp_compress(zstrm, src, &comp_len);
+#else
 	ret = zcomp_compress(zstrm, src, &comp_len);
+#endif
+
 	kunmap_atomic(src);
 
 	if (unlikely(ret)) {
-- 
2.30.2


             reply	other threads:[~2022-05-20 17:14 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-20 17:13 Alexey Romanov [this message]
2022-05-28  9:30 ` [RFC PATCH v1] zram: experimental patch with entropy calculation Aleksey Romanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220520171309.26768-1-avromanov@sberdevices.ru \
    --to=avromanov@sberdevices.ru \
    --cc=axboe@chromium.org \
    --cc=kernel@sberdevices.ru \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=mnitenko@gmail.com \
    --cc=ngupta@vflare.org \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.