linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] zsmalloc/zram: configurable zspage size
@ 2022-10-24 16:12 Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 1/6] zsmalloc: turn zspage order into runtime variable Sergey Senozhatsky
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-24 16:12 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

	Hello,

	Some use-cases and/or data patterns may benefit from
larger zspages. Currently the limit on the number of physical
pages that are linked into a zspage is hardcoded to 4. Higher
limit changes key characteristics of a number of the size
clases, improving compactness of the pool and redusing the
amount of memory zsmalloc pool uses.

For instance, the huge size class watermark is currently set
to 3264 bytes. With order 3 zspages we have more normal classe
and huge size watermark becomes 3632. With order 4 zspages
huge size watermark becomes 3840.

Commit #1 has more numbers and some analysis.

Sergey Senozhatsky (6):
  zsmalloc: turn zspage order into runtime variable
  zsmalloc/zram: pass zspage order to zs_create_pool()
  zram: add pool_page_order device attribute
  Documentation: document zram pool_page_order attribute
  zsmalloc: break out of loop when found perfect zspage order
  zsmalloc: make sure we select best zspage size

 Documentation/admin-guide/blockdev/zram.rst | 31 +++++--
 drivers/block/zram/zram_drv.c               | 44 ++++++++-
 drivers/block/zram/zram_drv.h               |  2 +
 include/linux/zsmalloc.h                    | 15 +++-
 mm/zsmalloc.c                               | 98 +++++++++++++--------
 5 files changed, 145 insertions(+), 45 deletions(-)

-- 
2.38.0.135.g90850a2211-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/6] zsmalloc: turn zspage order into runtime variable
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
@ 2022-10-24 16:12 ` Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 2/6] zsmalloc/zram: pass zspage order to zs_create_pool() Sergey Senozhatsky
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-24 16:12 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

zsmalloc has 255 size classes. Size classes contain a number of zspages,
which store objects of the same size. zspage can consist of up to four
physical pages. The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of
objects zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes #95-99 are merged with size class #100. That is, each time
we store an object of size, say, 1568 bytes instead of using class #96
we end up storing it in size class #100. Class #100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class #100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class #96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class #96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class #100 will allocate.

A higher order zspage for class #96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes #96 and #100
are not merged anymore, which gives us more compact zsmalloc.

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is #202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class #254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from #203 to #254. Similarly to class size #96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

We move huge class watermark with higher order zspages.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

1) ChromeOS memory pressure test
-----------------------------------------------------------------------------

Our standard memory pressure test, that is designed with the reproducibility
in mind.

zram is configured as a swap device, lzo-rle compression algorithm.
We captured /sys/block/zram0/mm_stat after every test and rebooted
device.

Columns per (Documentation/admin-guide/blockdev/zram.rst)

orig_data_size        mem_used_total      mem_used_max         pages_compacted
          compr_data_size         mem_limit           same_pages          huge_pages

ORDER 2 (BASE)

10353639424 2981711944 3166896128        0 3543158784   579494   825135   123707
10168573952 2932288347 3106541568        0 3499085824   565187   853137   126153
9950461952 2815911234 3035693056        0 3441090560   586696   748054   122103
9892335616 2779566152 2943459328        0 3514736640   591541   650696   119621
9993949184 2814279212 3021357056        0 3336421376   582488   711744   121273
9953226752 2856382009 3025649664        0 3512893440   564559   787861   123034
9838448640 2785481728 2997575680        0 3367219200   573282   777099   122739

ORDER 3

9509138432 2706941227 2823393280        0 3389587456   535856  1011472    90223
10105245696 2882368370 3013095424        0 3296165888   563896  1059033    94808
9531236352 2666125512 2867650560        0 3396173824   567117  1126396    88807
9561812992 2714536764 2956652544        0 3310505984   548223   827322    90992
9807470592 2790315707 2908053504        0 3378315264   563670  1020933    93725
10178371584 2948838782 3071209472        0 3329548288   548533   954546    90730
9925165056 2849839413 2958274560        0 3336978432   551464  1058302    89381

ORDER 4

9444515840 2613362645 2668232704        0 3396759552   573735  1162207    83475
10129108992 2925888488 3038351360        0 3499597824   555634  1231542    84525
9876594688 2786692282 2897006592        0 3469463552   584835  1290535    84133
10012909568 2649711847 2801512448        0 3171323904   675405   750728    80424
10120966144 2866742402 2978639872        0 3257815040   587435  1093981    83587
9578790912 2671245225 2802270208        0 3376353280   545548  1047930    80895
10108588032 2888433523 2983960576        0 3316641792   571445  1290640    81402

First, we establish that order 3 and 4 don't cause any statistically
significant change in `orig_data_size` (number of bytes we store during
the test), in other words larger zspages don't cause regressions.

T-test for order 3:

x order-2-stored
+ order-3-stored
+-----------------------------------------------------------------------------+
|+ +  +                     +  x   x  +  x   x         +    x+               x|
| |________________________AM__|_________M_____A____|__________|              |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 9.8384486e+09 1.0353639e+10 9.9532268e+09 1.0021519e+10 1.7916718e+08
+   7 9.5091384e+09 1.0178372e+10 9.8074706e+09 9.8026344e+09 2.7856206e+08
No difference proven at 95.0% confidence

T-test for order 4:

x order-2-stored
+ order-4-stored
+-----------------------------------------------------------------------------+
|                                                         +                   |
|+          +                     x  +x    xx  x +       ++   x              x|
|              |__________________|____A____M____M____________|_|             |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 9.8384486e+09 1.0353639e+10 9.9532268e+09 1.0021519e+10 1.7916718e+08
+   7 9.4445158e+09 1.0129109e+10  1.001291e+10 9.8959249e+09 2.7947784e+08
No difference proven at 95.0% confidence

Next we establish that there is a statistically significant improvement
in `mem_used_total` metrics.

T-test for order 3:

x order-2-usedmem
+ order-3-usedmem
+-----------------------------------------------------------------------------+
|+         +        +       x ++        x  + xx x       +       x            x|
|        |_________________A__M__|____________|__A________________|           |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 2.9434593e+09 3.1668961e+09 3.0256497e+09 3.0424532e+09      73235062
+   7 2.8233933e+09 3.0712095e+09 2.9566525e+09 2.9426185e+09      84630851
Difference at 95.0% confidence
	-9.98347e+07 +/- 9.21744e+07
	-3.28139% +/- 3.02961%
	(Student's t, pooled s = 7.91383e+07)

T-test for order 4:

x order-2-usedmem
+ order-4-usedmem
+-----------------------------------------------------------------------------+
|                    +                                 x                      |
|+                   +              +      x    ++ x   x *          x        x|
|             |__________________A__M__________|_____|_M__A__________|        |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 2.9434593e+09 3.1668961e+09 3.0256497e+09 3.0424532e+09      73235062
+   7 2.6682327e+09 3.0383514e+09 2.8970066e+09 2.8814248e+09 1.3098053e+08
Difference at 95.0% confidence
	-1.61028e+08 +/- 1.23591e+08
	-5.29272% +/- 4.0622%
	(Student's t, pooled s = 1.06111e+08)

Order 3 zspages also show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
+-----------------------------------------------------------------------------+
|+   +     + x+        x  +   + +             x                x    x        x|
|    |________M__A_________|_|_____________________A___________M____________| |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 3.3364214e+09 3.5431588e+09 3.4990858e+09 3.4592294e+09      80073158
+   7 3.2961659e+09 3.3961738e+09 3.3369784e+09 3.3481822e+09      39840377
Difference at 95.0% confidence
	-1.11047e+08 +/- 7.36589e+07
	-3.21017% +/- 2.12934%
	(Student's t, pooled s = 6.32415e+07)

Order 4 zspages, on the other hand, do not show any statistically significant
improvement in `mem_used_max` metrics.

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
+-----------------------------------------------------------------------------+
|+                 +           +   x     x +   +        x     +     *  x     x|
|              |_______________________A___M________________A_|_____M_______| |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 3.3364214e+09 3.5431588e+09 3.4990858e+09 3.4592294e+09      80073158
+   7 3.1713239e+09 3.4995978e+09 3.3763533e+09 3.3554221e+09 1.1609062e+08
No difference proven at 95.0% confidence

Overall, with sufficient level of confidence order 3 zspages appear to be
beneficial for these particular use-case and data patterns.

Rather expectedly we also observed lower numbers of huge-pages when zsmalloc
is configured with order 3 and order 4 zspages, for the reason already
explained.

2) Synthetic test
-----------------------------------------------------------------------------

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted VM.

orig_data_size        mem_used_total      mem_used_max         pages_compacted
          compr_data_size         mem_limit           same_pages          huge_pages

ORDER 2 (BASE)

1691807744 628091753 655187968        0 655187968       59        0    34042    34043
1691803648 628089105 655159296        0 655159296       60        0    34043    34043
1691795456 628087429 655151104        0 655151104       59        0    34046    34046
1691799552 628093723 655216640        0 655216640       60        0    34044    34044

ORDER 3

1691787264 627781464 641740800        0 641740800       59        0    33591    33591
1691795456 627794239 641789952        0 641789952       59        0    33591    33591
1691811840 627788466 641691648        0 641691648       60        0    33591    33591
1691791360 627790682 641781760        0 641781760       59        0    33591    33591

ORDER 4

1691807744 627729506 639627264        0 639627264       59        0    33432    33432
1691820032 627731485 639606784        0 639606784       59        0    33432    33432
1691799552 627725753 639623168        0 639623168       59        0    33432    33433
1691820032 627734080 639746048        0 639746048       61        0    33432    33432

Order 3 and order 4 show statistically significant improvement in
`mem_used_total` metrics.

T-test for order 3:

x order-2-usedmem-comp
+ order-3-usedmem-comp
+-----------------------------------------------------------------------------+
|++                                                                          x|
|++                                                                          x|
|AM                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.4169165e+08 6.4178995e+08 6.4178176e+08 6.4175104e+08         45056
Difference at 95.0% confidence
	-1.34277e+07 +/- 66089.8
	-2.04947% +/- 0.0100873%
	(Student's t, pooled s = 38195.8)

T-test for order 4:

x order-2-usedmem-comp
+ order-4-usedmem-comp
+-----------------------------------------------------------------------------+
|+                                                                           x|
|+                                                                           x|
|++                                                                          x|
|A|                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.3960678e+08 6.3974605e+08 6.3962726e+08 6.3965082e+08     64101.637
Difference at 95.0% confidence
	-1.55279e+07 +/- 86486.9
	-2.37003% +/- 0.0132005%
	(Student's t, pooled s = 49984.1)

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem-comp
+ order-3-maxmem-comp
+-----------------------------------------------------------------------------+
|++                                                                          x|
|++                                                                          x|
|AM                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.4169165e+08 6.4178995e+08 6.4178176e+08 6.4175104e+08         45056
Difference at 95.0% confidence
	-1.34277e+07 +/- 66089.8
	-2.04947% +/- 0.0100873%
	(Student's t, pooled s = 38195.8)

T-test for order 4:

x order-2-maxmem-comp
+ order-4-maxmem-comp
+-----------------------------------------------------------------------------+
|+                                                                           x|
|+                                                                           x|
|++                                                                          x|
|A|                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.3960678e+08 6.3974605e+08 6.3962726e+08 6.3965082e+08     64101.637
Difference at 95.0% confidence
	-1.55279e+07 +/- 86486.9
	-2.37003% +/- 0.0132005%
	(Student's t, pooled s = 49984.1)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

Data patterns that generate a considerable number of badly compressible
objects benefit from higher `huge_class_size` watermark, which is achieved
with order 4 zspages.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 include/linux/zsmalloc.h | 13 ++++++++
 mm/zsmalloc.c            | 72 +++++++++++++++++++++++-----------------
 2 files changed, 55 insertions(+), 30 deletions(-)

diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 2a430e713ce5..2110b140e0fa 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -33,6 +33,19 @@ enum zs_mapmode {
 	 */
 };
 
+#define ZS_PAGE_ORDER_2		2
+#define ZS_PAGE_ORDER_3		3
+#define ZS_PAGE_ORDER_4		4
+
+/*
+ * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single)
+ * pages. ZS_MAX_PAGE_ORDER defines upper limit on N, ZS_MIN_PAGE_ORDER
+ * defines lower limit on N. ZS_DEFAULT_PAGE_ORDER is recommended value.
+ */
+#define ZS_MIN_PAGE_ORDER	ZS_PAGE_ORDER_2
+#define ZS_MAX_PAGE_ORDER	ZS_PAGE_ORDER_4
+#define ZS_DEFAULT_PAGE_ORDER	ZS_PAGE_ORDER_2
+
 struct zs_pool_stats {
 	/* How many pages were migrated (freed) */
 	atomic_long_t pages_compacted;
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 6645506b0b14..6ffa32b8b6c8 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -74,12 +74,7 @@
  */
 #define ZS_ALIGN		8
 
-/*
- * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single)
- * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
- */
-#define ZS_MAX_ZSPAGE_ORDER 2
-#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER)
+#define ZS_MAX_PAGES_PER_ZSPAGE	(_AC(1, UL) << ZS_MAX_PAGE_ORDER)
 
 #define ZS_HANDLE_SIZE (sizeof(unsigned long))
 
@@ -124,10 +119,8 @@
 #define ISOLATED_BITS	3
 #define MAGIC_VAL_BITS	8
 
-#define MAX(a, b) ((a) >= (b) ? (a) : (b))
-/* ZS_MIN_ALLOC_SIZE must be multiple of ZS_ALIGN */
-#define ZS_MIN_ALLOC_SIZE \
-	MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
+#define ZS_MIN_ALLOC_SIZE	32U
+
 /* each chunk includes extra space to keep handle */
 #define ZS_MAX_ALLOC_SIZE	PAGE_SIZE
 
@@ -141,12 +134,10 @@
  *    determined). NOTE: all those class sizes must be set as multiple of
  *    ZS_ALIGN to make sure link_free itself never has to span 2 pages.
  *
- *  ZS_MIN_ALLOC_SIZE and ZS_SIZE_CLASS_DELTA must be multiple of ZS_ALIGN
- *  (reason above)
+ *  pool->min_alloc_size (ZS_MIN_ALLOC_SIZE) and ZS_SIZE_CLASS_DELTA must
+ *  be multiple of ZS_ALIGN (reason above)
  */
 #define ZS_SIZE_CLASS_DELTA	(PAGE_SIZE >> CLASS_BITS)
-#define ZS_SIZE_CLASSES	(DIV_ROUND_UP(ZS_MAX_ALLOC_SIZE - ZS_MIN_ALLOC_SIZE, \
-				      ZS_SIZE_CLASS_DELTA) + 1)
 
 enum fullness_group {
 	ZS_EMPTY,
@@ -230,12 +221,16 @@ struct link_free {
 struct zs_pool {
 	const char *name;
 
-	struct size_class *size_class[ZS_SIZE_CLASSES];
+	struct size_class **size_class;
 	struct kmem_cache *handle_cachep;
 	struct kmem_cache *zspage_cachep;
 
 	atomic_long_t pages_allocated;
 
+	u32 num_size_classes;
+	u32 min_alloc_size;
+	u32 max_pages_per_zspage;
+
 	struct zs_pool_stats stats;
 
 	/* Compact classes */
@@ -523,15 +518,15 @@ static void set_zspage_mapping(struct zspage *zspage,
  * classes depending on its size. This function returns index of the
  * size class which has chunk size big enough to hold the given size.
  */
-static int get_size_class_index(int size)
+static int get_size_class_index(struct zs_pool *pool, int size)
 {
 	int idx = 0;
 
-	if (likely(size > ZS_MIN_ALLOC_SIZE))
-		idx = DIV_ROUND_UP(size - ZS_MIN_ALLOC_SIZE,
+	if (likely(size > pool->min_alloc_size))
+		idx = DIV_ROUND_UP(size - pool->min_alloc_size,
 				ZS_SIZE_CLASS_DELTA);
 
-	return min_t(int, ZS_SIZE_CLASSES - 1, idx);
+	return min_t(int, pool->num_size_classes - 1, idx);
 }
 
 /* type can be of enum type class_stat_type or fullness_group */
@@ -591,7 +586,7 @@ static int zs_stats_size_show(struct seq_file *s, void *v)
 			"obj_allocated", "obj_used", "pages_used",
 			"pages_per_zspage", "freeable", "objs_per_zspage");
 
-	for (i = 0; i < ZS_SIZE_CLASSES; i++) {
+	for (i = 0; i < pool->num_size_classes; i++) {
 		class = pool->size_class[i];
 
 		if (class->index != i)
@@ -777,13 +772,13 @@ static enum fullness_group fix_fullness_group(struct size_class *class,
  * link together 3 PAGE_SIZE sized pages to form a zspage
  * since then we can perfectly fit in 8 such objects.
  */
-static int get_pages_per_zspage(int class_size)
+static int get_pages_per_zspage(struct zs_pool *pool, int class_size)
 {
 	int i, max_usedpc = 0;
 	/* zspage order which gives maximum used size per KB */
 	int max_usedpc_order = 1;
 
-	for (i = 1; i <= ZS_MAX_PAGES_PER_ZSPAGE; i++) {
+	for (i = 1; i <= pool->max_pages_per_zspage; i++) {
 		int zspage_size;
 		int waste, usedpc;
 
@@ -1410,7 +1405,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp)
 
 	/* extra space in chunk to keep the handle */
 	size += ZS_HANDLE_SIZE;
-	class = pool->size_class[get_size_class_index(size)];
+	class = pool->size_class[get_size_class_index(pool, size)];
 
 	/* class->lock effectively protects the zpage migration */
 	spin_lock(&class->lock);
@@ -1959,7 +1954,7 @@ static void async_free_zspage(struct work_struct *work)
 	struct zs_pool *pool = container_of(work, struct zs_pool,
 					free_work);
 
-	for (i = 0; i < ZS_SIZE_CLASSES; i++) {
+	for (i = 0; i < pool->num_size_classes; i++) {
 		class = pool->size_class[i];
 		if (class->index != i)
 			continue;
@@ -2108,7 +2103,7 @@ unsigned long zs_compact(struct zs_pool *pool)
 	struct size_class *class;
 	unsigned long pages_freed = 0;
 
-	for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) {
+	for (i = pool->num_size_classes - 1; i >= 0; i--) {
 		class = pool->size_class[i];
 		if (class->index != i)
 			continue;
@@ -2152,7 +2147,7 @@ static unsigned long zs_shrinker_count(struct shrinker *shrinker,
 	struct zs_pool *pool = container_of(shrinker, struct zs_pool,
 			shrinker);
 
-	for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) {
+	for (i = pool->num_size_classes - 1; i >= 0; i--) {
 		class = pool->size_class[i];
 		if (class->index != i)
 			continue;
@@ -2199,6 +2194,22 @@ struct zs_pool *zs_create_pool(const char *name)
 	if (!pool)
 		return NULL;
 
+	pool->max_pages_per_zspage = 1U << ZS_MIN_PAGE_ORDER;
+	/* min_alloc_size must be multiple of ZS_ALIGN */
+	pool->min_alloc_size = (pool->max_pages_per_zspage << PAGE_SHIFT) >>
+		OBJ_INDEX_BITS;
+	pool->min_alloc_size = max(pool->min_alloc_size, ZS_MIN_ALLOC_SIZE);
+
+	pool->num_size_classes =
+		DIV_ROUND_UP(ZS_MAX_ALLOC_SIZE - pool->min_alloc_size,
+			     ZS_SIZE_CLASS_DELTA) + 1;
+
+	pool->size_class = kmalloc_array(pool->num_size_classes,
+					 sizeof(struct size_class *),
+					 GFP_KERNEL | __GFP_ZERO);
+	if (!pool->size_class)
+		goto err;
+
 	init_deferred_free(pool);
 	rwlock_init(&pool->migrate_lock);
 
@@ -2213,17 +2224,17 @@ struct zs_pool *zs_create_pool(const char *name)
 	 * Iterate reversely, because, size of size_class that we want to use
 	 * for merging should be larger or equal to current size.
 	 */
-	for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) {
+	for (i = pool->num_size_classes - 1; i >= 0; i--) {
 		int size;
 		int pages_per_zspage;
 		int objs_per_zspage;
 		struct size_class *class;
 		int fullness = 0;
 
-		size = ZS_MIN_ALLOC_SIZE + i * ZS_SIZE_CLASS_DELTA;
+		size = pool->min_alloc_size + i * ZS_SIZE_CLASS_DELTA;
 		if (size > ZS_MAX_ALLOC_SIZE)
 			size = ZS_MAX_ALLOC_SIZE;
-		pages_per_zspage = get_pages_per_zspage(size);
+		pages_per_zspage = get_pages_per_zspage(pool, size);
 		objs_per_zspage = pages_per_zspage * PAGE_SIZE / size;
 
 		/*
@@ -2307,7 +2318,7 @@ void zs_destroy_pool(struct zs_pool *pool)
 	zs_flush_migration(pool);
 	zs_pool_stat_destroy(pool);
 
-	for (i = 0; i < ZS_SIZE_CLASSES; i++) {
+	for (i = 0; i < pool->num_size_classes; i++) {
 		int fg;
 		struct size_class *class = pool->size_class[i];
 
@@ -2327,6 +2338,7 @@ void zs_destroy_pool(struct zs_pool *pool)
 	}
 
 	destroy_cache(pool);
+	kfree(pool->size_class);
 	kfree(pool->name);
 	kfree(pool);
 }
-- 
2.38.0.135.g90850a2211-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/6] zsmalloc/zram: pass zspage order to zs_create_pool()
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 1/6] zsmalloc: turn zspage order into runtime variable Sergey Senozhatsky
@ 2022-10-24 16:12 ` Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 3/6] zram: add pool_page_order device attribute Sergey Senozhatsky
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-24 16:12 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

Allow zsmalloc pool owner to specify max zspage (during
pool creation), so that different pools can have different
characteristics.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 drivers/block/zram/zram_drv.c |  3 ++-
 include/linux/zsmalloc.h      |  2 +-
 mm/zsmalloc.c                 | 11 ++++++++---
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 364323713393..e3ef542f9618 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1253,7 +1253,8 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
 	if (!zram->table)
 		return false;
 
-	zram->mem_pool = zs_create_pool(zram->disk->disk_name);
+	zram->mem_pool = zs_create_pool(zram->disk->disk_name,
+					ZS_DEFAULT_PAGE_ORDER);
 	if (!zram->mem_pool) {
 		vfree(zram->table);
 		return false;
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 2110b140e0fa..4a92c5e186ad 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -53,7 +53,7 @@ struct zs_pool_stats {
 
 struct zs_pool;
 
-struct zs_pool *zs_create_pool(const char *name);
+struct zs_pool *zs_create_pool(const char *name, u32 zspage_order);
 void zs_destroy_pool(struct zs_pool *pool);
 
 unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t flags);
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 6ffa32b8b6c8..fa55e0c66f8d 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -369,7 +369,7 @@ static void *zs_zpool_create(const char *name, gfp_t gfp,
 	 * different contexts and its caller must provide a valid
 	 * gfp mask.
 	 */
-	return zs_create_pool(name);
+	return zs_create_pool(name, ZS_DEFAULT_PAGE_ORDER);
 }
 
 static void zs_zpool_destroy(void *pool)
@@ -2177,6 +2177,7 @@ static int zs_register_shrinker(struct zs_pool *pool)
 /**
  * zs_create_pool - Creates an allocation pool to work from.
  * @name: pool name to be created
+ * @zspage_order: maximum order of zspage
  *
  * This function must be called before anything when using
  * the zsmalloc allocator.
@@ -2184,17 +2185,21 @@ static int zs_register_shrinker(struct zs_pool *pool)
  * On success, a pointer to the newly created pool is returned,
  * otherwise NULL.
  */
-struct zs_pool *zs_create_pool(const char *name)
+struct zs_pool *zs_create_pool(const char *name, u32 zspage_order)
 {
 	int i;
 	struct zs_pool *pool;
 	struct size_class *prev_class = NULL;
 
+	if (WARN_ON(zspage_order < ZS_MIN_PAGE_ORDER ||
+		    zspage_order > ZS_MAX_PAGE_ORDER))
+		return NULL;
+
 	pool = kzalloc(sizeof(*pool), GFP_KERNEL);
 	if (!pool)
 		return NULL;
 
-	pool->max_pages_per_zspage = 1U << ZS_MIN_PAGE_ORDER;
+	pool->max_pages_per_zspage = 1U << zspage_order;
 	/* min_alloc_size must be multiple of ZS_ALIGN */
 	pool->min_alloc_size = (pool->max_pages_per_zspage << PAGE_SHIFT) >>
 		OBJ_INDEX_BITS;
-- 
2.38.0.135.g90850a2211-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/6] zram: add pool_page_order device attribute
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 1/6] zsmalloc: turn zspage order into runtime variable Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 2/6] zsmalloc/zram: pass zspage order to zs_create_pool() Sergey Senozhatsky
@ 2022-10-24 16:12 ` Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 4/6] Documentation: document zram pool_page_order attribute Sergey Senozhatsky
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-24 16:12 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

Add a new sysfs knob that allows user-space to set
zsmalloc page order value on per-device basis.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 drivers/block/zram/zram_drv.c | 43 ++++++++++++++++++++++++++++++++++-
 drivers/block/zram/zram_drv.h |  2 ++
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index e3ef542f9618..517dae4ff21c 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1186,6 +1186,44 @@ static ssize_t mm_stat_show(struct device *dev,
 	return ret;
 }
 
+static ssize_t pool_page_order_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	u32 val;
+	struct zram *zram = dev_to_zram(dev);
+
+	down_read(&zram->init_lock);
+	val = zram->pool_page_order;
+	up_read(&zram->init_lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+static ssize_t pool_page_order_store(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t len)
+{
+	struct zram *zram = dev_to_zram(dev);
+	u32 val;
+
+	if (kstrtou32(buf, 10, &val))
+		return -EINVAL;
+
+	if (val < ZS_MIN_PAGE_ORDER || val > ZS_MAX_PAGE_ORDER)
+		return -EINVAL;
+
+	down_read(&zram->init_lock);
+	if (init_done(zram)) {
+		up_read(&zram->init_lock);
+		return -EINVAL;
+	}
+
+	zram->pool_page_order = val;
+	up_read(&zram->init_lock);
+
+	return len;
+}
+
 #ifdef CONFIG_ZRAM_WRITEBACK
 #define FOUR_K(x) ((x) * (1 << (PAGE_SHIFT - 12)))
 static ssize_t bd_stat_show(struct device *dev,
@@ -1254,7 +1292,7 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
 		return false;
 
 	zram->mem_pool = zs_create_pool(zram->disk->disk_name,
-					ZS_DEFAULT_PAGE_ORDER);
+					zram->pool_page_order);
 	if (!zram->mem_pool) {
 		vfree(zram->table);
 		return false;
@@ -2176,6 +2214,7 @@ static DEVICE_ATTR_RW(writeback_limit_enable);
 static DEVICE_ATTR_RW(recomp_algorithm);
 static DEVICE_ATTR_WO(recompress);
 #endif
+static DEVICE_ATTR_RW(pool_page_order);
 
 static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_disksize.attr,
@@ -2203,6 +2242,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_recomp_algorithm.attr,
 	&dev_attr_recompress.attr,
 #endif
+	&dev_attr_pool_page_order.attr,
 	NULL,
 };
 
@@ -2240,6 +2280,7 @@ static int zram_add(void)
 		goto out_free_idr;
 	}
 
+	zram->pool_page_order = ZS_DEFAULT_PAGE_ORDER;
 	zram->disk->major = zram_major;
 	zram->disk->first_minor = device_id;
 	zram->disk->minors = 1;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 09b9ceb5dfa3..076d5b17a954 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -120,6 +120,8 @@ struct zram {
 	 */
 	u64 disksize;	/* bytes */
 	const char *comp_algs[ZRAM_MAX_ZCOMPS];
+
+	u32 pool_page_order;
 	/*
 	 * zram is claimed so open request will be failed
 	 */
-- 
2.38.0.135.g90850a2211-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/6] Documentation: document zram pool_page_order attribute
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
                   ` (2 preceding siblings ...)
  2022-10-24 16:12 ` [PATCH 3/6] zram: add pool_page_order device attribute Sergey Senozhatsky
@ 2022-10-24 16:12 ` Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 5/6] zsmalloc: break out of loop when found perfect zspage order Sergey Senozhatsky
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-24 16:12 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

Provide a simple documentation for zram pool_page_order
device attribute.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 Documentation/admin-guide/blockdev/zram.rst | 31 ++++++++++++++++-----
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 010fb05a5999..cd12a5982ae0 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -112,7 +112,24 @@ to list all of them using, for instance, /proc/crypto or any other
 method. This, however, has an advantage of permitting the usage of
 custom crypto compression modules (implementing S/W or H/W compression).
 
-4) Set Disksize
+4) Set maximum pool page order
+==============================
+
+zsmalloc pages can consist of up to 2^N physical pages. The exact size
+is calculated per each zsmalloc size class during zsmalloc pool creation.
+ZRAM provides pool_page_order device attribute to see or change N.
+
+Examples::
+
+	#show current maximum zsmalloc page order
+	cat /sys/block/zramX/pool_page_order
+	2
+
+	#set maximum zsmalloc page order
+	echo 3 > /sys/block/zramX/pool_page_order
+
+
+5) Set Disksize
 ===============
 
 Set disk size by writing the value to sysfs node 'disksize'.
@@ -132,7 +149,7 @@ There is little point creating a zram of greater than twice the size of memory
 since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the
 size of the disk when not in use so a huge zram is wasteful.
 
-5) Set memory limit: Optional
+6) Set memory limit: Optional
 =============================
 
 Set memory limit by writing the value to sysfs node 'mem_limit'.
@@ -151,7 +168,7 @@ Examples::
 	# To disable memory limit
 	echo 0 > /sys/block/zram0/mem_limit
 
-6) Activate
+7) Activate
 ===========
 
 ::
@@ -162,7 +179,7 @@ Examples::
 	mkfs.ext4 /dev/zram1
 	mount /dev/zram1 /tmp
 
-7) Add/remove zram devices
+8) Add/remove zram devices
 ==========================
 
 zram provides a control interface, which enables dynamic (on-demand) device
@@ -182,7 +199,7 @@ execute::
 
 	echo X > /sys/class/zram-control/hot_remove
 
-8) Stats
+9) Stats
 ========
 
 Per-device statistics are exported as various nodes under /sys/block/zram<id>/
@@ -283,7 +300,7 @@ a single line of text and contains the following stats separated by whitespace:
 		Unit: 4K bytes
  ============== =============================================================
 
-9) Deactivate
+10) Deactivate
 =============
 
 ::
@@ -291,7 +308,7 @@ a single line of text and contains the following stats separated by whitespace:
 	swapoff /dev/zram0
 	umount /dev/zram1
 
-10) Reset
+11) Reset
 =========
 
 	Write any positive value to 'reset' sysfs node::
-- 
2.38.0.135.g90850a2211-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/6] zsmalloc: break out of loop when found perfect zspage order
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
                   ` (3 preceding siblings ...)
  2022-10-24 16:12 ` [PATCH 4/6] Documentation: document zram pool_page_order attribute Sergey Senozhatsky
@ 2022-10-24 16:12 ` Sergey Senozhatsky
  2022-10-24 16:12 ` [PATCH 6/6] zsmalloc: make sure we select best zspage size Sergey Senozhatsky
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-24 16:12 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

If we found zspage configuration that gives us perfect
100% used percentage (zero wasted space) then there is
no point it trying any other configuration

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 mm/zsmalloc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index fa55e0c66f8d..40a09b1f63b5 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -790,6 +790,9 @@ static int get_pages_per_zspage(struct zs_pool *pool, int class_size)
 			max_usedpc = usedpc;
 			max_usedpc_order = i;
 		}
+
+		if (usedpc == 100)
+			break;
 	}
 
 	return max_usedpc_order;
-- 
2.38.0.135.g90850a2211-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/6] zsmalloc: make sure we select best zspage size
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
                   ` (4 preceding siblings ...)
  2022-10-24 16:12 ` [PATCH 5/6] zsmalloc: break out of loop when found perfect zspage order Sergey Senozhatsky
@ 2022-10-24 16:12 ` Sergey Senozhatsky
  2022-10-25  3:26 ` [PATCH 0/6] zsmalloc/zram: configurable " Bagas Sanjaya
  2022-10-25  4:30 ` Sergey Senozhatsky
  7 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-24 16:12 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim
  Cc: Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

We currently decide the best zspage size by looking at
used percentage value. This is not exactly enough as
zspage usage percentage calculation is not accurate.

For example, let's look at size class 208

pages per zspage       wasted bytes         used%
       1                   144               96
       2                    80               99
       3                    16               99
       4                   160               99

We will select 2 page per zspage configuration, as it
is the first one to reach 99%. However, 3 pages per
zspage wastes less memory. Hence we need to also consider
wasted space metrics when device zspage size.

Additionally, rename max_usedpc_order because it does
not hold zspage order, it holds maximum pages per-zspage
value.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 mm/zsmalloc.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 40a09b1f63b5..5de56f4cd16a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -775,8 +775,9 @@ static enum fullness_group fix_fullness_group(struct size_class *class,
 static int get_pages_per_zspage(struct zs_pool *pool, int class_size)
 {
 	int i, max_usedpc = 0;
-	/* zspage order which gives maximum used size per KB */
-	int max_usedpc_order = 1;
+	/* zspage size which gives maximum used size per KB */
+	int pages_per_zspage = 1;
+	int min_waste = INT_MAX;
 
 	for (i = 1; i <= pool->max_pages_per_zspage; i++) {
 		int zspage_size;
@@ -788,14 +789,19 @@ static int get_pages_per_zspage(struct zs_pool *pool, int class_size)
 
 		if (usedpc > max_usedpc) {
 			max_usedpc = usedpc;
-			max_usedpc_order = i;
+			pages_per_zspage = i;
 		}
 
 		if (usedpc == 100)
 			break;
+
+		if (waste < min_waste) {
+			min_waste = waste;
+			pages_per_zspage = i;
+		}
 	}
 
-	return max_usedpc_order;
+	return pages_per_zspage;
 }
 
 static struct zspage *get_zspage(struct page *page)
-- 
2.38.0.135.g90850a2211-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] zsmalloc/zram: configurable zspage size
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
                   ` (5 preceding siblings ...)
  2022-10-24 16:12 ` [PATCH 6/6] zsmalloc: make sure we select best zspage size Sergey Senozhatsky
@ 2022-10-25  3:26 ` Bagas Sanjaya
  2022-10-25  3:42   ` Sergey Senozhatsky
  2022-10-25  4:30 ` Sergey Senozhatsky
  7 siblings, 1 reply; 12+ messages in thread
From: Bagas Sanjaya @ 2022-10-25  3:26 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1656 bytes --]

On Tue, Oct 25, 2022 at 01:12:07AM +0900, Sergey Senozhatsky wrote:
> 	Hello,
> 
> 	Some use-cases and/or data patterns may benefit from
> larger zspages. Currently the limit on the number of physical
> pages that are linked into a zspage is hardcoded to 4. Higher
> limit changes key characteristics of a number of the size
> clases, improving compactness of the pool and redusing the
> amount of memory zsmalloc pool uses.
> 
> For instance, the huge size class watermark is currently set
> to 3264 bytes. With order 3 zspages we have more normal classe
> and huge size watermark becomes 3632. With order 4 zspages
> huge size watermark becomes 3840.
> 
> Commit #1 has more numbers and some analysis.
> 
> Sergey Senozhatsky (6):
>   zsmalloc: turn zspage order into runtime variable
>   zsmalloc/zram: pass zspage order to zs_create_pool()
>   zram: add pool_page_order device attribute
>   Documentation: document zram pool_page_order attribute
>   zsmalloc: break out of loop when found perfect zspage order
>   zsmalloc: make sure we select best zspage size
> 
>  Documentation/admin-guide/blockdev/zram.rst | 31 +++++--
>  drivers/block/zram/zram_drv.c               | 44 ++++++++-
>  drivers/block/zram/zram_drv.h               |  2 +
>  include/linux/zsmalloc.h                    | 15 +++-
>  mm/zsmalloc.c                               | 98 +++++++++++++--------
>  5 files changed, 145 insertions(+), 45 deletions(-)
> 

Sorry, I can't cleanly apply this patch series due to conflicts in
patch [1/6]. On what tree and commit the series is based?

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] zsmalloc/zram: configurable zspage size
  2022-10-25  3:26 ` [PATCH 0/6] zsmalloc/zram: configurable " Bagas Sanjaya
@ 2022-10-25  3:42   ` Sergey Senozhatsky
  2022-10-25  8:40     ` Bagas Sanjaya
  0 siblings, 1 reply; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-25  3:42 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Sergey Senozhatsky, Andrew Morton, Minchan Kim, Nitin Gupta,
	linux-kernel, linux-mm

On (22/10/25 10:26), Bagas Sanjaya wrote:
> 
> Sorry, I can't cleanly apply this patch series due to conflicts in
> patch [1/6]. On what tree and commit the series is based?

next-20221024

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] zsmalloc/zram: configurable zspage size
  2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
                   ` (6 preceding siblings ...)
  2022-10-25  3:26 ` [PATCH 0/6] zsmalloc/zram: configurable " Bagas Sanjaya
@ 2022-10-25  4:30 ` Sergey Senozhatsky
  2022-10-25  7:57   ` Sergey Senozhatsky
  7 siblings, 1 reply; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-25  4:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Nitin Gupta, linux-kernel, linux-mm, Sergey Senozhatsky

On (22/10/25 01:12), Sergey Senozhatsky wrote:
> Sergey Senozhatsky (6):
>   zsmalloc: turn zspage order into runtime variable
>   zsmalloc/zram: pass zspage order to zs_create_pool()
>   zram: add pool_page_order device attribute
>   Documentation: document zram pool_page_order attribute
>   zsmalloc: break out of loop when found perfect zspage order
>   zsmalloc: make sure we select best zspage size

Andrew, I want to replace the last 2 patches in the series: I think
we can drop `usedpc` calculations and instead optimize only for `waste`
value. Would you prefer me to resend the entire instead?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] zsmalloc/zram: configurable zspage size
  2022-10-25  4:30 ` Sergey Senozhatsky
@ 2022-10-25  7:57   ` Sergey Senozhatsky
  0 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2022-10-25  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Minchan Kim, Nitin Gupta, linux-kernel, linux-mm

On (22/10/25 13:30), Sergey Senozhatsky wrote:
> On (22/10/25 01:12), Sergey Senozhatsky wrote:
> > Sergey Senozhatsky (6):
> >   zsmalloc: turn zspage order into runtime variable
> >   zsmalloc/zram: pass zspage order to zs_create_pool()
> >   zram: add pool_page_order device attribute
> >   Documentation: document zram pool_page_order attribute
> >   zsmalloc: break out of loop when found perfect zspage order
> >   zsmalloc: make sure we select best zspage size
> 
> Andrew, I want to replace the last 2 patches in the series: I think
> we can drop `usedpc` calculations and instead optimize only for `waste`
> value. Would you prefer me to resend the entire instead?

Andrew, let's do it another way - let's drop the last patch from the
series. But only the last one. The past was a last minute addition to
the series and I have not fully studied it's impact yet. From a
preliminary research I can say that it improves zsmalloc memory usage
only for order 4 zspages and has no statistically significant impact
on order 2 nor order 3 zspages.

Synthetic test, base get_pages_per_zspage() vs 'waste' optimized
get_pages_per_zspage() for order 4 zspages:

x zram-order-4-memused-base
+ zram-order-4-memused-patched
+----------------------------------------------------------------------------+
|+               +        +  +                               x xx           x|
|     |___________A_______M____|                           |____M_A______|   |
+----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4 6.3960678e+08 6.3974605e+08 6.3962726e+08 6.3965082e+08     64101.637
+   4 6.3902925e+08 6.3929958e+08 6.3926682e+08 6.3919514e+08     120652.52
Difference at 95.0% confidence
	-455680 +/- 167159
	-0.0712389% +/- 0.0261329%
	(Student's t, pooled s = 96607.6)


If I will have enough confidence in that patch I will submit it
separately, with a proper commit message and clear justification.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] zsmalloc/zram: configurable zspage size
  2022-10-25  3:42   ` Sergey Senozhatsky
@ 2022-10-25  8:40     ` Bagas Sanjaya
  0 siblings, 0 replies; 12+ messages in thread
From: Bagas Sanjaya @ 2022-10-25  8:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel, linux-mm

On 10/25/22 10:42, Sergey Senozhatsky wrote:
> On (22/10/25 10:26), Bagas Sanjaya wrote:
>>
>> Sorry, I can't cleanly apply this patch series due to conflicts in
>> patch [1/6]. On what tree and commit the series is based?
> 
> next-20221024

Hmm, still can't be applied (again patch [1/6] is the culprit).
Please rebase on top of mm-everything. Don't forget to pass
--base to git-format-patch(1) so that I know the base commit
of this series.

Thanks.

-- 
An old man doll... just what I always wanted! - Clara


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-10-25  8:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24 16:12 [PATCH 0/6] zsmalloc/zram: configurable zspage size Sergey Senozhatsky
2022-10-24 16:12 ` [PATCH 1/6] zsmalloc: turn zspage order into runtime variable Sergey Senozhatsky
2022-10-24 16:12 ` [PATCH 2/6] zsmalloc/zram: pass zspage order to zs_create_pool() Sergey Senozhatsky
2022-10-24 16:12 ` [PATCH 3/6] zram: add pool_page_order device attribute Sergey Senozhatsky
2022-10-24 16:12 ` [PATCH 4/6] Documentation: document zram pool_page_order attribute Sergey Senozhatsky
2022-10-24 16:12 ` [PATCH 5/6] zsmalloc: break out of loop when found perfect zspage order Sergey Senozhatsky
2022-10-24 16:12 ` [PATCH 6/6] zsmalloc: make sure we select best zspage size Sergey Senozhatsky
2022-10-25  3:26 ` [PATCH 0/6] zsmalloc/zram: configurable " Bagas Sanjaya
2022-10-25  3:42   ` Sergey Senozhatsky
2022-10-25  8:40     ` Bagas Sanjaya
2022-10-25  4:30 ` Sergey Senozhatsky
2022-10-25  7:57   ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).