linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv9 0/7] add compressing abstraction and multi stream support
@ 2014-02-28 17:52 Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 1/7] zram: introduce compressing backend abstraction Sergey Senozhatsky
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

This patchset introduces zcomp compression backend abstraction
adding ability to support compression algorithms other than LZO;
support for multi compression streams, making parallel compressions
possible; adds support for LZ4 compression algorithm.

v8->v9 (reviewed by Andrew Morton):
-- add LZ4 backend (+iozone test vs LZO)
-- merge patches 'zram: document max_comp_streams' and 'zram: add multi
   stream functionality'
-- do not extern backend struct from source file
-- use find()/release() naming instead of get()/put()
-- minor code, commit messages and code comments `nitpicks'
-- removed Acked-by Minchan Kim from first two patches, because I've
   changed them a bit.

v7->v8 (reviewed by Minchan Kim):
-- merge patches 'add multi stream functionality' and 'enable multi
   stream compression support in zram'
-- return status code from set_max_streams knob and print message on
   error
-- do not use atomic type for ->avail_strm
-- return back: allocate by default only one stream for multi stream backend
-- wake sleeping write in zcomp_strm_multi_put() only if we put stream
   to idle list
-- minor code `nitpicks'

v6->v7 (reviewed by Minchan Kim):
-- enable multi and single stream support out of the box (drop
   ZRAM_MULTI_STREAM config option)
-- add set_max_stream knob, so we can adjust max number of compression
   streams in runtime (for multi stream backend at the moment)
-- minor code `nitpicks'

v5->v6 (reviewed by Minchan Kim):
-- handle single compression stream case separately, using mutex locking,
   to address perfomance regression
-- handle multi compression stream using spin lock and wait_event()/wake_up()
-- make multi compression stream support configurable (ZRAM_MULTI_STREAM
   config option)

v4->v5 (reviewed by Minchan Kim):
-- renamed zcomp buffer_lock; removed src len and dst len from
   compress() and decompress(); not using term `buffer' and
   `workmem' in code and documentation; define compress() and
   decompress() functions for LZO backend; not using goto's;
   do not put idle zcomp_strm to idle list tail.

v3->v4 (reviewed by Minchan Kim):
-- renamed compression backend and working memory structs as requested
   by Minchan Kim; fixed several issues noted by Minchan Kim.

Sergey Senozhatsky (7):
  zram: introduce compressing backend abstraction
  zram: use zcomp compressing backends
  zram: factor out single stream compression
  zram: add multi stream functionality
  zram: add set_max_streams knob
  zram: make compression algorithm selection possible
  zram: add lz4 algorithm backend

 Documentation/ABI/testing/sysfs-block-zram |  17 +-
 Documentation/blockdev/zram.txt            |  45 +++-
 drivers/block/zram/Kconfig                 |  10 +
 drivers/block/zram/Makefile                |   4 +-
 drivers/block/zram/zcomp.c                 | 349 +++++++++++++++++++++++++++++
 drivers/block/zram/zcomp.h                 |  68 ++++++
 drivers/block/zram/zcomp_lz4.c             |  47 ++++
 drivers/block/zram/zcomp_lz4.h             |  17 ++
 drivers/block/zram/zcomp_lzo.c             |  47 ++++
 drivers/block/zram/zcomp_lzo.h             |  17 ++
 drivers/block/zram/zram_drv.c              | 131 ++++++++---
 drivers/block/zram/zram_drv.h              |  11 +-
 12 files changed, 715 insertions(+), 48 deletions(-)
 create mode 100644 drivers/block/zram/zcomp.c
 create mode 100644 drivers/block/zram/zcomp.h
 create mode 100644 drivers/block/zram/zcomp_lz4.c
 create mode 100644 drivers/block/zram/zcomp_lz4.h
 create mode 100644 drivers/block/zram/zcomp_lzo.c
 create mode 100644 drivers/block/zram/zcomp_lzo.h

-- 
1.9.0.359.g5e34a15


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCHv9 1/7] zram: introduce compressing backend abstraction
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
@ 2014-02-28 17:52 ` Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 2/7] zram: use zcomp compressing backends Sergey Senozhatsky
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

ZRAM performs direct LZO compression algorithm calls, making it the one and
only option. While LZO is generally performs well, LZ4 algorithm tends to
have a faster decompression (see http://code.google.com/p/lz4/ for full
report)

	Name            Ratio  C.speed D.speed
	                        MB/s    MB/s
	LZ4 (r101)      2.084    422    1820
	LZO 2.06        2.106    414     600

Thus, users who have mostly read (decompress) usage scenarious or mixed
workflow (writes with relatively high read ops number) will benefit from
using LZ4 compression backend.

Introduce compressing backend abstraction zcomp in order to support multiple
compression algorithms with the following set of operations:
        .create
        .destroy
        .compress
        .decompress

Schematically zram write() usually contains the following steps:
0) preparation (decompression of partioal IO, etc.)
1) lock buffer_lock mutex (protects meta compress buffers)
2) compress (using meta compress buffers)
3) alloc and map zs_pool object
4) copy compressed data (from meta compress buffers) to object allocated by 3)
5) free previous pool page, assign a new one
6) unlock buffer_lock mutex

As we can see, compressing buffers must remain untouched from 1) to 4),
because, otherwise, concurrent write() can overwrite data. At the same time,
zram_meta must be aware of a) specific compression algorithm memory requirements
and b) necessary locking to protect compression buffers. To remove requirement
a) new struct zcomp_strm introduced, which contains a compress/decompress
`buffer' and compression algorithm `private' part. While struct zcomp implements
zcomp_strm stream handling and locking and removes requirement b) from zram meta.
zcomp ->create() and ->destroy(), respectively, allocate and deallocate algorithm
specific zcomp_strm `private' part.

Every zcomp has zcomp stream and mutex to protect its compression stream. Stream
usage semantics remains the same -- only one write can hold stream lock and use
its buffers. zcomp_strm_find() turns caller into exclusive user of a stream
(holding stream mutex until zram release stream), and zcomp_strm_release()
makes zcomp stream available (unlock the stream mutex). Hence no concurrent
write (compression) operations possible at the moment.

iozone -t 3 -R -r 16K -s 60M -I +Z

       test            base           patched
--------------------------------------------------
  Initial write      597992.91       591660.58
        Rewrite      609674.34       616054.97
           Read     2404771.75      2452909.12
        Re-read     2459216.81      2470074.44
   Reverse Read     1652769.66      1589128.66
    Stride read     2202441.81      2202173.31
    Random read     2236311.47      2276565.31
 Mixed workload     1423760.41      1709760.06
   Random write      579584.08       615933.86
         Pwrite      597550.02       594933.70
          Pread     1703672.53      1718126.72
         Fwrite     1330497.06      1461054.00
          Fread     3922851.00      3957242.62

Usage examples:

	comp = zcomp_create(NAME) /* NAME e.g. "lzo" */

which initialises compressing backend if requested algorithm is supported.

Compress:
	zstrm = zcomp_strm_find(comp)
	zcomp_compress(comp, zstrm, src, &dst_len)
	[..] /* copy compressed data */
	zcomp_strm_release(comp, zstrm)

Decompress:
	zcomp_decompress(comp, src, src_len, dst);

Free compessing backend and its zcomp stream:
	zcomp_destroy(comp)

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/zcomp.c     | 115 +++++++++++++++++++++++++++++++++++++++++
 drivers/block/zram/zcomp.h     |  58 +++++++++++++++++++++
 drivers/block/zram/zcomp_lzo.c |  47 +++++++++++++++++
 drivers/block/zram/zcomp_lzo.h |  17 ++++++
 4 files changed, 237 insertions(+)
 create mode 100644 drivers/block/zram/zcomp.c
 create mode 100644 drivers/block/zram/zcomp.h
 create mode 100644 drivers/block/zram/zcomp_lzo.c
 create mode 100644 drivers/block/zram/zcomp_lzo.h

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
new file mode 100644
index 0000000..22f4ae2
--- /dev/null
+++ b/drivers/block/zram/zcomp.c
@@ -0,0 +1,115 @@
+/*
+ * Copyright (C) 2014 Sergey Senozhatsky.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/wait.h>
+#include <linux/sched.h>
+
+#include "zcomp.h"
+#include "zcomp_lzo.h"
+
+static struct zcomp_backend *find_backend(const char *compress)
+{
+	if (strncmp(compress, "lzo", 3) == 0)
+		return &zcomp_lzo;
+	return NULL;
+}
+
+static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm)
+{
+	if (zstrm->private)
+		comp->backend->destroy(zstrm->private);
+	free_pages((unsigned long)zstrm->buffer, 1);
+	kfree(zstrm);
+}
+
+/*
+ * allocate new zcomp_strm structure with ->private initialized by
+ * backend, return NULL on error
+ */
+static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
+{
+	struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_KERNEL);
+	if (!zstrm)
+		return NULL;
+
+	zstrm->private = comp->backend->create();
+	/*
+	 * allocate 2 pages. 1 for compressed data, plus 1 extra for the
+	 * case when compressed size is larger than the original one
+	 */
+	zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
+	if (!zstrm->private || !zstrm->buffer) {
+		zcomp_strm_free(comp, zstrm);
+		zstrm = NULL;
+	}
+	return zstrm;
+}
+
+struct zcomp_strm *zcomp_strm_find(struct zcomp *comp)
+{
+	mutex_lock(&comp->strm_lock);
+	return comp->zstrm;
+}
+
+void zcomp_strm_release(struct zcomp *comp, struct zcomp_strm *zstrm)
+{
+	mutex_unlock(&comp->strm_lock);
+}
+
+int zcomp_compress(struct zcomp *comp, struct zcomp_strm *zstrm,
+		const unsigned char *src, size_t *dst_len)
+{
+	return comp->backend->compress(src, zstrm->buffer, dst_len,
+			zstrm->private);
+}
+
+int zcomp_decompress(struct zcomp *comp, const unsigned char *src,
+		size_t src_len, unsigned char *dst)
+{
+	return comp->backend->decompress(src, src_len, dst);
+}
+
+void zcomp_destroy(struct zcomp *comp)
+{
+	zcomp_strm_free(comp, comp->zstrm);
+	kfree(comp);
+}
+
+/*
+ * search available compressors for requested algorithm.
+ * allocate new zcomp and initialize it. return NULL
+ * if requested algorithm is not supported or in case
+ * of init error
+ */
+struct zcomp *zcomp_create(const char *compress)
+{
+	struct zcomp *comp;
+	struct zcomp_backend *backend;
+
+	backend = find_backend(compress);
+	if (!backend)
+		return NULL;
+
+	comp = kzalloc(sizeof(struct zcomp), GFP_KERNEL);
+	if (!comp)
+		return NULL;
+
+	comp->backend = backend;
+	mutex_init(&comp->strm_lock);
+
+	comp->zstrm = zcomp_strm_alloc(comp);
+	if (!comp->zstrm) {
+		kfree(comp);
+		return NULL;
+	}
+	return comp;
+}
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
new file mode 100644
index 0000000..c9a98e1
--- /dev/null
+++ b/drivers/block/zram/zcomp.h
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2014 Sergey Senozhatsky.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _ZCOMP_H_
+#define _ZCOMP_H_
+
+#include <linux/mutex.h>
+
+struct zcomp_strm {
+	/* compression/decompression buffer */
+	void *buffer;
+	/*
+	 * The private data of the compression stream, only compression
+	 * stream backend can touch this (e.g. compression algorithm
+	 * working memory)
+	 */
+	void *private;
+};
+
+/* static compression backend */
+struct zcomp_backend {
+	int (*compress)(const unsigned char *src, unsigned char *dst,
+			size_t *dst_len, void *private);
+
+	int (*decompress)(const unsigned char *src, size_t src_len,
+			unsigned char *dst);
+
+	void *(*create)(void);
+	void (*destroy)(void *private);
+
+	const char *name;
+};
+
+/* dynamic per-device compression frontend */
+struct zcomp {
+	struct mutex strm_lock;
+	struct zcomp_strm *zstrm;
+	struct zcomp_backend *backend;
+};
+
+struct zcomp *zcomp_create(const char *comp);
+void zcomp_destroy(struct zcomp *comp);
+
+struct zcomp_strm *zcomp_strm_find(struct zcomp *comp);
+void zcomp_strm_release(struct zcomp *comp, struct zcomp_strm *zstrm);
+
+int zcomp_compress(struct zcomp *comp, struct zcomp_strm *zstrm,
+		const unsigned char *src, size_t *dst_len);
+
+int zcomp_decompress(struct zcomp *comp, const unsigned char *src,
+		size_t src_len, unsigned char *dst);
+#endif /* _ZCOMP_H_ */
diff --git a/drivers/block/zram/zcomp_lzo.c b/drivers/block/zram/zcomp_lzo.c
new file mode 100644
index 0000000..da1bc47
--- /dev/null
+++ b/drivers/block/zram/zcomp_lzo.c
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2014 Sergey Senozhatsky.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/lzo.h>
+
+#include "zcomp_lzo.h"
+
+static void *lzo_create(void)
+{
+	return kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL);
+}
+
+static void lzo_destroy(void *private)
+{
+	kfree(private);
+}
+
+static int lzo_compress(const unsigned char *src, unsigned char *dst,
+		size_t *dst_len, void *private)
+{
+	int ret = lzo1x_1_compress(src, PAGE_SIZE, dst, dst_len, private);
+	return ret == LZO_E_OK ? 0 : ret;
+}
+
+static int lzo_decompress(const unsigned char *src, size_t src_len,
+		unsigned char *dst)
+{
+	size_t dst_len = PAGE_SIZE;
+	int ret = lzo1x_decompress_safe(src, src_len, dst, &dst_len);
+	return ret == LZO_E_OK ? 0 : ret;
+}
+
+struct zcomp_backend zcomp_lzo = {
+	.compress = lzo_compress,
+	.decompress = lzo_decompress,
+	.create = lzo_create,
+	.destroy = lzo_destroy,
+	.name = "lzo",
+};
diff --git a/drivers/block/zram/zcomp_lzo.h b/drivers/block/zram/zcomp_lzo.h
new file mode 100644
index 0000000..128c580
--- /dev/null
+++ b/drivers/block/zram/zcomp_lzo.h
@@ -0,0 +1,17 @@
+/*
+ * Copyright (C) 2014 Sergey Senozhatsky.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _ZCOMP_LZO_H_
+#define _ZCOMP_LZO_H_
+
+#include "zcomp.h"
+
+extern struct zcomp_backend zcomp_lzo;
+
+#endif /* _ZCOMP_LZO_H_ */
-- 
1.9.0.359.g5e34a15


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv9 2/7] zram: use zcomp compressing backends
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 1/7] zram: introduce compressing backend abstraction Sergey Senozhatsky
@ 2014-02-28 17:52 ` Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 3/7] zram: factor out single stream compression Sergey Senozhatsky
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

Do not perform direct LZO compress/decompress calls, initialise
and use zcomp LZO backend (single compression stream) instead.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/Makefile   |  2 +-
 drivers/block/zram/zram_drv.c | 59 ++++++++++++++++++-------------------------
 drivers/block/zram/zram_drv.h |  8 +++---
 3 files changed, 29 insertions(+), 40 deletions(-)

diff --git a/drivers/block/zram/Makefile b/drivers/block/zram/Makefile
index cb0f9ce..757c6a5 100644
--- a/drivers/block/zram/Makefile
+++ b/drivers/block/zram/Makefile
@@ -1,3 +1,3 @@
-zram-y	:=	zram_drv.o
+zram-y	:=	zcomp_lzo.o zcomp.o zram_drv.o
 
 obj-$(CONFIG_ZRAM)	+=	zram.o
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 9baac5b..fc12a69 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -29,7 +29,6 @@
 #include <linux/genhd.h>
 #include <linux/highmem.h>
 #include <linux/slab.h>
-#include <linux/lzo.h>
 #include <linux/string.h>
 #include <linux/vmalloc.h>
 
@@ -38,6 +37,7 @@
 /* Globals */
 static int zram_major;
 static struct zram *zram_devices;
+static const char *default_compressor = "lzo";
 
 /* Module params (documentation at end) */
 static unsigned int num_devices = 1;
@@ -160,8 +160,6 @@ static inline int valid_io_request(struct zram *zram, struct bio *bio)
 static void zram_meta_free(struct zram_meta *meta)
 {
 	zs_destroy_pool(meta->mem_pool);
-	kfree(meta->compress_workmem);
-	free_pages((unsigned long)meta->compress_buffer, 1);
 	vfree(meta->table);
 	kfree(meta);
 }
@@ -173,22 +171,11 @@ static struct zram_meta *zram_meta_alloc(u64 disksize)
 	if (!meta)
 		goto out;
 
-	meta->compress_workmem = kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL);
-	if (!meta->compress_workmem)
-		goto free_meta;
-
-	meta->compress_buffer =
-		(void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
-	if (!meta->compress_buffer) {
-		pr_err("Error allocating compressor buffer space\n");
-		goto free_workmem;
-	}
-
 	num_pages = disksize >> PAGE_SHIFT;
 	meta->table = vzalloc(num_pages * sizeof(*meta->table));
 	if (!meta->table) {
 		pr_err("Error allocating zram address table\n");
-		goto free_buffer;
+		goto free_meta;
 	}
 
 	meta->mem_pool = zs_create_pool(GFP_NOIO | __GFP_HIGHMEM);
@@ -198,15 +185,10 @@ static struct zram_meta *zram_meta_alloc(u64 disksize)
 	}
 
 	rwlock_init(&meta->tb_lock);
-	mutex_init(&meta->buffer_lock);
 	return meta;
 
 free_table:
 	vfree(meta->table);
-free_buffer:
-	free_pages((unsigned long)meta->compress_buffer, 1);
-free_workmem:
-	kfree(meta->compress_workmem);
 free_meta:
 	kfree(meta);
 	meta = NULL;
@@ -280,8 +262,7 @@ static void zram_free_page(struct zram *zram, size_t index)
 
 static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
 {
-	int ret = LZO_E_OK;
-	size_t clen = PAGE_SIZE;
+	int ret = 0;
 	unsigned char *cmem;
 	struct zram_meta *meta = zram->meta;
 	unsigned long handle;
@@ -301,12 +282,12 @@ static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
 	if (size == PAGE_SIZE)
 		copy_page(mem, cmem);
 	else
-		ret = lzo1x_decompress_safe(cmem, size,	mem, &clen);
+		ret = zcomp_decompress(zram->comp, cmem, size, mem);
 	zs_unmap_object(meta->mem_pool, handle);
 	read_unlock(&meta->tb_lock);
 
 	/* Should NEVER happen. Return bio error if it does. */
-	if (unlikely(ret != LZO_E_OK)) {
+	if (unlikely(ret)) {
 		pr_err("Decompression failed! err=%d, page=%u\n", ret, index);
 		atomic64_inc(&zram->stats.failed_reads);
 		return ret;
@@ -349,7 +330,7 @@ static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
 
 	ret = zram_decompress_page(zram, uncmem, index);
 	/* Should NEVER happen. Return bio error if it does. */
-	if (unlikely(ret != LZO_E_OK))
+	if (unlikely(ret))
 		goto out_cleanup;
 
 	if (is_partial_io(bvec))
@@ -374,11 +355,10 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 	struct page *page;
 	unsigned char *user_mem, *cmem, *src, *uncmem = NULL;
 	struct zram_meta *meta = zram->meta;
+	struct zcomp_strm *zstrm;
 	bool locked = false;
 
 	page = bvec->bv_page;
-	src = meta->compress_buffer;
-
 	if (is_partial_io(bvec)) {
 		/*
 		 * This is a partial IO. We need to read the full page
@@ -394,7 +374,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 			goto out;
 	}
 
-	mutex_lock(&meta->buffer_lock);
+	zstrm = zcomp_strm_find(zram->comp);
 	locked = true;
 	user_mem = kmap_atomic(page);
 
@@ -420,22 +400,20 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 		goto out;
 	}
 
-	ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, &clen,
-			       meta->compress_workmem);
+	ret = zcomp_compress(zram->comp, zstrm, uncmem, &clen);
 	if (!is_partial_io(bvec)) {
 		kunmap_atomic(user_mem);
 		user_mem = NULL;
 		uncmem = NULL;
 	}
 
-	if (unlikely(ret != LZO_E_OK)) {
+	if (unlikely(ret)) {
 		pr_err("Compression failed! err=%d\n", ret);
 		goto out;
 	}
-
+	src = zstrm->buffer;
 	if (unlikely(clen > max_zpage_size)) {
 		clen = PAGE_SIZE;
-		src = NULL;
 		if (is_partial_io(bvec))
 			src = uncmem;
 	}
@@ -457,6 +435,8 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 		memcpy(cmem, src, clen);
 	}
 
+	zcomp_strm_release(zram->comp, zstrm);
+	locked = false;
 	zs_unmap_object(meta->mem_pool, handle);
 
 	/*
@@ -475,10 +455,9 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 	atomic64_inc(&zram->stats.pages_stored);
 out:
 	if (locked)
-		mutex_unlock(&meta->buffer_lock);
+		zcomp_strm_release(zram->comp, zstrm);
 	if (is_partial_io(bvec))
 		kfree(uncmem);
-
 	if (ret)
 		atomic64_inc(&zram->stats.failed_writes);
 	return ret;
@@ -522,6 +501,7 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
 		zs_free(meta->mem_pool, handle);
 	}
 
+	zcomp_destroy(zram->comp);
 	zram_meta_free(zram->meta);
 	zram->meta = NULL;
 	/* Reset stats */
@@ -550,9 +530,18 @@ static ssize_t disksize_store(struct device *dev,
 		return -EBUSY;
 	}
 
+	zram->comp = zcomp_create(default_compressor);
+	if (!zram->comp) {
+		up_write(&zram->init_lock);
+		pr_info("Cannot initialise %s compressing backend\n",
+				default_compressor);
+		return -EINVAL;
+	}
+
 	disksize = PAGE_ALIGN(disksize);
 	zram->meta = zram_meta_alloc(disksize);
 	if (!zram->meta) {
+		zcomp_destroy(zram->comp);
 		up_write(&zram->init_lock);
 		return -ENOMEM;
 	}
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 1d5b1f5..45e04f7 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -16,9 +16,10 @@
 #define _ZRAM_DRV_H_
 
 #include <linux/spinlock.h>
-#include <linux/mutex.h>
 #include <linux/zsmalloc.h>
 
+#include "zcomp.h"
+
 /*
  * Some arbitrary value. This is just to catch
  * invalid value for num_devices module parameter.
@@ -81,17 +82,16 @@ struct zram_stats {
 
 struct zram_meta {
 	rwlock_t tb_lock;	/* protect table */
-	void *compress_workmem;
-	void *compress_buffer;
 	struct table *table;
 	struct zs_pool *mem_pool;
-	struct mutex buffer_lock; /* protect compress buffers */
 };
 
 struct zram {
 	struct zram_meta *meta;
 	struct request_queue *queue;
 	struct gendisk *disk;
+	struct zcomp *comp;
+
 	/* Prevent concurrent execution of device init, reset and R/W request */
 	struct rw_semaphore init_lock;
 	/*
-- 
1.9.0.359.g5e34a15


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv9 3/7] zram: factor out single stream compression
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 1/7] zram: introduce compressing backend abstraction Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 2/7] zram: use zcomp compressing backends Sergey Senozhatsky
@ 2014-02-28 17:52 ` Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 4/7] zram: add multi stream functionality Sergey Senozhatsky
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

This is preparation patch to add multi stream support to zcomp.

Introduce struct zcomp_strm_single and a set of functions to manage zcomp_strm
stream access. zcomp_strm_single implements single compession stream, same way
as current zcomp implementation. This moves zcomp_strm stream control and
locking from zcomp, so compressing backend zcomp is not aware of required
locking.

Single and multi streams require different locking schemes. Minchan Kim
reported that spinlock-based locking scheme (which is used in multi stream
implementation) has demonstrated a severe perfomance regression for single
compression stream case, comparing to mutex-based.
see https://lkml.org/lkml/2014/2/18/16

The following set of functions added:
- zcomp_strm_single_find()/zcomp_strm_single_release()
  find and release a compression stream, implement required locking
- zcomp_strm_single_create()/zcomp_strm_single_destroy()
  create and destroy zcomp_strm_single

New ->strm_find() and ->strm_release() callbacks added to zcomp, which are set to
zcomp_strm_single_find() and zcomp_strm_single_release() during initialisation.
Instead of direct locking and zcomp_strm access from zcomp_strm_find() and
zcomp_strm_release(), zcomp now calls ->strm_find() and ->strm_release()
correspondingly.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/zcomp.c | 62 ++++++++++++++++++++++++++++++++++++++++------
 drivers/block/zram/zcomp.h |  7 ++++--
 2 files changed, 59 insertions(+), 10 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 22f4ae2..72e8071 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -16,6 +16,14 @@
 #include "zcomp.h"
 #include "zcomp_lzo.h"
 
+/*
+ * single zcomp_strm backend
+ */
+struct zcomp_strm_single {
+	struct mutex strm_lock;
+	struct zcomp_strm *zstrm;
+};
+
 static struct zcomp_backend *find_backend(const char *compress)
 {
 	if (strncmp(compress, "lzo", 3) == 0)
@@ -54,15 +62,56 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
 	return zstrm;
 }
 
+static struct zcomp_strm *zcomp_strm_single_find(struct zcomp *comp)
+{
+	struct zcomp_strm_single *zs = comp->stream;
+	mutex_lock(&zs->strm_lock);
+	return zs->zstrm;
+}
+
+static void zcomp_strm_single_release(struct zcomp *comp,
+		struct zcomp_strm *zstrm)
+{
+	struct zcomp_strm_single *zs = comp->stream;
+	mutex_unlock(&zs->strm_lock);
+}
+
+static void zcomp_strm_single_destroy(struct zcomp *comp)
+{
+	struct zcomp_strm_single *zs = comp->stream;
+	zcomp_strm_free(comp, zs->zstrm);
+	kfree(zs);
+}
+
+static int zcomp_strm_single_create(struct zcomp *comp)
+{
+	struct zcomp_strm_single *zs;
+
+	comp->destroy = zcomp_strm_single_destroy;
+	comp->strm_find = zcomp_strm_single_find;
+	comp->strm_release = zcomp_strm_single_release;
+	zs = kmalloc(sizeof(struct zcomp_strm_single), GFP_KERNEL);
+	if (!zs)
+		return -ENOMEM;
+
+	comp->stream = zs;
+	mutex_init(&zs->strm_lock);
+	zs->zstrm = zcomp_strm_alloc(comp);
+	if (!zs->zstrm) {
+		kfree(zs);
+		return -ENOMEM;
+	}
+	return 0;
+}
+
 struct zcomp_strm *zcomp_strm_find(struct zcomp *comp)
 {
-	mutex_lock(&comp->strm_lock);
-	return comp->zstrm;
+	return comp->strm_find(comp);
 }
 
 void zcomp_strm_release(struct zcomp *comp, struct zcomp_strm *zstrm)
 {
-	mutex_unlock(&comp->strm_lock);
+	comp->strm_release(comp, zstrm);
 }
 
 int zcomp_compress(struct zcomp *comp, struct zcomp_strm *zstrm,
@@ -80,7 +129,7 @@ int zcomp_decompress(struct zcomp *comp, const unsigned char *src,
 
 void zcomp_destroy(struct zcomp *comp)
 {
-	zcomp_strm_free(comp, comp->zstrm);
+	comp->destroy(comp);
 	kfree(comp);
 }
 
@@ -104,10 +153,7 @@ struct zcomp *zcomp_create(const char *compress)
 		return NULL;
 
 	comp->backend = backend;
-	mutex_init(&comp->strm_lock);
-
-	comp->zstrm = zcomp_strm_alloc(comp);
-	if (!comp->zstrm) {
+	if (zcomp_strm_single_create(comp) != 0) {
 		kfree(comp);
 		return NULL;
 	}
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
index c9a98e1..dc3500d 100644
--- a/drivers/block/zram/zcomp.h
+++ b/drivers/block/zram/zcomp.h
@@ -39,9 +39,12 @@ struct zcomp_backend {
 
 /* dynamic per-device compression frontend */
 struct zcomp {
-	struct mutex strm_lock;
-	struct zcomp_strm *zstrm;
+	void *stream;
 	struct zcomp_backend *backend;
+
+	struct zcomp_strm *(*strm_find)(struct zcomp *comp);
+	void (*strm_release)(struct zcomp *comp, struct zcomp_strm *zstrm);
+	void (*destroy)(struct zcomp *comp);
 };
 
 struct zcomp *zcomp_create(const char *comp);
-- 
1.9.0.359.g5e34a15


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv9 4/7] zram: add multi stream functionality
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
                   ` (2 preceding siblings ...)
  2014-02-28 17:52 ` [PATCHv9 3/7] zram: factor out single stream compression Sergey Senozhatsky
@ 2014-02-28 17:52 ` Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 5/7] zram: add set_max_streams knob Sergey Senozhatsky
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

Existing zram (zcomp) implementation has only one compression stream (buffer
and algorithm private part), so in order to prevent data corruption only one
write (compress operation) can use this compression stream, forcing all
concurrent write operations to wait for stream lock to be released. This patch
changes zcomp to keep a compression streams list of user-defined size (via
sysfs device attr). Each write operation still exclusively holds compression
stream, the difference is that we can have N write operations (depending on
size of streams list) executing in parallel. See TEST section later in commit
message for performance data.

Introduce struct zcomp_strm_multi and a set of functions to manage
zcomp_strm stream access. zcomp_strm_multi has a list of idle zcomp_strm
structs, spinlock to protect idle list and wait queue, making it possible
to perform parallel compressions.

The following set of functions added:
- zcomp_strm_multi_find()/zcomp_strm_multi_release()
  find and release a compression stream, implement required locking
- zcomp_strm_multi_create()/zcomp_strm_multi_destroy()
  create and destroy zcomp_strm_multi

zcomp ->strm_find() and ->strm_release() callbacks are set during initialisation
to zcomp_strm_multi_find()/zcomp_strm_multi_release() correspondingly.

Each time zcomp issues a zcomp_strm_multi_find() call, the following set of
operations performed:
- spin lock strm_lock
- if idle list is not empty, remove zcomp_strm from idle list, spin
  unlock and return zcomp stream pointer to caller
- if idle list is empty, current adds itself to wait queue. it will be
  awaken by zcomp_strm_multi_release() caller.

zcomp_strm_multi_release():
- spin lock strm_lock
- add zcomp stream to idle list
- spin unlock, wake up sleeper

Minchan Kim reported that spinlock-based locking scheme has demonstrated a
severe perfomance regression for single compression stream case, comparing
to mutex-based (see https://lkml.org/lkml/2014/2/18/16)

base                      spinlock                    mutex

==Initial write           ==Initial write             ==Initial  write
records:  5               records:  5                 records:   5
avg:      1642424.35      avg:      699610.40         avg:       1655583.71
std:      39890.95(2.43%) std:      232014.19(33.16%) std:       52293.96
max:      1690170.94      max:      1163473.45        max:       1697164.75
min:      1568669.52      min:      573429.88         min:       1553410.23
==Rewrite                 ==Rewrite                   ==Rewrite
records:  5               records:  5                 records:   5
avg:      1611775.39      avg:      501406.64         avg:       1684419.11
std:      17144.58(1.06%) std:      15354.41(3.06%)   std:       18367.42
max:      1641800.95      max:      531356.78         max:       1706445.84
min:      1593515.27      min:      488817.78         min:       1655335.73

When only one compression stream available, mutex with spin on owner tends
to perform much better than frequent wait_event()/wake_up(). This is why
single stream implemented as a special case with mutex locking.

Introduce and document zram device attribute max_comp_streams. This attr
shows and stores current zcomp's max number of zcomp streams (max_strm).
Extend zcomp's zcomp_create() with `max_strm' parameter. `max_strm' limits
the number of zcomp_strm structs in compression backend's idle list
(max_comp_streams).

max_comp_streams used during initialisation as follows:
-- passing to zcomp_create() max_strm equals to 1 will initialise zcomp
using single compression stream zcomp_strm_single (mutex-based locking).
-- passing to zcomp_create() max_strm greater than 1 will initialise zcomp
using multi compression stream zcomp_strm_multi (spinlock-based locking).

default max_comp_streams value is 1, meaning that zram with single stream
will be initialised.

Later patch will introduce configuration knob to change max_comp_streams
on already initialised and used zcomp.

TEST
iozone -t 3 -R -r 16K -s 60M -I +Z

       test           base       1 strm (mutex)     3 strm (spinlock)
-----------------------------------------------------------------------
 Initial write      589286.78       583518.39          718011.05
       Rewrite      604837.97       596776.38         1515125.72
  Random write      584120.11       595714.58         1388850.25
        Pwrite      535731.17       541117.38          739295.27
        Fwrite     1418083.88      1478612.72         1484927.06

Usage example:
set max_comp_streams to 4
        echo 4 > /sys/block/zram0/max_comp_streams

show current max_comp_streams (default value is 1).
        cat /sys/block/zram0/max_comp_streams

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 Documentation/ABI/testing/sysfs-block-zram |   9 ++-
 Documentation/blockdev/zram.txt            |  31 ++++++--
 drivers/block/zram/zcomp.c                 | 124 ++++++++++++++++++++++++++++-
 drivers/block/zram/zcomp.h                 |   4 +-
 drivers/block/zram/zram_drv.c              |  42 +++++++++-
 drivers/block/zram/zram_drv.h              |   2 +-
 6 files changed, 201 insertions(+), 11 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
index 8aa0468..0da9ed6 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -50,7 +50,6 @@ Description:
 		The failed_reads file is read-only and specifies the number of
 		failed reads happened on this device.
 
-
 What:		/sys/block/zram<id>/failed_writes
 Date:		February 2014
 Contact:	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
@@ -58,6 +57,14 @@ Description:
 		The failed_writes file is read-only and specifies the number of
 		failed writes happened on this device.
 
+What:		/sys/block/zram<id>/max_comp_streams
+Date:		February 2014
+Contact:	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
+Description:
+		The max_comp_streams file is read-write and specifies the
+		number of backend's zcomp_strm compression streams (number of
+		concurrent compress operations).
+
 What:		/sys/block/zram<id>/notify_free
 Date:		August 2010
 Contact:	Nitin Gupta <ngupta@vflare.org>
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index b31ac5e..aadfe60 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -21,7 +21,28 @@ Following shows a typical sequence of steps for using zram.
 	This creates 4 devices: /dev/zram{0,1,2,3}
 	(num_devices parameter is optional. Default: 1)
 
-2) Set Disksize
+2) Set max number of compression streams
+	Compression backend may use up to max_comp_streams compression streams,
+	thus allowing up to max_comp_streams concurrent compression operations.
+	By default, compression backend uses single compression stream.
+
+	Examples:
+	#show max compression streams number
+	cat /sys/block/zram0/max_comp_streams
+
+	#set max compression streams number to 3
+	echo 3 > /sys/block/zram0/max_comp_streams
+
+Note:
+In order to enable compression backend's multi stream support max_comp_streams
+must be initially set to desired concurrency level before ZRAM device
+initialisation. Once the device initialised as a single stream compression
+backend (max_comp_streams equals to 0) changing the value of max_comp_streams
+will not take any effect, because single stream compression backend implemented
+as a special case and does not support dynamic max_comp_streams. Only multi
+stream backend supports dynamic max_comp_streams adjustment.
+
+3) Set Disksize
         Set disk size by writing the value to sysfs node 'disksize'.
         The value can be either in bytes or you can use mem suffixes.
         Examples:
@@ -38,14 +59,14 @@ There is little point creating a zram of greater than twice the size of memory
 since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the
 size of the disk when not in use so a huge zram is wasteful.
 
-3) Activate:
+4) Activate:
 	mkswap /dev/zram0
 	swapon /dev/zram0
 
 	mkfs.ext4 /dev/zram1
 	mount /dev/zram1 /tmp
 
-4) Stats:
+5) Stats:
 	Per-device statistics are exported as various nodes under
 	/sys/block/zram<id>/
 		disksize
@@ -60,11 +81,11 @@ size of the disk when not in use so a huge zram is wasteful.
 		compr_data_size
 		mem_used_total
 
-5) Deactivate:
+6) Deactivate:
 	swapoff /dev/zram0
 	umount /dev/zram1
 
-6) Reset:
+7) Reset:
 	Write any positive value to 'reset' sysfs node
 	echo 1 > /sys/block/zram0/reset
 	echo 1 > /sys/block/zram1/reset
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 72e8071..c06f75f 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -24,6 +24,21 @@ struct zcomp_strm_single {
 	struct zcomp_strm *zstrm;
 };
 
+/*
+ * multi zcomp_strm backend
+ */
+struct zcomp_strm_multi {
+	/* protect strm list */
+	spinlock_t strm_lock;
+	/* max possible number of zstrm streams */
+	int max_strm;
+	/* number of available zstrm streams */
+	int avail_strm;
+	/* list of available strms */
+	struct list_head idle_strm;
+	wait_queue_head_t strm_wait;
+};
+
 static struct zcomp_backend *find_backend(const char *compress)
 {
 	if (strncmp(compress, "lzo", 3) == 0)
@@ -62,6 +77,107 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
 	return zstrm;
 }
 
+/*
+ * get idle zcomp_strm or wait until other process release
+ * (zcomp_strm_release()) one for us
+ */
+static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
+{
+	struct zcomp_strm_multi *zs = comp->stream;
+	struct zcomp_strm *zstrm;
+
+	while (1) {
+		spin_lock(&zs->strm_lock);
+		if (!list_empty(&zs->idle_strm)) {
+			zstrm = list_entry(zs->idle_strm.next,
+					struct zcomp_strm, list);
+			list_del(&zstrm->list);
+			spin_unlock(&zs->strm_lock);
+			return zstrm;
+		}
+		/* zstrm streams limit reached, wait for idle stream */
+		if (zs->avail_strm >= zs->max_strm) {
+			spin_unlock(&zs->strm_lock);
+			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
+			continue;
+		}
+		/* allocate new zstrm stream */
+		zs->avail_strm++;
+		spin_unlock(&zs->strm_lock);
+
+		zstrm = zcomp_strm_alloc(comp);
+		if (!zstrm) {
+			spin_lock(&zs->strm_lock);
+			zs->avail_strm--;
+			spin_unlock(&zs->strm_lock);
+			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
+			continue;
+		}
+		break;
+	}
+	return zstrm;
+}
+
+/* add stream back to idle list and wake up waiter or free the stream */
+static void zcomp_strm_multi_release(struct zcomp *comp, struct zcomp_strm *zstrm)
+{
+	struct zcomp_strm_multi *zs = comp->stream;
+
+	spin_lock(&zs->strm_lock);
+	if (zs->avail_strm <= zs->max_strm) {
+		list_add(&zstrm->list, &zs->idle_strm);
+		spin_unlock(&zs->strm_lock);
+		wake_up(&zs->strm_wait);
+		return;
+	}
+
+	zs->avail_strm--;
+	spin_unlock(&zs->strm_lock);
+	zcomp_strm_free(comp, zstrm);
+}
+
+static void zcomp_strm_multi_destroy(struct zcomp *comp)
+{
+	struct zcomp_strm_multi *zs = comp->stream;
+	struct zcomp_strm *zstrm;
+
+	while (!list_empty(&zs->idle_strm)) {
+		zstrm = list_entry(zs->idle_strm.next,
+				struct zcomp_strm, list);
+		list_del(&zstrm->list);
+		zcomp_strm_free(comp, zstrm);
+	}
+	kfree(zs);
+}
+
+static int zcomp_strm_multi_create(struct zcomp *comp, int max_strm)
+{
+	struct zcomp_strm *zstrm;
+	struct zcomp_strm_multi *zs;
+
+	comp->destroy = zcomp_strm_multi_destroy;
+	comp->strm_find = zcomp_strm_multi_find;
+	comp->strm_release = zcomp_strm_multi_release;
+	zs = kmalloc(sizeof(struct zcomp_strm_multi), GFP_KERNEL);
+	if (!zs)
+		return -ENOMEM;
+
+	comp->stream = zs;
+	spin_lock_init(&zs->strm_lock);
+	INIT_LIST_HEAD(&zs->idle_strm);
+	init_waitqueue_head(&zs->strm_wait);
+	zs->max_strm = max_strm;
+	zs->avail_strm = 1;
+
+	zstrm = zcomp_strm_alloc(comp);
+	if (!zstrm) {
+		kfree(zs);
+		return -ENOMEM;
+	}
+	list_add(&zstrm->list, &zs->idle_strm);
+	return 0;
+}
+
 static struct zcomp_strm *zcomp_strm_single_find(struct zcomp *comp)
 {
 	struct zcomp_strm_single *zs = comp->stream;
@@ -139,7 +255,7 @@ void zcomp_destroy(struct zcomp *comp)
  * if requested algorithm is not supported or in case
  * of init error
  */
-struct zcomp *zcomp_create(const char *compress)
+struct zcomp *zcomp_create(const char *compress, int max_strm)
 {
 	struct zcomp *comp;
 	struct zcomp_backend *backend;
@@ -153,7 +269,11 @@ struct zcomp *zcomp_create(const char *compress)
 		return NULL;
 
 	comp->backend = backend;
-	if (zcomp_strm_single_create(comp) != 0) {
+	if (max_strm > 1)
+		zcomp_strm_multi_create(comp, max_strm);
+	else
+		zcomp_strm_single_create(comp);
+	if (!comp->stream) {
 		kfree(comp);
 		return NULL;
 	}
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
index dc3500d..2a36844 100644
--- a/drivers/block/zram/zcomp.h
+++ b/drivers/block/zram/zcomp.h
@@ -21,6 +21,8 @@ struct zcomp_strm {
 	 * working memory)
 	 */
 	void *private;
+	/* used in multi stream backend, protected by backend strm_lock */
+	struct list_head list;
 };
 
 /* static compression backend */
@@ -47,7 +49,7 @@ struct zcomp {
 	void (*destroy)(struct zcomp *comp);
 };
 
-struct zcomp *zcomp_create(const char *comp);
+struct zcomp *zcomp_create(const char *comp, int max_strm);
 void zcomp_destroy(struct zcomp *comp);
 
 struct zcomp_strm *zcomp_strm_find(struct zcomp *comp);
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index fc12a69..75bbc37 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -108,6 +108,40 @@ static ssize_t mem_used_total_show(struct device *dev,
 	return sprintf(buf, "%llu\n", val);
 }
 
+static ssize_t max_comp_streams_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	int val;
+	struct zram *zram = dev_to_zram(dev);
+
+	down_read(&zram->init_lock);
+	val = zram->max_comp_streams;
+	up_read(&zram->init_lock);
+
+	return sprintf(buf, "%d\n", val);
+}
+
+static ssize_t max_comp_streams_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t len)
+{
+	int num;
+	struct zram *zram = dev_to_zram(dev);
+
+	if (kstrtoint(buf, 0, &num))
+		return -EINVAL;
+	if (num < 1)
+		return -EINVAL;
+	down_write(&zram->init_lock);
+	if (init_done(zram)) {
+		up_write(&zram->init_lock);
+		pr_info("Can't set max_comp_streams for initialized device\n");
+		return -EBUSY;
+	}
+	zram->max_comp_streams = num;
+	up_write(&zram->init_lock);
+	return len;
+}
+
 /* flag operations needs meta->tb_lock */
 static int zram_test_flag(struct zram_meta *meta, u32 index,
 			enum zram_pageflags flag)
@@ -502,6 +536,8 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
 	}
 
 	zcomp_destroy(zram->comp);
+	zram->max_comp_streams = 1;
+
 	zram_meta_free(zram->meta);
 	zram->meta = NULL;
 	/* Reset stats */
@@ -530,7 +566,7 @@ static ssize_t disksize_store(struct device *dev,
 		return -EBUSY;
 	}
 
-	zram->comp = zcomp_create(default_compressor);
+	zram->comp = zcomp_create(default_compressor, zram->max_comp_streams);
 	if (!zram->comp) {
 		up_write(&zram->init_lock);
 		pr_info("Cannot initialise %s compressing backend\n",
@@ -693,6 +729,8 @@ static DEVICE_ATTR(initstate, S_IRUGO, initstate_show, NULL);
 static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
 static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
 static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
+static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
+		max_comp_streams_show, max_comp_streams_store);
 
 ZRAM_ATTR_RO(num_reads);
 ZRAM_ATTR_RO(num_writes);
@@ -717,6 +755,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_orig_data_size.attr,
 	&dev_attr_compr_data_size.attr,
 	&dev_attr_mem_used_total.attr,
+	&dev_attr_max_comp_streams.attr,
 	NULL,
 };
 
@@ -779,6 +818,7 @@ static int create_device(struct zram *zram, int device_id)
 	}
 
 	zram->meta = NULL;
+	zram->max_comp_streams = 1;
 	return 0;
 
 out_free_disk:
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 45e04f7..ccf36d1 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -99,7 +99,7 @@ struct zram {
 	 * we can store in a disk.
 	 */
 	u64 disksize;	/* bytes */
-
+	int max_comp_streams;
 	struct zram_stats stats;
 };
 #endif
-- 
1.9.0.359.g5e34a15


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv9 5/7] zram: add set_max_streams knob
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
                   ` (3 preceding siblings ...)
  2014-02-28 17:52 ` [PATCHv9 4/7] zram: add multi stream functionality Sergey Senozhatsky
@ 2014-02-28 17:52 ` Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 6/7] zram: make compression algorithm selection possible Sergey Senozhatsky
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

This patch allows to change max_comp_streams on initialised zcomp.

Introduce zcomp set_max_streams() knob, zcomp_strm_multi_set_max_streams()
and zcomp_strm_single_set_max_streams() callbacks to change streams limit
for zcomp_strm_multi and zcomp_strm_single, accordingly. set_max_streams
for single steam zcomp does nothing.

If user has lowered the limit, then zcomp_strm_multi_set_max_streams()
attempts to immediately free extra streams (as much as it can, depending
on idle streams availability).

Note, this patch does not allow to change stream 'policy' from single to
multi stream (or vice versa) on already initialised compression backend.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/zcomp.c    | 36 ++++++++++++++++++++++++++++++++++++
 drivers/block/zram/zcomp.h    |  3 +++
 drivers/block/zram/zram_drv.c |  5 ++---
 3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index c06f75f..ac276f7 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -136,6 +136,29 @@ static void zcomp_strm_multi_release(struct zcomp *comp, struct zcomp_strm *zstr
 	zcomp_strm_free(comp, zstrm);
 }
 
+/* change max_strm limit */
+static int zcomp_strm_multi_set_max_streams(struct zcomp *comp, int num_strm)
+{
+	struct zcomp_strm_multi *zs = comp->stream;
+	struct zcomp_strm *zstrm;
+
+	spin_lock(&zs->strm_lock);
+	zs->max_strm = num_strm;
+	/*
+	 * if user has lowered the limit and there are idle streams,
+	 * immediately free as much streams (and memory) as we can.
+	 */
+	while (zs->avail_strm > num_strm && !list_empty(&zs->idle_strm)) {
+		zstrm = list_entry(zs->idle_strm.next,
+				struct zcomp_strm, list);
+		list_del(&zstrm->list);
+		zcomp_strm_free(comp, zstrm);
+		zs->avail_strm--;
+	}
+	spin_unlock(&zs->strm_lock);
+	return 0;
+}
+
 static void zcomp_strm_multi_destroy(struct zcomp *comp)
 {
 	struct zcomp_strm_multi *zs = comp->stream;
@@ -158,6 +181,7 @@ static int zcomp_strm_multi_create(struct zcomp *comp, int max_strm)
 	comp->destroy = zcomp_strm_multi_destroy;
 	comp->strm_find = zcomp_strm_multi_find;
 	comp->strm_release = zcomp_strm_multi_release;
+	comp->set_max_streams = zcomp_strm_multi_set_max_streams;
 	zs = kmalloc(sizeof(struct zcomp_strm_multi), GFP_KERNEL);
 	if (!zs)
 		return -ENOMEM;
@@ -192,6 +216,12 @@ static void zcomp_strm_single_release(struct zcomp *comp,
 	mutex_unlock(&zs->strm_lock);
 }
 
+static int zcomp_strm_single_set_max_streams(struct zcomp *comp, int num_strm)
+{
+	/* zcomp_strm_single support only max_comp_streams == 1 */
+	return -ENOTSUPP;
+}
+
 static void zcomp_strm_single_destroy(struct zcomp *comp)
 {
 	struct zcomp_strm_single *zs = comp->stream;
@@ -206,6 +236,7 @@ static int zcomp_strm_single_create(struct zcomp *comp)
 	comp->destroy = zcomp_strm_single_destroy;
 	comp->strm_find = zcomp_strm_single_find;
 	comp->strm_release = zcomp_strm_single_release;
+	comp->set_max_streams = zcomp_strm_single_set_max_streams;
 	zs = kmalloc(sizeof(struct zcomp_strm_single), GFP_KERNEL);
 	if (!zs)
 		return -ENOMEM;
@@ -220,6 +251,11 @@ static int zcomp_strm_single_create(struct zcomp *comp)
 	return 0;
 }
 
+int zcomp_set_max_streams(struct zcomp *comp, int num_strm)
+{
+	return comp->set_max_streams(comp, num_strm);
+}
+
 struct zcomp_strm *zcomp_strm_find(struct zcomp *comp)
 {
 	return comp->strm_find(comp);
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
index 2a36844..bd11d59 100644
--- a/drivers/block/zram/zcomp.h
+++ b/drivers/block/zram/zcomp.h
@@ -46,6 +46,7 @@ struct zcomp {
 
 	struct zcomp_strm *(*strm_find)(struct zcomp *comp);
 	void (*strm_release)(struct zcomp *comp, struct zcomp_strm *zstrm);
+	int (*set_max_streams)(struct zcomp *comp, int num_strm);
 	void (*destroy)(struct zcomp *comp);
 };
 
@@ -60,4 +61,6 @@ int zcomp_compress(struct zcomp *comp, struct zcomp_strm *zstrm,
 
 int zcomp_decompress(struct zcomp *comp, const unsigned char *src,
 		size_t src_len, unsigned char *dst);
+
+int zcomp_set_max_streams(struct zcomp *comp, int num_strm);
 #endif /* _ZCOMP_H_ */
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 75bbc37..2de17f6 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -133,9 +133,8 @@ static ssize_t max_comp_streams_store(struct device *dev,
 		return -EINVAL;
 	down_write(&zram->init_lock);
 	if (init_done(zram)) {
-		up_write(&zram->init_lock);
-		pr_info("Can't set max_comp_streams for initialized device\n");
-		return -EBUSY;
+		if (zcomp_set_max_streams(zram->comp, num))
+			pr_info("Cannot change max compression streams\n");
 	}
 	zram->max_comp_streams = num;
 	up_write(&zram->init_lock);
-- 
1.9.0.359.g5e34a15


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv9 6/7] zram: make compression algorithm selection possible
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
                   ` (4 preceding siblings ...)
  2014-02-28 17:52 ` [PATCHv9 5/7] zram: add set_max_streams knob Sergey Senozhatsky
@ 2014-02-28 17:52 ` Sergey Senozhatsky
  2014-02-28 17:52 ` [PATCHv9 7/7] zram: add lz4 algorithm backend Sergey Senozhatsky
  2014-03-06  8:11 ` [PATCHv9 0/7] add compressing abstraction and multi stream support Minchan Kim
  7 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

Add and document `comp_algorithm' device attribute. This attribute
allows to show supported compression and currently selected
compression algorithms:
	cat /sys/block/zram0/comp_algorithm
	[lzo] lz4

and change selected compression algorithm:
	echo lzo > /sys/block/zram0/comp_algorithm

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 Documentation/ABI/testing/sysfs-block-zram |  8 +++++++
 Documentation/blockdev/zram.txt            | 24 +++++++++++++++----
 drivers/block/zram/zcomp.c                 | 32 +++++++++++++++++++++++---
 drivers/block/zram/zcomp.h                 |  2 ++
 drivers/block/zram/zram_drv.c              | 37 +++++++++++++++++++++++++++---
 drivers/block/zram/zram_drv.h              |  1 +
 6 files changed, 93 insertions(+), 11 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
index 0da9ed6..70ec992 100644
--- a/Documentation/ABI/testing/sysfs-block-zram
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -65,6 +65,14 @@ Description:
 		number of backend's zcomp_strm compression streams (number of
 		concurrent compress operations).
 
+What:		/sys/block/zram<id>/comp_algorithm
+Date:		February 2014
+Contact:	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
+Description:
+		The comp_algorithm file is read-write and lets to show
+		available and selected compression algorithms, change
+		compression algorithm selection.
+
 What:		/sys/block/zram<id>/notify_free
 Date:		August 2010
 Contact:	Nitin Gupta <ngupta@vflare.org>
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index aadfe60..2604ffe 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -42,7 +42,21 @@ will not take any effect, because single stream compression backend implemented
 as a special case and does not support dynamic max_comp_streams. Only multi
 stream backend supports dynamic max_comp_streams adjustment.
 
-3) Set Disksize
+3) Select compression algorithm
+	Using comp_algorithm device attribute one can see available and
+	currently selected (shown in square brackets) compression algortithms,
+	change selected compression algorithm (once the device is initialised
+	there is no way to change compression algorithm).
+
+	Examples:
+	#show supported compression algorithms
+	cat /sys/block/zram0/comp_algorithm
+	lzo [lz4]
+
+	#select lzo compression algorithm
+	echo lzo > /sys/block/zram0/comp_algorithm
+
+4) Set Disksize
         Set disk size by writing the value to sysfs node 'disksize'.
         The value can be either in bytes or you can use mem suffixes.
         Examples:
@@ -59,14 +73,14 @@ There is little point creating a zram of greater than twice the size of memory
 since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the
 size of the disk when not in use so a huge zram is wasteful.
 
-4) Activate:
+5) Activate:
 	mkswap /dev/zram0
 	swapon /dev/zram0
 
 	mkfs.ext4 /dev/zram1
 	mount /dev/zram1 /tmp
 
-5) Stats:
+6) Stats:
 	Per-device statistics are exported as various nodes under
 	/sys/block/zram<id>/
 		disksize
@@ -81,11 +95,11 @@ size of the disk when not in use so a huge zram is wasteful.
 		compr_data_size
 		mem_used_total
 
-6) Deactivate:
+7) Deactivate:
 	swapoff /dev/zram0
 	umount /dev/zram1
 
-7) Reset:
+8) Reset:
 	Write any positive value to 'reset' sysfs node
 	echo 1 > /sys/block/zram0/reset
 	echo 1 > /sys/block/zram1/reset
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index ac276f7..aad533a 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -39,11 +39,20 @@ struct zcomp_strm_multi {
 	wait_queue_head_t strm_wait;
 };
 
+static struct zcomp_backend *backends[] = {
+	&zcomp_lzo,
+	NULL
+};
+
 static struct zcomp_backend *find_backend(const char *compress)
 {
-	if (strncmp(compress, "lzo", 3) == 0)
-		return &zcomp_lzo;
-	return NULL;
+	int i = 0;
+	while (backends[i]) {
+		if (sysfs_streq(compress, backends[i]->name))
+			break;
+		i++;
+	}
+	return backends[i];
 }
 
 static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm)
@@ -251,6 +260,23 @@ static int zcomp_strm_single_create(struct zcomp *comp)
 	return 0;
 }
 
+/* show available compressors */
+ssize_t zcomp_available_show(const char *comp, char *buf)
+{
+	ssize_t sz = 0;
+	int i = 0;
+
+	while (backends[i]) {
+		if (sysfs_streq(comp, backends[i]->name))
+			sz += sprintf(buf + sz, "[%s] ", backends[i]->name);
+		else
+			sz += sprintf(buf + sz, "%s ", backends[i]->name);
+		i++;
+	}
+	sz += sprintf(buf + sz, "\n");
+	return sz;
+}
+
 int zcomp_set_max_streams(struct zcomp *comp, int num_strm)
 {
 	return comp->set_max_streams(comp, num_strm);
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
index bd11d59..8b8997f 100644
--- a/drivers/block/zram/zcomp.h
+++ b/drivers/block/zram/zcomp.h
@@ -50,6 +50,8 @@ struct zcomp {
 	void (*destroy)(struct zcomp *comp);
 };
 
+ssize_t zcomp_available_show(const char *comp, char *buf);
+
 struct zcomp *zcomp_create(const char *comp, int max_strm);
 void zcomp_destroy(struct zcomp *comp);
 
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 2de17f6..dfe52ed 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -141,6 +141,34 @@ static ssize_t max_comp_streams_store(struct device *dev,
 	return len;
 }
 
+static ssize_t comp_algorithm_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	size_t sz;
+	struct zram *zram = dev_to_zram(dev);
+
+	down_read(&zram->init_lock);
+	sz = zcomp_available_show(zram->compressor, buf);
+	up_read(&zram->init_lock);
+
+	return sz;
+}
+
+static ssize_t comp_algorithm_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t len)
+{
+	struct zram *zram = dev_to_zram(dev);
+	down_write(&zram->init_lock);
+	if (init_done(zram)) {
+		up_write(&zram->init_lock);
+		pr_info("Can't change algorithm for initialized device\n");
+		return -EBUSY;
+	}
+	strlcpy(zram->compressor, buf, sizeof(zram->compressor));
+	up_write(&zram->init_lock);
+	return len;
+}
+
 /* flag operations needs meta->tb_lock */
 static int zram_test_flag(struct zram_meta *meta, u32 index,
 			enum zram_pageflags flag)
@@ -565,11 +593,11 @@ static ssize_t disksize_store(struct device *dev,
 		return -EBUSY;
 	}
 
-	zram->comp = zcomp_create(default_compressor, zram->max_comp_streams);
+	zram->comp = zcomp_create(zram->compressor, zram->max_comp_streams);
 	if (!zram->comp) {
 		up_write(&zram->init_lock);
 		pr_info("Cannot initialise %s compressing backend\n",
-				default_compressor);
+				zram->compressor);
 		return -EINVAL;
 	}
 
@@ -730,6 +758,8 @@ static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
 static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
 static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
 		max_comp_streams_show, max_comp_streams_store);
+static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
+		comp_algorithm_show, comp_algorithm_store);
 
 ZRAM_ATTR_RO(num_reads);
 ZRAM_ATTR_RO(num_writes);
@@ -755,6 +785,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_compr_data_size.attr,
 	&dev_attr_mem_used_total.attr,
 	&dev_attr_max_comp_streams.attr,
+	&dev_attr_comp_algorithm.attr,
 	NULL,
 };
 
@@ -815,7 +846,7 @@ static int create_device(struct zram *zram, int device_id)
 		pr_warn("Error creating sysfs group");
 		goto out_free_disk;
 	}
-
+	strlcpy(zram->compressor, default_compressor, sizeof(zram->compressor));
 	zram->meta = NULL;
 	zram->max_comp_streams = 1;
 	return 0;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index ccf36d1..7f21c14 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -101,5 +101,6 @@ struct zram {
 	u64 disksize;	/* bytes */
 	int max_comp_streams;
 	struct zram_stats stats;
+	char compressor[10];
 };
 #endif
-- 
1.9.0.359.g5e34a15


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv9 7/7] zram: add lz4 algorithm backend
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
                   ` (5 preceding siblings ...)
  2014-02-28 17:52 ` [PATCHv9 6/7] zram: make compression algorithm selection possible Sergey Senozhatsky
@ 2014-02-28 17:52 ` Sergey Senozhatsky
  2014-03-06  8:11 ` [PATCHv9 0/7] add compressing abstraction and multi stream support Minchan Kim
  7 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-02-28 17:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel,
	Sergey Senozhatsky

Introduce LZ4 compression backend and make it available for selection.
LZ4 support is optional and requires user to set ZRAM_LZ4_COMPRESS
config option. The default compression backend is LZO.

TEST

(x86_64, core i5, 2 cores + 2 hyperthreading, zram disk size 1G,
ext4 file system, 3 compression streams)

iozone -t 3 -R -r 16K -s 60M -I +Z

       Test           LZO           LZ4
----------------------------------------------
  Initial write   1642744.62    1317005.09
        Rewrite   2498980.88    1800645.16
           Read   3957026.38    5877043.75
        Re-read   3950997.38    5861847.00
   Reverse Read   2937114.56    5047384.00
    Stride read   2948163.19    4929587.38
    Random read   3292692.69    4880793.62
 Mixed workload   1545602.62    3502940.38
   Random write   2448039.75    1758786.25
         Pwrite   1670051.03    1338329.69
          Pread   2530682.00    5097177.62
         Fwrite   3232085.62    3275942.56
          Fread   6306880.25    6645271.12

So on my system LZ4 is slower in write-only tests, while it performs
better in read-only and mixed (reads + writes) tests.

Official LZ4 benchmarks available here http://code.google.com/p/lz4/
(linux kernel uses revision r90).

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/Kconfig     | 10 +++++++++
 drivers/block/zram/Makefile    |  2 ++
 drivers/block/zram/zcomp.c     |  6 ++++++
 drivers/block/zram/zcomp_lz4.c | 47 ++++++++++++++++++++++++++++++++++++++++++
 drivers/block/zram/zcomp_lz4.h | 17 +++++++++++++++
 5 files changed, 82 insertions(+)
 create mode 100644 drivers/block/zram/zcomp_lz4.c
 create mode 100644 drivers/block/zram/zcomp_lz4.h

diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
index 3450be8..6489c0f 100644
--- a/drivers/block/zram/Kconfig
+++ b/drivers/block/zram/Kconfig
@@ -15,6 +15,16 @@ config ZRAM
 
 	  See zram.txt for more information.
 
+config ZRAM_LZ4_COMPRESS
+	bool "Enable LZ4 algorithm support"
+	depends on ZRAM
+	select LZ4_COMPRESS
+	select LZ4_DECOMPRESS
+	default n
+	help
+	  This option enables LZ4 compression algorithm support. Compression
+	  algorithm can be changed using `comp_algorithm' device attribute.
+
 config ZRAM_DEBUG
 	bool "Compressed RAM block device debug support"
 	depends on ZRAM
diff --git a/drivers/block/zram/Makefile b/drivers/block/zram/Makefile
index 757c6a5..be0763f 100644
--- a/drivers/block/zram/Makefile
+++ b/drivers/block/zram/Makefile
@@ -1,3 +1,5 @@
 zram-y	:=	zcomp_lzo.o zcomp.o zram_drv.o
 
+zram-$(CONFIG_ZRAM_LZ4_COMPRESS) += zcomp_lz4.o
+
 obj-$(CONFIG_ZRAM)	+=	zram.o
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index aad533a..d591903 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -15,6 +15,9 @@
 
 #include "zcomp.h"
 #include "zcomp_lzo.h"
+#ifdef CONFIG_ZRAM_LZ4_COMPRESS
+#include "zcomp_lz4.h"
+#endif
 
 /*
  * single zcomp_strm backend
@@ -41,6 +44,9 @@ struct zcomp_strm_multi {
 
 static struct zcomp_backend *backends[] = {
 	&zcomp_lzo,
+#ifdef CONFIG_ZRAM_LZ4_COMPRESS
+	&zcomp_lz4,
+#endif
 	NULL
 };
 
diff --git a/drivers/block/zram/zcomp_lz4.c b/drivers/block/zram/zcomp_lz4.c
new file mode 100644
index 0000000..f2afb7e
--- /dev/null
+++ b/drivers/block/zram/zcomp_lz4.c
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2014 Sergey Senozhatsky.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/lz4.h>
+
+#include "zcomp_lz4.h"
+
+static void *zcomp_lz4_create(void)
+{
+	return kzalloc(LZ4_MEM_COMPRESS, GFP_KERNEL);
+}
+
+static void zcomp_lz4_destroy(void *private)
+{
+	kfree(private);
+}
+
+static int zcomp_lz4_compress(const unsigned char *src, unsigned char *dst,
+		size_t *dst_len, void *private)
+{
+	/* return  : Success if return 0 */
+	return lz4_compress(src, PAGE_SIZE, dst, dst_len, private);
+}
+
+static int zcomp_lz4_decompress(const unsigned char *src, size_t src_len,
+		unsigned char *dst)
+{
+	size_t dst_len = PAGE_SIZE;
+	/* return  : Success if return 0 */
+	return lz4_decompress_unknownoutputsize(src, src_len, dst, &dst_len);
+}
+
+struct zcomp_backend zcomp_lz4 = {
+	.compress = zcomp_lz4_compress,
+	.decompress = zcomp_lz4_decompress,
+	.create = zcomp_lz4_create,
+	.destroy = zcomp_lz4_destroy,
+	.name = "lz4",
+};
diff --git a/drivers/block/zram/zcomp_lz4.h b/drivers/block/zram/zcomp_lz4.h
new file mode 100644
index 0000000..60613fb
--- /dev/null
+++ b/drivers/block/zram/zcomp_lz4.h
@@ -0,0 +1,17 @@
+/*
+ * Copyright (C) 2014 Sergey Senozhatsky.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _ZCOMP_LZ4_H_
+#define _ZCOMP_LZ4_H_
+
+#include "zcomp.h"
+
+extern struct zcomp_backend zcomp_lz4;
+
+#endif /* _ZCOMP_LZ4_H_ */
-- 
1.9.0.359.g5e34a15


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCHv9 0/7] add compressing abstraction and multi stream support
  2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
                   ` (6 preceding siblings ...)
  2014-02-28 17:52 ` [PATCHv9 7/7] zram: add lz4 algorithm backend Sergey Senozhatsky
@ 2014-03-06  8:11 ` Minchan Kim
  2014-03-06 11:27   ` Sergey Senozhatsky
  2014-04-16 13:53   ` Bartlomiej Zolnierkiewicz
  7 siblings, 2 replies; 13+ messages in thread
From: Minchan Kim @ 2014-03-06  8:11 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jerome Marchand, Nitin Gupta, linux-kernel

Hello Sergey,

Sorry for the late.

Today, I tested this patch and confirm that it's really good.
I send result for the record.

In x86(4core and x2 hyper threading, i7, 2.8GHz), I did parallel 4 dd
test with 200m file like below

dd if=./test200m.file of=mnt/file1 bs=512k count=1024 oflag=direct &
dd if=./test200m.file of=mnt/file2 bs=512k count=1024 oflag=direct &
dd if=./test200m.file of=mnt/file3 bs=512k count=1024 oflag=direct &
dd if=./test200m.file of=mnt/file4 bs=512k count=1024 oflag=direct &
wait all backgroud job

The result is 

When 1 max_comp_streams, elapsed time is 36.26s
When 8 max_comp_streams, elapsed time is 4.09s

In ARM(4core, ARMv7, 1.5GHz), I did iozone test.

      1 max_comp_streams,      8 max_comp_streams
==Initial  write      ==Initial  write
records:   10         records:   10
avg:       141964.05  avg:       239632.54
std:       3532.11    (2.49%)    std:       43863.53  (18.30%)
max:       147900.45  max:       319046.62
min:       135363.73  min:       178929.40
==Rewrite  ==Rewrite
records:   10         records:   10
avg:       144757.46  avg:       247410.80
std:       4019.19    (2.78%)    std:       37378.42  (15.11%)
max:       150757.72  max:       293284.84
min:       135127.55  min:       188984.27
==Read     ==Read
records:   10         records:   10
avg:       208325.22  avg:       202894.24
std:       57072.62   (27.40%)   std:       41099.56  (20.26%)
max:       293428.96  max:       289581.12
min:       79445.37   min:       157478.27
==Re-read  ==Re-read
records:   10         records:   10
avg:       204750.36  avg:       237406.96
std:       36959.99   (18.05%)   std:       41518.36  (17.49%)
max:       268399.89  max:       286898.13
min:       154831.28  min:       160326.88
==Reverse  Read       ==Reverse  Read
records:   10         records:   10
avg:       215043.10  avg:       208946.35
std:       31239.60   (14.53%)   std:       38859.74  (18.60%)
max:       251564.57  max:       284481.31
min:       154719.20  min:       155024.33
==Stride   read       ==Stride   read
records:   10         records:   10
avg:       227246.54  avg:       198925.10
std:       31105.89   (13.69%)   std:       30721.86  (15.44%)
max:       290020.34  max:       227178.70
min:       157399.46  min:       153592.91
==Random   read       ==Random   read
records:   10         records:   10
avg:       238239.81  avg:       216298.41
std:       37276.91   (15.65%)   std:       38194.73  (17.66%)
max:       291416.20  max:       286345.37
min:       152734.23  min:       151871.52
==Mixed    workload   ==Mixed    workload
records:   10         records:   10
avg:       208434.11  avg:       234355.66
std:       31385.40   (15.06%)   std:       22064.02  (9.41%)
max:       253990.11  max:       270412.58
min:       162041.47  min:       186052.12
==Random   write      ==Random   write
records:   10         records:   10
avg:       142172.54  avg:       290231.28
std:       6233.67    (4.38%)    std:       46462.35  (16.01%)
max:       150652.40  max:       338096.54
min:       130584.14  min:       183253.25
==Pwrite   ==Pwrite
records:   10         records:   10
avg:       141247.91  avg:       267085.70
std:       6756.08    (4.78%)    std:       40019.39  (14.98%)
max:       150239.13  max:       335512.33
min:       130456.98  min:       180832.45
==Pread    ==Pread
records:   10         records:   10
avg:       214990.26  avg:       208730.94
std:       40701.79   (18.93%)   std:       50797.78  (24.34%)
max:       287060.54  max:       300675.25
min:       157642.17  min:       156763.98

So, all write test both x86 and ARM is really huge win
and I couldn't find any regression!

Thanks for nice work.

For all patchset,

Acked-by: Minchan Kim <minchan@kernel.org>

On Fri, Feb 28, 2014 at 08:52:00PM +0300, Sergey Senozhatsky wrote:
> This patchset introduces zcomp compression backend abstraction
> adding ability to support compression algorithms other than LZO;
> support for multi compression streams, making parallel compressions
> possible; adds support for LZ4 compression algorithm.
> 
> v8->v9 (reviewed by Andrew Morton):
> -- add LZ4 backend (+iozone test vs LZO)
> -- merge patches 'zram: document max_comp_streams' and 'zram: add multi
>    stream functionality'
> -- do not extern backend struct from source file
> -- use find()/release() naming instead of get()/put()
> -- minor code, commit messages and code comments `nitpicks'
> -- removed Acked-by Minchan Kim from first two patches, because I've
>    changed them a bit.
> 
> v7->v8 (reviewed by Minchan Kim):
> -- merge patches 'add multi stream functionality' and 'enable multi
>    stream compression support in zram'
> -- return status code from set_max_streams knob and print message on
>    error
> -- do not use atomic type for ->avail_strm
> -- return back: allocate by default only one stream for multi stream backend
> -- wake sleeping write in zcomp_strm_multi_put() only if we put stream
>    to idle list
> -- minor code `nitpicks'
> 
> v6->v7 (reviewed by Minchan Kim):
> -- enable multi and single stream support out of the box (drop
>    ZRAM_MULTI_STREAM config option)
> -- add set_max_stream knob, so we can adjust max number of compression
>    streams in runtime (for multi stream backend at the moment)
> -- minor code `nitpicks'
> 
> v5->v6 (reviewed by Minchan Kim):
> -- handle single compression stream case separately, using mutex locking,
>    to address perfomance regression
> -- handle multi compression stream using spin lock and wait_event()/wake_up()
> -- make multi compression stream support configurable (ZRAM_MULTI_STREAM
>    config option)
> 
> v4->v5 (reviewed by Minchan Kim):
> -- renamed zcomp buffer_lock; removed src len and dst len from
>    compress() and decompress(); not using term `buffer' and
>    `workmem' in code and documentation; define compress() and
>    decompress() functions for LZO backend; not using goto's;
>    do not put idle zcomp_strm to idle list tail.
> 
> v3->v4 (reviewed by Minchan Kim):
> -- renamed compression backend and working memory structs as requested
>    by Minchan Kim; fixed several issues noted by Minchan Kim.
> 
> Sergey Senozhatsky (7):
>   zram: introduce compressing backend abstraction
>   zram: use zcomp compressing backends
>   zram: factor out single stream compression
>   zram: add multi stream functionality
>   zram: add set_max_streams knob
>   zram: make compression algorithm selection possible
>   zram: add lz4 algorithm backend
> 
>  Documentation/ABI/testing/sysfs-block-zram |  17 +-
>  Documentation/blockdev/zram.txt            |  45 +++-
>  drivers/block/zram/Kconfig                 |  10 +
>  drivers/block/zram/Makefile                |   4 +-
>  drivers/block/zram/zcomp.c                 | 349 +++++++++++++++++++++++++++++
>  drivers/block/zram/zcomp.h                 |  68 ++++++
>  drivers/block/zram/zcomp_lz4.c             |  47 ++++
>  drivers/block/zram/zcomp_lz4.h             |  17 ++
>  drivers/block/zram/zcomp_lzo.c             |  47 ++++
>  drivers/block/zram/zcomp_lzo.h             |  17 ++
>  drivers/block/zram/zram_drv.c              | 131 ++++++++---
>  drivers/block/zram/zram_drv.h              |  11 +-
>  12 files changed, 715 insertions(+), 48 deletions(-)
>  create mode 100644 drivers/block/zram/zcomp.c
>  create mode 100644 drivers/block/zram/zcomp.h
>  create mode 100644 drivers/block/zram/zcomp_lz4.c
>  create mode 100644 drivers/block/zram/zcomp_lz4.h
>  create mode 100644 drivers/block/zram/zcomp_lzo.c
>  create mode 100644 drivers/block/zram/zcomp_lzo.h
> 
> -- 
> 1.9.0.359.g5e34a15
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv9 0/7] add compressing abstraction and multi stream support
  2014-03-06  8:11 ` [PATCHv9 0/7] add compressing abstraction and multi stream support Minchan Kim
@ 2014-03-06 11:27   ` Sergey Senozhatsky
  2014-04-16 13:53   ` Bartlomiej Zolnierkiewicz
  1 sibling, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-03-06 11:27 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, Jerome Marchand, Nitin Gupta,
	linux-kernel

Hello Minchan,

On (03/06/14 17:11), Minchan Kim wrote:
> Hello Sergey,
> 
> Sorry for the late.
> 
> Today, I tested this patch and confirm that it's really good.
> I send result for the record.
> 
> In x86(4core and x2 hyper threading, i7, 2.8GHz), I did parallel 4 dd
> test with 200m file like below
> 
> dd if=./test200m.file of=mnt/file1 bs=512k count=1024 oflag=direct &
> dd if=./test200m.file of=mnt/file2 bs=512k count=1024 oflag=direct &
> dd if=./test200m.file of=mnt/file3 bs=512k count=1024 oflag=direct &
> dd if=./test200m.file of=mnt/file4 bs=512k count=1024 oflag=direct &
> wait all backgroud job
> 
> The result is 
> 
> When 1 max_comp_streams, elapsed time is 36.26s
> When 8 max_comp_streams, elapsed time is 4.09s
> 
> In ARM(4core, ARMv7, 1.5GHz), I did iozone test.
> 

many thanks, Minchan! I really appreciate all your help and time that you
spent helping me out with this patchset. good job!



p.s. I noticed that for the last two days (and still) fetchmail works
extremly bad with gmail. sorry if I lost someone's email or replied late.

	-ss

>       1 max_comp_streams,      8 max_comp_streams
> ==Initial  write      ==Initial  write
> records:   10         records:   10
> avg:       141964.05  avg:       239632.54
> std:       3532.11    (2.49%)    std:       43863.53  (18.30%)
> max:       147900.45  max:       319046.62
> min:       135363.73  min:       178929.40
> ==Rewrite  ==Rewrite
> records:   10         records:   10
> avg:       144757.46  avg:       247410.80
> std:       4019.19    (2.78%)    std:       37378.42  (15.11%)
> max:       150757.72  max:       293284.84
> min:       135127.55  min:       188984.27
> ==Read     ==Read
> records:   10         records:   10
> avg:       208325.22  avg:       202894.24
> std:       57072.62   (27.40%)   std:       41099.56  (20.26%)
> max:       293428.96  max:       289581.12
> min:       79445.37   min:       157478.27
> ==Re-read  ==Re-read
> records:   10         records:   10
> avg:       204750.36  avg:       237406.96
> std:       36959.99   (18.05%)   std:       41518.36  (17.49%)
> max:       268399.89  max:       286898.13
> min:       154831.28  min:       160326.88
> ==Reverse  Read       ==Reverse  Read
> records:   10         records:   10
> avg:       215043.10  avg:       208946.35
> std:       31239.60   (14.53%)   std:       38859.74  (18.60%)
> max:       251564.57  max:       284481.31
> min:       154719.20  min:       155024.33
> ==Stride   read       ==Stride   read
> records:   10         records:   10
> avg:       227246.54  avg:       198925.10
> std:       31105.89   (13.69%)   std:       30721.86  (15.44%)
> max:       290020.34  max:       227178.70
> min:       157399.46  min:       153592.91
> ==Random   read       ==Random   read
> records:   10         records:   10
> avg:       238239.81  avg:       216298.41
> std:       37276.91   (15.65%)   std:       38194.73  (17.66%)
> max:       291416.20  max:       286345.37
> min:       152734.23  min:       151871.52
> ==Mixed    workload   ==Mixed    workload
> records:   10         records:   10
> avg:       208434.11  avg:       234355.66
> std:       31385.40   (15.06%)   std:       22064.02  (9.41%)
> max:       253990.11  max:       270412.58
> min:       162041.47  min:       186052.12
> ==Random   write      ==Random   write
> records:   10         records:   10
> avg:       142172.54  avg:       290231.28
> std:       6233.67    (4.38%)    std:       46462.35  (16.01%)
> max:       150652.40  max:       338096.54
> min:       130584.14  min:       183253.25
> ==Pwrite   ==Pwrite
> records:   10         records:   10
> avg:       141247.91  avg:       267085.70
> std:       6756.08    (4.78%)    std:       40019.39  (14.98%)
> max:       150239.13  max:       335512.33
> min:       130456.98  min:       180832.45
> ==Pread    ==Pread
> records:   10         records:   10
> avg:       214990.26  avg:       208730.94
> std:       40701.79   (18.93%)   std:       50797.78  (24.34%)
> max:       287060.54  max:       300675.25
> min:       157642.17  min:       156763.98
> 
> So, all write test both x86 and ARM is really huge win
> and I couldn't find any regression!
> 
> Thanks for nice work.
> 
> For all patchset,
> 
> Acked-by: Minchan Kim <minchan@kernel.org>
> 
> On Fri, Feb 28, 2014 at 08:52:00PM +0300, Sergey Senozhatsky wrote:
> > This patchset introduces zcomp compression backend abstraction
> > adding ability to support compression algorithms other than LZO;
> > support for multi compression streams, making parallel compressions
> > possible; adds support for LZ4 compression algorithm.
> > 
> > v8->v9 (reviewed by Andrew Morton):
> > -- add LZ4 backend (+iozone test vs LZO)
> > -- merge patches 'zram: document max_comp_streams' and 'zram: add multi
> >    stream functionality'
> > -- do not extern backend struct from source file
> > -- use find()/release() naming instead of get()/put()
> > -- minor code, commit messages and code comments `nitpicks'
> > -- removed Acked-by Minchan Kim from first two patches, because I've
> >    changed them a bit.
> > 
> > v7->v8 (reviewed by Minchan Kim):
> > -- merge patches 'add multi stream functionality' and 'enable multi
> >    stream compression support in zram'
> > -- return status code from set_max_streams knob and print message on
> >    error
> > -- do not use atomic type for ->avail_strm
> > -- return back: allocate by default only one stream for multi stream backend
> > -- wake sleeping write in zcomp_strm_multi_put() only if we put stream
> >    to idle list
> > -- minor code `nitpicks'
> > 
> > v6->v7 (reviewed by Minchan Kim):
> > -- enable multi and single stream support out of the box (drop
> >    ZRAM_MULTI_STREAM config option)
> > -- add set_max_stream knob, so we can adjust max number of compression
> >    streams in runtime (for multi stream backend at the moment)
> > -- minor code `nitpicks'
> > 
> > v5->v6 (reviewed by Minchan Kim):
> > -- handle single compression stream case separately, using mutex locking,
> >    to address perfomance regression
> > -- handle multi compression stream using spin lock and wait_event()/wake_up()
> > -- make multi compression stream support configurable (ZRAM_MULTI_STREAM
> >    config option)
> > 
> > v4->v5 (reviewed by Minchan Kim):
> > -- renamed zcomp buffer_lock; removed src len and dst len from
> >    compress() and decompress(); not using term `buffer' and
> >    `workmem' in code and documentation; define compress() and
> >    decompress() functions for LZO backend; not using goto's;
> >    do not put idle zcomp_strm to idle list tail.
> > 
> > v3->v4 (reviewed by Minchan Kim):
> > -- renamed compression backend and working memory structs as requested
> >    by Minchan Kim; fixed several issues noted by Minchan Kim.
> > 
> > Sergey Senozhatsky (7):
> >   zram: introduce compressing backend abstraction
> >   zram: use zcomp compressing backends
> >   zram: factor out single stream compression
> >   zram: add multi stream functionality
> >   zram: add set_max_streams knob
> >   zram: make compression algorithm selection possible
> >   zram: add lz4 algorithm backend
> > 
> >  Documentation/ABI/testing/sysfs-block-zram |  17 +-
> >  Documentation/blockdev/zram.txt            |  45 +++-
> >  drivers/block/zram/Kconfig                 |  10 +
> >  drivers/block/zram/Makefile                |   4 +-
> >  drivers/block/zram/zcomp.c                 | 349 +++++++++++++++++++++++++++++
> >  drivers/block/zram/zcomp.h                 |  68 ++++++
> >  drivers/block/zram/zcomp_lz4.c             |  47 ++++
> >  drivers/block/zram/zcomp_lz4.h             |  17 ++
> >  drivers/block/zram/zcomp_lzo.c             |  47 ++++
> >  drivers/block/zram/zcomp_lzo.h             |  17 ++
> >  drivers/block/zram/zram_drv.c              | 131 ++++++++---
> >  drivers/block/zram/zram_drv.h              |  11 +-
> >  12 files changed, 715 insertions(+), 48 deletions(-)
> >  create mode 100644 drivers/block/zram/zcomp.c
> >  create mode 100644 drivers/block/zram/zcomp.h
> >  create mode 100644 drivers/block/zram/zcomp_lz4.c
> >  create mode 100644 drivers/block/zram/zcomp_lz4.h
> >  create mode 100644 drivers/block/zram/zcomp_lzo.c
> >  create mode 100644 drivers/block/zram/zcomp_lzo.h
> > 
> > -- 
> > 1.9.0.359.g5e34a15
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> -- 
> Kind regards,
> Minchan Kim
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv9 0/7] add compressing abstraction and multi stream support
  2014-03-06  8:11 ` [PATCHv9 0/7] add compressing abstraction and multi stream support Minchan Kim
  2014-03-06 11:27   ` Sergey Senozhatsky
@ 2014-04-16 13:53   ` Bartlomiej Zolnierkiewicz
  2014-04-16 14:53     ` Sergey Senozhatsky
  1 sibling, 1 reply; 13+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2014-04-16 13:53 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, Jerome Marchand, Nitin Gupta,
	linux-kernel


Hi,

I'm a bit late on this patch series (sorry for that) but why are we not
using Crypto API for compression algorithm selection and multi stream
support?  Compared to the earlier patches for zram like the ones we did
in July 2013 [1] this patch series requires us to:

- implement compression algorithm support for each algorithm separately
  (when using Crypto API all compression algorithms supported by Crypto
  API are supported automatically)

- manually set the number of maximum active compression streams (earlier
  patches using Crypto API needed a lot less code and automatically scaled
  number of compression streams to number of online CPUs)

>From what I see the pros of the current patch series are:

- dynamic selection of the compression algorithm

- ability to limit number of active streams below tha number of online CPUs

However I believe that both above features can be also implemented on top of
the code using Crypto API.  What am I missing?

[1] https://lkml.org/lkml/2013/7/30/365

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

On Thursday, March 06, 2014 05:11:37 PM Minchan Kim wrote:
> Hello Sergey,
> 
> Sorry for the late.
> 
> Today, I tested this patch and confirm that it's really good.
> I send result for the record.
> 
> In x86(4core and x2 hyper threading, i7, 2.8GHz), I did parallel 4 dd
> test with 200m file like below
> 
> dd if=./test200m.file of=mnt/file1 bs=512k count=1024 oflag=direct &
> dd if=./test200m.file of=mnt/file2 bs=512k count=1024 oflag=direct &
> dd if=./test200m.file of=mnt/file3 bs=512k count=1024 oflag=direct &
> dd if=./test200m.file of=mnt/file4 bs=512k count=1024 oflag=direct &
> wait all backgroud job
> 
> The result is 
> 
> When 1 max_comp_streams, elapsed time is 36.26s
> When 8 max_comp_streams, elapsed time is 4.09s
> 
> In ARM(4core, ARMv7, 1.5GHz), I did iozone test.
> 
>       1 max_comp_streams,      8 max_comp_streams
> ==Initial  write      ==Initial  write
> records:   10         records:   10
> avg:       141964.05  avg:       239632.54
> std:       3532.11    (2.49%)    std:       43863.53  (18.30%)
> max:       147900.45  max:       319046.62
> min:       135363.73  min:       178929.40
> ==Rewrite  ==Rewrite
> records:   10         records:   10
> avg:       144757.46  avg:       247410.80
> std:       4019.19    (2.78%)    std:       37378.42  (15.11%)
> max:       150757.72  max:       293284.84
> min:       135127.55  min:       188984.27
> ==Read     ==Read
> records:   10         records:   10
> avg:       208325.22  avg:       202894.24
> std:       57072.62   (27.40%)   std:       41099.56  (20.26%)
> max:       293428.96  max:       289581.12
> min:       79445.37   min:       157478.27
> ==Re-read  ==Re-read
> records:   10         records:   10
> avg:       204750.36  avg:       237406.96
> std:       36959.99   (18.05%)   std:       41518.36  (17.49%)
> max:       268399.89  max:       286898.13
> min:       154831.28  min:       160326.88
> ==Reverse  Read       ==Reverse  Read
> records:   10         records:   10
> avg:       215043.10  avg:       208946.35
> std:       31239.60   (14.53%)   std:       38859.74  (18.60%)
> max:       251564.57  max:       284481.31
> min:       154719.20  min:       155024.33
> ==Stride   read       ==Stride   read
> records:   10         records:   10
> avg:       227246.54  avg:       198925.10
> std:       31105.89   (13.69%)   std:       30721.86  (15.44%)
> max:       290020.34  max:       227178.70
> min:       157399.46  min:       153592.91
> ==Random   read       ==Random   read
> records:   10         records:   10
> avg:       238239.81  avg:       216298.41
> std:       37276.91   (15.65%)   std:       38194.73  (17.66%)
> max:       291416.20  max:       286345.37
> min:       152734.23  min:       151871.52
> ==Mixed    workload   ==Mixed    workload
> records:   10         records:   10
> avg:       208434.11  avg:       234355.66
> std:       31385.40   (15.06%)   std:       22064.02  (9.41%)
> max:       253990.11  max:       270412.58
> min:       162041.47  min:       186052.12
> ==Random   write      ==Random   write
> records:   10         records:   10
> avg:       142172.54  avg:       290231.28
> std:       6233.67    (4.38%)    std:       46462.35  (16.01%)
> max:       150652.40  max:       338096.54
> min:       130584.14  min:       183253.25
> ==Pwrite   ==Pwrite
> records:   10         records:   10
> avg:       141247.91  avg:       267085.70
> std:       6756.08    (4.78%)    std:       40019.39  (14.98%)
> max:       150239.13  max:       335512.33
> min:       130456.98  min:       180832.45
> ==Pread    ==Pread
> records:   10         records:   10
> avg:       214990.26  avg:       208730.94
> std:       40701.79   (18.93%)   std:       50797.78  (24.34%)
> max:       287060.54  max:       300675.25
> min:       157642.17  min:       156763.98
> 
> So, all write test both x86 and ARM is really huge win
> and I couldn't find any regression!
> 
> Thanks for nice work.
> 
> For all patchset,
> 
> Acked-by: Minchan Kim <minchan@kernel.org>
> 
> On Fri, Feb 28, 2014 at 08:52:00PM +0300, Sergey Senozhatsky wrote:
> > This patchset introduces zcomp compression backend abstraction
> > adding ability to support compression algorithms other than LZO;
> > support for multi compression streams, making parallel compressions
> > possible; adds support for LZ4 compression algorithm.
> > 
> > v8->v9 (reviewed by Andrew Morton):
> > -- add LZ4 backend (+iozone test vs LZO)
> > -- merge patches 'zram: document max_comp_streams' and 'zram: add multi
> >    stream functionality'
> > -- do not extern backend struct from source file
> > -- use find()/release() naming instead of get()/put()
> > -- minor code, commit messages and code comments `nitpicks'
> > -- removed Acked-by Minchan Kim from first two patches, because I've
> >    changed them a bit.
> > 
> > v7->v8 (reviewed by Minchan Kim):
> > -- merge patches 'add multi stream functionality' and 'enable multi
> >    stream compression support in zram'
> > -- return status code from set_max_streams knob and print message on
> >    error
> > -- do not use atomic type for ->avail_strm
> > -- return back: allocate by default only one stream for multi stream backend
> > -- wake sleeping write in zcomp_strm_multi_put() only if we put stream
> >    to idle list
> > -- minor code `nitpicks'
> > 
> > v6->v7 (reviewed by Minchan Kim):
> > -- enable multi and single stream support out of the box (drop
> >    ZRAM_MULTI_STREAM config option)
> > -- add set_max_stream knob, so we can adjust max number of compression
> >    streams in runtime (for multi stream backend at the moment)
> > -- minor code `nitpicks'
> > 
> > v5->v6 (reviewed by Minchan Kim):
> > -- handle single compression stream case separately, using mutex locking,
> >    to address perfomance regression
> > -- handle multi compression stream using spin lock and wait_event()/wake_up()
> > -- make multi compression stream support configurable (ZRAM_MULTI_STREAM
> >    config option)
> > 
> > v4->v5 (reviewed by Minchan Kim):
> > -- renamed zcomp buffer_lock; removed src len and dst len from
> >    compress() and decompress(); not using term `buffer' and
> >    `workmem' in code and documentation; define compress() and
> >    decompress() functions for LZO backend; not using goto's;
> >    do not put idle zcomp_strm to idle list tail.
> > 
> > v3->v4 (reviewed by Minchan Kim):
> > -- renamed compression backend and working memory structs as requested
> >    by Minchan Kim; fixed several issues noted by Minchan Kim.
> > 
> > Sergey Senozhatsky (7):
> >   zram: introduce compressing backend abstraction
> >   zram: use zcomp compressing backends
> >   zram: factor out single stream compression
> >   zram: add multi stream functionality
> >   zram: add set_max_streams knob
> >   zram: make compression algorithm selection possible
> >   zram: add lz4 algorithm backend
> > 
> >  Documentation/ABI/testing/sysfs-block-zram |  17 +-
> >  Documentation/blockdev/zram.txt            |  45 +++-
> >  drivers/block/zram/Kconfig                 |  10 +
> >  drivers/block/zram/Makefile                |   4 +-
> >  drivers/block/zram/zcomp.c                 | 349 +++++++++++++++++++++++++++++
> >  drivers/block/zram/zcomp.h                 |  68 ++++++
> >  drivers/block/zram/zcomp_lz4.c             |  47 ++++
> >  drivers/block/zram/zcomp_lz4.h             |  17 ++
> >  drivers/block/zram/zcomp_lzo.c             |  47 ++++
> >  drivers/block/zram/zcomp_lzo.h             |  17 ++
> >  drivers/block/zram/zram_drv.c              | 131 ++++++++---
> >  drivers/block/zram/zram_drv.h              |  11 +-
> >  12 files changed, 715 insertions(+), 48 deletions(-)
> >  create mode 100644 drivers/block/zram/zcomp.c
> >  create mode 100644 drivers/block/zram/zcomp.h
> >  create mode 100644 drivers/block/zram/zcomp_lz4.c
> >  create mode 100644 drivers/block/zram/zcomp_lz4.h
> >  create mode 100644 drivers/block/zram/zcomp_lzo.c
> >  create mode 100644 drivers/block/zram/zcomp_lzo.h


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv9 0/7] add compressing abstraction and multi stream support
  2014-04-16 13:53   ` Bartlomiej Zolnierkiewicz
@ 2014-04-16 14:53     ` Sergey Senozhatsky
  2014-04-16 19:20       ` Sergey Senozhatsky
  0 siblings, 1 reply; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-04-16 14:53 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Minchan Kim, Sergey Senozhatsky, Andrew Morton, Jerome Marchand,
	Nitin Gupta, linux-kernel

On (04/16/14 15:53), Bartlomiej Zolnierkiewicz wrote:
> Hi,
> 
> I'm a bit late on this patch series (sorry for that) but why are we not
> using Crypto API for compression algorithm selection and multi stream
> support?  Compared to the earlier patches for zram like the ones we did
> in July 2013 [1] this patch series requires us to:
> 
> - implement compression algorithm support for each algorithm separately
>   (when using Crypto API all compression algorithms supported by Crypto
>   API are supported automatically)
> 
> - manually set the number of maximum active compression streams (earlier
>   patches using Crypto API needed a lot less code and automatically scaled
>   number of compression streams to number of online CPUs)

Hello Bartlomiej,

there was a short discussion of `custom solution' vs `crypto API'.
personally, I was not impressed by Crypto API. basically it requires
	a) locking of transformation context (if we have only one tfm
	for everyone)
or
	b) having some sort of a list of tfms
or
	c) storing tfm in a per_cpu area.

c) requires ZRAM block device to become CPU hotplug aware and to begin
reacting on onlining/offlining of a CPU by allocating/deallocating of
transformation context. allocating new tfm potentially could a problem
if system is under memory pressure and ZRAM is used as a swap device.
besides, c) also requires additional locking via get_cpu()/put_cpu()
around every comp op.

in overall, that was something that I wanted to avoid. current solution
does not depend on CPU notifier (which is a good thing for a block device
driver, imho) and, thus, looks simpler in some ways. yet we still have
ability to scale.

there was a patch that was letting ZRAM to automatically scale number of
compression streams to number of online CPUs, but Minchan wanted explicit
one-stream-only option for swap device use-case.

the other thing to mention is that CPU offlining API was under rework at
that time by Srivatsa S. Bhat (cpu_notifier_register_begin(),
cpu_notifier_register_done()) due to discovered races, which resulted in
a patchset of 50+ patches (every existing CPU notifier user, including
zsmalloc) and that played as additional factor.


	-ss

> From what I see the pros of the current patch series are:
> 
> - dynamic selection of the compression algorithm
> 
> - ability to limit number of active streams below tha number of online CPUs
> 
> However I believe that both above features can be also implemented on top of
> the code using Crypto API.  What am I missing?
> 
> [1] https://lkml.org/lkml/2013/7/30/365
> 
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R&D Institute Poland
> Samsung Electronics
> 
> On Thursday, March 06, 2014 05:11:37 PM Minchan Kim wrote:
> > Hello Sergey,
> > 
> > Sorry for the late.
> > 
> > Today, I tested this patch and confirm that it's really good.
> > I send result for the record.
> > 
> > In x86(4core and x2 hyper threading, i7, 2.8GHz), I did parallel 4 dd
> > test with 200m file like below
> > 
> > dd if=./test200m.file of=mnt/file1 bs=512k count=1024 oflag=direct &
> > dd if=./test200m.file of=mnt/file2 bs=512k count=1024 oflag=direct &
> > dd if=./test200m.file of=mnt/file3 bs=512k count=1024 oflag=direct &
> > dd if=./test200m.file of=mnt/file4 bs=512k count=1024 oflag=direct &
> > wait all backgroud job
> > 
> > The result is 
> > 
> > When 1 max_comp_streams, elapsed time is 36.26s
> > When 8 max_comp_streams, elapsed time is 4.09s
> > 
> > In ARM(4core, ARMv7, 1.5GHz), I did iozone test.
> > 
> >       1 max_comp_streams,      8 max_comp_streams
> > ==Initial  write      ==Initial  write
> > records:   10         records:   10
> > avg:       141964.05  avg:       239632.54
> > std:       3532.11    (2.49%)    std:       43863.53  (18.30%)
> > max:       147900.45  max:       319046.62
> > min:       135363.73  min:       178929.40
> > ==Rewrite  ==Rewrite
> > records:   10         records:   10
> > avg:       144757.46  avg:       247410.80
> > std:       4019.19    (2.78%)    std:       37378.42  (15.11%)
> > max:       150757.72  max:       293284.84
> > min:       135127.55  min:       188984.27
> > ==Read     ==Read
> > records:   10         records:   10
> > avg:       208325.22  avg:       202894.24
> > std:       57072.62   (27.40%)   std:       41099.56  (20.26%)
> > max:       293428.96  max:       289581.12
> > min:       79445.37   min:       157478.27
> > ==Re-read  ==Re-read
> > records:   10         records:   10
> > avg:       204750.36  avg:       237406.96
> > std:       36959.99   (18.05%)   std:       41518.36  (17.49%)
> > max:       268399.89  max:       286898.13
> > min:       154831.28  min:       160326.88
> > ==Reverse  Read       ==Reverse  Read
> > records:   10         records:   10
> > avg:       215043.10  avg:       208946.35
> > std:       31239.60   (14.53%)   std:       38859.74  (18.60%)
> > max:       251564.57  max:       284481.31
> > min:       154719.20  min:       155024.33
> > ==Stride   read       ==Stride   read
> > records:   10         records:   10
> > avg:       227246.54  avg:       198925.10
> > std:       31105.89   (13.69%)   std:       30721.86  (15.44%)
> > max:       290020.34  max:       227178.70
> > min:       157399.46  min:       153592.91
> > ==Random   read       ==Random   read
> > records:   10         records:   10
> > avg:       238239.81  avg:       216298.41
> > std:       37276.91   (15.65%)   std:       38194.73  (17.66%)
> > max:       291416.20  max:       286345.37
> > min:       152734.23  min:       151871.52
> > ==Mixed    workload   ==Mixed    workload
> > records:   10         records:   10
> > avg:       208434.11  avg:       234355.66
> > std:       31385.40   (15.06%)   std:       22064.02  (9.41%)
> > max:       253990.11  max:       270412.58
> > min:       162041.47  min:       186052.12
> > ==Random   write      ==Random   write
> > records:   10         records:   10
> > avg:       142172.54  avg:       290231.28
> > std:       6233.67    (4.38%)    std:       46462.35  (16.01%)
> > max:       150652.40  max:       338096.54
> > min:       130584.14  min:       183253.25
> > ==Pwrite   ==Pwrite
> > records:   10         records:   10
> > avg:       141247.91  avg:       267085.70
> > std:       6756.08    (4.78%)    std:       40019.39  (14.98%)
> > max:       150239.13  max:       335512.33
> > min:       130456.98  min:       180832.45
> > ==Pread    ==Pread
> > records:   10         records:   10
> > avg:       214990.26  avg:       208730.94
> > std:       40701.79   (18.93%)   std:       50797.78  (24.34%)
> > max:       287060.54  max:       300675.25
> > min:       157642.17  min:       156763.98
> > 
> > So, all write test both x86 and ARM is really huge win
> > and I couldn't find any regression!
> > 
> > Thanks for nice work.
> > 
> > For all patchset,
> > 
> > Acked-by: Minchan Kim <minchan@kernel.org>
> > 
> > On Fri, Feb 28, 2014 at 08:52:00PM +0300, Sergey Senozhatsky wrote:
> > > This patchset introduces zcomp compression backend abstraction
> > > adding ability to support compression algorithms other than LZO;
> > > support for multi compression streams, making parallel compressions
> > > possible; adds support for LZ4 compression algorithm.
> > > 
> > > v8->v9 (reviewed by Andrew Morton):
> > > -- add LZ4 backend (+iozone test vs LZO)
> > > -- merge patches 'zram: document max_comp_streams' and 'zram: add multi
> > >    stream functionality'
> > > -- do not extern backend struct from source file
> > > -- use find()/release() naming instead of get()/put()
> > > -- minor code, commit messages and code comments `nitpicks'
> > > -- removed Acked-by Minchan Kim from first two patches, because I've
> > >    changed them a bit.
> > > 
> > > v7->v8 (reviewed by Minchan Kim):
> > > -- merge patches 'add multi stream functionality' and 'enable multi
> > >    stream compression support in zram'
> > > -- return status code from set_max_streams knob and print message on
> > >    error
> > > -- do not use atomic type for ->avail_strm
> > > -- return back: allocate by default only one stream for multi stream backend
> > > -- wake sleeping write in zcomp_strm_multi_put() only if we put stream
> > >    to idle list
> > > -- minor code `nitpicks'
> > > 
> > > v6->v7 (reviewed by Minchan Kim):
> > > -- enable multi and single stream support out of the box (drop
> > >    ZRAM_MULTI_STREAM config option)
> > > -- add set_max_stream knob, so we can adjust max number of compression
> > >    streams in runtime (for multi stream backend at the moment)
> > > -- minor code `nitpicks'
> > > 
> > > v5->v6 (reviewed by Minchan Kim):
> > > -- handle single compression stream case separately, using mutex locking,
> > >    to address perfomance regression
> > > -- handle multi compression stream using spin lock and wait_event()/wake_up()
> > > -- make multi compression stream support configurable (ZRAM_MULTI_STREAM
> > >    config option)
> > > 
> > > v4->v5 (reviewed by Minchan Kim):
> > > -- renamed zcomp buffer_lock; removed src len and dst len from
> > >    compress() and decompress(); not using term `buffer' and
> > >    `workmem' in code and documentation; define compress() and
> > >    decompress() functions for LZO backend; not using goto's;
> > >    do not put idle zcomp_strm to idle list tail.
> > > 
> > > v3->v4 (reviewed by Minchan Kim):
> > > -- renamed compression backend and working memory structs as requested
> > >    by Minchan Kim; fixed several issues noted by Minchan Kim.
> > > 
> > > Sergey Senozhatsky (7):
> > >   zram: introduce compressing backend abstraction
> > >   zram: use zcomp compressing backends
> > >   zram: factor out single stream compression
> > >   zram: add multi stream functionality
> > >   zram: add set_max_streams knob
> > >   zram: make compression algorithm selection possible
> > >   zram: add lz4 algorithm backend
> > > 
> > >  Documentation/ABI/testing/sysfs-block-zram |  17 +-
> > >  Documentation/blockdev/zram.txt            |  45 +++-
> > >  drivers/block/zram/Kconfig                 |  10 +
> > >  drivers/block/zram/Makefile                |   4 +-
> > >  drivers/block/zram/zcomp.c                 | 349 +++++++++++++++++++++++++++++
> > >  drivers/block/zram/zcomp.h                 |  68 ++++++
> > >  drivers/block/zram/zcomp_lz4.c             |  47 ++++
> > >  drivers/block/zram/zcomp_lz4.h             |  17 ++
> > >  drivers/block/zram/zcomp_lzo.c             |  47 ++++
> > >  drivers/block/zram/zcomp_lzo.h             |  17 ++
> > >  drivers/block/zram/zram_drv.c              | 131 ++++++++---
> > >  drivers/block/zram/zram_drv.h              |  11 +-
> > >  12 files changed, 715 insertions(+), 48 deletions(-)
> > >  create mode 100644 drivers/block/zram/zcomp.c
> > >  create mode 100644 drivers/block/zram/zcomp.h
> > >  create mode 100644 drivers/block/zram/zcomp_lz4.c
> > >  create mode 100644 drivers/block/zram/zcomp_lz4.h
> > >  create mode 100644 drivers/block/zram/zcomp_lzo.c
> > >  create mode 100644 drivers/block/zram/zcomp_lzo.h
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv9 0/7] add compressing abstraction and multi stream support
  2014-04-16 14:53     ` Sergey Senozhatsky
@ 2014-04-16 19:20       ` Sergey Senozhatsky
  0 siblings, 0 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2014-04-16 19:20 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Sergey Senozhatsky, Minchan Kim, Andrew Morton, Jerome Marchand,
	Nitin Gupta, linux-kernel

On (04/16/14 17:53), Sergey Senozhatsky wrote:
> On (04/16/14 15:53), Bartlomiej Zolnierkiewicz wrote:
> > Hi,
> > 
> > I'm a bit late on this patch series (sorry for that) but why are we not
> > using Crypto API for compression algorithm selection and multi stream
> > support?  Compared to the earlier patches for zram like the ones we did
> > in July 2013 [1] this patch series requires us to:
> > 
> > - implement compression algorithm support for each algorithm separately
> >   (when using Crypto API all compression algorithms supported by Crypto
> >   API are supported automatically)
> > 
> > - manually set the number of maximum active compression streams (earlier
> >   patches using Crypto API needed a lot less code and automatically scaled
> >   number of compression streams to number of online CPUs)
> 
> Hello Bartlomiej,
> 
> there was a short discussion of `custom solution' vs `crypto API'.
> personally, I was not impressed by Crypto API. basically it requires
> 	a) locking of transformation context (if we have only one tfm
> 	for everyone)
> or
> 	b) having some sort of a list of tfms
> or
> 	c) storing tfm in a per_cpu area.
> 
> c) requires ZRAM block device to become CPU hotplug aware and to begin
> reacting on onlining/offlining of a CPU by allocating/deallocating of
> transformation context. allocating new tfm potentially could a problem
> if system is under memory pressure and ZRAM is used as a swap device.
> besides, c) also requires additional locking via get_cpu()/put_cpu()
> around every comp op.
> 
> in overall, that was something that I wanted to avoid. current solution
> does not depend on CPU notifier (which is a good thing for a block device
> driver, imho) and, thus, looks simpler in some ways. yet we still have
> ability to scale.
> 
> there was a patch that was letting ZRAM to automatically scale number of
> compression streams to number of online CPUs, but Minchan wanted explicit
> one-stream-only option for swap device use-case.
> 

to be correct, special case for one-stream-only is reasonable. single
stream, protected with mutex, gives us really nice performance numbers
comparing to spinlock and wait_event(), because of adaptive spin-on-onwner
mutex's behaviour. we probably can reduce lines of code with adaptive
locking, which will behave as spinlock for multistream and as adaptive
mutex for single stream.

	-ss

> the other thing to mention is that CPU offlining API was under rework at
> that time by Srivatsa S. Bhat (cpu_notifier_register_begin(),
> cpu_notifier_register_done()) due to discovered races, which resulted in
> a patchset of 50+ patches (every existing CPU notifier user, including
> zsmalloc) and that played as additional factor.
> 
> 
> 	-ss
> 
> > From what I see the pros of the current patch series are:
> > 
> > - dynamic selection of the compression algorithm
> > 
> > - ability to limit number of active streams below tha number of online CPUs
> > 
> > However I believe that both above features can be also implemented on top of
> > the code using Crypto API.  What am I missing?
> > 
> > [1] https://lkml.org/lkml/2013/7/30/365
> > 
> > Best regards,
> > --
> > Bartlomiej Zolnierkiewicz
> > Samsung R&D Institute Poland
> > Samsung Electronics
> > 
> > On Thursday, March 06, 2014 05:11:37 PM Minchan Kim wrote:
> > > Hello Sergey,
> > > 
> > > Sorry for the late.
> > > 
> > > Today, I tested this patch and confirm that it's really good.
> > > I send result for the record.
> > > 
> > > In x86(4core and x2 hyper threading, i7, 2.8GHz), I did parallel 4 dd
> > > test with 200m file like below
> > > 
> > > dd if=./test200m.file of=mnt/file1 bs=512k count=1024 oflag=direct &
> > > dd if=./test200m.file of=mnt/file2 bs=512k count=1024 oflag=direct &
> > > dd if=./test200m.file of=mnt/file3 bs=512k count=1024 oflag=direct &
> > > dd if=./test200m.file of=mnt/file4 bs=512k count=1024 oflag=direct &
> > > wait all backgroud job
> > > 
> > > The result is 
> > > 
> > > When 1 max_comp_streams, elapsed time is 36.26s
> > > When 8 max_comp_streams, elapsed time is 4.09s
> > > 
> > > In ARM(4core, ARMv7, 1.5GHz), I did iozone test.
> > > 
> > >       1 max_comp_streams,      8 max_comp_streams
> > > ==Initial  write      ==Initial  write
> > > records:   10         records:   10
> > > avg:       141964.05  avg:       239632.54
> > > std:       3532.11    (2.49%)    std:       43863.53  (18.30%)
> > > max:       147900.45  max:       319046.62
> > > min:       135363.73  min:       178929.40
> > > ==Rewrite  ==Rewrite
> > > records:   10         records:   10
> > > avg:       144757.46  avg:       247410.80
> > > std:       4019.19    (2.78%)    std:       37378.42  (15.11%)
> > > max:       150757.72  max:       293284.84
> > > min:       135127.55  min:       188984.27
> > > ==Read     ==Read
> > > records:   10         records:   10
> > > avg:       208325.22  avg:       202894.24
> > > std:       57072.62   (27.40%)   std:       41099.56  (20.26%)
> > > max:       293428.96  max:       289581.12
> > > min:       79445.37   min:       157478.27
> > > ==Re-read  ==Re-read
> > > records:   10         records:   10
> > > avg:       204750.36  avg:       237406.96
> > > std:       36959.99   (18.05%)   std:       41518.36  (17.49%)
> > > max:       268399.89  max:       286898.13
> > > min:       154831.28  min:       160326.88
> > > ==Reverse  Read       ==Reverse  Read
> > > records:   10         records:   10
> > > avg:       215043.10  avg:       208946.35
> > > std:       31239.60   (14.53%)   std:       38859.74  (18.60%)
> > > max:       251564.57  max:       284481.31
> > > min:       154719.20  min:       155024.33
> > > ==Stride   read       ==Stride   read
> > > records:   10         records:   10
> > > avg:       227246.54  avg:       198925.10
> > > std:       31105.89   (13.69%)   std:       30721.86  (15.44%)
> > > max:       290020.34  max:       227178.70
> > > min:       157399.46  min:       153592.91
> > > ==Random   read       ==Random   read
> > > records:   10         records:   10
> > > avg:       238239.81  avg:       216298.41
> > > std:       37276.91   (15.65%)   std:       38194.73  (17.66%)
> > > max:       291416.20  max:       286345.37
> > > min:       152734.23  min:       151871.52
> > > ==Mixed    workload   ==Mixed    workload
> > > records:   10         records:   10
> > > avg:       208434.11  avg:       234355.66
> > > std:       31385.40   (15.06%)   std:       22064.02  (9.41%)
> > > max:       253990.11  max:       270412.58
> > > min:       162041.47  min:       186052.12
> > > ==Random   write      ==Random   write
> > > records:   10         records:   10
> > > avg:       142172.54  avg:       290231.28
> > > std:       6233.67    (4.38%)    std:       46462.35  (16.01%)
> > > max:       150652.40  max:       338096.54
> > > min:       130584.14  min:       183253.25
> > > ==Pwrite   ==Pwrite
> > > records:   10         records:   10
> > > avg:       141247.91  avg:       267085.70
> > > std:       6756.08    (4.78%)    std:       40019.39  (14.98%)
> > > max:       150239.13  max:       335512.33
> > > min:       130456.98  min:       180832.45
> > > ==Pread    ==Pread
> > > records:   10         records:   10
> > > avg:       214990.26  avg:       208730.94
> > > std:       40701.79   (18.93%)   std:       50797.78  (24.34%)
> > > max:       287060.54  max:       300675.25
> > > min:       157642.17  min:       156763.98
> > > 
> > > So, all write test both x86 and ARM is really huge win
> > > and I couldn't find any regression!
> > > 
> > > Thanks for nice work.
> > > 
> > > For all patchset,
> > > 
> > > Acked-by: Minchan Kim <minchan@kernel.org>
> > > 
> > > On Fri, Feb 28, 2014 at 08:52:00PM +0300, Sergey Senozhatsky wrote:
> > > > This patchset introduces zcomp compression backend abstraction
> > > > adding ability to support compression algorithms other than LZO;
> > > > support for multi compression streams, making parallel compressions
> > > > possible; adds support for LZ4 compression algorithm.
> > > > 
> > > > v8->v9 (reviewed by Andrew Morton):
> > > > -- add LZ4 backend (+iozone test vs LZO)
> > > > -- merge patches 'zram: document max_comp_streams' and 'zram: add multi
> > > >    stream functionality'
> > > > -- do not extern backend struct from source file
> > > > -- use find()/release() naming instead of get()/put()
> > > > -- minor code, commit messages and code comments `nitpicks'
> > > > -- removed Acked-by Minchan Kim from first two patches, because I've
> > > >    changed them a bit.
> > > > 
> > > > v7->v8 (reviewed by Minchan Kim):
> > > > -- merge patches 'add multi stream functionality' and 'enable multi
> > > >    stream compression support in zram'
> > > > -- return status code from set_max_streams knob and print message on
> > > >    error
> > > > -- do not use atomic type for ->avail_strm
> > > > -- return back: allocate by default only one stream for multi stream backend
> > > > -- wake sleeping write in zcomp_strm_multi_put() only if we put stream
> > > >    to idle list
> > > > -- minor code `nitpicks'
> > > > 
> > > > v6->v7 (reviewed by Minchan Kim):
> > > > -- enable multi and single stream support out of the box (drop
> > > >    ZRAM_MULTI_STREAM config option)
> > > > -- add set_max_stream knob, so we can adjust max number of compression
> > > >    streams in runtime (for multi stream backend at the moment)
> > > > -- minor code `nitpicks'
> > > > 
> > > > v5->v6 (reviewed by Minchan Kim):
> > > > -- handle single compression stream case separately, using mutex locking,
> > > >    to address perfomance regression
> > > > -- handle multi compression stream using spin lock and wait_event()/wake_up()
> > > > -- make multi compression stream support configurable (ZRAM_MULTI_STREAM
> > > >    config option)
> > > > 
> > > > v4->v5 (reviewed by Minchan Kim):
> > > > -- renamed zcomp buffer_lock; removed src len and dst len from
> > > >    compress() and decompress(); not using term `buffer' and
> > > >    `workmem' in code and documentation; define compress() and
> > > >    decompress() functions for LZO backend; not using goto's;
> > > >    do not put idle zcomp_strm to idle list tail.
> > > > 
> > > > v3->v4 (reviewed by Minchan Kim):
> > > > -- renamed compression backend and working memory structs as requested
> > > >    by Minchan Kim; fixed several issues noted by Minchan Kim.
> > > > 
> > > > Sergey Senozhatsky (7):
> > > >   zram: introduce compressing backend abstraction
> > > >   zram: use zcomp compressing backends
> > > >   zram: factor out single stream compression
> > > >   zram: add multi stream functionality
> > > >   zram: add set_max_streams knob
> > > >   zram: make compression algorithm selection possible
> > > >   zram: add lz4 algorithm backend
> > > > 
> > > >  Documentation/ABI/testing/sysfs-block-zram |  17 +-
> > > >  Documentation/blockdev/zram.txt            |  45 +++-
> > > >  drivers/block/zram/Kconfig                 |  10 +
> > > >  drivers/block/zram/Makefile                |   4 +-
> > > >  drivers/block/zram/zcomp.c                 | 349 +++++++++++++++++++++++++++++
> > > >  drivers/block/zram/zcomp.h                 |  68 ++++++
> > > >  drivers/block/zram/zcomp_lz4.c             |  47 ++++
> > > >  drivers/block/zram/zcomp_lz4.h             |  17 ++
> > > >  drivers/block/zram/zcomp_lzo.c             |  47 ++++
> > > >  drivers/block/zram/zcomp_lzo.h             |  17 ++
> > > >  drivers/block/zram/zram_drv.c              | 131 ++++++++---
> > > >  drivers/block/zram/zram_drv.h              |  11 +-
> > > >  12 files changed, 715 insertions(+), 48 deletions(-)
> > > >  create mode 100644 drivers/block/zram/zcomp.c
> > > >  create mode 100644 drivers/block/zram/zcomp.h
> > > >  create mode 100644 drivers/block/zram/zcomp_lz4.c
> > > >  create mode 100644 drivers/block/zram/zcomp_lz4.h
> > > >  create mode 100644 drivers/block/zram/zcomp_lzo.c
> > > >  create mode 100644 drivers/block/zram/zcomp_lzo.h
> > 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-04-16 19:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-28 17:52 [PATCHv9 0/7] add compressing abstraction and multi stream support Sergey Senozhatsky
2014-02-28 17:52 ` [PATCHv9 1/7] zram: introduce compressing backend abstraction Sergey Senozhatsky
2014-02-28 17:52 ` [PATCHv9 2/7] zram: use zcomp compressing backends Sergey Senozhatsky
2014-02-28 17:52 ` [PATCHv9 3/7] zram: factor out single stream compression Sergey Senozhatsky
2014-02-28 17:52 ` [PATCHv9 4/7] zram: add multi stream functionality Sergey Senozhatsky
2014-02-28 17:52 ` [PATCHv9 5/7] zram: add set_max_streams knob Sergey Senozhatsky
2014-02-28 17:52 ` [PATCHv9 6/7] zram: make compression algorithm selection possible Sergey Senozhatsky
2014-02-28 17:52 ` [PATCHv9 7/7] zram: add lz4 algorithm backend Sergey Senozhatsky
2014-03-06  8:11 ` [PATCHv9 0/7] add compressing abstraction and multi stream support Minchan Kim
2014-03-06 11:27   ` Sergey Senozhatsky
2014-04-16 13:53   ` Bartlomiej Zolnierkiewicz
2014-04-16 14:53     ` Sergey Senozhatsky
2014-04-16 19:20       ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).