All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] zram clean up
@ 2017-04-03  5:17 Minchan Kim
  2017-04-03  5:17 ` [PATCH 1/5] zram: handle multiple pages attached bio's bvec Minchan Kim
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  5:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Sergey Senozhatsky, kernel-team, Minchan Kim

This patchset aims zram clean-up.

[1] clean up multiple pages's bvec handling.
[2] clean up partial IO handling
[3-5] clean up zram via using accessor and removing pointless structure.

With [2-5] applied, we can get a few hundred bytes as well as huge
readibility enhance.

This patchset is based on v4.11-rc4-mmotm-2017-03-30-16-31 and
*drop* zram-factor-out-partial-io-routine.patch.

x86: 708 byte save

    add/remove: 1/1 grow/shrink: 0/11 up/down: 478/-1186 (-708)
    function                                     old     new   delta
    zram_special_page_read                         -     478    +478
    zram_reset_device                            317     314      -3
    mem_used_max_store                           131     128      -3
    compact_store                                 96      93      -3
    mm_stat_show                                 203     197      -6
    zram_add                                     719     712      -7
    zram_slot_free_notify                        229     214     -15
    zram_make_request                            819     803     -16
    zram_meta_free                               128     111     -17
    zram_free_page                               180     151     -29
    disksize_store                               432     361     -71
    zram_decompress_page.isra                    504       -    -504
    zram_bvec_rw                                2592    2080    -512
    Total: Before=25350773, After=25350065, chg -0.00%
    
ppc64: 231 byte save
    
    add/remove: 2/0 grow/shrink: 1/9 up/down: 681/-912 (-231)
    function                                     old     new   delta
    zram_special_page_read                         -     480    +480
    zram_slot_lock                                 -     200    +200
    vermagic                                      39      40      +1
    mm_stat_show                                 256     248      -8
    zram_meta_free                               200     184     -16
    zram_add                                     944     912     -32
    zram_free_page                               348     308     -40
    disksize_store                               572     492     -80
    zram_decompress_page                         664     564    -100
    zram_slot_free_notify                        292     160    -132
    zram_make_request                           1132    1000    -132
    zram_bvec_rw                                2768    2396    -372
    Total: Before=17565825, After=17565594, chg -0.00%

Minchan Kim (5):
  [1] zram: handle multiple pages attached bio's bvec
  [2] zram: partial IO refactoring
  [3] zram: use zram_slot_lock instead of raw bit_spin_lock op
  [4] zram: remove zram_meta structure
  [5] zram: introduce zram data accessor

 drivers/block/zram/zram_drv.c | 532 +++++++++++++++++++++---------------------
 drivers/block/zram/zram_drv.h |   6 +-
 2 files changed, 270 insertions(+), 268 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/5] zram: handle multiple pages attached bio's bvec
  2017-04-03  5:17 [PATCH 0/5] zram clean up Minchan Kim
@ 2017-04-03  5:17 ` Minchan Kim
  2017-04-03 22:45   ` Andrew Morton
  2017-04-04  4:55   ` Sergey Senozhatsky
  2017-04-03  5:17 ` [PATCH 2/5] zram: partial IO refactoring Minchan Kim
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  5:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Sergey Senozhatsky, kernel-team, Minchan Kim,
	Jens Axboe, Hannes Reinecke, Johannes Thumshirn

Johannes Thumshirn reported system goes the panic when using NVMe over
Fabrics loopback target with zram.

The reason is zram expects each bvec in bio contains a single page
but nvme can attach a huge bulk of pages attached to the bio's bvec
so that zram's index arithmetic could be wrong so that out-of-bound
access makes panic.

It can be solved by limiting max_sectors with SECTORS_PER_PAGE like
[1] but it makes zram slow because bio should split with each pages
so this patch makes zram aware of multiple pages in a bvec so it
could solve without any regression.

[1] 0bc315381fe9, zram: set physical queue limits to avoid array out of
    bounds accesses

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.com>
Reported-by: Johannes Thumshirn <jthumshirn@suse.de>
Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 39 ++++++++++-----------------------------
 1 file changed, 10 insertions(+), 29 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 01944419b1f3..28c2836f8c96 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -137,8 +137,7 @@ static inline bool valid_io_request(struct zram *zram,
 
 static void update_position(u32 *index, int *offset, struct bio_vec *bvec)
 {
-	if (*offset + bvec->bv_len >= PAGE_SIZE)
-		(*index)++;
+	*index  += (*offset + bvec->bv_len) / PAGE_SIZE;
 	*offset = (*offset + bvec->bv_len) % PAGE_SIZE;
 }
 
@@ -838,34 +837,20 @@ static void __zram_make_request(struct zram *zram, struct bio *bio)
 	}
 
 	bio_for_each_segment(bvec, bio, iter) {
-		int max_transfer_size = PAGE_SIZE - offset;
-
-		if (bvec.bv_len > max_transfer_size) {
-			/*
-			 * zram_bvec_rw() can only make operation on a single
-			 * zram page. Split the bio vector.
-			 */
-			struct bio_vec bv;
-
-			bv.bv_page = bvec.bv_page;
-			bv.bv_len = max_transfer_size;
-			bv.bv_offset = bvec.bv_offset;
+		struct bio_vec bv = bvec;
+		unsigned int remained = bvec.bv_len;
 
+		do {
+			bv.bv_len = min_t(unsigned int, PAGE_SIZE, remained);
 			if (zram_bvec_rw(zram, &bv, index, offset,
-					 op_is_write(bio_op(bio))) < 0)
+					op_is_write(bio_op(bio))) < 0)
 				goto out;
 
-			bv.bv_len = bvec.bv_len - max_transfer_size;
-			bv.bv_offset += max_transfer_size;
-			if (zram_bvec_rw(zram, &bv, index + 1, 0,
-					 op_is_write(bio_op(bio))) < 0)
-				goto out;
-		} else
-			if (zram_bvec_rw(zram, &bvec, index, offset,
-					 op_is_write(bio_op(bio))) < 0)
-				goto out;
+			bv.bv_offset += bv.bv_len;
+			remained -= bv.bv_len;
 
-		update_position(&index, &offset, &bvec);
+			update_position(&index, &offset, &bv);
+		} while (remained);
 	}
 
 	bio_endio(bio);
@@ -882,8 +867,6 @@ static blk_qc_t zram_make_request(struct request_queue *queue, struct bio *bio)
 {
 	struct zram *zram = queue->queuedata;
 
-	blk_queue_split(queue, &bio, queue->bio_split);
-
 	if (!valid_io_request(zram, bio->bi_iter.bi_sector,
 					bio->bi_iter.bi_size)) {
 		atomic64_inc(&zram->stats.invalid_io);
@@ -1191,8 +1174,6 @@ static int zram_add(void)
 	blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
 	blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
 	zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
-	zram->disk->queue->limits.max_sectors = SECTORS_PER_PAGE;
-	zram->disk->queue->limits.chunk_sectors = 0;
 	blk_queue_max_discard_sectors(zram->disk->queue, UINT_MAX);
 	/*
 	 * zram_bio_discard() will clear all logical blocks if logical block
-- 
2.7.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/5] zram: partial IO refactoring
  2017-04-03  5:17 [PATCH 0/5] zram clean up Minchan Kim
  2017-04-03  5:17 ` [PATCH 1/5] zram: handle multiple pages attached bio's bvec Minchan Kim
@ 2017-04-03  5:17 ` Minchan Kim
  2017-04-03  5:52   ` Mika Penttilä
  2017-04-04  2:17   ` Sergey Senozhatsky
  2017-04-03  5:17 ` [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op Minchan Kim
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  5:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Sergey Senozhatsky, kernel-team, Minchan Kim

For architecture(PAGE_SIZE > 4K), zram have supported partial IO.
However, the mixed code for handling normal/partial IO is too mess,
error-prone to modify IO handler functions with upcoming feature
so this patch aims for cleaning up zram's IO handling functions.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 333 +++++++++++++++++++++++-------------------
 1 file changed, 184 insertions(+), 149 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 28c2836f8c96..7938f4b98b01 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -45,6 +45,8 @@ static const char *default_compressor = "lzo";
 /* Module params (documentation at end) */
 static unsigned int num_devices = 1;
 
+static void zram_free_page(struct zram *zram, size_t index);
+
 static inline bool init_done(struct zram *zram)
 {
 	return zram->disksize;
@@ -98,10 +100,17 @@ static void zram_set_obj_size(struct zram_meta *meta,
 	meta->table[index].value = (flags << ZRAM_FLAG_SHIFT) | size;
 }
 
+#if PAGE_SIZE != 4096
 static inline bool is_partial_io(struct bio_vec *bvec)
 {
 	return bvec->bv_len != PAGE_SIZE;
 }
+#else
+static inline bool is_partial_io(struct bio_vec *bvec)
+{
+	return false;
+}
+#endif
 
 static void zram_revalidate_disk(struct zram *zram)
 {
@@ -191,18 +200,6 @@ static bool page_same_filled(void *ptr, unsigned long *element)
 	return true;
 }
 
-static void handle_same_page(struct bio_vec *bvec, unsigned long element)
-{
-	struct page *page = bvec->bv_page;
-	void *user_mem;
-
-	user_mem = kmap_atomic(page);
-	zram_fill_page(user_mem + bvec->bv_offset, bvec->bv_len, element);
-	kunmap_atomic(user_mem);
-
-	flush_dcache_page(page);
-}
-
 static ssize_t initstate_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
@@ -418,6 +415,53 @@ static DEVICE_ATTR_RO(io_stat);
 static DEVICE_ATTR_RO(mm_stat);
 static DEVICE_ATTR_RO(debug_stat);
 
+static bool zram_special_page_read(struct zram *zram, u32 index,
+				struct page *page,
+				unsigned int offset, unsigned int len)
+{
+	struct zram_meta *meta = zram->meta;
+
+	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+	if (unlikely(!meta->table[index].handle) ||
+			zram_test_flag(meta, index, ZRAM_SAME)) {
+		void *mem;
+
+		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+		mem = kmap_atomic(page);
+		zram_fill_page(mem + offset, len, meta->table[index].element);
+		kunmap_atomic(mem);
+		return true;
+	}
+	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+
+	return false;
+}
+
+static bool zram_special_page_write(struct zram *zram, u32 index,
+					struct page *page)
+{
+	unsigned long element;
+	void *mem = kmap_atomic(page);
+
+	if (page_same_filled(mem, &element)) {
+		struct zram_meta *meta = zram->meta;
+
+		kunmap_atomic(mem);
+		/* Free memory associated with this sector now. */
+		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+		zram_free_page(zram, index);
+		zram_set_flag(meta, index, ZRAM_SAME);
+		zram_set_element(meta, index, element);
+		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+
+		atomic64_inc(&zram->stats.same_pages);
+		return true;
+	}
+	kunmap_atomic(mem);
+
+	return false;
+}
+
 static void zram_meta_free(struct zram_meta *meta, u64 disksize)
 {
 	size_t num_pages = disksize >> PAGE_SHIFT;
@@ -504,169 +548,104 @@ static void zram_free_page(struct zram *zram, size_t index)
 	zram_set_obj_size(meta, index, 0);
 }
 
-static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
+static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
 {
-	int ret = 0;
-	unsigned char *cmem;
-	struct zram_meta *meta = zram->meta;
+	int ret;
 	unsigned long handle;
 	unsigned int size;
+	void *src, *dst;
+	struct zram_meta *meta = zram->meta;
+
+	if (zram_special_page_read(zram, index, page, 0, PAGE_SIZE))
+		return 0;
 
 	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
 	handle = meta->table[index].handle;
 	size = zram_get_obj_size(meta, index);
 
-	if (!handle || zram_test_flag(meta, index, ZRAM_SAME)) {
-		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
-		zram_fill_page(mem, PAGE_SIZE, meta->table[index].element);
-		return 0;
-	}
-
-	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
+	src = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
 	if (size == PAGE_SIZE) {
-		copy_page(mem, cmem);
+		dst = kmap_atomic(page);
+		copy_page(dst, src);
+		kunmap_atomic(dst);
+		ret = 0;
 	} else {
 		struct zcomp_strm *zstrm = zcomp_stream_get(zram->comp);
 
-		ret = zcomp_decompress(zstrm, cmem, size, mem);
+		dst = kmap_atomic(page);
+		ret = zcomp_decompress(zstrm, src, size, dst);
+		kunmap_atomic(dst);
 		zcomp_stream_put(zram->comp);
 	}
 	zs_unmap_object(meta->mem_pool, handle);
 	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
 
 	/* Should NEVER happen. Return bio error if it does. */
-	if (unlikely(ret)) {
+	if (unlikely(ret))
 		pr_err("Decompression failed! err=%d, page=%u\n", ret, index);
-		return ret;
-	}
 
-	return 0;
+	return ret;
 }
 
 static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
-			  u32 index, int offset)
+				u32 index, int offset)
 {
 	int ret;
 	struct page *page;
-	unsigned char *user_mem, *uncmem = NULL;
-	struct zram_meta *meta = zram->meta;
-	page = bvec->bv_page;
 
-	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
-	if (unlikely(!meta->table[index].handle) ||
-			zram_test_flag(meta, index, ZRAM_SAME)) {
-		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
-		handle_same_page(bvec, meta->table[index].element);
+	page = bvec->bv_page;
+	if (zram_special_page_read(zram, index, page, bvec->bv_offset,
+				bvec->bv_len))
 		return 0;
-	}
-	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
-
-	if (is_partial_io(bvec))
-		/* Use  a temporary buffer to decompress the page */
-		uncmem = kmalloc(PAGE_SIZE, GFP_NOIO);
-
-	user_mem = kmap_atomic(page);
-	if (!is_partial_io(bvec))
-		uncmem = user_mem;
 
-	if (!uncmem) {
-		pr_err("Unable to allocate temp memory\n");
-		ret = -ENOMEM;
-		goto out_cleanup;
+	if (is_partial_io(bvec)) {
+		/* Use a temporary buffer to decompress the page */
+		page = alloc_page(GFP_NOIO|__GFP_HIGHMEM);
+		if (!page)
+			return -ENOMEM;
 	}
 
-	ret = zram_decompress_page(zram, uncmem, index);
-	/* Should NEVER happen. Return bio error if it does. */
+	ret = zram_decompress_page(zram, page, index);
 	if (unlikely(ret))
-		goto out_cleanup;
+		goto out;
 
-	if (is_partial_io(bvec))
-		memcpy(user_mem + bvec->bv_offset, uncmem + offset,
-				bvec->bv_len);
+	if (is_partial_io(bvec)) {
+		void *dst = kmap_atomic(bvec->bv_page);
+		void *src = kmap_atomic(page);
 
-	flush_dcache_page(page);
-	ret = 0;
-out_cleanup:
-	kunmap_atomic(user_mem);
+		memcpy(dst + bvec->bv_offset, src + offset, bvec->bv_len);
+		kunmap_atomic(src);
+		kunmap_atomic(dst);
+	}
+out:
 	if (is_partial_io(bvec))
-		kfree(uncmem);
+		__free_page(page);
+
 	return ret;
 }
 
-static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
-			   int offset)
+static int zram_compress(struct zram *zram, struct zcomp_strm **zstrm,
+			struct page *page,
+			unsigned long *out_handle, unsigned int *out_comp_len)
 {
-	int ret = 0;
-	unsigned int clen;
+	int ret;
+	unsigned int comp_len;
+	void *src;
 	unsigned long handle = 0;
-	struct page *page;
-	unsigned char *user_mem, *cmem, *src, *uncmem = NULL;
 	struct zram_meta *meta = zram->meta;
-	struct zcomp_strm *zstrm = NULL;
-	unsigned long alloced_pages;
-	unsigned long element;
-
-	page = bvec->bv_page;
-	if (is_partial_io(bvec)) {
-		/*
-		 * This is a partial IO. We need to read the full page
-		 * before to write the changes.
-		 */
-		uncmem = kmalloc(PAGE_SIZE, GFP_NOIO);
-		if (!uncmem) {
-			ret = -ENOMEM;
-			goto out;
-		}
-		ret = zram_decompress_page(zram, uncmem, index);
-		if (ret)
-			goto out;
-	}
 
 compress_again:
-	user_mem = kmap_atomic(page);
-	if (is_partial_io(bvec)) {
-		memcpy(uncmem + offset, user_mem + bvec->bv_offset,
-		       bvec->bv_len);
-		kunmap_atomic(user_mem);
-		user_mem = NULL;
-	} else {
-		uncmem = user_mem;
-	}
-
-	if (page_same_filled(uncmem, &element)) {
-		if (user_mem)
-			kunmap_atomic(user_mem);
-		/* Free memory associated with this sector now. */
-		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
-		zram_free_page(zram, index);
-		zram_set_flag(meta, index, ZRAM_SAME);
-		zram_set_element(meta, index, element);
-		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
-
-		atomic64_inc(&zram->stats.same_pages);
-		ret = 0;
-		goto out;
-	}
-
-	zstrm = zcomp_stream_get(zram->comp);
-	ret = zcomp_compress(zstrm, uncmem, &clen);
-	if (!is_partial_io(bvec)) {
-		kunmap_atomic(user_mem);
-		user_mem = NULL;
-		uncmem = NULL;
-	}
+	src = kmap_atomic(page);
+	ret = zcomp_compress(*zstrm, src, &comp_len);
+	kunmap_atomic(src);
 
 	if (unlikely(ret)) {
 		pr_err("Compression failed! err=%d\n", ret);
-		goto out;
+		return ret;
 	}
 
-	src = zstrm->buffer;
-	if (unlikely(clen > max_zpage_size)) {
-		clen = PAGE_SIZE;
-		if (is_partial_io(bvec))
-			src = uncmem;
-	}
+	if (unlikely(comp_len > max_zpage_size))
+		comp_len = PAGE_SIZE;
 
 	/*
 	 * handle allocation has 2 paths:
@@ -682,50 +661,70 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 	 * from the slow path and handle has already been allocated.
 	 */
 	if (!handle)
-		handle = zs_malloc(meta->mem_pool, clen,
+		handle = zs_malloc(meta->mem_pool, comp_len,
 				__GFP_KSWAPD_RECLAIM |
 				__GFP_NOWARN |
 				__GFP_HIGHMEM |
 				__GFP_MOVABLE);
 	if (!handle) {
 		zcomp_stream_put(zram->comp);
-		zstrm = NULL;
-
 		atomic64_inc(&zram->stats.writestall);
-
-		handle = zs_malloc(meta->mem_pool, clen,
+		handle = zs_malloc(meta->mem_pool, comp_len,
 				GFP_NOIO | __GFP_HIGHMEM |
 				__GFP_MOVABLE);
+		*zstrm = zcomp_stream_get(zram->comp);
 		if (handle)
 			goto compress_again;
+		return -ENOMEM;
+	}
 
-		pr_err("Error allocating memory for compressed page: %u, size=%u\n",
-			index, clen);
-		ret = -ENOMEM;
-		goto out;
+	*out_handle = handle;
+	*out_comp_len = comp_len;
+	return 0;
+}
+
+static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
+				u32 index, int offset)
+{
+	int ret;
+	unsigned long handle;
+	unsigned int comp_len;
+	void *src, *dst;
+	struct zcomp_strm *zstrm;
+	unsigned long alloced_pages;
+	struct zram_meta *meta = zram->meta;
+	struct page *page = bvec->bv_page;
+
+	if (zram_special_page_write(zram, index, page))
+		return 0;
+
+	zstrm = zcomp_stream_get(zram->comp);
+	ret = zram_compress(zram, &zstrm, page, &handle, &comp_len);
+	if (ret) {
+		zcomp_stream_put(zram->comp);
+		return ret;
 	}
 
 	alloced_pages = zs_get_total_pages(meta->mem_pool);
 	update_used_max(zram, alloced_pages);
 
 	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
+		zcomp_stream_put(zram->comp);
 		zs_free(meta->mem_pool, handle);
-		ret = -ENOMEM;
-		goto out;
+		return -ENOMEM;
 	}
 
-	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
+	dst = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
 
-	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
+	if (comp_len == PAGE_SIZE) {
 		src = kmap_atomic(page);
-		copy_page(cmem, src);
+		copy_page(dst, src);
 		kunmap_atomic(src);
 	} else {
-		memcpy(cmem, src, clen);
+		memcpy(dst, zstrm->buffer, comp_len);
 	}
 
 	zcomp_stream_put(zram->comp);
-	zstrm = NULL;
 	zs_unmap_object(meta->mem_pool, handle);
 
 	/*
@@ -734,19 +733,54 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 	 */
 	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
 	zram_free_page(zram, index);
-
 	meta->table[index].handle = handle;
-	zram_set_obj_size(meta, index, clen);
+	zram_set_obj_size(meta, index, comp_len);
 	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
 
 	/* Update stats */
-	atomic64_add(clen, &zram->stats.compr_data_size);
+	atomic64_add(comp_len, &zram->stats.compr_data_size);
 	atomic64_inc(&zram->stats.pages_stored);
+	return 0;
+}
+
+static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
+				u32 index, int offset)
+{
+	int ret;
+	struct page *page = NULL;
+	void *src;
+	struct bio_vec vec;
+
+	vec = *bvec;
+	if (is_partial_io(bvec)) {
+		void *dst;
+		/*
+		 * This is a partial IO. We need to read the full page
+		 * before to write the changes.
+		 */
+		page = alloc_page(GFP_NOIO|__GFP_HIGHMEM);
+		if (!page)
+			return -ENOMEM;
+
+		ret = zram_decompress_page(zram, page, index);
+		if (ret)
+			goto out;
+
+		src = kmap_atomic(bvec->bv_page);
+		dst = kmap_atomic(page);
+		memcpy(dst + offset, src + bvec->bv_offset, bvec->bv_len);
+		kunmap_atomic(dst);
+		kunmap_atomic(src);
+
+		vec.bv_page = page;
+		vec.bv_len = PAGE_SIZE;
+		vec.bv_offset = 0;
+	}
+
+	ret = __zram_bvec_write(zram, &vec, index, offset);
 out:
-	if (zstrm)
-		zcomp_stream_put(zram->comp);
 	if (is_partial_io(bvec))
-		kfree(uncmem);
+		__free_page(page);
 	return ret;
 }
 
@@ -802,6 +836,7 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
 	if (!is_write) {
 		atomic64_inc(&zram->stats.num_reads);
 		ret = zram_bvec_read(zram, bvec, index, offset);
+		flush_dcache_page(bvec->bv_page);
 	} else {
 		atomic64_inc(&zram->stats.num_writes);
 		ret = zram_bvec_write(zram, bvec, index, offset);
-- 
2.7.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op
  2017-04-03  5:17 [PATCH 0/5] zram clean up Minchan Kim
  2017-04-03  5:17 ` [PATCH 1/5] zram: handle multiple pages attached bio's bvec Minchan Kim
  2017-04-03  5:17 ` [PATCH 2/5] zram: partial IO refactoring Minchan Kim
@ 2017-04-03  5:17 ` Minchan Kim
  2017-04-03  6:08   ` Sergey Senozhatsky
  2017-04-04  2:18   ` Sergey Senozhatsky
  2017-04-03  5:17 ` [PATCH 4/5] zram: remove zram_meta structure Minchan Kim
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  5:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Sergey Senozhatsky, kernel-team, Minchan Kim

With this clean-up phase, I want to use zram's wrapper function
to lock table access which is more consistent with other zram's
functions.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 42 ++++++++++++++++++++++++++++--------------
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 7938f4b98b01..71b0a584bc85 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -415,24 +415,39 @@ static DEVICE_ATTR_RO(io_stat);
 static DEVICE_ATTR_RO(mm_stat);
 static DEVICE_ATTR_RO(debug_stat);
 
+
+static void zram_slot_lock(struct zram *zram, u32 index)
+{
+	struct zram_meta *meta = zram->meta;
+
+	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+}
+
+static void zram_slot_unlock(struct zram *zram, u32 index)
+{
+	struct zram_meta *meta = zram->meta;
+
+	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+}
+
 static bool zram_special_page_read(struct zram *zram, u32 index,
 				struct page *page,
 				unsigned int offset, unsigned int len)
 {
 	struct zram_meta *meta = zram->meta;
 
-	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_lock(zram, index);
 	if (unlikely(!meta->table[index].handle) ||
 			zram_test_flag(meta, index, ZRAM_SAME)) {
 		void *mem;
 
-		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+		zram_slot_unlock(zram, index);
 		mem = kmap_atomic(page);
 		zram_fill_page(mem + offset, len, meta->table[index].element);
 		kunmap_atomic(mem);
 		return true;
 	}
-	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_unlock(zram, index);
 
 	return false;
 }
@@ -448,11 +463,11 @@ static bool zram_special_page_write(struct zram *zram, u32 index,
 
 		kunmap_atomic(mem);
 		/* Free memory associated with this sector now. */
-		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+		zram_slot_lock(zram, index);
 		zram_free_page(zram, index);
 		zram_set_flag(meta, index, ZRAM_SAME);
 		zram_set_element(meta, index, element);
-		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+		zram_slot_unlock(zram, index);
 
 		atomic64_inc(&zram->stats.same_pages);
 		return true;
@@ -559,7 +574,7 @@ static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
 	if (zram_special_page_read(zram, index, page, 0, PAGE_SIZE))
 		return 0;
 
-	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_lock(zram, index);
 	handle = meta->table[index].handle;
 	size = zram_get_obj_size(meta, index);
 
@@ -578,7 +593,7 @@ static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
 		zcomp_stream_put(zram->comp);
 	}
 	zs_unmap_object(meta->mem_pool, handle);
-	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_unlock(zram, index);
 
 	/* Should NEVER happen. Return bio error if it does. */
 	if (unlikely(ret))
@@ -731,11 +746,11 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 	 * Free memory associated with this sector
 	 * before overwriting unused sectors.
 	 */
-	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_lock(zram, index);
 	zram_free_page(zram, index);
 	meta->table[index].handle = handle;
 	zram_set_obj_size(meta, index, comp_len);
-	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_unlock(zram, index);
 
 	/* Update stats */
 	atomic64_add(comp_len, &zram->stats.compr_data_size);
@@ -793,7 +808,6 @@ static void zram_bio_discard(struct zram *zram, u32 index,
 			     int offset, struct bio *bio)
 {
 	size_t n = bio->bi_iter.bi_size;
-	struct zram_meta *meta = zram->meta;
 
 	/*
 	 * zram manages data in physical block size units. Because logical block
@@ -814,9 +828,9 @@ static void zram_bio_discard(struct zram *zram, u32 index,
 	}
 
 	while (n >= PAGE_SIZE) {
-		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+		zram_slot_lock(zram, index);
 		zram_free_page(zram, index);
-		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+		zram_slot_unlock(zram, index);
 		atomic64_inc(&zram->stats.notify_free);
 		index++;
 		n -= PAGE_SIZE;
@@ -925,9 +939,9 @@ static void zram_slot_free_notify(struct block_device *bdev,
 	zram = bdev->bd_disk->private_data;
 	meta = zram->meta;
 
-	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_lock(zram, index);
 	zram_free_page(zram, index);
-	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+	zram_slot_unlock(zram, index);
 	atomic64_inc(&zram->stats.notify_free);
 }
 
-- 
2.7.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 4/5] zram: remove zram_meta structure
  2017-04-03  5:17 [PATCH 0/5] zram clean up Minchan Kim
                   ` (2 preceding siblings ...)
  2017-04-03  5:17 ` [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op Minchan Kim
@ 2017-04-03  5:17 ` Minchan Kim
  2017-04-04  2:31   ` Sergey Senozhatsky
  2017-04-03  5:17 ` [PATCH 5/5] zram: introduce zram data accessor Minchan Kim
  2017-04-11  5:38 ` [PATCH 0/5] zram clean up Minchan Kim
  5 siblings, 1 reply; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  5:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Sergey Senozhatsky, kernel-team, Minchan Kim

It's redundant now. Instead, remove it and use zram structure
directly.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 163 +++++++++++++++++-------------------------
 drivers/block/zram/zram_drv.h |   6 +-
 2 files changed, 65 insertions(+), 104 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 71b0a584bc85..fdb73222841d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -58,46 +58,46 @@ static inline struct zram *dev_to_zram(struct device *dev)
 }
 
 /* flag operations require table entry bit_spin_lock() being held */
-static int zram_test_flag(struct zram_meta *meta, u32 index,
+static int zram_test_flag(struct zram *zram, u32 index,
 			enum zram_pageflags flag)
 {
-	return meta->table[index].value & BIT(flag);
+	return zram->table[index].value & BIT(flag);
 }
 
-static void zram_set_flag(struct zram_meta *meta, u32 index,
+static void zram_set_flag(struct zram *zram, u32 index,
 			enum zram_pageflags flag)
 {
-	meta->table[index].value |= BIT(flag);
+	zram->table[index].value |= BIT(flag);
 }
 
-static void zram_clear_flag(struct zram_meta *meta, u32 index,
+static void zram_clear_flag(struct zram *zram, u32 index,
 			enum zram_pageflags flag)
 {
-	meta->table[index].value &= ~BIT(flag);
+	zram->table[index].value &= ~BIT(flag);
 }
 
-static inline void zram_set_element(struct zram_meta *meta, u32 index,
+static inline void zram_set_element(struct zram *zram, u32 index,
 			unsigned long element)
 {
-	meta->table[index].element = element;
+	zram->table[index].element = element;
 }
 
-static inline void zram_clear_element(struct zram_meta *meta, u32 index)
+static inline void zram_clear_element(struct zram *zram, u32 index)
 {
-	meta->table[index].element = 0;
+	zram->table[index].element = 0;
 }
 
-static size_t zram_get_obj_size(struct zram_meta *meta, u32 index)
+static size_t zram_get_obj_size(struct zram *zram, u32 index)
 {
-	return meta->table[index].value & (BIT(ZRAM_FLAG_SHIFT) - 1);
+	return zram->table[index].value & (BIT(ZRAM_FLAG_SHIFT) - 1);
 }
 
-static void zram_set_obj_size(struct zram_meta *meta,
+static void zram_set_obj_size(struct zram *zram,
 					u32 index, size_t size)
 {
-	unsigned long flags = meta->table[index].value >> ZRAM_FLAG_SHIFT;
+	unsigned long flags = zram->table[index].value >> ZRAM_FLAG_SHIFT;
 
-	meta->table[index].value = (flags << ZRAM_FLAG_SHIFT) | size;
+	zram->table[index].value = (flags << ZRAM_FLAG_SHIFT) | size;
 }
 
 #if PAGE_SIZE != 4096
@@ -252,9 +252,8 @@ static ssize_t mem_used_max_store(struct device *dev,
 
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
-		struct zram_meta *meta = zram->meta;
 		atomic_long_set(&zram->stats.max_used_pages,
-				zs_get_total_pages(meta->mem_pool));
+				zs_get_total_pages(zram->mem_pool));
 	}
 	up_read(&zram->init_lock);
 
@@ -327,7 +326,6 @@ static ssize_t compact_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
 	struct zram *zram = dev_to_zram(dev);
-	struct zram_meta *meta;
 
 	down_read(&zram->init_lock);
 	if (!init_done(zram)) {
@@ -335,8 +333,7 @@ static ssize_t compact_store(struct device *dev,
 		return -EINVAL;
 	}
 
-	meta = zram->meta;
-	zs_compact(meta->mem_pool);
+	zs_compact(zram->mem_pool);
 	up_read(&zram->init_lock);
 
 	return len;
@@ -373,8 +370,8 @@ static ssize_t mm_stat_show(struct device *dev,
 
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
-		mem_used = zs_get_total_pages(zram->meta->mem_pool);
-		zs_pool_stats(zram->meta->mem_pool, &pool_stats);
+		mem_used = zs_get_total_pages(zram->mem_pool);
+		zs_pool_stats(zram->mem_pool, &pool_stats);
 	}
 
 	orig_size = atomic64_read(&zram->stats.pages_stored);
@@ -418,32 +415,26 @@ static DEVICE_ATTR_RO(debug_stat);
 
 static void zram_slot_lock(struct zram *zram, u32 index)
 {
-	struct zram_meta *meta = zram->meta;
-
-	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
+	bit_spin_lock(ZRAM_ACCESS, &zram->table[index].value);
 }
 
 static void zram_slot_unlock(struct zram *zram, u32 index)
 {
-	struct zram_meta *meta = zram->meta;
-
-	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
+	bit_spin_unlock(ZRAM_ACCESS, &zram->table[index].value);
 }
 
 static bool zram_special_page_read(struct zram *zram, u32 index,
 				struct page *page,
 				unsigned int offset, unsigned int len)
 {
-	struct zram_meta *meta = zram->meta;
-
 	zram_slot_lock(zram, index);
-	if (unlikely(!meta->table[index].handle) ||
-			zram_test_flag(meta, index, ZRAM_SAME)) {
+	if (unlikely(!zram->table[index].handle) ||
+			zram_test_flag(zram, index, ZRAM_SAME)) {
 		void *mem;
 
 		zram_slot_unlock(zram, index);
 		mem = kmap_atomic(page);
-		zram_fill_page(mem + offset, len, meta->table[index].element);
+		zram_fill_page(mem + offset, len, zram->table[index].element);
 		kunmap_atomic(mem);
 		return true;
 	}
@@ -459,14 +450,12 @@ static bool zram_special_page_write(struct zram *zram, u32 index,
 	void *mem = kmap_atomic(page);
 
 	if (page_same_filled(mem, &element)) {
-		struct zram_meta *meta = zram->meta;
-
 		kunmap_atomic(mem);
 		/* Free memory associated with this sector now. */
 		zram_slot_lock(zram, index);
 		zram_free_page(zram, index);
-		zram_set_flag(meta, index, ZRAM_SAME);
-		zram_set_element(meta, index, element);
+		zram_set_flag(zram, index, ZRAM_SAME);
+		zram_set_element(zram, index, element);
 		zram_slot_unlock(zram, index);
 
 		atomic64_inc(&zram->stats.same_pages);
@@ -477,56 +466,44 @@ static bool zram_special_page_write(struct zram *zram, u32 index,
 	return false;
 }
 
-static void zram_meta_free(struct zram_meta *meta, u64 disksize)
+static void zram_meta_free(struct zram *zram, u64 disksize)
 {
 	size_t num_pages = disksize >> PAGE_SHIFT;
 	size_t index;
 
 	/* Free all pages that are still in this zram device */
 	for (index = 0; index < num_pages; index++) {
-		unsigned long handle = meta->table[index].handle;
+		unsigned long handle = zram->table[index].handle;
 		/*
 		 * No memory is allocated for same element filled pages.
 		 * Simply clear same page flag.
 		 */
-		if (!handle || zram_test_flag(meta, index, ZRAM_SAME))
+		if (!handle || zram_test_flag(zram, index, ZRAM_SAME))
 			continue;
 
-		zs_free(meta->mem_pool, handle);
+		zs_free(zram->mem_pool, handle);
 	}
 
-	zs_destroy_pool(meta->mem_pool);
-	vfree(meta->table);
-	kfree(meta);
+	zs_destroy_pool(zram->mem_pool);
+	vfree(zram->table);
 }
 
-static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
+static bool zram_meta_alloc(struct zram *zram, u64 disksize)
 {
 	size_t num_pages;
-	struct zram_meta *meta = kmalloc(sizeof(*meta), GFP_KERNEL);
-
-	if (!meta)
-		return NULL;
 
 	num_pages = disksize >> PAGE_SHIFT;
-	meta->table = vzalloc(num_pages * sizeof(*meta->table));
-	if (!meta->table) {
-		pr_err("Error allocating zram address table\n");
-		goto out_error;
-	}
+	zram->table = vzalloc(num_pages * sizeof(*zram->table));
+	if (!zram->table)
+		return false;
 
-	meta->mem_pool = zs_create_pool(pool_name);
-	if (!meta->mem_pool) {
-		pr_err("Error creating memory pool\n");
-		goto out_error;
+	zram->mem_pool = zs_create_pool(zram->disk->disk_name);
+	if (!zram->mem_pool) {
+		vfree(zram->table);
+		return false;
 	}
 
-	return meta;
-
-out_error:
-	vfree(meta->table);
-	kfree(meta);
-	return NULL;
+	return true;
 }
 
 /*
@@ -536,16 +513,15 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
  */
 static void zram_free_page(struct zram *zram, size_t index)
 {
-	struct zram_meta *meta = zram->meta;
-	unsigned long handle = meta->table[index].handle;
+	unsigned long handle = zram->table[index].handle;
 
 	/*
 	 * No memory is allocated for same element filled pages.
 	 * Simply clear same page flag.
 	 */
-	if (zram_test_flag(meta, index, ZRAM_SAME)) {
-		zram_clear_flag(meta, index, ZRAM_SAME);
-		zram_clear_element(meta, index);
+	if (zram_test_flag(zram, index, ZRAM_SAME)) {
+		zram_clear_flag(zram, index, ZRAM_SAME);
+		zram_clear_element(zram, index);
 		atomic64_dec(&zram->stats.same_pages);
 		return;
 	}
@@ -553,14 +529,14 @@ static void zram_free_page(struct zram *zram, size_t index)
 	if (!handle)
 		return;
 
-	zs_free(meta->mem_pool, handle);
+	zs_free(zram->mem_pool, handle);
 
-	atomic64_sub(zram_get_obj_size(meta, index),
+	atomic64_sub(zram_get_obj_size(zram, index),
 			&zram->stats.compr_data_size);
 	atomic64_dec(&zram->stats.pages_stored);
 
-	meta->table[index].handle = 0;
-	zram_set_obj_size(meta, index, 0);
+	zram->table[index].handle = 0;
+	zram_set_obj_size(zram, index, 0);
 }
 
 static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
@@ -569,16 +545,15 @@ static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
 	unsigned long handle;
 	unsigned int size;
 	void *src, *dst;
-	struct zram_meta *meta = zram->meta;
 
 	if (zram_special_page_read(zram, index, page, 0, PAGE_SIZE))
 		return 0;
 
 	zram_slot_lock(zram, index);
-	handle = meta->table[index].handle;
-	size = zram_get_obj_size(meta, index);
+	handle = zram->table[index].handle;
+	size = zram_get_obj_size(zram, index);
 
-	src = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
+	src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
 	if (size == PAGE_SIZE) {
 		dst = kmap_atomic(page);
 		copy_page(dst, src);
@@ -592,7 +567,7 @@ static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
 		kunmap_atomic(dst);
 		zcomp_stream_put(zram->comp);
 	}
-	zs_unmap_object(meta->mem_pool, handle);
+	zs_unmap_object(zram->mem_pool, handle);
 	zram_slot_unlock(zram, index);
 
 	/* Should NEVER happen. Return bio error if it does. */
@@ -647,7 +622,6 @@ static int zram_compress(struct zram *zram, struct zcomp_strm **zstrm,
 	unsigned int comp_len;
 	void *src;
 	unsigned long handle = 0;
-	struct zram_meta *meta = zram->meta;
 
 compress_again:
 	src = kmap_atomic(page);
@@ -676,7 +650,7 @@ static int zram_compress(struct zram *zram, struct zcomp_strm **zstrm,
 	 * from the slow path and handle has already been allocated.
 	 */
 	if (!handle)
-		handle = zs_malloc(meta->mem_pool, comp_len,
+		handle = zs_malloc(zram->mem_pool, comp_len,
 				__GFP_KSWAPD_RECLAIM |
 				__GFP_NOWARN |
 				__GFP_HIGHMEM |
@@ -684,7 +658,7 @@ static int zram_compress(struct zram *zram, struct zcomp_strm **zstrm,
 	if (!handle) {
 		zcomp_stream_put(zram->comp);
 		atomic64_inc(&zram->stats.writestall);
-		handle = zs_malloc(meta->mem_pool, comp_len,
+		handle = zs_malloc(zram->mem_pool, comp_len,
 				GFP_NOIO | __GFP_HIGHMEM |
 				__GFP_MOVABLE);
 		*zstrm = zcomp_stream_get(zram->comp);
@@ -707,7 +681,6 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 	void *src, *dst;
 	struct zcomp_strm *zstrm;
 	unsigned long alloced_pages;
-	struct zram_meta *meta = zram->meta;
 	struct page *page = bvec->bv_page;
 
 	if (zram_special_page_write(zram, index, page))
@@ -720,16 +693,16 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 		return ret;
 	}
 
-	alloced_pages = zs_get_total_pages(meta->mem_pool);
+	alloced_pages = zs_get_total_pages(zram->mem_pool);
 	update_used_max(zram, alloced_pages);
 
 	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
 		zcomp_stream_put(zram->comp);
-		zs_free(meta->mem_pool, handle);
+		zs_free(zram->mem_pool, handle);
 		return -ENOMEM;
 	}
 
-	dst = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
+	dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
 
 	if (comp_len == PAGE_SIZE) {
 		src = kmap_atomic(page);
@@ -740,7 +713,7 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 	}
 
 	zcomp_stream_put(zram->comp);
-	zs_unmap_object(meta->mem_pool, handle);
+	zs_unmap_object(zram->mem_pool, handle);
 
 	/*
 	 * Free memory associated with this sector
@@ -748,8 +721,8 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 	 */
 	zram_slot_lock(zram, index);
 	zram_free_page(zram, index);
-	meta->table[index].handle = handle;
-	zram_set_obj_size(meta, index, comp_len);
+	zram->table[index].handle = handle;
+	zram_set_obj_size(zram, index, comp_len);
 	zram_slot_unlock(zram, index);
 
 	/* Update stats */
@@ -934,10 +907,8 @@ static void zram_slot_free_notify(struct block_device *bdev,
 				unsigned long index)
 {
 	struct zram *zram;
-	struct zram_meta *meta;
 
 	zram = bdev->bd_disk->private_data;
-	meta = zram->meta;
 
 	zram_slot_lock(zram, index);
 	zram_free_page(zram, index);
@@ -985,7 +956,6 @@ static int zram_rw_page(struct block_device *bdev, sector_t sector,
 
 static void zram_reset_device(struct zram *zram)
 {
-	struct zram_meta *meta;
 	struct zcomp *comp;
 	u64 disksize;
 
@@ -998,7 +968,6 @@ static void zram_reset_device(struct zram *zram)
 		return;
 	}
 
-	meta = zram->meta;
 	comp = zram->comp;
 	disksize = zram->disksize;
 
@@ -1011,7 +980,7 @@ static void zram_reset_device(struct zram *zram)
 
 	up_write(&zram->init_lock);
 	/* I/O operation under all of CPU are done so let's free */
-	zram_meta_free(meta, disksize);
+	zram_meta_free(zram, disksize);
 	zcomp_destroy(comp);
 }
 
@@ -1020,7 +989,6 @@ static ssize_t disksize_store(struct device *dev,
 {
 	u64 disksize;
 	struct zcomp *comp;
-	struct zram_meta *meta;
 	struct zram *zram = dev_to_zram(dev);
 	int err;
 
@@ -1029,8 +997,7 @@ static ssize_t disksize_store(struct device *dev,
 		return -EINVAL;
 
 	disksize = PAGE_ALIGN(disksize);
-	meta = zram_meta_alloc(zram->disk->disk_name, disksize);
-	if (!meta)
+	if (!zram_meta_alloc(zram, disksize))
 		return -ENOMEM;
 
 	comp = zcomp_create(zram->compressor);
@@ -1048,7 +1015,6 @@ static ssize_t disksize_store(struct device *dev,
 		goto out_destroy_comp;
 	}
 
-	zram->meta = meta;
 	zram->comp = comp;
 	zram->disksize = disksize;
 	set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT);
@@ -1061,7 +1027,7 @@ static ssize_t disksize_store(struct device *dev,
 	up_write(&zram->init_lock);
 	zcomp_destroy(comp);
 out_free_meta:
-	zram_meta_free(meta, disksize);
+	zram_meta_free(zram, disksize);
 	return err;
 }
 
@@ -1248,7 +1214,6 @@ static int zram_add(void)
 		goto out_free_disk;
 	}
 	strlcpy(zram->compressor, default_compressor, sizeof(zram->compressor));
-	zram->meta = NULL;
 
 	pr_info("Added device: %s\n", zram->disk->disk_name);
 	return device_id;
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index caeff51f1571..e34e44d02e3e 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -92,13 +92,9 @@ struct zram_stats {
 	atomic64_t writestall;		/* no. of write slow paths */
 };
 
-struct zram_meta {
+struct zram {
 	struct zram_table_entry *table;
 	struct zs_pool *mem_pool;
-};
-
-struct zram {
-	struct zram_meta *meta;
 	struct zcomp *comp;
 	struct gendisk *disk;
 	/* Prevent concurrent execution of device init */
-- 
2.7.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 5/5] zram: introduce zram data accessor
  2017-04-03  5:17 [PATCH 0/5] zram clean up Minchan Kim
                   ` (3 preceding siblings ...)
  2017-04-03  5:17 ` [PATCH 4/5] zram: remove zram_meta structure Minchan Kim
@ 2017-04-03  5:17 ` Minchan Kim
  2017-04-04  4:40   ` Sergey Senozhatsky
  2017-04-11  5:38 ` [PATCH 0/5] zram clean up Minchan Kim
  5 siblings, 1 reply; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  5:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Sergey Senozhatsky, kernel-team, Minchan Kim

With element, sometime I got confused handle and element access.
It might be my bad but I think it's time to introduce accessor
to prevent future idiot like me.
This patch is just clean-up patch so it shouldn't change any
behavior.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index fdb73222841d..c3171e5aa582 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -57,6 +57,16 @@ static inline struct zram *dev_to_zram(struct device *dev)
 	return (struct zram *)dev_to_disk(dev)->private_data;
 }
 
+static unsigned long zram_get_handle(struct zram *zram, u32 index)
+{
+	return zram->table[index].handle;
+}
+
+static void zram_set_handle(struct zram *zram, u32 index, unsigned long handle)
+{
+	zram->table[index].handle = handle;
+}
+
 /* flag operations require table entry bit_spin_lock() being held */
 static int zram_test_flag(struct zram *zram, u32 index,
 			enum zram_pageflags flag)
@@ -82,9 +92,9 @@ static inline void zram_set_element(struct zram *zram, u32 index,
 	zram->table[index].element = element;
 }
 
-static inline void zram_clear_element(struct zram *zram, u32 index)
+static unsigned long zram_get_element(struct zram *zram, u32 index)
 {
-	zram->table[index].element = 0;
+	return zram->table[index].element;
 }
 
 static size_t zram_get_obj_size(struct zram *zram, u32 index)
@@ -428,13 +438,14 @@ static bool zram_special_page_read(struct zram *zram, u32 index,
 				unsigned int offset, unsigned int len)
 {
 	zram_slot_lock(zram, index);
-	if (unlikely(!zram->table[index].handle) ||
-			zram_test_flag(zram, index, ZRAM_SAME)) {
+	if (unlikely(!zram_get_handle(zram, index) ||
+			zram_test_flag(zram, index, ZRAM_SAME))) {
 		void *mem;
 
 		zram_slot_unlock(zram, index);
 		mem = kmap_atomic(page);
-		zram_fill_page(mem + offset, len, zram->table[index].element);
+		zram_fill_page(mem + offset, len,
+					zram_get_element(zram, index));
 		kunmap_atomic(mem);
 		return true;
 	}
@@ -473,7 +484,7 @@ static void zram_meta_free(struct zram *zram, u64 disksize)
 
 	/* Free all pages that are still in this zram device */
 	for (index = 0; index < num_pages; index++) {
-		unsigned long handle = zram->table[index].handle;
+		unsigned long handle = zram_get_handle(zram, index);
 		/*
 		 * No memory is allocated for same element filled pages.
 		 * Simply clear same page flag.
@@ -513,7 +524,7 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
  */
 static void zram_free_page(struct zram *zram, size_t index)
 {
-	unsigned long handle = zram->table[index].handle;
+	unsigned long handle = zram_get_handle(zram, index);
 
 	/*
 	 * No memory is allocated for same element filled pages.
@@ -521,7 +532,7 @@ static void zram_free_page(struct zram *zram, size_t index)
 	 */
 	if (zram_test_flag(zram, index, ZRAM_SAME)) {
 		zram_clear_flag(zram, index, ZRAM_SAME);
-		zram_clear_element(zram, index);
+		zram_set_element(zram, index, 0);
 		atomic64_dec(&zram->stats.same_pages);
 		return;
 	}
@@ -535,7 +546,7 @@ static void zram_free_page(struct zram *zram, size_t index)
 			&zram->stats.compr_data_size);
 	atomic64_dec(&zram->stats.pages_stored);
 
-	zram->table[index].handle = 0;
+	zram_set_handle(zram, index, 0);
 	zram_set_obj_size(zram, index, 0);
 }
 
@@ -550,7 +561,7 @@ static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
 		return 0;
 
 	zram_slot_lock(zram, index);
-	handle = zram->table[index].handle;
+	handle = zram_get_handle(zram, index);
 	size = zram_get_obj_size(zram, index);
 
 	src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
@@ -721,7 +732,7 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
 	 */
 	zram_slot_lock(zram, index);
 	zram_free_page(zram, index);
-	zram->table[index].handle = handle;
+	zram_set_handle(zram, index, handle);
 	zram_set_obj_size(zram, index, comp_len);
 	zram_slot_unlock(zram, index);
 
-- 
2.7.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] zram: partial IO refactoring
  2017-04-03  5:17 ` [PATCH 2/5] zram: partial IO refactoring Minchan Kim
@ 2017-04-03  5:52   ` Mika Penttilä
  2017-04-03  6:12     ` Minchan Kim
  2017-04-04  2:17   ` Sergey Senozhatsky
  1 sibling, 1 reply; 25+ messages in thread
From: Mika Penttilä @ 2017-04-03  5:52 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton; +Cc: linux-kernel, Sergey Senozhatsky, kernel-team


Hi!

On 04/03/2017 08:17 AM, Minchan Kim wrote:
> For architecture(PAGE_SIZE > 4K), zram have supported partial IO.
> However, the mixed code for handling normal/partial IO is too mess,
> error-prone to modify IO handler functions with upcoming feature
> so this patch aims for cleaning up zram's IO handling functions.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/block/zram/zram_drv.c | 333 +++++++++++++++++++++++-------------------
>  1 file changed, 184 insertions(+), 149 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 28c2836f8c96..7938f4b98b01 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -45,6 +45,8 @@ static const char *default_compressor = "lzo";
>  /* Module params (documentation at end) */
>  static unsigned int num_devices = 1;
>  
> +static void zram_free_page(struct zram *zram, size_t index);
> +
>  static inline bool init_done(struct zram *zram)
>  {
>  	return zram->disksize;
> @@ -98,10 +100,17 @@ static void zram_set_obj_size(struct zram_meta *meta,
>  	meta->table[index].value = (flags << ZRAM_FLAG_SHIFT) | size;
>  }
>  
> +#if PAGE_SIZE != 4096
>  static inline bool is_partial_io(struct bio_vec *bvec)
>  {
>  	return bvec->bv_len != PAGE_SIZE;
>  }
> +#else

For page size of 4096 bv_len can still be < 4096 and partial pages should be supported 
(uncompress before write etc). ? 

> +static inline bool is_partial_io(struct bio_vec *bvec)
> +{
> +	return false;
> +}
> +#endif
>  
>  static void zram_revalidate_disk(struct zram *zram)
>  {
> @@ -191,18 +200,6 @@ static bool page_same_filled(void *ptr, unsigned long *element)
>  	return true;
>  }
>  
> -static void handle_same_page(struct bio_vec *bvec, unsigned long element)
> -{
> -	struct page *page = bvec->bv_page;
> -	void *user_mem;
> -
> -	user_mem = kmap_atomic(page);
> -	zram_fill_page(user_mem + bvec->bv_offset, bvec->bv_len, element);
> -	kunmap_atomic(user_mem);
> -
> -	flush_dcache_page(page);
> -}
> -
>  static ssize_t initstate_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
>  {
> @@ -418,6 +415,53 @@ static DEVICE_ATTR_RO(io_stat);
>  static DEVICE_ATTR_RO(mm_stat);
>  static DEVICE_ATTR_RO(debug_stat);
>  
> +static bool zram_special_page_read(struct zram *zram, u32 index,
> +				struct page *page,
> +				unsigned int offset, unsigned int len)
> +{
> +	struct zram_meta *meta = zram->meta;
> +
> +	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> +	if (unlikely(!meta->table[index].handle) ||
> +			zram_test_flag(meta, index, ZRAM_SAME)) {
> +		void *mem;
> +
> +		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> +		mem = kmap_atomic(page);
> +		zram_fill_page(mem + offset, len, meta->table[index].element);
> +		kunmap_atomic(mem);
> +		return true;
> +	}
> +	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> +
> +	return false;
> +}
> +
> +static bool zram_special_page_write(struct zram *zram, u32 index,
> +					struct page *page)
> +{
> +	unsigned long element;
> +	void *mem = kmap_atomic(page);
> +
> +	if (page_same_filled(mem, &element)) {
> +		struct zram_meta *meta = zram->meta;
> +
> +		kunmap_atomic(mem);
> +		/* Free memory associated with this sector now. */
> +		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> +		zram_free_page(zram, index);
> +		zram_set_flag(meta, index, ZRAM_SAME);
> +		zram_set_element(meta, index, element);
> +		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> +
> +		atomic64_inc(&zram->stats.same_pages);
> +		return true;
> +	}
> +	kunmap_atomic(mem);
> +
> +	return false;
> +}
> +
>  static void zram_meta_free(struct zram_meta *meta, u64 disksize)
>  {
>  	size_t num_pages = disksize >> PAGE_SHIFT;
> @@ -504,169 +548,104 @@ static void zram_free_page(struct zram *zram, size_t index)
>  	zram_set_obj_size(meta, index, 0);
>  }
>  
> -static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
> +static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
>  {
> -	int ret = 0;
> -	unsigned char *cmem;
> -	struct zram_meta *meta = zram->meta;
> +	int ret;
>  	unsigned long handle;
>  	unsigned int size;
> +	void *src, *dst;
> +	struct zram_meta *meta = zram->meta;
> +
> +	if (zram_special_page_read(zram, index, page, 0, PAGE_SIZE))
> +		return 0;
>  
>  	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
>  	handle = meta->table[index].handle;
>  	size = zram_get_obj_size(meta, index);
>  
> -	if (!handle || zram_test_flag(meta, index, ZRAM_SAME)) {
> -		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> -		zram_fill_page(mem, PAGE_SIZE, meta->table[index].element);
> -		return 0;
> -	}
> -
> -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
> +	src = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
>  	if (size == PAGE_SIZE) {
> -		copy_page(mem, cmem);
> +		dst = kmap_atomic(page);
> +		copy_page(dst, src);
> +		kunmap_atomic(dst);
> +		ret = 0;
>  	} else {
>  		struct zcomp_strm *zstrm = zcomp_stream_get(zram->comp);
>  
> -		ret = zcomp_decompress(zstrm, cmem, size, mem);
> +		dst = kmap_atomic(page);
> +		ret = zcomp_decompress(zstrm, src, size, dst);
> +		kunmap_atomic(dst);
>  		zcomp_stream_put(zram->comp);
>  	}
>  	zs_unmap_object(meta->mem_pool, handle);
>  	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
>  
>  	/* Should NEVER happen. Return bio error if it does. */
> -	if (unlikely(ret)) {
> +	if (unlikely(ret))
>  		pr_err("Decompression failed! err=%d, page=%u\n", ret, index);
> -		return ret;
> -	}
>  
> -	return 0;
> +	return ret;
>  }
>  
>  static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
> -			  u32 index, int offset)
> +				u32 index, int offset)
>  {
>  	int ret;
>  	struct page *page;
> -	unsigned char *user_mem, *uncmem = NULL;
> -	struct zram_meta *meta = zram->meta;
> -	page = bvec->bv_page;
>  
> -	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> -	if (unlikely(!meta->table[index].handle) ||
> -			zram_test_flag(meta, index, ZRAM_SAME)) {
> -		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> -		handle_same_page(bvec, meta->table[index].element);
> +	page = bvec->bv_page;
> +	if (zram_special_page_read(zram, index, page, bvec->bv_offset,
> +				bvec->bv_len))
>  		return 0;
> -	}
> -	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> -
> -	if (is_partial_io(bvec))
> -		/* Use  a temporary buffer to decompress the page */
> -		uncmem = kmalloc(PAGE_SIZE, GFP_NOIO);
> -
> -	user_mem = kmap_atomic(page);
> -	if (!is_partial_io(bvec))
> -		uncmem = user_mem;
>  
> -	if (!uncmem) {
> -		pr_err("Unable to allocate temp memory\n");
> -		ret = -ENOMEM;
> -		goto out_cleanup;
> +	if (is_partial_io(bvec)) {
> +		/* Use a temporary buffer to decompress the page */
> +		page = alloc_page(GFP_NOIO|__GFP_HIGHMEM);
> +		if (!page)
> +			return -ENOMEM;
>  	}
>  
> -	ret = zram_decompress_page(zram, uncmem, index);
> -	/* Should NEVER happen. Return bio error if it does. */
> +	ret = zram_decompress_page(zram, page, index);
>  	if (unlikely(ret))
> -		goto out_cleanup;
> +		goto out;
>  
> -	if (is_partial_io(bvec))
> -		memcpy(user_mem + bvec->bv_offset, uncmem + offset,
> -				bvec->bv_len);
> +	if (is_partial_io(bvec)) {
> +		void *dst = kmap_atomic(bvec->bv_page);
> +		void *src = kmap_atomic(page);
>  
> -	flush_dcache_page(page);
> -	ret = 0;
> -out_cleanup:
> -	kunmap_atomic(user_mem);
> +		memcpy(dst + bvec->bv_offset, src + offset, bvec->bv_len);
> +		kunmap_atomic(src);
> +		kunmap_atomic(dst);
> +	}
> +out:
>  	if (is_partial_io(bvec))
> -		kfree(uncmem);
> +		__free_page(page);
> +
>  	return ret;
>  }
>  
> -static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> -			   int offset)
> +static int zram_compress(struct zram *zram, struct zcomp_strm **zstrm,
> +			struct page *page,
> +			unsigned long *out_handle, unsigned int *out_comp_len)
>  {
> -	int ret = 0;
> -	unsigned int clen;
> +	int ret;
> +	unsigned int comp_len;
> +	void *src;
>  	unsigned long handle = 0;
> -	struct page *page;
> -	unsigned char *user_mem, *cmem, *src, *uncmem = NULL;
>  	struct zram_meta *meta = zram->meta;
> -	struct zcomp_strm *zstrm = NULL;
> -	unsigned long alloced_pages;
> -	unsigned long element;
> -
> -	page = bvec->bv_page;
> -	if (is_partial_io(bvec)) {
> -		/*
> -		 * This is a partial IO. We need to read the full page
> -		 * before to write the changes.
> -		 */
> -		uncmem = kmalloc(PAGE_SIZE, GFP_NOIO);
> -		if (!uncmem) {
> -			ret = -ENOMEM;
> -			goto out;
> -		}
> -		ret = zram_decompress_page(zram, uncmem, index);
> -		if (ret)
> -			goto out;
> -	}
>  
>  compress_again:
> -	user_mem = kmap_atomic(page);
> -	if (is_partial_io(bvec)) {
> -		memcpy(uncmem + offset, user_mem + bvec->bv_offset,
> -		       bvec->bv_len);
> -		kunmap_atomic(user_mem);
> -		user_mem = NULL;
> -	} else {
> -		uncmem = user_mem;
> -	}
> -
> -	if (page_same_filled(uncmem, &element)) {
> -		if (user_mem)
> -			kunmap_atomic(user_mem);
> -		/* Free memory associated with this sector now. */
> -		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> -		zram_free_page(zram, index);
> -		zram_set_flag(meta, index, ZRAM_SAME);
> -		zram_set_element(meta, index, element);
> -		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> -
> -		atomic64_inc(&zram->stats.same_pages);
> -		ret = 0;
> -		goto out;
> -	}
> -
> -	zstrm = zcomp_stream_get(zram->comp);
> -	ret = zcomp_compress(zstrm, uncmem, &clen);
> -	if (!is_partial_io(bvec)) {
> -		kunmap_atomic(user_mem);
> -		user_mem = NULL;
> -		uncmem = NULL;
> -	}
> +	src = kmap_atomic(page);
> +	ret = zcomp_compress(*zstrm, src, &comp_len);
> +	kunmap_atomic(src);
>  
>  	if (unlikely(ret)) {
>  		pr_err("Compression failed! err=%d\n", ret);
> -		goto out;
> +		return ret;
>  	}
>  
> -	src = zstrm->buffer;
> -	if (unlikely(clen > max_zpage_size)) {
> -		clen = PAGE_SIZE;
> -		if (is_partial_io(bvec))
> -			src = uncmem;
> -	}
> +	if (unlikely(comp_len > max_zpage_size))
> +		comp_len = PAGE_SIZE;
>  
>  	/*
>  	 * handle allocation has 2 paths:
> @@ -682,50 +661,70 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  	 * from the slow path and handle has already been allocated.
>  	 */
>  	if (!handle)
> -		handle = zs_malloc(meta->mem_pool, clen,
> +		handle = zs_malloc(meta->mem_pool, comp_len,
>  				__GFP_KSWAPD_RECLAIM |
>  				__GFP_NOWARN |
>  				__GFP_HIGHMEM |
>  				__GFP_MOVABLE);
>  	if (!handle) {
>  		zcomp_stream_put(zram->comp);
> -		zstrm = NULL;
> -
>  		atomic64_inc(&zram->stats.writestall);
> -
> -		handle = zs_malloc(meta->mem_pool, clen,
> +		handle = zs_malloc(meta->mem_pool, comp_len,
>  				GFP_NOIO | __GFP_HIGHMEM |
>  				__GFP_MOVABLE);
> +		*zstrm = zcomp_stream_get(zram->comp);
>  		if (handle)
>  			goto compress_again;
> +		return -ENOMEM;
> +	}
>  
> -		pr_err("Error allocating memory for compressed page: %u, size=%u\n",
> -			index, clen);
> -		ret = -ENOMEM;
> -		goto out;
> +	*out_handle = handle;
> +	*out_comp_len = comp_len;
> +	return 0;
> +}
> +
> +static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
> +				u32 index, int offset)
> +{
> +	int ret;
> +	unsigned long handle;
> +	unsigned int comp_len;
> +	void *src, *dst;
> +	struct zcomp_strm *zstrm;
> +	unsigned long alloced_pages;
> +	struct zram_meta *meta = zram->meta;
> +	struct page *page = bvec->bv_page;
> +
> +	if (zram_special_page_write(zram, index, page))
> +		return 0;
> +
> +	zstrm = zcomp_stream_get(zram->comp);
> +	ret = zram_compress(zram, &zstrm, page, &handle, &comp_len);
> +	if (ret) {
> +		zcomp_stream_put(zram->comp);
> +		return ret;
>  	}
>  
>  	alloced_pages = zs_get_total_pages(meta->mem_pool);
>  	update_used_max(zram, alloced_pages);
>  
>  	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
> +		zcomp_stream_put(zram->comp);
>  		zs_free(meta->mem_pool, handle);
> -		ret = -ENOMEM;
> -		goto out;
> +		return -ENOMEM;
>  	}
>  
> -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> +	dst = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
>  
> -	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> +	if (comp_len == PAGE_SIZE) {
>  		src = kmap_atomic(page);
> -		copy_page(cmem, src);
> +		copy_page(dst, src);
>  		kunmap_atomic(src);
>  	} else {
> -		memcpy(cmem, src, clen);
> +		memcpy(dst, zstrm->buffer, comp_len);
>  	}
>  
>  	zcomp_stream_put(zram->comp);
> -	zstrm = NULL;
>  	zs_unmap_object(meta->mem_pool, handle);
>  
>  	/*
> @@ -734,19 +733,54 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  	 */
>  	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
>  	zram_free_page(zram, index);
> -
>  	meta->table[index].handle = handle;
> -	zram_set_obj_size(meta, index, clen);
> +	zram_set_obj_size(meta, index, comp_len);
>  	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
>  
>  	/* Update stats */
> -	atomic64_add(clen, &zram->stats.compr_data_size);
> +	atomic64_add(comp_len, &zram->stats.compr_data_size);
>  	atomic64_inc(&zram->stats.pages_stored);
> +	return 0;
> +}
> +
> +static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
> +				u32 index, int offset)
> +{
> +	int ret;
> +	struct page *page = NULL;
> +	void *src;
> +	struct bio_vec vec;
> +
> +	vec = *bvec;
> +	if (is_partial_io(bvec)) {
> +		void *dst;
> +		/*
> +		 * This is a partial IO. We need to read the full page
> +		 * before to write the changes.
> +		 */
> +		page = alloc_page(GFP_NOIO|__GFP_HIGHMEM);
> +		if (!page)
> +			return -ENOMEM;
> +
> +		ret = zram_decompress_page(zram, page, index);
> +		if (ret)
> +			goto out;
> +
> +		src = kmap_atomic(bvec->bv_page);
> +		dst = kmap_atomic(page);
> +		memcpy(dst + offset, src + bvec->bv_offset, bvec->bv_len);
> +		kunmap_atomic(dst);
> +		kunmap_atomic(src);
> +
> +		vec.bv_page = page;
> +		vec.bv_len = PAGE_SIZE;
> +		vec.bv_offset = 0;
> +	}
> +
> +	ret = __zram_bvec_write(zram, &vec, index, offset);
>  out:
> -	if (zstrm)
> -		zcomp_stream_put(zram->comp);
>  	if (is_partial_io(bvec))
> -		kfree(uncmem);
> +		__free_page(page);
>  	return ret;
>  }
>  
> @@ -802,6 +836,7 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
>  	if (!is_write) {
>  		atomic64_inc(&zram->stats.num_reads);
>  		ret = zram_bvec_read(zram, bvec, index, offset);
> +		flush_dcache_page(bvec->bv_page);
>  	} else {
>  		atomic64_inc(&zram->stats.num_writes);
>  		ret = zram_bvec_write(zram, bvec, index, offset);
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op
  2017-04-03  5:17 ` [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op Minchan Kim
@ 2017-04-03  6:08   ` Sergey Senozhatsky
  2017-04-03  6:34     ` Minchan Kim
  2017-04-04  2:18   ` Sergey Senozhatsky
  1 sibling, 1 reply; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-03  6:08 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

Hello Minchan,

On (04/03/17 14:17), Minchan Kim wrote:
> With this clean-up phase, I want to use zram's wrapper function
> to lock table access which is more consistent with other zram's
> functions.

which reminds me of...

there was a discussion a long time ago, -rt people absolutely
hate bit spin_locks and they suggested us to replace it with
normal spin_locks (and I promised to take a look at it, but
got interrupted and never really returned back to it).

for !lockdep builds the impact is somewhat small; for lockdep
builds we increase the memory usage, but

a) lockdep builds are debug builds by definition, no one runs lockdep
   enabled kernels in production

b) we have lockdep in zram now, which is nice

c) spin_locks probably have better fairness guarantees


what do you think? can we, in this patch set, also replce bit
spin_locks with a normal spin_lock?

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] zram: partial IO refactoring
  2017-04-03  5:52   ` Mika Penttilä
@ 2017-04-03  6:12     ` Minchan Kim
  2017-04-03  6:57       ` Mika Penttilä
  0 siblings, 1 reply; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  6:12 UTC (permalink / raw)
  To: Mika Penttilä
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

Hi Mika,

On Mon, Apr 03, 2017 at 08:52:33AM +0300, Mika Penttilä wrote:
> 
> Hi!
> 
> On 04/03/2017 08:17 AM, Minchan Kim wrote:
> > For architecture(PAGE_SIZE > 4K), zram have supported partial IO.
> > However, the mixed code for handling normal/partial IO is too mess,
> > error-prone to modify IO handler functions with upcoming feature
> > so this patch aims for cleaning up zram's IO handling functions.
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  drivers/block/zram/zram_drv.c | 333 +++++++++++++++++++++++-------------------
> >  1 file changed, 184 insertions(+), 149 deletions(-)
> > 
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index 28c2836f8c96..7938f4b98b01 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -45,6 +45,8 @@ static const char *default_compressor = "lzo";
> >  /* Module params (documentation at end) */
> >  static unsigned int num_devices = 1;
> >  
> > +static void zram_free_page(struct zram *zram, size_t index);
> > +
> >  static inline bool init_done(struct zram *zram)
> >  {
> >  	return zram->disksize;
> > @@ -98,10 +100,17 @@ static void zram_set_obj_size(struct zram_meta *meta,
> >  	meta->table[index].value = (flags << ZRAM_FLAG_SHIFT) | size;
> >  }
> >  
> > +#if PAGE_SIZE != 4096
> >  static inline bool is_partial_io(struct bio_vec *bvec)
> >  {
> >  	return bvec->bv_len != PAGE_SIZE;
> >  }
> > +#else
> 
> For page size of 4096 bv_len can still be < 4096 and partial pages should be supported 
> (uncompress before write etc). ? 

zram declares this.

        #define ZRAM_LOGICAL_BLOCK_SIZE (1<<12)

	blk_queue_physical_block_size(zram->disk->queue, PAGE_SIZE);
	blk_queue_logical_block_size(zram->disk->queue,
					ZRAM_LOGICAL_BLOCK_SIZE);

So, I thought there is no such partial IO in 4096 page architecture.
Am I missing something? Could you tell the scenario if it happens?

Thanks!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op
  2017-04-03  6:08   ` Sergey Senozhatsky
@ 2017-04-03  6:34     ` Minchan Kim
  2017-04-03  8:06       ` Sergey Senozhatsky
  0 siblings, 1 reply; 25+ messages in thread
From: Minchan Kim @ 2017-04-03  6:34 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

Hi Sergey,

On Mon, Apr 03, 2017 at 03:08:58PM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> On (04/03/17 14:17), Minchan Kim wrote:
> > With this clean-up phase, I want to use zram's wrapper function
> > to lock table access which is more consistent with other zram's
> > functions.
> 
> which reminds me of...
> 
> there was a discussion a long time ago, -rt people absolutely
> hate bit spin_locks and they suggested us to replace it with
> normal spin_locks (and I promised to take a look at it, but
> got interrupted and never really returned back to it).
> 
> for !lockdep builds the impact is somewhat small; for lockdep
> builds we increase the memory usage, but
> 
> a) lockdep builds are debug builds by definition, no one runs lockdep
>    enabled kernels in production
> 
> b) we have lockdep in zram now, which is nice

It's really one I want to have.

> 
> c) spin_locks probably have better fairness guarantees

In fact, it wouldn't be an imporant because zram's slot lock contention
is not heavy.
> 
> 
> what do you think? can we, in this patch set, also replce bit
> spin_locks with a normal spin_lock?

With changing only zram side from bit_spin_lock to spin_lock,
it would be crippled. I mean zsmalloc should be changed, too
and it's really hard. :(

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] zram: partial IO refactoring
  2017-04-03  6:12     ` Minchan Kim
@ 2017-04-03  6:57       ` Mika Penttilä
  0 siblings, 0 replies; 25+ messages in thread
From: Mika Penttilä @ 2017-04-03  6:57 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

On 04/03/2017 09:12 AM, Minchan Kim wrote:
> Hi Mika,
> 
> On Mon, Apr 03, 2017 at 08:52:33AM +0300, Mika Penttilä wrote:
>>
>> Hi!
>>
>> On 04/03/2017 08:17 AM, Minchan Kim wrote:
>>> For architecture(PAGE_SIZE > 4K), zram have supported partial IO.
>>> However, the mixed code for handling normal/partial IO is too mess,
>>> error-prone to modify IO handler functions with upcoming feature
>>> so this patch aims for cleaning up zram's IO handling functions.
>>>
>>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>>> ---
>>>  drivers/block/zram/zram_drv.c | 333 +++++++++++++++++++++++-------------------
>>>  1 file changed, 184 insertions(+), 149 deletions(-)
>>>
>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>>> index 28c2836f8c96..7938f4b98b01 100644
>>> --- a/drivers/block/zram/zram_drv.c
>>> +++ b/drivers/block/zram/zram_drv.c
>>> @@ -45,6 +45,8 @@ static const char *default_compressor = "lzo";
>>>  /* Module params (documentation at end) */
>>>  static unsigned int num_devices = 1;
>>>  
>>> +static void zram_free_page(struct zram *zram, size_t index);
>>> +
>>>  static inline bool init_done(struct zram *zram)
>>>  {
>>>  	return zram->disksize;
>>> @@ -98,10 +100,17 @@ static void zram_set_obj_size(struct zram_meta *meta,
>>>  	meta->table[index].value = (flags << ZRAM_FLAG_SHIFT) | size;
>>>  }
>>>  
>>> +#if PAGE_SIZE != 4096
>>>  static inline bool is_partial_io(struct bio_vec *bvec)
>>>  {
>>>  	return bvec->bv_len != PAGE_SIZE;
>>>  }
>>> +#else
>>
>> For page size of 4096 bv_len can still be < 4096 and partial pages should be supported 
>> (uncompress before write etc). ? 
> 
> zram declares this.
> 
>         #define ZRAM_LOGICAL_BLOCK_SIZE (1<<12)
> 
> 	blk_queue_physical_block_size(zram->disk->queue, PAGE_SIZE);
> 	blk_queue_logical_block_size(zram->disk->queue,
> 					ZRAM_LOGICAL_BLOCK_SIZE);
> 
> So, I thought there is no such partial IO in 4096 page architecture.
> Am I missing something? Could you tell the scenario if it happens?

I think you're right. At least swap operates with min 4096 sizes.

> 
> Thanks!
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op
  2017-04-03  6:34     ` Minchan Kim
@ 2017-04-03  8:06       ` Sergey Senozhatsky
  0 siblings, 0 replies; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-03  8:06 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-kernel,
	Sergey Senozhatsky, kernel-team

On (04/03/17 15:34), Minchan Kim wrote:
[..]
> > c) spin_locks probably have better fairness guarantees
> 
> In fact, it wouldn't be an imporant because zram's slot lock contention
> is not heavy.

mostly agree. I think (and I may be mistaken) direct IO
causes contention; but direct IO is probably not a usual
zram workload.

> > what do you think? can we, in this patch set, also replce bit
> > spin_locks with a normal spin_lock?
> 
> With changing only zram side from bit_spin_lock to spin_lock,
> it would be crippled. I mean zsmalloc should be changed, too
> and it's really hard. :(

hm, good point.

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] zram: handle multiple pages attached bio's bvec
  2017-04-03  5:17 ` [PATCH 1/5] zram: handle multiple pages attached bio's bvec Minchan Kim
@ 2017-04-03 22:45   ` Andrew Morton
  2017-04-03 23:13     ` Minchan Kim
  2017-04-04  4:55   ` Sergey Senozhatsky
  1 sibling, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2017-04-03 22:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-kernel, Sergey Senozhatsky, kernel-team, Jens Axboe,
	Hannes Reinecke, Johannes Thumshirn

On Mon, 3 Apr 2017 14:17:29 +0900 Minchan Kim <minchan@kernel.org> wrote:

> Johannes Thumshirn reported system goes the panic when using NVMe over
> Fabrics loopback target with zram.
> 
> The reason is zram expects each bvec in bio contains a single page
> but nvme can attach a huge bulk of pages attached to the bio's bvec
> so that zram's index arithmetic could be wrong so that out-of-bound
> access makes panic.
> 
> It can be solved by limiting max_sectors with SECTORS_PER_PAGE like
> [1] but it makes zram slow because bio should split with each pages
> so this patch makes zram aware of multiple pages in a bvec so it
> could solve without any regression.
> 
> [1] 0bc315381fe9, zram: set physical queue limits to avoid array out of
>     bounds accesses

This isn't a cleanup - it fixes a panic (or is it a BUG or is it an
oops, or...)

How serious is this bug?  Should the fix be backported into -stable
kernels?  etc.

A better description of the bug's behaviour would be appropriate.

> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Hannes Reinecke <hare@suse.com>
> Reported-by: Johannes Thumshirn <jthumshirn@suse.de>
> Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

This signoff trail is confusing.  It somewhat implies that Johannes
authored the patch which I don't think is the case?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] zram: handle multiple pages attached bio's bvec
  2017-04-03 22:45   ` Andrew Morton
@ 2017-04-03 23:13     ` Minchan Kim
  0 siblings, 0 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-03 23:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, linux-kernel, Sergey Senozhatsky, kernel-team,
	Jens Axboe, Hannes Reinecke, Johannes Thumshirn

Hi Andrew,

On Mon, Apr 03, 2017 at 03:45:28PM -0700, Andrew Morton wrote:
> On Mon, 3 Apr 2017 14:17:29 +0900 Minchan Kim <minchan@kernel.org> wrote:
> 
> > Johannes Thumshirn reported system goes the panic when using NVMe over
> > Fabrics loopback target with zram.
> > 
> > The reason is zram expects each bvec in bio contains a single page
> > but nvme can attach a huge bulk of pages attached to the bio's bvec
> > so that zram's index arithmetic could be wrong so that out-of-bound
> > access makes panic.
> > 
> > It can be solved by limiting max_sectors with SECTORS_PER_PAGE like
> > [1] but it makes zram slow because bio should split with each pages
> > so this patch makes zram aware of multiple pages in a bvec so it
> > could solve without any regression.
> > 
> > [1] 0bc315381fe9, zram: set physical queue limits to avoid array out of
> >     bounds accesses
> 
> This isn't a cleanup - it fixes a panic (or is it a BUG or is it an
> oops, or...)

I should have written more carefully.
Johannes reported the problem with fix[1] and Jens already sent it to the
mainline to fix it. However, during the discussion, we can solve the problem
nice way so this is revert of [1] plus solving the problem with other way
which no need to split bio.

Thanks.

> 
> How serious is this bug?  Should the fix be backported into -stable
> kernels?  etc.
> 
> A better description of the bug's behaviour would be appropriate.
> 
> > Cc: Jens Axboe <axboe@kernel.dk>
> > Cc: Hannes Reinecke <hare@suse.com>
> > Reported-by: Johannes Thumshirn <jthumshirn@suse.de>
> > Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
> > Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> 
> This signoff trail is confusing.  It somewhat implies that Johannes
> authored the patch which I don't think is the case?
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] zram: partial IO refactoring
  2017-04-03  5:17 ` [PATCH 2/5] zram: partial IO refactoring Minchan Kim
  2017-04-03  5:52   ` Mika Penttilä
@ 2017-04-04  2:17   ` Sergey Senozhatsky
  2017-04-04  4:50     ` Minchan Kim
  1 sibling, 1 reply; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-04  2:17 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

Hello,

On (04/03/17 14:17), Minchan Kim wrote:
> +static bool zram_special_page_read(struct zram *zram, u32 index,
> +				struct page *page,
> +				unsigned int offset, unsigned int len)
> +{
> +	struct zram_meta *meta = zram->meta;
> +
> +	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> +	if (unlikely(!meta->table[index].handle) ||
> +			zram_test_flag(meta, index, ZRAM_SAME)) {
> +		void *mem;
> +
> +		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> +		mem = kmap_atomic(page);
> +		zram_fill_page(mem + offset, len, meta->table[index].element);
> +		kunmap_atomic(mem);
> +		return true;
> +	}
> +	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> +
> +	return false;
> +}
> +
> +static bool zram_special_page_write(struct zram *zram, u32 index,
> +					struct page *page)
> +{
> +	unsigned long element;
> +	void *mem = kmap_atomic(page);
> +
> +	if (page_same_filled(mem, &element)) {
> +		struct zram_meta *meta = zram->meta;
> +
> +		kunmap_atomic(mem);
> +		/* Free memory associated with this sector now. */
> +		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> +		zram_free_page(zram, index);
> +		zram_set_flag(meta, index, ZRAM_SAME);
> +		zram_set_element(meta, index, element);
> +		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> +
> +		atomic64_inc(&zram->stats.same_pages);
> +		return true;
> +	}
> +	kunmap_atomic(mem);
> +
> +	return false;
> +}

zram_special_page_read() and zram_special_page_write() have a slightly
different locking semantics.

zram_special_page_read() copy-out ZRAM_SAME page having slot unlocked
(can the slot got overwritten in the meantime?), while
zram_special_page_write() keeps the slot locked through out the entire
operation.

>  static void zram_meta_free(struct zram_meta *meta, u64 disksize)
>  {
>  	size_t num_pages = disksize >> PAGE_SHIFT;
> @@ -504,169 +548,104 @@ static void zram_free_page(struct zram *zram, size_t index)
>  	zram_set_obj_size(meta, index, 0);
>  }
>  
> -static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
> +static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
>  {
> -	int ret = 0;
> -	unsigned char *cmem;
> -	struct zram_meta *meta = zram->meta;
> +	int ret;
>  	unsigned long handle;
>  	unsigned int size;
> +	void *src, *dst;
> +	struct zram_meta *meta = zram->meta;
> +
> +	if (zram_special_page_read(zram, index, page, 0, PAGE_SIZE))
> +		return 0;
>  
>  	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
>  	handle = meta->table[index].handle;
>  	size = zram_get_obj_size(meta, index);
>  
> -	if (!handle || zram_test_flag(meta, index, ZRAM_SAME)) {
> -		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> -		zram_fill_page(mem, PAGE_SIZE, meta->table[index].element);
> -		return 0;
> -	}
> -
> -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
> +	src = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
>  	if (size == PAGE_SIZE) {
> -		copy_page(mem, cmem);
> +		dst = kmap_atomic(page);
> +		copy_page(dst, src);
> +		kunmap_atomic(dst);
> +		ret = 0;
>  	} else {
>  		struct zcomp_strm *zstrm = zcomp_stream_get(zram->comp);
>  
> -		ret = zcomp_decompress(zstrm, cmem, size, mem);
> +		dst = kmap_atomic(page);
> +		ret = zcomp_decompress(zstrm, src, size, dst);
> +		kunmap_atomic(dst);
>  		zcomp_stream_put(zram->comp);
>  	}
>  	zs_unmap_object(meta->mem_pool, handle);
>  	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
>  
>  	/* Should NEVER happen. Return bio error if it does. */
> -	if (unlikely(ret)) {
> +	if (unlikely(ret))
>  		pr_err("Decompression failed! err=%d, page=%u\n", ret, index);
> -		return ret;
> -	}
>  
> -	return 0;
> +	return ret;
>  }
>  
>  static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
> -			  u32 index, int offset)
> +				u32 index, int offset)
>  {
>  	int ret;
>  	struct page *page;
> -	unsigned char *user_mem, *uncmem = NULL;
> -	struct zram_meta *meta = zram->meta;
> -	page = bvec->bv_page;
>  
> -	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> -	if (unlikely(!meta->table[index].handle) ||
> -			zram_test_flag(meta, index, ZRAM_SAME)) {
> -		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> -		handle_same_page(bvec, meta->table[index].element);
> +	page = bvec->bv_page;
> +	if (zram_special_page_read(zram, index, page, bvec->bv_offset,
> +				bvec->bv_len))

so, I think zram_bvec_read() path calls zram_special_page_read() twice:

  a) direct zram_special_page_read() call

  b) zram_decompress_page()->zram_special_page_read()

is it supposed to be so?

>  		return 0;
> -	}
> -	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> -
> -	if (is_partial_io(bvec))
> -		/* Use  a temporary buffer to decompress the page */
> -		uncmem = kmalloc(PAGE_SIZE, GFP_NOIO);
> -
> -	user_mem = kmap_atomic(page);
> -	if (!is_partial_io(bvec))
> -		uncmem = user_mem;
>  
> -	if (!uncmem) {
> -		pr_err("Unable to allocate temp memory\n");
> -		ret = -ENOMEM;
> -		goto out_cleanup;
> +	if (is_partial_io(bvec)) {
> +		/* Use a temporary buffer to decompress the page */
> +		page = alloc_page(GFP_NOIO|__GFP_HIGHMEM);
> +		if (!page)
> +			return -ENOMEM;
>  	}
>  
> -	ret = zram_decompress_page(zram, uncmem, index);
> -	/* Should NEVER happen. Return bio error if it does. */
> +	ret = zram_decompress_page(zram, page, index);
>  	if (unlikely(ret))
> -		goto out_cleanup;
> +		goto out;
>  
> -	if (is_partial_io(bvec))
> -		memcpy(user_mem + bvec->bv_offset, uncmem + offset,
> -				bvec->bv_len);
> +	if (is_partial_io(bvec)) {
> +		void *dst = kmap_atomic(bvec->bv_page);
> +		void *src = kmap_atomic(page);
>  
> -	flush_dcache_page(page);
> -	ret = 0;
> -out_cleanup:
> -	kunmap_atomic(user_mem);
> +		memcpy(dst + bvec->bv_offset, src + offset, bvec->bv_len);
> +		kunmap_atomic(src);
> +		kunmap_atomic(dst);
> +	}
> +out:
>  	if (is_partial_io(bvec))
> -		kfree(uncmem);
> +		__free_page(page);
> +
>  	return ret;
>  }
>  
> -static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> -			   int offset)
> +static int zram_compress(struct zram *zram, struct zcomp_strm **zstrm,
> +			struct page *page,
> +			unsigned long *out_handle, unsigned int *out_comp_len)

ok, I see why... not super happy with the `zstrm' magic, but OK. can do.

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op
  2017-04-03  5:17 ` [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op Minchan Kim
  2017-04-03  6:08   ` Sergey Senozhatsky
@ 2017-04-04  2:18   ` Sergey Senozhatsky
  2017-04-04  4:50     ` Minchan Kim
  1 sibling, 1 reply; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-04  2:18 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

On (04/03/17 14:17), Minchan Kim wrote:
> With this clean-up phase, I want to use zram's wrapper function
> to lock table access which is more consistent with other zram's
> functions.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] zram: remove zram_meta structure
  2017-04-03  5:17 ` [PATCH 4/5] zram: remove zram_meta structure Minchan Kim
@ 2017-04-04  2:31   ` Sergey Senozhatsky
  2017-04-04  4:52     ` Minchan Kim
  2017-04-04  5:40     ` Minchan Kim
  0 siblings, 2 replies; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-04  2:31 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

On (04/03/17 14:17), Minchan Kim wrote:
[..]
> -static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
> +static bool zram_meta_alloc(struct zram *zram, u64 disksize)
>  {
>  	size_t num_pages;
> -	struct zram_meta *meta = kmalloc(sizeof(*meta), GFP_KERNEL);
> -
> -	if (!meta)
> -		return NULL;
>  
>  	num_pages = disksize >> PAGE_SHIFT;
> -	meta->table = vzalloc(num_pages * sizeof(*meta->table));
> -	if (!meta->table) {
> -		pr_err("Error allocating zram address table\n");
> -		goto out_error;
> -	}
> +	zram->table = vzalloc(num_pages * sizeof(*zram->table));
> +	if (!zram->table)
> +		return false;
>  
> -	meta->mem_pool = zs_create_pool(pool_name);
> -	if (!meta->mem_pool) {
> -		pr_err("Error creating memory pool\n");
> -		goto out_error;
> +	zram->mem_pool = zs_create_pool(zram->disk->disk_name);
> +	if (!zram->mem_pool) {
> +		vfree(zram->table);
> +		return false;
>  	}
>  
> -	return meta;
> -
> -out_error:
> -	vfree(meta->table);
> -	kfree(meta);
> -	return NULL;
> +	return true;
>  }

[..]

> @@ -1020,7 +989,6 @@ static ssize_t disksize_store(struct device *dev,
>  {
>  	u64 disksize;
>  	struct zcomp *comp;
> -	struct zram_meta *meta;
>  	struct zram *zram = dev_to_zram(dev);
>  	int err;
>  
> @@ -1029,8 +997,7 @@ static ssize_t disksize_store(struct device *dev,
>  		return -EINVAL;
>  
>  	disksize = PAGE_ALIGN(disksize);
> -	meta = zram_meta_alloc(zram->disk->disk_name, disksize);
> -	if (!meta)
> +	if (!zram_meta_alloc(zram, disksize))
>  		return -ENOMEM;
>  
>  	comp = zcomp_create(zram->compressor);
> @@ -1048,7 +1015,6 @@ static ssize_t disksize_store(struct device *dev,
>  		goto out_destroy_comp;
>  	}
>  
> -	zram->meta = meta;
>  	zram->comp = comp;
>  	zram->disksize = disksize;
>  	set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT);
> @@ -1061,7 +1027,7 @@ static ssize_t disksize_store(struct device *dev,
>  	up_write(&zram->init_lock);
>  	zcomp_destroy(comp);
>  out_free_meta:
> -	zram_meta_free(meta, disksize);
> +	zram_meta_free(zram, disksize);
>  	return err;
>  }

OK, I don't think it's the same.

we used to have

	struct zram_meta *zram_meta_alloc()
	{
		meta->table = vzalloc()
		meta->mem_pool = zs_create_pool();
		return meta;
	}


	disksize_store()
	{
		meta = zram_meta_alloc();
		if (init_done(zram)) {
			pr_info("Cannot change disksize for initialized device\n");
			goto out_destroy_comp;
		}

		zram->meta = meta;
		^^^^^^^^^^^^^^^^^^
	}

now we have

	struct zram_meta *zram_meta_alloc()
	{
		zram->table = vzalloc()
		zram->mem_pool = zs_create_pool();
		return true;
	}


	disksize_store()
	{
		zram_meta_alloc();
		^^^^^^^^^^^^^^^^^
		if (init_done(zram)) {
			pr_info("Cannot change disksize for initialized device\n");
			goto out_destroy_comp;
		}
	}


by the time we call init_done(zram) on already init device zram->table
and zram->mem_pool are overwritten and lost. right?


[..]
> -struct zram_meta {
> +struct zram {
>  	struct zram_table_entry *table;
>  	struct zs_pool *mem_pool;
> -};
> -
> -struct zram {
> -	struct zram_meta *meta;
>  	struct zcomp *comp;
>  	struct gendisk *disk;
>  	/* Prevent concurrent execution of device init */


we still have several zram_meta_FOO() left overs in zram_drv.c

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] zram: introduce zram data accessor
  2017-04-03  5:17 ` [PATCH 5/5] zram: introduce zram data accessor Minchan Kim
@ 2017-04-04  4:40   ` Sergey Senozhatsky
  0 siblings, 0 replies; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-04  4:40 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

On (04/03/17 14:17), Minchan Kim wrote:
> With element, sometime I got confused handle and element access.
> It might be my bad but I think it's time to introduce accessor
> to prevent future idiot like me.
> This patch is just clean-up patch so it shouldn't change any
> behavior.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] zram: partial IO refactoring
  2017-04-04  2:17   ` Sergey Senozhatsky
@ 2017-04-04  4:50     ` Minchan Kim
  0 siblings, 0 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-04  4:50 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

Hi Sergey,

On Tue, Apr 04, 2017 at 11:17:06AM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (04/03/17 14:17), Minchan Kim wrote:
> > +static bool zram_special_page_read(struct zram *zram, u32 index,
> > +				struct page *page,
> > +				unsigned int offset, unsigned int len)
> > +{
> > +	struct zram_meta *meta = zram->meta;
> > +
> > +	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> > +	if (unlikely(!meta->table[index].handle) ||
> > +			zram_test_flag(meta, index, ZRAM_SAME)) {
> > +		void *mem;
> > +
> > +		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> > +		mem = kmap_atomic(page);
> > +		zram_fill_page(mem + offset, len, meta->table[index].element);
> > +		kunmap_atomic(mem);
> > +		return true;
> > +	}
> > +	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> > +
> > +	return false;
> > +}
> > +
> > +static bool zram_special_page_write(struct zram *zram, u32 index,
> > +					struct page *page)
> > +{
> > +	unsigned long element;
> > +	void *mem = kmap_atomic(page);
> > +
> > +	if (page_same_filled(mem, &element)) {
> > +		struct zram_meta *meta = zram->meta;
> > +
> > +		kunmap_atomic(mem);
> > +		/* Free memory associated with this sector now. */
> > +		bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> > +		zram_free_page(zram, index);
> > +		zram_set_flag(meta, index, ZRAM_SAME);
> > +		zram_set_element(meta, index, element);
> > +		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> > +
> > +		atomic64_inc(&zram->stats.same_pages);
> > +		return true;
> > +	}
> > +	kunmap_atomic(mem);
> > +
> > +	return false;
> > +}
> 
> zram_special_page_read() and zram_special_page_write() have a slightly
> different locking semantics.
> 
> zram_special_page_read() copy-out ZRAM_SAME page having slot unlocked
> (can the slot got overwritten in the meantime?), while

IMHO, yes, it can be overwritten but it doesn't make corruption of kernel.
I mean if such race happens, it's user fault who should protect the race.
zRAM is dumb block device so it can read/write block user request but
one thing we should keep the promise is it shouldn't corrupt the kernel.
Such pov, zram_special_page_read wouldn't be a problem to return
stale data, I think.

> zram_special_page_write() keeps the slot locked through out the entire
> operation.

zram_special_page_write is something different because it updates
zram_table's slot via zram_set_[flag|element] so it should be protected
by zram.

> 
> >  static void zram_meta_free(struct zram_meta *meta, u64 disksize)
> >  {
> >  	size_t num_pages = disksize >> PAGE_SHIFT;
> > @@ -504,169 +548,104 @@ static void zram_free_page(struct zram *zram, size_t index)
> >  	zram_set_obj_size(meta, index, 0);
> >  }
> >  
> > -static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
> > +static int zram_decompress_page(struct zram *zram, struct page *page, u32 index)
> >  {
> > -	int ret = 0;
> > -	unsigned char *cmem;
> > -	struct zram_meta *meta = zram->meta;
> > +	int ret;
> >  	unsigned long handle;
> >  	unsigned int size;
> > +	void *src, *dst;
> > +	struct zram_meta *meta = zram->meta;
> > +
> > +	if (zram_special_page_read(zram, index, page, 0, PAGE_SIZE))
> > +		return 0;
> >  
> >  	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> >  	handle = meta->table[index].handle;
> >  	size = zram_get_obj_size(meta, index);
> >  
> > -	if (!handle || zram_test_flag(meta, index, ZRAM_SAME)) {
> > -		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> > -		zram_fill_page(mem, PAGE_SIZE, meta->table[index].element);
> > -		return 0;
> > -	}
> > -
> > -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
> > +	src = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
> >  	if (size == PAGE_SIZE) {
> > -		copy_page(mem, cmem);
> > +		dst = kmap_atomic(page);
> > +		copy_page(dst, src);
> > +		kunmap_atomic(dst);
> > +		ret = 0;
> >  	} else {
> >  		struct zcomp_strm *zstrm = zcomp_stream_get(zram->comp);
> >  
> > -		ret = zcomp_decompress(zstrm, cmem, size, mem);
> > +		dst = kmap_atomic(page);
> > +		ret = zcomp_decompress(zstrm, src, size, dst);
> > +		kunmap_atomic(dst);
> >  		zcomp_stream_put(zram->comp);
> >  	}
> >  	zs_unmap_object(meta->mem_pool, handle);
> >  	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> >  
> >  	/* Should NEVER happen. Return bio error if it does. */
> > -	if (unlikely(ret)) {
> > +	if (unlikely(ret))
> >  		pr_err("Decompression failed! err=%d, page=%u\n", ret, index);
> > -		return ret;
> > -	}
> >  
> > -	return 0;
> > +	return ret;
> >  }
> >  
> >  static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
> > -			  u32 index, int offset)
> > +				u32 index, int offset)
> >  {
> >  	int ret;
> >  	struct page *page;
> > -	unsigned char *user_mem, *uncmem = NULL;
> > -	struct zram_meta *meta = zram->meta;
> > -	page = bvec->bv_page;
> >  
> > -	bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
> > -	if (unlikely(!meta->table[index].handle) ||
> > -			zram_test_flag(meta, index, ZRAM_SAME)) {
> > -		bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
> > -		handle_same_page(bvec, meta->table[index].element);
> > +	page = bvec->bv_page;
> > +	if (zram_special_page_read(zram, index, page, bvec->bv_offset,
> > +				bvec->bv_len))
> 
> so, I think zram_bvec_read() path calls zram_special_page_read() twice:
> 
>   a) direct zram_special_page_read() call
> 
>   b) zram_decompress_page()->zram_special_page_read()
> 
> is it supposed to be so?

Yes, Because zram_decompress_page is called by zram_bvec_write
in case of partial IO. Maybe, we makes it simple with removing
zram_special_page_read in zram_bvec_read. I will look.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op
  2017-04-04  2:18   ` Sergey Senozhatsky
@ 2017-04-04  4:50     ` Minchan Kim
  0 siblings, 0 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-04  4:50 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

On Tue, Apr 04, 2017 at 11:18:51AM +0900, Sergey Senozhatsky wrote:
> On (04/03/17 14:17), Minchan Kim wrote:
> > With this clean-up phase, I want to use zram's wrapper function
> > to lock table access which is more consistent with other zram's
> > functions.
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> 
> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

Thanks!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] zram: remove zram_meta structure
  2017-04-04  2:31   ` Sergey Senozhatsky
@ 2017-04-04  4:52     ` Minchan Kim
  2017-04-04  5:40     ` Minchan Kim
  1 sibling, 0 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-04  4:52 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

On Tue, Apr 04, 2017 at 11:31:15AM +0900, Sergey Senozhatsky wrote:
> On (04/03/17 14:17), Minchan Kim wrote:
> [..]
> > -static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
> > +static bool zram_meta_alloc(struct zram *zram, u64 disksize)
> >  {
> >  	size_t num_pages;
> > -	struct zram_meta *meta = kmalloc(sizeof(*meta), GFP_KERNEL);
> > -
> > -	if (!meta)
> > -		return NULL;
> >  
> >  	num_pages = disksize >> PAGE_SHIFT;
> > -	meta->table = vzalloc(num_pages * sizeof(*meta->table));
> > -	if (!meta->table) {
> > -		pr_err("Error allocating zram address table\n");
> > -		goto out_error;
> > -	}
> > +	zram->table = vzalloc(num_pages * sizeof(*zram->table));
> > +	if (!zram->table)
> > +		return false;
> >  
> > -	meta->mem_pool = zs_create_pool(pool_name);
> > -	if (!meta->mem_pool) {
> > -		pr_err("Error creating memory pool\n");
> > -		goto out_error;
> > +	zram->mem_pool = zs_create_pool(zram->disk->disk_name);
> > +	if (!zram->mem_pool) {
> > +		vfree(zram->table);
> > +		return false;
> >  	}
> >  
> > -	return meta;
> > -
> > -out_error:
> > -	vfree(meta->table);
> > -	kfree(meta);
> > -	return NULL;
> > +	return true;
> >  }
> 
> [..]
> 
> > @@ -1020,7 +989,6 @@ static ssize_t disksize_store(struct device *dev,
> >  {
> >  	u64 disksize;
> >  	struct zcomp *comp;
> > -	struct zram_meta *meta;
> >  	struct zram *zram = dev_to_zram(dev);
> >  	int err;
> >  
> > @@ -1029,8 +997,7 @@ static ssize_t disksize_store(struct device *dev,
> >  		return -EINVAL;
> >  
> >  	disksize = PAGE_ALIGN(disksize);
> > -	meta = zram_meta_alloc(zram->disk->disk_name, disksize);
> > -	if (!meta)
> > +	if (!zram_meta_alloc(zram, disksize))
> >  		return -ENOMEM;
> >  
> >  	comp = zcomp_create(zram->compressor);
> > @@ -1048,7 +1015,6 @@ static ssize_t disksize_store(struct device *dev,
> >  		goto out_destroy_comp;
> >  	}
> >  
> > -	zram->meta = meta;
> >  	zram->comp = comp;
> >  	zram->disksize = disksize;
> >  	set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT);
> > @@ -1061,7 +1027,7 @@ static ssize_t disksize_store(struct device *dev,
> >  	up_write(&zram->init_lock);
> >  	zcomp_destroy(comp);
> >  out_free_meta:
> > -	zram_meta_free(meta, disksize);
> > +	zram_meta_free(zram, disksize);
> >  	return err;
> >  }
> 
> OK, I don't think it's the same.
> 
> we used to have
> 
> 	struct zram_meta *zram_meta_alloc()
> 	{
> 		meta->table = vzalloc()
> 		meta->mem_pool = zs_create_pool();
> 		return meta;
> 	}
> 
> 
> 	disksize_store()
> 	{
> 		meta = zram_meta_alloc();
> 		if (init_done(zram)) {
> 			pr_info("Cannot change disksize for initialized device\n");
> 			goto out_destroy_comp;
> 		}
> 
> 		zram->meta = meta;
> 		^^^^^^^^^^^^^^^^^^
> 	}
> 
> now we have
> 
> 	struct zram_meta *zram_meta_alloc()
> 	{
> 		zram->table = vzalloc()
> 		zram->mem_pool = zs_create_pool();
> 		return true;
> 	}
> 
> 
> 	disksize_store()
> 	{
> 		zram_meta_alloc();
> 		^^^^^^^^^^^^^^^^^
> 		if (init_done(zram)) {
> 			pr_info("Cannot change disksize for initialized device\n");
> 			goto out_destroy_comp;
> 		}
> 	}
> 
> 
> by the time we call init_done(zram) on already init device zram->table
> and zram->mem_pool are overwritten and lost. right?

Right you are.
I will fix it!

> 
> 
> [..]
> > -struct zram_meta {
> > +struct zram {
> >  	struct zram_table_entry *table;
> >  	struct zs_pool *mem_pool;
> > -};
> > -
> > -struct zram {
> > -	struct zram_meta *meta;
> >  	struct zcomp *comp;
> >  	struct gendisk *disk;
> >  	/* Prevent concurrent execution of device init */
> 
> 
> we still have several zram_meta_FOO() left overs in zram_drv.c
> 
> 	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] zram: handle multiple pages attached bio's bvec
  2017-04-03  5:17 ` [PATCH 1/5] zram: handle multiple pages attached bio's bvec Minchan Kim
  2017-04-03 22:45   ` Andrew Morton
@ 2017-04-04  4:55   ` Sergey Senozhatsky
  1 sibling, 0 replies; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-04  4:55 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team,
	Jens Axboe, Hannes Reinecke, Johannes Thumshirn

On (04/03/17 14:17), Minchan Kim wrote:
> Johannes Thumshirn reported system goes the panic when using NVMe over
> Fabrics loopback target with zram.
> 
> The reason is zram expects each bvec in bio contains a single page
> but nvme can attach a huge bulk of pages attached to the bio's bvec
> so that zram's index arithmetic could be wrong so that out-of-bound
> access makes panic.
> 
> It can be solved by limiting max_sectors with SECTORS_PER_PAGE like
> [1] but it makes zram slow because bio should split with each pages
> so this patch makes zram aware of multiple pages in a bvec so it
> could solve without any regression.
> 
> [1] 0bc315381fe9, zram: set physical queue limits to avoid array out of
>     bounds accesses
> 
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Hannes Reinecke <hare@suse.com>
> Reported-by: Johannes Thumshirn <jthumshirn@suse.de>
> Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>


> +		unsigned int remained = bvec.bv_len;
...
> +		} while (remained);

a tiny nitpick,  "-ed" in variable name looks a bit unusual.

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] zram: remove zram_meta structure
  2017-04-04  2:31   ` Sergey Senozhatsky
  2017-04-04  4:52     ` Minchan Kim
@ 2017-04-04  5:40     ` Minchan Kim
  2017-04-04  5:54       ` Sergey Senozhatsky
  1 sibling, 1 reply; 25+ messages in thread
From: Minchan Kim @ 2017-04-04  5:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, linux-kernel, Sergey Senozhatsky, kernel-team

On Tue, Apr 04, 2017 at 11:31:15AM +0900, Sergey Senozhatsky wrote:
< snip >

> 
> [..]
> > -struct zram_meta {
> > +struct zram {
> >  	struct zram_table_entry *table;
> >  	struct zs_pool *mem_pool;
> > -};
> > -
> > -struct zram {
> > -	struct zram_meta *meta;
> >  	struct zcomp *comp;
> >  	struct gendisk *disk;
> >  	/* Prevent concurrent execution of device init */
> 
> 
> we still have several zram_meta_FOO() left overs in zram_drv.c

I intentionally used that term because I don't think better name
to wrap logis which allocates table and mempool.
Feel free to suggest if you have better.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] zram: remove zram_meta structure
  2017-04-04  5:40     ` Minchan Kim
@ 2017-04-04  5:54       ` Sergey Senozhatsky
  0 siblings, 0 replies; 25+ messages in thread
From: Sergey Senozhatsky @ 2017-04-04  5:54 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, linux-kernel,
	Sergey Senozhatsky, kernel-team

On (04/04/17 14:40), Minchan Kim wrote:
> > [..]
> > > -struct zram_meta {
> > > +struct zram {
> > >  	struct zram_table_entry *table;
> > >  	struct zs_pool *mem_pool;
> > > -};
> > > -
> > > -struct zram {
> > > -	struct zram_meta *meta;
> > >  	struct zcomp *comp;
> > >  	struct gendisk *disk;
> > >  	/* Prevent concurrent execution of device init */
> > 
> > 
> > we still have several zram_meta_FOO() left overs in zram_drv.c
> 
> I intentionally used that term because I don't think better name
> to wrap logis which allocates table and mempool.

ah, it was intentional. um, OK, let's keep zram_meta_foo().

	-ss

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/5] zram clean up
  2017-04-03  5:17 [PATCH 0/5] zram clean up Minchan Kim
                   ` (4 preceding siblings ...)
  2017-04-03  5:17 ` [PATCH 5/5] zram: introduce zram data accessor Minchan Kim
@ 2017-04-11  5:38 ` Minchan Kim
  5 siblings, 0 replies; 25+ messages in thread
From: Minchan Kim @ 2017-04-11  5:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Sergey Senozhatsky, kernel-team

Hi Andrew,

Could you drop this patchset?

I will send updated version based on Sergey with some bug fixes.

zram-handle-multiple-pages-attached-bios-bvec.patch
zram-partial-io-refactoring.patch
zram-use-zram_slot_lock-instead-of-raw-bit_spin_lock-op.patch
zram-remove-zram_meta-structure.patch
zram-introduce-zram-data-accessor.patch

Thanks!

On Mon, Apr 03, 2017 at 02:17:28PM +0900, Minchan Kim wrote:
> This patchset aims zram clean-up.
> 
> [1] clean up multiple pages's bvec handling.
> [2] clean up partial IO handling
> [3-5] clean up zram via using accessor and removing pointless structure.
> 
> With [2-5] applied, we can get a few hundred bytes as well as huge
> readibility enhance.
> 
> This patchset is based on v4.11-rc4-mmotm-2017-03-30-16-31 and
> *drop* zram-factor-out-partial-io-routine.patch.
> 
> x86: 708 byte save
> 
>     add/remove: 1/1 grow/shrink: 0/11 up/down: 478/-1186 (-708)
>     function                                     old     new   delta
>     zram_special_page_read                         -     478    +478
>     zram_reset_device                            317     314      -3
>     mem_used_max_store                           131     128      -3
>     compact_store                                 96      93      -3
>     mm_stat_show                                 203     197      -6
>     zram_add                                     719     712      -7
>     zram_slot_free_notify                        229     214     -15
>     zram_make_request                            819     803     -16
>     zram_meta_free                               128     111     -17
>     zram_free_page                               180     151     -29
>     disksize_store                               432     361     -71
>     zram_decompress_page.isra                    504       -    -504
>     zram_bvec_rw                                2592    2080    -512
>     Total: Before=25350773, After=25350065, chg -0.00%
>     
> ppc64: 231 byte save
>     
>     add/remove: 2/0 grow/shrink: 1/9 up/down: 681/-912 (-231)
>     function                                     old     new   delta
>     zram_special_page_read                         -     480    +480
>     zram_slot_lock                                 -     200    +200
>     vermagic                                      39      40      +1
>     mm_stat_show                                 256     248      -8
>     zram_meta_free                               200     184     -16
>     zram_add                                     944     912     -32
>     zram_free_page                               348     308     -40
>     disksize_store                               572     492     -80
>     zram_decompress_page                         664     564    -100
>     zram_slot_free_notify                        292     160    -132
>     zram_make_request                           1132    1000    -132
>     zram_bvec_rw                                2768    2396    -372
>     Total: Before=17565825, After=17565594, chg -0.00%
> 
> Minchan Kim (5):
>   [1] zram: handle multiple pages attached bio's bvec
>   [2] zram: partial IO refactoring
>   [3] zram: use zram_slot_lock instead of raw bit_spin_lock op
>   [4] zram: remove zram_meta structure
>   [5] zram: introduce zram data accessor
> 
>  drivers/block/zram/zram_drv.c | 532 +++++++++++++++++++++---------------------
>  drivers/block/zram/zram_drv.h |   6 +-
>  2 files changed, 270 insertions(+), 268 deletions(-)
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-04-11  5:38 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-03  5:17 [PATCH 0/5] zram clean up Minchan Kim
2017-04-03  5:17 ` [PATCH 1/5] zram: handle multiple pages attached bio's bvec Minchan Kim
2017-04-03 22:45   ` Andrew Morton
2017-04-03 23:13     ` Minchan Kim
2017-04-04  4:55   ` Sergey Senozhatsky
2017-04-03  5:17 ` [PATCH 2/5] zram: partial IO refactoring Minchan Kim
2017-04-03  5:52   ` Mika Penttilä
2017-04-03  6:12     ` Minchan Kim
2017-04-03  6:57       ` Mika Penttilä
2017-04-04  2:17   ` Sergey Senozhatsky
2017-04-04  4:50     ` Minchan Kim
2017-04-03  5:17 ` [PATCH 3/5] zram: use zram_slot_lock instead of raw bit_spin_lock op Minchan Kim
2017-04-03  6:08   ` Sergey Senozhatsky
2017-04-03  6:34     ` Minchan Kim
2017-04-03  8:06       ` Sergey Senozhatsky
2017-04-04  2:18   ` Sergey Senozhatsky
2017-04-04  4:50     ` Minchan Kim
2017-04-03  5:17 ` [PATCH 4/5] zram: remove zram_meta structure Minchan Kim
2017-04-04  2:31   ` Sergey Senozhatsky
2017-04-04  4:52     ` Minchan Kim
2017-04-04  5:40     ` Minchan Kim
2017-04-04  5:54       ` Sergey Senozhatsky
2017-04-03  5:17 ` [PATCH 5/5] zram: introduce zram data accessor Minchan Kim
2017-04-04  4:40   ` Sergey Senozhatsky
2017-04-11  5:38 ` [PATCH 0/5] zram clean up Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.