From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754263AbdCWDC6 (ORCPT <rfc822;w@1wt.eu>);
        Wed, 22 Mar 2017 23:02:58 -0400
Received: from LGEAMRELO11.lge.com ([156.147.23.51]:49556 "EHLO
        lgeamrelo11.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1754177AbdCWDCv (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 22 Mar 2017 23:02:51 -0400
X-Original-SENDERIP: 156.147.1.151
X-Original-MAILFROM: iamjoonsoo.kim@lge.com
X-Original-SENDERIP: 165.244.249.26
X-Original-MAILFROM: iamjoonsoo.kim@lge.com
X-Original-SENDERIP: 10.177.222.138
X-Original-MAILFROM: iamjoonsoo.kim@lge.com
Date: Thu, 23 Mar 2017 12:04:23 +0900
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Minchan Kim <minchan@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
        <linux-kernel@vger.kernel.org>, <kernel-team@lge.com>
Subject: Re: [PATCH 3/4] zram: implement deduplication in zram
Message-ID: <20170323030423.GB17486@js1304-P5Q-DELUXE>
References: <1489632398-31501-1-git-send-email-iamjoonsoo.kim@lge.com>
 <1489632398-31501-4-git-send-email-iamjoonsoo.kim@lge.com>
 <20170321234121.GA30149@bbox>
MIME-Version: 1.0
In-Reply-To: <20170321234121.GA30149@bbox>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-MIMETrack: Itemize by SMTP Server on LGEKRMHUB05/LGE/LG Group(Release 8.5.3FP6|November
 21, 2013) at 2017/03/23 12:02:25,
        Serialize by Router on LGEKRMHUB05/LGE/LG Group(Release 8.5.3FP6|November
 21, 2013) at 2017/03/23 12:02:25,
        Serialize complete at 2017/03/23 12:02:25
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Mar 22, 2017 at 08:41:21AM +0900, Minchan Kim wrote:
> Hi Joonsoo,
> 
> On Thu, Mar 16, 2017 at 11:46:37AM +0900, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > This patch implements deduplication feature in zram. The purpose
> > of this work is naturally to save amount of memory usage by zram.
> > 
> > Android is one of the biggest users to use zram as swap and it's
> > really important to save amount of memory usage. There is a paper
> > that reports that duplication ratio of Android's memory content is
> > rather high [1]. And, there is a similar work on zswap that also
> > reports that experiments has shown that around 10-15% of pages
> > stored in zswp are duplicates and deduplicate them provides some
> > benefits [2].
> > 
> > Also, there is a different kind of workload that uses zram as blockdev
> > and store build outputs into it to reduce wear-out problem of real
> > blockdev. In this workload, deduplication hit is very high
> > although I don't know exact reason about it.
> 
> Hmm, Isn't it due to binary composed by obj files so that many of
> part between binary and object are sharable?
> 
> I'd like to clear it out because dedup was not useful for swap workload
> for the testing in android unlike papers you mentioned.
> Of course, it depends on the workload so someone might find it's
> huge useful for his swap workload. However, I want to focus clear
> winning case scenario rather than "might be better".
> 
> That's why I want to know clear reason the saving. If my assumption
> is right(i.e., object file vs. binary file), it would be enough
> justfication to merge this feature because user can turn on the feature
> with reasonable expectation.

Okay. I checked the reason of benefit on the kernel build now. There are
some cases that deduplication happens.

1) *.cmd
Build command is usually similar in one directory. So, content of
these file are very similar. Please check fs/ext4/.namei.o.cmd and
fs/ext4/.inode.o.cmd. In my system, more than 789 lines are the same
in 944 and 939 line of the file, respectively.

2) object file
built-in.o and temporal object file have the similar content. More
than 50% of fs/ext4/ext4.o is the same with fs/ext4/built-in.o. I
think that we can optimize in this case. ext4.o is temporal object
file and it isn't necessarily needed. We can change following
fs/ext4/Makefile to optimized one.

orig>
obj-$(CONFIG_EXT4_FS) += ext4.o
ext4-y := XXX YYY ZZZ

optimized>
obj-$(CONFIG_EXT4_FS) += XXX YYY ZZZ

3) vmlinux
.tmp_vmlinux1 and .tmp_vmlinux2 and arch/x86/boot/compressed/vmlinux.bin
have the similar content.

I did similar checking for Android and it also has similar case.
Some of object files (.class and .so) are similar with another object
files. (./host/linux-x86/lib/libartd.so and
./host/linux-x86/lib/libartd-compiler.so)

> 
> > 
> > Anyway, if we can detect duplicated content and avoid to store duplicated
> > content at different memory space, we can save memory. This patch
> > tries to do that.
> > 
> > Implementation is almost simple and intuitive but I should note
> > one thing about implementation detail.
> > 
> > To check duplication, this patch uses checksum of the page and
> > collision of this checksum could be possible. There would be
> > many choices to handle this situation but this patch chooses
> > to allow entry with duplicated checksum to be added to the hash,
> > but, not to compare all entries with duplicated checksum
> > when checking duplication. I guess that checksum collision is quite
> 
> Hmm, if there are many duplicated checksum, what a specific checksum
> is compared among them?

I implemented it to check just first one.

> > rare event and we don't need to pay any attention to such a case.
> 
> If it's rare event, why can't we iterate all of entries?

It's possible. If you prefer it, I can do it.

> > Therefore, I decided the most simplest way to implement the feature.
> > If there is a different opinion, I can accept and go that way.
> > 
> > Following is the result of this patch.
> > 
> > Test result #1 (Swap):
> > Android Marshmallow, emulator, x86_64, Backporting to kernel v3.18
> > 
> > orig_data_size: 145297408
> > compr_data_size: 32408125
> > mem_used_total: 32276480
> > dup_data_size: 3188134
> > meta_data_size: 1444272
> > 
> > Last two metrics added to mm_stat are related to this work.
> > First one, dup_data_size, is amount of saved memory by avoiding
> > to store duplicated page. Later one, meta_data_size, is the amount of
> > data structure to support deduplication. If dup > meta, we can judge
> > that the patch improves memory usage.
> > 
> > In Adnroid, we can save 5% of memory usage by this work.
> > 
> > Test result #2 (Blockdev):
> > build the kernel and store output to ext4 FS on zram
> > 
> > <no-dedup>
> > Elapsed time: 249 s
> > mm_stat: 430845952 191014886 196898816 0 196898816 28320 0 0 0
> > 
> > <dedup>
> > Elapsed time: 250 s
> > mm_stat: 430505984 190971334 148365312 0 148365312 28404 0 47287038  3945792
> > 
> > There is no performance degradation and save 23% memory.
> > 
> > Test result #3 (Blockdev):
> > copy android build output dir(out/host) to ext4 FS on zram
> > 
> > <no-dedup>
> > Elapsed time: out/host: 88 s
> > mm_stat: 8834420736 3658184579 3834208256 0 3834208256 32889 0 0 0
> > 
> > <dedup>
> > Elapsed time: out/host: 100 s
> > mm_stat: 8832929792 3657329322 2832015360 0 2832015360 32609 0 952568877 80880336
> > 
> > It shows performance degradation roughly 13% and save 24% memory. Maybe,
> > it is due to overhead of calculating checksum and comparison.
> > 
> > Test result #4 (Blockdev):
> > copy android build output dir(out/target/common) to ext4 FS on zram
> > 
> > <no-dedup>
> > Elapsed time: out/host: 203 s
> > mm_stat: 4041678848 2310355010 2346577920 0 2346582016 500 4 0 0
> > 
> > <dedup>
> > Elapsed time: out/host: 201 s
> > mm_stat: 4041666560 2310488276 1338150912 0 1338150912 476 0 989088794 24564336
> > 
> > Memory is saved by 42% and performance is the same. Even if there is overhead
> > of calculating checksum and comparison, large hit ratio compensate it since
> > hit leads to less compression attempt.
> 
> Cool! It would help a lot for users have used zram to build output directory.
> 
> > 
> > Anyway, benefit seems to be largely dependent on the workload so
> > following patch will make this feature optional. However, this feature
> > can help some usecases so is deserved to be merged.
> > 
> > [1]: MemScope: Analyzing Memory Duplication on Android Systems,
> > dl.acm.org/citation.cfm?id=2797023
> > [2]: zswap: Optimize compressed pool memory utilization,
> > lkml.kernel.org/r/1341407574.7551.1471584870761.JavaMail.weblogic@epwas3p2
> 
> Overall, it looks good. Thanks!
> However, I want to change function naming/structuring into my preference style
> to maintain because it's non-trival feature. So, please look below.

Okay. I understand all you commented. I will fix them and I won't reply to each one.

Thanks.

> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  drivers/block/zram/zram_drv.c | 212 ++++++++++++++++++++++++++++++++++++++----
> >  drivers/block/zram/zram_drv.h |  18 ++++
> >  2 files changed, 214 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index a3d9cbca..012425f 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -32,6 +32,7 @@
> >  #include <linux/idr.h>
> >  #include <linux/sysfs.h>
> >  #include <linux/cpuhotplug.h>
> > +#include <linux/jhash.h>
> >  
> >  #include "zram_drv.h"
> >  
> > @@ -385,14 +386,16 @@ static ssize_t mm_stat_show(struct device *dev,
> >  	max_used = atomic_long_read(&zram->stats.max_used_pages);
> >  
> >  	ret = scnprintf(buf, PAGE_SIZE,
> > -			"%8llu %8llu %8llu %8lu %8ld %8llu %8lu\n",
> > +			"%8llu %8llu %8llu %8lu %8ld %8llu %8lu %8llu %8llu\n",
> >  			orig_size << PAGE_SHIFT,
> >  			(u64)atomic64_read(&zram->stats.compr_data_size),
> >  			mem_used << PAGE_SHIFT,
> >  			zram->limit_pages << PAGE_SHIFT,
> >  			max_used << PAGE_SHIFT,
> >  			(u64)atomic64_read(&zram->stats.same_pages),
> > -			pool_stats.pages_compacted);
> > +			pool_stats.pages_compacted,
> > +			(u64)atomic64_read(&zram->stats.dup_data_size),
> 
> zram_dedeup_stat_read?
> 
> The reason to use wrapper function is I reallly want to remove any dedup
> related code unless users turns on the feature in Kconfig.
> Wrapper function would be more clean rather than ifdef.
> 
> > +			(u64)atomic64_read(&zram->stats.meta_data_size));
> 
> 
> >  	up_read(&zram->init_lock);
> >  
> >  	return ret;
> > @@ -419,29 +422,165 @@ static ssize_t debug_stat_show(struct device *dev,
> >  static DEVICE_ATTR_RO(mm_stat);
> >  static DEVICE_ATTR_RO(debug_stat);
> >  
> > -static struct zram_entry *zram_entry_alloc(struct zram_meta *meta,
> > +static u32 zram_calc_checksum(unsigned char *mem)
> 
> zram_dedup_checksum.
> 
> I want to use 'dedup' term all of dedup related to functions.
> 
> 
> > +{
> > +	return jhash(mem, PAGE_SIZE, 0);
> > +}
> > +
> > +static struct zram_entry *zram_entry_alloc(struct zram *zram,
> >  					unsigned int len, gfp_t flags)
> >  {
> > +	struct zram_meta *meta = zram->meta;
> >  	struct zram_entry *entry;
> > +	unsigned long handle;
> >  
> > -	entry = kzalloc(sizeof(*entry), flags);
> > -	if (!entry)
> > +	handle = zs_malloc(meta->mem_pool, len, flags);
> > +	if (!handle)
> >  		return NULL;
> 
> Separate allocate zram_entry and dedeup.
> IOW,
> 
>         struct zram_entry *entry = zram_entry_alloc(zram, xxx);
>         if (!zram_dedup_init(zram, entry, xxx))
>                 zram_entry_free(entry);
> 
> >  
> > -	entry->handle = zs_malloc(meta->mem_pool, len, flags);
> > -	if (!entry->handle) {
> > -		kfree(entry);
> > +	entry = kzalloc(sizeof(*entry), flags);
> > +	if (!entry) {
> > +		zs_free(meta->mem_pool, handle);
> >  		return NULL;
> >  	}
> >  
> > +	entry->handle = handle;
> > +	RB_CLEAR_NODE(&entry->rb_node);
> > +	entry->refcount = 1;
> > +	entry->len = len;
> > +	atomic64_add(sizeof(*entry), &zram->stats.meta_data_size);
> > +
> >  	return entry;
> >  }
> >  
> > -static inline void zram_entry_free(struct zram_meta *meta,
> > -			struct zram_entry *entry)
> > +static void zram_entry_insert(struct zram *zram, struct zram_entry *new,
> 
> zram_dedup_insert or add?
> 
> > +				u32 checksum)
> > +{
> > +	struct zram_meta *meta = zram->meta;
> > +	struct zram_hash *hash;
> > +	struct rb_root *rb_root;
> > +	struct rb_node **rb_node, *parent = NULL;
> > +	struct zram_entry *entry;
> > +
> > +	new->checksum = checksum;
> > +	hash = &meta->hash[checksum % meta->hash_size];
> > +	rb_root = &hash->rb_root;
> > +
> > +	spin_lock(&hash->lock);
> > +	rb_node = &rb_root->rb_node;
> > +	while (*rb_node) {
> > +		parent = *rb_node;
> > +		entry = rb_entry(parent, struct zram_entry, rb_node);
> > +		if (checksum < entry->checksum)
> > +			rb_node = &parent->rb_left;
> > +		else if (checksum > entry->checksum)
> > +			rb_node = &parent->rb_right;
> > +		else
> > +			rb_node = &parent->rb_left;
> > +	}
> > +
> > +	rb_link_node(&new->rb_node, parent, rb_node);
> > +	rb_insert_color(&new->rb_node, rb_root);
> > +	spin_unlock(&hash->lock);
> > +}
> > +
> > +static bool zram_entry_match(struct zram *zram, struct zram_entry *entry,
> 
> Ditto.
> 
> zram_dedup_match?
> 
> > +				unsigned char *mem)
> > +{
> > +	bool match = false;
> > +	unsigned char *cmem;
> > +	struct zram_meta *meta = zram->meta;
> > +	struct zcomp_strm *zstrm;
> > +
> > +	cmem = zs_map_object(meta->mem_pool, entry->handle, ZS_MM_RO);
> > +	if (entry->len == PAGE_SIZE) {
> > +		match = !memcmp(mem, cmem, PAGE_SIZE);
> > +	} else {
> > +		zstrm = zcomp_stream_get(zram->comp);
> > +		if (!zcomp_decompress(zstrm, cmem, entry->len, zstrm->buffer))
> > +			match = !memcmp(mem, zstrm->buffer, PAGE_SIZE);
> > +		zcomp_stream_put(zram->comp);
> > +	}
> > +	zs_unmap_object(meta->mem_pool, entry->handle);
> > +
> > +	return match;
> > +}
> > +
> > +static inline void zram_entry_free(struct zram *zram, struct zram_meta *meta,
> > +				struct zram_entry *entry)
> >  {
> >  	zs_free(meta->mem_pool, entry->handle);
> >  	kfree(entry);
> > +	if (zram)
> > +		atomic64_sub(sizeof(*entry), &zram->stats.meta_data_size);
> > +}
> > +
> > +static bool zram_entry_put(struct zram *zram, struct zram_meta *meta,
> > +			struct zram_entry *entry, bool populated)
> 
> Please, separate entry and dedup part like zram_entry_alloc.
> Then, Can't we remove 'populated'?
> 
> > +{
> > +	struct zram_hash *hash;
> > +	u32 checksum;
> > +
> > +	if (!populated)
> > +		goto free;
> > +
> > +	checksum = entry->checksum;
> > +	hash = &meta->hash[checksum % meta->hash_size];
> > +
> > +	spin_lock(&hash->lock);
> > +	entry->refcount--;
> > +	if (!entry->refcount) {
> > +		populated = false;
> > +		rb_erase(&entry->rb_node, &hash->rb_root);
> > +		RB_CLEAR_NODE(&entry->rb_node);
> > +	}
> > +	spin_unlock(&hash->lock);
> > +
> > +free:
> > +	if (!populated)
> > +		zram_entry_free(zram, meta, entry);
> > +
> > +	return populated;
> > +}
> > +
> > +static struct zram_entry *zram_entry_get(struct zram *zram,
> 
> We can rename it to zram_entry_match.
> 
> > +				unsigned char *mem, u32 checksum)
> > +{
> > +	struct zram_meta *meta = zram->meta;
> > +	struct zram_hash *hash;
> > +	struct zram_entry *entry;
> > +	struct rb_node *rb_node;
> > +
> > +	hash = &meta->hash[checksum % meta->hash_size];
> > +
> > +	spin_lock(&hash->lock);
> > +	rb_node = hash->rb_root.rb_node;
> > +	while (rb_node) {
> > +		entry = rb_entry(rb_node, struct zram_entry, rb_node);
> > +		if (checksum == entry->checksum) {
> > +			entry->refcount++;
> > +			atomic64_add(entry->len, &zram->stats.dup_data_size);
> > +			spin_unlock(&hash->lock);
> > +
> > +			if (zram_entry_match(zram, entry, mem))
> 
> __zram_entry_match? Or just open-code.
> 
> > +				return entry;
> > +
> > +			if (zram_entry_put(zram, meta, entry, true)) {
> > +				atomic64_sub(entry->len,
> > +					&zram->stats.dup_data_size);
> > +			}
> > +
> > +			return NULL;
> > +		}
> > +
> > +		if (checksum < entry->checksum)
> > +			rb_node = rb_node->rb_left;
> > +		else
> > +			rb_node = rb_node->rb_right;
> > +	}
> > +	spin_unlock(&hash->lock);
> > +
> > +	return NULL;
> >  }
> >  
> >  static void zram_meta_free(struct zram_meta *meta, u64 disksize)
> > @@ -459,18 +598,31 @@ static void zram_meta_free(struct zram_meta *meta, u64 disksize)
> >  		if (!entry || zram_test_flag(meta, index, ZRAM_SAME))
> >  			continue;
> >  
> > -		zram_entry_free(meta, entry);
> > +		zram_entry_put(NULL, meta, entry, true);
> >  	}
> >  
> >  	zs_destroy_pool(meta->mem_pool);
> > +	vfree(meta->hash);
> >  	vfree(meta->table);
> >  	kfree(meta);
> >  }
> >  
> > +static void zram_meta_init(struct zram_meta *meta)
> > +{
> > +	int i;
> > +	struct zram_hash *hash;
> > +
> > +	for (i = 0; i < meta->hash_size; i++) {
> > +		hash = &meta->hash[i];
> > +		spin_lock_init(&hash->lock);
> > +		hash->rb_root = RB_ROOT;
> > +	}
> > +}
> > +
> >  static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
> >  {
> >  	size_t num_pages;
> > -	struct zram_meta *meta = kmalloc(sizeof(*meta), GFP_KERNEL);
> > +	struct zram_meta *meta = kzalloc(sizeof(*meta), GFP_KERNEL);
> >  
> >  	if (!meta)
> >  		return NULL;
> > @@ -482,15 +634,27 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
> >  		goto out_error;
> >  	}
> >  
> > +	meta->hash_size = num_pages >> ZRAM_HASH_SHIFT;
> > +	meta->hash_size = min_t(size_t, ZRAM_HASH_SIZE_MAX, meta->hash_size);
> > +	meta->hash_size = max_t(size_t, ZRAM_HASH_SIZE_MIN, meta->hash_size);
> > +	meta->hash = vzalloc(meta->hash_size * sizeof(struct zram_hash));
> > +	if (!meta->hash) {
> > +		pr_err("Error allocating zram entry hash\n");
> > +		goto out_error;
> > +	}
> > +
> >  	meta->mem_pool = zs_create_pool(pool_name);
> >  	if (!meta->mem_pool) {
> >  		pr_err("Error creating memory pool\n");
> >  		goto out_error;
> >  	}
> >  
> > +	zram_meta_init(meta);
> 
> zram_dedup_init?
> 
> Can't we put above hash initialization routines to zram_meta_init?
> 
> > +
> >  	return meta;
> >  
> >  out_error:
> > +	vfree(meta->hash);
> >  	vfree(meta->table);
> >  	kfree(meta);
> >  	return NULL;
> > @@ -520,7 +684,8 @@ static void zram_free_page(struct zram *zram, size_t index)
> >  	if (!entry)
> >  		return;
> >  
> > -	zram_entry_free(meta, entry);
> > +	if (zram_entry_put(zram, meta, entry, true))
> > +		atomic64_sub(entry->len, &zram->stats.dup_data_size);
> 
> Hmm,
> 
> I want to put dedup stat logic in dedup functions without exporting
> higher level functions like zram_free_page.
> 
> So,
> 
> zram_dedup_put:
>         ...
>         ...
>         atomic64_sub(entry->len, &zram->stats.dup_data_size);
> 
> zram_free_page:
>         ...
>         if (zram_dedup_put())
>                 zram_entry_free
> 
> >  
> >  	atomic64_sub(zram_get_obj_size(meta, index),
> >  			&zram->stats.compr_data_size);
> > @@ -631,6 +796,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  	struct zcomp_strm *zstrm = NULL;
> >  	unsigned long alloced_pages;
> >  	unsigned long element;
> > +	u32 checksum;
> >  
> >  	page = bvec->bv_page;
> >  	if (is_partial_io(bvec)) {
> > @@ -674,6 +840,18 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  		goto out;
> >  	}
> >  
> > +	checksum = zram_calc_checksum(uncmem);
> > +	if (!entry) {
> > +		entry = zram_entry_get(zram, uncmem, checksum);
> 
>                 entry = zram_dedup_get
> 
> > +		if (entry) {
> > +			if (!is_partial_io(bvec))
> > +				kunmap_atomic(user_mem);
> > +
> > +			clen = entry->len;
> > +			goto found_duplication;
> > +		}
> > +	}
> > +
> >  	zstrm = zcomp_stream_get(zram->comp);
> >  	ret = zcomp_compress(zstrm, uncmem, &clen);
> >  	if (!is_partial_io(bvec)) {
> > @@ -708,7 +886,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  	 * from the slow path and entry has already been allocated.
> >  	 */
> >  	if (!entry) {
> > -		entry = zram_entry_alloc(meta, clen,
> > +		entry = zram_entry_alloc(zram, clen,
> >  				__GFP_KSWAPD_RECLAIM | __GFP_NOWARN);
> >  	}
> >  
> > @@ -718,7 +896,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  
> >  		atomic64_inc(&zram->stats.writestall);
> >  
> > -		entry = zram_entry_alloc(meta, clen, GFP_NOIO);
> > +		entry = zram_entry_alloc(zram, clen, GFP_NOIO);
> >  		if (entry)
> >  			goto compress_again;
> >  
> > @@ -732,7 +910,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  	update_used_max(zram, alloced_pages);
> >  
> >  	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
> > -		zram_entry_free(meta, entry);
> > +		zram_entry_put(zram, meta, entry, false);
> >  		ret = -ENOMEM;
> >  		goto out;
> >  	}
> > @@ -750,7 +928,9 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  	zcomp_stream_put(zram->comp);
> >  	zstrm = NULL;
> >  	zs_unmap_object(meta->mem_pool, entry->handle);
> > +	zram_entry_insert(zram, entry, checksum);
> >  
> > +found_duplication:
> 
> Nit:
> 
> I prefer found_dup.
> 
> >  	/*
> >  	 * Free memory associated with this sector
> >  	 * before overwriting unused sectors.
> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > index a7ae46c..07d1f8d 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -18,6 +18,7 @@
> >  #include <linux/rwsem.h>
> >  #include <linux/zsmalloc.h>
> >  #include <linux/crypto.h>
> > +#include <linux/spinlock.h>
> >  
> >  #include "zcomp.h"
> >  
> > @@ -45,6 +46,10 @@
> >  #define ZRAM_SECTOR_PER_LOGICAL_BLOCK	\
> >  	(1 << (ZRAM_LOGICAL_BLOCK_SHIFT - SECTOR_SHIFT))
> >  
> > +/* One slot will contain 128 pages theoretically */
> > +#define ZRAM_HASH_SHIFT		7
> > +#define ZRAM_HASH_SIZE_MIN	(1 << 10)
> > +#define ZRAM_HASH_SIZE_MAX	(1 << 31)
> >  
> >  /*
> >   * The lower ZRAM_FLAG_SHIFT bits of table.value is for
> > @@ -70,6 +75,10 @@ enum zram_pageflags {
> >  /*-- Data structures */
> >  
> >  struct zram_entry {
> > +	struct rb_node rb_node;
> > +	u32 len;
> > +	u32 checksum;
> > +	unsigned long refcount;
> 
> It should be compiled out if users doesn't turn on the feature in Kconfig.
> I hope you makes Kconfig option to [4/4].