All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] zram memory control enhance
@ 2014-08-05  8:02 ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

Notice! It's RFC. I didn't test at all but wanted to hear opinion
during merge window when it's really busy time for Andrew so we could
use the slack time to discuss without hurting him. ;-)

Patch 1 is to move pages_allocated in zsmalloc from size_class to zs_pool
so zs_get_total_size_bytes of zsmalloc would be faster than old.
zs_get_total_size_bytes could be used next patches frequently.

Patch 2 adds new feature which exports how many of bytes zsmalloc consumes
during testing workload. Normally, before fixing the zram's disksize
we have tested various workload and wanted to how many of bytes zram
consumed.
For it, we could poll mem_used_total of zram in userspace but the problem is
when memory pressure is severe and heavy swap out happens suddenly then
heavy swapin or exist while polling interval of user space is a few second,
it could miss max memory size zram had consumed easily.
With lack of information, user can set wrong disksize of zram so the result
is OOM. So this patch adds max_mem_used for zram and zsmalloc supports it

Patch 3 is to limit zram memory consumption. Now, zram has no bound for
memory usage so it could consume up all of system memory. It makes system
memory control for platform hard so I have heard the feature several time.

Feedback is welcome!

Minchan Kim (3):
  zsmalloc: move pages_allocated to zs_pool
  zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
  zram: limit memory size for zram

 Documentation/blockdev/zram.txt |  2 ++
 drivers/block/zram/zram_drv.c   | 58 +++++++++++++++++++++++++++++++++++++++++
 drivers/block/zram/zram_drv.h   |  1 +
 include/linux/zsmalloc.h        |  1 +
 mm/zsmalloc.c                   | 50 +++++++++++++++++++++++++----------
 5 files changed, 98 insertions(+), 14 deletions(-)

-- 
2.0.0


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC 0/3] zram memory control enhance
@ 2014-08-05  8:02 ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

Notice! It's RFC. I didn't test at all but wanted to hear opinion
during merge window when it's really busy time for Andrew so we could
use the slack time to discuss without hurting him. ;-)

Patch 1 is to move pages_allocated in zsmalloc from size_class to zs_pool
so zs_get_total_size_bytes of zsmalloc would be faster than old.
zs_get_total_size_bytes could be used next patches frequently.

Patch 2 adds new feature which exports how many of bytes zsmalloc consumes
during testing workload. Normally, before fixing the zram's disksize
we have tested various workload and wanted to how many of bytes zram
consumed.
For it, we could poll mem_used_total of zram in userspace but the problem is
when memory pressure is severe and heavy swap out happens suddenly then
heavy swapin or exist while polling interval of user space is a few second,
it could miss max memory size zram had consumed easily.
With lack of information, user can set wrong disksize of zram so the result
is OOM. So this patch adds max_mem_used for zram and zsmalloc supports it

Patch 3 is to limit zram memory consumption. Now, zram has no bound for
memory usage so it could consume up all of system memory. It makes system
memory control for platform hard so I have heard the feature several time.

Feedback is welcome!

Minchan Kim (3):
  zsmalloc: move pages_allocated to zs_pool
  zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
  zram: limit memory size for zram

 Documentation/blockdev/zram.txt |  2 ++
 drivers/block/zram/zram_drv.c   | 58 +++++++++++++++++++++++++++++++++++++++++
 drivers/block/zram/zram_drv.h   |  1 +
 include/linux/zsmalloc.h        |  1 +
 mm/zsmalloc.c                   | 50 +++++++++++++++++++++++++----------
 5 files changed, 98 insertions(+), 14 deletions(-)

-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-05  8:02 ` Minchan Kim
@ 2014-08-05  8:02   ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

Pages_allocated has counted in size_class structure and when user
want to see total_size_bytes, it gathers all of value from each
size_class to report the sum.

It's not bad if user don't see the value often but if user start
to see the value frequently, it would be not a good deal for
performance POV.

This patch moves the variable from size_class to zs_pool so it would
reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
but it adds new locking overhead but it wouldn't be severe because
it's not a hot path in zs_malloc(ie, it is called only when new
zspage is created, not a object).

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/zsmalloc.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index fe78189624cf..a6089bd26621 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -198,9 +198,6 @@ struct size_class {
 
 	spinlock_t lock;
 
-	/* stats */
-	u64 pages_allocated;
-
 	struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
 };
 
@@ -216,9 +213,12 @@ struct link_free {
 };
 
 struct zs_pool {
+	spinlock_t stat_lock;
+
 	struct size_class size_class[ZS_SIZE_CLASSES];
 
 	gfp_t flags;	/* allocation flags used when growing pool */
+	unsigned long pages_allocated;
 };
 
 /*
@@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
 
 	}
 
+	spin_lock_init(&pool->stat_lock);
 	pool->flags = flags;
 
 	return pool;
@@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
 			return 0;
 
 		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
+		spin_lock(&pool->stat_lock);
+		pool->pages_allocated += class->pages_per_zspage;
+		spin_unlock(&pool->stat_lock);
 		spin_lock(&class->lock);
-		class->pages_allocated += class->pages_per_zspage;
 	}
 
 	obj = (unsigned long)first_page->freelist;
@@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
 
 	first_page->inuse--;
 	fullness = fix_fullness_group(pool, first_page);
-
-	if (fullness == ZS_EMPTY)
-		class->pages_allocated -= class->pages_per_zspage;
-
 	spin_unlock(&class->lock);
 
-	if (fullness == ZS_EMPTY)
+	if (fullness == ZS_EMPTY) {
+		spin_lock(&pool->stat_lock);
+		pool->pages_allocated -= class->pages_per_zspage;
+		spin_unlock(&pool->stat_lock);
 		free_zspage(first_page);
+	}
 }
 EXPORT_SYMBOL_GPL(zs_free);
 
@@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
 
 u64 zs_get_total_size_bytes(struct zs_pool *pool)
 {
-	int i;
-	u64 npages = 0;
-
-	for (i = 0; i < ZS_SIZE_CLASSES; i++)
-		npages += pool->size_class[i].pages_allocated;
+	u64 npages;
 
+	spin_lock(&pool->stat_lock);
+	npages = pool->pages_allocated;
+	spin_unlock(&pool->stat_lock);
 	return npages << PAGE_SHIFT;
 }
 EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-05  8:02   ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

Pages_allocated has counted in size_class structure and when user
want to see total_size_bytes, it gathers all of value from each
size_class to report the sum.

It's not bad if user don't see the value often but if user start
to see the value frequently, it would be not a good deal for
performance POV.

This patch moves the variable from size_class to zs_pool so it would
reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
but it adds new locking overhead but it wouldn't be severe because
it's not a hot path in zs_malloc(ie, it is called only when new
zspage is created, not a object).

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/zsmalloc.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index fe78189624cf..a6089bd26621 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -198,9 +198,6 @@ struct size_class {
 
 	spinlock_t lock;
 
-	/* stats */
-	u64 pages_allocated;
-
 	struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
 };
 
@@ -216,9 +213,12 @@ struct link_free {
 };
 
 struct zs_pool {
+	spinlock_t stat_lock;
+
 	struct size_class size_class[ZS_SIZE_CLASSES];
 
 	gfp_t flags;	/* allocation flags used when growing pool */
+	unsigned long pages_allocated;
 };
 
 /*
@@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
 
 	}
 
+	spin_lock_init(&pool->stat_lock);
 	pool->flags = flags;
 
 	return pool;
@@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
 			return 0;
 
 		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
+		spin_lock(&pool->stat_lock);
+		pool->pages_allocated += class->pages_per_zspage;
+		spin_unlock(&pool->stat_lock);
 		spin_lock(&class->lock);
-		class->pages_allocated += class->pages_per_zspage;
 	}
 
 	obj = (unsigned long)first_page->freelist;
@@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
 
 	first_page->inuse--;
 	fullness = fix_fullness_group(pool, first_page);
-
-	if (fullness == ZS_EMPTY)
-		class->pages_allocated -= class->pages_per_zspage;
-
 	spin_unlock(&class->lock);
 
-	if (fullness == ZS_EMPTY)
+	if (fullness == ZS_EMPTY) {
+		spin_lock(&pool->stat_lock);
+		pool->pages_allocated -= class->pages_per_zspage;
+		spin_unlock(&pool->stat_lock);
 		free_zspage(first_page);
+	}
 }
 EXPORT_SYMBOL_GPL(zs_free);
 
@@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
 
 u64 zs_get_total_size_bytes(struct zs_pool *pool)
 {
-	int i;
-	u64 npages = 0;
-
-	for (i = 0; i < ZS_SIZE_CLASSES; i++)
-		npages += pool->size_class[i].pages_allocated;
+	u64 npages;
 
+	spin_lock(&pool->stat_lock);
+	npages = pool->pages_allocated;
+	spin_unlock(&pool->stat_lock);
 	return npages << PAGE_SHIFT;
 }
 EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
  2014-08-05  8:02 ` Minchan Kim
@ 2014-08-05  8:02   ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

Normally, zram user can get maximum memory zsmalloc consumed via
polling mem_used_total with sysfs in userspace.

But it has a critical problem because user can miss peak memory
usage during update interval so that gap between them could be
huge when memory pressure is really heavy.

This patch adds new API zs_get_max_size_bytes in zsmalloc so
user(ex, zram) doesn't need to poll in short interval to get
exact value.

User can just see max memory usage once his test workload is
done. It's pretty handy and accurate.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/blockdev/zram.txt |  1 +
 drivers/block/zram/zram_drv.c   | 17 +++++++++++++++++
 include/linux/zsmalloc.h        |  1 +
 mm/zsmalloc.c                   | 20 ++++++++++++++++++++
 4 files changed, 39 insertions(+)

diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 0595c3f56ccf..d24534bee763 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -95,6 +95,7 @@ size of the disk when not in use so a huge zram is wasteful.
 		orig_data_size
 		compr_data_size
 		mem_used_total
+		mem_used_max
 
 7) Deactivate:
 	swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 36e54be402df..a4d637b4db7d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -109,6 +109,21 @@ static ssize_t mem_used_total_show(struct device *dev,
 	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
 }
 
+static ssize_t mem_used_max_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	u64 val = 0;
+	struct zram *zram = dev_to_zram(dev);
+	struct zram_meta *meta = zram->meta;
+
+	down_read(&zram->init_lock);
+	if (init_done(zram))
+		val = zs_get_max_size_bytes(meta->mem_pool);
+	up_read(&zram->init_lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+}
+
 static ssize_t max_comp_streams_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
@@ -838,6 +853,7 @@ static DEVICE_ATTR(initstate, S_IRUGO, initstate_show, NULL);
 static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
 static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
 static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
+static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
 static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
 		max_comp_streams_show, max_comp_streams_store);
 static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
@@ -866,6 +882,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_orig_data_size.attr,
 	&dev_attr_compr_data_size.attr,
 	&dev_attr_mem_used_total.attr,
+	&dev_attr_mem_used_max.attr,
 	&dev_attr_max_comp_streams.attr,
 	&dev_attr_comp_algorithm.attr,
 	NULL,
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index e44d634e7fb7..fb087ca06a88 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -47,5 +47,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
 void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
 
 u64 zs_get_total_size_bytes(struct zs_pool *pool);
+u64 zs_get_max_size_bytes(struct zs_pool *pool);
 
 #endif
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index a6089bd26621..3b5be076268a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -219,6 +219,7 @@ struct zs_pool {
 
 	gfp_t flags;	/* allocation flags used when growing pool */
 	unsigned long pages_allocated;
+	unsigned long max_pages_allocated;
 };
 
 /*
@@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
 		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
 		spin_lock(&pool->stat_lock);
 		pool->pages_allocated += class->pages_per_zspage;
+		if (pool->max_pages_allocated < pool->pages_allocated)
+			pool->max_pages_allocated = pool->pages_allocated;
 		spin_unlock(&pool->stat_lock);
 		spin_lock(&class->lock);
 	}
@@ -1101,6 +1104,9 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
 
+/*
+ * Reports current memory usage consumed by zs_malloc
+ */
 u64 zs_get_total_size_bytes(struct zs_pool *pool)
 {
 	u64 npages;
@@ -1112,6 +1118,20 @@ u64 zs_get_total_size_bytes(struct zs_pool *pool)
 }
 EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
 
+/*
+ * Reports maximum memory usage zs_malloc have consumed
+ */
+u64 zs_get_max_size_bytes(struct zs_pool *pool)
+{
+	u64 npages;
+
+	spin_lock(&pool->stat_lock);
+	npages = pool->max_pages_allocated;
+	spin_unlock(&pool->stat_lock);
+	return npages << PAGE_SHIFT;
+}
+EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
+
 module_init(zs_init);
 module_exit(zs_exit);
 
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
@ 2014-08-05  8:02   ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

Normally, zram user can get maximum memory zsmalloc consumed via
polling mem_used_total with sysfs in userspace.

But it has a critical problem because user can miss peak memory
usage during update interval so that gap between them could be
huge when memory pressure is really heavy.

This patch adds new API zs_get_max_size_bytes in zsmalloc so
user(ex, zram) doesn't need to poll in short interval to get
exact value.

User can just see max memory usage once his test workload is
done. It's pretty handy and accurate.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/blockdev/zram.txt |  1 +
 drivers/block/zram/zram_drv.c   | 17 +++++++++++++++++
 include/linux/zsmalloc.h        |  1 +
 mm/zsmalloc.c                   | 20 ++++++++++++++++++++
 4 files changed, 39 insertions(+)

diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 0595c3f56ccf..d24534bee763 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -95,6 +95,7 @@ size of the disk when not in use so a huge zram is wasteful.
 		orig_data_size
 		compr_data_size
 		mem_used_total
+		mem_used_max
 
 7) Deactivate:
 	swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 36e54be402df..a4d637b4db7d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -109,6 +109,21 @@ static ssize_t mem_used_total_show(struct device *dev,
 	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
 }
 
+static ssize_t mem_used_max_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	u64 val = 0;
+	struct zram *zram = dev_to_zram(dev);
+	struct zram_meta *meta = zram->meta;
+
+	down_read(&zram->init_lock);
+	if (init_done(zram))
+		val = zs_get_max_size_bytes(meta->mem_pool);
+	up_read(&zram->init_lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+}
+
 static ssize_t max_comp_streams_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
@@ -838,6 +853,7 @@ static DEVICE_ATTR(initstate, S_IRUGO, initstate_show, NULL);
 static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
 static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
 static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
+static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
 static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
 		max_comp_streams_show, max_comp_streams_store);
 static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
@@ -866,6 +882,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_orig_data_size.attr,
 	&dev_attr_compr_data_size.attr,
 	&dev_attr_mem_used_total.attr,
+	&dev_attr_mem_used_max.attr,
 	&dev_attr_max_comp_streams.attr,
 	&dev_attr_comp_algorithm.attr,
 	NULL,
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index e44d634e7fb7..fb087ca06a88 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -47,5 +47,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
 void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
 
 u64 zs_get_total_size_bytes(struct zs_pool *pool);
+u64 zs_get_max_size_bytes(struct zs_pool *pool);
 
 #endif
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index a6089bd26621..3b5be076268a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -219,6 +219,7 @@ struct zs_pool {
 
 	gfp_t flags;	/* allocation flags used when growing pool */
 	unsigned long pages_allocated;
+	unsigned long max_pages_allocated;
 };
 
 /*
@@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
 		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
 		spin_lock(&pool->stat_lock);
 		pool->pages_allocated += class->pages_per_zspage;
+		if (pool->max_pages_allocated < pool->pages_allocated)
+			pool->max_pages_allocated = pool->pages_allocated;
 		spin_unlock(&pool->stat_lock);
 		spin_lock(&class->lock);
 	}
@@ -1101,6 +1104,9 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
 
+/*
+ * Reports current memory usage consumed by zs_malloc
+ */
 u64 zs_get_total_size_bytes(struct zs_pool *pool)
 {
 	u64 npages;
@@ -1112,6 +1118,20 @@ u64 zs_get_total_size_bytes(struct zs_pool *pool)
 }
 EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
 
+/*
+ * Reports maximum memory usage zs_malloc have consumed
+ */
+u64 zs_get_max_size_bytes(struct zs_pool *pool)
+{
+	u64 npages;
+
+	spin_lock(&pool->stat_lock);
+	npages = pool->max_pages_allocated;
+	spin_unlock(&pool->stat_lock);
+	return npages << PAGE_SHIFT;
+}
+EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
+
 module_init(zs_init);
 module_exit(zs_exit);
 
-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 3/3] zram: limit memory size for zram
  2014-08-05  8:02 ` Minchan Kim
@ 2014-08-05  8:02   ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

I have received a request several time from zram users.
They want to limit memory size for zram because zram can consume
lot of memory on system without limit so it makes memory management
control hard.

This patch adds new knob to limit memory of zram.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/blockdev/zram.txt |  1 +
 drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
 drivers/block/zram/zram_drv.h   |  1 +
 3 files changed, 43 insertions(+)

diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index d24534bee763..fcb0561dfe2e 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
 		compr_data_size
 		mem_used_total
 		mem_used_max
+		mem_limit
 
 7) Deactivate:
 	swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a4d637b4db7d..47f68bbb2c44 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
 	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
 }
 
+static ssize_t mem_limit_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	u64 val;
+	struct zram *zram = dev_to_zram(dev);
+
+	down_read(&zram->init_lock);
+	val = zram->limit_bytes;
+	up_read(&zram->init_lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+}
+
+static ssize_t mem_limit_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t len)
+{
+	u64 limit;
+	struct zram *zram = dev_to_zram(dev);
+	int ret;
+
+	ret = kstrtoull(buf, 0, &limit);
+	if (ret < 0)
+		return ret;
+
+	down_write(&zram->init_lock);
+	zram->limit_bytes = limit;
+	ret = len;
+	up_write(&zram->init_lock);
+	return ret;
+}
+
 static ssize_t max_comp_streams_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
@@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 		ret = -ENOMEM;
 		goto out;
 	}
+
+	if (zram->limit_bytes &&
+		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
+		zs_free(meta->mem_pool, handle);
+		ret = -ENOMEM;
+		goto out;
+	}
+
 	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
 
 	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
@@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
 static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
 static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
 static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
+static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
 static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
 		max_comp_streams_show, max_comp_streams_store);
 static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
@@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_compr_data_size.attr,
 	&dev_attr_mem_used_total.attr,
 	&dev_attr_mem_used_max.attr,
+	&dev_attr_mem_limit.attr,
 	&dev_attr_max_comp_streams.attr,
 	&dev_attr_comp_algorithm.attr,
 	NULL,
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 7f21c145e317..c0d497ff6efc 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -99,6 +99,7 @@ struct zram {
 	 * we can store in a disk.
 	 */
 	u64 disksize;	/* bytes */
+	u64 limit_bytes;
 	int max_comp_streams;
 	struct zram_stats stats;
 	char compressor[10];
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 3/3] zram: limit memory size for zram
@ 2014-08-05  8:02   ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  8:02 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta, Minchan Kim

I have received a request several time from zram users.
They want to limit memory size for zram because zram can consume
lot of memory on system without limit so it makes memory management
control hard.

This patch adds new knob to limit memory of zram.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/blockdev/zram.txt |  1 +
 drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
 drivers/block/zram/zram_drv.h   |  1 +
 3 files changed, 43 insertions(+)

diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index d24534bee763..fcb0561dfe2e 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
 		compr_data_size
 		mem_used_total
 		mem_used_max
+		mem_limit
 
 7) Deactivate:
 	swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a4d637b4db7d..47f68bbb2c44 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
 	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
 }
 
+static ssize_t mem_limit_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	u64 val;
+	struct zram *zram = dev_to_zram(dev);
+
+	down_read(&zram->init_lock);
+	val = zram->limit_bytes;
+	up_read(&zram->init_lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+}
+
+static ssize_t mem_limit_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t len)
+{
+	u64 limit;
+	struct zram *zram = dev_to_zram(dev);
+	int ret;
+
+	ret = kstrtoull(buf, 0, &limit);
+	if (ret < 0)
+		return ret;
+
+	down_write(&zram->init_lock);
+	zram->limit_bytes = limit;
+	ret = len;
+	up_write(&zram->init_lock);
+	return ret;
+}
+
 static ssize_t max_comp_streams_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
@@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 		ret = -ENOMEM;
 		goto out;
 	}
+
+	if (zram->limit_bytes &&
+		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
+		zs_free(meta->mem_pool, handle);
+		ret = -ENOMEM;
+		goto out;
+	}
+
 	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
 
 	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
@@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
 static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
 static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
 static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
+static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
 static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
 		max_comp_streams_show, max_comp_streams_store);
 static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
@@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_compr_data_size.attr,
 	&dev_attr_mem_used_total.attr,
 	&dev_attr_mem_used_max.attr,
+	&dev_attr_mem_limit.attr,
 	&dev_attr_max_comp_streams.attr,
 	&dev_attr_comp_algorithm.attr,
 	NULL,
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 7f21c145e317..c0d497ff6efc 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -99,6 +99,7 @@ struct zram {
 	 * we can store in a disk.
 	 */
 	u64 disksize;	/* bytes */
+	u64 limit_bytes;
 	int max_comp_streams;
 	struct zram_stats stats;
 	char compressor[10];
-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-05  8:02   ` Minchan Kim
@ 2014-08-05  9:48     ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  9:48 UTC (permalink / raw)
  To: linux-mm
  Cc: Jerome Marchand, linux-kernel, juno.choi, Sergey Senozhatsky,
	seungho1.park, Luigi Semenzato, Nitin Gupta

Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
in zsmalloc and put the limit in zs_pool via new API from zram so that
zs_malloc could be failed as soon as it exceeds the limit.

In the end, zram doesn't need to call zs_get_total_size_bytes on every
write. It's more clean and right layer, IMHO.

On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> I have received a request several time from zram users.
> They want to limit memory size for zram because zram can consume
> lot of memory on system without limit so it makes memory management
> control hard.
> 
> This patch adds new knob to limit memory of zram.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  Documentation/blockdev/zram.txt |  1 +
>  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
>  drivers/block/zram/zram_drv.h   |  1 +
>  3 files changed, 43 insertions(+)
> 
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index d24534bee763..fcb0561dfe2e 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
>  		compr_data_size
>  		mem_used_total
>  		mem_used_max
> +		mem_limit
>  
>  7) Deactivate:
>  	swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index a4d637b4db7d..47f68bbb2c44 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
>  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
>  }
>  
> +static ssize_t mem_limit_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	u64 val;
> +	struct zram *zram = dev_to_zram(dev);
> +
> +	down_read(&zram->init_lock);
> +	val = zram->limit_bytes;
> +	up_read(&zram->init_lock);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> +}
> +
> +static ssize_t mem_limit_store(struct device *dev,
> +		struct device_attribute *attr, const char *buf, size_t len)
> +{
> +	u64 limit;
> +	struct zram *zram = dev_to_zram(dev);
> +	int ret;
> +
> +	ret = kstrtoull(buf, 0, &limit);
> +	if (ret < 0)
> +		return ret;
> +
> +	down_write(&zram->init_lock);
> +	zram->limit_bytes = limit;
> +	ret = len;
> +	up_write(&zram->init_lock);
> +	return ret;
> +}
> +
>  static ssize_t max_comp_streams_store(struct device *dev,
>  		struct device_attribute *attr, const char *buf, size_t len)
>  {
> @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> +
> +	if (zram->limit_bytes &&
> +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> +		zs_free(meta->mem_pool, handle);
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
>  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
>  
>  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
>  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>  		max_comp_streams_show, max_comp_streams_store);
>  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
>  	&dev_attr_compr_data_size.attr,
>  	&dev_attr_mem_used_total.attr,
>  	&dev_attr_mem_used_max.attr,
> +	&dev_attr_mem_limit.attr,
>  	&dev_attr_max_comp_streams.attr,
>  	&dev_attr_comp_algorithm.attr,
>  	NULL,
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 7f21c145e317..c0d497ff6efc 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -99,6 +99,7 @@ struct zram {
>  	 * we can store in a disk.
>  	 */
>  	u64 disksize;	/* bytes */
> +	u64 limit_bytes;
>  	int max_comp_streams;
>  	struct zram_stats stats;
>  	char compressor[10];
> -- 
> 2.0.0

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-05  9:48     ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-05  9:48 UTC (permalink / raw)
  To: linux-mm
  Cc: Jerome Marchand, linux-kernel, juno.choi, Sergey Senozhatsky,
	seungho1.park, Luigi Semenzato, Nitin Gupta

Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
in zsmalloc and put the limit in zs_pool via new API from zram so that
zs_malloc could be failed as soon as it exceeds the limit.

In the end, zram doesn't need to call zs_get_total_size_bytes on every
write. It's more clean and right layer, IMHO.

On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> I have received a request several time from zram users.
> They want to limit memory size for zram because zram can consume
> lot of memory on system without limit so it makes memory management
> control hard.
> 
> This patch adds new knob to limit memory of zram.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  Documentation/blockdev/zram.txt |  1 +
>  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
>  drivers/block/zram/zram_drv.h   |  1 +
>  3 files changed, 43 insertions(+)
> 
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index d24534bee763..fcb0561dfe2e 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
>  		compr_data_size
>  		mem_used_total
>  		mem_used_max
> +		mem_limit
>  
>  7) Deactivate:
>  	swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index a4d637b4db7d..47f68bbb2c44 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
>  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
>  }
>  
> +static ssize_t mem_limit_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	u64 val;
> +	struct zram *zram = dev_to_zram(dev);
> +
> +	down_read(&zram->init_lock);
> +	val = zram->limit_bytes;
> +	up_read(&zram->init_lock);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> +}
> +
> +static ssize_t mem_limit_store(struct device *dev,
> +		struct device_attribute *attr, const char *buf, size_t len)
> +{
> +	u64 limit;
> +	struct zram *zram = dev_to_zram(dev);
> +	int ret;
> +
> +	ret = kstrtoull(buf, 0, &limit);
> +	if (ret < 0)
> +		return ret;
> +
> +	down_write(&zram->init_lock);
> +	zram->limit_bytes = limit;
> +	ret = len;
> +	up_write(&zram->init_lock);
> +	return ret;
> +}
> +
>  static ssize_t max_comp_streams_store(struct device *dev,
>  		struct device_attribute *attr, const char *buf, size_t len)
>  {
> @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> +
> +	if (zram->limit_bytes &&
> +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> +		zs_free(meta->mem_pool, handle);
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
>  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
>  
>  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
>  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>  		max_comp_streams_show, max_comp_streams_store);
>  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
>  	&dev_attr_compr_data_size.attr,
>  	&dev_attr_mem_used_total.attr,
>  	&dev_attr_mem_used_max.attr,
> +	&dev_attr_mem_limit.attr,
>  	&dev_attr_max_comp_streams.attr,
>  	&dev_attr_comp_algorithm.attr,
>  	NULL,
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 7f21c145e317..c0d497ff6efc 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -99,6 +99,7 @@ struct zram {
>  	 * we can store in a disk.
>  	 */
>  	u64 disksize;	/* bytes */
> +	u64 limit_bytes;
>  	int max_comp_streams;
>  	struct zram_stats stats;
>  	char compressor[10];
> -- 
> 2.0.0

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-05  9:48     ` Minchan Kim
@ 2014-08-05 13:16       ` Sergey Senozhatsky
  -1 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-05 13:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	Sergey Senozhatsky, seungho1.park, Luigi Semenzato, Nitin Gupta

Hello,

On (08/05/14 18:48), Minchan Kim wrote:
> Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> in zsmalloc and put the limit in zs_pool via new API from zram so that
> zs_malloc could be failed as soon as it exceeds the limit.
> 
> In the end, zram doesn't need to call zs_get_total_size_bytes on every
> write. It's more clean and right layer, IMHO.

yes, I think this one is better.

	-ss

> On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > I have received a request several time from zram users.
> > They want to limit memory size for zram because zram can consume
> > lot of memory on system without limit so it makes memory management
> > control hard.
> > 
> > This patch adds new knob to limit memory of zram.
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  Documentation/blockdev/zram.txt |  1 +
> >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> >  drivers/block/zram/zram_drv.h   |  1 +
> >  3 files changed, 43 insertions(+)
> > 
> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > index d24534bee763..fcb0561dfe2e 100644
> > --- a/Documentation/blockdev/zram.txt
> > +++ b/Documentation/blockdev/zram.txt
> > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> >  		compr_data_size
> >  		mem_used_total
> >  		mem_used_max
> > +		mem_limit
> >  
> >  7) Deactivate:
> >  	swapoff /dev/zram0
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index a4d637b4db7d..47f68bbb2c44 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> >  }
> >  
> > +static ssize_t mem_limit_show(struct device *dev,
> > +		struct device_attribute *attr, char *buf)
> > +{
> > +	u64 val;
> > +	struct zram *zram = dev_to_zram(dev);
> > +
> > +	down_read(&zram->init_lock);
> > +	val = zram->limit_bytes;
> > +	up_read(&zram->init_lock);
> > +
> > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > +}
> > +
> > +static ssize_t mem_limit_store(struct device *dev,
> > +		struct device_attribute *attr, const char *buf, size_t len)
> > +{
> > +	u64 limit;
> > +	struct zram *zram = dev_to_zram(dev);
> > +	int ret;
> > +
> > +	ret = kstrtoull(buf, 0, &limit);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	down_write(&zram->init_lock);
> > +	zram->limit_bytes = limit;
> > +	ret = len;
> > +	up_write(&zram->init_lock);
> > +	return ret;
> > +}
> > +
> >  static ssize_t max_comp_streams_store(struct device *dev,
> >  		struct device_attribute *attr, const char *buf, size_t len)
> >  {
> > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  		ret = -ENOMEM;
> >  		goto out;
> >  	}
> > +
> > +	if (zram->limit_bytes &&
> > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > +		zs_free(meta->mem_pool, handle);
> > +		ret = -ENOMEM;
> > +		goto out;
> > +	}
> > +
> >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> >  
> >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> >  		max_comp_streams_show, max_comp_streams_store);
> >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> >  	&dev_attr_compr_data_size.attr,
> >  	&dev_attr_mem_used_total.attr,
> >  	&dev_attr_mem_used_max.attr,
> > +	&dev_attr_mem_limit.attr,
> >  	&dev_attr_max_comp_streams.attr,
> >  	&dev_attr_comp_algorithm.attr,
> >  	NULL,
> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > index 7f21c145e317..c0d497ff6efc 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -99,6 +99,7 @@ struct zram {
> >  	 * we can store in a disk.
> >  	 */
> >  	u64 disksize;	/* bytes */
> > +	u64 limit_bytes;
> >  	int max_comp_streams;
> >  	struct zram_stats stats;
> >  	char compressor[10];
> > -- 
> > 2.0.0
> 
> -- 
> Kind regards,
> Minchan Kim
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-05 13:16       ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-05 13:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	Sergey Senozhatsky, seungho1.park, Luigi Semenzato, Nitin Gupta

Hello,

On (08/05/14 18:48), Minchan Kim wrote:
> Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> in zsmalloc and put the limit in zs_pool via new API from zram so that
> zs_malloc could be failed as soon as it exceeds the limit.
> 
> In the end, zram doesn't need to call zs_get_total_size_bytes on every
> write. It's more clean and right layer, IMHO.

yes, I think this one is better.

	-ss

> On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > I have received a request several time from zram users.
> > They want to limit memory size for zram because zram can consume
> > lot of memory on system without limit so it makes memory management
> > control hard.
> > 
> > This patch adds new knob to limit memory of zram.
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  Documentation/blockdev/zram.txt |  1 +
> >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> >  drivers/block/zram/zram_drv.h   |  1 +
> >  3 files changed, 43 insertions(+)
> > 
> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > index d24534bee763..fcb0561dfe2e 100644
> > --- a/Documentation/blockdev/zram.txt
> > +++ b/Documentation/blockdev/zram.txt
> > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> >  		compr_data_size
> >  		mem_used_total
> >  		mem_used_max
> > +		mem_limit
> >  
> >  7) Deactivate:
> >  	swapoff /dev/zram0
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index a4d637b4db7d..47f68bbb2c44 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> >  }
> >  
> > +static ssize_t mem_limit_show(struct device *dev,
> > +		struct device_attribute *attr, char *buf)
> > +{
> > +	u64 val;
> > +	struct zram *zram = dev_to_zram(dev);
> > +
> > +	down_read(&zram->init_lock);
> > +	val = zram->limit_bytes;
> > +	up_read(&zram->init_lock);
> > +
> > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > +}
> > +
> > +static ssize_t mem_limit_store(struct device *dev,
> > +		struct device_attribute *attr, const char *buf, size_t len)
> > +{
> > +	u64 limit;
> > +	struct zram *zram = dev_to_zram(dev);
> > +	int ret;
> > +
> > +	ret = kstrtoull(buf, 0, &limit);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	down_write(&zram->init_lock);
> > +	zram->limit_bytes = limit;
> > +	ret = len;
> > +	up_write(&zram->init_lock);
> > +	return ret;
> > +}
> > +
> >  static ssize_t max_comp_streams_store(struct device *dev,
> >  		struct device_attribute *attr, const char *buf, size_t len)
> >  {
> > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  		ret = -ENOMEM;
> >  		goto out;
> >  	}
> > +
> > +	if (zram->limit_bytes &&
> > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > +		zs_free(meta->mem_pool, handle);
> > +		ret = -ENOMEM;
> > +		goto out;
> > +	}
> > +
> >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> >  
> >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> >  		max_comp_streams_show, max_comp_streams_store);
> >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> >  	&dev_attr_compr_data_size.attr,
> >  	&dev_attr_mem_used_total.attr,
> >  	&dev_attr_mem_used_max.attr,
> > +	&dev_attr_mem_limit.attr,
> >  	&dev_attr_max_comp_streams.attr,
> >  	&dev_attr_comp_algorithm.attr,
> >  	NULL,
> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > index 7f21c145e317..c0d497ff6efc 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -99,6 +99,7 @@ struct zram {
> >  	 * we can store in a disk.
> >  	 */
> >  	u64 disksize;	/* bytes */
> > +	u64 limit_bytes;
> >  	int max_comp_streams;
> >  	struct zram_stats stats;
> >  	char compressor[10];
> > -- 
> > 2.0.0
> 
> -- 
> Kind regards,
> Minchan Kim
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-05 13:16       ` Sergey Senozhatsky
@ 2014-08-06  6:52         ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-06  6:52 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (08/05/14 18:48), Minchan Kim wrote:
> > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > zs_malloc could be failed as soon as it exceeds the limit.
> > 
> > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > write. It's more clean and right layer, IMHO.
> 
> yes, I think this one is better.
> 
> 	-ss

>From 279c406b5a8eabd03edca55490ec92b539b39c76 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Tue, 5 Aug 2014 16:24:57 +0900
Subject: [PATCH] zram: limit memory size for zram

I have received a request several time from zram users.
They want to limit memory size for zram because zram can consume
lot of memory on system without limit so it makes memory management
control hard.

This patch adds new knob to limit memory of zram.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/blockdev/zram.txt |  1 +
 drivers/block/zram/zram_drv.c   | 39 +++++++++++++++++++++++++++++++++++++--
 include/linux/zsmalloc.h        |  2 ++
 mm/zsmalloc.c                   | 24 ++++++++++++++++++++++++
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index d24534bee763..fcb0561dfe2e 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
 		compr_data_size
 		mem_used_total
 		mem_used_max
+		mem_limit
 
 7) Deactivate:
 	swapoff /dev/zram0
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a4d637b4db7d..069e81ef0c17 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -137,6 +137,41 @@ static ssize_t max_comp_streams_show(struct device *dev,
 	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
 }
 
+static ssize_t mem_limit_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	u64 val = 0;
+	struct zram *zram = dev_to_zram(dev);
+	struct zram_meta *meta = zram->meta;
+
+	down_read(&zram->init_lock);
+	if (init_done(zram))
+		val = zs_get_limit_size_bytes(meta->mem_pool);
+	up_read(&zram->init_lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+}
+
+static ssize_t mem_limit_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t len)
+{
+	int ret;
+	u64 limit;
+	struct zram *zram = dev_to_zram(dev);
+	struct zram_meta *meta = zram->meta;
+
+	ret = kstrtoull(buf, 0, &limit);
+	if (ret < 0)
+		return ret;
+
+	down_write(&zram->init_lock);
+	if (init_done(zram))
+		zs_set_limit_size_bytes(meta->mem_pool, limit);
+	up_write(&zram->init_lock);
+	ret = len;
+	return ret;
+}
+
 static ssize_t max_comp_streams_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
@@ -506,8 +541,6 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 
 	handle = zs_malloc(meta->mem_pool, clen);
 	if (!handle) {
-		pr_info("Error allocating memory for compressed page: %u, size=%zu\n",
-			index, clen);
 		ret = -ENOMEM;
 		goto out;
 	}
@@ -854,6 +887,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
 static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
 static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
 static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
+static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
 static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
 		max_comp_streams_show, max_comp_streams_store);
 static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
@@ -883,6 +917,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_compr_data_size.attr,
 	&dev_attr_mem_used_total.attr,
 	&dev_attr_mem_used_max.attr,
+	&dev_attr_mem_limit.attr,
 	&dev_attr_max_comp_streams.attr,
 	&dev_attr_comp_algorithm.attr,
 	NULL,
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index fb087ca06a88..41122251a2d0 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -49,4 +49,6 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
 u64 zs_get_total_size_bytes(struct zs_pool *pool);
 u64 zs_get_max_size_bytes(struct zs_pool *pool);
 
+u64 zs_get_limit_size_bytes(struct zs_pool *pool);
+void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit);
 #endif
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 3b5be076268a..8ca51118cf2b 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -220,6 +220,7 @@ struct zs_pool {
 	gfp_t flags;	/* allocation flags used when growing pool */
 	unsigned long pages_allocated;
 	unsigned long max_pages_allocated;
+	unsigned long pages_limited;
 };
 
 /*
@@ -940,6 +941,11 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
 
 	if (!first_page) {
 		spin_unlock(&class->lock);
+
+		if (pool->pages_limited && (pool->pages_limited <
+			pool->pages_allocated + class->pages_per_zspage))
+			return 0;
+
 		first_page = alloc_zspage(class, pool->flags);
 		if (unlikely(!first_page))
 			return 0;
@@ -1132,6 +1138,24 @@ u64 zs_get_max_size_bytes(struct zs_pool *pool)
 }
 EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
 
+void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit)
+{
+	pool->pages_limited = round_down(limit, PAGE_SIZE) >> PAGE_SHIFT;
+}
+EXPORT_SYMBOL_GPL(zs_set_limit_size_bytes);
+
+u64 zs_get_limit_size_bytes(struct zs_pool *pool)
+{
+	u64 npages;
+
+	spin_lock(&pool->stat_lock);
+	npages = pool->pages_limited;
+	spin_unlock(&pool->stat_lock);
+	return npages << PAGE_SHIFT;
+
+}
+EXPORT_SYMBOL_GPL(zs_get_limit_size_bytes);
+
 module_init(zs_init);
 module_exit(zs_exit);
 
-- 
2.0.0

-- 
Kind regards,
Minchan Kim

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-06  6:52         ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-06  6:52 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (08/05/14 18:48), Minchan Kim wrote:
> > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > zs_malloc could be failed as soon as it exceeds the limit.
> > 
> > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > write. It's more clean and right layer, IMHO.
> 
> yes, I think this one is better.
> 
> 	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 0/3] zram memory control enhance
  2014-08-05  8:02 ` Minchan Kim
                   ` (3 preceding siblings ...)
  (?)
@ 2014-08-06 12:54 ` Jerome Marchand
  -1 siblings, 0 replies; 54+ messages in thread
From: Jerome Marchand @ 2014-08-06 12:54 UTC (permalink / raw)
  To: Minchan Kim, linux-mm
  Cc: linux-kernel, Sergey Senozhatsky, juno.choi, seungho1.park,
	Luigi Semenzato, Nitin Gupta

[-- Attachment #1: Type: text/plain, Size: 2148 bytes --]

On 08/05/2014 10:02 AM, Minchan Kim wrote:
> Notice! It's RFC. I didn't test at all but wanted to hear opinion
> during merge window when it's really busy time for Andrew so we could
> use the slack time to discuss without hurting him. ;-)
> 
> Patch 1 is to move pages_allocated in zsmalloc from size_class to zs_pool
> so zs_get_total_size_bytes of zsmalloc would be faster than old.
> zs_get_total_size_bytes could be used next patches frequently.
> 
> Patch 2 adds new feature which exports how many of bytes zsmalloc consumes
> during testing workload. Normally, before fixing the zram's disksize
> we have tested various workload and wanted to how many of bytes zram
> consumed.
> For it, we could poll mem_used_total of zram in userspace but the problem is
> when memory pressure is severe and heavy swap out happens suddenly then
> heavy swapin or exist while polling interval of user space is a few second,
> it could miss max memory size zram had consumed easily.
> With lack of information, user can set wrong disksize of zram so the result
> is OOM. So this patch adds max_mem_used for zram and zsmalloc supports it
> 
> Patch 3 is to limit zram memory consumption. Now, zram has no bound for
> memory usage so it could consume up all of system memory. It makes system
> memory control for platform hard so I have heard the feature several time.
> 
> Feedback is welcome!

Hi,

I haven't really reviewed the code yet, but I like the general idea. The
third patch in particular provides a very useful feature. I'm actually
surprised no one provided it earlier.

Jerome


> 
> Minchan Kim (3):
>   zsmalloc: move pages_allocated to zs_pool
>   zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
>   zram: limit memory size for zram
> 
>  Documentation/blockdev/zram.txt |  2 ++
>  drivers/block/zram/zram_drv.c   | 58 +++++++++++++++++++++++++++++++++++++++++
>  drivers/block/zram/zram_drv.h   |  1 +
>  include/linux/zsmalloc.h        |  1 +
>  mm/zsmalloc.c                   | 50 +++++++++++++++++++++++++----------
>  5 files changed, 98 insertions(+), 14 deletions(-)
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 538 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-05  8:02   ` Minchan Kim
@ 2014-08-13 13:59     ` Dan Streetman
  -1 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-13 13:59 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> Pages_allocated has counted in size_class structure and when user
> want to see total_size_bytes, it gathers all of value from each
> size_class to report the sum.
>
> It's not bad if user don't see the value often but if user start
> to see the value frequently, it would be not a good deal for
> performance POV.
>
> This patch moves the variable from size_class to zs_pool so it would
> reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> but it adds new locking overhead but it wouldn't be severe because
> it's not a hot path in zs_malloc(ie, it is called only when new
> zspage is created, not a object).

Would using an atomic64_t without locking be simpler?

>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index fe78189624cf..a6089bd26621 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -198,9 +198,6 @@ struct size_class {
>
>         spinlock_t lock;
>
> -       /* stats */
> -       u64 pages_allocated;
> -
>         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>  };
>
> @@ -216,9 +213,12 @@ struct link_free {
>  };
>
>  struct zs_pool {
> +       spinlock_t stat_lock;
> +
>         struct size_class size_class[ZS_SIZE_CLASSES];
>
>         gfp_t flags;    /* allocation flags used when growing pool */
> +       unsigned long pages_allocated;
>  };
>
>  /*
> @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>
>         }
>
> +       spin_lock_init(&pool->stat_lock);
>         pool->flags = flags;
>
>         return pool;
> @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>                         return 0;
>
>                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> +               spin_lock(&pool->stat_lock);
> +               pool->pages_allocated += class->pages_per_zspage;
> +               spin_unlock(&pool->stat_lock);
>                 spin_lock(&class->lock);
> -               class->pages_allocated += class->pages_per_zspage;
>         }
>
>         obj = (unsigned long)first_page->freelist;
> @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>
>         first_page->inuse--;
>         fullness = fix_fullness_group(pool, first_page);
> -
> -       if (fullness == ZS_EMPTY)
> -               class->pages_allocated -= class->pages_per_zspage;
> -
>         spin_unlock(&class->lock);
>
> -       if (fullness == ZS_EMPTY)
> +       if (fullness == ZS_EMPTY) {
> +               spin_lock(&pool->stat_lock);
> +               pool->pages_allocated -= class->pages_per_zspage;
> +               spin_unlock(&pool->stat_lock);
>                 free_zspage(first_page);
> +       }
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
>
> @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>
>  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  {
> -       int i;
> -       u64 npages = 0;
> -
> -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> -               npages += pool->size_class[i].pages_allocated;
> +       u64 npages;
>
> +       spin_lock(&pool->stat_lock);
> +       npages = pool->pages_allocated;
> +       spin_unlock(&pool->stat_lock);
>         return npages << PAGE_SHIFT;
>  }
>  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> --
> 2.0.0
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-13 13:59     ` Dan Streetman
  0 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-13 13:59 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Linux-MM, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> Pages_allocated has counted in size_class structure and when user
> want to see total_size_bytes, it gathers all of value from each
> size_class to report the sum.
>
> It's not bad if user don't see the value often but if user start
> to see the value frequently, it would be not a good deal for
> performance POV.
>
> This patch moves the variable from size_class to zs_pool so it would
> reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> but it adds new locking overhead but it wouldn't be severe because
> it's not a hot path in zs_malloc(ie, it is called only when new
> zspage is created, not a object).

Would using an atomic64_t without locking be simpler?

>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index fe78189624cf..a6089bd26621 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -198,9 +198,6 @@ struct size_class {
>
>         spinlock_t lock;
>
> -       /* stats */
> -       u64 pages_allocated;
> -
>         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>  };
>
> @@ -216,9 +213,12 @@ struct link_free {
>  };
>
>  struct zs_pool {
> +       spinlock_t stat_lock;
> +
>         struct size_class size_class[ZS_SIZE_CLASSES];
>
>         gfp_t flags;    /* allocation flags used when growing pool */
> +       unsigned long pages_allocated;
>  };
>
>  /*
> @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>
>         }
>
> +       spin_lock_init(&pool->stat_lock);
>         pool->flags = flags;
>
>         return pool;
> @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>                         return 0;
>
>                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> +               spin_lock(&pool->stat_lock);
> +               pool->pages_allocated += class->pages_per_zspage;
> +               spin_unlock(&pool->stat_lock);
>                 spin_lock(&class->lock);
> -               class->pages_allocated += class->pages_per_zspage;
>         }
>
>         obj = (unsigned long)first_page->freelist;
> @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>
>         first_page->inuse--;
>         fullness = fix_fullness_group(pool, first_page);
> -
> -       if (fullness == ZS_EMPTY)
> -               class->pages_allocated -= class->pages_per_zspage;
> -
>         spin_unlock(&class->lock);
>
> -       if (fullness == ZS_EMPTY)
> +       if (fullness == ZS_EMPTY) {
> +               spin_lock(&pool->stat_lock);
> +               pool->pages_allocated -= class->pages_per_zspage;
> +               spin_unlock(&pool->stat_lock);
>                 free_zspage(first_page);
> +       }
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
>
> @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>
>  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  {
> -       int i;
> -       u64 npages = 0;
> -
> -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> -               npages += pool->size_class[i].pages_allocated;
> +       u64 npages;
>
> +       spin_lock(&pool->stat_lock);
> +       npages = pool->pages_allocated;
> +       spin_unlock(&pool->stat_lock);
>         return npages << PAGE_SHIFT;
>  }
>  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> --
> 2.0.0
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-13 13:59     ` Dan Streetman
@ 2014-08-13 14:14       ` Sergey Senozhatsky
  -1 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-13 14:14 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Minchan Kim, Linux-MM, linux-kernel, Sergey Senozhatsky,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/13/14 09:59), Dan Streetman wrote:
> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> > Pages_allocated has counted in size_class structure and when user
> > want to see total_size_bytes, it gathers all of value from each
> > size_class to report the sum.
> >
> > It's not bad if user don't see the value often but if user start
> > to see the value frequently, it would be not a good deal for
> > performance POV.
> >
> > This patch moves the variable from size_class to zs_pool so it would
> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> > but it adds new locking overhead but it wouldn't be severe because
> > it's not a hot path in zs_malloc(ie, it is called only when new
> > zspage is created, not a object).
> 
> Would using an atomic64_t without locking be simpler?

it would be racy.

	-ss

> 
> >
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> >  1 file changed, 16 insertions(+), 14 deletions(-)
> >
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index fe78189624cf..a6089bd26621 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -198,9 +198,6 @@ struct size_class {
> >
> >         spinlock_t lock;
> >
> > -       /* stats */
> > -       u64 pages_allocated;
> > -
> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> >  };
> >
> > @@ -216,9 +213,12 @@ struct link_free {
> >  };
> >
> >  struct zs_pool {
> > +       spinlock_t stat_lock;
> > +
> >         struct size_class size_class[ZS_SIZE_CLASSES];
> >
> >         gfp_t flags;    /* allocation flags used when growing pool */
> > +       unsigned long pages_allocated;
> >  };
> >
> >  /*
> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> >
> >         }
> >
> > +       spin_lock_init(&pool->stat_lock);
> >         pool->flags = flags;
> >
> >         return pool;
> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >                         return 0;
> >
> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> > +               spin_lock(&pool->stat_lock);
> > +               pool->pages_allocated += class->pages_per_zspage;
> > +               spin_unlock(&pool->stat_lock);
> >                 spin_lock(&class->lock);
> > -               class->pages_allocated += class->pages_per_zspage;
> >         }
> >
> >         obj = (unsigned long)first_page->freelist;
> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> >
> >         first_page->inuse--;
> >         fullness = fix_fullness_group(pool, first_page);
> > -
> > -       if (fullness == ZS_EMPTY)
> > -               class->pages_allocated -= class->pages_per_zspage;
> > -
> >         spin_unlock(&class->lock);
> >
> > -       if (fullness == ZS_EMPTY)
> > +       if (fullness == ZS_EMPTY) {
> > +               spin_lock(&pool->stat_lock);
> > +               pool->pages_allocated -= class->pages_per_zspage;
> > +               spin_unlock(&pool->stat_lock);
> >                 free_zspage(first_page);
> > +       }
> >  }
> >  EXPORT_SYMBOL_GPL(zs_free);
> >
> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> >
> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> >  {
> > -       int i;
> > -       u64 npages = 0;
> > -
> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> > -               npages += pool->size_class[i].pages_allocated;
> > +       u64 npages;
> >
> > +       spin_lock(&pool->stat_lock);
> > +       npages = pool->pages_allocated;
> > +       spin_unlock(&pool->stat_lock);
> >         return npages << PAGE_SHIFT;
> >  }
> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > --
> > 2.0.0
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-13 14:14       ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-13 14:14 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Minchan Kim, Linux-MM, linux-kernel, Sergey Senozhatsky,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/13/14 09:59), Dan Streetman wrote:
> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> > Pages_allocated has counted in size_class structure and when user
> > want to see total_size_bytes, it gathers all of value from each
> > size_class to report the sum.
> >
> > It's not bad if user don't see the value often but if user start
> > to see the value frequently, it would be not a good deal for
> > performance POV.
> >
> > This patch moves the variable from size_class to zs_pool so it would
> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> > but it adds new locking overhead but it wouldn't be severe because
> > it's not a hot path in zs_malloc(ie, it is called only when new
> > zspage is created, not a object).
> 
> Would using an atomic64_t without locking be simpler?

it would be racy.

	-ss

> 
> >
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> >  1 file changed, 16 insertions(+), 14 deletions(-)
> >
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index fe78189624cf..a6089bd26621 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -198,9 +198,6 @@ struct size_class {
> >
> >         spinlock_t lock;
> >
> > -       /* stats */
> > -       u64 pages_allocated;
> > -
> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> >  };
> >
> > @@ -216,9 +213,12 @@ struct link_free {
> >  };
> >
> >  struct zs_pool {
> > +       spinlock_t stat_lock;
> > +
> >         struct size_class size_class[ZS_SIZE_CLASSES];
> >
> >         gfp_t flags;    /* allocation flags used when growing pool */
> > +       unsigned long pages_allocated;
> >  };
> >
> >  /*
> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> >
> >         }
> >
> > +       spin_lock_init(&pool->stat_lock);
> >         pool->flags = flags;
> >
> >         return pool;
> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >                         return 0;
> >
> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> > +               spin_lock(&pool->stat_lock);
> > +               pool->pages_allocated += class->pages_per_zspage;
> > +               spin_unlock(&pool->stat_lock);
> >                 spin_lock(&class->lock);
> > -               class->pages_allocated += class->pages_per_zspage;
> >         }
> >
> >         obj = (unsigned long)first_page->freelist;
> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> >
> >         first_page->inuse--;
> >         fullness = fix_fullness_group(pool, first_page);
> > -
> > -       if (fullness == ZS_EMPTY)
> > -               class->pages_allocated -= class->pages_per_zspage;
> > -
> >         spin_unlock(&class->lock);
> >
> > -       if (fullness == ZS_EMPTY)
> > +       if (fullness == ZS_EMPTY) {
> > +               spin_lock(&pool->stat_lock);
> > +               pool->pages_allocated -= class->pages_per_zspage;
> > +               spin_unlock(&pool->stat_lock);
> >                 free_zspage(first_page);
> > +       }
> >  }
> >  EXPORT_SYMBOL_GPL(zs_free);
> >
> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> >
> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> >  {
> > -       int i;
> > -       u64 npages = 0;
> > -
> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> > -               npages += pool->size_class[i].pages_allocated;
> > +       u64 npages;
> >
> > +       spin_lock(&pool->stat_lock);
> > +       npages = pool->pages_allocated;
> > +       spin_unlock(&pool->stat_lock);
> >         return npages << PAGE_SHIFT;
> >  }
> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > --
> > 2.0.0
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-13 14:14       ` Sergey Senozhatsky
@ 2014-08-13 14:51         ` Dan Streetman
  -1 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-13 14:51 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Linux-MM, linux-kernel, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
> On (08/13/14 09:59), Dan Streetman wrote:
>> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
>> > Pages_allocated has counted in size_class structure and when user
>> > want to see total_size_bytes, it gathers all of value from each
>> > size_class to report the sum.
>> >
>> > It's not bad if user don't see the value often but if user start
>> > to see the value frequently, it would be not a good deal for
>> > performance POV.
>> >
>> > This patch moves the variable from size_class to zs_pool so it would
>> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
>> > but it adds new locking overhead but it wouldn't be severe because
>> > it's not a hot path in zs_malloc(ie, it is called only when new
>> > zspage is created, not a object).
>>
>> Would using an atomic64_t without locking be simpler?
>
> it would be racy.

oh.  atomic operations aren't smp safe?  is that because other
processors might use a stale value, and barriers must be added?  I
guess I don't quite understand the value of atomic then. :-/

>
>         -ss
>
>>
>> >
>> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > ---
>> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>> >  1 file changed, 16 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
>> > index fe78189624cf..a6089bd26621 100644
>> > --- a/mm/zsmalloc.c
>> > +++ b/mm/zsmalloc.c
>> > @@ -198,9 +198,6 @@ struct size_class {
>> >
>> >         spinlock_t lock;
>> >
>> > -       /* stats */
>> > -       u64 pages_allocated;
>> > -
>> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>> >  };
>> >
>> > @@ -216,9 +213,12 @@ struct link_free {
>> >  };
>> >
>> >  struct zs_pool {
>> > +       spinlock_t stat_lock;
>> > +
>> >         struct size_class size_class[ZS_SIZE_CLASSES];
>> >
>> >         gfp_t flags;    /* allocation flags used when growing pool */
>> > +       unsigned long pages_allocated;
>> >  };
>> >
>> >  /*
>> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>> >
>> >         }
>> >
>> > +       spin_lock_init(&pool->stat_lock);
>> >         pool->flags = flags;
>> >
>> >         return pool;
>> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>> >                         return 0;
>> >
>> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>> > +               spin_lock(&pool->stat_lock);
>> > +               pool->pages_allocated += class->pages_per_zspage;
>> > +               spin_unlock(&pool->stat_lock);
>> >                 spin_lock(&class->lock);
>> > -               class->pages_allocated += class->pages_per_zspage;
>> >         }
>> >
>> >         obj = (unsigned long)first_page->freelist;
>> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>> >
>> >         first_page->inuse--;
>> >         fullness = fix_fullness_group(pool, first_page);
>> > -
>> > -       if (fullness == ZS_EMPTY)
>> > -               class->pages_allocated -= class->pages_per_zspage;
>> > -
>> >         spin_unlock(&class->lock);
>> >
>> > -       if (fullness == ZS_EMPTY)
>> > +       if (fullness == ZS_EMPTY) {
>> > +               spin_lock(&pool->stat_lock);
>> > +               pool->pages_allocated -= class->pages_per_zspage;
>> > +               spin_unlock(&pool->stat_lock);
>> >                 free_zspage(first_page);
>> > +       }
>> >  }
>> >  EXPORT_SYMBOL_GPL(zs_free);
>> >
>> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>> >
>> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>> >  {
>> > -       int i;
>> > -       u64 npages = 0;
>> > -
>> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
>> > -               npages += pool->size_class[i].pages_allocated;
>> > +       u64 npages;
>> >
>> > +       spin_lock(&pool->stat_lock);
>> > +       npages = pool->pages_allocated;
>> > +       spin_unlock(&pool->stat_lock);
>> >         return npages << PAGE_SHIFT;
>> >  }
>> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
>> > --
>> > 2.0.0
>> >
>> > --
>> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> > the body to majordomo@kvack.org.  For more info on Linux MM,
>> > see: http://www.linux-mm.org/ .
>> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-13 14:51         ` Dan Streetman
  0 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-13 14:51 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Linux-MM, linux-kernel, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
> On (08/13/14 09:59), Dan Streetman wrote:
>> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
>> > Pages_allocated has counted in size_class structure and when user
>> > want to see total_size_bytes, it gathers all of value from each
>> > size_class to report the sum.
>> >
>> > It's not bad if user don't see the value often but if user start
>> > to see the value frequently, it would be not a good deal for
>> > performance POV.
>> >
>> > This patch moves the variable from size_class to zs_pool so it would
>> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
>> > but it adds new locking overhead but it wouldn't be severe because
>> > it's not a hot path in zs_malloc(ie, it is called only when new
>> > zspage is created, not a object).
>>
>> Would using an atomic64_t without locking be simpler?
>
> it would be racy.

oh.  atomic operations aren't smp safe?  is that because other
processors might use a stale value, and barriers must be added?  I
guess I don't quite understand the value of atomic then. :-/

>
>         -ss
>
>>
>> >
>> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > ---
>> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>> >  1 file changed, 16 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
>> > index fe78189624cf..a6089bd26621 100644
>> > --- a/mm/zsmalloc.c
>> > +++ b/mm/zsmalloc.c
>> > @@ -198,9 +198,6 @@ struct size_class {
>> >
>> >         spinlock_t lock;
>> >
>> > -       /* stats */
>> > -       u64 pages_allocated;
>> > -
>> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>> >  };
>> >
>> > @@ -216,9 +213,12 @@ struct link_free {
>> >  };
>> >
>> >  struct zs_pool {
>> > +       spinlock_t stat_lock;
>> > +
>> >         struct size_class size_class[ZS_SIZE_CLASSES];
>> >
>> >         gfp_t flags;    /* allocation flags used when growing pool */
>> > +       unsigned long pages_allocated;
>> >  };
>> >
>> >  /*
>> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>> >
>> >         }
>> >
>> > +       spin_lock_init(&pool->stat_lock);
>> >         pool->flags = flags;
>> >
>> >         return pool;
>> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>> >                         return 0;
>> >
>> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>> > +               spin_lock(&pool->stat_lock);
>> > +               pool->pages_allocated += class->pages_per_zspage;
>> > +               spin_unlock(&pool->stat_lock);
>> >                 spin_lock(&class->lock);
>> > -               class->pages_allocated += class->pages_per_zspage;
>> >         }
>> >
>> >         obj = (unsigned long)first_page->freelist;
>> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>> >
>> >         first_page->inuse--;
>> >         fullness = fix_fullness_group(pool, first_page);
>> > -
>> > -       if (fullness == ZS_EMPTY)
>> > -               class->pages_allocated -= class->pages_per_zspage;
>> > -
>> >         spin_unlock(&class->lock);
>> >
>> > -       if (fullness == ZS_EMPTY)
>> > +       if (fullness == ZS_EMPTY) {
>> > +               spin_lock(&pool->stat_lock);
>> > +               pool->pages_allocated -= class->pages_per_zspage;
>> > +               spin_unlock(&pool->stat_lock);
>> >                 free_zspage(first_page);
>> > +       }
>> >  }
>> >  EXPORT_SYMBOL_GPL(zs_free);
>> >
>> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>> >
>> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>> >  {
>> > -       int i;
>> > -       u64 npages = 0;
>> > -
>> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
>> > -               npages += pool->size_class[i].pages_allocated;
>> > +       u64 npages;
>> >
>> > +       spin_lock(&pool->stat_lock);
>> > +       npages = pool->pages_allocated;
>> > +       spin_unlock(&pool->stat_lock);
>> >         return npages << PAGE_SHIFT;
>> >  }
>> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
>> > --
>> > 2.0.0
>> >
>> > --
>> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> > the body to majordomo@kvack.org.  For more info on Linux MM,
>> > see: http://www.linux-mm.org/ .
>> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-13 14:51         ` Dan Streetman
@ 2014-08-13 15:13           ` Sergey Senozhatsky
  -1 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-13 15:13 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Sergey Senozhatsky, Minchan Kim, Linux-MM, linux-kernel,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/13/14 10:51), Dan Streetman wrote:
> Date: Wed, 13 Aug 2014 10:51:40 -0400
> From: Dan Streetman <ddstreet@ieee.org>
> To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Minchan Kim <minchan@kernel.org>, Linux-MM <linux-mm@kvack.org>,
>  linux-kernel <linux-kernel@vger.kernel.org>, Jerome Marchand
>  <jmarchan@redhat.com>, juno.choi@lge.com, seungho1.park@lge.com, Luigi
>  Semenzato <semenzato@google.com>, Nitin Gupta <ngupta@vflare.org>
> Subject: Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
> 
> On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> <sergey.senozhatsky@gmail.com> wrote:
> > On (08/13/14 09:59), Dan Streetman wrote:
> >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> >> > Pages_allocated has counted in size_class structure and when user
> >> > want to see total_size_bytes, it gathers all of value from each
> >> > size_class to report the sum.
> >> >
> >> > It's not bad if user don't see the value often but if user start
> >> > to see the value frequently, it would be not a good deal for
> >> > performance POV.
> >> >
> >> > This patch moves the variable from size_class to zs_pool so it would
> >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> >> > but it adds new locking overhead but it wouldn't be severe because
> >> > it's not a hot path in zs_malloc(ie, it is called only when new
> >> > zspage is created, not a object).
> >>
> >> Would using an atomic64_t without locking be simpler?
> >
> > it would be racy.
> 
> oh.  atomic operations aren't smp safe?  is that because other
> processors might use a stale value, and barriers must be added?  I
> guess I don't quite understand the value of atomic then. :-/

pool not only set the value, it also read it and make some decisions
based on that value:

	pages_allocated += X
	if (pages_allocated >= max_pages_allocated)
		return 0;

	-ss

> >>
> >> >
> >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >> > ---
> >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> >> >
> >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> >> > index fe78189624cf..a6089bd26621 100644
> >> > --- a/mm/zsmalloc.c
> >> > +++ b/mm/zsmalloc.c
> >> > @@ -198,9 +198,6 @@ struct size_class {
> >> >
> >> >         spinlock_t lock;
> >> >
> >> > -       /* stats */
> >> > -       u64 pages_allocated;
> >> > -
> >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> >> >  };
> >> >
> >> > @@ -216,9 +213,12 @@ struct link_free {
> >> >  };
> >> >
> >> >  struct zs_pool {
> >> > +       spinlock_t stat_lock;
> >> > +
> >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> >> >
> >> >         gfp_t flags;    /* allocation flags used when growing pool */
> >> > +       unsigned long pages_allocated;
> >> >  };
> >> >
> >> >  /*
> >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> >> >
> >> >         }
> >> >
> >> > +       spin_lock_init(&pool->stat_lock);
> >> >         pool->flags = flags;
> >> >
> >> >         return pool;
> >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >> >                         return 0;
> >> >
> >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> >> > +               spin_lock(&pool->stat_lock);
> >> > +               pool->pages_allocated += class->pages_per_zspage;
> >> > +               spin_unlock(&pool->stat_lock);
> >> >                 spin_lock(&class->lock);
> >> > -               class->pages_allocated += class->pages_per_zspage;
> >> >         }
> >> >
> >> >         obj = (unsigned long)first_page->freelist;
> >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> >> >
> >> >         first_page->inuse--;
> >> >         fullness = fix_fullness_group(pool, first_page);
> >> > -
> >> > -       if (fullness == ZS_EMPTY)
> >> > -               class->pages_allocated -= class->pages_per_zspage;
> >> > -
> >> >         spin_unlock(&class->lock);
> >> >
> >> > -       if (fullness == ZS_EMPTY)
> >> > +       if (fullness == ZS_EMPTY) {
> >> > +               spin_lock(&pool->stat_lock);
> >> > +               pool->pages_allocated -= class->pages_per_zspage;
> >> > +               spin_unlock(&pool->stat_lock);
> >> >                 free_zspage(first_page);
> >> > +       }
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(zs_free);
> >> >
> >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> >> >
> >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> >> >  {
> >> > -       int i;
> >> > -       u64 npages = 0;
> >> > -
> >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> >> > -               npages += pool->size_class[i].pages_allocated;
> >> > +       u64 npages;
> >> >
> >> > +       spin_lock(&pool->stat_lock);
> >> > +       npages = pool->pages_allocated;
> >> > +       spin_unlock(&pool->stat_lock);
> >> >         return npages << PAGE_SHIFT;
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> >> > --
> >> > 2.0.0
> >> >
> >> > --
> >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> >> > see: http://www.linux-mm.org/ .
> >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-13 15:13           ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-13 15:13 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Sergey Senozhatsky, Minchan Kim, Linux-MM, linux-kernel,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/13/14 10:51), Dan Streetman wrote:
> Date: Wed, 13 Aug 2014 10:51:40 -0400
> From: Dan Streetman <ddstreet@ieee.org>
> To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Minchan Kim <minchan@kernel.org>, Linux-MM <linux-mm@kvack.org>,
>  linux-kernel <linux-kernel@vger.kernel.org>, Jerome Marchand
>  <jmarchan@redhat.com>, juno.choi@lge.com, seungho1.park@lge.com, Luigi
>  Semenzato <semenzato@google.com>, Nitin Gupta <ngupta@vflare.org>
> Subject: Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
> 
> On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> <sergey.senozhatsky@gmail.com> wrote:
> > On (08/13/14 09:59), Dan Streetman wrote:
> >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> >> > Pages_allocated has counted in size_class structure and when user
> >> > want to see total_size_bytes, it gathers all of value from each
> >> > size_class to report the sum.
> >> >
> >> > It's not bad if user don't see the value often but if user start
> >> > to see the value frequently, it would be not a good deal for
> >> > performance POV.
> >> >
> >> > This patch moves the variable from size_class to zs_pool so it would
> >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> >> > but it adds new locking overhead but it wouldn't be severe because
> >> > it's not a hot path in zs_malloc(ie, it is called only when new
> >> > zspage is created, not a object).
> >>
> >> Would using an atomic64_t without locking be simpler?
> >
> > it would be racy.
> 
> oh.  atomic operations aren't smp safe?  is that because other
> processors might use a stale value, and barriers must be added?  I
> guess I don't quite understand the value of atomic then. :-/

pool not only set the value, it also read it and make some decisions
based on that value:

	pages_allocated += X
	if (pages_allocated >= max_pages_allocated)
		return 0;

	-ss

> >>
> >> >
> >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >> > ---
> >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> >> >
> >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> >> > index fe78189624cf..a6089bd26621 100644
> >> > --- a/mm/zsmalloc.c
> >> > +++ b/mm/zsmalloc.c
> >> > @@ -198,9 +198,6 @@ struct size_class {
> >> >
> >> >         spinlock_t lock;
> >> >
> >> > -       /* stats */
> >> > -       u64 pages_allocated;
> >> > -
> >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> >> >  };
> >> >
> >> > @@ -216,9 +213,12 @@ struct link_free {
> >> >  };
> >> >
> >> >  struct zs_pool {
> >> > +       spinlock_t stat_lock;
> >> > +
> >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> >> >
> >> >         gfp_t flags;    /* allocation flags used when growing pool */
> >> > +       unsigned long pages_allocated;
> >> >  };
> >> >
> >> >  /*
> >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> >> >
> >> >         }
> >> >
> >> > +       spin_lock_init(&pool->stat_lock);
> >> >         pool->flags = flags;
> >> >
> >> >         return pool;
> >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >> >                         return 0;
> >> >
> >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> >> > +               spin_lock(&pool->stat_lock);
> >> > +               pool->pages_allocated += class->pages_per_zspage;
> >> > +               spin_unlock(&pool->stat_lock);
> >> >                 spin_lock(&class->lock);
> >> > -               class->pages_allocated += class->pages_per_zspage;
> >> >         }
> >> >
> >> >         obj = (unsigned long)first_page->freelist;
> >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> >> >
> >> >         first_page->inuse--;
> >> >         fullness = fix_fullness_group(pool, first_page);
> >> > -
> >> > -       if (fullness == ZS_EMPTY)
> >> > -               class->pages_allocated -= class->pages_per_zspage;
> >> > -
> >> >         spin_unlock(&class->lock);
> >> >
> >> > -       if (fullness == ZS_EMPTY)
> >> > +       if (fullness == ZS_EMPTY) {
> >> > +               spin_lock(&pool->stat_lock);
> >> > +               pool->pages_allocated -= class->pages_per_zspage;
> >> > +               spin_unlock(&pool->stat_lock);
> >> >                 free_zspage(first_page);
> >> > +       }
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(zs_free);
> >> >
> >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> >> >
> >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> >> >  {
> >> > -       int i;
> >> > -       u64 npages = 0;
> >> > -
> >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> >> > -               npages += pool->size_class[i].pages_allocated;
> >> > +       u64 npages;
> >> >
> >> > +       spin_lock(&pool->stat_lock);
> >> > +       npages = pool->pages_allocated;
> >> > +       spin_unlock(&pool->stat_lock);
> >> >         return npages << PAGE_SHIFT;
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> >> > --
> >> > 2.0.0
> >> >
> >> > --
> >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> >> > see: http://www.linux-mm.org/ .
> >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-05  8:02   ` Minchan Kim
@ 2014-08-13 15:21     ` Seth Jennings
  -1 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 05:02:01PM +0900, Minchan Kim wrote:
> Pages_allocated has counted in size_class structure and when user
> want to see total_size_bytes, it gathers all of value from each
> size_class to report the sum.
> 
> It's not bad if user don't see the value often but if user start
> to see the value frequently, it would be not a good deal for
> performance POV.
> 
> This patch moves the variable from size_class to zs_pool so it would
> reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> but it adds new locking overhead but it wouldn't be severe because
> it's not a hot path in zs_malloc(ie, it is called only when new
> zspage is created, not a object).
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index fe78189624cf..a6089bd26621 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -198,9 +198,6 @@ struct size_class {
>  
>  	spinlock_t lock;
>  
> -	/* stats */
> -	u64 pages_allocated;
> -
>  	struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>  };
>  
> @@ -216,9 +213,12 @@ struct link_free {
>  };
>  
>  struct zs_pool {
> +	spinlock_t stat_lock;
> +
>  	struct size_class size_class[ZS_SIZE_CLASSES];
>  
>  	gfp_t flags;	/* allocation flags used when growing pool */
> +	unsigned long pages_allocated;

As Dan was saying, I think this can be atomic to avoid the locking.

Seth

>  };
>  
>  /*
> @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>  
>  	}
>  
> +	spin_lock_init(&pool->stat_lock);
>  	pool->flags = flags;
>  
>  	return pool;
> @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>  			return 0;
>  
>  		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> +		spin_lock(&pool->stat_lock);
> +		pool->pages_allocated += class->pages_per_zspage;
> +		spin_unlock(&pool->stat_lock);
>  		spin_lock(&class->lock);
> -		class->pages_allocated += class->pages_per_zspage;
>  	}
>  
>  	obj = (unsigned long)first_page->freelist;
> @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>  
>  	first_page->inuse--;
>  	fullness = fix_fullness_group(pool, first_page);
> -
> -	if (fullness == ZS_EMPTY)
> -		class->pages_allocated -= class->pages_per_zspage;
> -
>  	spin_unlock(&class->lock);
>  
> -	if (fullness == ZS_EMPTY)
> +	if (fullness == ZS_EMPTY) {
> +		spin_lock(&pool->stat_lock);
> +		pool->pages_allocated -= class->pages_per_zspage;
> +		spin_unlock(&pool->stat_lock);
>  		free_zspage(first_page);
> +	}
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
>  
> @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>  
>  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  {
> -	int i;
> -	u64 npages = 0;
> -
> -	for (i = 0; i < ZS_SIZE_CLASSES; i++)
> -		npages += pool->size_class[i].pages_allocated;
> +	u64 npages;
>  
> +	spin_lock(&pool->stat_lock);
> +	npages = pool->pages_allocated;
> +	spin_unlock(&pool->stat_lock);
>  	return npages << PAGE_SHIFT;
>  }
>  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> -- 
> 2.0.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-13 15:21     ` Seth Jennings
  0 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 05:02:01PM +0900, Minchan Kim wrote:
> Pages_allocated has counted in size_class structure and when user
> want to see total_size_bytes, it gathers all of value from each
> size_class to report the sum.
> 
> It's not bad if user don't see the value often but if user start
> to see the value frequently, it would be not a good deal for
> performance POV.
> 
> This patch moves the variable from size_class to zs_pool so it would
> reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> but it adds new locking overhead but it wouldn't be severe because
> it's not a hot path in zs_malloc(ie, it is called only when new
> zspage is created, not a object).
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index fe78189624cf..a6089bd26621 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -198,9 +198,6 @@ struct size_class {
>  
>  	spinlock_t lock;
>  
> -	/* stats */
> -	u64 pages_allocated;
> -
>  	struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>  };
>  
> @@ -216,9 +213,12 @@ struct link_free {
>  };
>  
>  struct zs_pool {
> +	spinlock_t stat_lock;
> +
>  	struct size_class size_class[ZS_SIZE_CLASSES];
>  
>  	gfp_t flags;	/* allocation flags used when growing pool */
> +	unsigned long pages_allocated;

As Dan was saying, I think this can be atomic to avoid the locking.

Seth

>  };
>  
>  /*
> @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>  
>  	}
>  
> +	spin_lock_init(&pool->stat_lock);
>  	pool->flags = flags;
>  
>  	return pool;
> @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>  			return 0;
>  
>  		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> +		spin_lock(&pool->stat_lock);
> +		pool->pages_allocated += class->pages_per_zspage;
> +		spin_unlock(&pool->stat_lock);
>  		spin_lock(&class->lock);
> -		class->pages_allocated += class->pages_per_zspage;
>  	}
>  
>  	obj = (unsigned long)first_page->freelist;
> @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>  
>  	first_page->inuse--;
>  	fullness = fix_fullness_group(pool, first_page);
> -
> -	if (fullness == ZS_EMPTY)
> -		class->pages_allocated -= class->pages_per_zspage;
> -
>  	spin_unlock(&class->lock);
>  
> -	if (fullness == ZS_EMPTY)
> +	if (fullness == ZS_EMPTY) {
> +		spin_lock(&pool->stat_lock);
> +		pool->pages_allocated -= class->pages_per_zspage;
> +		spin_unlock(&pool->stat_lock);
>  		free_zspage(first_page);
> +	}
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
>  
> @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>  
>  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  {
> -	int i;
> -	u64 npages = 0;
> -
> -	for (i = 0; i < ZS_SIZE_CLASSES; i++)
> -		npages += pool->size_class[i].pages_allocated;
> +	u64 npages;
>  
> +	spin_lock(&pool->stat_lock);
> +	npages = pool->pages_allocated;
> +	spin_unlock(&pool->stat_lock);
>  	return npages << PAGE_SHIFT;
>  }
>  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> -- 
> 2.0.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
  2014-08-05  8:02   ` Minchan Kim
@ 2014-08-13 15:25     ` Seth Jennings
  -1 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:25 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 05:02:02PM +0900, Minchan Kim wrote:
> Normally, zram user can get maximum memory zsmalloc consumed via
> polling mem_used_total with sysfs in userspace.
> 
> But it has a critical problem because user can miss peak memory
> usage during update interval so that gap between them could be
> huge when memory pressure is really heavy.
> 
> This patch adds new API zs_get_max_size_bytes in zsmalloc so
> user(ex, zram) doesn't need to poll in short interval to get
> exact value.
> 
> User can just see max memory usage once his test workload is
> done. It's pretty handy and accurate.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  Documentation/blockdev/zram.txt |  1 +
>  drivers/block/zram/zram_drv.c   | 17 +++++++++++++++++
>  include/linux/zsmalloc.h        |  1 +
>  mm/zsmalloc.c                   | 20 ++++++++++++++++++++
>  4 files changed, 39 insertions(+)
> 
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index 0595c3f56ccf..d24534bee763 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -95,6 +95,7 @@ size of the disk when not in use so a huge zram is wasteful.
>  		orig_data_size
>  		compr_data_size
>  		mem_used_total
> +		mem_used_max
>  
>  7) Deactivate:
>  	swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 36e54be402df..a4d637b4db7d 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -109,6 +109,21 @@ static ssize_t mem_used_total_show(struct device *dev,
>  	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>  }
>  
> +static ssize_t mem_used_max_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	u64 val = 0;
> +	struct zram *zram = dev_to_zram(dev);
> +	struct zram_meta *meta = zram->meta;
> +
> +	down_read(&zram->init_lock);
> +	if (init_done(zram))
> +		val = zs_get_max_size_bytes(meta->mem_pool);
> +	up_read(&zram->init_lock);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> +}
> +
>  static ssize_t max_comp_streams_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
>  {
> @@ -838,6 +853,7 @@ static DEVICE_ATTR(initstate, S_IRUGO, initstate_show, NULL);
>  static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> +static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
>  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>  		max_comp_streams_show, max_comp_streams_store);
>  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> @@ -866,6 +882,7 @@ static struct attribute *zram_disk_attrs[] = {
>  	&dev_attr_orig_data_size.attr,
>  	&dev_attr_compr_data_size.attr,
>  	&dev_attr_mem_used_total.attr,
> +	&dev_attr_mem_used_max.attr,
>  	&dev_attr_max_comp_streams.attr,
>  	&dev_attr_comp_algorithm.attr,
>  	NULL,
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index e44d634e7fb7..fb087ca06a88 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -47,5 +47,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
>  void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>  
>  u64 zs_get_total_size_bytes(struct zs_pool *pool);
> +u64 zs_get_max_size_bytes(struct zs_pool *pool);
>  
>  #endif
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index a6089bd26621..3b5be076268a 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -219,6 +219,7 @@ struct zs_pool {
>  
>  	gfp_t flags;	/* allocation flags used when growing pool */
>  	unsigned long pages_allocated;
> +	unsigned long max_pages_allocated;

Same here with atomic.

Seth

>  };
>  
>  /*
> @@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>  		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>  		spin_lock(&pool->stat_lock);
>  		pool->pages_allocated += class->pages_per_zspage;
> +		if (pool->max_pages_allocated < pool->pages_allocated)
> +			pool->max_pages_allocated = pool->pages_allocated;
>  		spin_unlock(&pool->stat_lock);
>  		spin_lock(&class->lock);
>  	}
> @@ -1101,6 +1104,9 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
>  }
>  EXPORT_SYMBOL_GPL(zs_unmap_object);
>  
> +/*
> + * Reports current memory usage consumed by zs_malloc
> + */
>  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  {
>  	u64 npages;
> @@ -1112,6 +1118,20 @@ u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  }
>  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
>  
> +/*
> + * Reports maximum memory usage zs_malloc have consumed
> + */
> +u64 zs_get_max_size_bytes(struct zs_pool *pool)
> +{
> +	u64 npages;
> +
> +	spin_lock(&pool->stat_lock);
> +	npages = pool->max_pages_allocated;
> +	spin_unlock(&pool->stat_lock);
> +	return npages << PAGE_SHIFT;
> +}
> +EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
> +
>  module_init(zs_init);
>  module_exit(zs_exit);
>  
> -- 
> 2.0.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
@ 2014-08-13 15:25     ` Seth Jennings
  0 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:25 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 05:02:02PM +0900, Minchan Kim wrote:
> Normally, zram user can get maximum memory zsmalloc consumed via
> polling mem_used_total with sysfs in userspace.
> 
> But it has a critical problem because user can miss peak memory
> usage during update interval so that gap between them could be
> huge when memory pressure is really heavy.
> 
> This patch adds new API zs_get_max_size_bytes in zsmalloc so
> user(ex, zram) doesn't need to poll in short interval to get
> exact value.
> 
> User can just see max memory usage once his test workload is
> done. It's pretty handy and accurate.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  Documentation/blockdev/zram.txt |  1 +
>  drivers/block/zram/zram_drv.c   | 17 +++++++++++++++++
>  include/linux/zsmalloc.h        |  1 +
>  mm/zsmalloc.c                   | 20 ++++++++++++++++++++
>  4 files changed, 39 insertions(+)
> 
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index 0595c3f56ccf..d24534bee763 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -95,6 +95,7 @@ size of the disk when not in use so a huge zram is wasteful.
>  		orig_data_size
>  		compr_data_size
>  		mem_used_total
> +		mem_used_max
>  
>  7) Deactivate:
>  	swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 36e54be402df..a4d637b4db7d 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -109,6 +109,21 @@ static ssize_t mem_used_total_show(struct device *dev,
>  	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>  }
>  
> +static ssize_t mem_used_max_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	u64 val = 0;
> +	struct zram *zram = dev_to_zram(dev);
> +	struct zram_meta *meta = zram->meta;
> +
> +	down_read(&zram->init_lock);
> +	if (init_done(zram))
> +		val = zs_get_max_size_bytes(meta->mem_pool);
> +	up_read(&zram->init_lock);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> +}
> +
>  static ssize_t max_comp_streams_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
>  {
> @@ -838,6 +853,7 @@ static DEVICE_ATTR(initstate, S_IRUGO, initstate_show, NULL);
>  static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> +static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
>  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>  		max_comp_streams_show, max_comp_streams_store);
>  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> @@ -866,6 +882,7 @@ static struct attribute *zram_disk_attrs[] = {
>  	&dev_attr_orig_data_size.attr,
>  	&dev_attr_compr_data_size.attr,
>  	&dev_attr_mem_used_total.attr,
> +	&dev_attr_mem_used_max.attr,
>  	&dev_attr_max_comp_streams.attr,
>  	&dev_attr_comp_algorithm.attr,
>  	NULL,
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index e44d634e7fb7..fb087ca06a88 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -47,5 +47,6 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
>  void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>  
>  u64 zs_get_total_size_bytes(struct zs_pool *pool);
> +u64 zs_get_max_size_bytes(struct zs_pool *pool);
>  
>  #endif
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index a6089bd26621..3b5be076268a 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -219,6 +219,7 @@ struct zs_pool {
>  
>  	gfp_t flags;	/* allocation flags used when growing pool */
>  	unsigned long pages_allocated;
> +	unsigned long max_pages_allocated;

Same here with atomic.

Seth

>  };
>  
>  /*
> @@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>  		set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>  		spin_lock(&pool->stat_lock);
>  		pool->pages_allocated += class->pages_per_zspage;
> +		if (pool->max_pages_allocated < pool->pages_allocated)
> +			pool->max_pages_allocated = pool->pages_allocated;
>  		spin_unlock(&pool->stat_lock);
>  		spin_lock(&class->lock);
>  	}
> @@ -1101,6 +1104,9 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
>  }
>  EXPORT_SYMBOL_GPL(zs_unmap_object);
>  
> +/*
> + * Reports current memory usage consumed by zs_malloc
> + */
>  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  {
>  	u64 npages;
> @@ -1112,6 +1118,20 @@ u64 zs_get_total_size_bytes(struct zs_pool *pool)
>  }
>  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
>  
> +/*
> + * Reports maximum memory usage zs_malloc have consumed
> + */
> +u64 zs_get_max_size_bytes(struct zs_pool *pool)
> +{
> +	u64 npages;
> +
> +	spin_lock(&pool->stat_lock);
> +	npages = pool->max_pages_allocated;
> +	spin_unlock(&pool->stat_lock);
> +	return npages << PAGE_SHIFT;
> +}
> +EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
> +
>  module_init(zs_init);
>  module_exit(zs_exit);
>  
> -- 
> 2.0.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-13 15:13           ` Sergey Senozhatsky
@ 2014-08-13 15:25             ` Sergey Senozhatsky
  -1 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-13 15:25 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Minchan Kim, Linux-MM, Sergey Senozhatsky, linux-kernel,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/14/14 00:13), Sergey Senozhatsky wrote:
> > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> > <sergey.senozhatsky@gmail.com> wrote:
> > > On (08/13/14 09:59), Dan Streetman wrote:
> > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> > >> > Pages_allocated has counted in size_class structure and when user
> > >> > want to see total_size_bytes, it gathers all of value from each
> > >> > size_class to report the sum.
> > >> >
> > >> > It's not bad if user don't see the value often but if user start
> > >> > to see the value frequently, it would be not a good deal for
> > >> > performance POV.
> > >> >
> > >> > This patch moves the variable from size_class to zs_pool so it would
> > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> > >> > but it adds new locking overhead but it wouldn't be severe because
> > >> > it's not a hot path in zs_malloc(ie, it is called only when new
> > >> > zspage is created, not a object).
> > >>
> > >> Would using an atomic64_t without locking be simpler?
> > >
> > > it would be racy.
> > 
> > oh.  atomic operations aren't smp safe?  is that because other
> > processors might use a stale value, and barriers must be added?  I
> > guess I don't quite understand the value of atomic then. :-/
> 
> pool not only set the value, it also read it and make some decisions
> based on that value:
> 
> 	pages_allocated += X
> 	if (pages_allocated >= max_pages_allocated)
> 		return 0;


I mean, suppose this happens on two CPUs

max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
fail 2 operations, while we only were supposed to fail one.

	-ss

> 
> > >>
> > >> >
> > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > >> > ---
> > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> > >> >
> > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > >> > index fe78189624cf..a6089bd26621 100644
> > >> > --- a/mm/zsmalloc.c
> > >> > +++ b/mm/zsmalloc.c
> > >> > @@ -198,9 +198,6 @@ struct size_class {
> > >> >
> > >> >         spinlock_t lock;
> > >> >
> > >> > -       /* stats */
> > >> > -       u64 pages_allocated;
> > >> > -
> > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> > >> >  };
> > >> >
> > >> > @@ -216,9 +213,12 @@ struct link_free {
> > >> >  };
> > >> >
> > >> >  struct zs_pool {
> > >> > +       spinlock_t stat_lock;
> > >> > +
> > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> > >> >
> > >> >         gfp_t flags;    /* allocation flags used when growing pool */
> > >> > +       unsigned long pages_allocated;
> > >> >  };
> > >> >
> > >> >  /*
> > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> > >> >
> > >> >         }
> > >> >
> > >> > +       spin_lock_init(&pool->stat_lock);
> > >> >         pool->flags = flags;
> > >> >
> > >> >         return pool;
> > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> > >> >                         return 0;
> > >> >
> > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> > >> > +               spin_lock(&pool->stat_lock);
> > >> > +               pool->pages_allocated += class->pages_per_zspage;
> > >> > +               spin_unlock(&pool->stat_lock);
> > >> >                 spin_lock(&class->lock);
> > >> > -               class->pages_allocated += class->pages_per_zspage;
> > >> >         }
> > >> >
> > >> >         obj = (unsigned long)first_page->freelist;
> > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> > >> >
> > >> >         first_page->inuse--;
> > >> >         fullness = fix_fullness_group(pool, first_page);
> > >> > -
> > >> > -       if (fullness == ZS_EMPTY)
> > >> > -               class->pages_allocated -= class->pages_per_zspage;
> > >> > -
> > >> >         spin_unlock(&class->lock);
> > >> >
> > >> > -       if (fullness == ZS_EMPTY)
> > >> > +       if (fullness == ZS_EMPTY) {
> > >> > +               spin_lock(&pool->stat_lock);
> > >> > +               pool->pages_allocated -= class->pages_per_zspage;
> > >> > +               spin_unlock(&pool->stat_lock);
> > >> >                 free_zspage(first_page);
> > >> > +       }
> > >> >  }
> > >> >  EXPORT_SYMBOL_GPL(zs_free);
> > >> >
> > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> > >> >
> > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> > >> >  {
> > >> > -       int i;
> > >> > -       u64 npages = 0;
> > >> > -
> > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> > >> > -               npages += pool->size_class[i].pages_allocated;
> > >> > +       u64 npages;
> > >> >
> > >> > +       spin_lock(&pool->stat_lock);
> > >> > +       npages = pool->pages_allocated;
> > >> > +       spin_unlock(&pool->stat_lock);
> > >> >         return npages << PAGE_SHIFT;
> > >> >  }
> > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > >> > --
> > >> > 2.0.0
> > >> >
> > >> > --
> > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > >> > see: http://www.linux-mm.org/ .
> > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> > >>
> > 
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-13 15:25             ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-13 15:25 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Minchan Kim, Linux-MM, Sergey Senozhatsky, linux-kernel,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/14/14 00:13), Sergey Senozhatsky wrote:
> > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> > <sergey.senozhatsky@gmail.com> wrote:
> > > On (08/13/14 09:59), Dan Streetman wrote:
> > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> > >> > Pages_allocated has counted in size_class structure and when user
> > >> > want to see total_size_bytes, it gathers all of value from each
> > >> > size_class to report the sum.
> > >> >
> > >> > It's not bad if user don't see the value often but if user start
> > >> > to see the value frequently, it would be not a good deal for
> > >> > performance POV.
> > >> >
> > >> > This patch moves the variable from size_class to zs_pool so it would
> > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> > >> > but it adds new locking overhead but it wouldn't be severe because
> > >> > it's not a hot path in zs_malloc(ie, it is called only when new
> > >> > zspage is created, not a object).
> > >>
> > >> Would using an atomic64_t without locking be simpler?
> > >
> > > it would be racy.
> > 
> > oh.  atomic operations aren't smp safe?  is that because other
> > processors might use a stale value, and barriers must be added?  I
> > guess I don't quite understand the value of atomic then. :-/
> 
> pool not only set the value, it also read it and make some decisions
> based on that value:
> 
> 	pages_allocated += X
> 	if (pages_allocated >= max_pages_allocated)
> 		return 0;


I mean, suppose this happens on two CPUs

max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
fail 2 operations, while we only were supposed to fail one.

	-ss

> 
> > >>
> > >> >
> > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > >> > ---
> > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> > >> >
> > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > >> > index fe78189624cf..a6089bd26621 100644
> > >> > --- a/mm/zsmalloc.c
> > >> > +++ b/mm/zsmalloc.c
> > >> > @@ -198,9 +198,6 @@ struct size_class {
> > >> >
> > >> >         spinlock_t lock;
> > >> >
> > >> > -       /* stats */
> > >> > -       u64 pages_allocated;
> > >> > -
> > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> > >> >  };
> > >> >
> > >> > @@ -216,9 +213,12 @@ struct link_free {
> > >> >  };
> > >> >
> > >> >  struct zs_pool {
> > >> > +       spinlock_t stat_lock;
> > >> > +
> > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> > >> >
> > >> >         gfp_t flags;    /* allocation flags used when growing pool */
> > >> > +       unsigned long pages_allocated;
> > >> >  };
> > >> >
> > >> >  /*
> > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> > >> >
> > >> >         }
> > >> >
> > >> > +       spin_lock_init(&pool->stat_lock);
> > >> >         pool->flags = flags;
> > >> >
> > >> >         return pool;
> > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> > >> >                         return 0;
> > >> >
> > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> > >> > +               spin_lock(&pool->stat_lock);
> > >> > +               pool->pages_allocated += class->pages_per_zspage;
> > >> > +               spin_unlock(&pool->stat_lock);
> > >> >                 spin_lock(&class->lock);
> > >> > -               class->pages_allocated += class->pages_per_zspage;
> > >> >         }
> > >> >
> > >> >         obj = (unsigned long)first_page->freelist;
> > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> > >> >
> > >> >         first_page->inuse--;
> > >> >         fullness = fix_fullness_group(pool, first_page);
> > >> > -
> > >> > -       if (fullness == ZS_EMPTY)
> > >> > -               class->pages_allocated -= class->pages_per_zspage;
> > >> > -
> > >> >         spin_unlock(&class->lock);
> > >> >
> > >> > -       if (fullness == ZS_EMPTY)
> > >> > +       if (fullness == ZS_EMPTY) {
> > >> > +               spin_lock(&pool->stat_lock);
> > >> > +               pool->pages_allocated -= class->pages_per_zspage;
> > >> > +               spin_unlock(&pool->stat_lock);
> > >> >                 free_zspage(first_page);
> > >> > +       }
> > >> >  }
> > >> >  EXPORT_SYMBOL_GPL(zs_free);
> > >> >
> > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> > >> >
> > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> > >> >  {
> > >> > -       int i;
> > >> > -       u64 npages = 0;
> > >> > -
> > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> > >> > -               npages += pool->size_class[i].pages_allocated;
> > >> > +       u64 npages;
> > >> >
> > >> > +       spin_lock(&pool->stat_lock);
> > >> > +       npages = pool->pages_allocated;
> > >> > +       spin_unlock(&pool->stat_lock);
> > >> >         return npages << PAGE_SHIFT;
> > >> >  }
> > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > >> > --
> > >> > 2.0.0
> > >> >
> > >> > --
> > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > >> > see: http://www.linux-mm.org/ .
> > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> > >>
> > 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-06  6:52         ` Minchan Kim
@ 2014-08-13 15:30           ` Seth Jennings
  -1 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:30 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, linux-mm, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 06, 2014 at 03:52:53PM +0900, Minchan Kim wrote:
> On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > On (08/05/14 18:48), Minchan Kim wrote:
> > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > zs_malloc could be failed as soon as it exceeds the limit.
> > > 
> > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > write. It's more clean and right layer, IMHO.
> > 
> > yes, I think this one is better.
> > 
> > 	-ss
> 
> From 279c406b5a8eabd03edca55490ec92b539b39c76 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Tue, 5 Aug 2014 16:24:57 +0900
> Subject: [PATCH] zram: limit memory size for zram
> 
> I have received a request several time from zram users.
> They want to limit memory size for zram because zram can consume
> lot of memory on system without limit so it makes memory management
> control hard.
> 
> This patch adds new knob to limit memory of zram.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  Documentation/blockdev/zram.txt |  1 +
>  drivers/block/zram/zram_drv.c   | 39 +++++++++++++++++++++++++++++++++++++--
>  include/linux/zsmalloc.h        |  2 ++
>  mm/zsmalloc.c                   | 24 ++++++++++++++++++++++++
>  4 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index d24534bee763..fcb0561dfe2e 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
>  		compr_data_size
>  		mem_used_total
>  		mem_used_max
> +		mem_limit
>  
>  7) Deactivate:
>  	swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index a4d637b4db7d..069e81ef0c17 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -137,6 +137,41 @@ static ssize_t max_comp_streams_show(struct device *dev,
>  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
>  }
>  
> +static ssize_t mem_limit_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	u64 val = 0;
> +	struct zram *zram = dev_to_zram(dev);
> +	struct zram_meta *meta = zram->meta;
> +
> +	down_read(&zram->init_lock);
> +	if (init_done(zram))
> +		val = zs_get_limit_size_bytes(meta->mem_pool);
> +	up_read(&zram->init_lock);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> +}
> +
> +static ssize_t mem_limit_store(struct device *dev,
> +		struct device_attribute *attr, const char *buf, size_t len)
> +{
> +	int ret;
> +	u64 limit;
> +	struct zram *zram = dev_to_zram(dev);
> +	struct zram_meta *meta = zram->meta;
> +
> +	ret = kstrtoull(buf, 0, &limit);
> +	if (ret < 0)
> +		return ret;
> +
> +	down_write(&zram->init_lock);
> +	if (init_done(zram))
> +		zs_set_limit_size_bytes(meta->mem_pool, limit);
> +	up_write(&zram->init_lock);
> +	ret = len;
> +	return ret;
> +}
> +
>  static ssize_t max_comp_streams_store(struct device *dev,
>  		struct device_attribute *attr, const char *buf, size_t len)
>  {
> @@ -506,8 +541,6 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  
>  	handle = zs_malloc(meta->mem_pool, clen);
>  	if (!handle) {
> -		pr_info("Error allocating memory for compressed page: %u, size=%zu\n",
> -			index, clen);
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> @@ -854,6 +887,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
>  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>  		max_comp_streams_show, max_comp_streams_store);
>  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> @@ -883,6 +917,7 @@ static struct attribute *zram_disk_attrs[] = {
>  	&dev_attr_compr_data_size.attr,
>  	&dev_attr_mem_used_total.attr,
>  	&dev_attr_mem_used_max.attr,
> +	&dev_attr_mem_limit.attr,
>  	&dev_attr_max_comp_streams.attr,
>  	&dev_attr_comp_algorithm.attr,
>  	NULL,
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index fb087ca06a88..41122251a2d0 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -49,4 +49,6 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>  u64 zs_get_total_size_bytes(struct zs_pool *pool);
>  u64 zs_get_max_size_bytes(struct zs_pool *pool);
>  
> +u64 zs_get_limit_size_bytes(struct zs_pool *pool);
> +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit);

While having a function to change the limit is fine, the setting of the
initial limit should be a parameter to zs_create_pool() since, if the
user doesn't call zs_set_limit_size_bytes() after zs_create_pool(), the
default size is 0.

This also breaks zswap which does its pool size limiting in the zswap
layer using zs_get_total_size_bytes() to poll for the pool size.

It also has implications for the new zpool abstraction layer which
doesn't have a handle for setting the pool limit.

Could you do what zswap does already and enforce the pool limit in the
zram code?

Seth

>  #endif
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 3b5be076268a..8ca51118cf2b 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -220,6 +220,7 @@ struct zs_pool {
>  	gfp_t flags;	/* allocation flags used when growing pool */
>  	unsigned long pages_allocated;
>  	unsigned long max_pages_allocated;
> +	unsigned long pages_limited;
>  };
>  
>  /*
> @@ -940,6 +941,11 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>  
>  	if (!first_page) {
>  		spin_unlock(&class->lock);
> +
> +		if (pool->pages_limited && (pool->pages_limited <
> +			pool->pages_allocated + class->pages_per_zspage))
> +			return 0;
> +
>  		first_page = alloc_zspage(class, pool->flags);
>  		if (unlikely(!first_page))
>  			return 0;
> @@ -1132,6 +1138,24 @@ u64 zs_get_max_size_bytes(struct zs_pool *pool)
>  }
>  EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
>  
> +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit)
> +{
> +	pool->pages_limited = round_down(limit, PAGE_SIZE) >> PAGE_SHIFT;
> +}
> +EXPORT_SYMBOL_GPL(zs_set_limit_size_bytes);
> +
> +u64 zs_get_limit_size_bytes(struct zs_pool *pool)
> +{
> +	u64 npages;
> +
> +	spin_lock(&pool->stat_lock);
> +	npages = pool->pages_limited;
> +	spin_unlock(&pool->stat_lock);
> +	return npages << PAGE_SHIFT;
> +
> +}
> +EXPORT_SYMBOL_GPL(zs_get_limit_size_bytes);
> +
>  module_init(zs_init);
>  module_exit(zs_exit);
>  
> -- 
> 2.0.0
> 
> -- 
> Kind regards,
> Minchan Kim
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-13 15:30           ` Seth Jennings
  0 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:30 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, linux-mm, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 06, 2014 at 03:52:53PM +0900, Minchan Kim wrote:
> On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > On (08/05/14 18:48), Minchan Kim wrote:
> > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > zs_malloc could be failed as soon as it exceeds the limit.
> > > 
> > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > write. It's more clean and right layer, IMHO.
> > 
> > yes, I think this one is better.
> > 
> > 	-ss
> 
> From 279c406b5a8eabd03edca55490ec92b539b39c76 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Tue, 5 Aug 2014 16:24:57 +0900
> Subject: [PATCH] zram: limit memory size for zram
> 
> I have received a request several time from zram users.
> They want to limit memory size for zram because zram can consume
> lot of memory on system without limit so it makes memory management
> control hard.
> 
> This patch adds new knob to limit memory of zram.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  Documentation/blockdev/zram.txt |  1 +
>  drivers/block/zram/zram_drv.c   | 39 +++++++++++++++++++++++++++++++++++++--
>  include/linux/zsmalloc.h        |  2 ++
>  mm/zsmalloc.c                   | 24 ++++++++++++++++++++++++
>  4 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index d24534bee763..fcb0561dfe2e 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
>  		compr_data_size
>  		mem_used_total
>  		mem_used_max
> +		mem_limit
>  
>  7) Deactivate:
>  	swapoff /dev/zram0
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index a4d637b4db7d..069e81ef0c17 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -137,6 +137,41 @@ static ssize_t max_comp_streams_show(struct device *dev,
>  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
>  }
>  
> +static ssize_t mem_limit_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	u64 val = 0;
> +	struct zram *zram = dev_to_zram(dev);
> +	struct zram_meta *meta = zram->meta;
> +
> +	down_read(&zram->init_lock);
> +	if (init_done(zram))
> +		val = zs_get_limit_size_bytes(meta->mem_pool);
> +	up_read(&zram->init_lock);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> +}
> +
> +static ssize_t mem_limit_store(struct device *dev,
> +		struct device_attribute *attr, const char *buf, size_t len)
> +{
> +	int ret;
> +	u64 limit;
> +	struct zram *zram = dev_to_zram(dev);
> +	struct zram_meta *meta = zram->meta;
> +
> +	ret = kstrtoull(buf, 0, &limit);
> +	if (ret < 0)
> +		return ret;
> +
> +	down_write(&zram->init_lock);
> +	if (init_done(zram))
> +		zs_set_limit_size_bytes(meta->mem_pool, limit);
> +	up_write(&zram->init_lock);
> +	ret = len;
> +	return ret;
> +}
> +
>  static ssize_t max_comp_streams_store(struct device *dev,
>  		struct device_attribute *attr, const char *buf, size_t len)
>  {
> @@ -506,8 +541,6 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  
>  	handle = zs_malloc(meta->mem_pool, clen);
>  	if (!handle) {
> -		pr_info("Error allocating memory for compressed page: %u, size=%zu\n",
> -			index, clen);
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> @@ -854,6 +887,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
>  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>  		max_comp_streams_show, max_comp_streams_store);
>  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> @@ -883,6 +917,7 @@ static struct attribute *zram_disk_attrs[] = {
>  	&dev_attr_compr_data_size.attr,
>  	&dev_attr_mem_used_total.attr,
>  	&dev_attr_mem_used_max.attr,
> +	&dev_attr_mem_limit.attr,
>  	&dev_attr_max_comp_streams.attr,
>  	&dev_attr_comp_algorithm.attr,
>  	NULL,
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index fb087ca06a88..41122251a2d0 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -49,4 +49,6 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>  u64 zs_get_total_size_bytes(struct zs_pool *pool);
>  u64 zs_get_max_size_bytes(struct zs_pool *pool);
>  
> +u64 zs_get_limit_size_bytes(struct zs_pool *pool);
> +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit);

While having a function to change the limit is fine, the setting of the
initial limit should be a parameter to zs_create_pool() since, if the
user doesn't call zs_set_limit_size_bytes() after zs_create_pool(), the
default size is 0.

This also breaks zswap which does its pool size limiting in the zswap
layer using zs_get_total_size_bytes() to poll for the pool size.

It also has implications for the new zpool abstraction layer which
doesn't have a handle for setting the pool limit.

Could you do what zswap does already and enforce the pool limit in the
zram code?

Seth

>  #endif
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 3b5be076268a..8ca51118cf2b 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -220,6 +220,7 @@ struct zs_pool {
>  	gfp_t flags;	/* allocation flags used when growing pool */
>  	unsigned long pages_allocated;
>  	unsigned long max_pages_allocated;
> +	unsigned long pages_limited;
>  };
>  
>  /*
> @@ -940,6 +941,11 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>  
>  	if (!first_page) {
>  		spin_unlock(&class->lock);
> +
> +		if (pool->pages_limited && (pool->pages_limited <
> +			pool->pages_allocated + class->pages_per_zspage))
> +			return 0;
> +
>  		first_page = alloc_zspage(class, pool->flags);
>  		if (unlikely(!first_page))
>  			return 0;
> @@ -1132,6 +1138,24 @@ u64 zs_get_max_size_bytes(struct zs_pool *pool)
>  }
>  EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
>  
> +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit)
> +{
> +	pool->pages_limited = round_down(limit, PAGE_SIZE) >> PAGE_SHIFT;
> +}
> +EXPORT_SYMBOL_GPL(zs_set_limit_size_bytes);
> +
> +u64 zs_get_limit_size_bytes(struct zs_pool *pool)
> +{
> +	u64 npages;
> +
> +	spin_lock(&pool->stat_lock);
> +	npages = pool->pages_limited;
> +	spin_unlock(&pool->stat_lock);
> +	return npages << PAGE_SHIFT;
> +
> +}
> +EXPORT_SYMBOL_GPL(zs_get_limit_size_bytes);
> +
>  module_init(zs_init);
>  module_exit(zs_exit);
>  
> -- 
> 2.0.0
> 
> -- 
> Kind regards,
> Minchan Kim
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 0/3] zram memory control enhance
  2014-08-05  8:02 ` Minchan Kim
@ 2014-08-13 15:34   ` Seth Jennings
  -1 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:34 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 05:02:00PM +0900, Minchan Kim wrote:
> Notice! It's RFC. I didn't test at all but wanted to hear opinion
> during merge window when it's really busy time for Andrew so we could
> use the slack time to discuss without hurting him. ;-)
> 
> Patch 1 is to move pages_allocated in zsmalloc from size_class to zs_pool
> so zs_get_total_size_bytes of zsmalloc would be faster than old.
> zs_get_total_size_bytes could be used next patches frequently.
> 
> Patch 2 adds new feature which exports how many of bytes zsmalloc consumes
> during testing workload. Normally, before fixing the zram's disksize
> we have tested various workload and wanted to how many of bytes zram
> consumed.
> For it, we could poll mem_used_total of zram in userspace but the problem is
> when memory pressure is severe and heavy swap out happens suddenly then
> heavy swapin or exist while polling interval of user space is a few second,
> it could miss max memory size zram had consumed easily.
> With lack of information, user can set wrong disksize of zram so the result
> is OOM. So this patch adds max_mem_used for zram and zsmalloc supports it
> 
> Patch 3 is to limit zram memory consumption. Now, zram has no bound for
> memory usage so it could consume up all of system memory. It makes system
> memory control for platform hard so I have heard the feature several time.
> 
> Feedback is welcome!

One thing you might consider doing is moving zram to use the new zpool
API.  That way, when making changes that effect the zsmalloc API,
consideration for zpool, and by extension, zpool users like zswap are
also taken into account.

Seth

> 
> Minchan Kim (3):
>   zsmalloc: move pages_allocated to zs_pool
>   zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
>   zram: limit memory size for zram
> 
>  Documentation/blockdev/zram.txt |  2 ++
>  drivers/block/zram/zram_drv.c   | 58 +++++++++++++++++++++++++++++++++++++++++
>  drivers/block/zram/zram_drv.h   |  1 +
>  include/linux/zsmalloc.h        |  1 +
>  mm/zsmalloc.c                   | 50 +++++++++++++++++++++++++----------
>  5 files changed, 98 insertions(+), 14 deletions(-)
> 
> -- 
> 2.0.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 0/3] zram memory control enhance
@ 2014-08-13 15:34   ` Seth Jennings
  0 siblings, 0 replies; 54+ messages in thread
From: Seth Jennings @ 2014-08-13 15:34 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Tue, Aug 05, 2014 at 05:02:00PM +0900, Minchan Kim wrote:
> Notice! It's RFC. I didn't test at all but wanted to hear opinion
> during merge window when it's really busy time for Andrew so we could
> use the slack time to discuss without hurting him. ;-)
> 
> Patch 1 is to move pages_allocated in zsmalloc from size_class to zs_pool
> so zs_get_total_size_bytes of zsmalloc would be faster than old.
> zs_get_total_size_bytes could be used next patches frequently.
> 
> Patch 2 adds new feature which exports how many of bytes zsmalloc consumes
> during testing workload. Normally, before fixing the zram's disksize
> we have tested various workload and wanted to how many of bytes zram
> consumed.
> For it, we could poll mem_used_total of zram in userspace but the problem is
> when memory pressure is severe and heavy swap out happens suddenly then
> heavy swapin or exist while polling interval of user space is a few second,
> it could miss max memory size zram had consumed easily.
> With lack of information, user can set wrong disksize of zram so the result
> is OOM. So this patch adds max_mem_used for zram and zsmalloc supports it
> 
> Patch 3 is to limit zram memory consumption. Now, zram has no bound for
> memory usage so it could consume up all of system memory. It makes system
> memory control for platform hard so I have heard the feature several time.
> 
> Feedback is welcome!

One thing you might consider doing is moving zram to use the new zpool
API.  That way, when making changes that effect the zsmalloc API,
consideration for zpool, and by extension, zpool users like zswap are
also taken into account.

Seth

> 
> Minchan Kim (3):
>   zsmalloc: move pages_allocated to zs_pool
>   zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
>   zram: limit memory size for zram
> 
>  Documentation/blockdev/zram.txt |  2 ++
>  drivers/block/zram/zram_drv.c   | 58 +++++++++++++++++++++++++++++++++++++++++
>  drivers/block/zram/zram_drv.h   |  1 +
>  include/linux/zsmalloc.h        |  1 +
>  mm/zsmalloc.c                   | 50 +++++++++++++++++++++++++----------
>  5 files changed, 98 insertions(+), 14 deletions(-)
> 
> -- 
> 2.0.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-13 15:25             ` Sergey Senozhatsky
@ 2014-08-13 16:11               ` Dan Streetman
  -1 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-13 16:11 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Linux-MM, linux-kernel, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 11:25 AM, Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
> On (08/14/14 00:13), Sergey Senozhatsky wrote:
>> > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
>> > <sergey.senozhatsky@gmail.com> wrote:
>> > > On (08/13/14 09:59), Dan Streetman wrote:
>> > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
>> > >> > Pages_allocated has counted in size_class structure and when user
>> > >> > want to see total_size_bytes, it gathers all of value from each
>> > >> > size_class to report the sum.
>> > >> >
>> > >> > It's not bad if user don't see the value often but if user start
>> > >> > to see the value frequently, it would be not a good deal for
>> > >> > performance POV.
>> > >> >
>> > >> > This patch moves the variable from size_class to zs_pool so it would
>> > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
>> > >> > but it adds new locking overhead but it wouldn't be severe because
>> > >> > it's not a hot path in zs_malloc(ie, it is called only when new
>> > >> > zspage is created, not a object).
>> > >>
>> > >> Would using an atomic64_t without locking be simpler?
>> > >
>> > > it would be racy.
>> >
>> > oh.  atomic operations aren't smp safe?  is that because other
>> > processors might use a stale value, and barriers must be added?  I
>> > guess I don't quite understand the value of atomic then. :-/
>>
>> pool not only set the value, it also read it and make some decisions
>> based on that value:
>>
>>       pages_allocated += X
>>       if (pages_allocated >= max_pages_allocated)
>>               return 0;
>

I'm missing where that is?  I don't see that in this patch?

>
> I mean, suppose this happens on two CPUs
>
> max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
> happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
> that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
> fail 2 operations, while we only were supposed to fail one.

Do you mean this from the 2/3 patch:
@@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
                set_zspage_mapping(first_page, class->index, ZS_EMPTY);
                spin_lock(&pool->stat_lock);
                pool->pages_allocated += class->pages_per_zspage;
+               if (pool->max_pages_allocated < pool->pages_allocated)
+                       pool->max_pages_allocated = pool->pages_allocated;
                spin_unlock(&pool->stat_lock);
                spin_lock(&class->lock);
        }

I see, yeah the max > allocated check before setting is easiest done
with a spinlock.  I think pages_allocated could still be done as
atomic, just using atomic_add_return() to grab the current value to
check against, but keeping them the same type and both protected by
the same spinlock I guess simplifies things.  Although, if they were
both atomic, then the *only* place that would need a spinlock would be
this check - reading the (atomic) max_pages_allocated wouldn't need a
spinlock, nor would clearing it to 0.

>
>         -ss
>
>>
>> > >>
>> > >> >
>> > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > >> > ---
>> > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>> > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
>> > >> >
>> > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
>> > >> > index fe78189624cf..a6089bd26621 100644
>> > >> > --- a/mm/zsmalloc.c
>> > >> > +++ b/mm/zsmalloc.c
>> > >> > @@ -198,9 +198,6 @@ struct size_class {
>> > >> >
>> > >> >         spinlock_t lock;
>> > >> >
>> > >> > -       /* stats */
>> > >> > -       u64 pages_allocated;
>> > >> > -
>> > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>> > >> >  };
>> > >> >
>> > >> > @@ -216,9 +213,12 @@ struct link_free {
>> > >> >  };
>> > >> >
>> > >> >  struct zs_pool {
>> > >> > +       spinlock_t stat_lock;
>> > >> > +
>> > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
>> > >> >
>> > >> >         gfp_t flags;    /* allocation flags used when growing pool */
>> > >> > +       unsigned long pages_allocated;
>> > >> >  };
>> > >> >
>> > >> >  /*
>> > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>> > >> >
>> > >> >         }
>> > >> >
>> > >> > +       spin_lock_init(&pool->stat_lock);
>> > >> >         pool->flags = flags;
>> > >> >
>> > >> >         return pool;
>> > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>> > >> >                         return 0;
>> > >> >
>> > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>> > >> > +               spin_lock(&pool->stat_lock);
>> > >> > +               pool->pages_allocated += class->pages_per_zspage;
>> > >> > +               spin_unlock(&pool->stat_lock);
>> > >> >                 spin_lock(&class->lock);
>> > >> > -               class->pages_allocated += class->pages_per_zspage;
>> > >> >         }
>> > >> >
>> > >> >         obj = (unsigned long)first_page->freelist;
>> > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>> > >> >
>> > >> >         first_page->inuse--;
>> > >> >         fullness = fix_fullness_group(pool, first_page);
>> > >> > -
>> > >> > -       if (fullness == ZS_EMPTY)
>> > >> > -               class->pages_allocated -= class->pages_per_zspage;
>> > >> > -
>> > >> >         spin_unlock(&class->lock);
>> > >> >
>> > >> > -       if (fullness == ZS_EMPTY)
>> > >> > +       if (fullness == ZS_EMPTY) {
>> > >> > +               spin_lock(&pool->stat_lock);
>> > >> > +               pool->pages_allocated -= class->pages_per_zspage;
>> > >> > +               spin_unlock(&pool->stat_lock);
>> > >> >                 free_zspage(first_page);
>> > >> > +       }
>> > >> >  }
>> > >> >  EXPORT_SYMBOL_GPL(zs_free);
>> > >> >
>> > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>> > >> >
>> > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>> > >> >  {
>> > >> > -       int i;
>> > >> > -       u64 npages = 0;
>> > >> > -
>> > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
>> > >> > -               npages += pool->size_class[i].pages_allocated;
>> > >> > +       u64 npages;
>> > >> >
>> > >> > +       spin_lock(&pool->stat_lock);
>> > >> > +       npages = pool->pages_allocated;
>> > >> > +       spin_unlock(&pool->stat_lock);
>> > >> >         return npages << PAGE_SHIFT;
>> > >> >  }
>> > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
>> > >> > --
>> > >> > 2.0.0
>> > >> >
>> > >> > --
>> > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
>> > >> > see: http://www.linux-mm.org/ .
>> > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> > >>
>> >
>>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-13 16:11               ` Dan Streetman
  0 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-13 16:11 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Linux-MM, linux-kernel, Jerome Marchand, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 11:25 AM, Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
> On (08/14/14 00:13), Sergey Senozhatsky wrote:
>> > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
>> > <sergey.senozhatsky@gmail.com> wrote:
>> > > On (08/13/14 09:59), Dan Streetman wrote:
>> > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
>> > >> > Pages_allocated has counted in size_class structure and when user
>> > >> > want to see total_size_bytes, it gathers all of value from each
>> > >> > size_class to report the sum.
>> > >> >
>> > >> > It's not bad if user don't see the value often but if user start
>> > >> > to see the value frequently, it would be not a good deal for
>> > >> > performance POV.
>> > >> >
>> > >> > This patch moves the variable from size_class to zs_pool so it would
>> > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
>> > >> > but it adds new locking overhead but it wouldn't be severe because
>> > >> > it's not a hot path in zs_malloc(ie, it is called only when new
>> > >> > zspage is created, not a object).
>> > >>
>> > >> Would using an atomic64_t without locking be simpler?
>> > >
>> > > it would be racy.
>> >
>> > oh.  atomic operations aren't smp safe?  is that because other
>> > processors might use a stale value, and barriers must be added?  I
>> > guess I don't quite understand the value of atomic then. :-/
>>
>> pool not only set the value, it also read it and make some decisions
>> based on that value:
>>
>>       pages_allocated += X
>>       if (pages_allocated >= max_pages_allocated)
>>               return 0;
>

I'm missing where that is?  I don't see that in this patch?

>
> I mean, suppose this happens on two CPUs
>
> max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
> happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
> that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
> fail 2 operations, while we only were supposed to fail one.

Do you mean this from the 2/3 patch:
@@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
                set_zspage_mapping(first_page, class->index, ZS_EMPTY);
                spin_lock(&pool->stat_lock);
                pool->pages_allocated += class->pages_per_zspage;
+               if (pool->max_pages_allocated < pool->pages_allocated)
+                       pool->max_pages_allocated = pool->pages_allocated;
                spin_unlock(&pool->stat_lock);
                spin_lock(&class->lock);
        }

I see, yeah the max > allocated check before setting is easiest done
with a spinlock.  I think pages_allocated could still be done as
atomic, just using atomic_add_return() to grab the current value to
check against, but keeping them the same type and both protected by
the same spinlock I guess simplifies things.  Although, if they were
both atomic, then the *only* place that would need a spinlock would be
this check - reading the (atomic) max_pages_allocated wouldn't need a
spinlock, nor would clearing it to 0.

>
>         -ss
>
>>
>> > >>
>> > >> >
>> > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > >> > ---
>> > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
>> > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
>> > >> >
>> > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
>> > >> > index fe78189624cf..a6089bd26621 100644
>> > >> > --- a/mm/zsmalloc.c
>> > >> > +++ b/mm/zsmalloc.c
>> > >> > @@ -198,9 +198,6 @@ struct size_class {
>> > >> >
>> > >> >         spinlock_t lock;
>> > >> >
>> > >> > -       /* stats */
>> > >> > -       u64 pages_allocated;
>> > >> > -
>> > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
>> > >> >  };
>> > >> >
>> > >> > @@ -216,9 +213,12 @@ struct link_free {
>> > >> >  };
>> > >> >
>> > >> >  struct zs_pool {
>> > >> > +       spinlock_t stat_lock;
>> > >> > +
>> > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
>> > >> >
>> > >> >         gfp_t flags;    /* allocation flags used when growing pool */
>> > >> > +       unsigned long pages_allocated;
>> > >> >  };
>> > >> >
>> > >> >  /*
>> > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
>> > >> >
>> > >> >         }
>> > >> >
>> > >> > +       spin_lock_init(&pool->stat_lock);
>> > >> >         pool->flags = flags;
>> > >> >
>> > >> >         return pool;
>> > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>> > >> >                         return 0;
>> > >> >
>> > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>> > >> > +               spin_lock(&pool->stat_lock);
>> > >> > +               pool->pages_allocated += class->pages_per_zspage;
>> > >> > +               spin_unlock(&pool->stat_lock);
>> > >> >                 spin_lock(&class->lock);
>> > >> > -               class->pages_allocated += class->pages_per_zspage;
>> > >> >         }
>> > >> >
>> > >> >         obj = (unsigned long)first_page->freelist;
>> > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
>> > >> >
>> > >> >         first_page->inuse--;
>> > >> >         fullness = fix_fullness_group(pool, first_page);
>> > >> > -
>> > >> > -       if (fullness == ZS_EMPTY)
>> > >> > -               class->pages_allocated -= class->pages_per_zspage;
>> > >> > -
>> > >> >         spin_unlock(&class->lock);
>> > >> >
>> > >> > -       if (fullness == ZS_EMPTY)
>> > >> > +       if (fullness == ZS_EMPTY) {
>> > >> > +               spin_lock(&pool->stat_lock);
>> > >> > +               pool->pages_allocated -= class->pages_per_zspage;
>> > >> > +               spin_unlock(&pool->stat_lock);
>> > >> >                 free_zspage(first_page);
>> > >> > +       }
>> > >> >  }
>> > >> >  EXPORT_SYMBOL_GPL(zs_free);
>> > >> >
>> > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
>> > >> >
>> > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
>> > >> >  {
>> > >> > -       int i;
>> > >> > -       u64 npages = 0;
>> > >> > -
>> > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
>> > >> > -               npages += pool->size_class[i].pages_allocated;
>> > >> > +       u64 npages;
>> > >> >
>> > >> > +       spin_lock(&pool->stat_lock);
>> > >> > +       npages = pool->pages_allocated;
>> > >> > +       spin_unlock(&pool->stat_lock);
>> > >> >         return npages << PAGE_SHIFT;
>> > >> >  }
>> > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
>> > >> > --
>> > >> > 2.0.0
>> > >> >
>> > >> > --
>> > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
>> > >> > see: http://www.linux-mm.org/ .
>> > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> > >>
>> >
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-05 13:16       ` Sergey Senozhatsky
@ 2014-08-13 23:27         ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-13 23:27 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

Hey Sergey,

On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (08/05/14 18:48), Minchan Kim wrote:
> > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > zs_malloc could be failed as soon as it exceeds the limit.
> > 
> > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > write. It's more clean and right layer, IMHO.
> 
> yes, I think this one is better.

Although I suggested this new one, a few days ago I changed the decision
and was testing the new patchset.

If we add new API for zsmalloc, it adds unnecessary overhead for users who
doesn't care of limit. Although it's cheap, I'd like to avoid that.

The zsmalloc is just allocator so anybody can use it if they want.
But limitation is just requirement of zram who is a one of client
being able to use zsmalloc potentially so accouting should be on zram,
not zsmalloc.

If we might have more users of zsmalloc in future and they all want this
feature that limit of zsmalloc memory usage, we might move the feature
from client to zsmalloc core so everybody would be happy for performance
and readability but opposite would be painful.

In summary, let's keep the accounting logic in client side of zsmalloc(ie,
zram) at the moment but we could move it into zsmalloc core possibly
in future.

Any thoughts?

> 
> 	-ss
> 
> > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > > I have received a request several time from zram users.
> > > They want to limit memory size for zram because zram can consume
> > > lot of memory on system without limit so it makes memory management
> > > control hard.
> > > 
> > > This patch adds new knob to limit memory of zram.
> > > 
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > ---
> > >  Documentation/blockdev/zram.txt |  1 +
> > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> > >  drivers/block/zram/zram_drv.h   |  1 +
> > >  3 files changed, 43 insertions(+)
> > > 
> > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > > index d24534bee763..fcb0561dfe2e 100644
> > > --- a/Documentation/blockdev/zram.txt
> > > +++ b/Documentation/blockdev/zram.txt
> > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > >  		compr_data_size
> > >  		mem_used_total
> > >  		mem_used_max
> > > +		mem_limit
> > >  
> > >  7) Deactivate:
> > >  	swapoff /dev/zram0
> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > index a4d637b4db7d..47f68bbb2c44 100644
> > > --- a/drivers/block/zram/zram_drv.c
> > > +++ b/drivers/block/zram/zram_drv.c
> > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> > >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> > >  }
> > >  
> > > +static ssize_t mem_limit_show(struct device *dev,
> > > +		struct device_attribute *attr, char *buf)
> > > +{
> > > +	u64 val;
> > > +	struct zram *zram = dev_to_zram(dev);
> > > +
> > > +	down_read(&zram->init_lock);
> > > +	val = zram->limit_bytes;
> > > +	up_read(&zram->init_lock);
> > > +
> > > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > > +}
> > > +
> > > +static ssize_t mem_limit_store(struct device *dev,
> > > +		struct device_attribute *attr, const char *buf, size_t len)
> > > +{
> > > +	u64 limit;
> > > +	struct zram *zram = dev_to_zram(dev);
> > > +	int ret;
> > > +
> > > +	ret = kstrtoull(buf, 0, &limit);
> > > +	if (ret < 0)
> > > +		return ret;
> > > +
> > > +	down_write(&zram->init_lock);
> > > +	zram->limit_bytes = limit;
> > > +	ret = len;
> > > +	up_write(&zram->init_lock);
> > > +	return ret;
> > > +}
> > > +
> > >  static ssize_t max_comp_streams_store(struct device *dev,
> > >  		struct device_attribute *attr, const char *buf, size_t len)
> > >  {
> > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > >  		ret = -ENOMEM;
> > >  		goto out;
> > >  	}
> > > +
> > > +	if (zram->limit_bytes &&
> > > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > > +		zs_free(meta->mem_pool, handle);
> > > +		ret = -ENOMEM;
> > > +		goto out;
> > > +	}
> > > +
> > >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> > >  
> > >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > >  		max_comp_streams_show, max_comp_streams_store);
> > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> > >  	&dev_attr_compr_data_size.attr,
> > >  	&dev_attr_mem_used_total.attr,
> > >  	&dev_attr_mem_used_max.attr,
> > > +	&dev_attr_mem_limit.attr,
> > >  	&dev_attr_max_comp_streams.attr,
> > >  	&dev_attr_comp_algorithm.attr,
> > >  	NULL,
> > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > > index 7f21c145e317..c0d497ff6efc 100644
> > > --- a/drivers/block/zram/zram_drv.h
> > > +++ b/drivers/block/zram/zram_drv.h
> > > @@ -99,6 +99,7 @@ struct zram {
> > >  	 * we can store in a disk.
> > >  	 */
> > >  	u64 disksize;	/* bytes */
> > > +	u64 limit_bytes;
> > >  	int max_comp_streams;
> > >  	struct zram_stats stats;
> > >  	char compressor[10];
> > > -- 
> > > 2.0.0
> > 
> > -- 
> > Kind regards,
> > Minchan Kim
> > 

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-13 23:27         ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-13 23:27 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

Hey Sergey,

On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (08/05/14 18:48), Minchan Kim wrote:
> > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > zs_malloc could be failed as soon as it exceeds the limit.
> > 
> > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > write. It's more clean and right layer, IMHO.
> 
> yes, I think this one is better.

Although I suggested this new one, a few days ago I changed the decision
and was testing the new patchset.

If we add new API for zsmalloc, it adds unnecessary overhead for users who
doesn't care of limit. Although it's cheap, I'd like to avoid that.

The zsmalloc is just allocator so anybody can use it if they want.
But limitation is just requirement of zram who is a one of client
being able to use zsmalloc potentially so accouting should be on zram,
not zsmalloc.

If we might have more users of zsmalloc in future and they all want this
feature that limit of zsmalloc memory usage, we might move the feature
from client to zsmalloc core so everybody would be happy for performance
and readability but opposite would be painful.

In summary, let's keep the accounting logic in client side of zsmalloc(ie,
zram) at the moment but we could move it into zsmalloc core possibly
in future.

Any thoughts?

> 
> 	-ss
> 
> > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > > I have received a request several time from zram users.
> > > They want to limit memory size for zram because zram can consume
> > > lot of memory on system without limit so it makes memory management
> > > control hard.
> > > 
> > > This patch adds new knob to limit memory of zram.
> > > 
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > ---
> > >  Documentation/blockdev/zram.txt |  1 +
> > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> > >  drivers/block/zram/zram_drv.h   |  1 +
> > >  3 files changed, 43 insertions(+)
> > > 
> > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > > index d24534bee763..fcb0561dfe2e 100644
> > > --- a/Documentation/blockdev/zram.txt
> > > +++ b/Documentation/blockdev/zram.txt
> > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > >  		compr_data_size
> > >  		mem_used_total
> > >  		mem_used_max
> > > +		mem_limit
> > >  
> > >  7) Deactivate:
> > >  	swapoff /dev/zram0
> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > index a4d637b4db7d..47f68bbb2c44 100644
> > > --- a/drivers/block/zram/zram_drv.c
> > > +++ b/drivers/block/zram/zram_drv.c
> > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> > >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> > >  }
> > >  
> > > +static ssize_t mem_limit_show(struct device *dev,
> > > +		struct device_attribute *attr, char *buf)
> > > +{
> > > +	u64 val;
> > > +	struct zram *zram = dev_to_zram(dev);
> > > +
> > > +	down_read(&zram->init_lock);
> > > +	val = zram->limit_bytes;
> > > +	up_read(&zram->init_lock);
> > > +
> > > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > > +}
> > > +
> > > +static ssize_t mem_limit_store(struct device *dev,
> > > +		struct device_attribute *attr, const char *buf, size_t len)
> > > +{
> > > +	u64 limit;
> > > +	struct zram *zram = dev_to_zram(dev);
> > > +	int ret;
> > > +
> > > +	ret = kstrtoull(buf, 0, &limit);
> > > +	if (ret < 0)
> > > +		return ret;
> > > +
> > > +	down_write(&zram->init_lock);
> > > +	zram->limit_bytes = limit;
> > > +	ret = len;
> > > +	up_write(&zram->init_lock);
> > > +	return ret;
> > > +}
> > > +
> > >  static ssize_t max_comp_streams_store(struct device *dev,
> > >  		struct device_attribute *attr, const char *buf, size_t len)
> > >  {
> > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > >  		ret = -ENOMEM;
> > >  		goto out;
> > >  	}
> > > +
> > > +	if (zram->limit_bytes &&
> > > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > > +		zs_free(meta->mem_pool, handle);
> > > +		ret = -ENOMEM;
> > > +		goto out;
> > > +	}
> > > +
> > >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> > >  
> > >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > >  		max_comp_streams_show, max_comp_streams_store);
> > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> > >  	&dev_attr_compr_data_size.attr,
> > >  	&dev_attr_mem_used_total.attr,
> > >  	&dev_attr_mem_used_max.attr,
> > > +	&dev_attr_mem_limit.attr,
> > >  	&dev_attr_max_comp_streams.attr,
> > >  	&dev_attr_comp_algorithm.attr,
> > >  	NULL,
> > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > > index 7f21c145e317..c0d497ff6efc 100644
> > > --- a/drivers/block/zram/zram_drv.h
> > > +++ b/drivers/block/zram/zram_drv.h
> > > @@ -99,6 +99,7 @@ struct zram {
> > >  	 * we can store in a disk.
> > >  	 */
> > >  	u64 disksize;	/* bytes */
> > > +	u64 limit_bytes;
> > >  	int max_comp_streams;
> > >  	struct zram_stats stats;
> > >  	char compressor[10];
> > > -- 
> > > 2.0.0
> > 
> > -- 
> > Kind regards,
> > Minchan Kim
> > 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-13 15:30           ` Seth Jennings
@ 2014-08-13 23:31             ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-13 23:31 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Sergey Senozhatsky, linux-mm, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

Hello Seth,

On Wed, Aug 13, 2014 at 10:30:20AM -0500, Seth Jennings wrote:
> On Wed, Aug 06, 2014 at 03:52:53PM +0900, Minchan Kim wrote:
> > On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > On (08/05/14 18:48), Minchan Kim wrote:
> > > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > > zs_malloc could be failed as soon as it exceeds the limit.
> > > > 
> > > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > > write. It's more clean and right layer, IMHO.
> > > 
> > > yes, I think this one is better.
> > > 
> > > 	-ss
> > 
> > From 279c406b5a8eabd03edca55490ec92b539b39c76 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan@kernel.org>
> > Date: Tue, 5 Aug 2014 16:24:57 +0900
> > Subject: [PATCH] zram: limit memory size for zram
> > 
> > I have received a request several time from zram users.
> > They want to limit memory size for zram because zram can consume
> > lot of memory on system without limit so it makes memory management
> > control hard.
> > 
> > This patch adds new knob to limit memory of zram.
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  Documentation/blockdev/zram.txt |  1 +
> >  drivers/block/zram/zram_drv.c   | 39 +++++++++++++++++++++++++++++++++++++--
> >  include/linux/zsmalloc.h        |  2 ++
> >  mm/zsmalloc.c                   | 24 ++++++++++++++++++++++++
> >  4 files changed, 64 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > index d24534bee763..fcb0561dfe2e 100644
> > --- a/Documentation/blockdev/zram.txt
> > +++ b/Documentation/blockdev/zram.txt
> > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> >  		compr_data_size
> >  		mem_used_total
> >  		mem_used_max
> > +		mem_limit
> >  
> >  7) Deactivate:
> >  	swapoff /dev/zram0
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index a4d637b4db7d..069e81ef0c17 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -137,6 +137,41 @@ static ssize_t max_comp_streams_show(struct device *dev,
> >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> >  }
> >  
> > +static ssize_t mem_limit_show(struct device *dev,
> > +		struct device_attribute *attr, char *buf)
> > +{
> > +	u64 val = 0;
> > +	struct zram *zram = dev_to_zram(dev);
> > +	struct zram_meta *meta = zram->meta;
> > +
> > +	down_read(&zram->init_lock);
> > +	if (init_done(zram))
> > +		val = zs_get_limit_size_bytes(meta->mem_pool);
> > +	up_read(&zram->init_lock);
> > +
> > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > +}
> > +
> > +static ssize_t mem_limit_store(struct device *dev,
> > +		struct device_attribute *attr, const char *buf, size_t len)
> > +{
> > +	int ret;
> > +	u64 limit;
> > +	struct zram *zram = dev_to_zram(dev);
> > +	struct zram_meta *meta = zram->meta;
> > +
> > +	ret = kstrtoull(buf, 0, &limit);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	down_write(&zram->init_lock);
> > +	if (init_done(zram))
> > +		zs_set_limit_size_bytes(meta->mem_pool, limit);
> > +	up_write(&zram->init_lock);
> > +	ret = len;
> > +	return ret;
> > +}
> > +
> >  static ssize_t max_comp_streams_store(struct device *dev,
> >  		struct device_attribute *attr, const char *buf, size_t len)
> >  {
> > @@ -506,8 +541,6 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  
> >  	handle = zs_malloc(meta->mem_pool, clen);
> >  	if (!handle) {
> > -		pr_info("Error allocating memory for compressed page: %u, size=%zu\n",
> > -			index, clen);
> >  		ret = -ENOMEM;
> >  		goto out;
> >  	}
> > @@ -854,6 +887,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> >  		max_comp_streams_show, max_comp_streams_store);
> >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > @@ -883,6 +917,7 @@ static struct attribute *zram_disk_attrs[] = {
> >  	&dev_attr_compr_data_size.attr,
> >  	&dev_attr_mem_used_total.attr,
> >  	&dev_attr_mem_used_max.attr,
> > +	&dev_attr_mem_limit.attr,
> >  	&dev_attr_max_comp_streams.attr,
> >  	&dev_attr_comp_algorithm.attr,
> >  	NULL,
> > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> > index fb087ca06a88..41122251a2d0 100644
> > --- a/include/linux/zsmalloc.h
> > +++ b/include/linux/zsmalloc.h
> > @@ -49,4 +49,6 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
> >  u64 zs_get_total_size_bytes(struct zs_pool *pool);
> >  u64 zs_get_max_size_bytes(struct zs_pool *pool);
> >  
> > +u64 zs_get_limit_size_bytes(struct zs_pool *pool);
> > +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit);
> 
> While having a function to change the limit is fine, the setting of the
> initial limit should be a parameter to zs_create_pool() since, if the
> user doesn't call zs_set_limit_size_bytes() after zs_create_pool(), the
> default size is 0.
> 
> This also breaks zswap which does its pool size limiting in the zswap
> layer using zs_get_total_size_bytes() to poll for the pool size.
> 
> It also has implications for the new zpool abstraction layer which
> doesn't have a handle for setting the pool limit.

Just I sent a my opinion. I'd like to avoid limit logic into zsmalloc.
Instead, I'd like to add it in zram part like zswap.
It doesn't break anything so you would be happy. :)

> 
> Could you do what zswap does already and enforce the pool limit in the
> zram code?

Yeb, it's more desirable.
Thanks for the review!


> 
> Seth
> 
> >  #endif
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 3b5be076268a..8ca51118cf2b 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -220,6 +220,7 @@ struct zs_pool {
> >  	gfp_t flags;	/* allocation flags used when growing pool */
> >  	unsigned long pages_allocated;
> >  	unsigned long max_pages_allocated;
> > +	unsigned long pages_limited;
> >  };
> >  
> >  /*
> > @@ -940,6 +941,11 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >  
> >  	if (!first_page) {
> >  		spin_unlock(&class->lock);
> > +
> > +		if (pool->pages_limited && (pool->pages_limited <
> > +			pool->pages_allocated + class->pages_per_zspage))
> > +			return 0;
> > +
> >  		first_page = alloc_zspage(class, pool->flags);
> >  		if (unlikely(!first_page))
> >  			return 0;
> > @@ -1132,6 +1138,24 @@ u64 zs_get_max_size_bytes(struct zs_pool *pool)
> >  }
> >  EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
> >  
> > +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit)
> > +{
> > +	pool->pages_limited = round_down(limit, PAGE_SIZE) >> PAGE_SHIFT;
> > +}
> > +EXPORT_SYMBOL_GPL(zs_set_limit_size_bytes);
> > +
> > +u64 zs_get_limit_size_bytes(struct zs_pool *pool)
> > +{
> > +	u64 npages;
> > +
> > +	spin_lock(&pool->stat_lock);
> > +	npages = pool->pages_limited;
> > +	spin_unlock(&pool->stat_lock);
> > +	return npages << PAGE_SHIFT;
> > +
> > +}
> > +EXPORT_SYMBOL_GPL(zs_get_limit_size_bytes);
> > +
> >  module_init(zs_init);
> >  module_exit(zs_exit);
> >  
> > -- 
> > 2.0.0
> > 
> > -- 
> > Kind regards,
> > Minchan Kim
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-13 23:31             ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-13 23:31 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Sergey Senozhatsky, linux-mm, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

Hello Seth,

On Wed, Aug 13, 2014 at 10:30:20AM -0500, Seth Jennings wrote:
> On Wed, Aug 06, 2014 at 03:52:53PM +0900, Minchan Kim wrote:
> > On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > On (08/05/14 18:48), Minchan Kim wrote:
> > > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > > zs_malloc could be failed as soon as it exceeds the limit.
> > > > 
> > > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > > write. It's more clean and right layer, IMHO.
> > > 
> > > yes, I think this one is better.
> > > 
> > > 	-ss
> > 
> > From 279c406b5a8eabd03edca55490ec92b539b39c76 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan@kernel.org>
> > Date: Tue, 5 Aug 2014 16:24:57 +0900
> > Subject: [PATCH] zram: limit memory size for zram
> > 
> > I have received a request several time from zram users.
> > They want to limit memory size for zram because zram can consume
> > lot of memory on system without limit so it makes memory management
> > control hard.
> > 
> > This patch adds new knob to limit memory of zram.
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  Documentation/blockdev/zram.txt |  1 +
> >  drivers/block/zram/zram_drv.c   | 39 +++++++++++++++++++++++++++++++++++++--
> >  include/linux/zsmalloc.h        |  2 ++
> >  mm/zsmalloc.c                   | 24 ++++++++++++++++++++++++
> >  4 files changed, 64 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > index d24534bee763..fcb0561dfe2e 100644
> > --- a/Documentation/blockdev/zram.txt
> > +++ b/Documentation/blockdev/zram.txt
> > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> >  		compr_data_size
> >  		mem_used_total
> >  		mem_used_max
> > +		mem_limit
> >  
> >  7) Deactivate:
> >  	swapoff /dev/zram0
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index a4d637b4db7d..069e81ef0c17 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -137,6 +137,41 @@ static ssize_t max_comp_streams_show(struct device *dev,
> >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> >  }
> >  
> > +static ssize_t mem_limit_show(struct device *dev,
> > +		struct device_attribute *attr, char *buf)
> > +{
> > +	u64 val = 0;
> > +	struct zram *zram = dev_to_zram(dev);
> > +	struct zram_meta *meta = zram->meta;
> > +
> > +	down_read(&zram->init_lock);
> > +	if (init_done(zram))
> > +		val = zs_get_limit_size_bytes(meta->mem_pool);
> > +	up_read(&zram->init_lock);
> > +
> > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > +}
> > +
> > +static ssize_t mem_limit_store(struct device *dev,
> > +		struct device_attribute *attr, const char *buf, size_t len)
> > +{
> > +	int ret;
> > +	u64 limit;
> > +	struct zram *zram = dev_to_zram(dev);
> > +	struct zram_meta *meta = zram->meta;
> > +
> > +	ret = kstrtoull(buf, 0, &limit);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	down_write(&zram->init_lock);
> > +	if (init_done(zram))
> > +		zs_set_limit_size_bytes(meta->mem_pool, limit);
> > +	up_write(&zram->init_lock);
> > +	ret = len;
> > +	return ret;
> > +}
> > +
> >  static ssize_t max_comp_streams_store(struct device *dev,
> >  		struct device_attribute *attr, const char *buf, size_t len)
> >  {
> > @@ -506,8 +541,6 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >  
> >  	handle = zs_malloc(meta->mem_pool, clen);
> >  	if (!handle) {
> > -		pr_info("Error allocating memory for compressed page: %u, size=%zu\n",
> > -			index, clen);
> >  		ret = -ENOMEM;
> >  		goto out;
> >  	}
> > @@ -854,6 +887,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> >  		max_comp_streams_show, max_comp_streams_store);
> >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > @@ -883,6 +917,7 @@ static struct attribute *zram_disk_attrs[] = {
> >  	&dev_attr_compr_data_size.attr,
> >  	&dev_attr_mem_used_total.attr,
> >  	&dev_attr_mem_used_max.attr,
> > +	&dev_attr_mem_limit.attr,
> >  	&dev_attr_max_comp_streams.attr,
> >  	&dev_attr_comp_algorithm.attr,
> >  	NULL,
> > diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> > index fb087ca06a88..41122251a2d0 100644
> > --- a/include/linux/zsmalloc.h
> > +++ b/include/linux/zsmalloc.h
> > @@ -49,4 +49,6 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
> >  u64 zs_get_total_size_bytes(struct zs_pool *pool);
> >  u64 zs_get_max_size_bytes(struct zs_pool *pool);
> >  
> > +u64 zs_get_limit_size_bytes(struct zs_pool *pool);
> > +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit);
> 
> While having a function to change the limit is fine, the setting of the
> initial limit should be a parameter to zs_create_pool() since, if the
> user doesn't call zs_set_limit_size_bytes() after zs_create_pool(), the
> default size is 0.
> 
> This also breaks zswap which does its pool size limiting in the zswap
> layer using zs_get_total_size_bytes() to poll for the pool size.
> 
> It also has implications for the new zpool abstraction layer which
> doesn't have a handle for setting the pool limit.

Just I sent a my opinion. I'd like to avoid limit logic into zsmalloc.
Instead, I'd like to add it in zram part like zswap.
It doesn't break anything so you would be happy. :)

> 
> Could you do what zswap does already and enforce the pool limit in the
> zram code?

Yeb, it's more desirable.
Thanks for the review!


> 
> Seth
> 
> >  #endif
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 3b5be076268a..8ca51118cf2b 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -220,6 +220,7 @@ struct zs_pool {
> >  	gfp_t flags;	/* allocation flags used when growing pool */
> >  	unsigned long pages_allocated;
> >  	unsigned long max_pages_allocated;
> > +	unsigned long pages_limited;
> >  };
> >  
> >  /*
> > @@ -940,6 +941,11 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >  
> >  	if (!first_page) {
> >  		spin_unlock(&class->lock);
> > +
> > +		if (pool->pages_limited && (pool->pages_limited <
> > +			pool->pages_allocated + class->pages_per_zspage))
> > +			return 0;
> > +
> >  		first_page = alloc_zspage(class, pool->flags);
> >  		if (unlikely(!first_page))
> >  			return 0;
> > @@ -1132,6 +1138,24 @@ u64 zs_get_max_size_bytes(struct zs_pool *pool)
> >  }
> >  EXPORT_SYMBOL_GPL(zs_get_max_size_bytes);
> >  
> > +void zs_set_limit_size_bytes(struct zs_pool *pool, u64 limit)
> > +{
> > +	pool->pages_limited = round_down(limit, PAGE_SIZE) >> PAGE_SHIFT;
> > +}
> > +EXPORT_SYMBOL_GPL(zs_set_limit_size_bytes);
> > +
> > +u64 zs_get_limit_size_bytes(struct zs_pool *pool)
> > +{
> > +	u64 npages;
> > +
> > +	spin_lock(&pool->stat_lock);
> > +	npages = pool->pages_limited;
> > +	spin_unlock(&pool->stat_lock);
> > +	return npages << PAGE_SHIFT;
> > +
> > +}
> > +EXPORT_SYMBOL_GPL(zs_get_limit_size_bytes);
> > +
> >  module_init(zs_init);
> >  module_exit(zs_exit);
> >  
> > -- 
> > 2.0.0
> > 
> > -- 
> > Kind regards,
> > Minchan Kim
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 0/3] zram memory control enhance
  2014-08-13 15:34   ` Seth Jennings
@ 2014-08-13 23:32     ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-13 23:32 UTC (permalink / raw)
  To: Seth Jennings
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 10:34:22AM -0500, Seth Jennings wrote:
> On Tue, Aug 05, 2014 at 05:02:00PM +0900, Minchan Kim wrote:
> > Notice! It's RFC. I didn't test at all but wanted to hear opinion
> > during merge window when it's really busy time for Andrew so we could
> > use the slack time to discuss without hurting him. ;-)
> > 
> > Patch 1 is to move pages_allocated in zsmalloc from size_class to zs_pool
> > so zs_get_total_size_bytes of zsmalloc would be faster than old.
> > zs_get_total_size_bytes could be used next patches frequently.
> > 
> > Patch 2 adds new feature which exports how many of bytes zsmalloc consumes
> > during testing workload. Normally, before fixing the zram's disksize
> > we have tested various workload and wanted to how many of bytes zram
> > consumed.
> > For it, we could poll mem_used_total of zram in userspace but the problem is
> > when memory pressure is severe and heavy swap out happens suddenly then
> > heavy swapin or exist while polling interval of user space is a few second,
> > it could miss max memory size zram had consumed easily.
> > With lack of information, user can set wrong disksize of zram so the result
> > is OOM. So this patch adds max_mem_used for zram and zsmalloc supports it
> > 
> > Patch 3 is to limit zram memory consumption. Now, zram has no bound for
> > memory usage so it could consume up all of system memory. It makes system
> > memory control for platform hard so I have heard the feature several time.
> > 
> > Feedback is welcome!
> 
> One thing you might consider doing is moving zram to use the new zpool
> API.  That way, when making changes that effect the zsmalloc API,
> consideration for zpool, and by extension, zpool users like zswap are
> also taken into account.

Now, it's rather overkill for zram.

> 
> Seth
> 
> > 
> > Minchan Kim (3):
> >   zsmalloc: move pages_allocated to zs_pool
> >   zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
> >   zram: limit memory size for zram
> > 
> >  Documentation/blockdev/zram.txt |  2 ++
> >  drivers/block/zram/zram_drv.c   | 58 +++++++++++++++++++++++++++++++++++++++++
> >  drivers/block/zram/zram_drv.h   |  1 +
> >  include/linux/zsmalloc.h        |  1 +
> >  mm/zsmalloc.c                   | 50 +++++++++++++++++++++++++----------
> >  5 files changed, 98 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 2.0.0
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 0/3] zram memory control enhance
@ 2014-08-13 23:32     ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-13 23:32 UTC (permalink / raw)
  To: Seth Jennings
  Cc: linux-mm, linux-kernel, Sergey Senozhatsky, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 10:34:22AM -0500, Seth Jennings wrote:
> On Tue, Aug 05, 2014 at 05:02:00PM +0900, Minchan Kim wrote:
> > Notice! It's RFC. I didn't test at all but wanted to hear opinion
> > during merge window when it's really busy time for Andrew so we could
> > use the slack time to discuss without hurting him. ;-)
> > 
> > Patch 1 is to move pages_allocated in zsmalloc from size_class to zs_pool
> > so zs_get_total_size_bytes of zsmalloc would be faster than old.
> > zs_get_total_size_bytes could be used next patches frequently.
> > 
> > Patch 2 adds new feature which exports how many of bytes zsmalloc consumes
> > during testing workload. Normally, before fixing the zram's disksize
> > we have tested various workload and wanted to how many of bytes zram
> > consumed.
> > For it, we could poll mem_used_total of zram in userspace but the problem is
> > when memory pressure is severe and heavy swap out happens suddenly then
> > heavy swapin or exist while polling interval of user space is a few second,
> > it could miss max memory size zram had consumed easily.
> > With lack of information, user can set wrong disksize of zram so the result
> > is OOM. So this patch adds max_mem_used for zram and zsmalloc supports it
> > 
> > Patch 3 is to limit zram memory consumption. Now, zram has no bound for
> > memory usage so it could consume up all of system memory. It makes system
> > memory control for platform hard so I have heard the feature several time.
> > 
> > Feedback is welcome!
> 
> One thing you might consider doing is moving zram to use the new zpool
> API.  That way, when making changes that effect the zsmalloc API,
> consideration for zpool, and by extension, zpool users like zswap are
> also taken into account.

Now, it's rather overkill for zram.

> 
> Seth
> 
> > 
> > Minchan Kim (3):
> >   zsmalloc: move pages_allocated to zs_pool
> >   zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
> >   zram: limit memory size for zram
> > 
> >  Documentation/blockdev/zram.txt |  2 ++
> >  drivers/block/zram/zram_drv.c   | 58 +++++++++++++++++++++++++++++++++++++++++
> >  drivers/block/zram/zram_drv.h   |  1 +
> >  include/linux/zsmalloc.h        |  1 +
> >  mm/zsmalloc.c                   | 50 +++++++++++++++++++++++++----------
> >  5 files changed, 98 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 2.0.0
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-13 15:25             ` Sergey Senozhatsky
@ 2014-08-14  0:09               ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-14  0:09 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dan Streetman, Linux-MM, linux-kernel, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Thu, Aug 14, 2014 at 12:25:04AM +0900, Sergey Senozhatsky wrote:
> On (08/14/14 00:13), Sergey Senozhatsky wrote:
> > > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> > > <sergey.senozhatsky@gmail.com> wrote:
> > > > On (08/13/14 09:59), Dan Streetman wrote:
> > > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> > > >> > Pages_allocated has counted in size_class structure and when user
> > > >> > want to see total_size_bytes, it gathers all of value from each
> > > >> > size_class to report the sum.
> > > >> >
> > > >> > It's not bad if user don't see the value often but if user start
> > > >> > to see the value frequently, it would be not a good deal for
> > > >> > performance POV.
> > > >> >
> > > >> > This patch moves the variable from size_class to zs_pool so it would
> > > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> > > >> > but it adds new locking overhead but it wouldn't be severe because
> > > >> > it's not a hot path in zs_malloc(ie, it is called only when new
> > > >> > zspage is created, not a object).
> > > >>
> > > >> Would using an atomic64_t without locking be simpler?
> > > >
> > > > it would be racy.
> > > 
> > > oh.  atomic operations aren't smp safe?  is that because other
> > > processors might use a stale value, and barriers must be added?  I
> > > guess I don't quite understand the value of atomic then. :-/
> > 
> > pool not only set the value, it also read it and make some decisions
> > based on that value:
> > 
> > 	pages_allocated += X
> > 	if (pages_allocated >= max_pages_allocated)
> > 		return 0;
> 
> 
> I mean, suppose this happens on two CPUs
> 
> max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
> happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
> that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
> fail 2 operations, while we only were supposed to fail one.

Exactly speaking, you're saying not max_pages_allocated but pages_limited.
But I admit the race could affect max_pages_allocated, too.

I think it would be not severe if we move the feature into zram because
zram's requirement is not strict and the gap is just bounded by the number
of CPU so we could remove both spinlock and atomic.


> 
> 	-ss
> 
> > 
> > > >>
> > > >> >
> > > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > >> > ---
> > > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> > > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> > > >> >
> > > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > >> > index fe78189624cf..a6089bd26621 100644
> > > >> > --- a/mm/zsmalloc.c
> > > >> > +++ b/mm/zsmalloc.c
> > > >> > @@ -198,9 +198,6 @@ struct size_class {
> > > >> >
> > > >> >         spinlock_t lock;
> > > >> >
> > > >> > -       /* stats */
> > > >> > -       u64 pages_allocated;
> > > >> > -
> > > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> > > >> >  };
> > > >> >
> > > >> > @@ -216,9 +213,12 @@ struct link_free {
> > > >> >  };
> > > >> >
> > > >> >  struct zs_pool {
> > > >> > +       spinlock_t stat_lock;
> > > >> > +
> > > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> > > >> >
> > > >> >         gfp_t flags;    /* allocation flags used when growing pool */
> > > >> > +       unsigned long pages_allocated;
> > > >> >  };
> > > >> >
> > > >> >  /*
> > > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> > > >> >
> > > >> >         }
> > > >> >
> > > >> > +       spin_lock_init(&pool->stat_lock);
> > > >> >         pool->flags = flags;
> > > >> >
> > > >> >         return pool;
> > > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> > > >> >                         return 0;
> > > >> >
> > > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> > > >> > +               spin_lock(&pool->stat_lock);
> > > >> > +               pool->pages_allocated += class->pages_per_zspage;
> > > >> > +               spin_unlock(&pool->stat_lock);
> > > >> >                 spin_lock(&class->lock);
> > > >> > -               class->pages_allocated += class->pages_per_zspage;
> > > >> >         }
> > > >> >
> > > >> >         obj = (unsigned long)first_page->freelist;
> > > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> > > >> >
> > > >> >         first_page->inuse--;
> > > >> >         fullness = fix_fullness_group(pool, first_page);
> > > >> > -
> > > >> > -       if (fullness == ZS_EMPTY)
> > > >> > -               class->pages_allocated -= class->pages_per_zspage;
> > > >> > -
> > > >> >         spin_unlock(&class->lock);
> > > >> >
> > > >> > -       if (fullness == ZS_EMPTY)
> > > >> > +       if (fullness == ZS_EMPTY) {
> > > >> > +               spin_lock(&pool->stat_lock);
> > > >> > +               pool->pages_allocated -= class->pages_per_zspage;
> > > >> > +               spin_unlock(&pool->stat_lock);
> > > >> >                 free_zspage(first_page);
> > > >> > +       }
> > > >> >  }
> > > >> >  EXPORT_SYMBOL_GPL(zs_free);
> > > >> >
> > > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> > > >> >
> > > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> > > >> >  {
> > > >> > -       int i;
> > > >> > -       u64 npages = 0;
> > > >> > -
> > > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> > > >> > -               npages += pool->size_class[i].pages_allocated;
> > > >> > +       u64 npages;
> > > >> >
> > > >> > +       spin_lock(&pool->stat_lock);
> > > >> > +       npages = pool->pages_allocated;
> > > >> > +       spin_unlock(&pool->stat_lock);
> > > >> >         return npages << PAGE_SHIFT;
> > > >> >  }
> > > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > > >> > --
> > > >> > 2.0.0
> > > >> >
> > > >> > --
> > > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > > >> > see: http://www.linux-mm.org/ .
> > > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> > > >>
> > > 
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-14  0:09               ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-14  0:09 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dan Streetman, Linux-MM, linux-kernel, Jerome Marchand,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Thu, Aug 14, 2014 at 12:25:04AM +0900, Sergey Senozhatsky wrote:
> On (08/14/14 00:13), Sergey Senozhatsky wrote:
> > > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> > > <sergey.senozhatsky@gmail.com> wrote:
> > > > On (08/13/14 09:59), Dan Streetman wrote:
> > > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> > > >> > Pages_allocated has counted in size_class structure and when user
> > > >> > want to see total_size_bytes, it gathers all of value from each
> > > >> > size_class to report the sum.
> > > >> >
> > > >> > It's not bad if user don't see the value often but if user start
> > > >> > to see the value frequently, it would be not a good deal for
> > > >> > performance POV.
> > > >> >
> > > >> > This patch moves the variable from size_class to zs_pool so it would
> > > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> > > >> > but it adds new locking overhead but it wouldn't be severe because
> > > >> > it's not a hot path in zs_malloc(ie, it is called only when new
> > > >> > zspage is created, not a object).
> > > >>
> > > >> Would using an atomic64_t without locking be simpler?
> > > >
> > > > it would be racy.
> > > 
> > > oh.  atomic operations aren't smp safe?  is that because other
> > > processors might use a stale value, and barriers must be added?  I
> > > guess I don't quite understand the value of atomic then. :-/
> > 
> > pool not only set the value, it also read it and make some decisions
> > based on that value:
> > 
> > 	pages_allocated += X
> > 	if (pages_allocated >= max_pages_allocated)
> > 		return 0;
> 
> 
> I mean, suppose this happens on two CPUs
> 
> max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
> happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
> that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
> fail 2 operations, while we only were supposed to fail one.

Exactly speaking, you're saying not max_pages_allocated but pages_limited.
But I admit the race could affect max_pages_allocated, too.

I think it would be not severe if we move the feature into zram because
zram's requirement is not strict and the gap is just bounded by the number
of CPU so we could remove both spinlock and atomic.


> 
> 	-ss
> 
> > 
> > > >>
> > > >> >
> > > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > >> > ---
> > > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> > > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> > > >> >
> > > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > >> > index fe78189624cf..a6089bd26621 100644
> > > >> > --- a/mm/zsmalloc.c
> > > >> > +++ b/mm/zsmalloc.c
> > > >> > @@ -198,9 +198,6 @@ struct size_class {
> > > >> >
> > > >> >         spinlock_t lock;
> > > >> >
> > > >> > -       /* stats */
> > > >> > -       u64 pages_allocated;
> > > >> > -
> > > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> > > >> >  };
> > > >> >
> > > >> > @@ -216,9 +213,12 @@ struct link_free {
> > > >> >  };
> > > >> >
> > > >> >  struct zs_pool {
> > > >> > +       spinlock_t stat_lock;
> > > >> > +
> > > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> > > >> >
> > > >> >         gfp_t flags;    /* allocation flags used when growing pool */
> > > >> > +       unsigned long pages_allocated;
> > > >> >  };
> > > >> >
> > > >> >  /*
> > > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> > > >> >
> > > >> >         }
> > > >> >
> > > >> > +       spin_lock_init(&pool->stat_lock);
> > > >> >         pool->flags = flags;
> > > >> >
> > > >> >         return pool;
> > > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> > > >> >                         return 0;
> > > >> >
> > > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> > > >> > +               spin_lock(&pool->stat_lock);
> > > >> > +               pool->pages_allocated += class->pages_per_zspage;
> > > >> > +               spin_unlock(&pool->stat_lock);
> > > >> >                 spin_lock(&class->lock);
> > > >> > -               class->pages_allocated += class->pages_per_zspage;
> > > >> >         }
> > > >> >
> > > >> >         obj = (unsigned long)first_page->freelist;
> > > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> > > >> >
> > > >> >         first_page->inuse--;
> > > >> >         fullness = fix_fullness_group(pool, first_page);
> > > >> > -
> > > >> > -       if (fullness == ZS_EMPTY)
> > > >> > -               class->pages_allocated -= class->pages_per_zspage;
> > > >> > -
> > > >> >         spin_unlock(&class->lock);
> > > >> >
> > > >> > -       if (fullness == ZS_EMPTY)
> > > >> > +       if (fullness == ZS_EMPTY) {
> > > >> > +               spin_lock(&pool->stat_lock);
> > > >> > +               pool->pages_allocated -= class->pages_per_zspage;
> > > >> > +               spin_unlock(&pool->stat_lock);
> > > >> >                 free_zspage(first_page);
> > > >> > +       }
> > > >> >  }
> > > >> >  EXPORT_SYMBOL_GPL(zs_free);
> > > >> >
> > > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> > > >> >
> > > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> > > >> >  {
> > > >> > -       int i;
> > > >> > -       u64 npages = 0;
> > > >> > -
> > > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> > > >> > -               npages += pool->size_class[i].pages_allocated;
> > > >> > +       u64 npages;
> > > >> >
> > > >> > +       spin_lock(&pool->stat_lock);
> > > >> > +       npages = pool->pages_allocated;
> > > >> > +       spin_unlock(&pool->stat_lock);
> > > >> >         return npages << PAGE_SHIFT;
> > > >> >  }
> > > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> > > >> > --
> > > >> > 2.0.0
> > > >> >
> > > >> > --
> > > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > > >> > see: http://www.linux-mm.org/ .
> > > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> > > >>
> > > 
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
  2014-08-13 16:11               ` Dan Streetman
@ 2014-08-14 13:03                 ` Sergey Senozhatsky
  -1 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-14 13:03 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Sergey Senozhatsky, Minchan Kim, Linux-MM, linux-kernel,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/13/14 12:11), Dan Streetman wrote:
> >> > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> >> > <sergey.senozhatsky@gmail.com> wrote:
> >> > > On (08/13/14 09:59), Dan Streetman wrote:
> >> > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> >> > >> > Pages_allocated has counted in size_class structure and when user
> >> > >> > want to see total_size_bytes, it gathers all of value from each
> >> > >> > size_class to report the sum.
> >> > >> >
> >> > >> > It's not bad if user don't see the value often but if user start
> >> > >> > to see the value frequently, it would be not a good deal for
> >> > >> > performance POV.
> >> > >> >
> >> > >> > This patch moves the variable from size_class to zs_pool so it would
> >> > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> >> > >> > but it adds new locking overhead but it wouldn't be severe because
> >> > >> > it's not a hot path in zs_malloc(ie, it is called only when new
> >> > >> > zspage is created, not a object).
> >> > >>
> >> > >> Would using an atomic64_t without locking be simpler?
> >> > >
> >> > > it would be racy.
> >> >
> >> > oh.  atomic operations aren't smp safe?  is that because other
> >> > processors might use a stale value, and barriers must be added?  I
> >> > guess I don't quite understand the value of atomic then. :-/
> >>
> >> pool not only set the value, it also read it and make some decisions
> >> based on that value:
> >>
> >>       pages_allocated += X
> >>       if (pages_allocated >= max_pages_allocated)
> >>               return 0;
> >
> 
> I'm missing where that is?  I don't see that in this patch?
> 
> >
> > I mean, suppose this happens on two CPUs
> >
> > max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
> > happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
> > that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
> > fail 2 operations, while we only were supposed to fail one.
> 
> Do you mean this from the 2/3 patch:

yeah. sorry for being unclear, I was really sleepy.

> @@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>                 spin_lock(&pool->stat_lock);
>                 pool->pages_allocated += class->pages_per_zspage;
> +               if (pool->max_pages_allocated < pool->pages_allocated)
> +                       pool->max_pages_allocated = pool->pages_allocated;
>                 spin_unlock(&pool->stat_lock);
>                 spin_lock(&class->lock);
>         }
> 
> I see, yeah the max > allocated check before setting is easiest done
> with a spinlock.  I think pages_allocated could still be done as
> atomic, just using atomic_add_return() to grab the current value to
> check against, but keeping them the same type and both protected by
> the same spinlock I guess simplifies things.  Although, if they were
> both atomic, then the *only* place that would need a spinlock would be
> this check - reading the (atomic) max_pages_allocated wouldn't need a
> spinlock, nor would clearing it to 0.

makes sense.

	-ss

> >> > >>
> >> > >> >
> >> > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >> > >> > ---
> >> > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> >> > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> >> > >> >
> >> > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> >> > >> > index fe78189624cf..a6089bd26621 100644
> >> > >> > --- a/mm/zsmalloc.c
> >> > >> > +++ b/mm/zsmalloc.c
> >> > >> > @@ -198,9 +198,6 @@ struct size_class {
> >> > >> >
> >> > >> >         spinlock_t lock;
> >> > >> >
> >> > >> > -       /* stats */
> >> > >> > -       u64 pages_allocated;
> >> > >> > -
> >> > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> >> > >> >  };
> >> > >> >
> >> > >> > @@ -216,9 +213,12 @@ struct link_free {
> >> > >> >  };
> >> > >> >
> >> > >> >  struct zs_pool {
> >> > >> > +       spinlock_t stat_lock;
> >> > >> > +
> >> > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> >> > >> >
> >> > >> >         gfp_t flags;    /* allocation flags used when growing pool */
> >> > >> > +       unsigned long pages_allocated;
> >> > >> >  };
> >> > >> >
> >> > >> >  /*
> >> > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> >> > >> >
> >> > >> >         }
> >> > >> >
> >> > >> > +       spin_lock_init(&pool->stat_lock);
> >> > >> >         pool->flags = flags;
> >> > >> >
> >> > >> >         return pool;
> >> > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >> > >> >                         return 0;
> >> > >> >
> >> > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> >> > >> > +               spin_lock(&pool->stat_lock);
> >> > >> > +               pool->pages_allocated += class->pages_per_zspage;
> >> > >> > +               spin_unlock(&pool->stat_lock);
> >> > >> >                 spin_lock(&class->lock);
> >> > >> > -               class->pages_allocated += class->pages_per_zspage;
> >> > >> >         }
> >> > >> >
> >> > >> >         obj = (unsigned long)first_page->freelist;
> >> > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> >> > >> >
> >> > >> >         first_page->inuse--;
> >> > >> >         fullness = fix_fullness_group(pool, first_page);
> >> > >> > -
> >> > >> > -       if (fullness == ZS_EMPTY)
> >> > >> > -               class->pages_allocated -= class->pages_per_zspage;
> >> > >> > -
> >> > >> >         spin_unlock(&class->lock);
> >> > >> >
> >> > >> > -       if (fullness == ZS_EMPTY)
> >> > >> > +       if (fullness == ZS_EMPTY) {
> >> > >> > +               spin_lock(&pool->stat_lock);
> >> > >> > +               pool->pages_allocated -= class->pages_per_zspage;
> >> > >> > +               spin_unlock(&pool->stat_lock);
> >> > >> >                 free_zspage(first_page);
> >> > >> > +       }
> >> > >> >  }
> >> > >> >  EXPORT_SYMBOL_GPL(zs_free);
> >> > >> >
> >> > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> >> > >> >
> >> > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> >> > >> >  {
> >> > >> > -       int i;
> >> > >> > -       u64 npages = 0;
> >> > >> > -
> >> > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> >> > >> > -               npages += pool->size_class[i].pages_allocated;
> >> > >> > +       u64 npages;
> >> > >> >
> >> > >> > +       spin_lock(&pool->stat_lock);
> >> > >> > +       npages = pool->pages_allocated;
> >> > >> > +       spin_unlock(&pool->stat_lock);
> >> > >> >         return npages << PAGE_SHIFT;
> >> > >> >  }
> >> > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> >> > >> > --
> >> > >> > 2.0.0
> >> > >> >
> >> > >> > --
> >> > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> >> > >> > see: http://www.linux-mm.org/ .
> >> > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >> > >>
> >> >
> >>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 1/3] zsmalloc: move pages_allocated to zs_pool
@ 2014-08-14 13:03                 ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-14 13:03 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Sergey Senozhatsky, Minchan Kim, Linux-MM, linux-kernel,
	Jerome Marchand, juno.choi, seungho1.park, Luigi Semenzato,
	Nitin Gupta

On (08/13/14 12:11), Dan Streetman wrote:
> >> > On Wed, Aug 13, 2014 at 10:14 AM, Sergey Senozhatsky
> >> > <sergey.senozhatsky@gmail.com> wrote:
> >> > > On (08/13/14 09:59), Dan Streetman wrote:
> >> > >> On Tue, Aug 5, 2014 at 4:02 AM, Minchan Kim <minchan@kernel.org> wrote:
> >> > >> > Pages_allocated has counted in size_class structure and when user
> >> > >> > want to see total_size_bytes, it gathers all of value from each
> >> > >> > size_class to report the sum.
> >> > >> >
> >> > >> > It's not bad if user don't see the value often but if user start
> >> > >> > to see the value frequently, it would be not a good deal for
> >> > >> > performance POV.
> >> > >> >
> >> > >> > This patch moves the variable from size_class to zs_pool so it would
> >> > >> > reduce memory footprint (from [255 * 8byte] to [sizeof(atomic_t)])
> >> > >> > but it adds new locking overhead but it wouldn't be severe because
> >> > >> > it's not a hot path in zs_malloc(ie, it is called only when new
> >> > >> > zspage is created, not a object).
> >> > >>
> >> > >> Would using an atomic64_t without locking be simpler?
> >> > >
> >> > > it would be racy.
> >> >
> >> > oh.  atomic operations aren't smp safe?  is that because other
> >> > processors might use a stale value, and barriers must be added?  I
> >> > guess I don't quite understand the value of atomic then. :-/
> >>
> >> pool not only set the value, it also read it and make some decisions
> >> based on that value:
> >>
> >>       pages_allocated += X
> >>       if (pages_allocated >= max_pages_allocated)
> >>               return 0;
> >
> 
> I'm missing where that is?  I don't see that in this patch?
> 
> >
> > I mean, suppose this happens on two CPUs
> >
> > max_pages_allocated is 10; current pages_allocated is 8. now you have 2 zs_malloc()
> > happenning on two CPUs. each of them will do `pages_allocated += 1'. the problem is
> > that both will see 10 at `if (pages_allocated >= max_pages_allocated)', so we will
> > fail 2 operations, while we only were supposed to fail one.
> 
> Do you mean this from the 2/3 patch:

yeah. sorry for being unclear, I was really sleepy.

> @@ -946,6 +947,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
>                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
>                 spin_lock(&pool->stat_lock);
>                 pool->pages_allocated += class->pages_per_zspage;
> +               if (pool->max_pages_allocated < pool->pages_allocated)
> +                       pool->max_pages_allocated = pool->pages_allocated;
>                 spin_unlock(&pool->stat_lock);
>                 spin_lock(&class->lock);
>         }
> 
> I see, yeah the max > allocated check before setting is easiest done
> with a spinlock.  I think pages_allocated could still be done as
> atomic, just using atomic_add_return() to grab the current value to
> check against, but keeping them the same type and both protected by
> the same spinlock I guess simplifies things.  Although, if they were
> both atomic, then the *only* place that would need a spinlock would be
> this check - reading the (atomic) max_pages_allocated wouldn't need a
> spinlock, nor would clearing it to 0.

makes sense.

	-ss

> >> > >>
> >> > >> >
> >> > >> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> >> > >> > ---
> >> > >> >  mm/zsmalloc.c | 30 ++++++++++++++++--------------
> >> > >> >  1 file changed, 16 insertions(+), 14 deletions(-)
> >> > >> >
> >> > >> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> >> > >> > index fe78189624cf..a6089bd26621 100644
> >> > >> > --- a/mm/zsmalloc.c
> >> > >> > +++ b/mm/zsmalloc.c
> >> > >> > @@ -198,9 +198,6 @@ struct size_class {
> >> > >> >
> >> > >> >         spinlock_t lock;
> >> > >> >
> >> > >> > -       /* stats */
> >> > >> > -       u64 pages_allocated;
> >> > >> > -
> >> > >> >         struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS];
> >> > >> >  };
> >> > >> >
> >> > >> > @@ -216,9 +213,12 @@ struct link_free {
> >> > >> >  };
> >> > >> >
> >> > >> >  struct zs_pool {
> >> > >> > +       spinlock_t stat_lock;
> >> > >> > +
> >> > >> >         struct size_class size_class[ZS_SIZE_CLASSES];
> >> > >> >
> >> > >> >         gfp_t flags;    /* allocation flags used when growing pool */
> >> > >> > +       unsigned long pages_allocated;
> >> > >> >  };
> >> > >> >
> >> > >> >  /*
> >> > >> > @@ -882,6 +882,7 @@ struct zs_pool *zs_create_pool(gfp_t flags)
> >> > >> >
> >> > >> >         }
> >> > >> >
> >> > >> > +       spin_lock_init(&pool->stat_lock);
> >> > >> >         pool->flags = flags;
> >> > >> >
> >> > >> >         return pool;
> >> > >> > @@ -943,8 +944,10 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> >> > >> >                         return 0;
> >> > >> >
> >> > >> >                 set_zspage_mapping(first_page, class->index, ZS_EMPTY);
> >> > >> > +               spin_lock(&pool->stat_lock);
> >> > >> > +               pool->pages_allocated += class->pages_per_zspage;
> >> > >> > +               spin_unlock(&pool->stat_lock);
> >> > >> >                 spin_lock(&class->lock);
> >> > >> > -               class->pages_allocated += class->pages_per_zspage;
> >> > >> >         }
> >> > >> >
> >> > >> >         obj = (unsigned long)first_page->freelist;
> >> > >> > @@ -997,14 +1000,14 @@ void zs_free(struct zs_pool *pool, unsigned long obj)
> >> > >> >
> >> > >> >         first_page->inuse--;
> >> > >> >         fullness = fix_fullness_group(pool, first_page);
> >> > >> > -
> >> > >> > -       if (fullness == ZS_EMPTY)
> >> > >> > -               class->pages_allocated -= class->pages_per_zspage;
> >> > >> > -
> >> > >> >         spin_unlock(&class->lock);
> >> > >> >
> >> > >> > -       if (fullness == ZS_EMPTY)
> >> > >> > +       if (fullness == ZS_EMPTY) {
> >> > >> > +               spin_lock(&pool->stat_lock);
> >> > >> > +               pool->pages_allocated -= class->pages_per_zspage;
> >> > >> > +               spin_unlock(&pool->stat_lock);
> >> > >> >                 free_zspage(first_page);
> >> > >> > +       }
> >> > >> >  }
> >> > >> >  EXPORT_SYMBOL_GPL(zs_free);
> >> > >> >
> >> > >> > @@ -1100,12 +1103,11 @@ EXPORT_SYMBOL_GPL(zs_unmap_object);
> >> > >> >
> >> > >> >  u64 zs_get_total_size_bytes(struct zs_pool *pool)
> >> > >> >  {
> >> > >> > -       int i;
> >> > >> > -       u64 npages = 0;
> >> > >> > -
> >> > >> > -       for (i = 0; i < ZS_SIZE_CLASSES; i++)
> >> > >> > -               npages += pool->size_class[i].pages_allocated;
> >> > >> > +       u64 npages;
> >> > >> >
> >> > >> > +       spin_lock(&pool->stat_lock);
> >> > >> > +       npages = pool->pages_allocated;
> >> > >> > +       spin_unlock(&pool->stat_lock);
> >> > >> >         return npages << PAGE_SHIFT;
> >> > >> >  }
> >> > >> >  EXPORT_SYMBOL_GPL(zs_get_total_size_bytes);
> >> > >> > --
> >> > >> > 2.0.0
> >> > >> >
> >> > >> > --
> >> > >> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> > >> > the body to majordomo@kvack.org.  For more info on Linux MM,
> >> > >> > see: http://www.linux-mm.org/ .
> >> > >> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >> > >>
> >> >
> >>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-13 23:27         ` Minchan Kim
@ 2014-08-14 13:29           ` Sergey Senozhatsky
  -1 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-14 13:29 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, linux-mm, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

Hello Minchan,

On (08/14/14 08:27), Minchan Kim wrote:
> Date: Thu, 14 Aug 2014 08:27:19 +0900
> From: Minchan Kim <minchan@kernel.org>
> To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: linux-mm@kvack.org, Jerome Marchand <jmarchan@redhat.com>,
>  linux-kernel@vger.kernel.org, juno.choi@lge.com, seungho1.park@lge.com,
>  Luigi Semenzato <semenzato@google.com>, Nitin Gupta <ngupta@vflare.org>
> Subject: Re: [RFC 3/3] zram: limit memory size for zram
> User-Agent: Mutt/1.5.21 (2010-09-15)
> 
> Hey Sergey,
> 
> On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > On (08/05/14 18:48), Minchan Kim wrote:
> > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > zs_malloc could be failed as soon as it exceeds the limit.
> > > 
> > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > write. It's more clean and right layer, IMHO.
> > 
> > yes, I think this one is better.
> 
> Although I suggested this new one, a few days ago I changed the decision
> and was testing the new patchset.
> 
> If we add new API for zsmalloc, it adds unnecessary overhead for users who
> doesn't care of limit. Although it's cheap, I'd like to avoid that.
> 
> The zsmalloc is just allocator so anybody can use it if they want.
> But limitation is just requirement of zram who is a one of client
> being able to use zsmalloc potentially so accouting should be on zram,
> not zsmalloc.
> 

my motivation was that zram does not use that much memory itself,
zspool - does. zram is just a clueless client from that point of
view: it recives some requests, do some things with supplied data,
and asks zspool if the latter one can find some place to keep that
data (and zram doesn't really care how that memory will be allocated
or will not be).

I'm OK if we will have memory limitation in ZRAM. though conceptually,
IMHO, it feels that such logic belongs to allocation layer. yet I admit
the potential overhead issue.

> If we might have more users of zsmalloc in future and they all want this
> feature that limit of zsmalloc memory usage, we might move the feature
> from client to zsmalloc core so everybody would be happy for performance
> and readability but opposite would be painful.
> 
> In summary, let's keep the accounting logic in client side of zsmalloc(ie,
> zram) at the moment but we could move it into zsmalloc core possibly
> in future.
> 
> Any thoughts?

agreed.

	-ss

> > 
> > 	-ss
> > 
> > > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > > > I have received a request several time from zram users.
> > > > They want to limit memory size for zram because zram can consume
> > > > lot of memory on system without limit so it makes memory management
> > > > control hard.
> > > > 
> > > > This patch adds new knob to limit memory of zram.
> > > > 
> > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > > ---
> > > >  Documentation/blockdev/zram.txt |  1 +
> > > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> > > >  drivers/block/zram/zram_drv.h   |  1 +
> > > >  3 files changed, 43 insertions(+)
> > > > 
> > > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > > > index d24534bee763..fcb0561dfe2e 100644
> > > > --- a/Documentation/blockdev/zram.txt
> > > > +++ b/Documentation/blockdev/zram.txt
> > > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > > >  		compr_data_size
> > > >  		mem_used_total
> > > >  		mem_used_max
> > > > +		mem_limit
> > > >  
> > > >  7) Deactivate:
> > > >  	swapoff /dev/zram0
> > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > > index a4d637b4db7d..47f68bbb2c44 100644
> > > > --- a/drivers/block/zram/zram_drv.c
> > > > +++ b/drivers/block/zram/zram_drv.c
> > > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> > > >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> > > >  }
> > > >  
> > > > +static ssize_t mem_limit_show(struct device *dev,
> > > > +		struct device_attribute *attr, char *buf)
> > > > +{
> > > > +	u64 val;
> > > > +	struct zram *zram = dev_to_zram(dev);
> > > > +
> > > > +	down_read(&zram->init_lock);
> > > > +	val = zram->limit_bytes;
> > > > +	up_read(&zram->init_lock);
> > > > +
> > > > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > > > +}
> > > > +
> > > > +static ssize_t mem_limit_store(struct device *dev,
> > > > +		struct device_attribute *attr, const char *buf, size_t len)
> > > > +{
> > > > +	u64 limit;
> > > > +	struct zram *zram = dev_to_zram(dev);
> > > > +	int ret;
> > > > +
> > > > +	ret = kstrtoull(buf, 0, &limit);
> > > > +	if (ret < 0)
> > > > +		return ret;
> > > > +
> > > > +	down_write(&zram->init_lock);
> > > > +	zram->limit_bytes = limit;
> > > > +	ret = len;
> > > > +	up_write(&zram->init_lock);
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  static ssize_t max_comp_streams_store(struct device *dev,
> > > >  		struct device_attribute *attr, const char *buf, size_t len)
> > > >  {
> > > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > > >  		ret = -ENOMEM;
> > > >  		goto out;
> > > >  	}
> > > > +
> > > > +	if (zram->limit_bytes &&
> > > > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > > > +		zs_free(meta->mem_pool, handle);
> > > > +		ret = -ENOMEM;
> > > > +		goto out;
> > > > +	}
> > > > +
> > > >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> > > >  
> > > >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> > > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> > > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > > >  		max_comp_streams_show, max_comp_streams_store);
> > > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> > > >  	&dev_attr_compr_data_size.attr,
> > > >  	&dev_attr_mem_used_total.attr,
> > > >  	&dev_attr_mem_used_max.attr,
> > > > +	&dev_attr_mem_limit.attr,
> > > >  	&dev_attr_max_comp_streams.attr,
> > > >  	&dev_attr_comp_algorithm.attr,
> > > >  	NULL,
> > > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > > > index 7f21c145e317..c0d497ff6efc 100644
> > > > --- a/drivers/block/zram/zram_drv.h
> > > > +++ b/drivers/block/zram/zram_drv.h
> > > > @@ -99,6 +99,7 @@ struct zram {
> > > >  	 * we can store in a disk.
> > > >  	 */
> > > >  	u64 disksize;	/* bytes */
> > > > +	u64 limit_bytes;
> > > >  	int max_comp_streams;
> > > >  	struct zram_stats stats;
> > > >  	char compressor[10];
> > > > -- 
> > > > 2.0.0
> > > 
> > > -- 
> > > Kind regards,
> > > Minchan Kim
> > > 
> 
> -- 
> Kind regards,
> Minchan Kim
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-14 13:29           ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2014-08-14 13:29 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, linux-mm, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

Hello Minchan,

On (08/14/14 08:27), Minchan Kim wrote:
> Date: Thu, 14 Aug 2014 08:27:19 +0900
> From: Minchan Kim <minchan@kernel.org>
> To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: linux-mm@kvack.org, Jerome Marchand <jmarchan@redhat.com>,
>  linux-kernel@vger.kernel.org, juno.choi@lge.com, seungho1.park@lge.com,
>  Luigi Semenzato <semenzato@google.com>, Nitin Gupta <ngupta@vflare.org>
> Subject: Re: [RFC 3/3] zram: limit memory size for zram
> User-Agent: Mutt/1.5.21 (2010-09-15)
> 
> Hey Sergey,
> 
> On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > On (08/05/14 18:48), Minchan Kim wrote:
> > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > zs_malloc could be failed as soon as it exceeds the limit.
> > > 
> > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > write. It's more clean and right layer, IMHO.
> > 
> > yes, I think this one is better.
> 
> Although I suggested this new one, a few days ago I changed the decision
> and was testing the new patchset.
> 
> If we add new API for zsmalloc, it adds unnecessary overhead for users who
> doesn't care of limit. Although it's cheap, I'd like to avoid that.
> 
> The zsmalloc is just allocator so anybody can use it if they want.
> But limitation is just requirement of zram who is a one of client
> being able to use zsmalloc potentially so accouting should be on zram,
> not zsmalloc.
> 

my motivation was that zram does not use that much memory itself,
zspool - does. zram is just a clueless client from that point of
view: it recives some requests, do some things with supplied data,
and asks zspool if the latter one can find some place to keep that
data (and zram doesn't really care how that memory will be allocated
or will not be).

I'm OK if we will have memory limitation in ZRAM. though conceptually,
IMHO, it feels that such logic belongs to allocation layer. yet I admit
the potential overhead issue.

> If we might have more users of zsmalloc in future and they all want this
> feature that limit of zsmalloc memory usage, we might move the feature
> from client to zsmalloc core so everybody would be happy for performance
> and readability but opposite would be painful.
> 
> In summary, let's keep the accounting logic in client side of zsmalloc(ie,
> zram) at the moment but we could move it into zsmalloc core possibly
> in future.
> 
> Any thoughts?

agreed.

	-ss

> > 
> > 	-ss
> > 
> > > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > > > I have received a request several time from zram users.
> > > > They want to limit memory size for zram because zram can consume
> > > > lot of memory on system without limit so it makes memory management
> > > > control hard.
> > > > 
> > > > This patch adds new knob to limit memory of zram.
> > > > 
> > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > > ---
> > > >  Documentation/blockdev/zram.txt |  1 +
> > > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> > > >  drivers/block/zram/zram_drv.h   |  1 +
> > > >  3 files changed, 43 insertions(+)
> > > > 
> > > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > > > index d24534bee763..fcb0561dfe2e 100644
> > > > --- a/Documentation/blockdev/zram.txt
> > > > +++ b/Documentation/blockdev/zram.txt
> > > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > > >  		compr_data_size
> > > >  		mem_used_total
> > > >  		mem_used_max
> > > > +		mem_limit
> > > >  
> > > >  7) Deactivate:
> > > >  	swapoff /dev/zram0
> > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > > index a4d637b4db7d..47f68bbb2c44 100644
> > > > --- a/drivers/block/zram/zram_drv.c
> > > > +++ b/drivers/block/zram/zram_drv.c
> > > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> > > >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> > > >  }
> > > >  
> > > > +static ssize_t mem_limit_show(struct device *dev,
> > > > +		struct device_attribute *attr, char *buf)
> > > > +{
> > > > +	u64 val;
> > > > +	struct zram *zram = dev_to_zram(dev);
> > > > +
> > > > +	down_read(&zram->init_lock);
> > > > +	val = zram->limit_bytes;
> > > > +	up_read(&zram->init_lock);
> > > > +
> > > > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > > > +}
> > > > +
> > > > +static ssize_t mem_limit_store(struct device *dev,
> > > > +		struct device_attribute *attr, const char *buf, size_t len)
> > > > +{
> > > > +	u64 limit;
> > > > +	struct zram *zram = dev_to_zram(dev);
> > > > +	int ret;
> > > > +
> > > > +	ret = kstrtoull(buf, 0, &limit);
> > > > +	if (ret < 0)
> > > > +		return ret;
> > > > +
> > > > +	down_write(&zram->init_lock);
> > > > +	zram->limit_bytes = limit;
> > > > +	ret = len;
> > > > +	up_write(&zram->init_lock);
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  static ssize_t max_comp_streams_store(struct device *dev,
> > > >  		struct device_attribute *attr, const char *buf, size_t len)
> > > >  {
> > > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > > >  		ret = -ENOMEM;
> > > >  		goto out;
> > > >  	}
> > > > +
> > > > +	if (zram->limit_bytes &&
> > > > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > > > +		zs_free(meta->mem_pool, handle);
> > > > +		ret = -ENOMEM;
> > > > +		goto out;
> > > > +	}
> > > > +
> > > >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> > > >  
> > > >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> > > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> > > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > > >  		max_comp_streams_show, max_comp_streams_store);
> > > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> > > >  	&dev_attr_compr_data_size.attr,
> > > >  	&dev_attr_mem_used_total.attr,
> > > >  	&dev_attr_mem_used_max.attr,
> > > > +	&dev_attr_mem_limit.attr,
> > > >  	&dev_attr_max_comp_streams.attr,
> > > >  	&dev_attr_comp_algorithm.attr,
> > > >  	NULL,
> > > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > > > index 7f21c145e317..c0d497ff6efc 100644
> > > > --- a/drivers/block/zram/zram_drv.h
> > > > +++ b/drivers/block/zram/zram_drv.h
> > > > @@ -99,6 +99,7 @@ struct zram {
> > > >  	 * we can store in a disk.
> > > >  	 */
> > > >  	u64 disksize;	/* bytes */
> > > > +	u64 limit_bytes;
> > > >  	int max_comp_streams;
> > > >  	struct zram_stats stats;
> > > >  	char compressor[10];
> > > > -- 
> > > > 2.0.0
> > > 
> > > -- 
> > > Kind regards,
> > > Minchan Kim
> > > 
> 
> -- 
> Kind regards,
> Minchan Kim
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-13 23:27         ` Minchan Kim
@ 2014-08-14 14:45           ` Dan Streetman
  -1 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-14 14:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Linux-MM, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 7:27 PM, Minchan Kim <minchan@kernel.org> wrote:
> Hey Sergey,
>
> On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
>> Hello,
>>
>> On (08/05/14 18:48), Minchan Kim wrote:
>> > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
>> > in zsmalloc and put the limit in zs_pool via new API from zram so that
>> > zs_malloc could be failed as soon as it exceeds the limit.
>> >
>> > In the end, zram doesn't need to call zs_get_total_size_bytes on every
>> > write. It's more clean and right layer, IMHO.
>>
>> yes, I think this one is better.
>
> Although I suggested this new one, a few days ago I changed the decision
> and was testing the new patchset.
>
> If we add new API for zsmalloc, it adds unnecessary overhead for users who
> doesn't care of limit. Although it's cheap, I'd like to avoid that.
>
> The zsmalloc is just allocator so anybody can use it if they want.
> But limitation is just requirement of zram who is a one of client
> being able to use zsmalloc potentially so accouting should be on zram,
> not zsmalloc.
>
> If we might have more users of zsmalloc in future and they all want this
> feature that limit of zsmalloc memory usage, we might move the feature
> from client to zsmalloc core so everybody would be happy for performance
> and readability but opposite would be painful.
>
> In summary, let's keep the accounting logic in client side of zsmalloc(ie,
> zram) at the moment but we could move it into zsmalloc core possibly
> in future.
>
> Any thoughts?

I agree - the limit is useful, and right now is better to put in zram.

Moving it into zsmalloc (and zbud, and zpool) should be possible in
the future, if it makes sense, although it may be more complicated
since different users might want different ways of controlling it -
e.g. zram may want a hard limit of Xmb, while zswap currently wants a
% of total mem limit.

>
>>
>>       -ss
>>
>> > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
>> > > I have received a request several time from zram users.
>> > > They want to limit memory size for zram because zram can consume
>> > > lot of memory on system without limit so it makes memory management
>> > > control hard.
>> > >
>> > > This patch adds new knob to limit memory of zram.
>> > >
>> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > > ---
>> > >  Documentation/blockdev/zram.txt |  1 +
>> > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
>> > >  drivers/block/zram/zram_drv.h   |  1 +
>> > >  3 files changed, 43 insertions(+)
>> > >
>> > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
>> > > index d24534bee763..fcb0561dfe2e 100644
>> > > --- a/Documentation/blockdev/zram.txt
>> > > +++ b/Documentation/blockdev/zram.txt
>> > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
>> > >           compr_data_size
>> > >           mem_used_total
>> > >           mem_used_max
>> > > +         mem_limit
>> > >
>> > >  7) Deactivate:
>> > >   swapoff /dev/zram0
>> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> > > index a4d637b4db7d..47f68bbb2c44 100644
>> > > --- a/drivers/block/zram/zram_drv.c
>> > > +++ b/drivers/block/zram/zram_drv.c
>> > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
>> > >   return scnprintf(buf, PAGE_SIZE, "%d\n", val);
>> > >  }
>> > >
>> > > +static ssize_t mem_limit_show(struct device *dev,
>> > > +         struct device_attribute *attr, char *buf)
>> > > +{
>> > > + u64 val;
>> > > + struct zram *zram = dev_to_zram(dev);
>> > > +
>> > > + down_read(&zram->init_lock);
>> > > + val = zram->limit_bytes;
>> > > + up_read(&zram->init_lock);
>> > > +
>> > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>> > > +}
>> > > +
>> > > +static ssize_t mem_limit_store(struct device *dev,
>> > > +         struct device_attribute *attr, const char *buf, size_t len)
>> > > +{
>> > > + u64 limit;
>> > > + struct zram *zram = dev_to_zram(dev);
>> > > + int ret;
>> > > +
>> > > + ret = kstrtoull(buf, 0, &limit);
>> > > + if (ret < 0)
>> > > +         return ret;
>> > > +
>> > > + down_write(&zram->init_lock);
>> > > + zram->limit_bytes = limit;
>> > > + ret = len;
>> > > + up_write(&zram->init_lock);
>> > > + return ret;
>> > > +}
>> > > +
>> > >  static ssize_t max_comp_streams_store(struct device *dev,
>> > >           struct device_attribute *attr, const char *buf, size_t len)
>> > >  {
>> > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>> > >           ret = -ENOMEM;
>> > >           goto out;
>> > >   }
>> > > +
>> > > + if (zram->limit_bytes &&
>> > > +         zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
>> > > +         zs_free(meta->mem_pool, handle);
>> > > +         ret = -ENOMEM;
>> > > +         goto out;
>> > > + }
>> > > +
>> > >   cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
>> > >
>> > >   if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
>> > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>> > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>> > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>> > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
>> > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
>> > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>> > >           max_comp_streams_show, max_comp_streams_store);
>> > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
>> > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
>> > >   &dev_attr_compr_data_size.attr,
>> > >   &dev_attr_mem_used_total.attr,
>> > >   &dev_attr_mem_used_max.attr,
>> > > + &dev_attr_mem_limit.attr,
>> > >   &dev_attr_max_comp_streams.attr,
>> > >   &dev_attr_comp_algorithm.attr,
>> > >   NULL,
>> > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
>> > > index 7f21c145e317..c0d497ff6efc 100644
>> > > --- a/drivers/block/zram/zram_drv.h
>> > > +++ b/drivers/block/zram/zram_drv.h
>> > > @@ -99,6 +99,7 @@ struct zram {
>> > >    * we can store in a disk.
>> > >    */
>> > >   u64 disksize;   /* bytes */
>> > > + u64 limit_bytes;
>> > >   int max_comp_streams;
>> > >   struct zram_stats stats;
>> > >   char compressor[10];
>> > > --
>> > > 2.0.0
>> >
>> > --
>> > Kind regards,
>> > Minchan Kim
>> >
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-14 14:45           ` Dan Streetman
  0 siblings, 0 replies; 54+ messages in thread
From: Dan Streetman @ 2014-08-14 14:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Linux-MM, Jerome Marchand, linux-kernel,
	juno.choi, seungho1.park, Luigi Semenzato, Nitin Gupta

On Wed, Aug 13, 2014 at 7:27 PM, Minchan Kim <minchan@kernel.org> wrote:
> Hey Sergey,
>
> On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
>> Hello,
>>
>> On (08/05/14 18:48), Minchan Kim wrote:
>> > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
>> > in zsmalloc and put the limit in zs_pool via new API from zram so that
>> > zs_malloc could be failed as soon as it exceeds the limit.
>> >
>> > In the end, zram doesn't need to call zs_get_total_size_bytes on every
>> > write. It's more clean and right layer, IMHO.
>>
>> yes, I think this one is better.
>
> Although I suggested this new one, a few days ago I changed the decision
> and was testing the new patchset.
>
> If we add new API for zsmalloc, it adds unnecessary overhead for users who
> doesn't care of limit. Although it's cheap, I'd like to avoid that.
>
> The zsmalloc is just allocator so anybody can use it if they want.
> But limitation is just requirement of zram who is a one of client
> being able to use zsmalloc potentially so accouting should be on zram,
> not zsmalloc.
>
> If we might have more users of zsmalloc in future and they all want this
> feature that limit of zsmalloc memory usage, we might move the feature
> from client to zsmalloc core so everybody would be happy for performance
> and readability but opposite would be painful.
>
> In summary, let's keep the accounting logic in client side of zsmalloc(ie,
> zram) at the moment but we could move it into zsmalloc core possibly
> in future.
>
> Any thoughts?

I agree - the limit is useful, and right now is better to put in zram.

Moving it into zsmalloc (and zbud, and zpool) should be possible in
the future, if it makes sense, although it may be more complicated
since different users might want different ways of controlling it -
e.g. zram may want a hard limit of Xmb, while zswap currently wants a
% of total mem limit.

>
>>
>>       -ss
>>
>> > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
>> > > I have received a request several time from zram users.
>> > > They want to limit memory size for zram because zram can consume
>> > > lot of memory on system without limit so it makes memory management
>> > > control hard.
>> > >
>> > > This patch adds new knob to limit memory of zram.
>> > >
>> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > > ---
>> > >  Documentation/blockdev/zram.txt |  1 +
>> > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
>> > >  drivers/block/zram/zram_drv.h   |  1 +
>> > >  3 files changed, 43 insertions(+)
>> > >
>> > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
>> > > index d24534bee763..fcb0561dfe2e 100644
>> > > --- a/Documentation/blockdev/zram.txt
>> > > +++ b/Documentation/blockdev/zram.txt
>> > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
>> > >           compr_data_size
>> > >           mem_used_total
>> > >           mem_used_max
>> > > +         mem_limit
>> > >
>> > >  7) Deactivate:
>> > >   swapoff /dev/zram0
>> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> > > index a4d637b4db7d..47f68bbb2c44 100644
>> > > --- a/drivers/block/zram/zram_drv.c
>> > > +++ b/drivers/block/zram/zram_drv.c
>> > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
>> > >   return scnprintf(buf, PAGE_SIZE, "%d\n", val);
>> > >  }
>> > >
>> > > +static ssize_t mem_limit_show(struct device *dev,
>> > > +         struct device_attribute *attr, char *buf)
>> > > +{
>> > > + u64 val;
>> > > + struct zram *zram = dev_to_zram(dev);
>> > > +
>> > > + down_read(&zram->init_lock);
>> > > + val = zram->limit_bytes;
>> > > + up_read(&zram->init_lock);
>> > > +
>> > > + return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>> > > +}
>> > > +
>> > > +static ssize_t mem_limit_store(struct device *dev,
>> > > +         struct device_attribute *attr, const char *buf, size_t len)
>> > > +{
>> > > + u64 limit;
>> > > + struct zram *zram = dev_to_zram(dev);
>> > > + int ret;
>> > > +
>> > > + ret = kstrtoull(buf, 0, &limit);
>> > > + if (ret < 0)
>> > > +         return ret;
>> > > +
>> > > + down_write(&zram->init_lock);
>> > > + zram->limit_bytes = limit;
>> > > + ret = len;
>> > > + up_write(&zram->init_lock);
>> > > + return ret;
>> > > +}
>> > > +
>> > >  static ssize_t max_comp_streams_store(struct device *dev,
>> > >           struct device_attribute *attr, const char *buf, size_t len)
>> > >  {
>> > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>> > >           ret = -ENOMEM;
>> > >           goto out;
>> > >   }
>> > > +
>> > > + if (zram->limit_bytes &&
>> > > +         zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
>> > > +         zs_free(meta->mem_pool, handle);
>> > > +         ret = -ENOMEM;
>> > > +         goto out;
>> > > + }
>> > > +
>> > >   cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
>> > >
>> > >   if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
>> > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
>> > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
>> > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
>> > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
>> > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
>> > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
>> > >           max_comp_streams_show, max_comp_streams_store);
>> > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
>> > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
>> > >   &dev_attr_compr_data_size.attr,
>> > >   &dev_attr_mem_used_total.attr,
>> > >   &dev_attr_mem_used_max.attr,
>> > > + &dev_attr_mem_limit.attr,
>> > >   &dev_attr_max_comp_streams.attr,
>> > >   &dev_attr_comp_algorithm.attr,
>> > >   NULL,
>> > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
>> > > index 7f21c145e317..c0d497ff6efc 100644
>> > > --- a/drivers/block/zram/zram_drv.h
>> > > +++ b/drivers/block/zram/zram_drv.h
>> > > @@ -99,6 +99,7 @@ struct zram {
>> > >    * we can store in a disk.
>> > >    */
>> > >   u64 disksize;   /* bytes */
>> > > + u64 limit_bytes;
>> > >   int max_comp_streams;
>> > >   struct zram_stats stats;
>> > >   char compressor[10];
>> > > --
>> > > 2.0.0
>> >
>> > --
>> > Kind regards,
>> > Minchan Kim
>> >
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
  2014-08-14 13:29           ` Sergey Senozhatsky
@ 2014-08-17 23:32             ` Minchan Kim
  -1 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-17 23:32 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

Hello Sergey,

On Thu, Aug 14, 2014 at 10:29:53PM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> On (08/14/14 08:27), Minchan Kim wrote:
> > Date: Thu, 14 Aug 2014 08:27:19 +0900
> > From: Minchan Kim <minchan@kernel.org>
> > To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > Cc: linux-mm@kvack.org, Jerome Marchand <jmarchan@redhat.com>,
> >  linux-kernel@vger.kernel.org, juno.choi@lge.com, seungho1.park@lge.com,
> >  Luigi Semenzato <semenzato@google.com>, Nitin Gupta <ngupta@vflare.org>
> > Subject: Re: [RFC 3/3] zram: limit memory size for zram
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > 
> > Hey Sergey,
> > 
> > On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > On (08/05/14 18:48), Minchan Kim wrote:
> > > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > > zs_malloc could be failed as soon as it exceeds the limit.
> > > > 
> > > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > > write. It's more clean and right layer, IMHO.
> > > 
> > > yes, I think this one is better.
> > 
> > Although I suggested this new one, a few days ago I changed the decision
> > and was testing the new patchset.
> > 
> > If we add new API for zsmalloc, it adds unnecessary overhead for users who
> > doesn't care of limit. Although it's cheap, I'd like to avoid that.
> > 
> > The zsmalloc is just allocator so anybody can use it if they want.
> > But limitation is just requirement of zram who is a one of client
> > being able to use zsmalloc potentially so accouting should be on zram,
> > not zsmalloc.
> > 
> 
> my motivation was that zram does not use that much memory itself,
> zspool - does. zram is just a clueless client from that point of
> view: it recives some requests, do some things with supplied data,
> and asks zspool if the latter one can find some place to keep that
> data (and zram doesn't really care how that memory will be allocated
> or will not be).

Normally, when we consider malloc(3), malloc(3) doesn't give any API
to limit memory size for the process. It just exposes some API to
return the state like (ex, mallopt) to the user so it's user's role
to manage the memory. I thought it's same with zsmalloc.
zsmalloc already exposes zs_get_total_size_bytes so client can do it
if he want to limit and frequent API call(ex, zs_get_total_size_bytes)
should be his overhead while others who don't need to limit should
be no overhead.

> 
> I'm OK if we will have memory limitation in ZRAM. though conceptually,
> IMHO, it feels that such logic belongs to allocation layer. yet I admit
> the potential overhead issue.
> 
> > If we might have more users of zsmalloc in future and they all want this
> > feature that limit of zsmalloc memory usage, we might move the feature
> > from client to zsmalloc core so everybody would be happy for performance
> > and readability but opposite would be painful.
> > 
> > In summary, let's keep the accounting logic in client side of zsmalloc(ie,
> > zram) at the moment but we could move it into zsmalloc core possibly
> > in future.
> > 
> > Any thoughts?
> 
> agreed.

Thanks for the comment, Sergey!

> 
> 	-ss
> 
> > > 
> > > 	-ss
> > > 
> > > > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > > > > I have received a request several time from zram users.
> > > > > They want to limit memory size for zram because zram can consume
> > > > > lot of memory on system without limit so it makes memory management
> > > > > control hard.
> > > > > 
> > > > > This patch adds new knob to limit memory of zram.
> > > > > 
> > > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > > > ---
> > > > >  Documentation/blockdev/zram.txt |  1 +
> > > > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> > > > >  drivers/block/zram/zram_drv.h   |  1 +
> > > > >  3 files changed, 43 insertions(+)
> > > > > 
> > > > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > > > > index d24534bee763..fcb0561dfe2e 100644
> > > > > --- a/Documentation/blockdev/zram.txt
> > > > > +++ b/Documentation/blockdev/zram.txt
> > > > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > > > >  		compr_data_size
> > > > >  		mem_used_total
> > > > >  		mem_used_max
> > > > > +		mem_limit
> > > > >  
> > > > >  7) Deactivate:
> > > > >  	swapoff /dev/zram0
> > > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > > > index a4d637b4db7d..47f68bbb2c44 100644
> > > > > --- a/drivers/block/zram/zram_drv.c
> > > > > +++ b/drivers/block/zram/zram_drv.c
> > > > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> > > > >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> > > > >  }
> > > > >  
> > > > > +static ssize_t mem_limit_show(struct device *dev,
> > > > > +		struct device_attribute *attr, char *buf)
> > > > > +{
> > > > > +	u64 val;
> > > > > +	struct zram *zram = dev_to_zram(dev);
> > > > > +
> > > > > +	down_read(&zram->init_lock);
> > > > > +	val = zram->limit_bytes;
> > > > > +	up_read(&zram->init_lock);
> > > > > +
> > > > > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > > > > +}
> > > > > +
> > > > > +static ssize_t mem_limit_store(struct device *dev,
> > > > > +		struct device_attribute *attr, const char *buf, size_t len)
> > > > > +{
> > > > > +	u64 limit;
> > > > > +	struct zram *zram = dev_to_zram(dev);
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = kstrtoull(buf, 0, &limit);
> > > > > +	if (ret < 0)
> > > > > +		return ret;
> > > > > +
> > > > > +	down_write(&zram->init_lock);
> > > > > +	zram->limit_bytes = limit;
> > > > > +	ret = len;
> > > > > +	up_write(&zram->init_lock);
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > >  static ssize_t max_comp_streams_store(struct device *dev,
> > > > >  		struct device_attribute *attr, const char *buf, size_t len)
> > > > >  {
> > > > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > > > >  		ret = -ENOMEM;
> > > > >  		goto out;
> > > > >  	}
> > > > > +
> > > > > +	if (zram->limit_bytes &&
> > > > > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > > > > +		zs_free(meta->mem_pool, handle);
> > > > > +		ret = -ENOMEM;
> > > > > +		goto out;
> > > > > +	}
> > > > > +
> > > > >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> > > > >  
> > > > >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > > > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> > > > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > > > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > > > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > > > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> > > > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > > > >  		max_comp_streams_show, max_comp_streams_store);
> > > > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > > > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> > > > >  	&dev_attr_compr_data_size.attr,
> > > > >  	&dev_attr_mem_used_total.attr,
> > > > >  	&dev_attr_mem_used_max.attr,
> > > > > +	&dev_attr_mem_limit.attr,
> > > > >  	&dev_attr_max_comp_streams.attr,
> > > > >  	&dev_attr_comp_algorithm.attr,
> > > > >  	NULL,
> > > > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > > > > index 7f21c145e317..c0d497ff6efc 100644
> > > > > --- a/drivers/block/zram/zram_drv.h
> > > > > +++ b/drivers/block/zram/zram_drv.h
> > > > > @@ -99,6 +99,7 @@ struct zram {
> > > > >  	 * we can store in a disk.
> > > > >  	 */
> > > > >  	u64 disksize;	/* bytes */
> > > > > +	u64 limit_bytes;
> > > > >  	int max_comp_streams;
> > > > >  	struct zram_stats stats;
> > > > >  	char compressor[10];
> > > > > -- 
> > > > > 2.0.0
> > > > 
> > > > -- 
> > > > Kind regards,
> > > > Minchan Kim
> > > > 
> > 
> > -- 
> > Kind regards,
> > Minchan Kim
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 3/3] zram: limit memory size for zram
@ 2014-08-17 23:32             ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-17 23:32 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: linux-mm, Jerome Marchand, linux-kernel, juno.choi,
	seungho1.park, Luigi Semenzato, Nitin Gupta

Hello Sergey,

On Thu, Aug 14, 2014 at 10:29:53PM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> On (08/14/14 08:27), Minchan Kim wrote:
> > Date: Thu, 14 Aug 2014 08:27:19 +0900
> > From: Minchan Kim <minchan@kernel.org>
> > To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > Cc: linux-mm@kvack.org, Jerome Marchand <jmarchan@redhat.com>,
> >  linux-kernel@vger.kernel.org, juno.choi@lge.com, seungho1.park@lge.com,
> >  Luigi Semenzato <semenzato@google.com>, Nitin Gupta <ngupta@vflare.org>
> > Subject: Re: [RFC 3/3] zram: limit memory size for zram
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > 
> > Hey Sergey,
> > 
> > On Tue, Aug 05, 2014 at 10:16:15PM +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > On (08/05/14 18:48), Minchan Kim wrote:
> > > > Another idea: we could define void zs_limit_mem(unsinged long nr_pages)
> > > > in zsmalloc and put the limit in zs_pool via new API from zram so that
> > > > zs_malloc could be failed as soon as it exceeds the limit.
> > > > 
> > > > In the end, zram doesn't need to call zs_get_total_size_bytes on every
> > > > write. It's more clean and right layer, IMHO.
> > > 
> > > yes, I think this one is better.
> > 
> > Although I suggested this new one, a few days ago I changed the decision
> > and was testing the new patchset.
> > 
> > If we add new API for zsmalloc, it adds unnecessary overhead for users who
> > doesn't care of limit. Although it's cheap, I'd like to avoid that.
> > 
> > The zsmalloc is just allocator so anybody can use it if they want.
> > But limitation is just requirement of zram who is a one of client
> > being able to use zsmalloc potentially so accouting should be on zram,
> > not zsmalloc.
> > 
> 
> my motivation was that zram does not use that much memory itself,
> zspool - does. zram is just a clueless client from that point of
> view: it recives some requests, do some things with supplied data,
> and asks zspool if the latter one can find some place to keep that
> data (and zram doesn't really care how that memory will be allocated
> or will not be).

Normally, when we consider malloc(3), malloc(3) doesn't give any API
to limit memory size for the process. It just exposes some API to
return the state like (ex, mallopt) to the user so it's user's role
to manage the memory. I thought it's same with zsmalloc.
zsmalloc already exposes zs_get_total_size_bytes so client can do it
if he want to limit and frequent API call(ex, zs_get_total_size_bytes)
should be his overhead while others who don't need to limit should
be no overhead.

> 
> I'm OK if we will have memory limitation in ZRAM. though conceptually,
> IMHO, it feels that such logic belongs to allocation layer. yet I admit
> the potential overhead issue.
> 
> > If we might have more users of zsmalloc in future and they all want this
> > feature that limit of zsmalloc memory usage, we might move the feature
> > from client to zsmalloc core so everybody would be happy for performance
> > and readability but opposite would be painful.
> > 
> > In summary, let's keep the accounting logic in client side of zsmalloc(ie,
> > zram) at the moment but we could move it into zsmalloc core possibly
> > in future.
> > 
> > Any thoughts?
> 
> agreed.

Thanks for the comment, Sergey!

> 
> 	-ss
> 
> > > 
> > > 	-ss
> > > 
> > > > On Tue, Aug 05, 2014 at 05:02:03PM +0900, Minchan Kim wrote:
> > > > > I have received a request several time from zram users.
> > > > > They want to limit memory size for zram because zram can consume
> > > > > lot of memory on system without limit so it makes memory management
> > > > > control hard.
> > > > > 
> > > > > This patch adds new knob to limit memory of zram.
> > > > > 
> > > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > > > ---
> > > > >  Documentation/blockdev/zram.txt |  1 +
> > > > >  drivers/block/zram/zram_drv.c   | 41 +++++++++++++++++++++++++++++++++++++++++
> > > > >  drivers/block/zram/zram_drv.h   |  1 +
> > > > >  3 files changed, 43 insertions(+)
> > > > > 
> > > > > diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> > > > > index d24534bee763..fcb0561dfe2e 100644
> > > > > --- a/Documentation/blockdev/zram.txt
> > > > > +++ b/Documentation/blockdev/zram.txt
> > > > > @@ -96,6 +96,7 @@ size of the disk when not in use so a huge zram is wasteful.
> > > > >  		compr_data_size
> > > > >  		mem_used_total
> > > > >  		mem_used_max
> > > > > +		mem_limit
> > > > >  
> > > > >  7) Deactivate:
> > > > >  	swapoff /dev/zram0
> > > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > > > > index a4d637b4db7d..47f68bbb2c44 100644
> > > > > --- a/drivers/block/zram/zram_drv.c
> > > > > +++ b/drivers/block/zram/zram_drv.c
> > > > > @@ -137,6 +137,37 @@ static ssize_t max_comp_streams_show(struct device *dev,
> > > > >  	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> > > > >  }
> > > > >  
> > > > > +static ssize_t mem_limit_show(struct device *dev,
> > > > > +		struct device_attribute *attr, char *buf)
> > > > > +{
> > > > > +	u64 val;
> > > > > +	struct zram *zram = dev_to_zram(dev);
> > > > > +
> > > > > +	down_read(&zram->init_lock);
> > > > > +	val = zram->limit_bytes;
> > > > > +	up_read(&zram->init_lock);
> > > > > +
> > > > > +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
> > > > > +}
> > > > > +
> > > > > +static ssize_t mem_limit_store(struct device *dev,
> > > > > +		struct device_attribute *attr, const char *buf, size_t len)
> > > > > +{
> > > > > +	u64 limit;
> > > > > +	struct zram *zram = dev_to_zram(dev);
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = kstrtoull(buf, 0, &limit);
> > > > > +	if (ret < 0)
> > > > > +		return ret;
> > > > > +
> > > > > +	down_write(&zram->init_lock);
> > > > > +	zram->limit_bytes = limit;
> > > > > +	ret = len;
> > > > > +	up_write(&zram->init_lock);
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > >  static ssize_t max_comp_streams_store(struct device *dev,
> > > > >  		struct device_attribute *attr, const char *buf, size_t len)
> > > > >  {
> > > > > @@ -511,6 +542,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> > > > >  		ret = -ENOMEM;
> > > > >  		goto out;
> > > > >  	}
> > > > > +
> > > > > +	if (zram->limit_bytes &&
> > > > > +		zs_get_total_size_bytes(meta->mem_pool) >= zram->limit_bytes) {
> > > > > +		zs_free(meta->mem_pool, handle);
> > > > > +		ret = -ENOMEM;
> > > > > +		goto out;
> > > > > +	}
> > > > > +
> > > > >  	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> > > > >  
> > > > >  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
> > > > > @@ -854,6 +893,7 @@ static DEVICE_ATTR(reset, S_IWUSR, NULL, reset_store);
> > > > >  static DEVICE_ATTR(orig_data_size, S_IRUGO, orig_data_size_show, NULL);
> > > > >  static DEVICE_ATTR(mem_used_total, S_IRUGO, mem_used_total_show, NULL);
> > > > >  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> > > > > +static DEVICE_ATTR(mem_limit, S_IRUGO, mem_limit_show, mem_limit_store);
> > > > >  static DEVICE_ATTR(max_comp_streams, S_IRUGO | S_IWUSR,
> > > > >  		max_comp_streams_show, max_comp_streams_store);
> > > > >  static DEVICE_ATTR(comp_algorithm, S_IRUGO | S_IWUSR,
> > > > > @@ -883,6 +923,7 @@ static struct attribute *zram_disk_attrs[] = {
> > > > >  	&dev_attr_compr_data_size.attr,
> > > > >  	&dev_attr_mem_used_total.attr,
> > > > >  	&dev_attr_mem_used_max.attr,
> > > > > +	&dev_attr_mem_limit.attr,
> > > > >  	&dev_attr_max_comp_streams.attr,
> > > > >  	&dev_attr_comp_algorithm.attr,
> > > > >  	NULL,
> > > > > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > > > > index 7f21c145e317..c0d497ff6efc 100644
> > > > > --- a/drivers/block/zram/zram_drv.h
> > > > > +++ b/drivers/block/zram/zram_drv.h
> > > > > @@ -99,6 +99,7 @@ struct zram {
> > > > >  	 * we can store in a disk.
> > > > >  	 */
> > > > >  	u64 disksize;	/* bytes */
> > > > > +	u64 limit_bytes;
> > > > >  	int max_comp_streams;
> > > > >  	struct zram_stats stats;
> > > > >  	char compressor[10];
> > > > > -- 
> > > > > 2.0.0
> > > > 
> > > > -- 
> > > > Kind regards,
> > > > Minchan Kim
> > > > 
> > 
> > -- 
> > Kind regards,
> > Minchan Kim
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
  2014-08-08  2:56 David Horner
@ 2014-08-12  7:18 ` Minchan Kim
  0 siblings, 0 replies; 54+ messages in thread
From: Minchan Kim @ 2014-08-12  7:18 UTC (permalink / raw)
  To: David Horner; +Cc: linux-mm

Hello,

Sorry for the late response. I was on vacation and then was busy.

On Fri, Aug 08, 2014 at 02:56:24AM +0000, David Horner wrote:
> 
>  [2/3]
> 
> 
>  But why isn't mem_used_max writable? (save tearing down and rebuilding
>  device to reset max)

I don't know what you mean but I will make it writable so user can
reset it to zero when they want.

> 
>  static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);
> 
>  static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, NULL);
> 
>    with a check in the store() that the new value is positive and less
> than current max?
> 
> 
>  I'm also a little puzzled why there is a new API zs_get_max_size_bytes if
>  the data is accessible through sysfs?
>  Especially if max limit will be (as you propose for [3/3]) through accessed
>  through zsmalloc and hence zram needn't access.

I don't know why you meant.
Anyway, I will resend revised version and Cc you.
Please, comment on that. :)

> 
> 
> 
>   [3/3]
>  I concur that the zram limit is best implemented in zsmalloc.
>  I am looking forward to that revised code.

Thanks!

> 
> 
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
@ 2014-08-08  2:56 David Horner
  2014-08-12  7:18 ` Minchan Kim
  0 siblings, 1 reply; 54+ messages in thread
From: David Horner @ 2014-08-08  2:56 UTC (permalink / raw)
  To: linux-mm


 [2/3]


 But why isn't mem_used_max writable? (save tearing down and rebuilding
 device to reset max)

 static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);

 static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, NULL);

   with a check in the store() that the new value is positive and less
than current max?


 I'm also a little puzzled why there is a new API zs_get_max_size_bytes if
 the data is accessible through sysfs?
 Especially if max limit will be (as you propose for [3/3]) through accessed
 through zsmalloc and hence zram needn't access.



  [3/3]
 I concur that the zram limit is best implemented in zsmalloc.
 I am looking forward to that revised code.




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram
@ 2014-08-08  2:47 David Horner
  0 siblings, 0 replies; 54+ messages in thread
From: David Horner @ 2014-08-08  2:47 UTC (permalink / raw)
  To: minchan; +Cc: linux-kernel-mm, linux-kernel

 [2/3]


 But why isn't mem_used_max writable? (save tearing down and rebuilding
 device to reset max)

 static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);

 static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, NULL);

   with a check in the store() that the new value is positive and less
than current max?


 I'm also a little puzzled why there is a new API zs_get_max_size_bytes if
 the data is accessible through sysfs?
 Especially if max limit will be (as you propose for [3/3]) through accessed
 through zsmalloc and hence zram needn't access.



  [3/3]
 I concur that the zram limit is best implemented in zsmalloc.
 I am looking forward to that revised code.


> From: Minchan Kim <minchan <at> kernel.org>
> Subject: [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in
> zram
> <http://news.gmane.org/find-root.php?message_id=1407225723%2d23754%2d3%2dgit%2dsend%2demail%2dminchan%40kernel.org>
> Newsgroups: gmane.linux.kernel.mm
> <http://news.gmane.org/gmane.linux.kernel.mm>, gmane.linux.kernel
> <http://news.gmane.org/gmane.linux.kernel>
> Date: 2014-08-05 08:02:02 GMT (5 hours and 4 minutes ago)
>
> Normally, zram user can get maximum memory zsmalloc consumed via
> polling mem_used_total with sysfs in userspace.
>
> But it has a critical problem because user can miss peak memory
> usage during update interval so that gap between them could be
> huge when memory pressure is really heavy.
>
> This patch adds new API zs_get_max_size_bytes in zsmalloc so
> user(ex, zram) doesn't need to poll in short interval to get
> exact value.
>
> User can just see max memory usage once his test workload is
> done. It's pretty handy and accurate.

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2014-08-17 23:32 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-05  8:02 [RFC 0/3] zram memory control enhance Minchan Kim
2014-08-05  8:02 ` Minchan Kim
2014-08-05  8:02 ` [RFC 1/3] zsmalloc: move pages_allocated to zs_pool Minchan Kim
2014-08-05  8:02   ` Minchan Kim
2014-08-13 13:59   ` Dan Streetman
2014-08-13 13:59     ` Dan Streetman
2014-08-13 14:14     ` Sergey Senozhatsky
2014-08-13 14:14       ` Sergey Senozhatsky
2014-08-13 14:51       ` Dan Streetman
2014-08-13 14:51         ` Dan Streetman
2014-08-13 15:13         ` Sergey Senozhatsky
2014-08-13 15:13           ` Sergey Senozhatsky
2014-08-13 15:25           ` Sergey Senozhatsky
2014-08-13 15:25             ` Sergey Senozhatsky
2014-08-13 16:11             ` Dan Streetman
2014-08-13 16:11               ` Dan Streetman
2014-08-14 13:03               ` Sergey Senozhatsky
2014-08-14 13:03                 ` Sergey Senozhatsky
2014-08-14  0:09             ` Minchan Kim
2014-08-14  0:09               ` Minchan Kim
2014-08-13 15:21   ` Seth Jennings
2014-08-13 15:21     ` Seth Jennings
2014-08-05  8:02 ` [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram Minchan Kim
2014-08-05  8:02   ` Minchan Kim
2014-08-13 15:25   ` Seth Jennings
2014-08-13 15:25     ` Seth Jennings
2014-08-05  8:02 ` [RFC 3/3] zram: limit memory size for zram Minchan Kim
2014-08-05  8:02   ` Minchan Kim
2014-08-05  9:48   ` Minchan Kim
2014-08-05  9:48     ` Minchan Kim
2014-08-05 13:16     ` Sergey Senozhatsky
2014-08-05 13:16       ` Sergey Senozhatsky
2014-08-06  6:52       ` Minchan Kim
2014-08-06  6:52         ` Minchan Kim
2014-08-13 15:30         ` Seth Jennings
2014-08-13 15:30           ` Seth Jennings
2014-08-13 23:31           ` Minchan Kim
2014-08-13 23:31             ` Minchan Kim
2014-08-13 23:27       ` Minchan Kim
2014-08-13 23:27         ` Minchan Kim
2014-08-14 13:29         ` Sergey Senozhatsky
2014-08-14 13:29           ` Sergey Senozhatsky
2014-08-17 23:32           ` Minchan Kim
2014-08-17 23:32             ` Minchan Kim
2014-08-14 14:45         ` Dan Streetman
2014-08-14 14:45           ` Dan Streetman
2014-08-06 12:54 ` [RFC 0/3] zram memory control enhance Jerome Marchand
2014-08-13 15:34 ` Seth Jennings
2014-08-13 15:34   ` Seth Jennings
2014-08-13 23:32   ` Minchan Kim
2014-08-13 23:32     ` Minchan Kim
2014-08-08  2:47 [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram David Horner
2014-08-08  2:56 David Horner
2014-08-12  7:18 ` Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.