linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] z3fold: add shrinker
@ 2016-10-15 11:56 Vitaly Wool
  2016-10-15 11:58 ` [PATCH v5 1/3] z3fold: make counters atomic Vitaly Wool
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Vitaly Wool @ 2016-10-15 11:56 UTC (permalink / raw)
  To: Linux-MM, linux-kernel; +Cc: Dan Streetman, Andrew Morton, Dave Chinner


This patch set implements shrinker for z3fold, preceded by some
code optimizations and preparations that I thought would be
reasonable to have as separate patches.

This patch set has been verified on x86 and on Qemu emulating
Versatile Express. fio results for the resulting code are:
Run status group 0 (all jobs):
  WRITE: io=3200.0MB, aggrb=13095KB/s, minb=3273KB/s, maxb=3284KB/s, mint=249447msec, maxt=250214msec

Run status group 1 (all jobs):
   READ: io=3200.0MB, aggrb=28992KB/s, minb=7248KB/s, maxb=7273KB/s, mint=112623msec, maxt=113021msec

Run status group 2 (all jobs):
   READ: io=1595.2MB, aggrb=8825KB/s, minb=2194KB/s, maxb=2224KB/s, mint=184517msec, maxt=185077msec
  WRITE: io=1604.9MB, aggrb=8879KB/s, minb=2207KB/s, maxb=2245KB/s, mint=184519msec, maxt=185079msec

Run status group 3 (all jobs):
   READ: io=1600.6MB, aggrb=8413KB/s, minb=2084KB/s, maxb=2132KB/s, mint=193286msec, maxt=194803msec
  WRITE: io=1599.5MB, aggrb=8406KB/s, minb=2099KB/s, maxb=2120KB/s, mint=193290msec, maxt=194825msec

Disk stats (read/write):
  zram0: ios=1636792/1638952, merge=0/0, ticks=169250/462410, in_queue=633700, util=85.33%

Just for comparison, zsmalloc gives slightly worse results:
Run status group 0 (all jobs):
  WRITE: io=3200.0MB, aggrb=12827KB/s, minb=3206KB/s, maxb=3230KB/s, mint=253603msec, maxt=255450msec

Run status group 1 (all jobs):
   READ: io=3200.0MB, aggrb=26184KB/s, minb=6546KB/s, maxb=6556KB/s, mint=124940msec, maxt=125144msec

Run status group 2 (all jobs):
   READ: io=1595.2MB, aggrb=8549KB/s, minb=2123KB/s, maxb=2162KB/s, mint=190151msec, maxt=191049msec
  WRITE: io=1604.9MB, aggrb=8601KB/s, minb=2145KB/s, maxb=2172KB/s, mint=190153msec, maxt=191051msec

Run status group 3 (all jobs):
   READ: io=1600.6MB, aggrb=8147KB/s, minb=2026KB/s, maxb=2049KB/s, mint=200339msec, maxt=201154msec
  WRITE: io=1599.5MB, aggrb=8142KB/s, minb=2023KB/s, maxb=2062KB/s, mint=200343msec, maxt=201158msec

Disk stats (read/write):
  zram0: ios=1637032/1639304, merge=0/0, ticks=175840/458740, in_queue=637140, util=82.48%

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v5 1/3] z3fold: make counters atomic
  2016-10-15 11:56 [PATCH v5] z3fold: add shrinker Vitaly Wool
@ 2016-10-15 11:58 ` Vitaly Wool
  2016-10-17 20:37   ` Dan Streetman
  2016-10-15 11:59 ` [PATCH v5 2/3] z3fold: remove redundant locking Vitaly Wool
  2016-10-15 12:05 ` [PATCH v5 3/3] z3fold: add shrinker Vitaly Wool
  2 siblings, 1 reply; 14+ messages in thread
From: Vitaly Wool @ 2016-10-15 11:58 UTC (permalink / raw)
  To: Linux-MM, linux-kernel; +Cc: Dan Streetman, Andrew Morton, Dave Chinner

This patch converts pages_nr per-pool counter to atomic64_t.
It also introduces a new counter, unbuddied_nr, which is also
atomic64_t, to track the number of unbuddied (shrinkable) pages,
as a step to prepare for z3fold shrinker implementation.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 mm/z3fold.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index 8f9e89c..5197d7b 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -69,6 +69,7 @@ struct z3fold_ops {
  * @lru:	list tracking the z3fold pages in LRU order by most recently
  *		added buddy.
  * @pages_nr:	number of z3fold pages in the pool.
+ * @unbuddied_nr:	number of unbuddied z3fold pages in the pool.
  * @ops:	pointer to a structure of user defined operations specified at
  *		pool creation time.
  *
@@ -80,7 +81,8 @@ struct z3fold_pool {
 	struct list_head unbuddied[NCHUNKS];
 	struct list_head buddied;
 	struct list_head lru;
-	u64 pages_nr;
+	atomic64_t pages_nr;
+	atomic64_t unbuddied_nr;
 	const struct z3fold_ops *ops;
 	struct zpool *zpool;
 	const struct zpool_ops *zpool_ops;
@@ -234,7 +236,8 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
 		INIT_LIST_HEAD(&pool->unbuddied[i]);
 	INIT_LIST_HEAD(&pool->buddied);
 	INIT_LIST_HEAD(&pool->lru);
-	pool->pages_nr = 0;
+	atomic64_set(&pool->pages_nr, 0);
+	atomic64_set(&pool->unbuddied_nr, 0);
 	pool->ops = ops;
 	return pool;
 }
@@ -334,6 +337,7 @@ static int z3fold_alloc(struct z3fold_pool *pool, size_t size, gfp_t gfp,
 					continue;
 				}
 				list_del(&zhdr->buddy);
+				atomic64_dec(&pool->unbuddied_nr);
 				goto found;
 			}
 		}
@@ -346,7 +350,7 @@ static int z3fold_alloc(struct z3fold_pool *pool, size_t size, gfp_t gfp,
 	if (!page)
 		return -ENOMEM;
 	spin_lock(&pool->lock);
-	pool->pages_nr++;
+	atomic64_inc(&pool->pages_nr);
 	zhdr = init_z3fold_page(page);
 
 	if (bud == HEADLESS) {
@@ -369,6 +373,7 @@ found:
 		/* Add to unbuddied list */
 		freechunks = num_free_chunks(zhdr);
 		list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
+		atomic64_inc(&pool->unbuddied_nr);
 	} else {
 		/* Add to buddied list */
 		list_add(&zhdr->buddy, &pool->buddied);
@@ -412,6 +417,11 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
 		/* HEADLESS page stored */
 		bud = HEADLESS;
 	} else {
+		if (zhdr->first_chunks == 0 ||
+		    zhdr->middle_chunks == 0 ||
+		    zhdr->last_chunks == 0)
+			atomic64_dec(&pool->unbuddied_nr);
+
 		bud = handle_to_buddy(handle);
 
 		switch (bud) {
@@ -429,6 +439,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
 			pr_err("%s: unknown bud %d\n", __func__, bud);
 			WARN_ON(1);
 			spin_unlock(&pool->lock);
+			atomic64_inc(&pool->unbuddied_nr);
 			return;
 		}
 	}
@@ -451,12 +462,13 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
 		list_del(&page->lru);
 		clear_bit(PAGE_HEADLESS, &page->private);
 		free_z3fold_page(zhdr);
-		pool->pages_nr--;
+		atomic64_dec(&pool->pages_nr);
 	} else {
 		z3fold_compact_page(zhdr);
 		/* Add to the unbuddied list */
 		freechunks = num_free_chunks(zhdr);
 		list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
+		atomic64_inc(&pool->unbuddied_nr);
 	}
 
 	spin_unlock(&pool->lock);
@@ -520,6 +532,11 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries)
 		zhdr = page_address(page);
 		if (!test_bit(PAGE_HEADLESS, &page->private)) {
 			list_del(&zhdr->buddy);
+			if (zhdr->first_chunks == 0 ||
+			    zhdr->middle_chunks == 0 ||
+			    zhdr->last_chunks == 0)
+				atomic64_dec(&pool->unbuddied_nr);
+
 			/*
 			 * We need encode the handles before unlocking, since
 			 * we can race with free that will set
@@ -569,7 +586,7 @@ next:
 			 */
 			clear_bit(PAGE_HEADLESS, &page->private);
 			free_z3fold_page(zhdr);
-			pool->pages_nr--;
+			atomic64_dec(&pool->pages_nr);
 			spin_unlock(&pool->lock);
 			return 0;
 		}  else if (!test_bit(PAGE_HEADLESS, &page->private)) {
@@ -584,6 +601,7 @@ next:
 				freechunks = num_free_chunks(zhdr);
 				list_add(&zhdr->buddy,
 					 &pool->unbuddied[freechunks]);
+				atomic64_inc(&pool->unbuddied_nr);
 			}
 		}
 
@@ -672,12 +690,11 @@ static void z3fold_unmap(struct z3fold_pool *pool, unsigned long handle)
  * z3fold_get_pool_size() - gets the z3fold pool size in pages
  * @pool:	pool whose size is being queried
  *
- * Returns: size in pages of the given pool.  The pool lock need not be
- * taken to access pages_nr.
+ * Returns: size in pages of the given pool.
  */
 static u64 z3fold_get_pool_size(struct z3fold_pool *pool)
 {
-	return pool->pages_nr;
+	return atomic64_read(&pool->pages_nr);
 }
 
 /*****************
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 2/3] z3fold: remove redundant locking
  2016-10-15 11:56 [PATCH v5] z3fold: add shrinker Vitaly Wool
  2016-10-15 11:58 ` [PATCH v5 1/3] z3fold: make counters atomic Vitaly Wool
@ 2016-10-15 11:59 ` Vitaly Wool
  2016-10-17 20:48   ` Dan Streetman
  2016-10-15 12:05 ` [PATCH v5 3/3] z3fold: add shrinker Vitaly Wool
  2 siblings, 1 reply; 14+ messages in thread
From: Vitaly Wool @ 2016-10-15 11:59 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Linux-MM, linux-kernel, Dan Streetman, Andrew Morton, Dave Chinner

The per-pool z3fold spinlock should generally be taken only when
a non-atomic pool variable is modified. There's no need to take it
to map/unmap an object.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 mm/z3fold.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index 5197d7b..10513b5 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -580,6 +580,7 @@ next:
 		if ((test_bit(PAGE_HEADLESS, &page->private) && ret == 0) ||
 		    (zhdr->first_chunks == 0 && zhdr->last_chunks == 0 &&
 		     zhdr->middle_chunks == 0)) {
+			spin_unlock(&pool->lock);
 			/*
 			 * All buddies are now free, free the z3fold page and
 			 * return success.
@@ -587,7 +588,6 @@ next:
 			clear_bit(PAGE_HEADLESS, &page->private);
 			free_z3fold_page(zhdr);
 			atomic64_dec(&pool->pages_nr);
-			spin_unlock(&pool->lock);
 			return 0;
 		}  else if (!test_bit(PAGE_HEADLESS, &page->private)) {
 			if (zhdr->first_chunks != 0 &&
@@ -629,7 +629,6 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle)
 	void *addr;
 	enum buddy buddy;
 
-	spin_lock(&pool->lock);
 	zhdr = handle_to_z3fold_header(handle);
 	addr = zhdr;
 	page = virt_to_page(zhdr);
@@ -656,7 +655,6 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle)
 		break;
 	}
 out:
-	spin_unlock(&pool->lock);
 	return addr;
 }
 
@@ -671,19 +669,14 @@ static void z3fold_unmap(struct z3fold_pool *pool, unsigned long handle)
 	struct page *page;
 	enum buddy buddy;
 
-	spin_lock(&pool->lock);
 	zhdr = handle_to_z3fold_header(handle);
 	page = virt_to_page(zhdr);
 
-	if (test_bit(PAGE_HEADLESS, &page->private)) {
-		spin_unlock(&pool->lock);
-		return;
+	if (!test_bit(PAGE_HEADLESS, &page->private)) {
+		buddy = handle_to_buddy(handle);
+		if (buddy == MIDDLE)
+			clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
 	}
-
-	buddy = handle_to_buddy(handle);
-	if (buddy == MIDDLE)
-		clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
-	spin_unlock(&pool->lock);
 }
 
 /**
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 3/3] z3fold: add shrinker
  2016-10-15 11:56 [PATCH v5] z3fold: add shrinker Vitaly Wool
  2016-10-15 11:58 ` [PATCH v5 1/3] z3fold: make counters atomic Vitaly Wool
  2016-10-15 11:59 ` [PATCH v5 2/3] z3fold: remove redundant locking Vitaly Wool
@ 2016-10-15 12:05 ` Vitaly Wool
  2016-10-18  2:06   ` Dan Streetman
  2 siblings, 1 reply; 14+ messages in thread
From: Vitaly Wool @ 2016-10-15 12:05 UTC (permalink / raw)
  To: Linux-MM, linux-kernel; +Cc: Dan Streetman, Andrew Morton, Dave Chinner

This patch implements shrinker for z3fold. This shrinker
implementation does not free up any pages directly but it allows
for a denser placement of compressed objects which results in
less actual pages consumed and higher compression ratio therefore.

This update removes z3fold page compaction from the freeing path
since we can rely on shrinker to do the job. Also, a new flag
UNDER_COMPACTION is introduced to protect against two threads
trying to compact the same page.

This patch has been checked with the latest Linus's tree.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 mm/z3fold.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 119 insertions(+), 25 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index 10513b5..8f84d3c 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -27,6 +27,7 @@
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/preempt.h>
+#include <linux/shrinker.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
 #include <linux/zpool.h>
@@ -72,6 +73,7 @@ struct z3fold_ops {
  * @unbuddied_nr:	number of unbuddied z3fold pages in the pool.
  * @ops:	pointer to a structure of user defined operations specified at
  *		pool creation time.
+ * @shrinker:	shrinker structure to optimize page layout in background
  *
  * This structure is allocated at pool creation time and maintains metadata
  * pertaining to a particular z3fold pool.
@@ -86,6 +88,7 @@ struct z3fold_pool {
 	const struct z3fold_ops *ops;
 	struct zpool *zpool;
 	const struct zpool_ops *zpool_ops;
+	struct shrinker shrinker;
 };
 
 enum buddy {
@@ -121,6 +124,7 @@ enum z3fold_page_flags {
 	UNDER_RECLAIM = 0,
 	PAGE_HEADLESS,
 	MIDDLE_CHUNK_MAPPED,
+	UNDER_COMPACTION,
 };
 
 /*****************
@@ -136,6 +140,9 @@ static int size_to_chunks(size_t size)
 #define for_each_unbuddied_list(_iter, _begin) \
 	for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
 
+#define for_each_unbuddied_list_down(_iter, _end) \
+	for ((_iter) = (_end); (_iter) > 0; (_iter)--)
+
 /* Initializes the z3fold header of a newly allocated z3fold page */
 static struct z3fold_header *init_z3fold_page(struct page *page)
 {
@@ -145,6 +152,7 @@ static struct z3fold_header *init_z3fold_page(struct page *page)
 	clear_bit(UNDER_RECLAIM, &page->private);
 	clear_bit(PAGE_HEADLESS, &page->private);
 	clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
+	clear_bit(UNDER_COMPACTION, &page->private);
 
 	zhdr->first_chunks = 0;
 	zhdr->middle_chunks = 0;
@@ -211,6 +219,103 @@ static int num_free_chunks(struct z3fold_header *zhdr)
 	return nfree;
 }
 
+/* Has to be called with lock held */
+static int z3fold_compact_page(struct z3fold_header *zhdr, bool sync)
+{
+	struct page *page = virt_to_page(zhdr);
+	void *beg = zhdr;
+
+
+	if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
+	    !test_bit(UNDER_RECLAIM, &page->private) &&
+	    !test_bit(UNDER_COMPACTION, &page->private)) {
+		set_bit(UNDER_COMPACTION, &page->private);
+		if (zhdr->middle_chunks != 0 &&
+		    zhdr->first_chunks == 0 &&
+		    zhdr->last_chunks == 0) {
+			memmove(beg + ZHDR_SIZE_ALIGNED,
+				beg + (zhdr->start_middle << CHUNK_SHIFT),
+				zhdr->middle_chunks << CHUNK_SHIFT);
+			zhdr->first_chunks = zhdr->middle_chunks;
+			zhdr->middle_chunks = 0;
+			zhdr->start_middle = 0;
+			zhdr->first_num++;
+			clear_bit(UNDER_COMPACTION, &page->private);
+			return 1;
+		}
+		if (sync)
+			goto out;
+
+		/* moving data is expensive, so let's only do that if
+		 * there's substantial gain (2+ chunks)
+		 */
+		if (zhdr->middle_chunks != 0 && zhdr->first_chunks != 0 &&
+		    zhdr->last_chunks == 0 &&
+		    zhdr->start_middle > zhdr->first_chunks + 2) {
+			unsigned short new_start = zhdr->first_chunks + 1;
+			memmove(beg + (new_start << CHUNK_SHIFT),
+				beg + (zhdr->start_middle << CHUNK_SHIFT),
+				zhdr->middle_chunks << CHUNK_SHIFT);
+			zhdr->start_middle = new_start;
+			clear_bit(UNDER_COMPACTION, &page->private);
+			return 1;
+		}
+		if (zhdr->middle_chunks != 0 && zhdr->last_chunks != 0 &&
+		    zhdr->first_chunks == 0 &&
+		    zhdr->middle_chunks + zhdr->last_chunks <=
+		    NCHUNKS - zhdr->start_middle - 2) {
+			unsigned short new_start = NCHUNKS - zhdr->last_chunks -
+				zhdr->middle_chunks;
+			memmove(beg + (new_start << CHUNK_SHIFT),
+				beg + (zhdr->start_middle << CHUNK_SHIFT),
+				zhdr->middle_chunks << CHUNK_SHIFT);
+			zhdr->start_middle = new_start;
+			clear_bit(UNDER_COMPACTION, &page->private);
+			return 1;
+		}
+	}
+out:
+	clear_bit(UNDER_COMPACTION, &page->private);
+	return 0;
+}
+
+static unsigned long z3fold_shrink_count(struct shrinker *shrink,
+				struct shrink_control *sc)
+{
+	struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
+						shrinker);
+
+	return atomic64_read(&pool->unbuddied_nr);
+}
+
+static unsigned long z3fold_shrink_scan(struct shrinker *shrink,
+				struct shrink_control *sc)
+{
+	struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
+						shrinker);
+	struct z3fold_header *zhdr;
+	int i, nr_to_scan = sc->nr_to_scan, nr_shrunk = 0;
+
+	spin_lock(&pool->lock);
+	for_each_unbuddied_list_down(i, NCHUNKS - 3) {
+		if (!list_empty(&pool->unbuddied[i])) {
+			zhdr = list_first_entry(&pool->unbuddied[i],
+						struct z3fold_header, buddy);
+			list_del(&zhdr->buddy);
+			spin_unlock(&pool->lock);
+			nr_shrunk += z3fold_compact_page(zhdr, false);
+			spin_lock(&pool->lock);
+			list_add(&zhdr->buddy,
+				&pool->unbuddied[num_free_chunks(zhdr)]);
+			if (!--nr_to_scan)
+				break;
+		}
+	}
+	spin_unlock(&pool->lock);
+	return nr_shrunk;
+}
+
+
 /*****************
  * API Functions
 *****************/
@@ -230,7 +335,7 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
 
 	pool = kzalloc(sizeof(struct z3fold_pool), gfp);
 	if (!pool)
-		return NULL;
+		goto out;
 	spin_lock_init(&pool->lock);
 	for_each_unbuddied_list(i, 0)
 		INIT_LIST_HEAD(&pool->unbuddied[i]);
@@ -238,8 +343,19 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
 	INIT_LIST_HEAD(&pool->lru);
 	atomic64_set(&pool->pages_nr, 0);
 	atomic64_set(&pool->unbuddied_nr, 0);
+	pool->shrinker.count_objects = z3fold_shrink_count;
+	pool->shrinker.scan_objects = z3fold_shrink_scan;
+	pool->shrinker.seeks = DEFAULT_SEEKS;
+	pool->shrinker.batch = NCHUNKS - 4;
+	if (register_shrinker(&pool->shrinker))
+		goto out_free;
 	pool->ops = ops;
 	return pool;
+
+out_free:
+	kfree(pool);
+out:
+	return NULL;
 }
 
 /**
@@ -250,31 +366,10 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
  */
 static void z3fold_destroy_pool(struct z3fold_pool *pool)
 {
+	unregister_shrinker(&pool->shrinker);
 	kfree(pool);
 }
 
-/* Has to be called with lock held */
-static int z3fold_compact_page(struct z3fold_header *zhdr)
-{
-	struct page *page = virt_to_page(zhdr);
-	void *beg = zhdr;
-
-
-	if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
-	    zhdr->middle_chunks != 0 &&
-	    zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
-		memmove(beg + ZHDR_SIZE_ALIGNED,
-			beg + (zhdr->start_middle << CHUNK_SHIFT),
-			zhdr->middle_chunks << CHUNK_SHIFT);
-		zhdr->first_chunks = zhdr->middle_chunks;
-		zhdr->middle_chunks = 0;
-		zhdr->start_middle = 0;
-		zhdr->first_num++;
-		return 1;
-	}
-	return 0;
-}
-
 /**
  * z3fold_alloc() - allocates a region of a given size
  * @pool:	z3fold pool from which to allocate
@@ -464,7 +559,6 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
 		free_z3fold_page(zhdr);
 		atomic64_dec(&pool->pages_nr);
 	} else {
-		z3fold_compact_page(zhdr);
 		/* Add to the unbuddied list */
 		freechunks = num_free_chunks(zhdr);
 		list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
@@ -596,7 +690,7 @@ next:
 				/* Full, add to buddied list */
 				list_add(&zhdr->buddy, &pool->buddied);
 			} else {
-				z3fold_compact_page(zhdr);
+				z3fold_compact_page(zhdr, true);
 				/* add to unbuddied list */
 				freechunks = num_free_chunks(zhdr);
 				list_add(&zhdr->buddy,
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 1/3] z3fold: make counters atomic
  2016-10-15 11:58 ` [PATCH v5 1/3] z3fold: make counters atomic Vitaly Wool
@ 2016-10-17 20:37   ` Dan Streetman
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Streetman @ 2016-10-17 20:37 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

On Sat, Oct 15, 2016 at 7:58 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
> This patch converts pages_nr per-pool counter to atomic64_t.
> It also introduces a new counter, unbuddied_nr, which is also
> atomic64_t, to track the number of unbuddied (shrinkable) pages,
> as a step to prepare for z3fold shrinker implementation.
>
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  mm/z3fold.c | 33 +++++++++++++++++++++++++--------
>  1 file changed, 25 insertions(+), 8 deletions(-)
>
> diff --git a/mm/z3fold.c b/mm/z3fold.c
> index 8f9e89c..5197d7b 100644
> --- a/mm/z3fold.c
> +++ b/mm/z3fold.c
> @@ -69,6 +69,7 @@ struct z3fold_ops {
>   * @lru:       list tracking the z3fold pages in LRU order by most recently
>   *             added buddy.
>   * @pages_nr:  number of z3fold pages in the pool.
> + * @unbuddied_nr:      number of unbuddied z3fold pages in the pool.
>   * @ops:       pointer to a structure of user defined operations specified at
>   *             pool creation time.
>   *
> @@ -80,7 +81,8 @@ struct z3fold_pool {
>         struct list_head unbuddied[NCHUNKS];
>         struct list_head buddied;
>         struct list_head lru;
> -       u64 pages_nr;
> +       atomic64_t pages_nr;
> +       atomic64_t unbuddied_nr;
>         const struct z3fold_ops *ops;
>         struct zpool *zpool;
>         const struct zpool_ops *zpool_ops;
> @@ -234,7 +236,8 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>                 INIT_LIST_HEAD(&pool->unbuddied[i]);
>         INIT_LIST_HEAD(&pool->buddied);
>         INIT_LIST_HEAD(&pool->lru);
> -       pool->pages_nr = 0;
> +       atomic64_set(&pool->pages_nr, 0);
> +       atomic64_set(&pool->unbuddied_nr, 0);
>         pool->ops = ops;
>         return pool;
>  }
> @@ -334,6 +337,7 @@ static int z3fold_alloc(struct z3fold_pool *pool, size_t size, gfp_t gfp,
>                                         continue;
>                                 }
>                                 list_del(&zhdr->buddy);
> +                               atomic64_dec(&pool->unbuddied_nr);
>                                 goto found;
>                         }
>                 }
> @@ -346,7 +350,7 @@ static int z3fold_alloc(struct z3fold_pool *pool, size_t size, gfp_t gfp,
>         if (!page)
>                 return -ENOMEM;
>         spin_lock(&pool->lock);
> -       pool->pages_nr++;
> +       atomic64_inc(&pool->pages_nr);
>         zhdr = init_z3fold_page(page);
>
>         if (bud == HEADLESS) {
> @@ -369,6 +373,7 @@ found:
>                 /* Add to unbuddied list */
>                 freechunks = num_free_chunks(zhdr);
>                 list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
> +               atomic64_inc(&pool->unbuddied_nr);
>         } else {
>                 /* Add to buddied list */
>                 list_add(&zhdr->buddy, &pool->buddied);
> @@ -412,6 +417,11 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
>                 /* HEADLESS page stored */
>                 bud = HEADLESS;
>         } else {
> +               if (zhdr->first_chunks == 0 ||
> +                   zhdr->middle_chunks == 0 ||
> +                   zhdr->last_chunks == 0)
> +                       atomic64_dec(&pool->unbuddied_nr);
> +
>                 bud = handle_to_buddy(handle);
>
>                 switch (bud) {
> @@ -429,6 +439,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
>                         pr_err("%s: unknown bud %d\n", __func__, bud);
>                         WARN_ON(1);
>                         spin_unlock(&pool->lock);
> +                       atomic64_inc(&pool->unbuddied_nr);

this is wrong if the page was fully buddied; instead it may be better
to set a flag above, e.g. is_unbuddied, and later at the list_del call
do the dec if is_unbuddied.  That avoids needing to do this inc.

otherwise,

Acked-by: Dan Streetman <ddstreet@ieee.org>

>                         return;
>                 }
>         }
> @@ -451,12 +462,13 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
>                 list_del(&page->lru);
>                 clear_bit(PAGE_HEADLESS, &page->private);
>                 free_z3fold_page(zhdr);
> -               pool->pages_nr--;
> +               atomic64_dec(&pool->pages_nr);
>         } else {
>                 z3fold_compact_page(zhdr);
>                 /* Add to the unbuddied list */
>                 freechunks = num_free_chunks(zhdr);
>                 list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
> +               atomic64_inc(&pool->unbuddied_nr);
>         }
>
>         spin_unlock(&pool->lock);
> @@ -520,6 +532,11 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries)
>                 zhdr = page_address(page);
>                 if (!test_bit(PAGE_HEADLESS, &page->private)) {
>                         list_del(&zhdr->buddy);
> +                       if (zhdr->first_chunks == 0 ||
> +                           zhdr->middle_chunks == 0 ||
> +                           zhdr->last_chunks == 0)
> +                               atomic64_dec(&pool->unbuddied_nr);
> +
>                         /*
>                          * We need encode the handles before unlocking, since
>                          * we can race with free that will set
> @@ -569,7 +586,7 @@ next:
>                          */
>                         clear_bit(PAGE_HEADLESS, &page->private);
>                         free_z3fold_page(zhdr);
> -                       pool->pages_nr--;
> +                       atomic64_dec(&pool->pages_nr);
>                         spin_unlock(&pool->lock);
>                         return 0;
>                 }  else if (!test_bit(PAGE_HEADLESS, &page->private)) {
> @@ -584,6 +601,7 @@ next:
>                                 freechunks = num_free_chunks(zhdr);
>                                 list_add(&zhdr->buddy,
>                                          &pool->unbuddied[freechunks]);
> +                               atomic64_inc(&pool->unbuddied_nr);
>                         }
>                 }
>
> @@ -672,12 +690,11 @@ static void z3fold_unmap(struct z3fold_pool *pool, unsigned long handle)
>   * z3fold_get_pool_size() - gets the z3fold pool size in pages
>   * @pool:      pool whose size is being queried
>   *
> - * Returns: size in pages of the given pool.  The pool lock need not be
> - * taken to access pages_nr.
> + * Returns: size in pages of the given pool.
>   */
>  static u64 z3fold_get_pool_size(struct z3fold_pool *pool)
>  {
> -       return pool->pages_nr;
> +       return atomic64_read(&pool->pages_nr);
>  }
>
>  /*****************
> --
> 2.4.2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 2/3] z3fold: remove redundant locking
  2016-10-15 11:59 ` [PATCH v5 2/3] z3fold: remove redundant locking Vitaly Wool
@ 2016-10-17 20:48   ` Dan Streetman
  2016-10-18  2:55     ` Vitaly Wool
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Streetman @ 2016-10-17 20:48 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

On Sat, Oct 15, 2016 at 7:59 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
> The per-pool z3fold spinlock should generally be taken only when
> a non-atomic pool variable is modified. There's no need to take it
> to map/unmap an object.

no.  it absolutely needs to be taken, because z3fold_compact_page
could move the middle bud's contents to the first bud, and if the
middle bud gets mapped while it's being moved really bad things will
happen.

you can change that to a per-page lock in the z3fold_header, but some
locking needs to happen between mapping and middle bud moving (and
handle encoding/decoding and first_num access).


>
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  mm/z3fold.c | 17 +++++------------
>  1 file changed, 5 insertions(+), 12 deletions(-)
>
> diff --git a/mm/z3fold.c b/mm/z3fold.c
> index 5197d7b..10513b5 100644
> --- a/mm/z3fold.c
> +++ b/mm/z3fold.c
> @@ -580,6 +580,7 @@ next:
>                 if ((test_bit(PAGE_HEADLESS, &page->private) && ret == 0) ||
>                     (zhdr->first_chunks == 0 && zhdr->last_chunks == 0 &&
>                      zhdr->middle_chunks == 0)) {
> +                       spin_unlock(&pool->lock);
>                         /*
>                          * All buddies are now free, free the z3fold page and
>                          * return success.
> @@ -587,7 +588,6 @@ next:
>                         clear_bit(PAGE_HEADLESS, &page->private);
>                         free_z3fold_page(zhdr);
>                         atomic64_dec(&pool->pages_nr);
> -                       spin_unlock(&pool->lock);
>                         return 0;
>                 }  else if (!test_bit(PAGE_HEADLESS, &page->private)) {
>                         if (zhdr->first_chunks != 0 &&
> @@ -629,7 +629,6 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle)
>         void *addr;
>         enum buddy buddy;
>
> -       spin_lock(&pool->lock);
>         zhdr = handle_to_z3fold_header(handle);
>         addr = zhdr;
>         page = virt_to_page(zhdr);
> @@ -656,7 +655,6 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle)
>                 break;
>         }
>  out:
> -       spin_unlock(&pool->lock);
>         return addr;
>  }
>
> @@ -671,19 +669,14 @@ static void z3fold_unmap(struct z3fold_pool *pool, unsigned long handle)
>         struct page *page;
>         enum buddy buddy;
>
> -       spin_lock(&pool->lock);
>         zhdr = handle_to_z3fold_header(handle);
>         page = virt_to_page(zhdr);
>
> -       if (test_bit(PAGE_HEADLESS, &page->private)) {
> -               spin_unlock(&pool->lock);
> -               return;
> +       if (!test_bit(PAGE_HEADLESS, &page->private)) {
> +               buddy = handle_to_buddy(handle);
> +               if (buddy == MIDDLE)
> +                       clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
>         }
> -
> -       buddy = handle_to_buddy(handle);
> -       if (buddy == MIDDLE)
> -               clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
> -       spin_unlock(&pool->lock);
>  }
>
>  /**
> --
> 2.4.2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/3] z3fold: add shrinker
  2016-10-15 12:05 ` [PATCH v5 3/3] z3fold: add shrinker Vitaly Wool
@ 2016-10-18  2:06   ` Dan Streetman
  2016-10-18  2:45     ` Vitaly Wool
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Streetman @ 2016-10-18  2:06 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
> This patch implements shrinker for z3fold. This shrinker
> implementation does not free up any pages directly but it allows
> for a denser placement of compressed objects which results in
> less actual pages consumed and higher compression ratio therefore.
>
> This update removes z3fold page compaction from the freeing path
> since we can rely on shrinker to do the job. Also, a new flag
> UNDER_COMPACTION is introduced to protect against two threads
> trying to compact the same page.

i'm completely unconvinced that this should be a shrinker.  The
alloc/free paths are much, much better suited to compacting a page
than a shrinker that must scan through all the unbuddied pages.  Why
not just improve compaction for the alloc/free paths?


>
> This patch has been checked with the latest Linus's tree.
>
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  mm/z3fold.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 119 insertions(+), 25 deletions(-)
>
> diff --git a/mm/z3fold.c b/mm/z3fold.c
> index 10513b5..8f84d3c 100644
> --- a/mm/z3fold.c
> +++ b/mm/z3fold.c
> @@ -27,6 +27,7 @@
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/preempt.h>
> +#include <linux/shrinker.h>
>  #include <linux/slab.h>
>  #include <linux/spinlock.h>
>  #include <linux/zpool.h>
> @@ -72,6 +73,7 @@ struct z3fold_ops {
>   * @unbuddied_nr:      number of unbuddied z3fold pages in the pool.
>   * @ops:       pointer to a structure of user defined operations specified at
>   *             pool creation time.
> + * @shrinker:  shrinker structure to optimize page layout in background
>   *
>   * This structure is allocated at pool creation time and maintains metadata
>   * pertaining to a particular z3fold pool.
> @@ -86,6 +88,7 @@ struct z3fold_pool {
>         const struct z3fold_ops *ops;
>         struct zpool *zpool;
>         const struct zpool_ops *zpool_ops;
> +       struct shrinker shrinker;
>  };
>
>  enum buddy {
> @@ -121,6 +124,7 @@ enum z3fold_page_flags {
>         UNDER_RECLAIM = 0,
>         PAGE_HEADLESS,
>         MIDDLE_CHUNK_MAPPED,
> +       UNDER_COMPACTION,
>  };
>
>  /*****************
> @@ -136,6 +140,9 @@ static int size_to_chunks(size_t size)
>  #define for_each_unbuddied_list(_iter, _begin) \
>         for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
>
> +#define for_each_unbuddied_list_down(_iter, _end) \
> +       for ((_iter) = (_end); (_iter) > 0; (_iter)--)
> +

bikeshed: the conventional suffix is _reverse, not _down, i.e.
for_each_unbuddied_list_reverse()

>  /* Initializes the z3fold header of a newly allocated z3fold page */
>  static struct z3fold_header *init_z3fold_page(struct page *page)
>  {
> @@ -145,6 +152,7 @@ static struct z3fold_header *init_z3fold_page(struct page *page)
>         clear_bit(UNDER_RECLAIM, &page->private);
>         clear_bit(PAGE_HEADLESS, &page->private);
>         clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
> +       clear_bit(UNDER_COMPACTION, &page->private);
>
>         zhdr->first_chunks = 0;
>         zhdr->middle_chunks = 0;
> @@ -211,6 +219,103 @@ static int num_free_chunks(struct z3fold_header *zhdr)
>         return nfree;
>  }
>
> +/* Has to be called with lock held */
> +static int z3fold_compact_page(struct z3fold_header *zhdr, bool sync)
> +{
> +       struct page *page = virt_to_page(zhdr);
> +       void *beg = zhdr;
> +
> +
> +       if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
> +           !test_bit(UNDER_RECLAIM, &page->private) &&
> +           !test_bit(UNDER_COMPACTION, &page->private)) {
> +               set_bit(UNDER_COMPACTION, &page->private);

i assume the reclaim check, and new compaction bit check, is due to
the removal of the spinlock, which was incorrect.  changing to a
per-page spinlock may be the best way to handle this, but flags are
absolutely not appropriate - they don't provide the needed locking.
Even if the compaction bit was the only locking needed (which it
isn't), it still isn't correct here - while extremely unlikely, it's
still possible for multiple threads to race between checking the
compaction bit, and setting it.  That's what test_and_set_bit() is
for.

> +               if (zhdr->middle_chunks != 0 &&

only need to check if the middle bud is in use once; if it isn't,
there's no compaction to do.

> +                   zhdr->first_chunks == 0 &&
> +                   zhdr->last_chunks == 0) {
> +                       memmove(beg + ZHDR_SIZE_ALIGNED,
> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
> +                               zhdr->middle_chunks << CHUNK_SHIFT);
> +                       zhdr->first_chunks = zhdr->middle_chunks;
> +                       zhdr->middle_chunks = 0;
> +                       zhdr->start_middle = 0;
> +                       zhdr->first_num++;
> +                       clear_bit(UNDER_COMPACTION, &page->private);
> +                       return 1;
> +               }
> +               if (sync)
> +                       goto out;

i don't get it, why is that compaction synchronous while the others
below aren't?

> +
> +               /* moving data is expensive, so let's only do that if
> +                * there's substantial gain (2+ chunks)

"at least 2 chunks" feels arbitrary...it should be a #define instead
of a magic number...

> +                */
> +               if (zhdr->middle_chunks != 0 && zhdr->first_chunks != 0 &&
> +                   zhdr->last_chunks == 0 &&
> +                   zhdr->start_middle > zhdr->first_chunks + 2) {
> +                       unsigned short new_start = zhdr->first_chunks + 1;
> +                       memmove(beg + (new_start << CHUNK_SHIFT),
> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
> +                               zhdr->middle_chunks << CHUNK_SHIFT);
> +                       zhdr->start_middle = new_start;
> +                       clear_bit(UNDER_COMPACTION, &page->private);
> +                       return 1;
> +               }
> +               if (zhdr->middle_chunks != 0 && zhdr->last_chunks != 0 &&
> +                   zhdr->first_chunks == 0 &&
> +                   zhdr->middle_chunks + zhdr->last_chunks <=
> +                   NCHUNKS - zhdr->start_middle - 2) {
> +                       unsigned short new_start = NCHUNKS - zhdr->last_chunks -
> +                               zhdr->middle_chunks;
> +                       memmove(beg + (new_start << CHUNK_SHIFT),
> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
> +                               zhdr->middle_chunks << CHUNK_SHIFT);

these if clauses, and the memmoves, aren't very readable, could it be
made any better with a separate function?

> +                       zhdr->start_middle = new_start;
> +                       clear_bit(UNDER_COMPACTION, &page->private);
> +                       return 1;
> +               }
> +       }
> +out:
> +       clear_bit(UNDER_COMPACTION, &page->private);
> +       return 0;
> +}
> +
> +static unsigned long z3fold_shrink_count(struct shrinker *shrink,
> +                               struct shrink_control *sc)
> +{
> +       struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
> +                                               shrinker);
> +
> +       return atomic64_read(&pool->unbuddied_nr);
> +}
> +
> +static unsigned long z3fold_shrink_scan(struct shrinker *shrink,
> +                               struct shrink_control *sc)
> +{
> +       struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
> +                                               shrinker);
> +       struct z3fold_header *zhdr;
> +       int i, nr_to_scan = sc->nr_to_scan, nr_shrunk = 0;
> +
> +       spin_lock(&pool->lock);
> +       for_each_unbuddied_list_down(i, NCHUNKS - 3) {
> +               if (!list_empty(&pool->unbuddied[i])) {
> +                       zhdr = list_first_entry(&pool->unbuddied[i],
> +                                               struct z3fold_header, buddy);
> +                       list_del(&zhdr->buddy);
> +                       spin_unlock(&pool->lock);
> +                       nr_shrunk += z3fold_compact_page(zhdr, false);
> +                       spin_lock(&pool->lock);
> +                       list_add(&zhdr->buddy,
> +                               &pool->unbuddied[num_free_chunks(zhdr)]);

use list_add_tail(), we just compacted it and putting it at the head
of the new unbuddied list will cause it to be unnecessarily scanned
first later.

> +                       if (!--nr_to_scan)
> +                               break;
> +               }
> +       }
> +       spin_unlock(&pool->lock);
> +       return nr_shrunk;
> +}
> +
> +
>  /*****************
>   * API Functions
>  *****************/
> @@ -230,7 +335,7 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>
>         pool = kzalloc(sizeof(struct z3fold_pool), gfp);
>         if (!pool)
> -               return NULL;
> +               goto out;
>         spin_lock_init(&pool->lock);
>         for_each_unbuddied_list(i, 0)
>                 INIT_LIST_HEAD(&pool->unbuddied[i]);
> @@ -238,8 +343,19 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>         INIT_LIST_HEAD(&pool->lru);
>         atomic64_set(&pool->pages_nr, 0);
>         atomic64_set(&pool->unbuddied_nr, 0);
> +       pool->shrinker.count_objects = z3fold_shrink_count;
> +       pool->shrinker.scan_objects = z3fold_shrink_scan;
> +       pool->shrinker.seeks = DEFAULT_SEEKS;
> +       pool->shrinker.batch = NCHUNKS - 4;
> +       if (register_shrinker(&pool->shrinker))
> +               goto out_free;
>         pool->ops = ops;
>         return pool;
> +
> +out_free:
> +       kfree(pool);
> +out:
> +       return NULL;
>  }
>
>  /**
> @@ -250,31 +366,10 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>   */
>  static void z3fold_destroy_pool(struct z3fold_pool *pool)
>  {
> +       unregister_shrinker(&pool->shrinker);
>         kfree(pool);
>  }
>
> -/* Has to be called with lock held */
> -static int z3fold_compact_page(struct z3fold_header *zhdr)
> -{
> -       struct page *page = virt_to_page(zhdr);
> -       void *beg = zhdr;
> -
> -
> -       if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
> -           zhdr->middle_chunks != 0 &&
> -           zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
> -               memmove(beg + ZHDR_SIZE_ALIGNED,
> -                       beg + (zhdr->start_middle << CHUNK_SHIFT),
> -                       zhdr->middle_chunks << CHUNK_SHIFT);
> -               zhdr->first_chunks = zhdr->middle_chunks;
> -               zhdr->middle_chunks = 0;
> -               zhdr->start_middle = 0;
> -               zhdr->first_num++;
> -               return 1;
> -       }
> -       return 0;
> -}
> -
>  /**
>   * z3fold_alloc() - allocates a region of a given size
>   * @pool:      z3fold pool from which to allocate
> @@ -464,7 +559,6 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
>                 free_z3fold_page(zhdr);
>                 atomic64_dec(&pool->pages_nr);
>         } else {
> -               z3fold_compact_page(zhdr);

why remove this?

>                 /* Add to the unbuddied list */
>                 freechunks = num_free_chunks(zhdr);
>                 list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
> @@ -596,7 +690,7 @@ next:
>                                 /* Full, add to buddied list */
>                                 list_add(&zhdr->buddy, &pool->buddied);
>                         } else {
> -                               z3fold_compact_page(zhdr);
> +                               z3fold_compact_page(zhdr, true);
>                                 /* add to unbuddied list */
>                                 freechunks = num_free_chunks(zhdr);
>                                 list_add(&zhdr->buddy,
> --
> 2.4.2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/3] z3fold: add shrinker
  2016-10-18  2:06   ` Dan Streetman
@ 2016-10-18  2:45     ` Vitaly Wool
  2016-10-18 14:27       ` Dan Streetman
  0 siblings, 1 reply; 14+ messages in thread
From: Vitaly Wool @ 2016-10-18  2:45 UTC (permalink / raw)
  To: Dan Streetman; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

Hi Dan,

On Tue, Oct 18, 2016 at 4:06 AM, Dan Streetman <ddstreet@ieee.org> wrote:
> On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
>> This patch implements shrinker for z3fold. This shrinker
>> implementation does not free up any pages directly but it allows
>> for a denser placement of compressed objects which results in
>> less actual pages consumed and higher compression ratio therefore.
>>
>> This update removes z3fold page compaction from the freeing path
>> since we can rely on shrinker to do the job. Also, a new flag
>> UNDER_COMPACTION is introduced to protect against two threads
>> trying to compact the same page.
>
> i'm completely unconvinced that this should be a shrinker.  The
> alloc/free paths are much, much better suited to compacting a page
> than a shrinker that must scan through all the unbuddied pages.  Why
> not just improve compaction for the alloc/free paths?

Basically the main reason is performance, I want to avoid compaction on hot
paths as much as possible. This patchset brings both performance and
compression ratio gain, I'm not sure how to achieve that with improving
compaction on alloc/free paths.

>
>>
>> This patch has been checked with the latest Linus's tree.
>>
>> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
>> ---
>>  mm/z3fold.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
>>  1 file changed, 119 insertions(+), 25 deletions(-)
>>
>> diff --git a/mm/z3fold.c b/mm/z3fold.c
>> index 10513b5..8f84d3c 100644
>> --- a/mm/z3fold.c
>> +++ b/mm/z3fold.c
>> @@ -27,6 +27,7 @@
>>  #include <linux/mm.h>
>>  #include <linux/module.h>
>>  #include <linux/preempt.h>
>> +#include <linux/shrinker.h>
>>  #include <linux/slab.h>
>>  #include <linux/spinlock.h>
>>  #include <linux/zpool.h>
>> @@ -72,6 +73,7 @@ struct z3fold_ops {
>>   * @unbuddied_nr:      number of unbuddied z3fold pages in the pool.
>>   * @ops:       pointer to a structure of user defined operations specified at
>>   *             pool creation time.
>> + * @shrinker:  shrinker structure to optimize page layout in background
>>   *
>>   * This structure is allocated at pool creation time and maintains metadata
>>   * pertaining to a particular z3fold pool.
>> @@ -86,6 +88,7 @@ struct z3fold_pool {
>>         const struct z3fold_ops *ops;
>>         struct zpool *zpool;
>>         const struct zpool_ops *zpool_ops;
>> +       struct shrinker shrinker;
>>  };
>>
>>  enum buddy {
>> @@ -121,6 +124,7 @@ enum z3fold_page_flags {
>>         UNDER_RECLAIM = 0,
>>         PAGE_HEADLESS,
>>         MIDDLE_CHUNK_MAPPED,
>> +       UNDER_COMPACTION,
>>  };
>>
>>  /*****************
>> @@ -136,6 +140,9 @@ static int size_to_chunks(size_t size)
>>  #define for_each_unbuddied_list(_iter, _begin) \
>>         for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
>>
>> +#define for_each_unbuddied_list_down(_iter, _end) \
>> +       for ((_iter) = (_end); (_iter) > 0; (_iter)--)
>> +
>
> bikeshed: the conventional suffix is _reverse, not _down, i.e.
> for_each_unbuddied_list_reverse()

Ok :)

>>  /* Initializes the z3fold header of a newly allocated z3fold page */
>>  static struct z3fold_header *init_z3fold_page(struct page *page)
>>  {
>> @@ -145,6 +152,7 @@ static struct z3fold_header *init_z3fold_page(struct page *page)
>>         clear_bit(UNDER_RECLAIM, &page->private);
>>         clear_bit(PAGE_HEADLESS, &page->private);
>>         clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
>> +       clear_bit(UNDER_COMPACTION, &page->private);
>>
>>         zhdr->first_chunks = 0;
>>         zhdr->middle_chunks = 0;
>> @@ -211,6 +219,103 @@ static int num_free_chunks(struct z3fold_header *zhdr)
>>         return nfree;
>>  }
>>
>> +/* Has to be called with lock held */
>> +static int z3fold_compact_page(struct z3fold_header *zhdr, bool sync)
>> +{
>> +       struct page *page = virt_to_page(zhdr);
>> +       void *beg = zhdr;
>> +
>> +
>> +       if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
>> +           !test_bit(UNDER_RECLAIM, &page->private) &&
>> +           !test_bit(UNDER_COMPACTION, &page->private)) {
>> +               set_bit(UNDER_COMPACTION, &page->private);
>
> i assume the reclaim check, and new compaction bit check, is due to
> the removal of the spinlock, which was incorrect.  changing to a
> per-page spinlock may be the best way to handle this, but flags are
> absolutely not appropriate - they don't provide the needed locking.
> Even if the compaction bit was the only locking needed (which it
> isn't), it still isn't correct here - while extremely unlikely, it's
> still possible for multiple threads to race between checking the
> compaction bit, and setting it.  That's what test_and_set_bit() is
> for.

Yep, thanks, will fix that.

>
>> +               if (zhdr->middle_chunks != 0 &&
>
> only need to check if the middle bud is in use once; if it isn't,
> there's no compaction to do.
>
>> +                   zhdr->first_chunks == 0 &&
>> +                   zhdr->last_chunks == 0) {
>> +                       memmove(beg + ZHDR_SIZE_ALIGNED,
>> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
>> +                               zhdr->middle_chunks << CHUNK_SHIFT);
>> +                       zhdr->first_chunks = zhdr->middle_chunks;
>> +                       zhdr->middle_chunks = 0;
>> +                       zhdr->start_middle = 0;
>> +                       zhdr->first_num++;
>> +                       clear_bit(UNDER_COMPACTION, &page->private);
>> +                       return 1;
>> +               }
>> +               if (sync)
>> +                       goto out;
>
> i don't get it, why is that compaction synchronous while the others
> below aren't?

That is the "important" compaction which changes first_num and brings the
biggest benefit. Moving middle object closer to another existing one is not
so important ratio wise, so the idea was to leave that for the shrinker.

>
>> +
>> +               /* moving data is expensive, so let's only do that if
>> +                * there's substantial gain (2+ chunks)
>
> "at least 2 chunks" feels arbitrary...it should be a #define instead
> of a magic number...

...or the module parameter but that could be added later.

>> +                */
>> +               if (zhdr->middle_chunks != 0 && zhdr->first_chunks != 0 &&
>> +                   zhdr->last_chunks == 0 &&
>> +                   zhdr->start_middle > zhdr->first_chunks + 2) {
>> +                       unsigned short new_start = zhdr->first_chunks + 1;
>> +                       memmove(beg + (new_start << CHUNK_SHIFT),
>> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
>> +                               zhdr->middle_chunks << CHUNK_SHIFT);
>> +                       zhdr->start_middle = new_start;
>> +                       clear_bit(UNDER_COMPACTION, &page->private);
>> +                       return 1;
>> +               }
>> +               if (zhdr->middle_chunks != 0 && zhdr->last_chunks != 0 &&
>> +                   zhdr->first_chunks == 0 &&
>> +                   zhdr->middle_chunks + zhdr->last_chunks <=
>> +                   NCHUNKS - zhdr->start_middle - 2) {
>> +                       unsigned short new_start = NCHUNKS - zhdr->last_chunks -
>> +                               zhdr->middle_chunks;
>> +                       memmove(beg + (new_start << CHUNK_SHIFT),
>> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
>> +                               zhdr->middle_chunks << CHUNK_SHIFT);
>
> these if clauses, and the memmoves, aren't very readable, could it be
> made any better with a separate function?

Yep, look somewhat scary, I'll think what I can do about it.

>> +                       zhdr->start_middle = new_start;
>> +                       clear_bit(UNDER_COMPACTION, &page->private);
>> +                       return 1;
>> +               }
>> +       }
>> +out:
>> +       clear_bit(UNDER_COMPACTION, &page->private);
>> +       return 0;
>> +}
>> +
>> +static unsigned long z3fold_shrink_count(struct shrinker *shrink,
>> +                               struct shrink_control *sc)
>> +{
>> +       struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
>> +                                               shrinker);
>> +
>> +       return atomic64_read(&pool->unbuddied_nr);
>> +}
>> +
>> +static unsigned long z3fold_shrink_scan(struct shrinker *shrink,
>> +                               struct shrink_control *sc)
>> +{
>> +       struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
>> +                                               shrinker);
>> +       struct z3fold_header *zhdr;
>> +       int i, nr_to_scan = sc->nr_to_scan, nr_shrunk = 0;
>> +
>> +       spin_lock(&pool->lock);
>> +       for_each_unbuddied_list_down(i, NCHUNKS - 3) {
>> +               if (!list_empty(&pool->unbuddied[i])) {
>> +                       zhdr = list_first_entry(&pool->unbuddied[i],
>> +                                               struct z3fold_header, buddy);
>> +                       list_del(&zhdr->buddy);
>> +                       spin_unlock(&pool->lock);
>> +                       nr_shrunk += z3fold_compact_page(zhdr, false);
>> +                       spin_lock(&pool->lock);
>> +                       list_add(&zhdr->buddy,
>> +                               &pool->unbuddied[num_free_chunks(zhdr)]);
>
> use list_add_tail(), we just compacted it and putting it at the head
> of the new unbuddied list will cause it to be unnecessarily scanned
> first later.
>
>> +                       if (!--nr_to_scan)
>> +                               break;
>> +               }
>> +       }
>> +       spin_unlock(&pool->lock);
>> +       return nr_shrunk;
>> +}
>> +
>> +
>>  /*****************
>>   * API Functions
>>  *****************/
>> @@ -230,7 +335,7 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>>
>>         pool = kzalloc(sizeof(struct z3fold_pool), gfp);
>>         if (!pool)
>> -               return NULL;
>> +               goto out;
>>         spin_lock_init(&pool->lock);
>>         for_each_unbuddied_list(i, 0)
>>                 INIT_LIST_HEAD(&pool->unbuddied[i]);
>> @@ -238,8 +343,19 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>>         INIT_LIST_HEAD(&pool->lru);
>>         atomic64_set(&pool->pages_nr, 0);
>>         atomic64_set(&pool->unbuddied_nr, 0);
>> +       pool->shrinker.count_objects = z3fold_shrink_count;
>> +       pool->shrinker.scan_objects = z3fold_shrink_scan;
>> +       pool->shrinker.seeks = DEFAULT_SEEKS;
>> +       pool->shrinker.batch = NCHUNKS - 4;
>> +       if (register_shrinker(&pool->shrinker))
>> +               goto out_free;
>>         pool->ops = ops;
>>         return pool;
>> +
>> +out_free:
>> +       kfree(pool);
>> +out:
>> +       return NULL;
>>  }
>>
>>  /**
>> @@ -250,31 +366,10 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>>   */
>>  static void z3fold_destroy_pool(struct z3fold_pool *pool)
>>  {
>> +       unregister_shrinker(&pool->shrinker);
>>         kfree(pool);
>>  }
>>
>> -/* Has to be called with lock held */
>> -static int z3fold_compact_page(struct z3fold_header *zhdr)
>> -{
>> -       struct page *page = virt_to_page(zhdr);
>> -       void *beg = zhdr;
>> -
>> -
>> -       if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
>> -           zhdr->middle_chunks != 0 &&
>> -           zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
>> -               memmove(beg + ZHDR_SIZE_ALIGNED,
>> -                       beg + (zhdr->start_middle << CHUNK_SHIFT),
>> -                       zhdr->middle_chunks << CHUNK_SHIFT);
>> -               zhdr->first_chunks = zhdr->middle_chunks;
>> -               zhdr->middle_chunks = 0;
>> -               zhdr->start_middle = 0;
>> -               zhdr->first_num++;
>> -               return 1;
>> -       }
>> -       return 0;
>> -}
>> -
>>  /**
>>   * z3fold_alloc() - allocates a region of a given size
>>   * @pool:      z3fold pool from which to allocate
>> @@ -464,7 +559,6 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
>>                 free_z3fold_page(zhdr);
>>                 atomic64_dec(&pool->pages_nr);
>>         } else {
>> -               z3fold_compact_page(zhdr);
>
> why remove this?

As I've said above, that gives some performance improvement.

~vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 2/3] z3fold: remove redundant locking
  2016-10-17 20:48   ` Dan Streetman
@ 2016-10-18  2:55     ` Vitaly Wool
  0 siblings, 0 replies; 14+ messages in thread
From: Vitaly Wool @ 2016-10-18  2:55 UTC (permalink / raw)
  To: Dan Streetman; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

On Mon, Oct 17, 2016 at 10:48 PM, Dan Streetman <ddstreet@ieee.org> wrote:
> On Sat, Oct 15, 2016 at 7:59 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
>> The per-pool z3fold spinlock should generally be taken only when
>> a non-atomic pool variable is modified. There's no need to take it
>> to map/unmap an object.
>
> no.  it absolutely needs to be taken, because z3fold_compact_page
> could move the middle bud's contents to the first bud, and if the
> middle bud gets mapped while it's being moved really bad things will
> happen.
>
> you can change that to a per-page lock in the z3fold_header, but some
> locking needs to happen between mapping and middle bud moving (and
> handle encoding/decoding and first_num access).

Yep, probably per-page lock is the way to go. I was thinking of making
first_num atomic but that obviously creates more problems than it solves.

~vitaly

>
>
>>
>> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
>> ---
>>  mm/z3fold.c | 17 +++++------------
>>  1 file changed, 5 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/z3fold.c b/mm/z3fold.c
>> index 5197d7b..10513b5 100644
>> --- a/mm/z3fold.c
>> +++ b/mm/z3fold.c
>> @@ -580,6 +580,7 @@ next:
>>                 if ((test_bit(PAGE_HEADLESS, &page->private) && ret == 0) ||
>>                     (zhdr->first_chunks == 0 && zhdr->last_chunks == 0 &&
>>                      zhdr->middle_chunks == 0)) {
>> +                       spin_unlock(&pool->lock);
>>                         /*
>>                          * All buddies are now free, free the z3fold page and
>>                          * return success.
>> @@ -587,7 +588,6 @@ next:
>>                         clear_bit(PAGE_HEADLESS, &page->private);
>>                         free_z3fold_page(zhdr);
>>                         atomic64_dec(&pool->pages_nr);
>> -                       spin_unlock(&pool->lock);
>>                         return 0;
>>                 }  else if (!test_bit(PAGE_HEADLESS, &page->private)) {
>>                         if (zhdr->first_chunks != 0 &&
>> @@ -629,7 +629,6 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle)
>>         void *addr;
>>         enum buddy buddy;
>>
>> -       spin_lock(&pool->lock);
>>         zhdr = handle_to_z3fold_header(handle);
>>         addr = zhdr;
>>         page = virt_to_page(zhdr);
>> @@ -656,7 +655,6 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle)
>>                 break;
>>         }
>>  out:
>> -       spin_unlock(&pool->lock);
>>         return addr;
>>  }
>>
>> @@ -671,19 +669,14 @@ static void z3fold_unmap(struct z3fold_pool *pool, unsigned long handle)
>>         struct page *page;
>>         enum buddy buddy;
>>
>> -       spin_lock(&pool->lock);
>>         zhdr = handle_to_z3fold_header(handle);
>>         page = virt_to_page(zhdr);
>>
>> -       if (test_bit(PAGE_HEADLESS, &page->private)) {
>> -               spin_unlock(&pool->lock);
>> -               return;
>> +       if (!test_bit(PAGE_HEADLESS, &page->private)) {
>> +               buddy = handle_to_buddy(handle);
>> +               if (buddy == MIDDLE)
>> +                       clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
>>         }
>> -
>> -       buddy = handle_to_buddy(handle);
>> -       if (buddy == MIDDLE)
>> -               clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
>> -       spin_unlock(&pool->lock);
>>  }
>>
>>  /**
>> --
>> 2.4.2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/3] z3fold: add shrinker
  2016-10-18  2:45     ` Vitaly Wool
@ 2016-10-18 14:27       ` Dan Streetman
  2016-10-18 14:51         ` Vitaly Wool
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Streetman @ 2016-10-18 14:27 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

On Mon, Oct 17, 2016 at 10:45 PM, Vitaly Wool <vitalywool@gmail.com> wrote:
> Hi Dan,
>
> On Tue, Oct 18, 2016 at 4:06 AM, Dan Streetman <ddstreet@ieee.org> wrote:
>> On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
>>> This patch implements shrinker for z3fold. This shrinker
>>> implementation does not free up any pages directly but it allows
>>> for a denser placement of compressed objects which results in
>>> less actual pages consumed and higher compression ratio therefore.
>>>
>>> This update removes z3fold page compaction from the freeing path
>>> since we can rely on shrinker to do the job. Also, a new flag
>>> UNDER_COMPACTION is introduced to protect against two threads
>>> trying to compact the same page.
>>
>> i'm completely unconvinced that this should be a shrinker.  The
>> alloc/free paths are much, much better suited to compacting a page
>> than a shrinker that must scan through all the unbuddied pages.  Why
>> not just improve compaction for the alloc/free paths?
>
> Basically the main reason is performance, I want to avoid compaction on hot
> paths as much as possible. This patchset brings both performance and
> compression ratio gain, I'm not sure how to achieve that with improving
> compaction on alloc/free paths.

It seems like a tradeoff of slight improvement in hot paths, for
significant decrease in performance by adding a shrinker, which will
do a lot of unnecessary scanning.  The alloc/free/unmap functions are
working directly with the page at exactly the point where compaction
is needed - when adding or removing a bud from the page.

Sorry if I missed it in earlier emails, but have you done any
performance measurements comparing with/without the shrinker?  The
compression ratio gains may be possible with only the
z3fold_compact_page() improvements, and performance may be stable (or
better) with only a per-z3fold-page lock, instead of adding the
shrinker...?

If a shrinker really is needed, it seems like it would be better
suited to coalescing separate z3fold pages via migration, like
zsmalloc does (although that's a significant amount of work).

>
>>
>>>
>>> This patch has been checked with the latest Linus's tree.
>>>
>>> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
>>> ---
>>>  mm/z3fold.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
>>>  1 file changed, 119 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/mm/z3fold.c b/mm/z3fold.c
>>> index 10513b5..8f84d3c 100644
>>> --- a/mm/z3fold.c
>>> +++ b/mm/z3fold.c
>>> @@ -27,6 +27,7 @@
>>>  #include <linux/mm.h>
>>>  #include <linux/module.h>
>>>  #include <linux/preempt.h>
>>> +#include <linux/shrinker.h>
>>>  #include <linux/slab.h>
>>>  #include <linux/spinlock.h>
>>>  #include <linux/zpool.h>
>>> @@ -72,6 +73,7 @@ struct z3fold_ops {
>>>   * @unbuddied_nr:      number of unbuddied z3fold pages in the pool.
>>>   * @ops:       pointer to a structure of user defined operations specified at
>>>   *             pool creation time.
>>> + * @shrinker:  shrinker structure to optimize page layout in background
>>>   *
>>>   * This structure is allocated at pool creation time and maintains metadata
>>>   * pertaining to a particular z3fold pool.
>>> @@ -86,6 +88,7 @@ struct z3fold_pool {
>>>         const struct z3fold_ops *ops;
>>>         struct zpool *zpool;
>>>         const struct zpool_ops *zpool_ops;
>>> +       struct shrinker shrinker;
>>>  };
>>>
>>>  enum buddy {
>>> @@ -121,6 +124,7 @@ enum z3fold_page_flags {
>>>         UNDER_RECLAIM = 0,
>>>         PAGE_HEADLESS,
>>>         MIDDLE_CHUNK_MAPPED,
>>> +       UNDER_COMPACTION,
>>>  };
>>>
>>>  /*****************
>>> @@ -136,6 +140,9 @@ static int size_to_chunks(size_t size)
>>>  #define for_each_unbuddied_list(_iter, _begin) \
>>>         for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
>>>
>>> +#define for_each_unbuddied_list_down(_iter, _end) \
>>> +       for ((_iter) = (_end); (_iter) > 0; (_iter)--)
>>> +
>>
>> bikeshed: the conventional suffix is _reverse, not _down, i.e.
>> for_each_unbuddied_list_reverse()
>
> Ok :)
>
>>>  /* Initializes the z3fold header of a newly allocated z3fold page */
>>>  static struct z3fold_header *init_z3fold_page(struct page *page)
>>>  {
>>> @@ -145,6 +152,7 @@ static struct z3fold_header *init_z3fold_page(struct page *page)
>>>         clear_bit(UNDER_RECLAIM, &page->private);
>>>         clear_bit(PAGE_HEADLESS, &page->private);
>>>         clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
>>> +       clear_bit(UNDER_COMPACTION, &page->private);
>>>
>>>         zhdr->first_chunks = 0;
>>>         zhdr->middle_chunks = 0;
>>> @@ -211,6 +219,103 @@ static int num_free_chunks(struct z3fold_header *zhdr)
>>>         return nfree;
>>>  }
>>>
>>> +/* Has to be called with lock held */
>>> +static int z3fold_compact_page(struct z3fold_header *zhdr, bool sync)
>>> +{
>>> +       struct page *page = virt_to_page(zhdr);
>>> +       void *beg = zhdr;
>>> +
>>> +
>>> +       if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
>>> +           !test_bit(UNDER_RECLAIM, &page->private) &&
>>> +           !test_bit(UNDER_COMPACTION, &page->private)) {
>>> +               set_bit(UNDER_COMPACTION, &page->private);
>>
>> i assume the reclaim check, and new compaction bit check, is due to
>> the removal of the spinlock, which was incorrect.  changing to a
>> per-page spinlock may be the best way to handle this, but flags are
>> absolutely not appropriate - they don't provide the needed locking.
>> Even if the compaction bit was the only locking needed (which it
>> isn't), it still isn't correct here - while extremely unlikely, it's
>> still possible for multiple threads to race between checking the
>> compaction bit, and setting it.  That's what test_and_set_bit() is
>> for.
>
> Yep, thanks, will fix that.
>
>>
>>> +               if (zhdr->middle_chunks != 0 &&
>>
>> only need to check if the middle bud is in use once; if it isn't,
>> there's no compaction to do.
>>
>>> +                   zhdr->first_chunks == 0 &&
>>> +                   zhdr->last_chunks == 0) {
>>> +                       memmove(beg + ZHDR_SIZE_ALIGNED,
>>> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
>>> +                               zhdr->middle_chunks << CHUNK_SHIFT);
>>> +                       zhdr->first_chunks = zhdr->middle_chunks;
>>> +                       zhdr->middle_chunks = 0;
>>> +                       zhdr->start_middle = 0;
>>> +                       zhdr->first_num++;
>>> +                       clear_bit(UNDER_COMPACTION, &page->private);
>>> +                       return 1;
>>> +               }
>>> +               if (sync)
>>> +                       goto out;
>>
>> i don't get it, why is that compaction synchronous while the others
>> below aren't?
>
> That is the "important" compaction which changes first_num and brings the
> biggest benefit. Moving middle object closer to another existing one is not
> so important ratio wise, so the idea was to leave that for the shrinker.
>
>>
>>> +
>>> +               /* moving data is expensive, so let's only do that if
>>> +                * there's substantial gain (2+ chunks)
>>
>> "at least 2 chunks" feels arbitrary...it should be a #define instead
>> of a magic number...
>
> ...or the module parameter but that could be added later.
>
>>> +                */
>>> +               if (zhdr->middle_chunks != 0 && zhdr->first_chunks != 0 &&
>>> +                   zhdr->last_chunks == 0 &&
>>> +                   zhdr->start_middle > zhdr->first_chunks + 2) {
>>> +                       unsigned short new_start = zhdr->first_chunks + 1;
>>> +                       memmove(beg + (new_start << CHUNK_SHIFT),
>>> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
>>> +                               zhdr->middle_chunks << CHUNK_SHIFT);
>>> +                       zhdr->start_middle = new_start;
>>> +                       clear_bit(UNDER_COMPACTION, &page->private);
>>> +                       return 1;
>>> +               }
>>> +               if (zhdr->middle_chunks != 0 && zhdr->last_chunks != 0 &&
>>> +                   zhdr->first_chunks == 0 &&
>>> +                   zhdr->middle_chunks + zhdr->last_chunks <=
>>> +                   NCHUNKS - zhdr->start_middle - 2) {
>>> +                       unsigned short new_start = NCHUNKS - zhdr->last_chunks -
>>> +                               zhdr->middle_chunks;
>>> +                       memmove(beg + (new_start << CHUNK_SHIFT),
>>> +                               beg + (zhdr->start_middle << CHUNK_SHIFT),
>>> +                               zhdr->middle_chunks << CHUNK_SHIFT);
>>
>> these if clauses, and the memmoves, aren't very readable, could it be
>> made any better with a separate function?
>
> Yep, look somewhat scary, I'll think what I can do about it.
>
>>> +                       zhdr->start_middle = new_start;
>>> +                       clear_bit(UNDER_COMPACTION, &page->private);
>>> +                       return 1;
>>> +               }
>>> +       }
>>> +out:
>>> +       clear_bit(UNDER_COMPACTION, &page->private);
>>> +       return 0;
>>> +}
>>> +
>>> +static unsigned long z3fold_shrink_count(struct shrinker *shrink,
>>> +                               struct shrink_control *sc)
>>> +{
>>> +       struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
>>> +                                               shrinker);
>>> +
>>> +       return atomic64_read(&pool->unbuddied_nr);
>>> +}
>>> +
>>> +static unsigned long z3fold_shrink_scan(struct shrinker *shrink,
>>> +                               struct shrink_control *sc)
>>> +{
>>> +       struct z3fold_pool *pool = container_of(shrink, struct z3fold_pool,
>>> +                                               shrinker);
>>> +       struct z3fold_header *zhdr;
>>> +       int i, nr_to_scan = sc->nr_to_scan, nr_shrunk = 0;
>>> +
>>> +       spin_lock(&pool->lock);
>>> +       for_each_unbuddied_list_down(i, NCHUNKS - 3) {
>>> +               if (!list_empty(&pool->unbuddied[i])) {
>>> +                       zhdr = list_first_entry(&pool->unbuddied[i],
>>> +                                               struct z3fold_header, buddy);
>>> +                       list_del(&zhdr->buddy);
>>> +                       spin_unlock(&pool->lock);
>>> +                       nr_shrunk += z3fold_compact_page(zhdr, false);
>>> +                       spin_lock(&pool->lock);
>>> +                       list_add(&zhdr->buddy,
>>> +                               &pool->unbuddied[num_free_chunks(zhdr)]);
>>
>> use list_add_tail(), we just compacted it and putting it at the head
>> of the new unbuddied list will cause it to be unnecessarily scanned
>> first later.
>>
>>> +                       if (!--nr_to_scan)
>>> +                               break;
>>> +               }
>>> +       }
>>> +       spin_unlock(&pool->lock);
>>> +       return nr_shrunk;
>>> +}
>>> +
>>> +
>>>  /*****************
>>>   * API Functions
>>>  *****************/
>>> @@ -230,7 +335,7 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>>>
>>>         pool = kzalloc(sizeof(struct z3fold_pool), gfp);
>>>         if (!pool)
>>> -               return NULL;
>>> +               goto out;
>>>         spin_lock_init(&pool->lock);
>>>         for_each_unbuddied_list(i, 0)
>>>                 INIT_LIST_HEAD(&pool->unbuddied[i]);
>>> @@ -238,8 +343,19 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>>>         INIT_LIST_HEAD(&pool->lru);
>>>         atomic64_set(&pool->pages_nr, 0);
>>>         atomic64_set(&pool->unbuddied_nr, 0);
>>> +       pool->shrinker.count_objects = z3fold_shrink_count;
>>> +       pool->shrinker.scan_objects = z3fold_shrink_scan;
>>> +       pool->shrinker.seeks = DEFAULT_SEEKS;
>>> +       pool->shrinker.batch = NCHUNKS - 4;
>>> +       if (register_shrinker(&pool->shrinker))
>>> +               goto out_free;
>>>         pool->ops = ops;
>>>         return pool;
>>> +
>>> +out_free:
>>> +       kfree(pool);
>>> +out:
>>> +       return NULL;
>>>  }
>>>
>>>  /**
>>> @@ -250,31 +366,10 @@ static struct z3fold_pool *z3fold_create_pool(gfp_t gfp,
>>>   */
>>>  static void z3fold_destroy_pool(struct z3fold_pool *pool)
>>>  {
>>> +       unregister_shrinker(&pool->shrinker);
>>>         kfree(pool);
>>>  }
>>>
>>> -/* Has to be called with lock held */
>>> -static int z3fold_compact_page(struct z3fold_header *zhdr)
>>> -{
>>> -       struct page *page = virt_to_page(zhdr);
>>> -       void *beg = zhdr;
>>> -
>>> -
>>> -       if (!test_bit(MIDDLE_CHUNK_MAPPED, &page->private) &&
>>> -           zhdr->middle_chunks != 0 &&
>>> -           zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
>>> -               memmove(beg + ZHDR_SIZE_ALIGNED,
>>> -                       beg + (zhdr->start_middle << CHUNK_SHIFT),
>>> -                       zhdr->middle_chunks << CHUNK_SHIFT);
>>> -               zhdr->first_chunks = zhdr->middle_chunks;
>>> -               zhdr->middle_chunks = 0;
>>> -               zhdr->start_middle = 0;
>>> -               zhdr->first_num++;
>>> -               return 1;
>>> -       }
>>> -       return 0;
>>> -}
>>> -
>>>  /**
>>>   * z3fold_alloc() - allocates a region of a given size
>>>   * @pool:      z3fold pool from which to allocate
>>> @@ -464,7 +559,6 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
>>>                 free_z3fold_page(zhdr);
>>>                 atomic64_dec(&pool->pages_nr);
>>>         } else {
>>> -               z3fold_compact_page(zhdr);
>>
>> why remove this?
>
> As I've said above, that gives some performance improvement.
>
> ~vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/3] z3fold: add shrinker
  2016-10-18 14:27       ` Dan Streetman
@ 2016-10-18 14:51         ` Vitaly Wool
  2016-10-18 15:29           ` Dan Streetman
  0 siblings, 1 reply; 14+ messages in thread
From: Vitaly Wool @ 2016-10-18 14:51 UTC (permalink / raw)
  To: Dan Streetman; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

On Tue, Oct 18, 2016 at 4:27 PM, Dan Streetman <ddstreet@ieee.org> wrote:
> On Mon, Oct 17, 2016 at 10:45 PM, Vitaly Wool <vitalywool@gmail.com> wrote:
>> Hi Dan,
>>
>> On Tue, Oct 18, 2016 at 4:06 AM, Dan Streetman <ddstreet@ieee.org> wrote:
>>> On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
>>>> This patch implements shrinker for z3fold. This shrinker
>>>> implementation does not free up any pages directly but it allows
>>>> for a denser placement of compressed objects which results in
>>>> less actual pages consumed and higher compression ratio therefore.
>>>>
>>>> This update removes z3fold page compaction from the freeing path
>>>> since we can rely on shrinker to do the job. Also, a new flag
>>>> UNDER_COMPACTION is introduced to protect against two threads
>>>> trying to compact the same page.
>>>
>>> i'm completely unconvinced that this should be a shrinker.  The
>>> alloc/free paths are much, much better suited to compacting a page
>>> than a shrinker that must scan through all the unbuddied pages.  Why
>>> not just improve compaction for the alloc/free paths?
>>
>> Basically the main reason is performance, I want to avoid compaction on hot
>> paths as much as possible. This patchset brings both performance and
>> compression ratio gain, I'm not sure how to achieve that with improving
>> compaction on alloc/free paths.
>
> It seems like a tradeoff of slight improvement in hot paths, for
> significant decrease in performance by adding a shrinker, which will
> do a lot of unnecessary scanning.  The alloc/free/unmap functions are
> working directly with the page at exactly the point where compaction
> is needed - when adding or removing a bud from the page.

I can see that sometimes there are substantial amounts of pages that
are non-compactable synchronously due to the MIDDLE_CHUNK_MAPPED
bit set. Picking up those seems to be a good job for a shrinker, and those
end up in the beginning of respective unbuddied lists, so the shrinker is set
to find them. I can slightly optimize that by introducing a
COMPACT_DEFERRED flag or something like that to make shrinker find
those pages faster, would that make sense to you?

> Sorry if I missed it in earlier emails, but have you done any
> performance measurements comparing with/without the shrinker?  The
> compression ratio gains may be possible with only the
> z3fold_compact_page() improvements, and performance may be stable (or
> better) with only a per-z3fold-page lock, instead of adding the
> shrinker...?

I'm running some tests with per-page locks now, but according to the
previous measurements the shrinker version always wins on multi-core
platforms.

> If a shrinker really is needed, it seems like it would be better
> suited to coalescing separate z3fold pages via migration, like
> zsmalloc does (although that's a significant amount of work).

I really don't want to go that way to keep z3fold applicable to an MMU-less
system.

~vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/3] z3fold: add shrinker
  2016-10-18 14:51         ` Vitaly Wool
@ 2016-10-18 15:29           ` Dan Streetman
       [not found]             ` <CAMJBoFNRFPYkcX05jZWjO21V9xzCipbBBtsq9CQbTU1EOK2hyg@mail.gmail.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Streetman @ 2016-10-18 15:29 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Linux-MM, linux-kernel, Andrew Morton, Dave Chinner

On Tue, Oct 18, 2016 at 10:51 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
> On Tue, Oct 18, 2016 at 4:27 PM, Dan Streetman <ddstreet@ieee.org> wrote:
>> On Mon, Oct 17, 2016 at 10:45 PM, Vitaly Wool <vitalywool@gmail.com> wrote:
>>> Hi Dan,
>>>
>>> On Tue, Oct 18, 2016 at 4:06 AM, Dan Streetman <ddstreet@ieee.org> wrote:
>>>> On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
>>>>> This patch implements shrinker for z3fold. This shrinker
>>>>> implementation does not free up any pages directly but it allows
>>>>> for a denser placement of compressed objects which results in
>>>>> less actual pages consumed and higher compression ratio therefore.
>>>>>
>>>>> This update removes z3fold page compaction from the freeing path
>>>>> since we can rely on shrinker to do the job. Also, a new flag
>>>>> UNDER_COMPACTION is introduced to protect against two threads
>>>>> trying to compact the same page.
>>>>
>>>> i'm completely unconvinced that this should be a shrinker.  The
>>>> alloc/free paths are much, much better suited to compacting a page
>>>> than a shrinker that must scan through all the unbuddied pages.  Why
>>>> not just improve compaction for the alloc/free paths?
>>>
>>> Basically the main reason is performance, I want to avoid compaction on hot
>>> paths as much as possible. This patchset brings both performance and
>>> compression ratio gain, I'm not sure how to achieve that with improving
>>> compaction on alloc/free paths.
>>
>> It seems like a tradeoff of slight improvement in hot paths, for
>> significant decrease in performance by adding a shrinker, which will
>> do a lot of unnecessary scanning.  The alloc/free/unmap functions are
>> working directly with the page at exactly the point where compaction
>> is needed - when adding or removing a bud from the page.
>
> I can see that sometimes there are substantial amounts of pages that
> are non-compactable synchronously due to the MIDDLE_CHUNK_MAPPED
> bit set. Picking up those seems to be a good job for a shrinker, and those
> end up in the beginning of respective unbuddied lists, so the shrinker is set
> to find them. I can slightly optimize that by introducing a
> COMPACT_DEFERRED flag or something like that to make shrinker find
> those pages faster, would that make sense to you?

Why not just compact the page in z3fold_unmap()?

>
>> Sorry if I missed it in earlier emails, but have you done any
>> performance measurements comparing with/without the shrinker?  The
>> compression ratio gains may be possible with only the
>> z3fold_compact_page() improvements, and performance may be stable (or
>> better) with only a per-z3fold-page lock, instead of adding the
>> shrinker...?
>
> I'm running some tests with per-page locks now, but according to the
> previous measurements the shrinker version always wins on multi-core
> platforms.

But that comparison is without taking the spinlock in map/unmap right?

>
>> If a shrinker really is needed, it seems like it would be better
>> suited to coalescing separate z3fold pages via migration, like
>> zsmalloc does (although that's a significant amount of work).
>
> I really don't want to go that way to keep z3fold applicable to an MMU-less
> system.
>
> ~vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5] z3fold: add shrinker
       [not found]             ` <CAMJBoFNRFPYkcX05jZWjO21V9xzCipbBBtsq9CQbTU1EOK2hyg@mail.gmail.com>
@ 2016-10-18 17:35               ` Dan Streetman
  2016-10-18 18:36                 ` Vitaly Wool
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Streetman @ 2016-10-18 17:35 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Dave Chinner, LKML, Andrew Morton, Linux-MM

On Tue, Oct 18, 2016 at 12:26 PM, Vitaly Wool <vitalywool@gmail.com> wrote:
> 18 окт. 2016 г. 18:29 пользователь "Dan Streetman" <ddstreet@ieee.org>
> написал:
>
>
>>
>> On Tue, Oct 18, 2016 at 10:51 AM, Vitaly Wool <vitalywool@gmail.com>
>> wrote:
>> > On Tue, Oct 18, 2016 at 4:27 PM, Dan Streetman <ddstreet@ieee.org>
>> > wrote:
>> >> On Mon, Oct 17, 2016 at 10:45 PM, Vitaly Wool <vitalywool@gmail.com>
>> >> wrote:
>> >>> Hi Dan,
>> >>>
>> >>> On Tue, Oct 18, 2016 at 4:06 AM, Dan Streetman <ddstreet@ieee.org>
>> >>> wrote:
>> >>>> On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@gmail.com>
>> >>>> wrote:
>> >>>>> This patch implements shrinker for z3fold. This shrinker
>> >>>>> implementation does not free up any pages directly but it allows
>> >>>>> for a denser placement of compressed objects which results in
>> >>>>> less actual pages consumed and higher compression ratio therefore.
>> >>>>>
>> >>>>> This update removes z3fold page compaction from the freeing path
>> >>>>> since we can rely on shrinker to do the job. Also, a new flag
>> >>>>> UNDER_COMPACTION is introduced to protect against two threads
>> >>>>> trying to compact the same page.
>> >>>>
>> >>>> i'm completely unconvinced that this should be a shrinker.  The
>> >>>> alloc/free paths are much, much better suited to compacting a page
>> >>>> than a shrinker that must scan through all the unbuddied pages.  Why
>> >>>> not just improve compaction for the alloc/free paths?
>> >>>
>> >>> Basically the main reason is performance, I want to avoid compaction
>> >>> on hot
>> >>> paths as much as possible. This patchset brings both performance and
>> >>> compression ratio gain, I'm not sure how to achieve that with
>> >>> improving
>> >>> compaction on alloc/free paths.
>> >>
>> >> It seems like a tradeoff of slight improvement in hot paths, for
>> >> significant decrease in performance by adding a shrinker, which will
>> >> do a lot of unnecessary scanning.  The alloc/free/unmap functions are
>> >> working directly with the page at exactly the point where compaction
>> >> is needed - when adding or removing a bud from the page.
>> >
>> > I can see that sometimes there are substantial amounts of pages that
>> > are non-compactable synchronously due to the MIDDLE_CHUNK_MAPPED
>> > bit set. Picking up those seems to be a good job for a shrinker, and
>> > those
>> > end up in the beginning of respective unbuddied lists, so the shrinker
>> > is set
>> > to find them. I can slightly optimize that by introducing a
>> > COMPACT_DEFERRED flag or something like that to make shrinker find
>> > those pages faster, would that make sense to you?
>>
>> Why not just compact the page in z3fold_unmap()?
>
> That would give a huge performance penalty (checked).

my core concern with the shrinker is, it has no context about which
pages need compacting and which don't, while the alloc/free/unmap
functions do.  If the alloc/free/unmap fast paths are impacted too
much by compacting directly, then yeah setting a flag for deferred
action would be better than the shrinker just scanning all pages.
However, in that the case, then a shrinker still seems unnecessary -
all the pages that need compacting are pre-marked, there's no need to
scan any more.  Isn't a simple workqueue to deferred-compact pages
better?

>
>> >> Sorry if I missed it in earlier emails, but have you done any
>> >> performance measurements comparing with/without the shrinker?  The
>> >> compression ratio gains may be possible with only the
>> >> z3fold_compact_page() improvements, and performance may be stable (or
>> >> better) with only a per-z3fold-page lock, instead of adding the
>> >> shrinker...?
>> >
>> > I'm running some tests with per-page locks now, but according to the
>> > previous measurements the shrinker version always wins on multi-core
>> > platforms.
>>
>> But that comparison is without taking the spinlock in map/unmap right?
>
> Right, but from the recent measurements it looks like per-page locks don't
> slow things down that much.
>
> ~vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5] z3fold: add shrinker
  2016-10-18 17:35               ` [PATCH v5] " Dan Streetman
@ 2016-10-18 18:36                 ` Vitaly Wool
  0 siblings, 0 replies; 14+ messages in thread
From: Vitaly Wool @ 2016-10-18 18:36 UTC (permalink / raw)
  To: Dan Streetman; +Cc: Dave Chinner, LKML, Andrew Morton, Linux-MM

On Tue, Oct 18, 2016 at 7:35 PM, Dan Streetman <ddstreet@ieee.org> wrote:
> On Tue, Oct 18, 2016 at 12:26 PM, Vitaly Wool <vitalywool@gmail.com> wrote:
>> 18 окт. 2016 г. 18:29 пользователь "Dan Streetman" <ddstreet@ieee.org>
>> написал:
>>
>>
>>>
>>> On Tue, Oct 18, 2016 at 10:51 AM, Vitaly Wool <vitalywool@gmail.com>
>>> wrote:
>>> > On Tue, Oct 18, 2016 at 4:27 PM, Dan Streetman <ddstreet@ieee.org>
>>> > wrote:
>>> >> On Mon, Oct 17, 2016 at 10:45 PM, Vitaly Wool <vitalywool@gmail.com>
>>> >> wrote:
>>> >>> Hi Dan,
>>> >>>
>>> >>> On Tue, Oct 18, 2016 at 4:06 AM, Dan Streetman <ddstreet@ieee.org>
>>> >>> wrote:
>>> >>>> On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@gmail.com>
>>> >>>> wrote:
>>> >>>>> This patch implements shrinker for z3fold. This shrinker
>>> >>>>> implementation does not free up any pages directly but it allows
>>> >>>>> for a denser placement of compressed objects which results in
>>> >>>>> less actual pages consumed and higher compression ratio therefore.
>>> >>>>>
>>> >>>>> This update removes z3fold page compaction from the freeing path
>>> >>>>> since we can rely on shrinker to do the job. Also, a new flag
>>> >>>>> UNDER_COMPACTION is introduced to protect against two threads
>>> >>>>> trying to compact the same page.
>>> >>>>
>>> >>>> i'm completely unconvinced that this should be a shrinker.  The
>>> >>>> alloc/free paths are much, much better suited to compacting a page
>>> >>>> than a shrinker that must scan through all the unbuddied pages.  Why
>>> >>>> not just improve compaction for the alloc/free paths?
>>> >>>
>>> >>> Basically the main reason is performance, I want to avoid compaction
>>> >>> on hot
>>> >>> paths as much as possible. This patchset brings both performance and
>>> >>> compression ratio gain, I'm not sure how to achieve that with
>>> >>> improving
>>> >>> compaction on alloc/free paths.
>>> >>
>>> >> It seems like a tradeoff of slight improvement in hot paths, for
>>> >> significant decrease in performance by adding a shrinker, which will
>>> >> do a lot of unnecessary scanning.  The alloc/free/unmap functions are
>>> >> working directly with the page at exactly the point where compaction
>>> >> is needed - when adding or removing a bud from the page.
>>> >
>>> > I can see that sometimes there are substantial amounts of pages that
>>> > are non-compactable synchronously due to the MIDDLE_CHUNK_MAPPED
>>> > bit set. Picking up those seems to be a good job for a shrinker, and
>>> > those
>>> > end up in the beginning of respective unbuddied lists, so the shrinker
>>> > is set
>>> > to find them. I can slightly optimize that by introducing a
>>> > COMPACT_DEFERRED flag or something like that to make shrinker find
>>> > those pages faster, would that make sense to you?
>>>
>>> Why not just compact the page in z3fold_unmap()?
>>
>> That would give a huge performance penalty (checked).
>
> my core concern with the shrinker is, it has no context about which
> pages need compacting and which don't, while the alloc/free/unmap
> functions do.  If the alloc/free/unmap fast paths are impacted too
> much by compacting directly, then yeah setting a flag for deferred
> action would be better than the shrinker just scanning all pages.
> However, in that the case, then a shrinker still seems unnecessary -
> all the pages that need compacting are pre-marked, there's no need to
> scan any more.  Isn't a simple workqueue to deferred-compact pages
> better?

Well yep, the more I think of that the more it seems a better fit. I'll test
that workqueue thing performance wise and get back with a new patch.

~vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-10-18 18:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-15 11:56 [PATCH v5] z3fold: add shrinker Vitaly Wool
2016-10-15 11:58 ` [PATCH v5 1/3] z3fold: make counters atomic Vitaly Wool
2016-10-17 20:37   ` Dan Streetman
2016-10-15 11:59 ` [PATCH v5 2/3] z3fold: remove redundant locking Vitaly Wool
2016-10-17 20:48   ` Dan Streetman
2016-10-18  2:55     ` Vitaly Wool
2016-10-15 12:05 ` [PATCH v5 3/3] z3fold: add shrinker Vitaly Wool
2016-10-18  2:06   ` Dan Streetman
2016-10-18  2:45     ` Vitaly Wool
2016-10-18 14:27       ` Dan Streetman
2016-10-18 14:51         ` Vitaly Wool
2016-10-18 15:29           ` Dan Streetman
     [not found]             ` <CAMJBoFNRFPYkcX05jZWjO21V9xzCipbBBtsq9CQbTU1EOK2hyg@mail.gmail.com>
2016-10-18 17:35               ` [PATCH v5] " Dan Streetman
2016-10-18 18:36                 ` Vitaly Wool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).