All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] mm/zswap: optimize for dynamic zswap_pools
@ 2024-02-14  8:54 Chengming Zhou
  2024-02-14  8:54 ` [PATCH v2 1/2] mm/zswap: global lru and shrinker shared by all zswap_pools Chengming Zhou
  2024-02-14  8:54 ` [PATCH v2 2/2] mm/zswap: change zswap_pool kref to percpu_ref Chengming Zhou
  0 siblings, 2 replies; 7+ messages in thread
From: Chengming Zhou @ 2024-02-14  8:54 UTC (permalink / raw)
  To: Yosry Ahmed, Johannes Weiner, Andrew Morton, Nhat Pham
  Cc: linux-mm, Chengming Zhou, linux-kernel



---
Changes in v2:
- fix build error when !CONFIG_MEMCG_KMEM.
- make zswap struct static and fix some error paths, per Yosry.
- add another shrink_lock to protect zswap.next_shrink, per Yosry.
- keep "WARN_ON(percpu_ref_tryget(&pool->ref))" in pool release path
  for debug, per Nhat.
- improve the commit messages.
- Link to v1: https://lore.kernel.org/r/20240210-zswap-global-lru-v1-0-853473d7b0da@bytedance.com

Dynamic pool creation has been supported for a long time, which maybe
not used so much in practice. But with the per-memcg lru merged, the
current structure of zswap_pool's lru and shrinker become less optimal.

In the current structure, each zswap_pool has its own lru, shrinker and
shrink_work, but only the latest zswap_pool will be the current used.

1. When memory has pressure, all shrinkers of zswap_pools will try to
   shrink its lru list, there is no order between them.

2. When zswap limit hit, only the last zswap_pool's shrink_work will
   try to shrink its own lru, which is inefficient.

A more natural way is to have a global zswap lru shared between all
zswap_pools, and so is the shrinker. The code becomes much simpler too.

Another optimization is changing zswap_pool kref to percpu_ref, which
will be taken reference by every zswap entry. So the scalability is
better.

Testing kernel build (32 threads) in tmpfs with memory.max=2GB.
(zswap shrinker and writeback enabled with one 50GB swapfile,
on a 128 CPUs x86-64 machine, below is the average of 5 runs)

        mm-unstable  zswap-global-lru
real    63.20        63.12
user    1061.75      1062.95
sys     268.74       264.44

To: Andrew Morton <akpm@linux-foundation.org>
To: Johannes Weiner <hannes@cmpxchg.org>
To: Yosry Ahmed <yosryahmed@google.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>

---
Chengming Zhou (2):
      mm/zswap: global lru and shrinker shared by all zswap_pools
      mm/zswap: change zswap_pool kref to percpu_ref

 mm/zswap.c | 201 ++++++++++++++++++++++++++-----------------------------------
 1 file changed, 87 insertions(+), 114 deletions(-)
---
base-commit: 191d97734e41a5c9f90a2f6636fdd335ae1d435d
change-id: 20240210-zswap-global-lru-94d49316178b

Best regards,
-- 
Chengming Zhou <zhouchengming@bytedance.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] mm/zswap: global lru and shrinker shared by all zswap_pools
  2024-02-14  8:54 [PATCH v2 0/2] mm/zswap: optimize for dynamic zswap_pools Chengming Zhou
@ 2024-02-14  8:54 ` Chengming Zhou
  2024-02-14 19:20   ` Yosry Ahmed
  2024-02-14  8:54 ` [PATCH v2 2/2] mm/zswap: change zswap_pool kref to percpu_ref Chengming Zhou
  1 sibling, 1 reply; 7+ messages in thread
From: Chengming Zhou @ 2024-02-14  8:54 UTC (permalink / raw)
  To: Yosry Ahmed, Johannes Weiner, Andrew Morton, Nhat Pham
  Cc: linux-mm, Chengming Zhou, linux-kernel

Dynamic zswap_pool creation may create/reuse to have multiple
zswap_pools in a list, only the first will be current used.

Each zswap_pool has its own lru and shrinker, which is not
necessary and has its problem:

1. When memory has pressure, all shrinker of zswap_pools will
   try to shrink its own lru, there is no order between them.

2. When zswap limit hit, only the last zswap_pool's shrink_work
   will try to shrink its lru, which is inefficient.

Anyway, having a global lru and shrinker shared by all zswap_pools
is better and efficient.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 mm/zswap.c | 170 +++++++++++++++++++++++--------------------------------------
 1 file changed, 65 insertions(+), 105 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 62fe307521c9..dbff67d7e1c7 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -176,14 +176,18 @@ struct zswap_pool {
 	struct kref kref;
 	struct list_head list;
 	struct work_struct release_work;
-	struct work_struct shrink_work;
 	struct hlist_node node;
 	char tfm_name[CRYPTO_MAX_ALG_NAME];
+};
+
+static struct {
 	struct list_lru list_lru;
-	struct mem_cgroup *next_shrink;
-	struct shrinker *shrinker;
 	atomic_t nr_stored;
-};
+	struct shrinker *shrinker;
+	struct work_struct shrink_work;
+	struct mem_cgroup *next_shrink;
+	spinlock_t shrink_lock;
+} zswap;
 
 /*
  * struct zswap_entry
@@ -301,9 +305,6 @@ static void zswap_update_total_size(void)
 * pool functions
 **********************************/
 
-static void zswap_alloc_shrinker(struct zswap_pool *pool);
-static void shrink_worker(struct work_struct *w);
-
 static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 {
 	int i;
@@ -353,30 +354,16 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 	if (ret)
 		goto error;
 
-	zswap_alloc_shrinker(pool);
-	if (!pool->shrinker)
-		goto error;
-
-	pr_debug("using %s compressor\n", pool->tfm_name);
-
 	/* being the current pool takes 1 ref; this func expects the
 	 * caller to always add the new pool as the current pool
 	 */
 	kref_init(&pool->kref);
 	INIT_LIST_HEAD(&pool->list);
-	if (list_lru_init_memcg(&pool->list_lru, pool->shrinker))
-		goto lru_fail;
-	shrinker_register(pool->shrinker);
-	INIT_WORK(&pool->shrink_work, shrink_worker);
-	atomic_set(&pool->nr_stored, 0);
 
 	zswap_pool_debug("created", pool);
 
 	return pool;
 
-lru_fail:
-	list_lru_destroy(&pool->list_lru);
-	shrinker_free(pool->shrinker);
 error:
 	if (pool->acomp_ctx)
 		free_percpu(pool->acomp_ctx);
@@ -434,15 +421,8 @@ static void zswap_pool_destroy(struct zswap_pool *pool)
 
 	zswap_pool_debug("destroying", pool);
 
-	shrinker_free(pool->shrinker);
 	cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
 	free_percpu(pool->acomp_ctx);
-	list_lru_destroy(&pool->list_lru);
-
-	spin_lock(&zswap_pools_lock);
-	mem_cgroup_iter_break(NULL, pool->next_shrink);
-	pool->next_shrink = NULL;
-	spin_unlock(&zswap_pools_lock);
 
 	for (i = 0; i < ZSWAP_NR_ZPOOLS; i++)
 		zpool_destroy_pool(pool->zpools[i]);
@@ -529,24 +509,6 @@ static struct zswap_pool *zswap_pool_current_get(void)
 	return pool;
 }
 
-static struct zswap_pool *zswap_pool_last_get(void)
-{
-	struct zswap_pool *pool, *last = NULL;
-
-	rcu_read_lock();
-
-	list_for_each_entry_rcu(pool, &zswap_pools, list)
-		last = pool;
-	WARN_ONCE(!last && zswap_has_pool,
-		  "%s: no page storage pool!\n", __func__);
-	if (!zswap_pool_get(last))
-		last = NULL;
-
-	rcu_read_unlock();
-
-	return last;
-}
-
 /* type and compressor must be null-terminated */
 static struct zswap_pool *zswap_pool_find_get(char *type, char *compressor)
 {
@@ -816,15 +778,11 @@ void zswap_folio_swapin(struct folio *folio)
 
 void zswap_memcg_offline_cleanup(struct mem_cgroup *memcg)
 {
-	struct zswap_pool *pool;
-
-	/* lock out zswap pools list modification */
-	spin_lock(&zswap_pools_lock);
-	list_for_each_entry(pool, &zswap_pools, list) {
-		if (pool->next_shrink == memcg)
-			pool->next_shrink = mem_cgroup_iter(NULL, pool->next_shrink, NULL);
-	}
-	spin_unlock(&zswap_pools_lock);
+	/* lock out zswap shrinker walking memcg tree */
+	spin_lock(&zswap.shrink_lock);
+	if (zswap.next_shrink == memcg)
+		zswap.next_shrink = mem_cgroup_iter(NULL, zswap.next_shrink, NULL);
+	spin_unlock(&zswap.shrink_lock);
 }
 
 /*********************************
@@ -923,9 +881,9 @@ static void zswap_entry_free(struct zswap_entry *entry)
 	if (!entry->length)
 		atomic_dec(&zswap_same_filled_pages);
 	else {
-		zswap_lru_del(&entry->pool->list_lru, entry);
+		zswap_lru_del(&zswap.list_lru, entry);
 		zpool_free(zswap_find_zpool(entry), entry->handle);
-		atomic_dec(&entry->pool->nr_stored);
+		atomic_dec(&zswap.nr_stored);
 		zswap_pool_put(entry->pool);
 	}
 	if (entry->objcg) {
@@ -1288,7 +1246,6 @@ static unsigned long zswap_shrinker_scan(struct shrinker *shrinker,
 {
 	struct lruvec *lruvec = mem_cgroup_lruvec(sc->memcg, NODE_DATA(sc->nid));
 	unsigned long shrink_ret, nr_protected, lru_size;
-	struct zswap_pool *pool = shrinker->private_data;
 	bool encountered_page_in_swapcache = false;
 
 	if (!zswap_shrinker_enabled ||
@@ -1299,7 +1256,7 @@ static unsigned long zswap_shrinker_scan(struct shrinker *shrinker,
 
 	nr_protected =
 		atomic_long_read(&lruvec->zswap_lruvec_state.nr_zswap_protected);
-	lru_size = list_lru_shrink_count(&pool->list_lru, sc);
+	lru_size = list_lru_shrink_count(&zswap.list_lru, sc);
 
 	/*
 	 * Abort if we are shrinking into the protected region.
@@ -1316,7 +1273,7 @@ static unsigned long zswap_shrinker_scan(struct shrinker *shrinker,
 		return SHRINK_STOP;
 	}
 
-	shrink_ret = list_lru_shrink_walk(&pool->list_lru, sc, &shrink_memcg_cb,
+	shrink_ret = list_lru_shrink_walk(&zswap.list_lru, sc, &shrink_memcg_cb,
 		&encountered_page_in_swapcache);
 
 	if (encountered_page_in_swapcache)
@@ -1328,7 +1285,6 @@ static unsigned long zswap_shrinker_scan(struct shrinker *shrinker,
 static unsigned long zswap_shrinker_count(struct shrinker *shrinker,
 		struct shrink_control *sc)
 {
-	struct zswap_pool *pool = shrinker->private_data;
 	struct mem_cgroup *memcg = sc->memcg;
 	struct lruvec *lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(sc->nid));
 	unsigned long nr_backing, nr_stored, nr_freeable, nr_protected;
@@ -1342,8 +1298,8 @@ static unsigned long zswap_shrinker_count(struct shrinker *shrinker,
 	nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
 #else
 	/* use pool stats instead of memcg stats */
-	nr_backing = get_zswap_pool_size(pool) >> PAGE_SHIFT;
-	nr_stored = atomic_read(&pool->nr_stored);
+	nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
+	nr_stored = atomic_read(&zswap.nr_stored);
 #endif
 
 	if (!nr_stored)
@@ -1351,7 +1307,7 @@ static unsigned long zswap_shrinker_count(struct shrinker *shrinker,
 
 	nr_protected =
 		atomic_long_read(&lruvec->zswap_lruvec_state.nr_zswap_protected);
-	nr_freeable = list_lru_shrink_count(&pool->list_lru, sc);
+	nr_freeable = list_lru_shrink_count(&zswap.list_lru, sc);
 	/*
 	 * Subtract the lru size by an estimate of the number of pages
 	 * that should be protected.
@@ -1367,23 +1323,24 @@ static unsigned long zswap_shrinker_count(struct shrinker *shrinker,
 	return mult_frac(nr_freeable, nr_backing, nr_stored);
 }
 
-static void zswap_alloc_shrinker(struct zswap_pool *pool)
+static struct shrinker *zswap_alloc_shrinker(void)
 {
-	pool->shrinker =
+	struct shrinker *shrinker;
+
+	shrinker =
 		shrinker_alloc(SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE, "mm-zswap");
-	if (!pool->shrinker)
-		return;
+	if (!shrinker)
+		return NULL;
 
-	pool->shrinker->private_data = pool;
-	pool->shrinker->scan_objects = zswap_shrinker_scan;
-	pool->shrinker->count_objects = zswap_shrinker_count;
-	pool->shrinker->batch = 0;
-	pool->shrinker->seeks = DEFAULT_SEEKS;
+	shrinker->scan_objects = zswap_shrinker_scan;
+	shrinker->count_objects = zswap_shrinker_count;
+	shrinker->batch = 0;
+	shrinker->seeks = DEFAULT_SEEKS;
+	return shrinker;
 }
 
 static int shrink_memcg(struct mem_cgroup *memcg)
 {
-	struct zswap_pool *pool;
 	int nid, shrunk = 0;
 
 	if (!mem_cgroup_zswap_writeback_enabled(memcg))
@@ -1396,32 +1353,25 @@ static int shrink_memcg(struct mem_cgroup *memcg)
 	if (memcg && !mem_cgroup_online(memcg))
 		return -ENOENT;
 
-	pool = zswap_pool_current_get();
-	if (!pool)
-		return -EINVAL;
-
 	for_each_node_state(nid, N_NORMAL_MEMORY) {
 		unsigned long nr_to_walk = 1;
 
-		shrunk += list_lru_walk_one(&pool->list_lru, nid, memcg,
+		shrunk += list_lru_walk_one(&zswap.list_lru, nid, memcg,
 					    &shrink_memcg_cb, NULL, &nr_to_walk);
 	}
-	zswap_pool_put(pool);
 	return shrunk ? 0 : -EAGAIN;
 }
 
 static void shrink_worker(struct work_struct *w)
 {
-	struct zswap_pool *pool = container_of(w, typeof(*pool),
-						shrink_work);
 	struct mem_cgroup *memcg;
 	int ret, failures = 0;
 
 	/* global reclaim will select cgroup in a round-robin fashion. */
 	do {
-		spin_lock(&zswap_pools_lock);
-		pool->next_shrink = mem_cgroup_iter(NULL, pool->next_shrink, NULL);
-		memcg = pool->next_shrink;
+		spin_lock(&zswap.shrink_lock);
+		zswap.next_shrink = mem_cgroup_iter(NULL, zswap.next_shrink, NULL);
+		memcg = zswap.next_shrink;
 
 		/*
 		 * We need to retry if we have gone through a full round trip, or if we
@@ -1435,7 +1385,7 @@ static void shrink_worker(struct work_struct *w)
 		 * memcg is not killed when we are reclaiming.
 		 */
 		if (!memcg) {
-			spin_unlock(&zswap_pools_lock);
+			spin_unlock(&zswap.shrink_lock);
 			if (++failures == MAX_RECLAIM_RETRIES)
 				break;
 
@@ -1445,15 +1395,15 @@ static void shrink_worker(struct work_struct *w)
 		if (!mem_cgroup_tryget_online(memcg)) {
 			/* drop the reference from mem_cgroup_iter() */
 			mem_cgroup_iter_break(NULL, memcg);
-			pool->next_shrink = NULL;
-			spin_unlock(&zswap_pools_lock);
+			zswap.next_shrink = NULL;
+			spin_unlock(&zswap.shrink_lock);
 
 			if (++failures == MAX_RECLAIM_RETRIES)
 				break;
 
 			goto resched;
 		}
-		spin_unlock(&zswap_pools_lock);
+		spin_unlock(&zswap.shrink_lock);
 
 		ret = shrink_memcg(memcg);
 		/* drop the extra reference */
@@ -1467,7 +1417,6 @@ static void shrink_worker(struct work_struct *w)
 resched:
 		cond_resched();
 	} while (!zswap_can_accept());
-	zswap_pool_put(pool);
 }
 
 static int zswap_is_page_same_filled(void *ptr, unsigned long *value)
@@ -1508,7 +1457,6 @@ bool zswap_store(struct folio *folio)
 	struct zswap_entry *entry, *dupentry;
 	struct obj_cgroup *objcg = NULL;
 	struct mem_cgroup *memcg = NULL;
-	struct zswap_pool *shrink_pool;
 
 	VM_WARN_ON_ONCE(!folio_test_locked(folio));
 	VM_WARN_ON_ONCE(!folio_test_swapcache(folio));
@@ -1576,7 +1524,7 @@ bool zswap_store(struct folio *folio)
 
 	if (objcg) {
 		memcg = get_mem_cgroup_from_objcg(objcg);
-		if (memcg_list_lru_alloc(memcg, &entry->pool->list_lru, GFP_KERNEL)) {
+		if (memcg_list_lru_alloc(memcg, &zswap.list_lru, GFP_KERNEL)) {
 			mem_cgroup_put(memcg);
 			goto put_pool;
 		}
@@ -1607,8 +1555,8 @@ bool zswap_store(struct folio *folio)
 	}
 	if (entry->length) {
 		INIT_LIST_HEAD(&entry->lru);
-		zswap_lru_add(&entry->pool->list_lru, entry);
-		atomic_inc(&entry->pool->nr_stored);
+		zswap_lru_add(&zswap.list_lru, entry);
+		atomic_inc(&zswap.nr_stored);
 	}
 	spin_unlock(&tree->lock);
 
@@ -1640,9 +1588,7 @@ bool zswap_store(struct folio *folio)
 	return false;
 
 shrink:
-	shrink_pool = zswap_pool_last_get();
-	if (shrink_pool && !queue_work(shrink_wq, &shrink_pool->shrink_work))
-		zswap_pool_put(shrink_pool);
+	queue_work(shrink_wq, &zswap.shrink_work);
 	goto reject;
 }
 
@@ -1804,6 +1750,22 @@ static int zswap_setup(void)
 	if (ret)
 		goto hp_fail;
 
+	shrink_wq = alloc_workqueue("zswap-shrink",
+			WQ_UNBOUND|WQ_MEM_RECLAIM, 1);
+	if (!shrink_wq)
+		goto shrink_wq_fail;
+
+	zswap.shrinker = zswap_alloc_shrinker();
+	if (!zswap.shrinker)
+		goto shrinker_fail;
+	if (list_lru_init_memcg(&zswap.list_lru, zswap.shrinker))
+		goto lru_fail;
+	shrinker_register(zswap.shrinker);
+
+	INIT_WORK(&zswap.shrink_work, shrink_worker);
+	atomic_set(&zswap.nr_stored, 0);
+	spin_lock_init(&zswap.shrink_lock);
+
 	pool = __zswap_pool_create_fallback();
 	if (pool) {
 		pr_info("loaded using pool %s/%s\n", pool->tfm_name,
@@ -1815,19 +1777,17 @@ static int zswap_setup(void)
 		zswap_enabled = false;
 	}
 
-	shrink_wq = alloc_workqueue("zswap-shrink",
-			WQ_UNBOUND|WQ_MEM_RECLAIM, 1);
-	if (!shrink_wq)
-		goto fallback_fail;
-
 	if (zswap_debugfs_init())
 		pr_warn("debugfs initialization failed\n");
 	zswap_init_state = ZSWAP_INIT_SUCCEED;
 	return 0;
 
-fallback_fail:
-	if (pool)
-		zswap_pool_destroy(pool);
+lru_fail:
+	shrinker_free(zswap.shrinker);
+shrinker_fail:
+	destroy_workqueue(shrink_wq);
+shrink_wq_fail:
+	cpuhp_remove_multi_state(CPUHP_MM_ZSWP_POOL_PREPARE);
 hp_fail:
 	kmem_cache_destroy(zswap_entry_cache);
 cache_fail:

-- 
b4 0.10.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] mm/zswap: change zswap_pool kref to percpu_ref
  2024-02-14  8:54 [PATCH v2 0/2] mm/zswap: optimize for dynamic zswap_pools Chengming Zhou
  2024-02-14  8:54 ` [PATCH v2 1/2] mm/zswap: global lru and shrinker shared by all zswap_pools Chengming Zhou
@ 2024-02-14  8:54 ` Chengming Zhou
  2024-02-14 20:10   ` Yosry Ahmed
  1 sibling, 1 reply; 7+ messages in thread
From: Chengming Zhou @ 2024-02-14  8:54 UTC (permalink / raw)
  To: Yosry Ahmed, Johannes Weiner, Andrew Morton, Nhat Pham
  Cc: linux-mm, Chengming Zhou, linux-kernel

All zswap entries will take a reference of zswap_pool when
zswap_store(), and drop it when free. Change it to use the
percpu_ref is better for scalability performance.

Although percpu_ref use a bit more memory which should be ok
for our use case, since we almost have only one zswap_pool to
be using. The performance gain is for zswap_store/load hotpath.

Testing kernel build (32 threads) in tmpfs with memory.max=2GB.
(zswap shrinker and writeback enabled with one 50GB swapfile,
on a 128 CPUs x86-64 machine, below is the average of 5 runs)

        mm-unstable  zswap-global-lru
real    63.20        63.12
user    1061.75      1062.95
sys     268.74       264.44

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 mm/zswap.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index dbff67d7e1c7..f6470d30d337 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -173,7 +173,7 @@ struct crypto_acomp_ctx {
 struct zswap_pool {
 	struct zpool *zpools[ZSWAP_NR_ZPOOLS];
 	struct crypto_acomp_ctx __percpu *acomp_ctx;
-	struct kref kref;
+	struct percpu_ref ref;
 	struct list_head list;
 	struct work_struct release_work;
 	struct hlist_node node;
@@ -304,6 +304,7 @@ static void zswap_update_total_size(void)
 /*********************************
 * pool functions
 **********************************/
+static void __zswap_pool_empty(struct percpu_ref *ref);
 
 static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 {
@@ -357,13 +358,18 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 	/* being the current pool takes 1 ref; this func expects the
 	 * caller to always add the new pool as the current pool
 	 */
-	kref_init(&pool->kref);
+	ret = percpu_ref_init(&pool->ref, __zswap_pool_empty,
+			      PERCPU_REF_ALLOW_REINIT, GFP_KERNEL);
+	if (ret)
+		goto ref_fail;
 	INIT_LIST_HEAD(&pool->list);
 
 	zswap_pool_debug("created", pool);
 
 	return pool;
 
+ref_fail:
+	cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
 error:
 	if (pool->acomp_ctx)
 		free_percpu(pool->acomp_ctx);
@@ -436,8 +442,9 @@ static void __zswap_pool_release(struct work_struct *work)
 
 	synchronize_rcu();
 
-	/* nobody should have been able to get a kref... */
-	WARN_ON(kref_get_unless_zero(&pool->kref));
+	/* nobody should have been able to get a ref... */
+	WARN_ON(percpu_ref_tryget(&pool->ref));
+	percpu_ref_exit(&pool->ref);
 
 	/* pool is now off zswap_pools list and has no references. */
 	zswap_pool_destroy(pool);
@@ -445,11 +452,11 @@ static void __zswap_pool_release(struct work_struct *work)
 
 static struct zswap_pool *zswap_pool_current(void);
 
-static void __zswap_pool_empty(struct kref *kref)
+static void __zswap_pool_empty(struct percpu_ref *ref)
 {
 	struct zswap_pool *pool;
 
-	pool = container_of(kref, typeof(*pool), kref);
+	pool = container_of(ref, typeof(*pool), ref);
 
 	spin_lock(&zswap_pools_lock);
 
@@ -468,12 +475,12 @@ static int __must_check zswap_pool_get(struct zswap_pool *pool)
 	if (!pool)
 		return 0;
 
-	return kref_get_unless_zero(&pool->kref);
+	return percpu_ref_tryget(&pool->ref);
 }
 
 static void zswap_pool_put(struct zswap_pool *pool)
 {
-	kref_put(&pool->kref, __zswap_pool_empty);
+	percpu_ref_put(&pool->ref);
 }
 
 static struct zswap_pool *__zswap_pool_current(void)
@@ -603,6 +610,12 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp,
 
 	if (!pool)
 		pool = zswap_pool_create(type, compressor);
+	else {
+		/* Resurrect percpu_ref to percpu mode. */
+		percpu_ref_resurrect(&pool->ref);
+		/* Drop the ref from zswap_pool_find_get(). */
+		zswap_pool_put(pool);
+	}
 
 	if (pool)
 		ret = param_set_charp(s, kp);
@@ -641,7 +654,7 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp,
 	 * or the new pool we failed to add
 	 */
 	if (put_pool)
-		zswap_pool_put(put_pool);
+		percpu_ref_kill(&put_pool->ref);
 
 	return ret;
 }

-- 
b4 0.10.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm/zswap: global lru and shrinker shared by all zswap_pools
  2024-02-14  8:54 ` [PATCH v2 1/2] mm/zswap: global lru and shrinker shared by all zswap_pools Chengming Zhou
@ 2024-02-14 19:20   ` Yosry Ahmed
  2024-02-16  8:47     ` Chengming Zhou
  0 siblings, 1 reply; 7+ messages in thread
From: Yosry Ahmed @ 2024-02-14 19:20 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: Johannes Weiner, Andrew Morton, Nhat Pham, linux-mm, linux-kernel

On Wed, Feb 14, 2024 at 08:54:37AM +0000, Chengming Zhou wrote:
> Dynamic zswap_pool creation may create/reuse to have multiple
> zswap_pools in a list, only the first will be current used.
> 
> Each zswap_pool has its own lru and shrinker, which is not
> necessary and has its problem:
> 
> 1. When memory has pressure, all shrinker of zswap_pools will
>    try to shrink its own lru, there is no order between them.
> 
> 2. When zswap limit hit, only the last zswap_pool's shrink_work
>    will try to shrink its lru, which is inefficient.

I think the rationale here was to try and empty the old pool first so
that we can completely drop it. However, since we only support exclusive
loads now, the LRU ordering should be entirely decided by the order of
stores, so I think the oldest entries on the LRU will naturally be the
from the oldest pool, right?

Probably worth stating this.

> 
> Anyway, having a global lru and shrinker shared by all zswap_pools
> is better and efficient.
> 
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>

LGTM with a few comments, with those:
Acked-by: Yosry Ahmed <yosryahmed@google.com>

> ---
>  mm/zswap.c | 170 +++++++++++++++++++++++--------------------------------------
>  1 file changed, 65 insertions(+), 105 deletions(-)
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 62fe307521c9..dbff67d7e1c7 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -176,14 +176,18 @@ struct zswap_pool {
>  	struct kref kref;
>  	struct list_head list;
>  	struct work_struct release_work;
> -	struct work_struct shrink_work;
>  	struct hlist_node node;
>  	char tfm_name[CRYPTO_MAX_ALG_NAME];
> +};
> +
> +static struct {
>  	struct list_lru list_lru;
> -	struct mem_cgroup *next_shrink;
> -	struct shrinker *shrinker;
>  	atomic_t nr_stored;
> -};
> +	struct shrinker *shrinker;
> +	struct work_struct shrink_work;
> +	struct mem_cgroup *next_shrink;
> +	spinlock_t shrink_lock;

The lock is exclusively protecting next_shrink, right? Perhaps we should
rename it to next_shrink_lock or at least document this.

> +} zswap;
>  
>  /*
>   * struct zswap_entry
> @@ -301,9 +305,6 @@ static void zswap_update_total_size(void)
>  * pool functions
>  **********************************/
>  
> -static void zswap_alloc_shrinker(struct zswap_pool *pool);
> -static void shrink_worker(struct work_struct *w);
> -
>  static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>  {
>  	int i;
> @@ -353,30 +354,16 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>  	if (ret)
>  		goto error;
>  
> -	zswap_alloc_shrinker(pool);
> -	if (!pool->shrinker)
> -		goto error;
> -
> -	pr_debug("using %s compressor\n", pool->tfm_name);

nit: the next patch introduces a new failure case between this debug
print and zswap_pool_debug() below, so it will become possible again
that we get one and not the other. Not a big deal though.

> -
>  	/* being the current pool takes 1 ref; this func expects the
>  	 * caller to always add the new pool as the current pool
>  	 */
>  	kref_init(&pool->kref);
>  	INIT_LIST_HEAD(&pool->list);
> -	if (list_lru_init_memcg(&pool->list_lru, pool->shrinker))
> -		goto lru_fail;
> -	shrinker_register(pool->shrinker);
> -	INIT_WORK(&pool->shrink_work, shrink_worker);
> -	atomic_set(&pool->nr_stored, 0);
>  
>  	zswap_pool_debug("created", pool);
>  
>  	return pool;
>  
> -lru_fail:
> -	list_lru_destroy(&pool->list_lru);
> -	shrinker_free(pool->shrinker);
>  error:
>  	if (pool->acomp_ctx)
>  		free_percpu(pool->acomp_ctx);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] mm/zswap: change zswap_pool kref to percpu_ref
  2024-02-14  8:54 ` [PATCH v2 2/2] mm/zswap: change zswap_pool kref to percpu_ref Chengming Zhou
@ 2024-02-14 20:10   ` Yosry Ahmed
  2024-02-16  8:49     ` Chengming Zhou
  0 siblings, 1 reply; 7+ messages in thread
From: Yosry Ahmed @ 2024-02-14 20:10 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: Johannes Weiner, Andrew Morton, Nhat Pham, linux-mm, linux-kernel

On Wed, Feb 14, 2024 at 08:54:38AM +0000, Chengming Zhou wrote:
> All zswap entries will take a reference of zswap_pool when
> zswap_store(), and drop it when free. Change it to use the
> percpu_ref is better for scalability performance.
> 
> Although percpu_ref use a bit more memory which should be ok
> for our use case, since we almost have only one zswap_pool to
> be using. The performance gain is for zswap_store/load hotpath.
> 
> Testing kernel build (32 threads) in tmpfs with memory.max=2GB.
> (zswap shrinker and writeback enabled with one 50GB swapfile,
> on a 128 CPUs x86-64 machine, below is the average of 5 runs)
> 
>         mm-unstable  zswap-global-lru
> real    63.20        63.12
> user    1061.75      1062.95
> sys     268.74       264.44
> 
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
>  mm/zswap.c | 31 ++++++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index dbff67d7e1c7..f6470d30d337 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -173,7 +173,7 @@ struct crypto_acomp_ctx {
>  struct zswap_pool {
>  	struct zpool *zpools[ZSWAP_NR_ZPOOLS];
>  	struct crypto_acomp_ctx __percpu *acomp_ctx;
> -	struct kref kref;
> +	struct percpu_ref ref;
>  	struct list_head list;
>  	struct work_struct release_work;
>  	struct hlist_node node;
> @@ -304,6 +304,7 @@ static void zswap_update_total_size(void)
>  /*********************************
>  * pool functions
>  **********************************/
> +static void __zswap_pool_empty(struct percpu_ref *ref);
>  
>  static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>  {
> @@ -357,13 +358,18 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>  	/* being the current pool takes 1 ref; this func expects the
>  	 * caller to always add the new pool as the current pool
>  	 */
> -	kref_init(&pool->kref);
> +	ret = percpu_ref_init(&pool->ref, __zswap_pool_empty,
> +			      PERCPU_REF_ALLOW_REINIT, GFP_KERNEL);
> +	if (ret)
> +		goto ref_fail;
>  	INIT_LIST_HEAD(&pool->list);
>  
>  	zswap_pool_debug("created", pool);
>  
>  	return pool;
>  
> +ref_fail:
> +	cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
>  error:
>  	if (pool->acomp_ctx)
>  		free_percpu(pool->acomp_ctx);
> @@ -436,8 +442,9 @@ static void __zswap_pool_release(struct work_struct *work)
>  
>  	synchronize_rcu();
>  
> -	/* nobody should have been able to get a kref... */
> -	WARN_ON(kref_get_unless_zero(&pool->kref));
> +	/* nobody should have been able to get a ref... */
> +	WARN_ON(percpu_ref_tryget(&pool->ref));

Just curious, was there any value from using kref_get_unless_zero() over
kref_read() here? If not, I think percpu_ref_is_zero() is more
intuitive. This also seems like it fits more as a debug check.

> +	percpu_ref_exit(&pool->ref);
>  
>  	/* pool is now off zswap_pools list and has no references. */
>  	zswap_pool_destroy(pool);
> @@ -445,11 +452,11 @@ static void __zswap_pool_release(struct work_struct *work)
>  
>  static struct zswap_pool *zswap_pool_current(void);
>  
> -static void __zswap_pool_empty(struct kref *kref)
> +static void __zswap_pool_empty(struct percpu_ref *ref)
>  {
>  	struct zswap_pool *pool;
>  
> -	pool = container_of(kref, typeof(*pool), kref);
> +	pool = container_of(ref, typeof(*pool), ref);
>  
>  	spin_lock(&zswap_pools_lock);
>  
> @@ -468,12 +475,12 @@ static int __must_check zswap_pool_get(struct zswap_pool *pool)
>  	if (!pool)
>  		return 0;
>  
> -	return kref_get_unless_zero(&pool->kref);
> +	return percpu_ref_tryget(&pool->ref);
>  }
>  
>  static void zswap_pool_put(struct zswap_pool *pool)
>  {
> -	kref_put(&pool->kref, __zswap_pool_empty);
> +	percpu_ref_put(&pool->ref);
>  }
>  
>  static struct zswap_pool *__zswap_pool_current(void)
> @@ -603,6 +610,12 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp,
>  
>  	if (!pool)
>  		pool = zswap_pool_create(type, compressor);
> +	else {
> +		/* Resurrect percpu_ref to percpu mode. */
> +		percpu_ref_resurrect(&pool->ref);

I think this is not very clear. The previous code relied on the ref from
zswap_pool_find_get() to replace the initial ref that we had dropped
before. This is not needed with percpu_ref_resurrect() because it
already restores the initial ref dropped by percpu_ref_kill().

Perhaps something like:
		/*
		 * Restore the initial ref dropped by percpu_ref_kill()
		 * when the pool was decommissioned and switch it again
		 * to percpu mode.
		 /

, or am I overthinking this?

> +		/* Drop the ref from zswap_pool_find_get(). */
> +		zswap_pool_put(pool);
> +	}
>  
>  	if (pool)
>  		ret = param_set_charp(s, kp);
> @@ -641,7 +654,7 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp,
>  	 * or the new pool we failed to add
>  	 */
>  	if (put_pool)
> -		zswap_pool_put(put_pool);
> +		percpu_ref_kill(&put_pool->ref);
>  
>  	return ret;
>  }
> 
> -- 
> b4 0.10.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm/zswap: global lru and shrinker shared by all zswap_pools
  2024-02-14 19:20   ` Yosry Ahmed
@ 2024-02-16  8:47     ` Chengming Zhou
  0 siblings, 0 replies; 7+ messages in thread
From: Chengming Zhou @ 2024-02-16  8:47 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Johannes Weiner, Andrew Morton, Nhat Pham, linux-mm, linux-kernel

On 2024/2/15 03:20, Yosry Ahmed wrote:
> On Wed, Feb 14, 2024 at 08:54:37AM +0000, Chengming Zhou wrote:
>> Dynamic zswap_pool creation may create/reuse to have multiple
>> zswap_pools in a list, only the first will be current used.
>>
>> Each zswap_pool has its own lru and shrinker, which is not
>> necessary and has its problem:
>>
>> 1. When memory has pressure, all shrinker of zswap_pools will
>>    try to shrink its own lru, there is no order between them.
>>
>> 2. When zswap limit hit, only the last zswap_pool's shrink_work
>>    will try to shrink its lru, which is inefficient.
> 
> I think the rationale here was to try and empty the old pool first so
> that we can completely drop it. However, since we only support exclusive
> loads now, the LRU ordering should be entirely decided by the order of
> stores, so I think the oldest entries on the LRU will naturally be the
> from the oldest pool, right?
> 
> Probably worth stating this.

Right, will add your this part in the commit message.

> 
>>
>> Anyway, having a global lru and shrinker shared by all zswap_pools
>> is better and efficient.
>>
>> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
> 
> LGTM with a few comments, with those:
> Acked-by: Yosry Ahmed <yosryahmed@google.com>
> 
>> ---
>>  mm/zswap.c | 170 +++++++++++++++++++++++--------------------------------------
>>  1 file changed, 65 insertions(+), 105 deletions(-)
>>
>> diff --git a/mm/zswap.c b/mm/zswap.c
>> index 62fe307521c9..dbff67d7e1c7 100644
>> --- a/mm/zswap.c
>> +++ b/mm/zswap.c
>> @@ -176,14 +176,18 @@ struct zswap_pool {
>>  	struct kref kref;
>>  	struct list_head list;
>>  	struct work_struct release_work;
>> -	struct work_struct shrink_work;
>>  	struct hlist_node node;
>>  	char tfm_name[CRYPTO_MAX_ALG_NAME];
>> +};
>> +
>> +static struct {
>>  	struct list_lru list_lru;
>> -	struct mem_cgroup *next_shrink;
>> -	struct shrinker *shrinker;
>>  	atomic_t nr_stored;
>> -};
>> +	struct shrinker *shrinker;
>> +	struct work_struct shrink_work;
>> +	struct mem_cgroup *next_shrink;
>> +	spinlock_t shrink_lock;
> 
> The lock is exclusively protecting next_shrink, right? Perhaps we should
> rename it to next_shrink_lock or at least document this.

Ok, I will add a comment to it.

> 
>> +} zswap;
>>  
>>  /*
>>   * struct zswap_entry
>> @@ -301,9 +305,6 @@ static void zswap_update_total_size(void)
>>  * pool functions
>>  **********************************/
>>  
>> -static void zswap_alloc_shrinker(struct zswap_pool *pool);
>> -static void shrink_worker(struct work_struct *w);
>> -
>>  static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>>  {
>>  	int i;
>> @@ -353,30 +354,16 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>>  	if (ret)
>>  		goto error;
>>  
>> -	zswap_alloc_shrinker(pool);
>> -	if (!pool->shrinker)
>> -		goto error;
>> -
>> -	pr_debug("using %s compressor\n", pool->tfm_name);
> 
> nit: the next patch introduces a new failure case between this debug
> print and zswap_pool_debug() below, so it will become possible again
> that we get one and not the other. Not a big deal though.

Yes, not a big deal.

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] mm/zswap: change zswap_pool kref to percpu_ref
  2024-02-14 20:10   ` Yosry Ahmed
@ 2024-02-16  8:49     ` Chengming Zhou
  0 siblings, 0 replies; 7+ messages in thread
From: Chengming Zhou @ 2024-02-16  8:49 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Johannes Weiner, Andrew Morton, Nhat Pham, linux-mm, linux-kernel

On 2024/2/15 04:10, Yosry Ahmed wrote:
> On Wed, Feb 14, 2024 at 08:54:38AM +0000, Chengming Zhou wrote:
>> All zswap entries will take a reference of zswap_pool when
>> zswap_store(), and drop it when free. Change it to use the
>> percpu_ref is better for scalability performance.
>>
>> Although percpu_ref use a bit more memory which should be ok
>> for our use case, since we almost have only one zswap_pool to
>> be using. The performance gain is for zswap_store/load hotpath.
>>
>> Testing kernel build (32 threads) in tmpfs with memory.max=2GB.
>> (zswap shrinker and writeback enabled with one 50GB swapfile,
>> on a 128 CPUs x86-64 machine, below is the average of 5 runs)
>>
>>         mm-unstable  zswap-global-lru
>> real    63.20        63.12
>> user    1061.75      1062.95
>> sys     268.74       264.44
>>
>> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
>> ---
>>  mm/zswap.c | 31 ++++++++++++++++++++++---------
>>  1 file changed, 22 insertions(+), 9 deletions(-)
>>
>> diff --git a/mm/zswap.c b/mm/zswap.c
>> index dbff67d7e1c7..f6470d30d337 100644
>> --- a/mm/zswap.c
>> +++ b/mm/zswap.c
>> @@ -173,7 +173,7 @@ struct crypto_acomp_ctx {
>>  struct zswap_pool {
>>  	struct zpool *zpools[ZSWAP_NR_ZPOOLS];
>>  	struct crypto_acomp_ctx __percpu *acomp_ctx;
>> -	struct kref kref;
>> +	struct percpu_ref ref;
>>  	struct list_head list;
>>  	struct work_struct release_work;
>>  	struct hlist_node node;
>> @@ -304,6 +304,7 @@ static void zswap_update_total_size(void)
>>  /*********************************
>>  * pool functions
>>  **********************************/
>> +static void __zswap_pool_empty(struct percpu_ref *ref);
>>  
>>  static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>>  {
>> @@ -357,13 +358,18 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
>>  	/* being the current pool takes 1 ref; this func expects the
>>  	 * caller to always add the new pool as the current pool
>>  	 */
>> -	kref_init(&pool->kref);
>> +	ret = percpu_ref_init(&pool->ref, __zswap_pool_empty,
>> +			      PERCPU_REF_ALLOW_REINIT, GFP_KERNEL);
>> +	if (ret)
>> +		goto ref_fail;
>>  	INIT_LIST_HEAD(&pool->list);
>>  
>>  	zswap_pool_debug("created", pool);
>>  
>>  	return pool;
>>  
>> +ref_fail:
>> +	cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
>>  error:
>>  	if (pool->acomp_ctx)
>>  		free_percpu(pool->acomp_ctx);
>> @@ -436,8 +442,9 @@ static void __zswap_pool_release(struct work_struct *work)
>>  
>>  	synchronize_rcu();
>>  
>> -	/* nobody should have been able to get a kref... */
>> -	WARN_ON(kref_get_unless_zero(&pool->kref));
>> +	/* nobody should have been able to get a ref... */
>> +	WARN_ON(percpu_ref_tryget(&pool->ref));
> 
> Just curious, was there any value from using kref_get_unless_zero() over
> kref_read() here? If not, I think percpu_ref_is_zero() is more
> intuitive. This also seems like it fits more as a debug check.

Agree, percpu_ref_is_zero() is better for debug.

> 
>> +	percpu_ref_exit(&pool->ref);
>>  
>>  	/* pool is now off zswap_pools list and has no references. */
>>  	zswap_pool_destroy(pool);
>> @@ -445,11 +452,11 @@ static void __zswap_pool_release(struct work_struct *work)
>>  
>>  static struct zswap_pool *zswap_pool_current(void);
>>  
>> -static void __zswap_pool_empty(struct kref *kref)
>> +static void __zswap_pool_empty(struct percpu_ref *ref)
>>  {
>>  	struct zswap_pool *pool;
>>  
>> -	pool = container_of(kref, typeof(*pool), kref);
>> +	pool = container_of(ref, typeof(*pool), ref);
>>  
>>  	spin_lock(&zswap_pools_lock);
>>  
>> @@ -468,12 +475,12 @@ static int __must_check zswap_pool_get(struct zswap_pool *pool)
>>  	if (!pool)
>>  		return 0;
>>  
>> -	return kref_get_unless_zero(&pool->kref);
>> +	return percpu_ref_tryget(&pool->ref);
>>  }
>>  
>>  static void zswap_pool_put(struct zswap_pool *pool)
>>  {
>> -	kref_put(&pool->kref, __zswap_pool_empty);
>> +	percpu_ref_put(&pool->ref);
>>  }
>>  
>>  static struct zswap_pool *__zswap_pool_current(void)
>> @@ -603,6 +610,12 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp,
>>  
>>  	if (!pool)
>>  		pool = zswap_pool_create(type, compressor);
>> +	else {
>> +		/* Resurrect percpu_ref to percpu mode. */
>> +		percpu_ref_resurrect(&pool->ref);
> 
> I think this is not very clear. The previous code relied on the ref from
> zswap_pool_find_get() to replace the initial ref that we had dropped
> before. This is not needed with percpu_ref_resurrect() because it
> already restores the initial ref dropped by percpu_ref_kill().
> 
> Perhaps something like:
> 		/*
> 		 * Restore the initial ref dropped by percpu_ref_kill()
> 		 * when the pool was decommissioned and switch it again
> 		 * to percpu mode.
> 		 /
> 

Ok, will add this comment, it's clearer.

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-16  8:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-14  8:54 [PATCH v2 0/2] mm/zswap: optimize for dynamic zswap_pools Chengming Zhou
2024-02-14  8:54 ` [PATCH v2 1/2] mm/zswap: global lru and shrinker shared by all zswap_pools Chengming Zhou
2024-02-14 19:20   ` Yosry Ahmed
2024-02-16  8:47     ` Chengming Zhou
2024-02-14  8:54 ` [PATCH v2 2/2] mm/zswap: change zswap_pool kref to percpu_ref Chengming Zhou
2024-02-14 20:10   ` Yosry Ahmed
2024-02-16  8:49     ` Chengming Zhou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.