All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chulmin Kim <cmlaika.kim@samsung.com>
To: Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: Re: [PATCH v7 11/12] zsmalloc: page migration support
Date: Wed, 18 Jan 2017 19:13:21 -0500	[thread overview]
Message-ID: <afd38699-f1c4-f63f-7362-29c514e9ffb4@samsung.com> (raw)
In-Reply-To: <1464736881-24886-12-git-send-email-minchan@kernel.org>

Hello. Minchan, and all zsmalloc guys.

I have a quick question.
Is zsmalloc considering memory barrier things correctly?

AFAIK, in ARM64,
zsmalloc relies on dmb operation in bit_spin_unlock only.
(It seems that dmb operations in spinlock functions are being prepared,
but let is be aside as it is not merged yet.)

If I am correct,
migrating a page in a zspage filled with free objs
may cause the corruption cause bit_spin_unlock will not be executed at all.

I am not sure this is enough memory barrier for zsmalloc operations.

Can you enlighten me?


THanks!
CHulmin KIm



On 05/31/2016 07:21 PM, Minchan Kim wrote:
> This patch introduces run-time migration feature for zspage.
>
> For migration, VM uses page.lru field so it would be better to not use
> page.next field which is unified with page.lru for own purpose.
> For that, firstly, we can get first object offset of the page via
> runtime calculation instead of using page.index so we can use
> page.index as link for page chaining instead of page.next.
> 	
> In case of huge object, it stores handle to page.index instead of
> next link of page chaining because huge object doesn't need to next
> link for page chaining. So get_next_page need to identify huge
> object to return NULL. For it, this patch uses PG_owner_priv_1 flag
> of the page flag.
>
> For migration, it supports three functions
>
> * zs_page_isolate
>
> It isolates a zspage which includes a subpage VM want to migrate
> from class so anyone cannot allocate new object from the zspage.
>
> We could try to isolate a zspage by the number of subpage so
> subsequent isolation trial of other subpage of the zpsage shouldn't
> fail. For that, we introduce zspage.isolated count. With that,
> zs_page_isolate can know whether zspage is already isolated or not
> for migration so if it is isolated for migration, subsequent
> isolation trial can be successful without trying further isolation.
>
> * zs_page_migrate
>
> First of all, it holds write-side zspage->lock to prevent migrate other
> subpage in zspage. Then, lock all objects in the page VM want to migrate.
> The reason we should lock all objects in the page is due to race between
> zs_map_object and zs_page_migrate.
>
> zs_map_object				zs_page_migrate
>
> pin_tag(handle)
> obj = handle_to_obj(handle)
> obj_to_location(obj, &page, &obj_idx);
>
> 					write_lock(&zspage->lock)
> 					if (!trypin_tag(handle))
> 						goto unpin_object
>
> zspage = get_zspage(page);
> read_lock(&zspage->lock);
>
> If zs_page_migrate doesn't do trypin_tag, zs_map_object's page can
> be stale by migration so it goes crash.
>
> If it locks all of objects successfully, it copies content from
> old page to new one, finally, create new zspage chain with new page.
> And if it's last isolated subpage in the zspage, put the zspage back
> to class.
>
> * zs_page_putback
>
> It returns isolated zspage to right fullness_group list if it fails to
> migrate a page. If it find a zspage is ZS_EMPTY, it queues zspage
> freeing to workqueue. See below about async zspage freeing.
>
> This patch introduces asynchronous zspage free. The reason to need it
> is we need page_lock to clear PG_movable but unfortunately,
> zs_free path should be atomic so the apporach is try to grab page_lock.
> If it got page_lock of all of pages successfully, it can free zspage
> immediately. Otherwise, it queues free request and free zspage via
> workqueue in process context.
>
> If zs_free finds the zspage is isolated when it try to free zspage,
> it delays the freeing until zs_page_putback finds it so it will free
> free the zspage finally.
>
> In this patch, we expand fullness_list from ZS_EMPTY to ZS_FULL.
> First of all, it will use ZS_EMPTY list for delay freeing.
> And with adding ZS_FULL list, it makes to identify whether zspage is
> isolated or not via list_empty(&zspage->list) test.
>
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  include/uapi/linux/magic.h |   1 +
>  mm/zsmalloc.c              | 793 ++++++++++++++++++++++++++++++++++++++-------
>  2 files changed, 672 insertions(+), 122 deletions(-)
>
> diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
> index d829ce63529d..e398beac67b8 100644
> --- a/include/uapi/linux/magic.h
> +++ b/include/uapi/linux/magic.h
> @@ -81,5 +81,6 @@
>  /* Since UDF 2.01 is ISO 13346 based... */
>  #define UDF_SUPER_MAGIC		0x15013346
>  #define BALLOON_KVM_MAGIC	0x13661366
> +#define ZSMALLOC_MAGIC		0x58295829
>
>  #endif /* __LINUX_MAGIC_H__ */
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index c6fb543cfb98..a80100db16d6 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -17,14 +17,14 @@
>   *
>   * Usage of struct page fields:
>   *	page->private: points to zspage
> - *	page->index: offset of the first object starting in this page.
> - *		For the first page, this is always 0, so we use this field
> - *		to store handle for huge object.
> - *	page->next: links together all component pages of a zspage
> + *	page->freelist(index): links together all component pages of a zspage
> + *		For the huge page, this is always 0, so we use this field
> + *		to store handle.
>   *
>   * Usage of struct page flags:
>   *	PG_private: identifies the first component page
>   *	PG_private2: identifies the last component page
> + *	PG_owner_priv_1: indentifies the huge component page
>   *
>   */
>
> @@ -49,6 +49,11 @@
>  #include <linux/debugfs.h>
>  #include <linux/zsmalloc.h>
>  #include <linux/zpool.h>
> +#include <linux/mount.h>
> +#include <linux/compaction.h>
> +#include <linux/pagemap.h>
> +
> +#define ZSPAGE_MAGIC	0x58
>
>  /*
>   * This must be power of 2 and greater than of equal to sizeof(link_free).
> @@ -136,25 +141,23 @@
>   * We do not maintain any list for completely empty or full pages
>   */
>  enum fullness_group {
> -	ZS_ALMOST_FULL,
> -	ZS_ALMOST_EMPTY,
>  	ZS_EMPTY,
> -	ZS_FULL
> +	ZS_ALMOST_EMPTY,
> +	ZS_ALMOST_FULL,
> +	ZS_FULL,
> +	NR_ZS_FULLNESS,
>  };
>
>  enum zs_stat_type {
> +	CLASS_EMPTY,
> +	CLASS_ALMOST_EMPTY,
> +	CLASS_ALMOST_FULL,
> +	CLASS_FULL,
>  	OBJ_ALLOCATED,
>  	OBJ_USED,
> -	CLASS_ALMOST_FULL,
> -	CLASS_ALMOST_EMPTY,
> +	NR_ZS_STAT_TYPE,
>  };
>
> -#ifdef CONFIG_ZSMALLOC_STAT
> -#define NR_ZS_STAT_TYPE	(CLASS_ALMOST_EMPTY + 1)
> -#else
> -#define NR_ZS_STAT_TYPE	(OBJ_USED + 1)
> -#endif
> -
>  struct zs_size_stat {
>  	unsigned long objs[NR_ZS_STAT_TYPE];
>  };
> @@ -163,6 +166,10 @@ struct zs_size_stat {
>  static struct dentry *zs_stat_root;
>  #endif
>
> +#ifdef CONFIG_COMPACTION
> +static struct vfsmount *zsmalloc_mnt;
> +#endif
> +
>  /*
>   * number of size_classes
>   */
> @@ -186,23 +193,36 @@ static const int fullness_threshold_frac = 4;
>
>  struct size_class {
>  	spinlock_t lock;
> -	struct list_head fullness_list[2];
> +	struct list_head fullness_list[NR_ZS_FULLNESS];
>  	/*
>  	 * Size of objects stored in this class. Must be multiple
>  	 * of ZS_ALIGN.
>  	 */
>  	int size;
>  	int objs_per_zspage;
> -	unsigned int index;
> -
> -	struct zs_size_stat stats;
> -
>  	/* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */
>  	int pages_per_zspage;
> -	/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> -	bool huge;
> +
> +	unsigned int index;
> +	struct zs_size_stat stats;
>  };
>
> +/* huge object: pages_per_zspage == 1 && maxobj_per_zspage == 1 */
> +static void SetPageHugeObject(struct page *page)
> +{
> +	SetPageOwnerPriv1(page);
> +}
> +
> +static void ClearPageHugeObject(struct page *page)
> +{
> +	ClearPageOwnerPriv1(page);
> +}
> +
> +static int PageHugeObject(struct page *page)
> +{
> +	return PageOwnerPriv1(page);
> +}
> +
>  /*
>   * Placed within free objects to form a singly linked list.
>   * For every zspage, zspage->freeobj gives head of this list.
> @@ -244,6 +264,10 @@ struct zs_pool {
>  #ifdef CONFIG_ZSMALLOC_STAT
>  	struct dentry *stat_dentry;
>  #endif
> +#ifdef CONFIG_COMPACTION
> +	struct inode *inode;
> +	struct work_struct free_work;
> +#endif
>  };
>
>  /*
> @@ -252,16 +276,23 @@ struct zs_pool {
>   */
>  #define FULLNESS_BITS	2
>  #define CLASS_BITS	8
> +#define ISOLATED_BITS	3
> +#define MAGIC_VAL_BITS	8
>
>  struct zspage {
>  	struct {
>  		unsigned int fullness:FULLNESS_BITS;
>  		unsigned int class:CLASS_BITS;
> +		unsigned int isolated:ISOLATED_BITS;
> +		unsigned int magic:MAGIC_VAL_BITS;
>  	};
>  	unsigned int inuse;
>  	unsigned int freeobj;
>  	struct page *first_page;
>  	struct list_head list; /* fullness list */
> +#ifdef CONFIG_COMPACTION
> +	rwlock_t lock;
> +#endif
>  };
>
>  struct mapping_area {
> @@ -274,6 +305,28 @@ struct mapping_area {
>  	enum zs_mapmode vm_mm; /* mapping mode */
>  };
>
> +#ifdef CONFIG_COMPACTION
> +static int zs_register_migration(struct zs_pool *pool);
> +static void zs_unregister_migration(struct zs_pool *pool);
> +static void migrate_lock_init(struct zspage *zspage);
> +static void migrate_read_lock(struct zspage *zspage);
> +static void migrate_read_unlock(struct zspage *zspage);
> +static void kick_deferred_free(struct zs_pool *pool);
> +static void init_deferred_free(struct zs_pool *pool);
> +static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage);
> +#else
> +static int zsmalloc_mount(void) { return 0; }
> +static void zsmalloc_unmount(void) {}
> +static int zs_register_migration(struct zs_pool *pool) { return 0; }
> +static void zs_unregister_migration(struct zs_pool *pool) {}
> +static void migrate_lock_init(struct zspage *zspage) {}
> +static void migrate_read_lock(struct zspage *zspage) {}
> +static void migrate_read_unlock(struct zspage *zspage) {}
> +static void kick_deferred_free(struct zs_pool *pool) {}
> +static void init_deferred_free(struct zs_pool *pool) {}
> +static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage) {}
> +#endif
> +
>  static int create_cache(struct zs_pool *pool)
>  {
>  	pool->handle_cachep = kmem_cache_create("zs_handle", ZS_HANDLE_SIZE,
> @@ -301,7 +354,7 @@ static void destroy_cache(struct zs_pool *pool)
>  static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
>  {
>  	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> -			gfp & ~__GFP_HIGHMEM);
> +			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
>  }
>
>  static void cache_free_handle(struct zs_pool *pool, unsigned long handle)
> @@ -311,7 +364,8 @@ static void cache_free_handle(struct zs_pool *pool, unsigned long handle)
>
>  static struct zspage *cache_alloc_zspage(struct zs_pool *pool, gfp_t flags)
>  {
> -	return kmem_cache_alloc(pool->zspage_cachep, flags & ~__GFP_HIGHMEM);
> +	return kmem_cache_alloc(pool->zspage_cachep,
> +			flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
>  };
>
>  static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage)
> @@ -421,11 +475,17 @@ static unsigned int get_maxobj_per_zspage(int size, int pages_per_zspage)
>  /* per-cpu VM mapping areas for zspage accesses that cross page boundaries */
>  static DEFINE_PER_CPU(struct mapping_area, zs_map_area);
>
> +static bool is_zspage_isolated(struct zspage *zspage)
> +{
> +	return zspage->isolated;
> +}
> +
>  static int is_first_page(struct page *page)
>  {
>  	return PagePrivate(page);
>  }
>
> +/* Protected by class->lock */
>  static inline int get_zspage_inuse(struct zspage *zspage)
>  {
>  	return zspage->inuse;
> @@ -441,20 +501,12 @@ static inline void mod_zspage_inuse(struct zspage *zspage, int val)
>  	zspage->inuse += val;
>  }
>
> -static inline int get_first_obj_offset(struct page *page)
> +static inline struct page *get_first_page(struct zspage *zspage)
>  {
> -	if (is_first_page(page))
> -		return 0;
> +	struct page *first_page = zspage->first_page;
>
> -	return page->index;
> -}
> -
> -static inline void set_first_obj_offset(struct page *page, int offset)
> -{
> -	if (is_first_page(page))
> -		return;
> -
> -	page->index = offset;
> +	VM_BUG_ON_PAGE(!is_first_page(first_page), first_page);
> +	return first_page;
>  }
>
>  static inline unsigned int get_freeobj(struct zspage *zspage)
> @@ -471,6 +523,8 @@ static void get_zspage_mapping(struct zspage *zspage,
>  				unsigned int *class_idx,
>  				enum fullness_group *fullness)
>  {
> +	VM_BUG_ON(zspage->magic != ZSPAGE_MAGIC);
> +
>  	*fullness = zspage->fullness;
>  	*class_idx = zspage->class;
>  }
> @@ -504,23 +558,19 @@ static int get_size_class_index(int size)
>  static inline void zs_stat_inc(struct size_class *class,
>  				enum zs_stat_type type, unsigned long cnt)
>  {
> -	if (type < NR_ZS_STAT_TYPE)
> -		class->stats.objs[type] += cnt;
> +	class->stats.objs[type] += cnt;
>  }
>
>  static inline void zs_stat_dec(struct size_class *class,
>  				enum zs_stat_type type, unsigned long cnt)
>  {
> -	if (type < NR_ZS_STAT_TYPE)
> -		class->stats.objs[type] -= cnt;
> +	class->stats.objs[type] -= cnt;
>  }
>
>  static inline unsigned long zs_stat_get(struct size_class *class,
>  				enum zs_stat_type type)
>  {
> -	if (type < NR_ZS_STAT_TYPE)
> -		return class->stats.objs[type];
> -	return 0;
> +	return class->stats.objs[type];
>  }
>
>  #ifdef CONFIG_ZSMALLOC_STAT
> @@ -664,6 +714,7 @@ static inline void zs_pool_stat_destroy(struct zs_pool *pool)
>  }
>  #endif
>
> +
>  /*
>   * For each size class, zspages are divided into different groups
>   * depending on how "full" they are. This was done so that we could
> @@ -704,15 +755,9 @@ static void insert_zspage(struct size_class *class,
>  {
>  	struct zspage *head;
>
> -	if (fullness >= ZS_EMPTY)
> -		return;
> -
> +	zs_stat_inc(class, fullness, 1);
>  	head = list_first_entry_or_null(&class->fullness_list[fullness],
>  					struct zspage, list);
> -
> -	zs_stat_inc(class, fullness == ZS_ALMOST_EMPTY ?
> -			CLASS_ALMOST_EMPTY : CLASS_ALMOST_FULL, 1);
> -
>  	/*
>  	 * We want to see more ZS_FULL pages and less almost empty/full.
>  	 * Put pages with higher ->inuse first.
> @@ -734,14 +779,11 @@ static void remove_zspage(struct size_class *class,
>  				struct zspage *zspage,
>  				enum fullness_group fullness)
>  {
> -	if (fullness >= ZS_EMPTY)
> -		return;
> -
>  	VM_BUG_ON(list_empty(&class->fullness_list[fullness]));
> +	VM_BUG_ON(is_zspage_isolated(zspage));
>
>  	list_del_init(&zspage->list);
> -	zs_stat_dec(class, fullness == ZS_ALMOST_EMPTY ?
> -			CLASS_ALMOST_EMPTY : CLASS_ALMOST_FULL, 1);
> +	zs_stat_dec(class, fullness, 1);
>  }
>
>  /*
> @@ -764,8 +806,11 @@ static enum fullness_group fix_fullness_group(struct size_class *class,
>  	if (newfg == currfg)
>  		goto out;
>
> -	remove_zspage(class, zspage, currfg);
> -	insert_zspage(class, zspage, newfg);
> +	if (!is_zspage_isolated(zspage)) {
> +		remove_zspage(class, zspage, currfg);
> +		insert_zspage(class, zspage, newfg);
> +	}
> +
>  	set_zspage_mapping(zspage, class_idx, newfg);
>
>  out:
> @@ -808,19 +853,45 @@ static int get_pages_per_zspage(int class_size)
>  	return max_usedpc_order;
>  }
>
> -static struct page *get_first_page(struct zspage *zspage)
> +static struct zspage *get_zspage(struct page *page)
>  {
> -	return zspage->first_page;
> +	struct zspage *zspage = (struct zspage *)page->private;
> +
> +	VM_BUG_ON(zspage->magic != ZSPAGE_MAGIC);
> +	return zspage;
>  }
>
> -static struct zspage *get_zspage(struct page *page)
> +static struct page *get_next_page(struct page *page)
>  {
> -	return (struct zspage *)page->private;
> +	if (unlikely(PageHugeObject(page)))
> +		return NULL;
> +
> +	return page->freelist;
>  }
>
> -static struct page *get_next_page(struct page *page)
> +/* Get byte offset of first object in the @page */
> +static int get_first_obj_offset(struct size_class *class,
> +				struct page *first_page, struct page *page)
>  {
> -	return page->next;
> +	int pos;
> +	int page_idx = 0;
> +	int ofs = 0;
> +	struct page *cursor = first_page;
> +
> +	if (first_page == page)
> +		goto out;
> +
> +	while (page != cursor) {
> +		page_idx++;
> +		cursor = get_next_page(cursor);
> +	}
> +
> +	pos = class->objs_per_zspage * class->size *
> +		page_idx / class->pages_per_zspage;
> +
> +	ofs = (pos + class->size) % PAGE_SIZE;
> +out:
> +	return ofs;
>  }
>
>  /**
> @@ -857,16 +928,20 @@ static unsigned long handle_to_obj(unsigned long handle)
>  	return *(unsigned long *)handle;
>  }
>
> -static unsigned long obj_to_head(struct size_class *class, struct page *page,
> -			void *obj)
> +static unsigned long obj_to_head(struct page *page, void *obj)
>  {
> -	if (class->huge) {
> +	if (unlikely(PageHugeObject(page))) {
>  		VM_BUG_ON_PAGE(!is_first_page(page), page);
>  		return page->index;
>  	} else
>  		return *(unsigned long *)obj;
>  }
>
> +static inline int testpin_tag(unsigned long handle)
> +{
> +	return bit_spin_is_locked(HANDLE_PIN_BIT, (unsigned long *)handle);
> +}
> +
>  static inline int trypin_tag(unsigned long handle)
>  {
>  	return bit_spin_trylock(HANDLE_PIN_BIT, (unsigned long *)handle);
> @@ -884,27 +959,93 @@ static void unpin_tag(unsigned long handle)
>
>  static void reset_page(struct page *page)
>  {
> +	__ClearPageMovable(page);
>  	clear_bit(PG_private, &page->flags);
>  	clear_bit(PG_private_2, &page->flags);
>  	set_page_private(page, 0);
> -	page->index = 0;
> +	ClearPageHugeObject(page);
> +	page->freelist = NULL;
>  }
>
> -static void free_zspage(struct zs_pool *pool, struct zspage *zspage)
> +/*
> + * To prevent zspage destroy during migration, zspage freeing should
> + * hold locks of all pages in the zspage.
> + */
> +void lock_zspage(struct zspage *zspage)
> +{
> +	struct page *page = get_first_page(zspage);
> +
> +	do {
> +		lock_page(page);
> +	} while ((page = get_next_page(page)) != NULL);
> +}
> +
> +int trylock_zspage(struct zspage *zspage)
> +{
> +	struct page *cursor, *fail;
> +
> +	for (cursor = get_first_page(zspage); cursor != NULL; cursor =
> +					get_next_page(cursor)) {
> +		if (!trylock_page(cursor)) {
> +			fail = cursor;
> +			goto unlock;
> +		}
> +	}
> +
> +	return 1;
> +unlock:
> +	for (cursor = get_first_page(zspage); cursor != fail; cursor =
> +					get_next_page(cursor))
> +		unlock_page(cursor);
> +
> +	return 0;
> +}
> +
> +static void __free_zspage(struct zs_pool *pool, struct size_class *class,
> +				struct zspage *zspage)
>  {
>  	struct page *page, *next;
> +	enum fullness_group fg;
> +	unsigned int class_idx;
> +
> +	get_zspage_mapping(zspage, &class_idx, &fg);
> +
> +	assert_spin_locked(&class->lock);
>
>  	VM_BUG_ON(get_zspage_inuse(zspage));
> +	VM_BUG_ON(fg != ZS_EMPTY);
>
> -	next = page = zspage->first_page;
> +	next = page = get_first_page(zspage);
>  	do {
> -		next = page->next;
> +		VM_BUG_ON_PAGE(!PageLocked(page), page);
> +		next = get_next_page(page);
>  		reset_page(page);
> +		unlock_page(page);
>  		put_page(page);
>  		page = next;
>  	} while (page != NULL);
>
>  	cache_free_zspage(pool, zspage);
> +
> +	zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
> +			class->size, class->pages_per_zspage));
> +	atomic_long_sub(class->pages_per_zspage,
> +					&pool->pages_allocated);
> +}
> +
> +static void free_zspage(struct zs_pool *pool, struct size_class *class,
> +				struct zspage *zspage)
> +{
> +	VM_BUG_ON(get_zspage_inuse(zspage));
> +	VM_BUG_ON(list_empty(&zspage->list));
> +
> +	if (!trylock_zspage(zspage)) {
> +		kick_deferred_free(pool);
> +		return;
> +	}
> +
> +	remove_zspage(class, zspage, ZS_EMPTY);
> +	__free_zspage(pool, class, zspage);
>  }
>
>  /* Initialize a newly allocated zspage */
> @@ -912,15 +1053,13 @@ static void init_zspage(struct size_class *class, struct zspage *zspage)
>  {
>  	unsigned int freeobj = 1;
>  	unsigned long off = 0;
> -	struct page *page = zspage->first_page;
> +	struct page *page = get_first_page(zspage);
>
>  	while (page) {
>  		struct page *next_page;
>  		struct link_free *link;
>  		void *vaddr;
>
> -		set_first_obj_offset(page, off);
> -
>  		vaddr = kmap_atomic(page);
>  		link = (struct link_free *)vaddr + off / sizeof(*link);
>
> @@ -952,16 +1091,17 @@ static void init_zspage(struct size_class *class, struct zspage *zspage)
>  	set_freeobj(zspage, 0);
>  }
>
> -static void create_page_chain(struct zspage *zspage, struct page *pages[],
> -				int nr_pages)
> +static void create_page_chain(struct size_class *class, struct zspage *zspage,
> +				struct page *pages[])
>  {
>  	int i;
>  	struct page *page;
>  	struct page *prev_page = NULL;
> +	int nr_pages = class->pages_per_zspage;
>
>  	/*
>  	 * Allocate individual pages and link them together as:
> -	 * 1. all pages are linked together using page->next
> +	 * 1. all pages are linked together using page->freelist
>  	 * 2. each sub-page point to zspage using page->private
>  	 *
>  	 * we set PG_private to identify the first page (i.e. no other sub-page
> @@ -970,16 +1110,18 @@ static void create_page_chain(struct zspage *zspage, struct page *pages[],
>  	for (i = 0; i < nr_pages; i++) {
>  		page = pages[i];
>  		set_page_private(page, (unsigned long)zspage);
> +		page->freelist = NULL;
>  		if (i == 0) {
>  			zspage->first_page = page;
>  			SetPagePrivate(page);
> +			if (unlikely(class->objs_per_zspage == 1 &&
> +					class->pages_per_zspage == 1))
> +				SetPageHugeObject(page);
>  		} else {
> -			prev_page->next = page;
> +			prev_page->freelist = page;
>  		}
> -		if (i == nr_pages - 1) {
> +		if (i == nr_pages - 1)
>  			SetPagePrivate2(page);
> -			page->next = NULL;
> -		}
>  		prev_page = page;
>  	}
>  }
> @@ -999,6 +1141,8 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
>  		return NULL;
>
>  	memset(zspage, 0, sizeof(struct zspage));
> +	zspage->magic = ZSPAGE_MAGIC;
> +	migrate_lock_init(zspage);
>
>  	for (i = 0; i < class->pages_per_zspage; i++) {
>  		struct page *page;
> @@ -1013,7 +1157,7 @@ static struct zspage *alloc_zspage(struct zs_pool *pool,
>  		pages[i] = page;
>  	}
>
> -	create_page_chain(zspage, pages, class->pages_per_zspage);
> +	create_page_chain(class, zspage, pages);
>  	init_zspage(class, zspage);
>
>  	return zspage;
> @@ -1024,7 +1168,7 @@ static struct zspage *find_get_zspage(struct size_class *class)
>  	int i;
>  	struct zspage *zspage;
>
> -	for (i = ZS_ALMOST_FULL; i <= ZS_ALMOST_EMPTY; i++) {
> +	for (i = ZS_ALMOST_FULL; i >= ZS_EMPTY; i--) {
>  		zspage = list_first_entry_or_null(&class->fullness_list[i],
>  				struct zspage, list);
>  		if (zspage)
> @@ -1289,6 +1433,10 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
>  	obj = handle_to_obj(handle);
>  	obj_to_location(obj, &page, &obj_idx);
>  	zspage = get_zspage(page);
> +
> +	/* migration cannot move any subpage in this zspage */
> +	migrate_read_lock(zspage);
> +
>  	get_zspage_mapping(zspage, &class_idx, &fg);
>  	class = pool->size_class[class_idx];
>  	off = (class->size * obj_idx) & ~PAGE_MASK;
> @@ -1309,7 +1457,7 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
>
>  	ret = __zs_map_object(area, pages, off, class->size);
>  out:
> -	if (!class->huge)
> +	if (likely(!PageHugeObject(page)))
>  		ret += ZS_HANDLE_SIZE;
>
>  	return ret;
> @@ -1348,6 +1496,8 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
>  		__zs_unmap_object(area, pages, off, class->size);
>  	}
>  	put_cpu_var(zs_map_area);
> +
> +	migrate_read_unlock(zspage);
>  	unpin_tag(handle);
>  }
>  EXPORT_SYMBOL_GPL(zs_unmap_object);
> @@ -1377,7 +1527,7 @@ static unsigned long obj_malloc(struct size_class *class,
>  	vaddr = kmap_atomic(m_page);
>  	link = (struct link_free *)vaddr + m_offset / sizeof(*link);
>  	set_freeobj(zspage, link->next >> OBJ_ALLOCATED_TAG);
> -	if (!class->huge)
> +	if (likely(!PageHugeObject(m_page)))
>  		/* record handle in the header of allocated chunk */
>  		link->handle = handle;
>  	else
> @@ -1407,6 +1557,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp)
>  {
>  	unsigned long handle, obj;
>  	struct size_class *class;
> +	enum fullness_group newfg;
>  	struct zspage *zspage;
>
>  	if (unlikely(!size || size > ZS_MAX_ALLOC_SIZE))
> @@ -1422,28 +1573,37 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp)
>
>  	spin_lock(&class->lock);
>  	zspage = find_get_zspage(class);
> -
> -	if (!zspage) {
> +	if (likely(zspage)) {
> +		obj = obj_malloc(class, zspage, handle);
> +		/* Now move the zspage to another fullness group, if required */
> +		fix_fullness_group(class, zspage);
> +		record_obj(handle, obj);
>  		spin_unlock(&class->lock);
> -		zspage = alloc_zspage(pool, class, gfp);
> -		if (unlikely(!zspage)) {
> -			cache_free_handle(pool, handle);
> -			return 0;
> -		}
>
> -		set_zspage_mapping(zspage, class->index, ZS_EMPTY);
> -		atomic_long_add(class->pages_per_zspage,
> -					&pool->pages_allocated);
> +		return handle;
> +	}
>
> -		spin_lock(&class->lock);
> -		zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
> -				class->size, class->pages_per_zspage));
> +	spin_unlock(&class->lock);
> +
> +	zspage = alloc_zspage(pool, class, gfp);
> +	if (!zspage) {
> +		cache_free_handle(pool, handle);
> +		return 0;
>  	}
>
> +	spin_lock(&class->lock);
>  	obj = obj_malloc(class, zspage, handle);
> -	/* Now move the zspage to another fullness group, if required */
> -	fix_fullness_group(class, zspage);
> +	newfg = get_fullness_group(class, zspage);
> +	insert_zspage(class, zspage, newfg);
> +	set_zspage_mapping(zspage, class->index, newfg);
>  	record_obj(handle, obj);
> +	atomic_long_add(class->pages_per_zspage,
> +				&pool->pages_allocated);
> +	zs_stat_inc(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
> +			class->size, class->pages_per_zspage));
> +
> +	/* We completely set up zspage so mark them as movable */
> +	SetZsPageMovable(pool, zspage);
>  	spin_unlock(&class->lock);
>
>  	return handle;
> @@ -1484,6 +1644,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
>  	int class_idx;
>  	struct size_class *class;
>  	enum fullness_group fullness;
> +	bool isolated;
>
>  	if (unlikely(!handle))
>  		return;
> @@ -1493,22 +1654,28 @@ void zs_free(struct zs_pool *pool, unsigned long handle)
>  	obj_to_location(obj, &f_page, &f_objidx);
>  	zspage = get_zspage(f_page);
>
> +	migrate_read_lock(zspage);
> +
>  	get_zspage_mapping(zspage, &class_idx, &fullness);
>  	class = pool->size_class[class_idx];
>
>  	spin_lock(&class->lock);
>  	obj_free(class, obj);
>  	fullness = fix_fullness_group(class, zspage);
> -	if (fullness == ZS_EMPTY) {
> -		zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
> -				class->size, class->pages_per_zspage));
> -		atomic_long_sub(class->pages_per_zspage,
> -				&pool->pages_allocated);
> -		free_zspage(pool, zspage);
> +	if (fullness != ZS_EMPTY) {
> +		migrate_read_unlock(zspage);
> +		goto out;
>  	}
> +
> +	isolated = is_zspage_isolated(zspage);
> +	migrate_read_unlock(zspage);
> +	/* If zspage is isolated, zs_page_putback will free the zspage */
> +	if (likely(!isolated))
> +		free_zspage(pool, class, zspage);
> +out:
> +
>  	spin_unlock(&class->lock);
>  	unpin_tag(handle);
> -
>  	cache_free_handle(pool, handle);
>  }
>  EXPORT_SYMBOL_GPL(zs_free);
> @@ -1587,12 +1754,13 @@ static unsigned long find_alloced_obj(struct size_class *class,
>  	int offset = 0;
>  	unsigned long handle = 0;
>  	void *addr = kmap_atomic(page);
> +	struct zspage *zspage = get_zspage(page);
>
> -	offset = get_first_obj_offset(page);
> +	offset = get_first_obj_offset(class, get_first_page(zspage), page);
>  	offset += class->size * index;
>
>  	while (offset < PAGE_SIZE) {
> -		head = obj_to_head(class, page, addr + offset);
> +		head = obj_to_head(page, addr + offset);
>  		if (head & OBJ_ALLOCATED_TAG) {
>  			handle = head & ~OBJ_ALLOCATED_TAG;
>  			if (trypin_tag(handle))
> @@ -1684,6 +1852,7 @@ static struct zspage *isolate_zspage(struct size_class *class, bool source)
>  		zspage = list_first_entry_or_null(&class->fullness_list[fg[i]],
>  							struct zspage, list);
>  		if (zspage) {
> +			VM_BUG_ON(is_zspage_isolated(zspage));
>  			remove_zspage(class, zspage, fg[i]);
>  			return zspage;
>  		}
> @@ -1704,6 +1873,8 @@ static enum fullness_group putback_zspage(struct size_class *class,
>  {
>  	enum fullness_group fullness;
>
> +	VM_BUG_ON(is_zspage_isolated(zspage));
> +
>  	fullness = get_fullness_group(class, zspage);
>  	insert_zspage(class, zspage, fullness);
>  	set_zspage_mapping(zspage, class->index, fullness);
> @@ -1711,6 +1882,377 @@ static enum fullness_group putback_zspage(struct size_class *class,
>  	return fullness;
>  }
>
> +#ifdef CONFIG_COMPACTION
> +static struct dentry *zs_mount(struct file_system_type *fs_type,
> +				int flags, const char *dev_name, void *data)
> +{
> +	static const struct dentry_operations ops = {
> +		.d_dname = simple_dname,
> +	};
> +
> +	return mount_pseudo(fs_type, "zsmalloc:", NULL, &ops, ZSMALLOC_MAGIC);
> +}
> +
> +static struct file_system_type zsmalloc_fs = {
> +	.name		= "zsmalloc",
> +	.mount		= zs_mount,
> +	.kill_sb	= kill_anon_super,
> +};
> +
> +static int zsmalloc_mount(void)
> +{
> +	int ret = 0;
> +
> +	zsmalloc_mnt = kern_mount(&zsmalloc_fs);
> +	if (IS_ERR(zsmalloc_mnt))
> +		ret = PTR_ERR(zsmalloc_mnt);
> +
> +	return ret;
> +}
> +
> +static void zsmalloc_unmount(void)
> +{
> +	kern_unmount(zsmalloc_mnt);
> +}
> +
> +static void migrate_lock_init(struct zspage *zspage)
> +{
> +	rwlock_init(&zspage->lock);
> +}
> +
> +static void migrate_read_lock(struct zspage *zspage)
> +{
> +	read_lock(&zspage->lock);
> +}
> +
> +static void migrate_read_unlock(struct zspage *zspage)
> +{
> +	read_unlock(&zspage->lock);
> +}
> +
> +static void migrate_write_lock(struct zspage *zspage)
> +{
> +	write_lock(&zspage->lock);
> +}
> +
> +static void migrate_write_unlock(struct zspage *zspage)
> +{
> +	write_unlock(&zspage->lock);
> +}
> +
> +/* Number of isolated subpage for *page migration* in this zspage */
> +static void inc_zspage_isolation(struct zspage *zspage)
> +{
> +	zspage->isolated++;
> +}
> +
> +static void dec_zspage_isolation(struct zspage *zspage)
> +{
> +	zspage->isolated--;
> +}
> +
> +static void replace_sub_page(struct size_class *class, struct zspage *zspage,
> +				struct page *newpage, struct page *oldpage)
> +{
> +	struct page *page;
> +	struct page *pages[ZS_MAX_PAGES_PER_ZSPAGE] = {NULL, };
> +	int idx = 0;
> +
> +	page = get_first_page(zspage);
> +	do {
> +		if (page == oldpage)
> +			pages[idx] = newpage;
> +		else
> +			pages[idx] = page;
> +		idx++;
> +	} while ((page = get_next_page(page)) != NULL);
> +
> +	create_page_chain(class, zspage, pages);
> +	if (unlikely(PageHugeObject(oldpage)))
> +		newpage->index = oldpage->index;
> +	__SetPageMovable(newpage, page_mapping(oldpage));
> +}
> +
> +bool zs_page_isolate(struct page *page, isolate_mode_t mode)
> +{
> +	struct zs_pool *pool;
> +	struct size_class *class;
> +	int class_idx;
> +	enum fullness_group fullness;
> +	struct zspage *zspage;
> +	struct address_space *mapping;
> +
> +	/*
> +	 * Page is locked so zspage couldn't be destroyed. For detail, look at
> +	 * lock_zspage in free_zspage.
> +	 */
> +	VM_BUG_ON_PAGE(!PageMovable(page), page);
> +	VM_BUG_ON_PAGE(PageIsolated(page), page);
> +
> +	zspage = get_zspage(page);
> +
> +	/*
> +	 * Without class lock, fullness could be stale while class_idx is okay
> +	 * because class_idx is constant unless page is freed so we should get
> +	 * fullness again under class lock.
> +	 */
> +	get_zspage_mapping(zspage, &class_idx, &fullness);
> +	mapping = page_mapping(page);
> +	pool = mapping->private_data;
> +	class = pool->size_class[class_idx];
> +
> +	spin_lock(&class->lock);
> +	if (get_zspage_inuse(zspage) == 0) {
> +		spin_unlock(&class->lock);
> +		return false;
> +	}
> +
> +	/* zspage is isolated for object migration */
> +	if (list_empty(&zspage->list) && !is_zspage_isolated(zspage)) {
> +		spin_unlock(&class->lock);
> +		return false;
> +	}
> +
> +	/*
> +	 * If this is first time isolation for the zspage, isolate zspage from
> +	 * size_class to prevent further object allocation from the zspage.
> +	 */
> +	if (!list_empty(&zspage->list) && !is_zspage_isolated(zspage)) {
> +		get_zspage_mapping(zspage, &class_idx, &fullness);
> +		remove_zspage(class, zspage, fullness);
> +	}
> +
> +	inc_zspage_isolation(zspage);
> +	spin_unlock(&class->lock);
> +
> +	return true;
> +}
> +
> +int zs_page_migrate(struct address_space *mapping, struct page *newpage,
> +		struct page *page, enum migrate_mode mode)
> +{
> +	struct zs_pool *pool;
> +	struct size_class *class;
> +	int class_idx;
> +	enum fullness_group fullness;
> +	struct zspage *zspage;
> +	struct page *dummy;
> +	void *s_addr, *d_addr, *addr;
> +	int offset, pos;
> +	unsigned long handle, head;
> +	unsigned long old_obj, new_obj;
> +	unsigned int obj_idx;
> +	int ret = -EAGAIN;
> +
> +	VM_BUG_ON_PAGE(!PageMovable(page), page);
> +	VM_BUG_ON_PAGE(!PageIsolated(page), page);
> +
> +	zspage = get_zspage(page);
> +
> +	/* Concurrent compactor cannot migrate any subpage in zspage */
> +	migrate_write_lock(zspage);
> +	get_zspage_mapping(zspage, &class_idx, &fullness);
> +	pool = mapping->private_data;
> +	class = pool->size_class[class_idx];
> +	offset = get_first_obj_offset(class, get_first_page(zspage), page);
> +
> +	spin_lock(&class->lock);
> +	if (!get_zspage_inuse(zspage)) {
> +		ret = -EBUSY;
> +		goto unlock_class;
> +	}
> +
> +	pos = offset;
> +	s_addr = kmap_atomic(page);
> +	while (pos < PAGE_SIZE) {
> +		head = obj_to_head(page, s_addr + pos);
> +		if (head & OBJ_ALLOCATED_TAG) {
> +			handle = head & ~OBJ_ALLOCATED_TAG;
> +			if (!trypin_tag(handle))
> +				goto unpin_objects;
> +		}
> +		pos += class->size;
> +	}
> +
> +	/*
> +	 * Here, any user cannot access all objects in the zspage so let's move.
> +	 */
> +	d_addr = kmap_atomic(newpage);
> +	memcpy(d_addr, s_addr, PAGE_SIZE);
> +	kunmap_atomic(d_addr);
> +
> +	for (addr = s_addr + offset; addr < s_addr + pos;
> +					addr += class->size) {
> +		head = obj_to_head(page, addr);
> +		if (head & OBJ_ALLOCATED_TAG) {
> +			handle = head & ~OBJ_ALLOCATED_TAG;
> +			if (!testpin_tag(handle))
> +				BUG();
> +
> +			old_obj = handle_to_obj(handle);
> +			obj_to_location(old_obj, &dummy, &obj_idx);
> +			new_obj = (unsigned long)location_to_obj(newpage,
> +								obj_idx);
> +			new_obj |= BIT(HANDLE_PIN_BIT);
> +			record_obj(handle, new_obj);
> +		}
> +	}
> +
> +	replace_sub_page(class, zspage, newpage, page);
> +	get_page(newpage);
> +
> +	dec_zspage_isolation(zspage);
> +
> +	/*
> +	 * Page migration is done so let's putback isolated zspage to
> +	 * the list if @page is final isolated subpage in the zspage.
> +	 */
> +	if (!is_zspage_isolated(zspage))
> +		putback_zspage(class, zspage);
> +
> +	reset_page(page);
> +	put_page(page);
> +	page = newpage;
> +
> +	ret = 0;
> +unpin_objects:
> +	for (addr = s_addr + offset; addr < s_addr + pos;
> +						addr += class->size) {
> +		head = obj_to_head(page, addr);
> +		if (head & OBJ_ALLOCATED_TAG) {
> +			handle = head & ~OBJ_ALLOCATED_TAG;
> +			if (!testpin_tag(handle))
> +				BUG();
> +			unpin_tag(handle);
> +		}
> +	}
> +	kunmap_atomic(s_addr);
> +unlock_class:
> +	spin_unlock(&class->lock);
> +	migrate_write_unlock(zspage);
> +
> +	return ret;
> +}
> +
> +void zs_page_putback(struct page *page)
> +{
> +	struct zs_pool *pool;
> +	struct size_class *class;
> +	int class_idx;
> +	enum fullness_group fg;
> +	struct address_space *mapping;
> +	struct zspage *zspage;
> +
> +	VM_BUG_ON_PAGE(!PageMovable(page), page);
> +	VM_BUG_ON_PAGE(!PageIsolated(page), page);
> +
> +	zspage = get_zspage(page);
> +	get_zspage_mapping(zspage, &class_idx, &fg);
> +	mapping = page_mapping(page);
> +	pool = mapping->private_data;
> +	class = pool->size_class[class_idx];
> +
> +	spin_lock(&class->lock);
> +	dec_zspage_isolation(zspage);
> +	if (!is_zspage_isolated(zspage)) {
> +		fg = putback_zspage(class, zspage);
> +		/*
> +		 * Due to page_lock, we cannot free zspage immediately
> +		 * so let's defer.
> +		 */
> +		if (fg == ZS_EMPTY)
> +			schedule_work(&pool->free_work);
> +	}
> +	spin_unlock(&class->lock);
> +}
> +
> +const struct address_space_operations zsmalloc_aops = {
> +	.isolate_page = zs_page_isolate,
> +	.migratepage = zs_page_migrate,
> +	.putback_page = zs_page_putback,
> +};
> +
> +static int zs_register_migration(struct zs_pool *pool)
> +{
> +	pool->inode = alloc_anon_inode(zsmalloc_mnt->mnt_sb);
> +	if (IS_ERR(pool->inode)) {
> +		pool->inode = NULL;
> +		return 1;
> +	}
> +
> +	pool->inode->i_mapping->private_data = pool;
> +	pool->inode->i_mapping->a_ops = &zsmalloc_aops;
> +	return 0;
> +}
> +
> +static void zs_unregister_migration(struct zs_pool *pool)
> +{
> +	flush_work(&pool->free_work);
> +	if (pool->inode)
> +		iput(pool->inode);
> +}
> +
> +/*
> + * Caller should hold page_lock of all pages in the zspage
> + * In here, we cannot use zspage meta data.
> + */
> +static void async_free_zspage(struct work_struct *work)
> +{
> +	int i;
> +	struct size_class *class;
> +	unsigned int class_idx;
> +	enum fullness_group fullness;
> +	struct zspage *zspage, *tmp;
> +	LIST_HEAD(free_pages);
> +	struct zs_pool *pool = container_of(work, struct zs_pool,
> +					free_work);
> +
> +	for (i = 0; i < zs_size_classes; i++) {
> +		class = pool->size_class[i];
> +		if (class->index != i)
> +			continue;
> +
> +		spin_lock(&class->lock);
> +		list_splice_init(&class->fullness_list[ZS_EMPTY], &free_pages);
> +		spin_unlock(&class->lock);
> +	}
> +
> +
> +	list_for_each_entry_safe(zspage, tmp, &free_pages, list) {
> +		list_del(&zspage->list);
> +		lock_zspage(zspage);
> +
> +		get_zspage_mapping(zspage, &class_idx, &fullness);
> +		VM_BUG_ON(fullness != ZS_EMPTY);
> +		class = pool->size_class[class_idx];
> +		spin_lock(&class->lock);
> +		__free_zspage(pool, pool->size_class[class_idx], zspage);
> +		spin_unlock(&class->lock);
> +	}
> +};
> +
> +static void kick_deferred_free(struct zs_pool *pool)
> +{
> +	schedule_work(&pool->free_work);
> +}
> +
> +static void init_deferred_free(struct zs_pool *pool)
> +{
> +	INIT_WORK(&pool->free_work, async_free_zspage);
> +}
> +
> +static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage)
> +{
> +	struct page *page = get_first_page(zspage);
> +
> +	do {
> +		WARN_ON(!trylock_page(page));
> +		__SetPageMovable(page, pool->inode->i_mapping);
> +		unlock_page(page);
> +	} while ((page = get_next_page(page)) != NULL);
> +}
> +#endif
> +
>  /*
>   *
>   * Based on the number of unused allocated objects calculate
> @@ -1745,10 +2287,10 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
>  			break;
>
>  		cc.index = 0;
> -		cc.s_page = src_zspage->first_page;
> +		cc.s_page = get_first_page(src_zspage);
>
>  		while ((dst_zspage = isolate_zspage(class, false))) {
> -			cc.d_page = dst_zspage->first_page;
> +			cc.d_page = get_first_page(dst_zspage);
>  			/*
>  			 * If there is no more space in dst_page, resched
>  			 * and see if anyone had allocated another zspage.
> @@ -1765,11 +2307,7 @@ static void __zs_compact(struct zs_pool *pool, struct size_class *class)
>
>  		putback_zspage(class, dst_zspage);
>  		if (putback_zspage(class, src_zspage) == ZS_EMPTY) {
> -			zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
> -					class->size, class->pages_per_zspage));
> -			atomic_long_sub(class->pages_per_zspage,
> -					&pool->pages_allocated);
> -			free_zspage(pool, src_zspage);
> +			free_zspage(pool, class, src_zspage);
>  			pool->stats.pages_compacted += class->pages_per_zspage;
>  		}
>  		spin_unlock(&class->lock);
> @@ -1885,6 +2423,7 @@ struct zs_pool *zs_create_pool(const char *name)
>  	if (!pool)
>  		return NULL;
>
> +	init_deferred_free(pool);
>  	pool->size_class = kcalloc(zs_size_classes, sizeof(struct size_class *),
>  			GFP_KERNEL);
>  	if (!pool->size_class) {
> @@ -1939,12 +2478,10 @@ struct zs_pool *zs_create_pool(const char *name)
>  		class->pages_per_zspage = pages_per_zspage;
>  		class->objs_per_zspage = class->pages_per_zspage *
>  						PAGE_SIZE / class->size;
> -		if (pages_per_zspage == 1 && class->objs_per_zspage == 1)
> -			class->huge = true;
>  		spin_lock_init(&class->lock);
>  		pool->size_class[i] = class;
> -		for (fullness = ZS_ALMOST_FULL; fullness <= ZS_ALMOST_EMPTY;
> -								fullness++)
> +		for (fullness = ZS_EMPTY; fullness < NR_ZS_FULLNESS;
> +							fullness++)
>  			INIT_LIST_HEAD(&class->fullness_list[fullness]);
>
>  		prev_class = class;
> @@ -1953,6 +2490,9 @@ struct zs_pool *zs_create_pool(const char *name)
>  	/* debug only, don't abort if it fails */
>  	zs_pool_stat_create(pool, name);
>
> +	if (zs_register_migration(pool))
> +		goto err;
> +
>  	/*
>  	 * Not critical, we still can use the pool
>  	 * and user can trigger compaction manually.
> @@ -1972,6 +2512,7 @@ void zs_destroy_pool(struct zs_pool *pool)
>  	int i;
>
>  	zs_unregister_shrinker(pool);
> +	zs_unregister_migration(pool);
>  	zs_pool_stat_destroy(pool);
>
>  	for (i = 0; i < zs_size_classes; i++) {
> @@ -1984,7 +2525,7 @@ void zs_destroy_pool(struct zs_pool *pool)
>  		if (class->index != i)
>  			continue;
>
> -		for (fg = ZS_ALMOST_FULL; fg <= ZS_ALMOST_EMPTY; fg++) {
> +		for (fg = ZS_EMPTY; fg < NR_ZS_FULLNESS; fg++) {
>  			if (!list_empty(&class->fullness_list[fg])) {
>  				pr_info("Freeing non-empty class with size %db, fullness group %d\n",
>  					class->size, fg);
> @@ -2002,7 +2543,13 @@ EXPORT_SYMBOL_GPL(zs_destroy_pool);
>
>  static int __init zs_init(void)
>  {
> -	int ret = zs_register_cpu_notifier();
> +	int ret;
> +
> +	ret = zsmalloc_mount();
> +	if (ret)
> +		goto out;
> +
> +	ret = zs_register_cpu_notifier();
>
>  	if (ret)
>  		goto notifier_fail;
> @@ -2019,7 +2566,8 @@ static int __init zs_init(void)
>
>  notifier_fail:
>  	zs_unregister_cpu_notifier();
> -
> +	zsmalloc_unmount();
> +out:
>  	return ret;
>  }
>
> @@ -2028,6 +2576,7 @@ static void __exit zs_exit(void)
>  #ifdef CONFIG_ZPOOL
>  	zpool_unregister_driver(&zs_zpool_driver);
>  #endif
> +	zsmalloc_unmount();
>  	zs_unregister_cpu_notifier();
>
>  	zs_stat_exit();
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-01-19  0:13 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-31 23:21 [PATCH v7 00/12] Support non-lru page migration Minchan Kim
2016-05-31 23:21 ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 01/12] mm: use put_page to free page instead of putback_lru_page Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 02/12] mm: migrate: support non-lru movable page migration Minchan Kim
2016-05-31 23:21 ` Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 03/12] mm: balloon: use general non-lru movable page feature Minchan Kim
2016-05-31 23:21 ` Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 04/12] zsmalloc: keep max_object in size_class Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 05/12] zsmalloc: use bit_spin_lock Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 06/12] zsmalloc: use accessor Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 07/12] zsmalloc: factor page chain functionality out Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 08/12] zsmalloc: introduce zspage structure Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 09/12] zsmalloc: separate free_zspage from putback_zspage Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 10/12] zsmalloc: use freeobj for index Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 11/12] zsmalloc: page migration support Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-06-01 14:09   ` Vlastimil Babka
2016-06-01 14:09     ` Vlastimil Babka
2016-06-02  0:25     ` Minchan Kim
2016-06-02  0:25       ` Minchan Kim
2016-06-02 11:44       ` Vlastimil Babka
2016-06-02 11:44         ` Vlastimil Babka
2016-06-01 21:39   ` Andrew Morton
2016-06-01 21:39     ` Andrew Morton
2016-06-02  0:15     ` Minchan Kim
2016-06-02  0:15       ` Minchan Kim
     [not found]   ` <CGME20170119001317epcas1p188357c77e1f4ff08b6d3dcb76dedca06@epcas1p1.samsung.com>
2017-01-19  0:13     ` Chulmin Kim [this message]
2017-01-19  2:44       ` Minchan Kim
2017-01-19  3:39         ` Chulmin Kim
2017-01-19  6:21           ` Minchan Kim
2017-01-19  8:16             ` Chulmin Kim
2017-01-23  5:22               ` Minchan Kim
2017-01-23  5:30                 ` Sergey Senozhatsky
2017-01-23  5:40                   ` Minchan Kim
2017-01-25  4:06                     ` Chulmin Kim
2017-01-25  4:25                       ` Sergey Senozhatsky
2017-01-25  5:26                       ` Minchan Kim
2017-01-26 17:04                         ` Dan Streetman
2017-01-31  0:10                           ` Minchan Kim
2017-01-31 13:09                             ` Dan Streetman
2017-02-01  6:51                               ` Minchan Kim
2017-02-01 19:38                                 ` Dan Streetman
2017-02-02  8:48                               ` Minchan Kim
2016-05-31 23:21 ` [PATCH v7 12/12] zram: use __GFP_MOVABLE for memory allocation Minchan Kim
2016-05-31 23:21   ` Minchan Kim
2016-06-01 21:41 ` [PATCH v7 00/12] Support non-lru page migration Andrew Morton
2016-06-01 21:41 ` Andrew Morton
2016-06-01 21:41   ` Andrew Morton
2016-06-01 22:40   ` Daniel Vetter
2016-06-01 22:40     ` Daniel Vetter
2016-06-01 22:40     ` Daniel Vetter
2016-06-02  0:36   ` Minchan Kim
2016-06-02  0:36     ` Minchan Kim
2016-06-02  0:36   ` Minchan Kim
2016-06-15  7:59 ` Sergey Senozhatsky
2016-06-15  7:59 ` Sergey Senozhatsky
2016-06-15  7:59   ` Sergey Senozhatsky
2016-06-15 23:12   ` Minchan Kim
2016-06-15 23:12     ` Minchan Kim
2016-06-16  2:48     ` Sergey Senozhatsky
2016-06-16  2:48       ` Sergey Senozhatsky
2016-06-16  2:58       ` Minchan Kim
2016-06-16  2:58         ` Minchan Kim
2016-06-16  2:58         ` Minchan Kim
2016-06-16  4:23         ` Sergey Senozhatsky
2016-06-16  4:23           ` Sergey Senozhatsky
2016-06-16  4:47           ` Minchan Kim
2016-06-16  4:47           ` Minchan Kim
2016-06-16  4:47             ` Minchan Kim
2016-06-16  5:22             ` Sergey Senozhatsky
2016-06-16  5:22             ` Sergey Senozhatsky
2016-06-16  5:22               ` Sergey Senozhatsky
2016-06-16  6:47               ` Minchan Kim
2016-06-16  6:47               ` Minchan Kim
2016-06-16  6:47                 ` Minchan Kim
2016-06-16  8:42                 ` Sergey Senozhatsky
2016-06-16  8:42                   ` Sergey Senozhatsky
2016-06-16 10:09                   ` Minchan Kim
2016-06-16 10:09                     ` Minchan Kim
2016-06-17  7:28                     ` Joonsoo Kim
2016-06-17  7:28                       ` Joonsoo Kim
2016-06-17  7:28                     ` Joonsoo Kim
2016-06-16 10:09                   ` Minchan Kim
2016-06-16  8:42                 ` Sergey Senozhatsky
2016-06-16  4:23         ` Sergey Senozhatsky
2016-06-16  2:48     ` Sergey Senozhatsky
2016-06-15 23:12   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afd38699-f1c4-f63f-7362-29c514e9ffb4@samsung.com \
    --to=cmlaika.kim@samsung.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.