All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Chulmin Kim <cmlaika.kim@samsung.com>
Cc: Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v3 02/16] mm/compaction: support non-lru movable page migration
Date: Tue, 12 Apr 2016 23:25:22 +0900	[thread overview]
Message-ID: <20160412142522.GA3265@blaptop> (raw)
In-Reply-To: <570CAB12.2090408@samsung.com>

Hello Chulmin,

On Tue, Apr 12, 2016 at 05:00:18PM +0900, Chulmin Kim wrote:
> On 2016년 03월 30일 16:12, Minchan Kim wrote:
> >We have allowed migration for only LRU pages until now and it was
> >enough to make high-order pages. But recently, embedded system(e.g.,
> >webOS, android) uses lots of non-movable pages(e.g., zram, GPU memory)
> >so we have seen several reports about troubles of small high-order
> >allocation. For fixing the problem, there were several efforts
> >(e,g,. enhance compaction algorithm, SLUB fallback to 0-order page,
> >reserved memory, vmalloc and so on) but if there are lots of
> >non-movable pages in system, their solutions are void in the long run.
> >
> >So, this patch is to support facility to change non-movable pages
> >with movable. For the feature, this patch introduces functions related
> >to migration to address_space_operations as well as some page flags.
> >
> >Basically, this patch supports two page-flags and two functions related
> >to page migration. The flag and page->mapping stability are protected
> >by PG_lock.
> >
> >	PG_movable
> >	PG_isolated
> >
> >	bool (*isolate_page) (struct page *, isolate_mode_t);
> >	void (*putback_page) (struct page *);
> >
> >Duty of subsystem want to make their pages as migratable are
> >as follows:
> >
> >1. It should register address_space to page->mapping then mark
> >the page as PG_movable via __SetPageMovable.
> >
> >2. It should mark the page as PG_isolated via SetPageIsolated
> >if isolation is sucessful and return true.
> >
> >3. If migration is successful, it should clear PG_isolated and
> >PG_movable of the page for free preparation then release the
> >reference of the page to free.
> >
> >4. If migration fails, putback function of subsystem should
> >clear PG_isolated via ClearPageIsolated.
> >
> >5. If a subsystem want to release isolated page, it should
> >clear PG_isolated but not PG_movable. Instead, VM will do it.
> >
> >Cc: Vlastimil Babka <vbabka@suse.cz>
> >Cc: Mel Gorman <mgorman@suse.de>
> >Cc: Hugh Dickins <hughd@google.com>
> >Cc: dri-devel@lists.freedesktop.org
> >Cc: virtualization@lists.linux-foundation.org
> >Signed-off-by: Gioh Kim <gurugio@hanmail.net>
> >Signed-off-by: Minchan Kim <minchan@kernel.org>
> >---
> >  Documentation/filesystems/Locking      |   4 +
> >  Documentation/filesystems/vfs.txt      |   5 +
> >  fs/proc/page.c                         |   3 +
> >  include/linux/fs.h                     |   2 +
> >  include/linux/migrate.h                |   2 +
> >  include/linux/page-flags.h             |  31 ++++++
> >  include/uapi/linux/kernel-page-flags.h |   1 +
> >  mm/compaction.c                        |  14 ++-
> >  mm/migrate.c                           | 174 +++++++++++++++++++++++++++++----
> >  9 files changed, 217 insertions(+), 19 deletions(-)
> >
> >diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
> >index 619af9bfdcb3..0bb79560abb3 100644
> >--- a/Documentation/filesystems/Locking
> >+++ b/Documentation/filesystems/Locking
> >@@ -195,7 +195,9 @@ unlocks and drops the reference.
> >  	int (*releasepage) (struct page *, int);
> >  	void (*freepage)(struct page *);
> >  	int (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset);
> >+	bool (*isolate_page) (struct page *, isolate_mode_t);
> >  	int (*migratepage)(struct address_space *, struct page *, struct page *);
> >+	void (*putback_page) (struct page *);
> >  	int (*launder_page)(struct page *);
> >  	int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long);
> >  	int (*error_remove_page)(struct address_space *, struct page *);
> >@@ -219,7 +221,9 @@ invalidatepage:		yes
> >  releasepage:		yes
> >  freepage:		yes
> >  direct_IO:
> >+isolate_page:		yes
> >  migratepage:		yes (both)
> >+putback_page:		yes
> >  launder_page:		yes
> >  is_partially_uptodate:	yes
> >  error_remove_page:	yes
> >diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
> >index b02a7d598258..4c1b6c3b4bc8 100644
> >--- a/Documentation/filesystems/vfs.txt
> >+++ b/Documentation/filesystems/vfs.txt
> >@@ -592,9 +592,14 @@ struct address_space_operations {
> >  	int (*releasepage) (struct page *, int);
> >  	void (*freepage)(struct page *);
> >  	ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset);
> >+	/* isolate a page for migration */
> >+	bool (*isolate_page) (struct page *, isolate_mode_t);
> >  	/* migrate the contents of a page to the specified target */
> >  	int (*migratepage) (struct page *, struct page *);
> >+	/* put the page back to right list */
> >+	void (*putback_page) (struct page *);
> >  	int (*launder_page) (struct page *);
> >+
> >  	int (*is_partially_uptodate) (struct page *, unsigned long,
> >  					unsigned long);
> >  	void (*is_dirty_writeback) (struct page *, bool *, bool *);
> >diff --git a/fs/proc/page.c b/fs/proc/page.c
> >index 3ecd445e830d..ce3d08a4ad8d 100644
> >--- a/fs/proc/page.c
> >+++ b/fs/proc/page.c
> >@@ -157,6 +157,9 @@ u64 stable_page_flags(struct page *page)
> >  	if (page_is_idle(page))
> >  		u |= 1 << KPF_IDLE;
> >
> >+	if (PageMovable(page))
> >+		u |= 1 << KPF_MOVABLE;
> >+
> >  	u |= kpf_copy_bit(k, KPF_LOCKED,	PG_locked);
> >
> >  	u |= kpf_copy_bit(k, KPF_SLAB,		PG_slab);
> >diff --git a/include/linux/fs.h b/include/linux/fs.h
> >index da9e67d937e5..36f2d610e7a8 100644
> >--- a/include/linux/fs.h
> >+++ b/include/linux/fs.h
> >@@ -401,6 +401,8 @@ struct address_space_operations {
> >  	 */
> >  	int (*migratepage) (struct address_space *,
> >  			struct page *, struct page *, enum migrate_mode);
> >+	bool (*isolate_page)(struct page *, isolate_mode_t);
> >+	void (*putback_page)(struct page *);
> >  	int (*launder_page) (struct page *);
> >  	int (*is_partially_uptodate) (struct page *, unsigned long,
> >  					unsigned long);
> >diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> >index 9b50325e4ddf..404fbfefeb33 100644
> >--- a/include/linux/migrate.h
> >+++ b/include/linux/migrate.h
> >@@ -37,6 +37,8 @@ extern int migrate_page(struct address_space *,
> >  			struct page *, struct page *, enum migrate_mode);
> >  extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
> >  		unsigned long private, enum migrate_mode mode, int reason);
> >+extern bool isolate_movable_page(struct page *page, isolate_mode_t mode);
> >+extern void putback_movable_page(struct page *page);
> >
> >  extern int migrate_prep(void);
> >  extern int migrate_prep_local(void);
> >diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> >index f4ed4f1b0c77..77ebf8fdbc6e 100644
> >--- a/include/linux/page-flags.h
> >+++ b/include/linux/page-flags.h
> >@@ -129,6 +129,10 @@ enum pageflags {
> >
> >  	/* Compound pages. Stored in first tail page's flags */
> >  	PG_double_map = PG_private_2,
> >+
> >+	/* non-lru movable pages */
> >+	PG_movable = PG_reclaim,
> >+	PG_isolated = PG_owner_priv_1,
> >  };
> >
> >  #ifndef __GENERATING_BOUNDS_H
> >@@ -614,6 +618,33 @@ static inline void __ClearPageBalloon(struct page *page)
> >  	atomic_set(&page->_mapcount, -1);
> >  }
> >
> >+#define PAGE_MOVABLE_MAPCOUNT_VALUE (-255)
> >+
> >+static inline int PageMovable(struct page *page)
> >+{
> >+	return ((test_bit(PG_movable, &(page)->flags) &&
> >+		atomic_read(&page->_mapcount) == PAGE_MOVABLE_MAPCOUNT_VALUE)
> >+		|| PageBalloon(page));
> >+}
> >+
> >+/* Caller should hold a PG_lock */
> >+static inline void __SetPageMovable(struct page *page,
> >+				struct address_space *mapping)
> >+{
> >+	page->mapping = mapping;
> >+	__set_bit(PG_movable, &page->flags);
> >+	atomic_set(&page->_mapcount, PAGE_MOVABLE_MAPCOUNT_VALUE);
> >+}
> >+
> >+static inline void __ClearPageMovable(struct page *page)
> >+{
> >+	atomic_set(&page->_mapcount, -1);
> >+	__clear_bit(PG_movable, &(page)->flags);
> >+	page->mapping = NULL;
> >+}
> >+
> >+PAGEFLAG(Isolated, isolated, PF_ANY);
> >+
> >  /*
> >   * If network-based swap is enabled, sl*b must keep track of whether pages
> >   * were allocated from pfmemalloc reserves.
> >diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
> >index 5da5f8751ce7..a184fd2434fa 100644
> >--- a/include/uapi/linux/kernel-page-flags.h
> >+++ b/include/uapi/linux/kernel-page-flags.h
> >@@ -34,6 +34,7 @@
> >  #define KPF_BALLOON		23
> >  #define KPF_ZERO_PAGE		24
> >  #define KPF_IDLE		25
> >+#define KPF_MOVABLE		26
> >
> >
> >  #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
> >diff --git a/mm/compaction.c b/mm/compaction.c
> >index ccf97b02b85f..7557aedddaee 100644
> >--- a/mm/compaction.c
> >+++ b/mm/compaction.c
> >@@ -703,7 +703,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >
> >  		/*
> >  		 * Check may be lockless but that's ok as we recheck later.
> >-		 * It's possible to migrate LRU pages and balloon pages
> >+		 * It's possible to migrate LRU and movable kernel pages.
> >  		 * Skip any other type of page
> >  		 */
> >  		is_lru = PageLRU(page);
> >@@ -714,6 +714,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >  					goto isolate_success;
> >  				}
> >  			}
> >+
> >+			if (unlikely(PageMovable(page)) &&
> >+					!PageIsolated(page)) {
> >+				if (locked) {
> >+					spin_unlock_irqrestore(&zone->lru_lock,
> >+									flags);
> >+					locked = false;
> >+				}
> >+
> >+				if (isolate_movable_page(page, isolate_mode))
> >+					goto isolate_success;
> >+			}
> >  		}
> >
> >  		/*
> >diff --git a/mm/migrate.c b/mm/migrate.c
> >index 53529c805752..b56bf2b3fe8c 100644
> >--- a/mm/migrate.c
> >+++ b/mm/migrate.c
> >@@ -73,6 +73,85 @@ int migrate_prep_local(void)
> >  	return 0;
> >  }
> >
> >+bool isolate_movable_page(struct page *page, isolate_mode_t mode)
> >+{
> >+	bool ret = false;
> >+
> >+	/*
> >+	 * Avoid burning cycles with pages that are yet under __free_pages(),
> >+	 * or just got freed under us.
> >+	 *
> >+	 * In case we 'win' a race for a movable page being freed under us and
> >+	 * raise its refcount preventing __free_pages() from doing its job
> >+	 * the put_page() at the end of this block will take care of
> >+	 * release this page, thus avoiding a nasty leakage.
> >+	 */
> >+	if (unlikely(!get_page_unless_zero(page)))
> >+		goto out;
> >+
> >+	/*
> >+	 * Check PG_movable before holding a PG_lock because page's owner
> >+	 * assumes anybody doesn't touch PG_lock of newly allocated page.
> >+	 */
> >+	if (unlikely(!PageMovable(page)))
> >+		goto out_putpage;
> >+	/*
> >+	 * As movable pages are not isolated from LRU lists, concurrent
> >+	 * compaction threads can race against page migration functions
> >+	 * as well as race against the releasing a page.
> >+	 *
> >+	 * In order to avoid having an already isolated movable page
> >+	 * being (wrongly) re-isolated while it is under migration,
> >+	 * or to avoid attempting to isolate pages being released,
> >+	 * lets be sure we have the page lock
> >+	 * before proceeding with the movable page isolation steps.
> >+	 */
> >+	if (unlikely(!trylock_page(page)))
> >+		goto out_putpage;
> >+
> >+	if (!PageMovable(page) || PageIsolated(page))
> >+		goto out_no_isolated;
> 
> 
> Hello Minchan.
> 
> We captured a problem case.
> We suspect that
> a zs subpage(T) in the below condition can be isolated twice,
> which seems wrong.
> 
>  migrate_ctx_A         migrate_ctx_B            C (proc being killed)
> -------------       ----------------        -----------------
>  lock_page(T)
>  isolate(T)
>  unlock_page(T)
> 
>                                               zs_free() (making zspage
> ZS_EMPTY)
>                                                   free_zspage()
>                                                      lock_page(T)
>                                                      reset_page(T)
>                                                         (Keeps T
> "PageMovable" and Clears "PageIsolated")
>                                                      unlock_page(T)
> 
> 
>                        lock_page(T)
>                        isolate(T)
> 
> 
> In our case,
> during the second isolation (migrateB),
> there was null pointer dereference
> ((Without DEBUG_VM) T's first page was set to NULL)
> 
> 
> Not sure this is the case you and Vlastimil discussed.
> (I think it is a bit different)

That's exactly a bug we discussed.
I will fix it in next revision.

Thanks.

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Chulmin Kim <cmlaika.kim@samsung.com>
Cc: Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v3 02/16] mm/compaction: support non-lru movable page migration
Date: Tue, 12 Apr 2016 23:25:22 +0900	[thread overview]
Message-ID: <20160412142522.GA3265@blaptop> (raw)
In-Reply-To: <570CAB12.2090408@samsung.com>

Hello Chulmin,

On Tue, Apr 12, 2016 at 05:00:18PM +0900, Chulmin Kim wrote:
> On 2016e?? 03i?? 30i? 1/4  16:12, Minchan Kim wrote:
> >We have allowed migration for only LRU pages until now and it was
> >enough to make high-order pages. But recently, embedded system(e.g.,
> >webOS, android) uses lots of non-movable pages(e.g., zram, GPU memory)
> >so we have seen several reports about troubles of small high-order
> >allocation. For fixing the problem, there were several efforts
> >(e,g,. enhance compaction algorithm, SLUB fallback to 0-order page,
> >reserved memory, vmalloc and so on) but if there are lots of
> >non-movable pages in system, their solutions are void in the long run.
> >
> >So, this patch is to support facility to change non-movable pages
> >with movable. For the feature, this patch introduces functions related
> >to migration to address_space_operations as well as some page flags.
> >
> >Basically, this patch supports two page-flags and two functions related
> >to page migration. The flag and page->mapping stability are protected
> >by PG_lock.
> >
> >	PG_movable
> >	PG_isolated
> >
> >	bool (*isolate_page) (struct page *, isolate_mode_t);
> >	void (*putback_page) (struct page *);
> >
> >Duty of subsystem want to make their pages as migratable are
> >as follows:
> >
> >1. It should register address_space to page->mapping then mark
> >the page as PG_movable via __SetPageMovable.
> >
> >2. It should mark the page as PG_isolated via SetPageIsolated
> >if isolation is sucessful and return true.
> >
> >3. If migration is successful, it should clear PG_isolated and
> >PG_movable of the page for free preparation then release the
> >reference of the page to free.
> >
> >4. If migration fails, putback function of subsystem should
> >clear PG_isolated via ClearPageIsolated.
> >
> >5. If a subsystem want to release isolated page, it should
> >clear PG_isolated but not PG_movable. Instead, VM will do it.
> >
> >Cc: Vlastimil Babka <vbabka@suse.cz>
> >Cc: Mel Gorman <mgorman@suse.de>
> >Cc: Hugh Dickins <hughd@google.com>
> >Cc: dri-devel@lists.freedesktop.org
> >Cc: virtualization@lists.linux-foundation.org
> >Signed-off-by: Gioh Kim <gurugio@hanmail.net>
> >Signed-off-by: Minchan Kim <minchan@kernel.org>
> >---
> >  Documentation/filesystems/Locking      |   4 +
> >  Documentation/filesystems/vfs.txt      |   5 +
> >  fs/proc/page.c                         |   3 +
> >  include/linux/fs.h                     |   2 +
> >  include/linux/migrate.h                |   2 +
> >  include/linux/page-flags.h             |  31 ++++++
> >  include/uapi/linux/kernel-page-flags.h |   1 +
> >  mm/compaction.c                        |  14 ++-
> >  mm/migrate.c                           | 174 +++++++++++++++++++++++++++++----
> >  9 files changed, 217 insertions(+), 19 deletions(-)
> >
> >diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
> >index 619af9bfdcb3..0bb79560abb3 100644
> >--- a/Documentation/filesystems/Locking
> >+++ b/Documentation/filesystems/Locking
> >@@ -195,7 +195,9 @@ unlocks and drops the reference.
> >  	int (*releasepage) (struct page *, int);
> >  	void (*freepage)(struct page *);
> >  	int (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset);
> >+	bool (*isolate_page) (struct page *, isolate_mode_t);
> >  	int (*migratepage)(struct address_space *, struct page *, struct page *);
> >+	void (*putback_page) (struct page *);
> >  	int (*launder_page)(struct page *);
> >  	int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long);
> >  	int (*error_remove_page)(struct address_space *, struct page *);
> >@@ -219,7 +221,9 @@ invalidatepage:		yes
> >  releasepage:		yes
> >  freepage:		yes
> >  direct_IO:
> >+isolate_page:		yes
> >  migratepage:		yes (both)
> >+putback_page:		yes
> >  launder_page:		yes
> >  is_partially_uptodate:	yes
> >  error_remove_page:	yes
> >diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
> >index b02a7d598258..4c1b6c3b4bc8 100644
> >--- a/Documentation/filesystems/vfs.txt
> >+++ b/Documentation/filesystems/vfs.txt
> >@@ -592,9 +592,14 @@ struct address_space_operations {
> >  	int (*releasepage) (struct page *, int);
> >  	void (*freepage)(struct page *);
> >  	ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset);
> >+	/* isolate a page for migration */
> >+	bool (*isolate_page) (struct page *, isolate_mode_t);
> >  	/* migrate the contents of a page to the specified target */
> >  	int (*migratepage) (struct page *, struct page *);
> >+	/* put the page back to right list */
> >+	void (*putback_page) (struct page *);
> >  	int (*launder_page) (struct page *);
> >+
> >  	int (*is_partially_uptodate) (struct page *, unsigned long,
> >  					unsigned long);
> >  	void (*is_dirty_writeback) (struct page *, bool *, bool *);
> >diff --git a/fs/proc/page.c b/fs/proc/page.c
> >index 3ecd445e830d..ce3d08a4ad8d 100644
> >--- a/fs/proc/page.c
> >+++ b/fs/proc/page.c
> >@@ -157,6 +157,9 @@ u64 stable_page_flags(struct page *page)
> >  	if (page_is_idle(page))
> >  		u |= 1 << KPF_IDLE;
> >
> >+	if (PageMovable(page))
> >+		u |= 1 << KPF_MOVABLE;
> >+
> >  	u |= kpf_copy_bit(k, KPF_LOCKED,	PG_locked);
> >
> >  	u |= kpf_copy_bit(k, KPF_SLAB,		PG_slab);
> >diff --git a/include/linux/fs.h b/include/linux/fs.h
> >index da9e67d937e5..36f2d610e7a8 100644
> >--- a/include/linux/fs.h
> >+++ b/include/linux/fs.h
> >@@ -401,6 +401,8 @@ struct address_space_operations {
> >  	 */
> >  	int (*migratepage) (struct address_space *,
> >  			struct page *, struct page *, enum migrate_mode);
> >+	bool (*isolate_page)(struct page *, isolate_mode_t);
> >+	void (*putback_page)(struct page *);
> >  	int (*launder_page) (struct page *);
> >  	int (*is_partially_uptodate) (struct page *, unsigned long,
> >  					unsigned long);
> >diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> >index 9b50325e4ddf..404fbfefeb33 100644
> >--- a/include/linux/migrate.h
> >+++ b/include/linux/migrate.h
> >@@ -37,6 +37,8 @@ extern int migrate_page(struct address_space *,
> >  			struct page *, struct page *, enum migrate_mode);
> >  extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
> >  		unsigned long private, enum migrate_mode mode, int reason);
> >+extern bool isolate_movable_page(struct page *page, isolate_mode_t mode);
> >+extern void putback_movable_page(struct page *page);
> >
> >  extern int migrate_prep(void);
> >  extern int migrate_prep_local(void);
> >diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> >index f4ed4f1b0c77..77ebf8fdbc6e 100644
> >--- a/include/linux/page-flags.h
> >+++ b/include/linux/page-flags.h
> >@@ -129,6 +129,10 @@ enum pageflags {
> >
> >  	/* Compound pages. Stored in first tail page's flags */
> >  	PG_double_map = PG_private_2,
> >+
> >+	/* non-lru movable pages */
> >+	PG_movable = PG_reclaim,
> >+	PG_isolated = PG_owner_priv_1,
> >  };
> >
> >  #ifndef __GENERATING_BOUNDS_H
> >@@ -614,6 +618,33 @@ static inline void __ClearPageBalloon(struct page *page)
> >  	atomic_set(&page->_mapcount, -1);
> >  }
> >
> >+#define PAGE_MOVABLE_MAPCOUNT_VALUE (-255)
> >+
> >+static inline int PageMovable(struct page *page)
> >+{
> >+	return ((test_bit(PG_movable, &(page)->flags) &&
> >+		atomic_read(&page->_mapcount) == PAGE_MOVABLE_MAPCOUNT_VALUE)
> >+		|| PageBalloon(page));
> >+}
> >+
> >+/* Caller should hold a PG_lock */
> >+static inline void __SetPageMovable(struct page *page,
> >+				struct address_space *mapping)
> >+{
> >+	page->mapping = mapping;
> >+	__set_bit(PG_movable, &page->flags);
> >+	atomic_set(&page->_mapcount, PAGE_MOVABLE_MAPCOUNT_VALUE);
> >+}
> >+
> >+static inline void __ClearPageMovable(struct page *page)
> >+{
> >+	atomic_set(&page->_mapcount, -1);
> >+	__clear_bit(PG_movable, &(page)->flags);
> >+	page->mapping = NULL;
> >+}
> >+
> >+PAGEFLAG(Isolated, isolated, PF_ANY);
> >+
> >  /*
> >   * If network-based swap is enabled, sl*b must keep track of whether pages
> >   * were allocated from pfmemalloc reserves.
> >diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
> >index 5da5f8751ce7..a184fd2434fa 100644
> >--- a/include/uapi/linux/kernel-page-flags.h
> >+++ b/include/uapi/linux/kernel-page-flags.h
> >@@ -34,6 +34,7 @@
> >  #define KPF_BALLOON		23
> >  #define KPF_ZERO_PAGE		24
> >  #define KPF_IDLE		25
> >+#define KPF_MOVABLE		26
> >
> >
> >  #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
> >diff --git a/mm/compaction.c b/mm/compaction.c
> >index ccf97b02b85f..7557aedddaee 100644
> >--- a/mm/compaction.c
> >+++ b/mm/compaction.c
> >@@ -703,7 +703,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >
> >  		/*
> >  		 * Check may be lockless but that's ok as we recheck later.
> >-		 * It's possible to migrate LRU pages and balloon pages
> >+		 * It's possible to migrate LRU and movable kernel pages.
> >  		 * Skip any other type of page
> >  		 */
> >  		is_lru = PageLRU(page);
> >@@ -714,6 +714,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >  					goto isolate_success;
> >  				}
> >  			}
> >+
> >+			if (unlikely(PageMovable(page)) &&
> >+					!PageIsolated(page)) {
> >+				if (locked) {
> >+					spin_unlock_irqrestore(&zone->lru_lock,
> >+									flags);
> >+					locked = false;
> >+				}
> >+
> >+				if (isolate_movable_page(page, isolate_mode))
> >+					goto isolate_success;
> >+			}
> >  		}
> >
> >  		/*
> >diff --git a/mm/migrate.c b/mm/migrate.c
> >index 53529c805752..b56bf2b3fe8c 100644
> >--- a/mm/migrate.c
> >+++ b/mm/migrate.c
> >@@ -73,6 +73,85 @@ int migrate_prep_local(void)
> >  	return 0;
> >  }
> >
> >+bool isolate_movable_page(struct page *page, isolate_mode_t mode)
> >+{
> >+	bool ret = false;
> >+
> >+	/*
> >+	 * Avoid burning cycles with pages that are yet under __free_pages(),
> >+	 * or just got freed under us.
> >+	 *
> >+	 * In case we 'win' a race for a movable page being freed under us and
> >+	 * raise its refcount preventing __free_pages() from doing its job
> >+	 * the put_page() at the end of this block will take care of
> >+	 * release this page, thus avoiding a nasty leakage.
> >+	 */
> >+	if (unlikely(!get_page_unless_zero(page)))
> >+		goto out;
> >+
> >+	/*
> >+	 * Check PG_movable before holding a PG_lock because page's owner
> >+	 * assumes anybody doesn't touch PG_lock of newly allocated page.
> >+	 */
> >+	if (unlikely(!PageMovable(page)))
> >+		goto out_putpage;
> >+	/*
> >+	 * As movable pages are not isolated from LRU lists, concurrent
> >+	 * compaction threads can race against page migration functions
> >+	 * as well as race against the releasing a page.
> >+	 *
> >+	 * In order to avoid having an already isolated movable page
> >+	 * being (wrongly) re-isolated while it is under migration,
> >+	 * or to avoid attempting to isolate pages being released,
> >+	 * lets be sure we have the page lock
> >+	 * before proceeding with the movable page isolation steps.
> >+	 */
> >+	if (unlikely(!trylock_page(page)))
> >+		goto out_putpage;
> >+
> >+	if (!PageMovable(page) || PageIsolated(page))
> >+		goto out_no_isolated;
> 
> 
> Hello Minchan.
> 
> We captured a problem case.
> We suspect that
> a zs subpage(T) in the below condition can be isolated twice,
> which seems wrong.
> 
>  migrate_ctx_A         migrate_ctx_B            C (proc being killed)
> -------------       ----------------        -----------------
>  lock_page(T)
>  isolate(T)
>  unlock_page(T)
> 
>                                               zs_free() (making zspage
> ZS_EMPTY)
>                                                   free_zspage()
>                                                      lock_page(T)
>                                                      reset_page(T)
>                                                         (Keeps T
> "PageMovable" and Clears "PageIsolated")
>                                                      unlock_page(T)
> 
> 
>                        lock_page(T)
>                        isolate(T)
> 
> 
> In our case,
> during the second isolation (migrateB),
> there was null pointer dereference
> ((Without DEBUG_VM) T's first page was set to NULL)
> 
> 
> Not sure this is the case you and Vlastimil discussed.
> (I think it is a bit different)

That's exactly a bug we discussed.
I will fix it in next revision.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-04-12 14:25 UTC|newest]

Thread overview: 185+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-30  7:11 [PATCH v3 00/16] Support non-lru page migration Minchan Kim
2016-03-30  7:11 ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 01/16] mm: use put_page to free page instead of putback_lru_page Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-01 12:58   ` Vlastimil Babka
2016-04-01 12:58     ` Vlastimil Babka
2016-04-01 12:58     ` Vlastimil Babka
2016-04-04  1:39     ` Minchan Kim
2016-04-04  1:39       ` Minchan Kim
2016-04-04  1:39       ` Minchan Kim
2016-04-04  4:45       ` Naoya Horiguchi
2016-04-04  4:45         ` Naoya Horiguchi
2016-04-04 14:46         ` Vlastimil Babka
2016-04-04 14:46           ` Vlastimil Babka
2016-04-04 14:46           ` Vlastimil Babka
2016-04-05  1:54           ` Naoya Horiguchi
2016-04-05  1:54             ` Naoya Horiguchi
2016-04-05  1:54             ` Naoya Horiguchi
2016-04-05  8:20             ` Vlastimil Babka
2016-04-05  8:20               ` Vlastimil Babka
2016-04-05  8:20               ` Vlastimil Babka
2016-04-06  0:54               ` Naoya Horiguchi
2016-04-06  0:54                 ` Naoya Horiguchi
2016-04-06  0:54                 ` Naoya Horiguchi
2016-04-06  7:57                 ` Vlastimil Babka
2016-04-06  7:57                 ` Vlastimil Babka
2016-04-06  7:57                   ` Vlastimil Babka
2016-04-04  4:45       ` Naoya Horiguchi
2016-04-04  5:53   ` Balbir Singh
2016-04-04  5:53   ` Balbir Singh
2016-04-04  5:53     ` Balbir Singh
2016-04-04  6:01     ` Minchan Kim
2016-04-04  6:01       ` Minchan Kim
2016-04-04  6:01       ` Minchan Kim
2016-04-05  3:10       ` Balbir Singh
2016-04-05  3:10       ` Balbir Singh
2016-04-05  3:10         ` Balbir Singh
2016-03-30  7:12 ` [PATCH v3 02/16] mm/compaction: support non-lru movable page migration Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-01 21:29   ` Vlastimil Babka
2016-04-01 21:29     ` Vlastimil Babka
2016-04-04  5:12     ` Minchan Kim
2016-04-04  5:12       ` Minchan Kim
2016-04-04 13:24       ` Vlastimil Babka
2016-04-04 13:24         ` Vlastimil Babka
2016-04-04 13:24         ` Vlastimil Babka
2016-04-07  2:35         ` Minchan Kim
2016-04-07  2:35           ` Minchan Kim
2016-04-07  2:35         ` Minchan Kim
2016-04-04 13:24       ` Vlastimil Babka
2016-04-04  5:12     ` Minchan Kim
2016-04-01 21:29   ` Vlastimil Babka
2016-04-12  8:00   ` Chulmin Kim
2016-04-12  8:00     ` Chulmin Kim
2016-04-12 14:25     ` Minchan Kim [this message]
2016-04-12 14:25       ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 03/16] mm: add non-lru movable page support document Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-01 14:38   ` Vlastimil Babka
2016-04-01 14:38   ` Vlastimil Babka
2016-04-01 14:38     ` Vlastimil Babka
2016-04-04  2:25     ` Minchan Kim
2016-04-04  2:25       ` Minchan Kim
2016-04-04  2:25       ` Minchan Kim
2016-04-04 13:09       ` Vlastimil Babka
2016-04-04 13:09         ` Vlastimil Babka
2016-04-04 13:09         ` Vlastimil Babka
2016-04-07  2:27         ` Minchan Kim
2016-04-07  2:27         ` Minchan Kim
2016-04-07  2:27           ` Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 04/16] mm/balloon: use general movable page feature into balloon Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-05 12:03   ` Vlastimil Babka
2016-04-05 12:03     ` Vlastimil Babka
2016-04-05 12:03     ` Vlastimil Babka
2016-04-11  4:29     ` Minchan Kim
2016-04-11  4:29     ` Minchan Kim
2016-04-11  4:29       ` Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 05/16] zsmalloc: keep max_object in size_class Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-17 15:08   ` Sergey Senozhatsky
2016-04-17 15:08   ` Sergey Senozhatsky
2016-04-17 15:08     ` Sergey Senozhatsky
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 06/16] zsmalloc: squeeze inuse into page->mapping Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-17 15:08   ` Sergey Senozhatsky
2016-04-17 15:08   ` Sergey Senozhatsky
2016-04-17 15:08     ` Sergey Senozhatsky
2016-04-19  7:40     ` Minchan Kim
2016-04-19  7:40       ` Minchan Kim
2016-04-19  7:40     ` Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 07/16] zsmalloc: remove page_mapcount_reset Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-17 15:11   ` Sergey Senozhatsky
2016-04-17 15:11   ` Sergey Senozhatsky
2016-04-17 15:11     ` Sergey Senozhatsky
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 08/16] zsmalloc: squeeze freelist into page->mapping Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-17 15:56   ` Sergey Senozhatsky
2016-04-17 15:56   ` Sergey Senozhatsky
2016-04-17 15:56     ` Sergey Senozhatsky
2016-04-19  7:42     ` Minchan Kim
2016-04-19  7:42       ` Minchan Kim
2016-04-19  7:42       ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 09/16] zsmalloc: move struct zs_meta from mapping to freelist Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-17 15:22   ` Sergey Senozhatsky
2016-04-17 15:22   ` Sergey Senozhatsky
2016-04-17 15:22     ` Sergey Senozhatsky
2016-03-30  7:12 ` [PATCH v3 10/16] zsmalloc: factor page chain functionality out Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-18  0:33   ` Sergey Senozhatsky
2016-04-18  0:33     ` Sergey Senozhatsky
2016-04-19  7:46     ` Minchan Kim
2016-04-19  7:46       ` Minchan Kim
2016-04-19  7:46     ` Minchan Kim
2016-04-18  0:33   ` Sergey Senozhatsky
2016-03-30  7:12 ` [PATCH v3 11/16] zsmalloc: separate free_zspage from putback_zspage Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-18  1:04   ` Sergey Senozhatsky
2016-04-18  1:04     ` Sergey Senozhatsky
2016-04-19  7:51     ` Minchan Kim
2016-04-19  7:51       ` Minchan Kim
2016-04-19  7:53       ` Sergey Senozhatsky
2016-04-19  7:53         ` Sergey Senozhatsky
2016-04-19  7:53         ` Sergey Senozhatsky
2016-04-19  7:51     ` Minchan Kim
2016-04-18  1:04   ` Sergey Senozhatsky
2016-03-30  7:12 ` [PATCH v3 12/16] zsmalloc: zs_compact refactoring Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-04  8:04   ` Chulmin Kim
2016-04-04  8:04     ` Chulmin Kim
2016-04-04  9:01     ` Minchan Kim
2016-04-04  9:01       ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 13/16] zsmalloc: migrate head page of zspage Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-04-06 13:01   ` Chulmin Kim
2016-04-06 13:01     ` Chulmin Kim
2016-04-07  0:34     ` Chulmin Kim
2016-04-07  0:34       ` Chulmin Kim
2016-04-07  0:43     ` Minchan Kim
2016-04-07  0:43       ` Minchan Kim
2016-04-19  6:08   ` Chulmin Kim
2016-04-19  6:08     ` Chulmin Kim
2016-04-19  6:15     ` Minchan Kim
2016-04-19  6:15       ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 14/16] zsmalloc: use single linked list for page chain Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 15/16] zsmalloc: migrate tail pages in zspage Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 16/16] zram: use __GFP_MOVABLE for memory allocation Minchan Kim
2016-03-30  7:12 ` Minchan Kim
2016-03-30  7:12   ` Minchan Kim
2016-03-30 23:11 ` [PATCH v3 00/16] Support non-lru page migration Andrew Morton
2016-03-30 23:11   ` Andrew Morton
2016-03-30 23:11   ` Andrew Morton
2016-03-31  0:29   ` Sergey Senozhatsky
2016-03-31  0:29   ` Sergey Senozhatsky
2016-03-31  0:29     ` Sergey Senozhatsky
2016-03-31  0:57     ` Minchan Kim
2016-03-31  0:57     ` Minchan Kim
2016-03-31  0:57       ` Minchan Kim
2016-03-31  0:57   ` Minchan Kim
2016-03-31  0:57   ` Minchan Kim
2016-03-31  0:57     ` Minchan Kim
2016-04-04 13:17 ` John Einar Reitan
2016-04-04 13:17 ` John Einar Reitan
2016-04-11  4:35   ` Minchan Kim
2016-04-11  4:35     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160412142522.GA3265@blaptop \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cmlaika.kim@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.