All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/9] reduce memory usage by page_owner
@ 2016-06-17  7:57 ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hello,

There was a bug reported by Sasha and minor fixes is needed
so I send v3.

o fix a bg reported by Sasha (mm/compaction: split freepages
without holding the zone lock)
o add code comment for todo list (mm/page_owner: use stackdepot
to store stacktrace) per Michal
o add 'inline' keyword (mm/page_alloc: introduce post allocation
processing on page allocator) per Vlastimil
o add a patch that clean-up code per Vlastimil

Joonsoo Kim (8):
  mm/compaction: split freepages without holding the zone lock
  mm/page_owner: initialize page owner without holding the zone lock
  mm/page_owner: copy last_migrate_reason in copy_page_owner()
  mm/page_owner: introduce split_page_owner and replace manual handling
  tools/vm/page_owner: increase temporary buffer size
  mm/page_owner: use stackdepot to store stacktrace
  mm/page_alloc: introduce post allocation processing on page allocator
  mm/page_isolation: clean up confused code

Sudip Mukherjee (1):
  mm/page_owner: avoid null pointer dereference

 include/linux/mm.h         |   1 -
 include/linux/page_ext.h   |   4 +-
 include/linux/page_owner.h |  12 ++--
 lib/Kconfig.debug          |   1 +
 mm/compaction.c            |  44 ++++++++----
 mm/internal.h              |   2 +
 mm/page_alloc.c            |  60 +++++------------
 mm/page_isolation.c        |  13 ++--
 mm/page_owner.c            | 163 +++++++++++++++++++++++++++++++++++++--------
 tools/vm/page_owner_sort.c |   9 ++-
 10 files changed, 205 insertions(+), 104 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 0/9] reduce memory usage by page_owner
@ 2016-06-17  7:57 ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hello,

There was a bug reported by Sasha and minor fixes is needed
so I send v3.

o fix a bg reported by Sasha (mm/compaction: split freepages
without holding the zone lock)
o add code comment for todo list (mm/page_owner: use stackdepot
to store stacktrace) per Michal
o add 'inline' keyword (mm/page_alloc: introduce post allocation
processing on page allocator) per Vlastimil
o add a patch that clean-up code per Vlastimil

Joonsoo Kim (8):
  mm/compaction: split freepages without holding the zone lock
  mm/page_owner: initialize page owner without holding the zone lock
  mm/page_owner: copy last_migrate_reason in copy_page_owner()
  mm/page_owner: introduce split_page_owner and replace manual handling
  tools/vm/page_owner: increase temporary buffer size
  mm/page_owner: use stackdepot to store stacktrace
  mm/page_alloc: introduce post allocation processing on page allocator
  mm/page_isolation: clean up confused code

Sudip Mukherjee (1):
  mm/page_owner: avoid null pointer dereference

 include/linux/mm.h         |   1 -
 include/linux/page_ext.h   |   4 +-
 include/linux/page_owner.h |  12 ++--
 lib/Kconfig.debug          |   1 +
 mm/compaction.c            |  44 ++++++++----
 mm/internal.h              |   2 +
 mm/page_alloc.c            |  60 +++++------------
 mm/page_isolation.c        |  13 ++--
 mm/page_owner.c            | 163 +++++++++++++++++++++++++++++++++++++--------
 tools/vm/page_owner_sort.c |   9 ++-
 10 files changed, 205 insertions(+), 104 deletions(-)

-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 1/9] mm/compaction: split freepages without holding the zone lock
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

We don't need to split freepages with holding the zone lock.  It will
cause more contention on zone lock so not desirable.

v3: fix un-isolated case

Link: http://lkml.kernel.org/r/1464230275-25791-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 include/linux/mm.h |  1 -
 mm/compaction.c    | 47 +++++++++++++++++++++++++++++++++--------------
 mm/page_alloc.c    | 27 ---------------------------
 3 files changed, 33 insertions(+), 42 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 558949e..15b3bc9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -562,7 +562,6 @@ void __put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/mm/compaction.c b/mm/compaction.c
index d1d2063..5557421 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -64,13 +64,31 @@ static unsigned long release_freepages(struct list_head *freelist)
 
 static void map_pages(struct list_head *list)
 {
-	struct page *page;
+	unsigned int i, order, nr_pages;
+	struct page *page, *next;
+	LIST_HEAD(tmp_list);
+
+	list_for_each_entry_safe(page, next, list, lru) {
+		list_del(&page->lru);
 
-	list_for_each_entry(page, list, lru) {
-		arch_alloc_page(page, 0);
-		kernel_map_pages(page, 1, 1);
-		kasan_alloc_pages(page, 0);
+		order = page_private(page);
+		nr_pages = 1 << order;
+		set_page_private(page, 0);
+		set_page_refcounted(page);
+
+		arch_alloc_page(page, order);
+		kernel_map_pages(page, nr_pages, 1);
+		kasan_alloc_pages(page, order);
+		if (order)
+			split_page(page, order);
+
+		for (i = 0; i < nr_pages; i++) {
+			list_add(&page->lru, &tmp_list);
+			page++;
+		}
 	}
+
+	list_splice(&tmp_list, list);
 }
 
 static inline bool migrate_async_suitable(int migratetype)
@@ -405,12 +423,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
 	unsigned long flags = 0;
 	bool locked = false;
 	unsigned long blockpfn = *start_pfn;
+	unsigned int order;
 
 	cursor = pfn_to_page(blockpfn);
 
 	/* Isolate free pages. */
 	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
+		int isolated;
 		struct page *page = cursor;
 
 		/*
@@ -476,16 +495,16 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
 				goto isolate_fail;
 		}
 
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
+		/* Found a free page, will break it into order-0 pages */
+		order = page_order(page);
+		isolated = __isolate_free_page(page, order);
 
-		/* If a page was split, advance to the end of it */
+		/* If a page was isolated, advance to the end of it */
 		if (isolated) {
+			set_page_private(page, order);
+			total_isolated += isolated;
+			list_add_tail(&page->lru, freelist);
+
 			cc->nr_freepages += isolated;
 			if (!strict &&
 				cc->nr_migratepages <= cc->nr_freepages) {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e08186a..e07f424 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2532,33 +2532,6 @@ int __isolate_free_page(struct page *page, unsigned int order)
 }
 
 /*
- * Similar to split_page except the page is already free. As this is only
- * being used for migration, the migratetype of the block also changes.
- * As this is called with interrupts disabled, the caller is responsible
- * for calling arch_alloc_page() and kernel_map_page() after interrupts
- * are enabled.
- *
- * Note: this is probably too low level an operation for use in drivers.
- * Please consult with lkml before using this in your driver.
- */
-int split_free_page(struct page *page)
-{
-	unsigned int order;
-	int nr_pages;
-
-	order = page_order(page);
-
-	nr_pages = __isolate_free_page(page, order);
-	if (!nr_pages)
-		return 0;
-
-	/* Split into individual pages */
-	set_page_refcounted(page);
-	split_page(page, order);
-	return nr_pages;
-}
-
-/*
  * Update NUMA hit/miss statistics
  *
  * Must be called with interrupts disabled.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 1/9] mm/compaction: split freepages without holding the zone lock
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

We don't need to split freepages with holding the zone lock.  It will
cause more contention on zone lock so not desirable.

v3: fix un-isolated case

Link: http://lkml.kernel.org/r/1464230275-25791-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 include/linux/mm.h |  1 -
 mm/compaction.c    | 47 +++++++++++++++++++++++++++++++++--------------
 mm/page_alloc.c    | 27 ---------------------------
 3 files changed, 33 insertions(+), 42 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 558949e..15b3bc9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -562,7 +562,6 @@ void __put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/mm/compaction.c b/mm/compaction.c
index d1d2063..5557421 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -64,13 +64,31 @@ static unsigned long release_freepages(struct list_head *freelist)
 
 static void map_pages(struct list_head *list)
 {
-	struct page *page;
+	unsigned int i, order, nr_pages;
+	struct page *page, *next;
+	LIST_HEAD(tmp_list);
+
+	list_for_each_entry_safe(page, next, list, lru) {
+		list_del(&page->lru);
 
-	list_for_each_entry(page, list, lru) {
-		arch_alloc_page(page, 0);
-		kernel_map_pages(page, 1, 1);
-		kasan_alloc_pages(page, 0);
+		order = page_private(page);
+		nr_pages = 1 << order;
+		set_page_private(page, 0);
+		set_page_refcounted(page);
+
+		arch_alloc_page(page, order);
+		kernel_map_pages(page, nr_pages, 1);
+		kasan_alloc_pages(page, order);
+		if (order)
+			split_page(page, order);
+
+		for (i = 0; i < nr_pages; i++) {
+			list_add(&page->lru, &tmp_list);
+			page++;
+		}
 	}
+
+	list_splice(&tmp_list, list);
 }
 
 static inline bool migrate_async_suitable(int migratetype)
@@ -405,12 +423,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
 	unsigned long flags = 0;
 	bool locked = false;
 	unsigned long blockpfn = *start_pfn;
+	unsigned int order;
 
 	cursor = pfn_to_page(blockpfn);
 
 	/* Isolate free pages. */
 	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-		int isolated, i;
+		int isolated;
 		struct page *page = cursor;
 
 		/*
@@ -476,16 +495,16 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
 				goto isolate_fail;
 		}
 
-		/* Found a free page, break it into order-0 pages */
-		isolated = split_free_page(page);
-		total_isolated += isolated;
-		for (i = 0; i < isolated; i++) {
-			list_add(&page->lru, freelist);
-			page++;
-		}
+		/* Found a free page, will break it into order-0 pages */
+		order = page_order(page);
+		isolated = __isolate_free_page(page, order);
 
-		/* If a page was split, advance to the end of it */
+		/* If a page was isolated, advance to the end of it */
 		if (isolated) {
+			set_page_private(page, order);
+			total_isolated += isolated;
+			list_add_tail(&page->lru, freelist);
+
 			cc->nr_freepages += isolated;
 			if (!strict &&
 				cc->nr_migratepages <= cc->nr_freepages) {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e08186a..e07f424 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2532,33 +2532,6 @@ int __isolate_free_page(struct page *page, unsigned int order)
 }
 
 /*
- * Similar to split_page except the page is already free. As this is only
- * being used for migration, the migratetype of the block also changes.
- * As this is called with interrupts disabled, the caller is responsible
- * for calling arch_alloc_page() and kernel_map_page() after interrupts
- * are enabled.
- *
- * Note: this is probably too low level an operation for use in drivers.
- * Please consult with lkml before using this in your driver.
- */
-int split_free_page(struct page *page)
-{
-	unsigned int order;
-	int nr_pages;
-
-	order = page_order(page);
-
-	nr_pages = __isolate_free_page(page, order);
-	if (!nr_pages)
-		return 0;
-
-	/* Split into individual pages */
-	set_page_refcounted(page);
-	split_page(page, order);
-	return nr_pages;
-}
-
-/*
  * Update NUMA hit/miss statistics
  *
  * Must be called with interrupts disabled.
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 2/9] mm/page_owner: initialize page owner without holding the zone lock
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

It's not necessary to initialized page_owner with holding the zone lock.
It would cause more contention on the zone lock although it's not a big
problem since it is just debug feature.  But, it is better than before so
do it.  This is also preparation step to use stackdepot in page owner
feature.  Stackdepot allocates new pages when there is no reserved space
and holding the zone lock in this case will cause deadlock.

Link: http://lkml.kernel.org/r/1464230275-25791-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 mm/compaction.c     | 3 +++
 mm/page_alloc.c     | 2 --
 mm/page_isolation.c | 9 ++++++---
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 5557421..942d6cd 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -19,6 +19,7 @@
 #include <linux/kasan.h>
 #include <linux/kthread.h>
 #include <linux/freezer.h>
+#include <linux/page_owner.h>
 #include "internal.h"
 
 #ifdef CONFIG_COMPACTION
@@ -79,6 +80,8 @@ static void map_pages(struct list_head *list)
 		arch_alloc_page(page, order);
 		kernel_map_pages(page, nr_pages, 1);
 		kasan_alloc_pages(page, order);
+
+		set_page_owner(page, order, __GFP_MOVABLE);
 		if (order)
 			split_page(page, order);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e07f424..127128a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2514,8 +2514,6 @@ int __isolate_free_page(struct page *page, unsigned int order)
 	zone->free_area[order].nr_free--;
 	rmv_page_order(page);
 
-	set_page_owner(page, order, __GFP_MOVABLE);
-
 	/* Set the pageblock if the isolated page is at least a pageblock */
 	if (order >= pageblock_order - 1) {
 		struct page *endpage = page + (1 << order) - 1;
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 612122b..927f5ee 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -7,6 +7,7 @@
 #include <linux/pageblock-flags.h>
 #include <linux/memory.h>
 #include <linux/hugetlb.h>
+#include <linux/page_owner.h>
 #include "internal.h"
 
 #define CREATE_TRACE_POINTS
@@ -108,8 +109,6 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 			if (pfn_valid_within(page_to_pfn(buddy)) &&
 			    !is_migrate_isolate_page(buddy)) {
 				__isolate_free_page(page, order);
-				kernel_map_pages(page, (1 << order), 1);
-				set_page_refcounted(page);
 				isolated_page = page;
 			}
 		}
@@ -128,8 +127,12 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	zone->nr_isolate_pageblock--;
 out:
 	spin_unlock_irqrestore(&zone->lock, flags);
-	if (isolated_page)
+	if (isolated_page) {
+		kernel_map_pages(page, (1 << order), 1);
+		set_page_refcounted(page);
+		set_page_owner(page, order, __GFP_MOVABLE);
 		__free_pages(isolated_page, order);
+	}
 }
 
 static inline struct page *
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 2/9] mm/page_owner: initialize page owner without holding the zone lock
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

It's not necessary to initialized page_owner with holding the zone lock.
It would cause more contention on the zone lock although it's not a big
problem since it is just debug feature.  But, it is better than before so
do it.  This is also preparation step to use stackdepot in page owner
feature.  Stackdepot allocates new pages when there is no reserved space
and holding the zone lock in this case will cause deadlock.

Link: http://lkml.kernel.org/r/1464230275-25791-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 mm/compaction.c     | 3 +++
 mm/page_alloc.c     | 2 --
 mm/page_isolation.c | 9 ++++++---
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 5557421..942d6cd 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -19,6 +19,7 @@
 #include <linux/kasan.h>
 #include <linux/kthread.h>
 #include <linux/freezer.h>
+#include <linux/page_owner.h>
 #include "internal.h"
 
 #ifdef CONFIG_COMPACTION
@@ -79,6 +80,8 @@ static void map_pages(struct list_head *list)
 		arch_alloc_page(page, order);
 		kernel_map_pages(page, nr_pages, 1);
 		kasan_alloc_pages(page, order);
+
+		set_page_owner(page, order, __GFP_MOVABLE);
 		if (order)
 			split_page(page, order);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e07f424..127128a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2514,8 +2514,6 @@ int __isolate_free_page(struct page *page, unsigned int order)
 	zone->free_area[order].nr_free--;
 	rmv_page_order(page);
 
-	set_page_owner(page, order, __GFP_MOVABLE);
-
 	/* Set the pageblock if the isolated page is at least a pageblock */
 	if (order >= pageblock_order - 1) {
 		struct page *endpage = page + (1 << order) - 1;
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 612122b..927f5ee 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -7,6 +7,7 @@
 #include <linux/pageblock-flags.h>
 #include <linux/memory.h>
 #include <linux/hugetlb.h>
+#include <linux/page_owner.h>
 #include "internal.h"
 
 #define CREATE_TRACE_POINTS
@@ -108,8 +109,6 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 			if (pfn_valid_within(page_to_pfn(buddy)) &&
 			    !is_migrate_isolate_page(buddy)) {
 				__isolate_free_page(page, order);
-				kernel_map_pages(page, (1 << order), 1);
-				set_page_refcounted(page);
 				isolated_page = page;
 			}
 		}
@@ -128,8 +127,12 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	zone->nr_isolate_pageblock--;
 out:
 	spin_unlock_irqrestore(&zone->lock, flags);
-	if (isolated_page)
+	if (isolated_page) {
+		kernel_map_pages(page, (1 << order), 1);
+		set_page_refcounted(page);
+		set_page_owner(page, order, __GFP_MOVABLE);
 		__free_pages(isolated_page, order);
+	}
 }
 
 static inline struct page *
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 3/9] mm/page_owner: copy last_migrate_reason in copy_page_owner()
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Currently, copy_page_owner() doesn't copy all the owner information.  It
skips last_migrate_reason because copy_page_owner() is used for migration
and it will be properly set soon.  But, following patch will use
copy_page_owner() and this skip will cause the problem that allocated page
has uninitialied last_migrate_reason.  To prevent it, this patch also copy
last_migrate_reason in copy_page_owner().

Link: http://lkml.kernel.org/r/1464230275-25791-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 mm/page_owner.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index c6cda3e..73e202f 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -118,6 +118,7 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 
 	new_ext->order = old_ext->order;
 	new_ext->gfp_mask = old_ext->gfp_mask;
+	new_ext->last_migrate_reason = old_ext->last_migrate_reason;
 	new_ext->nr_entries = old_ext->nr_entries;
 
 	for (i = 0; i < ARRAY_SIZE(new_ext->trace_entries); i++)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 3/9] mm/page_owner: copy last_migrate_reason in copy_page_owner()
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Currently, copy_page_owner() doesn't copy all the owner information.  It
skips last_migrate_reason because copy_page_owner() is used for migration
and it will be properly set soon.  But, following patch will use
copy_page_owner() and this skip will cause the problem that allocated page
has uninitialied last_migrate_reason.  To prevent it, this patch also copy
last_migrate_reason in copy_page_owner().

Link: http://lkml.kernel.org/r/1464230275-25791-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 mm/page_owner.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index c6cda3e..73e202f 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -118,6 +118,7 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 
 	new_ext->order = old_ext->order;
 	new_ext->gfp_mask = old_ext->gfp_mask;
+	new_ext->last_migrate_reason = old_ext->last_migrate_reason;
 	new_ext->nr_entries = old_ext->nr_entries;
 
 	for (i = 0; i < ARRAY_SIZE(new_ext->trace_entries); i++)
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 4/9] mm/page_owner: introduce split_page_owner and replace manual handling
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

split_page() calls set_page_owner() to set up page_owner to each pages.
But, it has a drawback that head page and the others have different
stacktrace because callsite of set_page_owner() is slightly differnt.  To
avoid this problem, this patch copies head page's page_owner to the
others.  It needs to introduce new function, split_page_owner() but it
also remove the other function, get_page_owner_gfp() so looks good to do.

Link: http://lkml.kernel.org/r/1464230275-25791-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 include/linux/page_owner.h | 12 +++++-------
 mm/page_alloc.c            |  8 ++------
 mm/page_owner.c            | 14 +++++++-------
 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/include/linux/page_owner.h b/include/linux/page_owner.h
index 46f1b93..30583ab 100644
--- a/include/linux/page_owner.h
+++ b/include/linux/page_owner.h
@@ -10,7 +10,7 @@ extern struct page_ext_operations page_owner_ops;
 extern void __reset_page_owner(struct page *page, unsigned int order);
 extern void __set_page_owner(struct page *page,
 			unsigned int order, gfp_t gfp_mask);
-extern gfp_t __get_page_owner_gfp(struct page *page);
+extern void __split_page_owner(struct page *page, unsigned int order);
 extern void __copy_page_owner(struct page *oldpage, struct page *newpage);
 extern void __set_page_owner_migrate_reason(struct page *page, int reason);
 extern void __dump_page_owner(struct page *page);
@@ -28,12 +28,10 @@ static inline void set_page_owner(struct page *page,
 		__set_page_owner(page, order, gfp_mask);
 }
 
-static inline gfp_t get_page_owner_gfp(struct page *page)
+static inline void split_page_owner(struct page *page, unsigned int order)
 {
 	if (static_branch_unlikely(&page_owner_inited))
-		return __get_page_owner_gfp(page);
-	else
-		return 0;
+		__split_page_owner(page, order);
 }
 static inline void copy_page_owner(struct page *oldpage, struct page *newpage)
 {
@@ -58,9 +56,9 @@ static inline void set_page_owner(struct page *page,
 			unsigned int order, gfp_t gfp_mask)
 {
 }
-static inline gfp_t get_page_owner_gfp(struct page *page)
+static inline void split_page_owner(struct page *page,
+			unsigned int order)
 {
-	return 0;
 }
 static inline void copy_page_owner(struct page *oldpage, struct page *newpage)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 127128a..e3085eb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2466,7 +2466,6 @@ void free_hot_cold_page_list(struct list_head *list, bool cold)
 void split_page(struct page *page, unsigned int order)
 {
 	int i;
-	gfp_t gfp_mask;
 
 	VM_BUG_ON_PAGE(PageCompound(page), page);
 	VM_BUG_ON_PAGE(!page_count(page), page);
@@ -2480,12 +2479,9 @@ void split_page(struct page *page, unsigned int order)
 		split_page(virt_to_page(page[0].shadow), order);
 #endif
 
-	gfp_mask = get_page_owner_gfp(page);
-	set_page_owner(page, 0, gfp_mask);
-	for (i = 1; i < (1 << order); i++) {
+	for (i = 1; i < (1 << order); i++)
 		set_page_refcounted(page + i);
-		set_page_owner(page + i, 0, gfp_mask);
-	}
+	split_page_owner(page, order);
 }
 EXPORT_SYMBOL_GPL(split_page);
 
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 73e202f..499ad26 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -94,17 +94,17 @@ void __set_page_owner_migrate_reason(struct page *page, int reason)
 	page_ext->last_migrate_reason = reason;
 }
 
-gfp_t __get_page_owner_gfp(struct page *page)
+void __split_page_owner(struct page *page, unsigned int order)
 {
+	int i;
 	struct page_ext *page_ext = lookup_page_ext(page);
+
 	if (unlikely(!page_ext))
-		/*
-		 * The caller just returns 0 if no valid gfp
-		 * So return 0 here too.
-		 */
-		return 0;
+		return;
 
-	return page_ext->gfp_mask;
+	page_ext->order = 0;
+	for (i = 1; i < (1 << order); i++)
+		__copy_page_owner(page, page + i);
 }
 
 void __copy_page_owner(struct page *oldpage, struct page *newpage)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 4/9] mm/page_owner: introduce split_page_owner and replace manual handling
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

split_page() calls set_page_owner() to set up page_owner to each pages.
But, it has a drawback that head page and the others have different
stacktrace because callsite of set_page_owner() is slightly differnt.  To
avoid this problem, this patch copies head page's page_owner to the
others.  It needs to introduce new function, split_page_owner() but it
also remove the other function, get_page_owner_gfp() so looks good to do.

Link: http://lkml.kernel.org/r/1464230275-25791-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 include/linux/page_owner.h | 12 +++++-------
 mm/page_alloc.c            |  8 ++------
 mm/page_owner.c            | 14 +++++++-------
 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/include/linux/page_owner.h b/include/linux/page_owner.h
index 46f1b93..30583ab 100644
--- a/include/linux/page_owner.h
+++ b/include/linux/page_owner.h
@@ -10,7 +10,7 @@ extern struct page_ext_operations page_owner_ops;
 extern void __reset_page_owner(struct page *page, unsigned int order);
 extern void __set_page_owner(struct page *page,
 			unsigned int order, gfp_t gfp_mask);
-extern gfp_t __get_page_owner_gfp(struct page *page);
+extern void __split_page_owner(struct page *page, unsigned int order);
 extern void __copy_page_owner(struct page *oldpage, struct page *newpage);
 extern void __set_page_owner_migrate_reason(struct page *page, int reason);
 extern void __dump_page_owner(struct page *page);
@@ -28,12 +28,10 @@ static inline void set_page_owner(struct page *page,
 		__set_page_owner(page, order, gfp_mask);
 }
 
-static inline gfp_t get_page_owner_gfp(struct page *page)
+static inline void split_page_owner(struct page *page, unsigned int order)
 {
 	if (static_branch_unlikely(&page_owner_inited))
-		return __get_page_owner_gfp(page);
-	else
-		return 0;
+		__split_page_owner(page, order);
 }
 static inline void copy_page_owner(struct page *oldpage, struct page *newpage)
 {
@@ -58,9 +56,9 @@ static inline void set_page_owner(struct page *page,
 			unsigned int order, gfp_t gfp_mask)
 {
 }
-static inline gfp_t get_page_owner_gfp(struct page *page)
+static inline void split_page_owner(struct page *page,
+			unsigned int order)
 {
-	return 0;
 }
 static inline void copy_page_owner(struct page *oldpage, struct page *newpage)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 127128a..e3085eb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2466,7 +2466,6 @@ void free_hot_cold_page_list(struct list_head *list, bool cold)
 void split_page(struct page *page, unsigned int order)
 {
 	int i;
-	gfp_t gfp_mask;
 
 	VM_BUG_ON_PAGE(PageCompound(page), page);
 	VM_BUG_ON_PAGE(!page_count(page), page);
@@ -2480,12 +2479,9 @@ void split_page(struct page *page, unsigned int order)
 		split_page(virt_to_page(page[0].shadow), order);
 #endif
 
-	gfp_mask = get_page_owner_gfp(page);
-	set_page_owner(page, 0, gfp_mask);
-	for (i = 1; i < (1 << order); i++) {
+	for (i = 1; i < (1 << order); i++)
 		set_page_refcounted(page + i);
-		set_page_owner(page + i, 0, gfp_mask);
-	}
+	split_page_owner(page, order);
 }
 EXPORT_SYMBOL_GPL(split_page);
 
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 73e202f..499ad26 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -94,17 +94,17 @@ void __set_page_owner_migrate_reason(struct page *page, int reason)
 	page_ext->last_migrate_reason = reason;
 }
 
-gfp_t __get_page_owner_gfp(struct page *page)
+void __split_page_owner(struct page *page, unsigned int order)
 {
+	int i;
 	struct page_ext *page_ext = lookup_page_ext(page);
+
 	if (unlikely(!page_ext))
-		/*
-		 * The caller just returns 0 if no valid gfp
-		 * So return 0 here too.
-		 */
-		return 0;
+		return;
 
-	return page_ext->gfp_mask;
+	page_ext->order = 0;
+	for (i = 1; i < (1 << order); i++)
+		__copy_page_owner(page, page + i);
 }
 
 void __copy_page_owner(struct page *oldpage, struct page *newpage)
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 5/9] tools/vm/page_owner: increase temporary buffer size
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Page owner will be changed to store more deep stacktrace so current
temporary buffer size isn't enough.  Increase it.

Link: http://lkml.kernel.org/r/1464230275-25791-5-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 tools/vm/page_owner_sort.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/vm/page_owner_sort.c b/tools/vm/page_owner_sort.c
index 77147b4..f1c055f 100644
--- a/tools/vm/page_owner_sort.c
+++ b/tools/vm/page_owner_sort.c
@@ -79,12 +79,12 @@ static void add_list(char *buf, int len)
 	}
 }
 
-#define BUF_SIZE	1024
+#define BUF_SIZE	(128 * 1024)
 
 int main(int argc, char **argv)
 {
 	FILE *fin, *fout;
-	char buf[BUF_SIZE];
+	char *buf;
 	int ret, i, count;
 	struct block_list *list2;
 	struct stat st;
@@ -107,6 +107,11 @@ int main(int argc, char **argv)
 	max_size = st.st_size / 100; /* hack ... */
 
 	list = malloc(max_size * sizeof(*list));
+	buf = malloc(BUF_SIZE);
+	if (!list || !buf) {
+		printf("Out of memory\n");
+		exit(1);
+	}
 
 	for ( ; ; ) {
 		ret = read_block(buf, BUF_SIZE, fin);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 5/9] tools/vm/page_owner: increase temporary buffer size
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Page owner will be changed to store more deep stacktrace so current
temporary buffer size isn't enough.  Increase it.

Link: http://lkml.kernel.org/r/1464230275-25791-5-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 tools/vm/page_owner_sort.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/vm/page_owner_sort.c b/tools/vm/page_owner_sort.c
index 77147b4..f1c055f 100644
--- a/tools/vm/page_owner_sort.c
+++ b/tools/vm/page_owner_sort.c
@@ -79,12 +79,12 @@ static void add_list(char *buf, int len)
 	}
 }
 
-#define BUF_SIZE	1024
+#define BUF_SIZE	(128 * 1024)
 
 int main(int argc, char **argv)
 {
 	FILE *fin, *fout;
-	char buf[BUF_SIZE];
+	char *buf;
 	int ret, i, count;
 	struct block_list *list2;
 	struct stat st;
@@ -107,6 +107,11 @@ int main(int argc, char **argv)
 	max_size = st.st_size / 100; /* hack ... */
 
 	list = malloc(max_size * sizeof(*list));
+	buf = malloc(BUF_SIZE);
+	if (!list || !buf) {
+		printf("Out of memory\n");
+		exit(1);
+	}
 
 	for ( ; ; ) {
 		ret = read_block(buf, BUF_SIZE, fin);
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 6/9] mm/page_owner: use stackdepot to store stacktrace
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Currently, we store each page's allocation stacktrace on corresponding
page_ext structure and it requires a lot of memory.  This causes the
problem that memory tight system doesn't work well if page_owner is
enabled.  Moreover, even with this large memory consumption, we cannot get
full stacktrace because we allocate memory at boot time and just maintain
8 stacktrace slots to balance memory consumption.  We could increase it to
more but it would make system unusable or change system behaviour.

To solve the problem, this patch uses stackdepot to store stacktrace.  It
obviously provides memory saving but there is a drawback that stackdepot
could fail.

stackdepot allocates memory at runtime so it could fail if system has not
enough memory.  But, most of allocation stack are generated at very early
time and there are much memory at this time.  So, failure would not happen
easily.  And, one failure means that we miss just one page's allocation
stacktrace so it would not be a big problem.  In this patch, when memory
allocation failure happens, we store special stracktrace handle to the
page that is failed to save stacktrace.  With it, user can guess memory
usage properly even if failure happens.

Memory saving looks as following. (4GB memory system with page_owner)
(before the patch -> after the patch)

static allocation:
92274688 bytes -> 25165824 bytes

dynamic allocation after boot + kernel build:
0 bytes -> 327680 bytes

total:
92274688 bytes -> 25493504 bytes

72% reduction in total.

Note that implementation looks complex than someone would imagine because
there is recursion issue.  stackdepot uses page allocator and page_owner
is called at page allocation.  Using stackdepot in page_owner could
re-call page allcator and then page_owner.  That is a recursion.  To
detect and avoid it, whenever we obtain stacktrace, recursion is checked
and page_owner is set to dummy information if found.  Dummy information
means that this page is allocated for page_owner feature itself (such as
stackdepot) and it's understandable behavior for user.

Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
---
 include/linux/page_ext.h |   4 +-
 lib/Kconfig.debug        |   1 +
 mm/page_owner.c          | 142 ++++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 126 insertions(+), 21 deletions(-)

diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index e1fe7cf..03f2a3e 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <linux/stacktrace.h>
+#include <linux/stackdepot.h>
 
 struct pglist_data;
 struct page_ext_operations {
@@ -44,9 +45,8 @@ struct page_ext {
 #ifdef CONFIG_PAGE_OWNER
 	unsigned int order;
 	gfp_t gfp_mask;
-	unsigned int nr_entries;
 	int last_migrate_reason;
-	unsigned long trace_entries[8];
+	depot_stack_handle_t handle;
 #endif
 };
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 5a7f978..d40833b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -248,6 +248,7 @@ config PAGE_OWNER
 	depends on DEBUG_KERNEL && STACKTRACE_SUPPORT
 	select DEBUG_FS
 	select STACKTRACE
+	select STACKDEPOT
 	select PAGE_EXTENSION
 	help
 	  This keeps track of what call chain is the owner of a page, may
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 499ad26..dc92241 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -7,11 +7,22 @@
 #include <linux/page_owner.h>
 #include <linux/jump_label.h>
 #include <linux/migrate.h>
+#include <linux/stackdepot.h>
+
 #include "internal.h"
 
+/*
+ * TODO: teach PAGE_OWNER_STACK_DEPTH (__dump_page_owner and save_stack)
+ * to use off stack temporal storage
+ */
+#define PAGE_OWNER_STACK_DEPTH (16)
+
 static bool page_owner_disabled = true;
 DEFINE_STATIC_KEY_FALSE(page_owner_inited);
 
+static depot_stack_handle_t dummy_handle;
+static depot_stack_handle_t failure_handle;
+
 static void init_early_allocated_pages(void);
 
 static int early_page_owner_param(char *buf)
@@ -34,11 +45,41 @@ static bool need_page_owner(void)
 	return true;
 }
 
+static noinline void register_dummy_stack(void)
+{
+	unsigned long entries[4];
+	struct stack_trace dummy;
+
+	dummy.nr_entries = 0;
+	dummy.max_entries = ARRAY_SIZE(entries);
+	dummy.entries = &entries[0];
+	dummy.skip = 0;
+
+	save_stack_trace(&dummy);
+	dummy_handle = depot_save_stack(&dummy, GFP_KERNEL);
+}
+
+static noinline void register_failure_stack(void)
+{
+	unsigned long entries[4];
+	struct stack_trace failure;
+
+	failure.nr_entries = 0;
+	failure.max_entries = ARRAY_SIZE(entries);
+	failure.entries = &entries[0];
+	failure.skip = 0;
+
+	save_stack_trace(&failure);
+	failure_handle = depot_save_stack(&failure, GFP_KERNEL);
+}
+
 static void init_page_owner(void)
 {
 	if (page_owner_disabled)
 		return;
 
+	register_dummy_stack();
+	register_failure_stack();
 	static_branch_enable(&page_owner_inited);
 	init_early_allocated_pages();
 }
@@ -61,25 +102,66 @@ void __reset_page_owner(struct page *page, unsigned int order)
 	}
 }
 
-void __set_page_owner(struct page *page, unsigned int order, gfp_t gfp_mask)
+static inline bool check_recursive_alloc(struct stack_trace *trace,
+					unsigned long ip)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	int i, count;
+
+	if (!trace->nr_entries)
+		return false;
+
+	for (i = 0, count = 0; i < trace->nr_entries; i++) {
+		if (trace->entries[i] == ip && ++count == 2)
+			return true;
+	}
 
+	return false;
+}
+
+static noinline depot_stack_handle_t save_stack(gfp_t flags)
+{
+	unsigned long entries[PAGE_OWNER_STACK_DEPTH];
 	struct stack_trace trace = {
 		.nr_entries = 0,
-		.max_entries = ARRAY_SIZE(page_ext->trace_entries),
-		.entries = &page_ext->trace_entries[0],
-		.skip = 3,
+		.entries = entries,
+		.max_entries = PAGE_OWNER_STACK_DEPTH,
+		.skip = 0
 	};
+	depot_stack_handle_t handle;
+
+	save_stack_trace(&trace);
+	if (trace.nr_entries != 0 &&
+	    trace.entries[trace.nr_entries-1] == ULONG_MAX)
+		trace.nr_entries--;
+
+	/*
+	 * We need to check recursion here because our request to stackdepot
+	 * could trigger memory allocation to save new entry. New memory
+	 * allocation would reach here and call depot_save_stack() again
+	 * if we don't catch it. There is still not enough memory in stackdepot
+	 * so it would try to allocate memory again and loop forever.
+	 */
+	if (check_recursive_alloc(&trace, _RET_IP_))
+		return dummy_handle;
+
+	handle = depot_save_stack(&trace, flags);
+	if (!handle)
+		handle = failure_handle;
+
+	return handle;
+}
+
+noinline void __set_page_owner(struct page *page, unsigned int order,
+					gfp_t gfp_mask)
+{
+	struct page_ext *page_ext = lookup_page_ext(page);
 
 	if (unlikely(!page_ext))
 		return;
 
-	save_stack_trace(&trace);
-
+	page_ext->handle = save_stack(gfp_mask);
 	page_ext->order = order;
 	page_ext->gfp_mask = gfp_mask;
-	page_ext->nr_entries = trace.nr_entries;
 	page_ext->last_migrate_reason = -1;
 
 	__set_bit(PAGE_EXT_OWNER, &page_ext->flags);
@@ -111,7 +193,6 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 {
 	struct page_ext *old_ext = lookup_page_ext(oldpage);
 	struct page_ext *new_ext = lookup_page_ext(newpage);
-	int i;
 
 	if (unlikely(!old_ext || !new_ext))
 		return;
@@ -119,10 +200,7 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 	new_ext->order = old_ext->order;
 	new_ext->gfp_mask = old_ext->gfp_mask;
 	new_ext->last_migrate_reason = old_ext->last_migrate_reason;
-	new_ext->nr_entries = old_ext->nr_entries;
-
-	for (i = 0; i < ARRAY_SIZE(new_ext->trace_entries); i++)
-		new_ext->trace_entries[i] = old_ext->trace_entries[i];
+	new_ext->handle = old_ext->handle;
 
 	/*
 	 * We don't clear the bit on the oldpage as it's going to be freed
@@ -138,14 +216,18 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 
 static ssize_t
 print_page_owner(char __user *buf, size_t count, unsigned long pfn,
-		struct page *page, struct page_ext *page_ext)
+		struct page *page, struct page_ext *page_ext,
+		depot_stack_handle_t handle)
 {
 	int ret;
 	int pageblock_mt, page_mt;
 	char *kbuf;
+	unsigned long entries[PAGE_OWNER_STACK_DEPTH];
 	struct stack_trace trace = {
-		.nr_entries = page_ext->nr_entries,
-		.entries = &page_ext->trace_entries[0],
+		.nr_entries = 0,
+		.entries = entries,
+		.max_entries = PAGE_OWNER_STACK_DEPTH,
+		.skip = 0
 	};
 
 	kbuf = kmalloc(count, GFP_KERNEL);
@@ -174,6 +256,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	if (ret >= count)
 		goto err;
 
+	depot_fetch_stack(handle, &trace);
 	ret += snprint_stack_trace(kbuf + ret, count - ret, &trace, 0);
 	if (ret >= count)
 		goto err;
@@ -204,10 +287,14 @@ err:
 void __dump_page_owner(struct page *page)
 {
 	struct page_ext *page_ext = lookup_page_ext(page);
+	unsigned long entries[PAGE_OWNER_STACK_DEPTH];
 	struct stack_trace trace = {
-		.nr_entries = page_ext->nr_entries,
-		.entries = &page_ext->trace_entries[0],
+		.nr_entries = 0,
+		.entries = entries,
+		.max_entries = PAGE_OWNER_STACK_DEPTH,
+		.skip = 0
 	};
+	depot_stack_handle_t handle;
 	gfp_t gfp_mask = page_ext->gfp_mask;
 	int mt = gfpflags_to_migratetype(gfp_mask);
 
@@ -221,6 +308,13 @@ void __dump_page_owner(struct page *page)
 		return;
 	}
 
+	handle = READ_ONCE(page_ext->handle);
+	if (!handle) {
+		pr_alert("page_owner info is not active (free page?)\n");
+		return;
+	}
+
+	depot_fetch_stack(handle, &trace);
 	pr_alert("page allocated via order %u, migratetype %s, gfp_mask %#x(%pGg)\n",
 		 page_ext->order, migratetype_names[mt], gfp_mask, &gfp_mask);
 	print_stack_trace(&trace, 0);
@@ -236,6 +330,7 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 	unsigned long pfn;
 	struct page *page;
 	struct page_ext *page_ext;
+	depot_stack_handle_t handle;
 
 	if (!static_branch_unlikely(&page_owner_inited))
 		return -EINVAL;
@@ -284,10 +379,19 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 		if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags))
 			continue;
 
+		/*
+		 * Access to page_ext->handle isn't synchronous so we should
+		 * be careful to access it.
+		 */
+		handle = READ_ONCE(page_ext->handle);
+		if (!handle)
+			continue;
+
 		/* Record the next PFN to read in the file offset */
 		*ppos = (pfn - min_low_pfn) + 1;
 
-		return print_page_owner(buf, count, pfn, page, page_ext);
+		return print_page_owner(buf, count, pfn, page,
+				page_ext, handle);
 	}
 
 	return 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 6/9] mm/page_owner: use stackdepot to store stacktrace
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Currently, we store each page's allocation stacktrace on corresponding
page_ext structure and it requires a lot of memory.  This causes the
problem that memory tight system doesn't work well if page_owner is
enabled.  Moreover, even with this large memory consumption, we cannot get
full stacktrace because we allocate memory at boot time and just maintain
8 stacktrace slots to balance memory consumption.  We could increase it to
more but it would make system unusable or change system behaviour.

To solve the problem, this patch uses stackdepot to store stacktrace.  It
obviously provides memory saving but there is a drawback that stackdepot
could fail.

stackdepot allocates memory at runtime so it could fail if system has not
enough memory.  But, most of allocation stack are generated at very early
time and there are much memory at this time.  So, failure would not happen
easily.  And, one failure means that we miss just one page's allocation
stacktrace so it would not be a big problem.  In this patch, when memory
allocation failure happens, we store special stracktrace handle to the
page that is failed to save stacktrace.  With it, user can guess memory
usage properly even if failure happens.

Memory saving looks as following. (4GB memory system with page_owner)
(before the patch -> after the patch)

static allocation:
92274688 bytes -> 25165824 bytes

dynamic allocation after boot + kernel build:
0 bytes -> 327680 bytes

total:
92274688 bytes -> 25493504 bytes

72% reduction in total.

Note that implementation looks complex than someone would imagine because
there is recursion issue.  stackdepot uses page allocator and page_owner
is called at page allocation.  Using stackdepot in page_owner could
re-call page allcator and then page_owner.  That is a recursion.  To
detect and avoid it, whenever we obtain stacktrace, recursion is checked
and page_owner is set to dummy information if found.  Dummy information
means that this page is allocated for page_owner feature itself (such as
stackdepot) and it's understandable behavior for user.

Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
---
 include/linux/page_ext.h |   4 +-
 lib/Kconfig.debug        |   1 +
 mm/page_owner.c          | 142 ++++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 126 insertions(+), 21 deletions(-)

diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index e1fe7cf..03f2a3e 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <linux/stacktrace.h>
+#include <linux/stackdepot.h>
 
 struct pglist_data;
 struct page_ext_operations {
@@ -44,9 +45,8 @@ struct page_ext {
 #ifdef CONFIG_PAGE_OWNER
 	unsigned int order;
 	gfp_t gfp_mask;
-	unsigned int nr_entries;
 	int last_migrate_reason;
-	unsigned long trace_entries[8];
+	depot_stack_handle_t handle;
 #endif
 };
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 5a7f978..d40833b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -248,6 +248,7 @@ config PAGE_OWNER
 	depends on DEBUG_KERNEL && STACKTRACE_SUPPORT
 	select DEBUG_FS
 	select STACKTRACE
+	select STACKDEPOT
 	select PAGE_EXTENSION
 	help
 	  This keeps track of what call chain is the owner of a page, may
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 499ad26..dc92241 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -7,11 +7,22 @@
 #include <linux/page_owner.h>
 #include <linux/jump_label.h>
 #include <linux/migrate.h>
+#include <linux/stackdepot.h>
+
 #include "internal.h"
 
+/*
+ * TODO: teach PAGE_OWNER_STACK_DEPTH (__dump_page_owner and save_stack)
+ * to use off stack temporal storage
+ */
+#define PAGE_OWNER_STACK_DEPTH (16)
+
 static bool page_owner_disabled = true;
 DEFINE_STATIC_KEY_FALSE(page_owner_inited);
 
+static depot_stack_handle_t dummy_handle;
+static depot_stack_handle_t failure_handle;
+
 static void init_early_allocated_pages(void);
 
 static int early_page_owner_param(char *buf)
@@ -34,11 +45,41 @@ static bool need_page_owner(void)
 	return true;
 }
 
+static noinline void register_dummy_stack(void)
+{
+	unsigned long entries[4];
+	struct stack_trace dummy;
+
+	dummy.nr_entries = 0;
+	dummy.max_entries = ARRAY_SIZE(entries);
+	dummy.entries = &entries[0];
+	dummy.skip = 0;
+
+	save_stack_trace(&dummy);
+	dummy_handle = depot_save_stack(&dummy, GFP_KERNEL);
+}
+
+static noinline void register_failure_stack(void)
+{
+	unsigned long entries[4];
+	struct stack_trace failure;
+
+	failure.nr_entries = 0;
+	failure.max_entries = ARRAY_SIZE(entries);
+	failure.entries = &entries[0];
+	failure.skip = 0;
+
+	save_stack_trace(&failure);
+	failure_handle = depot_save_stack(&failure, GFP_KERNEL);
+}
+
 static void init_page_owner(void)
 {
 	if (page_owner_disabled)
 		return;
 
+	register_dummy_stack();
+	register_failure_stack();
 	static_branch_enable(&page_owner_inited);
 	init_early_allocated_pages();
 }
@@ -61,25 +102,66 @@ void __reset_page_owner(struct page *page, unsigned int order)
 	}
 }
 
-void __set_page_owner(struct page *page, unsigned int order, gfp_t gfp_mask)
+static inline bool check_recursive_alloc(struct stack_trace *trace,
+					unsigned long ip)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	int i, count;
+
+	if (!trace->nr_entries)
+		return false;
+
+	for (i = 0, count = 0; i < trace->nr_entries; i++) {
+		if (trace->entries[i] == ip && ++count == 2)
+			return true;
+	}
 
+	return false;
+}
+
+static noinline depot_stack_handle_t save_stack(gfp_t flags)
+{
+	unsigned long entries[PAGE_OWNER_STACK_DEPTH];
 	struct stack_trace trace = {
 		.nr_entries = 0,
-		.max_entries = ARRAY_SIZE(page_ext->trace_entries),
-		.entries = &page_ext->trace_entries[0],
-		.skip = 3,
+		.entries = entries,
+		.max_entries = PAGE_OWNER_STACK_DEPTH,
+		.skip = 0
 	};
+	depot_stack_handle_t handle;
+
+	save_stack_trace(&trace);
+	if (trace.nr_entries != 0 &&
+	    trace.entries[trace.nr_entries-1] == ULONG_MAX)
+		trace.nr_entries--;
+
+	/*
+	 * We need to check recursion here because our request to stackdepot
+	 * could trigger memory allocation to save new entry. New memory
+	 * allocation would reach here and call depot_save_stack() again
+	 * if we don't catch it. There is still not enough memory in stackdepot
+	 * so it would try to allocate memory again and loop forever.
+	 */
+	if (check_recursive_alloc(&trace, _RET_IP_))
+		return dummy_handle;
+
+	handle = depot_save_stack(&trace, flags);
+	if (!handle)
+		handle = failure_handle;
+
+	return handle;
+}
+
+noinline void __set_page_owner(struct page *page, unsigned int order,
+					gfp_t gfp_mask)
+{
+	struct page_ext *page_ext = lookup_page_ext(page);
 
 	if (unlikely(!page_ext))
 		return;
 
-	save_stack_trace(&trace);
-
+	page_ext->handle = save_stack(gfp_mask);
 	page_ext->order = order;
 	page_ext->gfp_mask = gfp_mask;
-	page_ext->nr_entries = trace.nr_entries;
 	page_ext->last_migrate_reason = -1;
 
 	__set_bit(PAGE_EXT_OWNER, &page_ext->flags);
@@ -111,7 +193,6 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 {
 	struct page_ext *old_ext = lookup_page_ext(oldpage);
 	struct page_ext *new_ext = lookup_page_ext(newpage);
-	int i;
 
 	if (unlikely(!old_ext || !new_ext))
 		return;
@@ -119,10 +200,7 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 	new_ext->order = old_ext->order;
 	new_ext->gfp_mask = old_ext->gfp_mask;
 	new_ext->last_migrate_reason = old_ext->last_migrate_reason;
-	new_ext->nr_entries = old_ext->nr_entries;
-
-	for (i = 0; i < ARRAY_SIZE(new_ext->trace_entries); i++)
-		new_ext->trace_entries[i] = old_ext->trace_entries[i];
+	new_ext->handle = old_ext->handle;
 
 	/*
 	 * We don't clear the bit on the oldpage as it's going to be freed
@@ -138,14 +216,18 @@ void __copy_page_owner(struct page *oldpage, struct page *newpage)
 
 static ssize_t
 print_page_owner(char __user *buf, size_t count, unsigned long pfn,
-		struct page *page, struct page_ext *page_ext)
+		struct page *page, struct page_ext *page_ext,
+		depot_stack_handle_t handle)
 {
 	int ret;
 	int pageblock_mt, page_mt;
 	char *kbuf;
+	unsigned long entries[PAGE_OWNER_STACK_DEPTH];
 	struct stack_trace trace = {
-		.nr_entries = page_ext->nr_entries,
-		.entries = &page_ext->trace_entries[0],
+		.nr_entries = 0,
+		.entries = entries,
+		.max_entries = PAGE_OWNER_STACK_DEPTH,
+		.skip = 0
 	};
 
 	kbuf = kmalloc(count, GFP_KERNEL);
@@ -174,6 +256,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	if (ret >= count)
 		goto err;
 
+	depot_fetch_stack(handle, &trace);
 	ret += snprint_stack_trace(kbuf + ret, count - ret, &trace, 0);
 	if (ret >= count)
 		goto err;
@@ -204,10 +287,14 @@ err:
 void __dump_page_owner(struct page *page)
 {
 	struct page_ext *page_ext = lookup_page_ext(page);
+	unsigned long entries[PAGE_OWNER_STACK_DEPTH];
 	struct stack_trace trace = {
-		.nr_entries = page_ext->nr_entries,
-		.entries = &page_ext->trace_entries[0],
+		.nr_entries = 0,
+		.entries = entries,
+		.max_entries = PAGE_OWNER_STACK_DEPTH,
+		.skip = 0
 	};
+	depot_stack_handle_t handle;
 	gfp_t gfp_mask = page_ext->gfp_mask;
 	int mt = gfpflags_to_migratetype(gfp_mask);
 
@@ -221,6 +308,13 @@ void __dump_page_owner(struct page *page)
 		return;
 	}
 
+	handle = READ_ONCE(page_ext->handle);
+	if (!handle) {
+		pr_alert("page_owner info is not active (free page?)\n");
+		return;
+	}
+
+	depot_fetch_stack(handle, &trace);
 	pr_alert("page allocated via order %u, migratetype %s, gfp_mask %#x(%pGg)\n",
 		 page_ext->order, migratetype_names[mt], gfp_mask, &gfp_mask);
 	print_stack_trace(&trace, 0);
@@ -236,6 +330,7 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 	unsigned long pfn;
 	struct page *page;
 	struct page_ext *page_ext;
+	depot_stack_handle_t handle;
 
 	if (!static_branch_unlikely(&page_owner_inited))
 		return -EINVAL;
@@ -284,10 +379,19 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 		if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags))
 			continue;
 
+		/*
+		 * Access to page_ext->handle isn't synchronous so we should
+		 * be careful to access it.
+		 */
+		handle = READ_ONCE(page_ext->handle);
+		if (!handle)
+			continue;
+
 		/* Record the next PFN to read in the file offset */
 		*ppos = (pfn - min_low_pfn) + 1;
 
-		return print_page_owner(buf, count, pfn, page, page_ext);
+		return print_page_owner(buf, count, pfn, page,
+				page_ext, handle);
 	}
 
 	return 0;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 7/9] mm/page_owner: avoid null pointer dereference
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Sudip Mukherjee, Sudip Mukherjee, Joonsoo Kim

From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>

We have dereferenced page_ext before checking it. Lets check it first
and then used it.

Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/page_owner.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index dc92241..ec6dc18 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -295,13 +295,15 @@ void __dump_page_owner(struct page *page)
 		.skip = 0
 	};
 	depot_stack_handle_t handle;
-	gfp_t gfp_mask = page_ext->gfp_mask;
-	int mt = gfpflags_to_migratetype(gfp_mask);
+	gfp_t gfp_mask;
+	int mt;
 
 	if (unlikely(!page_ext)) {
 		pr_alert("There is not page extension available.\n");
 		return;
 	}
+	gfp_mask = page_ext->gfp_mask;
+	mt = gfpflags_to_migratetype(gfp_mask);
 
 	if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) {
 		pr_alert("page_owner info is not active (free page?)\n");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 7/9] mm/page_owner: avoid null pointer dereference
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Sudip Mukherjee, Sudip Mukherjee, Joonsoo Kim

From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>

We have dereferenced page_ext before checking it. Lets check it first
and then used it.

Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/page_owner.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index dc92241..ec6dc18 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -295,13 +295,15 @@ void __dump_page_owner(struct page *page)
 		.skip = 0
 	};
 	depot_stack_handle_t handle;
-	gfp_t gfp_mask = page_ext->gfp_mask;
-	int mt = gfpflags_to_migratetype(gfp_mask);
+	gfp_t gfp_mask;
+	int mt;
 
 	if (unlikely(!page_ext)) {
 		pr_alert("There is not page extension available.\n");
 		return;
 	}
+	gfp_mask = page_ext->gfp_mask;
+	mt = gfpflags_to_migratetype(gfp_mask);
 
 	if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) {
 		pr_alert("page_owner info is not active (free page?)\n");
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 8/9] mm/page_alloc: introduce post allocation processing on page allocator
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

This patch is motivated from Hugh and Vlastimil's concern [1].

There are two ways to get freepage from the allocator.  One is using
normal memory allocation API and the other is __isolate_free_page() which
is internally used for compaction and pageblock isolation.  Later usage is
rather tricky since it doesn't do whole post allocation processing done by
normal API.

One problematic thing I already know is that poisoned page would not be
checked if it is allocated by __isolate_free_page().  Perhaps, there would
be more.

We could add more debug logic for allocated page in the future and this
separation would cause more problem.  I'd like to fix this situation at
this time.  Solution is simple.  This patch commonize some logic for newly
allocated page and uses it on all sites.  This will solve the problem.

[1] http://marc.info/?i=alpine.LSU.2.11.1604270029350.7066%40eggly.anvils%3E

Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 mm/compaction.c     |  8 +-------
 mm/internal.h       |  2 ++
 mm/page_alloc.c     | 23 ++++++++++++++---------
 mm/page_isolation.c |  4 +---
 4 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 942d6cd..199b486 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -74,14 +74,8 @@ static void map_pages(struct list_head *list)
 
 		order = page_private(page);
 		nr_pages = 1 << order;
-		set_page_private(page, 0);
-		set_page_refcounted(page);
 
-		arch_alloc_page(page, order);
-		kernel_map_pages(page, nr_pages, 1);
-		kasan_alloc_pages(page, order);
-
-		set_page_owner(page, order, __GFP_MOVABLE);
+		post_alloc_hook(page, order, __GFP_MOVABLE);
 		if (order)
 			split_page(page, order);
 
diff --git a/mm/internal.h b/mm/internal.h
index 11e8ea2..aa04f67 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -151,6 +151,8 @@ extern int __isolate_free_page(struct page *page, unsigned int order);
 extern void __free_pages_bootmem(struct page *page, unsigned long pfn,
 					unsigned int order);
 extern void prep_compound_page(struct page *page, unsigned int order);
+extern void post_alloc_hook(struct page *page, unsigned int order,
+					gfp_t gfp_flags);
 extern int user_min_free_kbytes;
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e3085eb..eeb3516 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1729,6 +1729,19 @@ static bool check_new_pages(struct page *page, unsigned int order)
 	return false;
 }
 
+inline void post_alloc_hook(struct page *page, unsigned int order,
+				gfp_t gfp_flags)
+{
+	set_page_private(page, 0);
+	set_page_refcounted(page);
+
+	arch_alloc_page(page, order);
+	kernel_map_pages(page, 1 << order, 1);
+	kernel_poison_pages(page, 1 << order, 1);
+	kasan_alloc_pages(page, order);
+	set_page_owner(page, order, gfp_flags);
+}
+
 static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
 							unsigned int alloc_flags)
 {
@@ -1741,13 +1754,7 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
 			poisoned &= page_is_poisoned(p);
 	}
 
-	set_page_private(page, 0);
-	set_page_refcounted(page);
-
-	arch_alloc_page(page, order);
-	kernel_map_pages(page, 1 << order, 1);
-	kernel_poison_pages(page, 1 << order, 1);
-	kasan_alloc_pages(page, order);
+	post_alloc_hook(page, order, gfp_flags);
 
 	if (!free_pages_prezeroed(poisoned) && (gfp_flags & __GFP_ZERO))
 		for (i = 0; i < (1 << order); i++)
@@ -1756,8 +1763,6 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
 	if (order && (gfp_flags & __GFP_COMP))
 		prep_compound_page(page, order);
 
-	set_page_owner(page, order, gfp_flags);
-
 	/*
 	 * page is set pfmemalloc when ALLOC_NO_WATERMARKS was necessary to
 	 * allocate the page. The expectation is that the caller is taking
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 927f5ee..4639163 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -128,9 +128,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 	if (isolated_page) {
-		kernel_map_pages(page, (1 << order), 1);
-		set_page_refcounted(page);
-		set_page_owner(page, order, __GFP_MOVABLE);
+		post_alloc_hook(page, order, __GFP_MOVABLE);
 		__free_pages(isolated_page, order);
 	}
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 8/9] mm/page_alloc: introduce post allocation processing on page allocator
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

This patch is motivated from Hugh and Vlastimil's concern [1].

There are two ways to get freepage from the allocator.  One is using
normal memory allocation API and the other is __isolate_free_page() which
is internally used for compaction and pageblock isolation.  Later usage is
rather tricky since it doesn't do whole post allocation processing done by
normal API.

One problematic thing I already know is that poisoned page would not be
checked if it is allocated by __isolate_free_page().  Perhaps, there would
be more.

We could add more debug logic for allocated page in the future and this
separation would cause more problem.  I'd like to fix this situation at
this time.  Solution is simple.  This patch commonize some logic for newly
allocated page and uses it on all sites.  This will solve the problem.

[1] http://marc.info/?i=alpine.LSU.2.11.1604270029350.7066%40eggly.anvils%3E

Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 mm/compaction.c     |  8 +-------
 mm/internal.h       |  2 ++
 mm/page_alloc.c     | 23 ++++++++++++++---------
 mm/page_isolation.c |  4 +---
 4 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 942d6cd..199b486 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -74,14 +74,8 @@ static void map_pages(struct list_head *list)
 
 		order = page_private(page);
 		nr_pages = 1 << order;
-		set_page_private(page, 0);
-		set_page_refcounted(page);
 
-		arch_alloc_page(page, order);
-		kernel_map_pages(page, nr_pages, 1);
-		kasan_alloc_pages(page, order);
-
-		set_page_owner(page, order, __GFP_MOVABLE);
+		post_alloc_hook(page, order, __GFP_MOVABLE);
 		if (order)
 			split_page(page, order);
 
diff --git a/mm/internal.h b/mm/internal.h
index 11e8ea2..aa04f67 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -151,6 +151,8 @@ extern int __isolate_free_page(struct page *page, unsigned int order);
 extern void __free_pages_bootmem(struct page *page, unsigned long pfn,
 					unsigned int order);
 extern void prep_compound_page(struct page *page, unsigned int order);
+extern void post_alloc_hook(struct page *page, unsigned int order,
+					gfp_t gfp_flags);
 extern int user_min_free_kbytes;
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e3085eb..eeb3516 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1729,6 +1729,19 @@ static bool check_new_pages(struct page *page, unsigned int order)
 	return false;
 }
 
+inline void post_alloc_hook(struct page *page, unsigned int order,
+				gfp_t gfp_flags)
+{
+	set_page_private(page, 0);
+	set_page_refcounted(page);
+
+	arch_alloc_page(page, order);
+	kernel_map_pages(page, 1 << order, 1);
+	kernel_poison_pages(page, 1 << order, 1);
+	kasan_alloc_pages(page, order);
+	set_page_owner(page, order, gfp_flags);
+}
+
 static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
 							unsigned int alloc_flags)
 {
@@ -1741,13 +1754,7 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
 			poisoned &= page_is_poisoned(p);
 	}
 
-	set_page_private(page, 0);
-	set_page_refcounted(page);
-
-	arch_alloc_page(page, order);
-	kernel_map_pages(page, 1 << order, 1);
-	kernel_poison_pages(page, 1 << order, 1);
-	kasan_alloc_pages(page, order);
+	post_alloc_hook(page, order, gfp_flags);
 
 	if (!free_pages_prezeroed(poisoned) && (gfp_flags & __GFP_ZERO))
 		for (i = 0; i < (1 << order); i++)
@@ -1756,8 +1763,6 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
 	if (order && (gfp_flags & __GFP_COMP))
 		prep_compound_page(page, order);
 
-	set_page_owner(page, order, gfp_flags);
-
 	/*
 	 * page is set pfmemalloc when ALLOC_NO_WATERMARKS was necessary to
 	 * allocate the page. The expectation is that the caller is taking
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 927f5ee..4639163 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -128,9 +128,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 	if (isolated_page) {
-		kernel_map_pages(page, (1 << order), 1);
-		set_page_refcounted(page);
-		set_page_owner(page, order, __GFP_MOVABLE);
+		post_alloc_hook(page, order, __GFP_MOVABLE);
 		__free_pages(isolated_page, order);
 	}
 }
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 9/9] mm/page_isolation: clean up confused code
  2016-06-17  7:57 ` js1304
@ 2016-06-17  7:57   ` js1304
  -1 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

When there is an isolated_page, post_alloc_hook() is called with
page but __free_pages() is called with isolated_page. Since they are
the same so no problem but it's very confusing. To reduce it,
this patch changes isolated_page to boolean type and uses page variable
consistently.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/page_isolation.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 4639163..064b7fb 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -81,7 +81,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 {
 	struct zone *zone;
 	unsigned long flags, nr_pages;
-	struct page *isolated_page = NULL;
+	bool isolated_page = false;
 	unsigned int order;
 	unsigned long page_idx, buddy_idx;
 	struct page *buddy;
@@ -109,7 +109,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 			if (pfn_valid_within(page_to_pfn(buddy)) &&
 			    !is_migrate_isolate_page(buddy)) {
 				__isolate_free_page(page, order);
-				isolated_page = page;
+				isolated_page = true;
 			}
 		}
 	}
@@ -129,7 +129,7 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 	if (isolated_page) {
 		post_alloc_hook(page, order, __GFP_MOVABLE);
-		__free_pages(isolated_page, order);
+		__free_pages(page, order);
 	}
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 9/9] mm/page_isolation: clean up confused code
@ 2016-06-17  7:57   ` js1304
  0 siblings, 0 replies; 36+ messages in thread
From: js1304 @ 2016-06-17  7:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

When there is an isolated_page, post_alloc_hook() is called with
page but __free_pages() is called with isolated_page. Since they are
the same so no problem but it's very confusing. To reduce it,
this patch changes isolated_page to boolean type and uses page variable
consistently.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/page_isolation.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 4639163..064b7fb 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -81,7 +81,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 {
 	struct zone *zone;
 	unsigned long flags, nr_pages;
-	struct page *isolated_page = NULL;
+	bool isolated_page = false;
 	unsigned int order;
 	unsigned long page_idx, buddy_idx;
 	struct page *buddy;
@@ -109,7 +109,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 			if (pfn_valid_within(page_to_pfn(buddy)) &&
 			    !is_migrate_isolate_page(buddy)) {
 				__isolate_free_page(page, order);
-				isolated_page = page;
+				isolated_page = true;
 			}
 		}
 	}
@@ -129,7 +129,7 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 	if (isolated_page) {
 		post_alloc_hook(page, order, __GFP_MOVABLE);
-		__free_pages(isolated_page, order);
+		__free_pages(page, order);
 	}
 }
 
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 5/9] tools/vm/page_owner: increase temporary buffer size
  2016-06-17  7:57   ` js1304
@ 2016-06-17 12:56     ` Vlastimil Babka
  -1 siblings, 0 replies; 36+ messages in thread
From: Vlastimil Babka @ 2016-06-17 12:56 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin, Joonsoo Kim

On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> Page owner will be changed to store more deep stacktrace so current
> temporary buffer size isn't enough.  Increase it.
>
> Link: http://lkml.kernel.org/r/1464230275-25791-5-git-send-email-iamjoonsoo.kim@lge.com
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> ---
>  tools/vm/page_owner_sort.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/tools/vm/page_owner_sort.c b/tools/vm/page_owner_sort.c
> index 77147b4..f1c055f 100644
> --- a/tools/vm/page_owner_sort.c
> +++ b/tools/vm/page_owner_sort.c
> @@ -79,12 +79,12 @@ static void add_list(char *buf, int len)
>  	}
>  }
>
> -#define BUF_SIZE	1024
> +#define BUF_SIZE	(128 * 1024)
>
>  int main(int argc, char **argv)
>  {
>  	FILE *fin, *fout;
> -	char buf[BUF_SIZE];
> +	char *buf;
>  	int ret, i, count;
>  	struct block_list *list2;
>  	struct stat st;
> @@ -107,6 +107,11 @@ int main(int argc, char **argv)
>  	max_size = st.st_size / 100; /* hack ... */
>
>  	list = malloc(max_size * sizeof(*list));
> +	buf = malloc(BUF_SIZE);
> +	if (!list || !buf) {
> +		printf("Out of memory\n");
> +		exit(1);
> +	}
>
>  	for ( ; ; ) {
>  		ret = read_block(buf, BUF_SIZE, fin);
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 5/9] tools/vm/page_owner: increase temporary buffer size
@ 2016-06-17 12:56     ` Vlastimil Babka
  0 siblings, 0 replies; 36+ messages in thread
From: Vlastimil Babka @ 2016-06-17 12:56 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin, Joonsoo Kim

On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> Page owner will be changed to store more deep stacktrace so current
> temporary buffer size isn't enough.  Increase it.
>
> Link: http://lkml.kernel.org/r/1464230275-25791-5-git-send-email-iamjoonsoo.kim@lge.com
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> ---
>  tools/vm/page_owner_sort.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/tools/vm/page_owner_sort.c b/tools/vm/page_owner_sort.c
> index 77147b4..f1c055f 100644
> --- a/tools/vm/page_owner_sort.c
> +++ b/tools/vm/page_owner_sort.c
> @@ -79,12 +79,12 @@ static void add_list(char *buf, int len)
>  	}
>  }
>
> -#define BUF_SIZE	1024
> +#define BUF_SIZE	(128 * 1024)
>
>  int main(int argc, char **argv)
>  {
>  	FILE *fin, *fout;
> -	char buf[BUF_SIZE];
> +	char *buf;
>  	int ret, i, count;
>  	struct block_list *list2;
>  	struct stat st;
> @@ -107,6 +107,11 @@ int main(int argc, char **argv)
>  	max_size = st.st_size / 100; /* hack ... */
>
>  	list = malloc(max_size * sizeof(*list));
> +	buf = malloc(BUF_SIZE);
> +	if (!list || !buf) {
> +		printf("Out of memory\n");
> +		exit(1);
> +	}
>
>  	for ( ; ; ) {
>  		ret = read_block(buf, BUF_SIZE, fin);
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 7/9] mm/page_owner: avoid null pointer dereference
  2016-06-17  7:57   ` js1304
@ 2016-06-17 13:32     ` Vlastimil Babka
  -1 siblings, 0 replies; 36+ messages in thread
From: Vlastimil Babka @ 2016-06-17 13:32 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Sudip Mukherjee, Sudip Mukherjee, Joonsoo Kim

On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
>
> We have dereferenced page_ext before checking it. Lets check it first
> and then used it.
>
> Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
> Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hmm, this is already in mmotm as 
http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-page_owner-use-stackdepot-to-store-stacktrace-fix.patch

But imho it's fixing a problem not related to your patch, but something that the 
commit f86e4271978b missed. So it should separately go to 4.7 ASAP.

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call 
sites")


> ---
>  mm/page_owner.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index dc92241..ec6dc18 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -295,13 +295,15 @@ void __dump_page_owner(struct page *page)
>  		.skip = 0
>  	};
>  	depot_stack_handle_t handle;
> -	gfp_t gfp_mask = page_ext->gfp_mask;
> -	int mt = gfpflags_to_migratetype(gfp_mask);
> +	gfp_t gfp_mask;
> +	int mt;
>
>  	if (unlikely(!page_ext)) {
>  		pr_alert("There is not page extension available.\n");
>  		return;
>  	}
> +	gfp_mask = page_ext->gfp_mask;
> +	mt = gfpflags_to_migratetype(gfp_mask);
>
>  	if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) {
>  		pr_alert("page_owner info is not active (free page?)\n");
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 7/9] mm/page_owner: avoid null pointer dereference
@ 2016-06-17 13:32     ` Vlastimil Babka
  0 siblings, 0 replies; 36+ messages in thread
From: Vlastimil Babka @ 2016-06-17 13:32 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Sudip Mukherjee, Sudip Mukherjee, Joonsoo Kim

On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
>
> We have dereferenced page_ext before checking it. Lets check it first
> and then used it.
>
> Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
> Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hmm, this is already in mmotm as 
http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-page_owner-use-stackdepot-to-store-stacktrace-fix.patch

But imho it's fixing a problem not related to your patch, but something that the 
commit f86e4271978b missed. So it should separately go to 4.7 ASAP.

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call 
sites")


> ---
>  mm/page_owner.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index dc92241..ec6dc18 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -295,13 +295,15 @@ void __dump_page_owner(struct page *page)
>  		.skip = 0
>  	};
>  	depot_stack_handle_t handle;
> -	gfp_t gfp_mask = page_ext->gfp_mask;
> -	int mt = gfpflags_to_migratetype(gfp_mask);
> +	gfp_t gfp_mask;
> +	int mt;
>
>  	if (unlikely(!page_ext)) {
>  		pr_alert("There is not page extension available.\n");
>  		return;
>  	}
> +	gfp_mask = page_ext->gfp_mask;
> +	mt = gfpflags_to_migratetype(gfp_mask);
>
>  	if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) {
>  		pr_alert("page_owner info is not active (free page?)\n");
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 9/9] mm/page_isolation: clean up confused code
  2016-06-17  7:57   ` js1304
@ 2016-06-17 13:34     ` Vlastimil Babka
  -1 siblings, 0 replies; 36+ messages in thread
From: Vlastimil Babka @ 2016-06-17 13:34 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin, Joonsoo Kim

On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> When there is an isolated_page, post_alloc_hook() is called with
> page but __free_pages() is called with isolated_page. Since they are
> the same so no problem but it's very confusing. To reduce it,
> this patch changes isolated_page to boolean type and uses page variable
> consistently.
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Could be also just folded to mm/page_owner: initialize page owner without 
holding the zone lock

> ---
>  mm/page_isolation.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 4639163..064b7fb 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -81,7 +81,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
>  {
>  	struct zone *zone;
>  	unsigned long flags, nr_pages;
> -	struct page *isolated_page = NULL;
> +	bool isolated_page = false;
>  	unsigned int order;
>  	unsigned long page_idx, buddy_idx;
>  	struct page *buddy;
> @@ -109,7 +109,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
>  			if (pfn_valid_within(page_to_pfn(buddy)) &&
>  			    !is_migrate_isolate_page(buddy)) {
>  				__isolate_free_page(page, order);
> -				isolated_page = page;
> +				isolated_page = true;
>  			}
>  		}
>  	}
> @@ -129,7 +129,7 @@ out:
>  	spin_unlock_irqrestore(&zone->lock, flags);
>  	if (isolated_page) {
>  		post_alloc_hook(page, order, __GFP_MOVABLE);
> -		__free_pages(isolated_page, order);
> +		__free_pages(page, order);
>  	}
>  }
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 9/9] mm/page_isolation: clean up confused code
@ 2016-06-17 13:34     ` Vlastimil Babka
  0 siblings, 0 replies; 36+ messages in thread
From: Vlastimil Babka @ 2016-06-17 13:34 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin, Joonsoo Kim

On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> When there is an isolated_page, post_alloc_hook() is called with
> page but __free_pages() is called with isolated_page. Since they are
> the same so no problem but it's very confusing. To reduce it,
> this patch changes isolated_page to boolean type and uses page variable
> consistently.
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Could be also just folded to mm/page_owner: initialize page owner without 
holding the zone lock

> ---
>  mm/page_isolation.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 4639163..064b7fb 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -81,7 +81,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
>  {
>  	struct zone *zone;
>  	unsigned long flags, nr_pages;
> -	struct page *isolated_page = NULL;
> +	bool isolated_page = false;
>  	unsigned int order;
>  	unsigned long page_idx, buddy_idx;
>  	struct page *buddy;
> @@ -109,7 +109,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
>  			if (pfn_valid_within(page_to_pfn(buddy)) &&
>  			    !is_migrate_isolate_page(buddy)) {
>  				__isolate_free_page(page, order);
> -				isolated_page = page;
> +				isolated_page = true;
>  			}
>  		}
>  	}
> @@ -129,7 +129,7 @@ out:
>  	spin_unlock_irqrestore(&zone->lock, flags);
>  	if (isolated_page) {
>  		post_alloc_hook(page, order, __GFP_MOVABLE);
> -		__free_pages(isolated_page, order);
> +		__free_pages(page, order);
>  	}
>  }
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 7/9] mm/page_owner: avoid null pointer dereference
  2016-06-17 13:32     ` Vlastimil Babka
@ 2016-06-24 20:19       ` Andrew Morton
  -1 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2016-06-24 20:19 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: js1304, mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Sudip Mukherjee, Sudip Mukherjee, Joonsoo Kim

On Fri, 17 Jun 2016 15:32:20 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> > From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
> >
> > We have dereferenced page_ext before checking it. Lets check it first
> > and then used it.
> >
> > Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
> > Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Hmm, this is already in mmotm as 
> http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-page_owner-use-stackdepot-to-store-stacktrace-fix.patch
> 
> But imho it's fixing a problem not related to your patch, but something that the 
> commit f86e4271978b missed. So it should separately go to 4.7 ASAP.
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call 
> sites")

Thanks, I reordered Sudip's patch.



From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Subject: mm/page_owner: avoid null pointer dereference

We have dereferenced page_ext before checking it. Lets check it first
and then used it.

Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call sites")
Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_owner.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff -puN mm/page_owner.c~mm-page_owner-use-stackdepot-to-store-stacktrace-fix mm/page_owner.c
--- a/mm/page_owner.c~mm-page_owner-use-stackdepot-to-store-stacktrace-fix
+++ a/mm/page_owner.c
@@ -207,13 +207,15 @@ void __dump_page_owner(struct page *page
 		.nr_entries = page_ext->nr_entries,
 		.entries = &page_ext->trace_entries[0],
 	};
-	gfp_t gfp_mask = page_ext->gfp_mask;
-	int mt = gfpflags_to_migratetype(gfp_mask);
+	gfp_t gfp_mask;
+	int mt;
 
 	if (unlikely(!page_ext)) {
 		pr_alert("There is not page extension available.\n");
 		return;
 	}
+	gfp_mask = page_ext->gfp_mask;
+	mt = gfpflags_to_migratetype(gfp_mask);
 
 	if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) {
 		pr_alert("page_owner info is not active (free page?)\n");
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 7/9] mm/page_owner: avoid null pointer dereference
@ 2016-06-24 20:19       ` Andrew Morton
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2016-06-24 20:19 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: js1304, mgorman, Minchan Kim, Alexander Potapenko, Hugh Dickins,
	Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Sudip Mukherjee, Sudip Mukherjee, Joonsoo Kim

On Fri, 17 Jun 2016 15:32:20 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 06/17/2016 09:57 AM, js1304@gmail.com wrote:
> > From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
> >
> > We have dereferenced page_ext before checking it. Lets check it first
> > and then used it.
> >
> > Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
> > Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Hmm, this is already in mmotm as 
> http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-page_owner-use-stackdepot-to-store-stacktrace-fix.patch
> 
> But imho it's fixing a problem not related to your patch, but something that the 
> commit f86e4271978b missed. So it should separately go to 4.7 ASAP.
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call 
> sites")

Thanks, I reordered Sudip's patch.



From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Subject: mm/page_owner: avoid null pointer dereference

We have dereferenced page_ext before checking it. Lets check it first
and then used it.

Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call sites")
Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_owner.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff -puN mm/page_owner.c~mm-page_owner-use-stackdepot-to-store-stacktrace-fix mm/page_owner.c
--- a/mm/page_owner.c~mm-page_owner-use-stackdepot-to-store-stacktrace-fix
+++ a/mm/page_owner.c
@@ -207,13 +207,15 @@ void __dump_page_owner(struct page *page
 		.nr_entries = page_ext->nr_entries,
 		.entries = &page_ext->trace_entries[0],
 	};
-	gfp_t gfp_mask = page_ext->gfp_mask;
-	int mt = gfpflags_to_migratetype(gfp_mask);
+	gfp_t gfp_mask;
+	int mt;
 
 	if (unlikely(!page_ext)) {
 		pr_alert("There is not page extension available.\n");
 		return;
 	}
+	gfp_mask = page_ext->gfp_mask;
+	mt = gfpflags_to_migratetype(gfp_mask);
 
 	if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) {
 		pr_alert("page_owner info is not active (free page?)\n");
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/9] reduce memory usage by page_owner
  2016-06-17  7:57 ` js1304
@ 2016-06-24 23:19   ` Andrew Morton
  -1 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2016-06-24 23:19 UTC (permalink / raw)
  To: js1304
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

On Fri, 17 Jun 2016 16:57:30 +0900 js1304@gmail.com wrote:

> There was a bug reported by Sasha and minor fixes is needed
> so I send v3.
> 
> o fix a bg reported by Sasha (mm/compaction: split freepages
> without holding the zone lock)
> o add code comment for todo list (mm/page_owner: use stackdepot
> to store stacktrace) per Michal
> o add 'inline' keyword (mm/page_alloc: introduce post allocation
> processing on page allocator) per Vlastimil
> o add a patch that clean-up code per Vlastimil

I've gone through v3 patches 2-9 and have plucked out the deltas to
take what-i-had and turn that into what-you-sent.  Patch 1/9 has seen a
lot of competing churn in isolate_freepages_block(), so please review
the current version of that, below.  Between the "===" markers:


static unsigned long isolate_freepages_block(struct compact_control *cc,
				unsigned long *start_pfn,
				unsigned long end_pfn,
				struct list_head *freelist,
				bool strict)
{
	int nr_scanned = 0, total_isolated = 0;
	struct page *cursor, *valid_page = NULL;
	unsigned long flags = 0;
	bool locked = false;
	unsigned long blockpfn = *start_pfn;
	unsigned int order;

	cursor = pfn_to_page(blockpfn);

	/* Isolate free pages. */
	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
		int isolated;
		struct page *page = cursor;

		/*
		 * Periodically drop the lock (if held) regardless of its
		 * contention, to give chance to IRQs. Abort if fatal signal
		 * pending or async compaction detects need_resched()
		 */
		if (!(blockpfn % SWAP_CLUSTER_MAX)
		    && compact_unlock_should_abort(&cc->zone->lock, flags,
								&locked, cc))
			break;

		nr_scanned++;
		if (!pfn_valid_within(blockpfn))
			goto isolate_fail;

		if (!valid_page)
			valid_page = page;

		/*
		 * For compound pages such as THP and hugetlbfs, we can save
		 * potentially a lot of iterations if we skip them at once.
		 * The check is racy, but we can consider only valid values
		 * and the only danger is skipping too much.
		 */
		if (PageCompound(page)) {
			unsigned int comp_order = compound_order(page);

			if (likely(comp_order < MAX_ORDER)) {
				blockpfn += (1UL << comp_order) - 1;
				cursor += (1UL << comp_order) - 1;
			}

			goto isolate_fail;
		}

		if (!PageBuddy(page))
			goto isolate_fail;

====================
		/*
		 * If we already hold the lock, we can skip some rechecking.
		 * Note that if we hold the lock now, checked_pageblock was
		 * already set in some previous iteration (or strict is true),
		 * so it is correct to skip the suitable migration target
		 * recheck as well.
		 */
		if (!locked) {
			/*
			 * The zone lock must be held to isolate freepages.
			 * Unfortunately this is a very coarse lock and can be
			 * heavily contended if there are parallel allocations
			 * or parallel compactions. For async compaction do not
			 * spin on the lock and we acquire the lock as late as
			 * possible.
			 */
			locked = compact_trylock_irqsave(&cc->zone->lock,
								&flags, cc);
			if (!locked)
				break;

			/* Recheck this is a buddy page under lock */
			if (!PageBuddy(page))
				goto isolate_fail;
		}

		/* Found a free page, will break it into order-0 pages */
		order = page_order(page);
		isolated = __isolate_free_page(page, order);
		if (!isolated)
			break;
		set_page_private(page, order);

		total_isolated += isolated;
		cc->nr_freepages += isolated;
		list_add_tail(&page->lru, freelist);

		if (!strict && cc->nr_migratepages <= cc->nr_freepages) {
			blockpfn += isolated;
			break;
		}
		/* Advance to the end of split page */
		blockpfn += isolated - 1;
		cursor += isolated - 1;
		continue;

isolate_fail:
=====================
		if (strict)
			break;
		else
			continue;

	}

	if (locked)
		spin_unlock_irqrestore(&cc->zone->lock, flags);

	/*
	 * There is a tiny chance that we have read bogus compound_order(),
	 * so be careful to not go outside of the pageblock.
	 */
	if (unlikely(blockpfn > end_pfn))
		blockpfn = end_pfn;

	trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn,
					nr_scanned, total_isolated);

	/* Record how far we have got within the block */
	*start_pfn = blockpfn;

	/*
	 * If strict isolation is requested by CMA then check that all the
	 * pages requested were isolated. If there were any failures, 0 is
	 * returned and CMA will fail.
	 */
	if (strict && blockpfn < end_pfn)
		total_isolated = 0;

	/* Update the pageblock-skip if the whole pageblock was scanned */
	if (blockpfn == end_pfn)
		update_pageblock_skip(cc, valid_page, total_isolated, false);

	count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
	if (total_isolated)
		count_compact_events(COMPACTISOLATED, total_isolated);
	return total_isolated;
}

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/9] reduce memory usage by page_owner
@ 2016-06-24 23:19   ` Andrew Morton
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2016-06-24 23:19 UTC (permalink / raw)
  To: js1304
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

On Fri, 17 Jun 2016 16:57:30 +0900 js1304@gmail.com wrote:

> There was a bug reported by Sasha and minor fixes is needed
> so I send v3.
> 
> o fix a bg reported by Sasha (mm/compaction: split freepages
> without holding the zone lock)
> o add code comment for todo list (mm/page_owner: use stackdepot
> to store stacktrace) per Michal
> o add 'inline' keyword (mm/page_alloc: introduce post allocation
> processing on page allocator) per Vlastimil
> o add a patch that clean-up code per Vlastimil

I've gone through v3 patches 2-9 and have plucked out the deltas to
take what-i-had and turn that into what-you-sent.  Patch 1/9 has seen a
lot of competing churn in isolate_freepages_block(), so please review
the current version of that, below.  Between the "===" markers:


static unsigned long isolate_freepages_block(struct compact_control *cc,
				unsigned long *start_pfn,
				unsigned long end_pfn,
				struct list_head *freelist,
				bool strict)
{
	int nr_scanned = 0, total_isolated = 0;
	struct page *cursor, *valid_page = NULL;
	unsigned long flags = 0;
	bool locked = false;
	unsigned long blockpfn = *start_pfn;
	unsigned int order;

	cursor = pfn_to_page(blockpfn);

	/* Isolate free pages. */
	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
		int isolated;
		struct page *page = cursor;

		/*
		 * Periodically drop the lock (if held) regardless of its
		 * contention, to give chance to IRQs. Abort if fatal signal
		 * pending or async compaction detects need_resched()
		 */
		if (!(blockpfn % SWAP_CLUSTER_MAX)
		    && compact_unlock_should_abort(&cc->zone->lock, flags,
								&locked, cc))
			break;

		nr_scanned++;
		if (!pfn_valid_within(blockpfn))
			goto isolate_fail;

		if (!valid_page)
			valid_page = page;

		/*
		 * For compound pages such as THP and hugetlbfs, we can save
		 * potentially a lot of iterations if we skip them at once.
		 * The check is racy, but we can consider only valid values
		 * and the only danger is skipping too much.
		 */
		if (PageCompound(page)) {
			unsigned int comp_order = compound_order(page);

			if (likely(comp_order < MAX_ORDER)) {
				blockpfn += (1UL << comp_order) - 1;
				cursor += (1UL << comp_order) - 1;
			}

			goto isolate_fail;
		}

		if (!PageBuddy(page))
			goto isolate_fail;

====================
		/*
		 * If we already hold the lock, we can skip some rechecking.
		 * Note that if we hold the lock now, checked_pageblock was
		 * already set in some previous iteration (or strict is true),
		 * so it is correct to skip the suitable migration target
		 * recheck as well.
		 */
		if (!locked) {
			/*
			 * The zone lock must be held to isolate freepages.
			 * Unfortunately this is a very coarse lock and can be
			 * heavily contended if there are parallel allocations
			 * or parallel compactions. For async compaction do not
			 * spin on the lock and we acquire the lock as late as
			 * possible.
			 */
			locked = compact_trylock_irqsave(&cc->zone->lock,
								&flags, cc);
			if (!locked)
				break;

			/* Recheck this is a buddy page under lock */
			if (!PageBuddy(page))
				goto isolate_fail;
		}

		/* Found a free page, will break it into order-0 pages */
		order = page_order(page);
		isolated = __isolate_free_page(page, order);
		if (!isolated)
			break;
		set_page_private(page, order);

		total_isolated += isolated;
		cc->nr_freepages += isolated;
		list_add_tail(&page->lru, freelist);

		if (!strict && cc->nr_migratepages <= cc->nr_freepages) {
			blockpfn += isolated;
			break;
		}
		/* Advance to the end of split page */
		blockpfn += isolated - 1;
		cursor += isolated - 1;
		continue;

isolate_fail:
=====================
		if (strict)
			break;
		else
			continue;

	}

	if (locked)
		spin_unlock_irqrestore(&cc->zone->lock, flags);

	/*
	 * There is a tiny chance that we have read bogus compound_order(),
	 * so be careful to not go outside of the pageblock.
	 */
	if (unlikely(blockpfn > end_pfn))
		blockpfn = end_pfn;

	trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn,
					nr_scanned, total_isolated);

	/* Record how far we have got within the block */
	*start_pfn = blockpfn;

	/*
	 * If strict isolation is requested by CMA then check that all the
	 * pages requested were isolated. If there were any failures, 0 is
	 * returned and CMA will fail.
	 */
	if (strict && blockpfn < end_pfn)
		total_isolated = 0;

	/* Update the pageblock-skip if the whole pageblock was scanned */
	if (blockpfn == end_pfn)
		update_pageblock_skip(cc, valid_page, total_isolated, false);

	count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
	if (total_isolated)
		count_compact_events(COMPACTISOLATED, total_isolated);
	return total_isolated;
}


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/9] reduce memory usage by page_owner
  2016-06-24 23:19   ` Andrew Morton
@ 2016-06-27  5:45     ` Joonsoo Kim
  -1 siblings, 0 replies; 36+ messages in thread
From: Joonsoo Kim @ 2016-06-27  5:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin

On Fri, Jun 24, 2016 at 04:19:35PM -0700, Andrew Morton wrote:
> On Fri, 17 Jun 2016 16:57:30 +0900 js1304@gmail.com wrote:
> 
> > There was a bug reported by Sasha and minor fixes is needed
> > so I send v3.
> > 
> > o fix a bg reported by Sasha (mm/compaction: split freepages
> > without holding the zone lock)
> > o add code comment for todo list (mm/page_owner: use stackdepot
> > to store stacktrace) per Michal
> > o add 'inline' keyword (mm/page_alloc: introduce post allocation
> > processing on page allocator) per Vlastimil
> > o add a patch that clean-up code per Vlastimil
> 
> I've gone through v3 patches 2-9 and have plucked out the deltas to
> take what-i-had and turn that into what-you-sent.  Patch 1/9 has seen a
> lot of competing churn in isolate_freepages_block(), so please review
> the current version of that, below.  Between the "===" markers:

Hello, Andrew.

Below looks okay to me.

Thanks.

> 
> 
> static unsigned long isolate_freepages_block(struct compact_control *cc,
> 				unsigned long *start_pfn,
> 				unsigned long end_pfn,
> 				struct list_head *freelist,
> 				bool strict)
> {
> 	int nr_scanned = 0, total_isolated = 0;
> 	struct page *cursor, *valid_page = NULL;
> 	unsigned long flags = 0;
> 	bool locked = false;
> 	unsigned long blockpfn = *start_pfn;
> 	unsigned int order;
> 
> 	cursor = pfn_to_page(blockpfn);
> 
> 	/* Isolate free pages. */
> 	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
> 		int isolated;
> 		struct page *page = cursor;
> 
> 		/*
> 		 * Periodically drop the lock (if held) regardless of its
> 		 * contention, to give chance to IRQs. Abort if fatal signal
> 		 * pending or async compaction detects need_resched()
> 		 */
> 		if (!(blockpfn % SWAP_CLUSTER_MAX)
> 		    && compact_unlock_should_abort(&cc->zone->lock, flags,
> 								&locked, cc))
> 			break;
> 
> 		nr_scanned++;
> 		if (!pfn_valid_within(blockpfn))
> 			goto isolate_fail;
> 
> 		if (!valid_page)
> 			valid_page = page;
> 
> 		/*
> 		 * For compound pages such as THP and hugetlbfs, we can save
> 		 * potentially a lot of iterations if we skip them at once.
> 		 * The check is racy, but we can consider only valid values
> 		 * and the only danger is skipping too much.
> 		 */
> 		if (PageCompound(page)) {
> 			unsigned int comp_order = compound_order(page);
> 
> 			if (likely(comp_order < MAX_ORDER)) {
> 				blockpfn += (1UL << comp_order) - 1;
> 				cursor += (1UL << comp_order) - 1;
> 			}
> 
> 			goto isolate_fail;
> 		}
> 
> 		if (!PageBuddy(page))
> 			goto isolate_fail;
> 
> ====================
> 		/*
> 		 * If we already hold the lock, we can skip some rechecking.
> 		 * Note that if we hold the lock now, checked_pageblock was
> 		 * already set in some previous iteration (or strict is true),
> 		 * so it is correct to skip the suitable migration target
> 		 * recheck as well.
> 		 */
> 		if (!locked) {
> 			/*
> 			 * The zone lock must be held to isolate freepages.
> 			 * Unfortunately this is a very coarse lock and can be
> 			 * heavily contended if there are parallel allocations
> 			 * or parallel compactions. For async compaction do not
> 			 * spin on the lock and we acquire the lock as late as
> 			 * possible.
> 			 */
> 			locked = compact_trylock_irqsave(&cc->zone->lock,
> 								&flags, cc);
> 			if (!locked)
> 				break;
> 
> 			/* Recheck this is a buddy page under lock */
> 			if (!PageBuddy(page))
> 				goto isolate_fail;
> 		}
> 
> 		/* Found a free page, will break it into order-0 pages */
> 		order = page_order(page);
> 		isolated = __isolate_free_page(page, order);
> 		if (!isolated)
> 			break;
> 		set_page_private(page, order);
> 
> 		total_isolated += isolated;
> 		cc->nr_freepages += isolated;
> 		list_add_tail(&page->lru, freelist);
> 
> 		if (!strict && cc->nr_migratepages <= cc->nr_freepages) {
> 			blockpfn += isolated;
> 			break;
> 		}
> 		/* Advance to the end of split page */
> 		blockpfn += isolated - 1;
> 		cursor += isolated - 1;
> 		continue;
> 
> isolate_fail:
> =====================
> 		if (strict)
> 			break;
> 		else
> 			continue;
> 
> 	}
> 
> 	if (locked)
> 		spin_unlock_irqrestore(&cc->zone->lock, flags);
> 
> 	/*
> 	 * There is a tiny chance that we have read bogus compound_order(),
> 	 * so be careful to not go outside of the pageblock.
> 	 */
> 	if (unlikely(blockpfn > end_pfn))
> 		blockpfn = end_pfn;
> 
> 	trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn,
> 					nr_scanned, total_isolated);
> 
> 	/* Record how far we have got within the block */
> 	*start_pfn = blockpfn;
> 
> 	/*
> 	 * If strict isolation is requested by CMA then check that all the
> 	 * pages requested were isolated. If there were any failures, 0 is
> 	 * returned and CMA will fail.
> 	 */
> 	if (strict && blockpfn < end_pfn)
> 		total_isolated = 0;
> 
> 	/* Update the pageblock-skip if the whole pageblock was scanned */
> 	if (blockpfn == end_pfn)
> 		update_pageblock_skip(cc, valid_page, total_isolated, false);
> 
> 	count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
> 	if (total_isolated)
> 		count_compact_events(COMPACTISOLATED, total_isolated);
> 	return total_isolated;
> }
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/9] reduce memory usage by page_owner
@ 2016-06-27  5:45     ` Joonsoo Kim
  0 siblings, 0 replies; 36+ messages in thread
From: Joonsoo Kim @ 2016-06-27  5:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin

On Fri, Jun 24, 2016 at 04:19:35PM -0700, Andrew Morton wrote:
> On Fri, 17 Jun 2016 16:57:30 +0900 js1304@gmail.com wrote:
> 
> > There was a bug reported by Sasha and minor fixes is needed
> > so I send v3.
> > 
> > o fix a bg reported by Sasha (mm/compaction: split freepages
> > without holding the zone lock)
> > o add code comment for todo list (mm/page_owner: use stackdepot
> > to store stacktrace) per Michal
> > o add 'inline' keyword (mm/page_alloc: introduce post allocation
> > processing on page allocator) per Vlastimil
> > o add a patch that clean-up code per Vlastimil
> 
> I've gone through v3 patches 2-9 and have plucked out the deltas to
> take what-i-had and turn that into what-you-sent.  Patch 1/9 has seen a
> lot of competing churn in isolate_freepages_block(), so please review
> the current version of that, below.  Between the "===" markers:

Hello, Andrew.

Below looks okay to me.

Thanks.

> 
> 
> static unsigned long isolate_freepages_block(struct compact_control *cc,
> 				unsigned long *start_pfn,
> 				unsigned long end_pfn,
> 				struct list_head *freelist,
> 				bool strict)
> {
> 	int nr_scanned = 0, total_isolated = 0;
> 	struct page *cursor, *valid_page = NULL;
> 	unsigned long flags = 0;
> 	bool locked = false;
> 	unsigned long blockpfn = *start_pfn;
> 	unsigned int order;
> 
> 	cursor = pfn_to_page(blockpfn);
> 
> 	/* Isolate free pages. */
> 	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
> 		int isolated;
> 		struct page *page = cursor;
> 
> 		/*
> 		 * Periodically drop the lock (if held) regardless of its
> 		 * contention, to give chance to IRQs. Abort if fatal signal
> 		 * pending or async compaction detects need_resched()
> 		 */
> 		if (!(blockpfn % SWAP_CLUSTER_MAX)
> 		    && compact_unlock_should_abort(&cc->zone->lock, flags,
> 								&locked, cc))
> 			break;
> 
> 		nr_scanned++;
> 		if (!pfn_valid_within(blockpfn))
> 			goto isolate_fail;
> 
> 		if (!valid_page)
> 			valid_page = page;
> 
> 		/*
> 		 * For compound pages such as THP and hugetlbfs, we can save
> 		 * potentially a lot of iterations if we skip them at once.
> 		 * The check is racy, but we can consider only valid values
> 		 * and the only danger is skipping too much.
> 		 */
> 		if (PageCompound(page)) {
> 			unsigned int comp_order = compound_order(page);
> 
> 			if (likely(comp_order < MAX_ORDER)) {
> 				blockpfn += (1UL << comp_order) - 1;
> 				cursor += (1UL << comp_order) - 1;
> 			}
> 
> 			goto isolate_fail;
> 		}
> 
> 		if (!PageBuddy(page))
> 			goto isolate_fail;
> 
> ====================
> 		/*
> 		 * If we already hold the lock, we can skip some rechecking.
> 		 * Note that if we hold the lock now, checked_pageblock was
> 		 * already set in some previous iteration (or strict is true),
> 		 * so it is correct to skip the suitable migration target
> 		 * recheck as well.
> 		 */
> 		if (!locked) {
> 			/*
> 			 * The zone lock must be held to isolate freepages.
> 			 * Unfortunately this is a very coarse lock and can be
> 			 * heavily contended if there are parallel allocations
> 			 * or parallel compactions. For async compaction do not
> 			 * spin on the lock and we acquire the lock as late as
> 			 * possible.
> 			 */
> 			locked = compact_trylock_irqsave(&cc->zone->lock,
> 								&flags, cc);
> 			if (!locked)
> 				break;
> 
> 			/* Recheck this is a buddy page under lock */
> 			if (!PageBuddy(page))
> 				goto isolate_fail;
> 		}
> 
> 		/* Found a free page, will break it into order-0 pages */
> 		order = page_order(page);
> 		isolated = __isolate_free_page(page, order);
> 		if (!isolated)
> 			break;
> 		set_page_private(page, order);
> 
> 		total_isolated += isolated;
> 		cc->nr_freepages += isolated;
> 		list_add_tail(&page->lru, freelist);
> 
> 		if (!strict && cc->nr_migratepages <= cc->nr_freepages) {
> 			blockpfn += isolated;
> 			break;
> 		}
> 		/* Advance to the end of split page */
> 		blockpfn += isolated - 1;
> 		cursor += isolated - 1;
> 		continue;
> 
> isolate_fail:
> =====================
> 		if (strict)
> 			break;
> 		else
> 			continue;
> 
> 	}
> 
> 	if (locked)
> 		spin_unlock_irqrestore(&cc->zone->lock, flags);
> 
> 	/*
> 	 * There is a tiny chance that we have read bogus compound_order(),
> 	 * so be careful to not go outside of the pageblock.
> 	 */
> 	if (unlikely(blockpfn > end_pfn))
> 		blockpfn = end_pfn;
> 
> 	trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn,
> 					nr_scanned, total_isolated);
> 
> 	/* Record how far we have got within the block */
> 	*start_pfn = blockpfn;
> 
> 	/*
> 	 * If strict isolation is requested by CMA then check that all the
> 	 * pages requested were isolated. If there were any failures, 0 is
> 	 * returned and CMA will fail.
> 	 */
> 	if (strict && blockpfn < end_pfn)
> 		total_isolated = 0;
> 
> 	/* Update the pageblock-skip if the whole pageblock was scanned */
> 	if (blockpfn == end_pfn)
> 		update_pageblock_skip(cc, valid_page, total_isolated, false);
> 
> 	count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
> 	if (total_isolated)
> 		count_compact_events(COMPACTISOLATED, total_isolated);
> 	return total_isolated;
> }
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [v3,6/9] mm/page_owner: use stackdepot to store stacktrace
  2016-06-17  7:57   ` js1304
  (?)
@ 2016-10-26 13:06   ` Sascha Silbe
  2016-10-27  0:17       ` Joonsoo Kim
  -1 siblings, 1 reply; 36+ messages in thread
From: Sascha Silbe @ 2016-10-26 13:06 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Vlastimil Babka, mgorman, Minchan Kim, Alexander Potapenko,
	Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, Sasha Levin,
	Joonsoo Kim

[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]

Dear Joonsoo,

Joonsoo Kim <js1304@gmail.com> writes:

> Currently, we store each page's allocation stacktrace on corresponding
> page_ext structure and it requires a lot of memory.  This causes the
> problem that memory tight system doesn't work well if page_owner is
> enabled.  Moreover, even with this large memory consumption, we cannot get
> full stacktrace because we allocate memory at boot time and just maintain
> 8 stacktrace slots to balance memory consumption.  We could increase it to
> more but it would make system unusable or change system behaviour.
[...]

This patch causes my Wandboard Quad [1] not to boot anymore. I don't get
any kernel output, even with earlycon enabled
(earlycon=ec_imx6q,0x02020000). git bisect pointed towards your patch;
reverting the patch causes the system to boot fine again. Config is
available at [2]; none of the defconfigs I tried (defconfig =
multi_v7_defconfig, imx_v6_v7_defconfig) works for me.

Haven't looked into this any further so far; hooking up a JTAG adapter
requires some hardware changes as the JTAG header is unpopulated.

Sascha

PS: Please CC me on replies; I'm not subscribed to any of the lists.

[1] http://www.wandboard.org/index.php/details/wandboard
[2] https://sascha.silbe.org/tmp/config-4.8.4-wandboard-28-00003-g9e9b5d6
-- 
Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg
https://se-silbe.de/
USt-IdNr.: DE281696641

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [v3,6/9] mm/page_owner: use stackdepot to store stacktrace
  2016-10-26 13:06   ` [v3,6/9] " Sascha Silbe
@ 2016-10-27  0:17       ` Joonsoo Kim
  0 siblings, 0 replies; 36+ messages in thread
From: Joonsoo Kim @ 2016-10-27  0:17 UTC (permalink / raw)
  To: Sascha Silbe
  Cc: Andrew Morton, Vlastimil Babka, mgorman, Minchan Kim,
	Alexander Potapenko, Hugh Dickins, Michal Hocko, linux-kernel,
	linux-mm, Sasha Levin

Hello,

Thanks for the report.

On Wed, Oct 26, 2016 at 03:06:05PM +0200, Sascha Silbe wrote:
> Dear Joonsoo,
> 
> Joonsoo Kim <js1304@gmail.com> writes:
> 
> > Currently, we store each page's allocation stacktrace on corresponding
> > page_ext structure and it requires a lot of memory.  This causes the
> > problem that memory tight system doesn't work well if page_owner is
> > enabled.  Moreover, even with this large memory consumption, we cannot get
> > full stacktrace because we allocate memory at boot time and just maintain
> > 8 stacktrace slots to balance memory consumption.  We could increase it to
> > more but it would make system unusable or change system behaviour.
> [...]
> 
> This patch causes my Wandboard Quad [1] not to boot anymore. I don't get
> any kernel output, even with earlycon enabled
> (earlycon=ec_imx6q,0x02020000). git bisect pointed towards your patch;
> reverting the patch causes the system to boot fine again. Config is
> available at [2]; none of the defconfigs I tried (defconfig =
> multi_v7_defconfig, imx_v6_v7_defconfig) works for me.
> 
> Haven't looked into this any further so far; hooking up a JTAG adapter
> requires some hardware changes as the JTAG header is unpopulated.
> 
> Sascha
> 
> PS: Please CC me on replies; I'm not subscribed to any of the lists.
> 
> [1] http://www.wandboard.org/index.php/details/wandboard
> [2] https://sascha.silbe.org/tmp/config-4.8.4-wandboard-28-00003-g9e9b5d6
> -- 
> Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg
> https://se-silbe.de/
> USt-IdNr.: DE281696641


I cannot see your config. Link [2] shows "No such file or directory"

Anyway, I find that there is an issue in early boot phase in
!CONFIG_SPARSEMEM. Could you try following one?
(It's an completely untested patch, even I don't try to compile it.)

Thanks.

---------->8---------------
diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index 9298c39..db31f58 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -47,6 +47,7 @@ struct page_ext {
 };
 
 extern void pgdat_page_ext_init(struct pglist_data *pgdat);
+extern void __init invoke_init_callbacks(void);
 
 #ifdef CONFIG_SPARSEMEM
 static inline void page_ext_init_flatmem(void)
@@ -57,6 +58,7 @@ static inline void page_ext_init_flatmem(void)
 extern void page_ext_init_flatmem(void);
 static inline void page_ext_init(void)
 {
+       invoke_init_callbacks();
 }
 #endif
 
diff --git a/mm/page_ext.c b/mm/page_ext.c
index 121dcff..a405869 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -91,7 +91,7 @@ static bool __init invoke_need_callbacks(void)
        return need;
 }
 
-static void __init invoke_init_callbacks(void)
+void __init invoke_init_callbacks(void)
 {
        int i;
        int entries = ARRAY_SIZE(page_ext_ops);
@@ -190,7 +190,6 @@ void __init page_ext_init_flatmem(void)
                        goto fail;
        }
        pr_info("allocated %ld bytes of page_ext\n", total_usage);
-       invoke_init_callbacks();
        return;
 
 fail:

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [v3,6/9] mm/page_owner: use stackdepot to store stacktrace
@ 2016-10-27  0:17       ` Joonsoo Kim
  0 siblings, 0 replies; 36+ messages in thread
From: Joonsoo Kim @ 2016-10-27  0:17 UTC (permalink / raw)
  To: Sascha Silbe
  Cc: Andrew Morton, Vlastimil Babka, mgorman, Minchan Kim,
	Alexander Potapenko, Hugh Dickins, Michal Hocko, linux-kernel,
	linux-mm, Sasha Levin

Hello,

Thanks for the report.

On Wed, Oct 26, 2016 at 03:06:05PM +0200, Sascha Silbe wrote:
> Dear Joonsoo,
> 
> Joonsoo Kim <js1304@gmail.com> writes:
> 
> > Currently, we store each page's allocation stacktrace on corresponding
> > page_ext structure and it requires a lot of memory.  This causes the
> > problem that memory tight system doesn't work well if page_owner is
> > enabled.  Moreover, even with this large memory consumption, we cannot get
> > full stacktrace because we allocate memory at boot time and just maintain
> > 8 stacktrace slots to balance memory consumption.  We could increase it to
> > more but it would make system unusable or change system behaviour.
> [...]
> 
> This patch causes my Wandboard Quad [1] not to boot anymore. I don't get
> any kernel output, even with earlycon enabled
> (earlycon=ec_imx6q,0x02020000). git bisect pointed towards your patch;
> reverting the patch causes the system to boot fine again. Config is
> available at [2]; none of the defconfigs I tried (defconfig =
> multi_v7_defconfig, imx_v6_v7_defconfig) works for me.
> 
> Haven't looked into this any further so far; hooking up a JTAG adapter
> requires some hardware changes as the JTAG header is unpopulated.
> 
> Sascha
> 
> PS: Please CC me on replies; I'm not subscribed to any of the lists.
> 
> [1] http://www.wandboard.org/index.php/details/wandboard
> [2] https://sascha.silbe.org/tmp/config-4.8.4-wandboard-28-00003-g9e9b5d6
> -- 
> Softwareentwicklung Sascha Silbe, Niederhofenstrasse 5/1, 71229 Leonberg
> https://se-silbe.de/
> USt-IdNr.: DE281696641


I cannot see your config. Link [2] shows "No such file or directory"

Anyway, I find that there is an issue in early boot phase in
!CONFIG_SPARSEMEM. Could you try following one?
(It's an completely untested patch, even I don't try to compile it.)

Thanks.

---------->8---------------
diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index 9298c39..db31f58 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -47,6 +47,7 @@ struct page_ext {
 };
 
 extern void pgdat_page_ext_init(struct pglist_data *pgdat);
+extern void __init invoke_init_callbacks(void);
 
 #ifdef CONFIG_SPARSEMEM
 static inline void page_ext_init_flatmem(void)
@@ -57,6 +58,7 @@ static inline void page_ext_init_flatmem(void)
 extern void page_ext_init_flatmem(void);
 static inline void page_ext_init(void)
 {
+       invoke_init_callbacks();
 }
 #endif
 
diff --git a/mm/page_ext.c b/mm/page_ext.c
index 121dcff..a405869 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -91,7 +91,7 @@ static bool __init invoke_need_callbacks(void)
        return need;
 }
 
-static void __init invoke_init_callbacks(void)
+void __init invoke_init_callbacks(void)
 {
        int i;
        int entries = ARRAY_SIZE(page_ext_ops);
@@ -190,7 +190,6 @@ void __init page_ext_init_flatmem(void)
                        goto fail;
        }
        pr_info("allocated %ld bytes of page_ext\n", total_usage);
-       invoke_init_callbacks();
        return;
 
 fail:

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [v3,6/9] mm/page_owner: use stackdepot to store stacktrace
  2016-10-27  0:17       ` Joonsoo Kim
  (?)
@ 2016-12-22 23:37       ` Sascha Silbe
  -1 siblings, 0 replies; 36+ messages in thread
From: Sascha Silbe @ 2016-12-22 23:37 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Vlastimil Babka, mgorman, Minchan Kim,
	Alexander Potapenko, Hugh Dickins, Michal Hocko, linux-kernel,
	linux-mm, Sasha Levin

[-- Attachment #1: Type: text/plain, Size: 745 bytes --]

Dear Joonsoo,

Joonsoo Kim <iamjoonsoo.kim@lge.com> writes:

> Anyway, I find that there is an issue in early boot phase in
> !CONFIG_SPARSEMEM. Could you try following one?
> (It's an completely untested patch, even I don't try to compile it.)

Finally got JTAG to work just enough to dive into this. Turned out to be
another case of the FDT clashing with the BSS area. Your patch caused
BSS to grow by > 4 MiB so that it now collides with the (old) default
FDT location on this board. There's nothing wrong with your patch, just
the usual black magic of placing initramfs and FDT in RAM on ARM.

Sascha
-- 
Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg
https://se-silbe.de/
USt-IdNr.: DE281696641

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2016-12-22 23:43 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-17  7:57 [PATCH v3 0/9] reduce memory usage by page_owner js1304
2016-06-17  7:57 ` js1304
2016-06-17  7:57 ` [PATCH v3 1/9] mm/compaction: split freepages without holding the zone lock js1304
2016-06-17  7:57   ` js1304
2016-06-17  7:57 ` [PATCH v3 2/9] mm/page_owner: initialize page owner " js1304
2016-06-17  7:57   ` js1304
2016-06-17  7:57 ` [PATCH v3 3/9] mm/page_owner: copy last_migrate_reason in copy_page_owner() js1304
2016-06-17  7:57   ` js1304
2016-06-17  7:57 ` [PATCH v3 4/9] mm/page_owner: introduce split_page_owner and replace manual handling js1304
2016-06-17  7:57   ` js1304
2016-06-17  7:57 ` [PATCH v3 5/9] tools/vm/page_owner: increase temporary buffer size js1304
2016-06-17  7:57   ` js1304
2016-06-17 12:56   ` Vlastimil Babka
2016-06-17 12:56     ` Vlastimil Babka
2016-06-17  7:57 ` [PATCH v3 6/9] mm/page_owner: use stackdepot to store stacktrace js1304
2016-06-17  7:57   ` js1304
2016-10-26 13:06   ` [v3,6/9] " Sascha Silbe
2016-10-27  0:17     ` Joonsoo Kim
2016-10-27  0:17       ` Joonsoo Kim
2016-12-22 23:37       ` Sascha Silbe
2016-06-17  7:57 ` [PATCH v3 7/9] mm/page_owner: avoid null pointer dereference js1304
2016-06-17  7:57   ` js1304
2016-06-17 13:32   ` Vlastimil Babka
2016-06-17 13:32     ` Vlastimil Babka
2016-06-24 20:19     ` Andrew Morton
2016-06-24 20:19       ` Andrew Morton
2016-06-17  7:57 ` [PATCH v3 8/9] mm/page_alloc: introduce post allocation processing on page allocator js1304
2016-06-17  7:57   ` js1304
2016-06-17  7:57 ` [PATCH v3 9/9] mm/page_isolation: clean up confused code js1304
2016-06-17  7:57   ` js1304
2016-06-17 13:34   ` Vlastimil Babka
2016-06-17 13:34     ` Vlastimil Babka
2016-06-24 23:19 ` [PATCH v3 0/9] reduce memory usage by page_owner Andrew Morton
2016-06-24 23:19   ` Andrew Morton
2016-06-27  5:45   ` Joonsoo Kim
2016-06-27  5:45     ` Joonsoo Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.