All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
@ 2017-01-04 10:19 ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

Hi,
this is the second version of the patchset [1]. I hope I've addressed all
the review feedback.

While debugging [2] I've realized that there is some room for
improvements in the tracepoints set we offer currently. I had hard times
to make any conclusion from the existing ones. The resulting problem
turned out to be active list aging [3] and we are missing at least two
tracepoints to debug such a problem.

Some existing tracepoints could export more information to see _why_ the
reclaim progress cannot be made not only _how much_ we could reclaim.
The later could be seen quite reasonably from the vmstat counters
already. It can be argued that we are showing too many implementation
details in those tracepoints but I consider them way too lowlevel
already to be usable by any kernel independent userspace. I would be
_really_ surprised if anything but debugging tools have used them.

Any feedback is highly appreciated.

[1] http://lkml.kernel.org/r/20161228153032.10821-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20161215225702.GA27944@boerne.fritz.box
[3] http://lkml.kernel.org/r/20161223105157.GB23109@dhcp22.suse.cz

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
@ 2017-01-04 10:19 ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

Hi,
this is the second version of the patchset [1]. I hope I've addressed all
the review feedback.

While debugging [2] I've realized that there is some room for
improvements in the tracepoints set we offer currently. I had hard times
to make any conclusion from the existing ones. The resulting problem
turned out to be active list aging [3] and we are missing at least two
tracepoints to debug such a problem.

Some existing tracepoints could export more information to see _why_ the
reclaim progress cannot be made not only _how much_ we could reclaim.
The later could be seen quite reasonably from the vmstat counters
already. It can be argued that we are showing too many implementation
details in those tracepoints but I consider them way too lowlevel
already to be usable by any kernel independent userspace. I would be
_really_ surprised if anything but debugging tools have used them.

Any feedback is highly appreciated.

[1] http://lkml.kernel.org/r/20161228153032.10821-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20161215225702.GA27944@boerne.fritz.box
[3] http://lkml.kernel.org/r/20161223105157.GB23109@dhcp22.suse.cz


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 1/7] mm, vmscan: remove unused mm_vmscan_memcg_isolate
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:19   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

the trace point is not used since 925b7673cce3 ("mm: make per-memcg LRU
lists exclusive") so it can be removed.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 31 +------------------------------
 1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index c88fd0934e7e..39bad8921ca1 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -269,8 +269,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 		__entry->retval)
 );
 
-DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
-
+TRACE_EVENT(mm_vmscan_lru_isolate,
 	TP_PROTO(int classzone_idx,
 		int order,
 		unsigned long nr_requested,
@@ -311,34 +310,6 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
 		__entry->file)
 );
 
-DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_lru_isolate,
-
-	TP_PROTO(int classzone_idx,
-		int order,
-		unsigned long nr_requested,
-		unsigned long nr_scanned,
-		unsigned long nr_taken,
-		isolate_mode_t isolate_mode,
-		int file),
-
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
-
-);
-
-DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_memcg_isolate,
-
-	TP_PROTO(int classzone_idx,
-		int order,
-		unsigned long nr_requested,
-		unsigned long nr_scanned,
-		unsigned long nr_taken,
-		isolate_mode_t isolate_mode,
-		int file),
-
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
-
-);
-
 TRACE_EVENT(mm_vmscan_writepage,
 
 	TP_PROTO(struct page *page),
-- 
2.11.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 1/7] mm, vmscan: remove unused mm_vmscan_memcg_isolate
@ 2017-01-04 10:19   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

the trace point is not used since 925b7673cce3 ("mm: make per-memcg LRU
lists exclusive") so it can be removed.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 31 +------------------------------
 1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index c88fd0934e7e..39bad8921ca1 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -269,8 +269,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 		__entry->retval)
 );
 
-DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
-
+TRACE_EVENT(mm_vmscan_lru_isolate,
 	TP_PROTO(int classzone_idx,
 		int order,
 		unsigned long nr_requested,
@@ -311,34 +310,6 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
 		__entry->file)
 );
 
-DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_lru_isolate,
-
-	TP_PROTO(int classzone_idx,
-		int order,
-		unsigned long nr_requested,
-		unsigned long nr_scanned,
-		unsigned long nr_taken,
-		isolate_mode_t isolate_mode,
-		int file),
-
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
-
-);
-
-DEFINE_EVENT(mm_vmscan_lru_isolate_template, mm_vmscan_memcg_isolate,
-
-	TP_PROTO(int classzone_idx,
-		int order,
-		unsigned long nr_requested,
-		unsigned long nr_scanned,
-		unsigned long nr_taken,
-		isolate_mode_t isolate_mode,
-		int file),
-
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_taken, isolate_mode, file)
-
-);
-
 TRACE_EVENT(mm_vmscan_writepage,
 
 	TP_PROTO(struct page *page),
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:19   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Our reclaim process has several tracepoints to tell us more about how
things are progressing. We are, however, missing a tracepoint to track
active list aging. Introduce mm_vmscan_lru_shrink_active which reports
the number of
	- nr_scanned, nr_taken pages to tell us the LRU isolation
	  effectiveness.
	- nr_referenced pages which tells us that we are hitting referenced
	  pages which are deactivated. If this is a large part of the
	  reported nr_deactivated pages then we might be hitting into
	  the active list too early because they might be still part of
	  the working set. This might help to debug performance issues.
	- nr_activated pages which tells us how many pages are kept on the
	  active list - mostly exec file backed pages. A high number can
	  indicate that we might be trashing on executables.

Changes since v1
- report nr_taken pages as per Minchan
- report nr_activated as per Minchan
- do not report nr_freed pages because that would add a tiny overhead to
  free_hot_cold_page_list which is a hot path
- do not report nr_unevictable because we can report this number via a
  different and more generic tracepoint in putback_lru_page
- fix move_active_pages_to_lru to report proper page count when we hit
  into large pages
- drop nr_scanned because this can be obtained from
  trace_mm_vmscan_lru_isolate as per Minchan

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 36 ++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                   | 18 ++++++++++++++----
 2 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 39bad8921ca1..087c0b625ba7 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -363,6 +363,42 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_lru_shrink_active,
+
+	TP_PROTO(int nid, unsigned long nr_taken,
+		unsigned long nr_activate, unsigned long nr_deactivated,
+		unsigned long nr_referenced, int priority, int file),
+
+	TP_ARGS(nid, nr_taken, nr_activate, nr_deactivated, nr_referenced, priority, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(unsigned long, nr_taken)
+		__field(unsigned long, nr_activate)
+		__field(unsigned long, nr_deactivated)
+		__field(unsigned long, nr_referenced)
+		__field(int, priority)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->nr_taken = nr_taken;
+		__entry->nr_activate = nr_activate;
+		__entry->nr_deactivated = nr_deactivated;
+		__entry->nr_referenced = nr_referenced;
+		__entry->priority = priority;
+		__entry->reclaim_flags = trace_shrink_flags(file);
+	),
+
+	TP_printk("nid=%d nr_taken=%ld nr_activated=%ld nr_deactivated=%ld nr_referenced=%ld priority=%d flags=%s",
+		__entry->nid,
+		__entry->nr_taken,
+		__entry->nr_activate, __entry->nr_deactivated, __entry->nr_referenced,
+		__entry->priority,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
+
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..70d1c55463c0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * The downside is that we have to touch page->_refcount against each page.
  * But we had to alter page->flags anyway.
+ *
+ * Returns the number of pages moved to the given lru.
  */
 
-static void move_active_pages_to_lru(struct lruvec *lruvec,
+static unsigned move_active_pages_to_lru(struct lruvec *lruvec,
 				     struct list_head *list,
 				     struct list_head *pages_to_free,
 				     enum lru_list lru)
@@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 	unsigned long pgmoved = 0;
 	struct page *page;
 	int nr_pages;
+	int nr_moved = 0;
 
 	while (!list_empty(list)) {
 		page = lru_to_page(list);
@@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 				spin_lock_irq(&pgdat->lru_lock);
 			} else
 				list_add(&page->lru, pages_to_free);
+		} else {
+			nr_moved += nr_pages;
 		}
 	}
 
 	if (!is_active_lru(lru))
 		__count_vm_events(PGDEACTIVATE, pgmoved);
+
+	return nr_moved;
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
@@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_inactive);
 	struct page *page;
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
-	unsigned long nr_rotated = 0;
+	unsigned nr_deactivate, nr_activate;
+	unsigned nr_rotated = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	reclaim_stat->recent_rotated[file] += nr_rotated;
 
-	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
-	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
+	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
+	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&pgdat->lru_lock);
 
 	mem_cgroup_uncharge_list(&l_hold);
 	free_hot_cold_page_list(&l_hold, true);
+	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_taken, nr_activate,
+			nr_deactivate, nr_rotated, sc->priority, file);
 }
 
 /*
-- 
2.11.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04 10:19   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Our reclaim process has several tracepoints to tell us more about how
things are progressing. We are, however, missing a tracepoint to track
active list aging. Introduce mm_vmscan_lru_shrink_active which reports
the number of
	- nr_scanned, nr_taken pages to tell us the LRU isolation
	  effectiveness.
	- nr_referenced pages which tells us that we are hitting referenced
	  pages which are deactivated. If this is a large part of the
	  reported nr_deactivated pages then we might be hitting into
	  the active list too early because they might be still part of
	  the working set. This might help to debug performance issues.
	- nr_activated pages which tells us how many pages are kept on the
	  active list - mostly exec file backed pages. A high number can
	  indicate that we might be trashing on executables.

Changes since v1
- report nr_taken pages as per Minchan
- report nr_activated as per Minchan
- do not report nr_freed pages because that would add a tiny overhead to
  free_hot_cold_page_list which is a hot path
- do not report nr_unevictable because we can report this number via a
  different and more generic tracepoint in putback_lru_page
- fix move_active_pages_to_lru to report proper page count when we hit
  into large pages
- drop nr_scanned because this can be obtained from
  trace_mm_vmscan_lru_isolate as per Minchan

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 36 ++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                   | 18 ++++++++++++++----
 2 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 39bad8921ca1..087c0b625ba7 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -363,6 +363,42 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_lru_shrink_active,
+
+	TP_PROTO(int nid, unsigned long nr_taken,
+		unsigned long nr_activate, unsigned long nr_deactivated,
+		unsigned long nr_referenced, int priority, int file),
+
+	TP_ARGS(nid, nr_taken, nr_activate, nr_deactivated, nr_referenced, priority, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(unsigned long, nr_taken)
+		__field(unsigned long, nr_activate)
+		__field(unsigned long, nr_deactivated)
+		__field(unsigned long, nr_referenced)
+		__field(int, priority)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->nr_taken = nr_taken;
+		__entry->nr_activate = nr_activate;
+		__entry->nr_deactivated = nr_deactivated;
+		__entry->nr_referenced = nr_referenced;
+		__entry->priority = priority;
+		__entry->reclaim_flags = trace_shrink_flags(file);
+	),
+
+	TP_printk("nid=%d nr_taken=%ld nr_activated=%ld nr_deactivated=%ld nr_referenced=%ld priority=%d flags=%s",
+		__entry->nid,
+		__entry->nr_taken,
+		__entry->nr_activate, __entry->nr_deactivated, __entry->nr_referenced,
+		__entry->priority,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
+
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..70d1c55463c0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * The downside is that we have to touch page->_refcount against each page.
  * But we had to alter page->flags anyway.
+ *
+ * Returns the number of pages moved to the given lru.
  */
 
-static void move_active_pages_to_lru(struct lruvec *lruvec,
+static unsigned move_active_pages_to_lru(struct lruvec *lruvec,
 				     struct list_head *list,
 				     struct list_head *pages_to_free,
 				     enum lru_list lru)
@@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 	unsigned long pgmoved = 0;
 	struct page *page;
 	int nr_pages;
+	int nr_moved = 0;
 
 	while (!list_empty(list)) {
 		page = lru_to_page(list);
@@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 				spin_lock_irq(&pgdat->lru_lock);
 			} else
 				list_add(&page->lru, pages_to_free);
+		} else {
+			nr_moved += nr_pages;
 		}
 	}
 
 	if (!is_active_lru(lru))
 		__count_vm_events(PGDEACTIVATE, pgmoved);
+
+	return nr_moved;
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
@@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_inactive);
 	struct page *page;
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
-	unsigned long nr_rotated = 0;
+	unsigned nr_deactivate, nr_activate;
+	unsigned nr_rotated = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	reclaim_stat->recent_rotated[file] += nr_rotated;
 
-	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
-	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
+	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
+	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&pgdat->lru_lock);
 
 	mem_cgroup_uncharge_list(&l_hold);
 	free_hot_cold_page_list(&l_hold, true);
+	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_taken, nr_activate,
+			nr_deactivate, nr_rotated, sc->priority, file);
 }
 
 /*
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:19   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

mm_vmscan_lru_isolate shows the number of requested, scanned and taken
pages. This is mostly OK but on 32b systems the number of scanned pages
is quite misleading because it includes both the scanned and skipped
pages.  Moreover the skipped part is scaled based on the number of taken
pages. Let's report the exact numbers without any additional logic and
add the number of skipped pages. This should make the reported data much
more easier to interpret.

Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h |  8 ++++++--
 mm/vmscan.c                   | 10 +++++-----
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 087c0b625ba7..36c999f806bf 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -274,17 +274,19 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		int order,
 		unsigned long nr_requested,
 		unsigned long nr_scanned,
+		unsigned long nr_skipped,
 		unsigned long nr_taken,
 		isolate_mode_t isolate_mode,
 		int file),
 
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_taken, isolate_mode, file),
+	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, file),
 
 	TP_STRUCT__entry(
 		__field(int, classzone_idx)
 		__field(int, order)
 		__field(unsigned long, nr_requested)
 		__field(unsigned long, nr_scanned)
+		__field(unsigned long, nr_skipped)
 		__field(unsigned long, nr_taken)
 		__field(isolate_mode_t, isolate_mode)
 		__field(int, file)
@@ -295,17 +297,19 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__entry->order = order;
 		__entry->nr_requested = nr_requested;
 		__entry->nr_scanned = nr_scanned;
+		__entry->nr_skipped = nr_skipped;
 		__entry->nr_taken = nr_taken;
 		__entry->isolate_mode = isolate_mode;
 		__entry->file = file;
 	),
 
-	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_taken=%lu file=%d",
+	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu file=%d",
 		__entry->isolate_mode,
 		__entry->classzone_idx,
 		__entry->order,
 		__entry->nr_requested,
 		__entry->nr_scanned,
+		__entry->nr_skipped,
 		__entry->nr_taken,
 		__entry->file)
 );
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 70d1c55463c0..31c623d5acb4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1428,6 +1428,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 	unsigned long nr_taken = 0;
 	unsigned long nr_zone_taken[MAX_NR_ZONES] = { 0 };
 	unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
+	unsigned long skipped = 0, total_skipped = 0;
 	unsigned long scan, nr_pages;
 	LIST_HEAD(pages_skipped);
 
@@ -1479,14 +1480,13 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 	 */
 	if (!list_empty(&pages_skipped)) {
 		int zid;
-		unsigned long total_skipped = 0;
 
 		for (zid = 0; zid < MAX_NR_ZONES; zid++) {
 			if (!nr_skipped[zid])
 				continue;
 
 			__count_zid_vm_events(PGSCAN_SKIP, zid, nr_skipped[zid]);
-			total_skipped += nr_skipped[zid];
+			skipped += nr_skipped[zid];
 		}
 
 		/*
@@ -1494,13 +1494,13 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 		 * close to unreclaimable. If the LRU list is empty, account
 		 * skipped pages as a full scan.
 		 */
-		scan += list_empty(src) ? total_skipped : total_skipped >> 2;
+		total_skipped = list_empty(src) ? skipped : skipped >> 2;
 
 		list_splice(&pages_skipped, src);
 	}
-	*nr_scanned = scan;
+	*nr_scanned = scan + total_skipped;
 	trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan,
-				    nr_taken, mode, is_file_lru(lru));
+				    skipped, nr_taken, mode, is_file_lru(lru));
 	update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
 	return nr_taken;
 }
-- 
2.11.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate
@ 2017-01-04 10:19   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

mm_vmscan_lru_isolate shows the number of requested, scanned and taken
pages. This is mostly OK but on 32b systems the number of scanned pages
is quite misleading because it includes both the scanned and skipped
pages.  Moreover the skipped part is scaled based on the number of taken
pages. Let's report the exact numbers without any additional logic and
add the number of skipped pages. This should make the reported data much
more easier to interpret.

Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h |  8 ++++++--
 mm/vmscan.c                   | 10 +++++-----
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 087c0b625ba7..36c999f806bf 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -274,17 +274,19 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		int order,
 		unsigned long nr_requested,
 		unsigned long nr_scanned,
+		unsigned long nr_skipped,
 		unsigned long nr_taken,
 		isolate_mode_t isolate_mode,
 		int file),
 
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_taken, isolate_mode, file),
+	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, file),
 
 	TP_STRUCT__entry(
 		__field(int, classzone_idx)
 		__field(int, order)
 		__field(unsigned long, nr_requested)
 		__field(unsigned long, nr_scanned)
+		__field(unsigned long, nr_skipped)
 		__field(unsigned long, nr_taken)
 		__field(isolate_mode_t, isolate_mode)
 		__field(int, file)
@@ -295,17 +297,19 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__entry->order = order;
 		__entry->nr_requested = nr_requested;
 		__entry->nr_scanned = nr_scanned;
+		__entry->nr_skipped = nr_skipped;
 		__entry->nr_taken = nr_taken;
 		__entry->isolate_mode = isolate_mode;
 		__entry->file = file;
 	),
 
-	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_taken=%lu file=%d",
+	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu file=%d",
 		__entry->isolate_mode,
 		__entry->classzone_idx,
 		__entry->order,
 		__entry->nr_requested,
 		__entry->nr_scanned,
+		__entry->nr_skipped,
 		__entry->nr_taken,
 		__entry->file)
 );
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 70d1c55463c0..31c623d5acb4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1428,6 +1428,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 	unsigned long nr_taken = 0;
 	unsigned long nr_zone_taken[MAX_NR_ZONES] = { 0 };
 	unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
+	unsigned long skipped = 0, total_skipped = 0;
 	unsigned long scan, nr_pages;
 	LIST_HEAD(pages_skipped);
 
@@ -1479,14 +1480,13 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 	 */
 	if (!list_empty(&pages_skipped)) {
 		int zid;
-		unsigned long total_skipped = 0;
 
 		for (zid = 0; zid < MAX_NR_ZONES; zid++) {
 			if (!nr_skipped[zid])
 				continue;
 
 			__count_zid_vm_events(PGSCAN_SKIP, zid, nr_skipped[zid]);
-			total_skipped += nr_skipped[zid];
+			skipped += nr_skipped[zid];
 		}
 
 		/*
@@ -1494,13 +1494,13 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 		 * close to unreclaimable. If the LRU list is empty, account
 		 * skipped pages as a full scan.
 		 */
-		scan += list_empty(src) ? total_skipped : total_skipped >> 2;
+		total_skipped = list_empty(src) ? skipped : skipped >> 2;
 
 		list_splice(&pages_skipped, src);
 	}
-	*nr_scanned = scan;
+	*nr_scanned = scan + total_skipped;
 	trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan,
-				    nr_taken, mode, is_file_lru(lru));
+				    skipped, nr_taken, mode, is_file_lru(lru));
 	update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
 	return nr_taken;
 }
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:19   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
from is file or anonymous but we do not know which LRU this is.

It is useful to know whether the list is active or inactive, since we
are using the same function to isolate pages from both of them and it's
hard to distinguish otherwise.

Chaneges since v1
- drop LRU_ prefix from names and use lowercase as per Vlastimil
- move and convert show_lru_name to mmflags.h EM magic as per Vlastimil

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/mmflags.h |  8 ++++++++
 include/trace/events/vmscan.h  | 12 ++++++------
 mm/vmscan.c                    |  2 +-
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index aa4caa6914a9..6172afa2fd82 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -240,6 +240,13 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY,	"softdirty"	)		\
 	IFDEF_ZONE_HIGHMEM(	EM (ZONE_HIGHMEM,"HighMem"))	\
 				EMe(ZONE_MOVABLE,"Movable")
 
+#define LRU_NAMES		\
+		EM (LRU_INACTIVE_ANON, "inactive_anon") \
+		EM (LRU_ACTIVE_ANON, "active_anon") \
+		EM (LRU_INACTIVE_FILE, "inactive_file") \
+		EM (LRU_ACTIVE_FILE, "active_file") \
+		EMe(LRU_UNEVICTABLE, "unevictable")
+
 /*
  * First define the enums in the above macros to be exported to userspace
  * via TRACE_DEFINE_ENUM().
@@ -253,6 +260,7 @@ COMPACTION_STATUS
 COMPACTION_PRIORITY
 COMPACTION_FEEDBACK
 ZONE_TYPE
+LRU_NAMES
 
 /*
  * Now redefine the EM() and EMe() macros to map the enums to the strings
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 36c999f806bf..7ec59e0432c4 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		unsigned long nr_skipped,
 		unsigned long nr_taken,
 		isolate_mode_t isolate_mode,
-		int file),
+		int lru),
 
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, file),
+	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, lru),
 
 	TP_STRUCT__entry(
 		__field(int, classzone_idx)
@@ -289,7 +289,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__field(unsigned long, nr_skipped)
 		__field(unsigned long, nr_taken)
 		__field(isolate_mode_t, isolate_mode)
-		__field(int, file)
+		__field(int, lru)
 	),
 
 	TP_fast_assign(
@@ -300,10 +300,10 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__entry->nr_skipped = nr_skipped;
 		__entry->nr_taken = nr_taken;
 		__entry->isolate_mode = isolate_mode;
-		__entry->file = file;
+		__entry->lru = lru;
 	),
 
-	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu file=%d",
+	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
 		__entry->isolate_mode,
 		__entry->classzone_idx,
 		__entry->order,
@@ -311,7 +311,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__entry->nr_scanned,
 		__entry->nr_skipped,
 		__entry->nr_taken,
-		__entry->file)
+		__print_symbolic(__entry->lru, LRU_NAMES))
 );
 
 TRACE_EVENT(mm_vmscan_writepage,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 31c623d5acb4..13758aaed78b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1500,7 +1500,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 	}
 	*nr_scanned = scan + total_skipped;
 	trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan,
-				    skipped, nr_taken, mode, is_file_lru(lru));
+				    skipped, nr_taken, mode, lru);
 	update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
 	return nr_taken;
 }
-- 
2.11.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
@ 2017-01-04 10:19   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
from is file or anonymous but we do not know which LRU this is.

It is useful to know whether the list is active or inactive, since we
are using the same function to isolate pages from both of them and it's
hard to distinguish otherwise.

Chaneges since v1
- drop LRU_ prefix from names and use lowercase as per Vlastimil
- move and convert show_lru_name to mmflags.h EM magic as per Vlastimil

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/mmflags.h |  8 ++++++++
 include/trace/events/vmscan.h  | 12 ++++++------
 mm/vmscan.c                    |  2 +-
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index aa4caa6914a9..6172afa2fd82 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -240,6 +240,13 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY,	"softdirty"	)		\
 	IFDEF_ZONE_HIGHMEM(	EM (ZONE_HIGHMEM,"HighMem"))	\
 				EMe(ZONE_MOVABLE,"Movable")
 
+#define LRU_NAMES		\
+		EM (LRU_INACTIVE_ANON, "inactive_anon") \
+		EM (LRU_ACTIVE_ANON, "active_anon") \
+		EM (LRU_INACTIVE_FILE, "inactive_file") \
+		EM (LRU_ACTIVE_FILE, "active_file") \
+		EMe(LRU_UNEVICTABLE, "unevictable")
+
 /*
  * First define the enums in the above macros to be exported to userspace
  * via TRACE_DEFINE_ENUM().
@@ -253,6 +260,7 @@ COMPACTION_STATUS
 COMPACTION_PRIORITY
 COMPACTION_FEEDBACK
 ZONE_TYPE
+LRU_NAMES
 
 /*
  * Now redefine the EM() and EMe() macros to map the enums to the strings
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 36c999f806bf..7ec59e0432c4 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		unsigned long nr_skipped,
 		unsigned long nr_taken,
 		isolate_mode_t isolate_mode,
-		int file),
+		int lru),
 
-	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, file),
+	TP_ARGS(classzone_idx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, lru),
 
 	TP_STRUCT__entry(
 		__field(int, classzone_idx)
@@ -289,7 +289,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__field(unsigned long, nr_skipped)
 		__field(unsigned long, nr_taken)
 		__field(isolate_mode_t, isolate_mode)
-		__field(int, file)
+		__field(int, lru)
 	),
 
 	TP_fast_assign(
@@ -300,10 +300,10 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__entry->nr_skipped = nr_skipped;
 		__entry->nr_taken = nr_taken;
 		__entry->isolate_mode = isolate_mode;
-		__entry->file = file;
+		__entry->lru = lru;
 	),
 
-	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu file=%d",
+	TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
 		__entry->isolate_mode,
 		__entry->classzone_idx,
 		__entry->order,
@@ -311,7 +311,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__entry->nr_scanned,
 		__entry->nr_skipped,
 		__entry->nr_taken,
-		__entry->file)
+		__print_symbolic(__entry->lru, LRU_NAMES))
 );
 
 TRACE_EVENT(mm_vmscan_writepage,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 31c623d5acb4..13758aaed78b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1500,7 +1500,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 	}
 	*nr_scanned = scan + total_skipped;
 	trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan,
-				    skipped, nr_taken, mode, is_file_lru(lru));
+				    skipped, nr_taken, mode, lru);
 	update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
 	return nr_taken;
 }
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:19   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

shrink_page_list returns quite some counters back to its caller. Extract
the existing 5 into struct reclaim_stat because this makes the code
easier to follow and also allows further counters to be returned.

While we are at it, make all of them unsigned rather than unsigned long
as we do not really need full 64b for them (we never scan more than
SWAP_CLUSTER_MAX pages at once). This should reduce some stack space.

This patch shouldn't introduce any functional change.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/vmscan.c | 61 ++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 30 insertions(+), 31 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 13758aaed78b..920e47a905c3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -902,6 +902,14 @@ static void page_check_dirty_writeback(struct page *page,
 		mapping->a_ops->is_dirty_writeback(page, dirty, writeback);
 }
 
+struct reclaim_stat {
+	unsigned nr_dirty;
+	unsigned nr_unqueued_dirty;
+	unsigned nr_congested;
+	unsigned nr_writeback;
+	unsigned nr_immediate;
+};
+
 /*
  * shrink_page_list() returns the number of reclaimed pages
  */
@@ -909,22 +917,18 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 				      struct pglist_data *pgdat,
 				      struct scan_control *sc,
 				      enum ttu_flags ttu_flags,
-				      unsigned long *ret_nr_dirty,
-				      unsigned long *ret_nr_unqueued_dirty,
-				      unsigned long *ret_nr_congested,
-				      unsigned long *ret_nr_writeback,
-				      unsigned long *ret_nr_immediate,
+				      struct reclaim_stat *stat,
 				      bool force_reclaim)
 {
 	LIST_HEAD(ret_pages);
 	LIST_HEAD(free_pages);
 	int pgactivate = 0;
-	unsigned long nr_unqueued_dirty = 0;
-	unsigned long nr_dirty = 0;
-	unsigned long nr_congested = 0;
-	unsigned long nr_reclaimed = 0;
-	unsigned long nr_writeback = 0;
-	unsigned long nr_immediate = 0;
+	unsigned nr_unqueued_dirty = 0;
+	unsigned nr_dirty = 0;
+	unsigned nr_congested = 0;
+	unsigned nr_reclaimed = 0;
+	unsigned nr_writeback = 0;
+	unsigned nr_immediate = 0;
 
 	cond_resched();
 
@@ -1266,11 +1270,13 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 	list_splice(&ret_pages, page_list);
 	count_vm_events(PGACTIVATE, pgactivate);
 
-	*ret_nr_dirty += nr_dirty;
-	*ret_nr_congested += nr_congested;
-	*ret_nr_unqueued_dirty += nr_unqueued_dirty;
-	*ret_nr_writeback += nr_writeback;
-	*ret_nr_immediate += nr_immediate;
+	if (stat) {
+		stat->nr_dirty = nr_dirty;
+		stat->nr_congested = nr_congested;
+		stat->nr_unqueued_dirty = nr_unqueued_dirty;
+		stat->nr_writeback = nr_writeback;
+		stat->nr_immediate = nr_immediate;
+	}
 	return nr_reclaimed;
 }
 
@@ -1282,7 +1288,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 		.priority = DEF_PRIORITY,
 		.may_unmap = 1,
 	};
-	unsigned long ret, dummy1, dummy2, dummy3, dummy4, dummy5;
+	unsigned long ret;
 	struct page *page, *next;
 	LIST_HEAD(clean_pages);
 
@@ -1295,8 +1301,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 	}
 
 	ret = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
-			TTU_UNMAP|TTU_IGNORE_ACCESS,
-			&dummy1, &dummy2, &dummy3, &dummy4, &dummy5, true);
+			TTU_UNMAP|TTU_IGNORE_ACCESS, NULL, true);
 	list_splice(&clean_pages, page_list);
 	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -ret);
 	return ret;
@@ -1696,11 +1701,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	unsigned long nr_scanned;
 	unsigned long nr_reclaimed = 0;
 	unsigned long nr_taken;
-	unsigned long nr_dirty = 0;
-	unsigned long nr_congested = 0;
-	unsigned long nr_unqueued_dirty = 0;
-	unsigned long nr_writeback = 0;
-	unsigned long nr_immediate = 0;
+	struct reclaim_stat stat = {};
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1745,9 +1746,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		return 0;
 
 	nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, TTU_UNMAP,
-				&nr_dirty, &nr_unqueued_dirty, &nr_congested,
-				&nr_writeback, &nr_immediate,
-				false);
+				&stat, false);
 
 	spin_lock_irq(&pgdat->lru_lock);
 
@@ -1781,7 +1780,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	 * of pages under pages flagged for immediate reclaim and stall if any
 	 * are encountered in the nr_immediate check below.
 	 */
-	if (nr_writeback && nr_writeback == nr_taken)
+	if (stat.nr_writeback && stat.nr_writeback == nr_taken)
 		set_bit(PGDAT_WRITEBACK, &pgdat->flags);
 
 	/*
@@ -1793,7 +1792,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * Tag a zone as congested if all the dirty pages scanned were
 		 * backed by a congested BDI and wait_iff_congested will stall.
 		 */
-		if (nr_dirty && nr_dirty == nr_congested)
+		if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested)
 			set_bit(PGDAT_CONGESTED, &pgdat->flags);
 
 		/*
@@ -1802,7 +1801,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * the pgdat PGDAT_DIRTY and kswapd will start writing pages from
 		 * reclaim context.
 		 */
-		if (nr_unqueued_dirty == nr_taken)
+		if (stat.nr_unqueued_dirty == nr_taken)
 			set_bit(PGDAT_DIRTY, &pgdat->flags);
 
 		/*
@@ -1811,7 +1810,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * that pages are cycling through the LRU faster than
 		 * they are written so also forcibly stall.
 		 */
-		if (nr_immediate && current_may_throttle())
+		if (stat.nr_immediate && current_may_throttle())
 			congestion_wait(BLK_RW_ASYNC, HZ/10);
 	}
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct
@ 2017-01-04 10:19   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

shrink_page_list returns quite some counters back to its caller. Extract
the existing 5 into struct reclaim_stat because this makes the code
easier to follow and also allows further counters to be returned.

While we are at it, make all of them unsigned rather than unsigned long
as we do not really need full 64b for them (we never scan more than
SWAP_CLUSTER_MAX pages at once). This should reduce some stack space.

This patch shouldn't introduce any functional change.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/vmscan.c | 61 ++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 30 insertions(+), 31 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 13758aaed78b..920e47a905c3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -902,6 +902,14 @@ static void page_check_dirty_writeback(struct page *page,
 		mapping->a_ops->is_dirty_writeback(page, dirty, writeback);
 }
 
+struct reclaim_stat {
+	unsigned nr_dirty;
+	unsigned nr_unqueued_dirty;
+	unsigned nr_congested;
+	unsigned nr_writeback;
+	unsigned nr_immediate;
+};
+
 /*
  * shrink_page_list() returns the number of reclaimed pages
  */
@@ -909,22 +917,18 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 				      struct pglist_data *pgdat,
 				      struct scan_control *sc,
 				      enum ttu_flags ttu_flags,
-				      unsigned long *ret_nr_dirty,
-				      unsigned long *ret_nr_unqueued_dirty,
-				      unsigned long *ret_nr_congested,
-				      unsigned long *ret_nr_writeback,
-				      unsigned long *ret_nr_immediate,
+				      struct reclaim_stat *stat,
 				      bool force_reclaim)
 {
 	LIST_HEAD(ret_pages);
 	LIST_HEAD(free_pages);
 	int pgactivate = 0;
-	unsigned long nr_unqueued_dirty = 0;
-	unsigned long nr_dirty = 0;
-	unsigned long nr_congested = 0;
-	unsigned long nr_reclaimed = 0;
-	unsigned long nr_writeback = 0;
-	unsigned long nr_immediate = 0;
+	unsigned nr_unqueued_dirty = 0;
+	unsigned nr_dirty = 0;
+	unsigned nr_congested = 0;
+	unsigned nr_reclaimed = 0;
+	unsigned nr_writeback = 0;
+	unsigned nr_immediate = 0;
 
 	cond_resched();
 
@@ -1266,11 +1270,13 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 	list_splice(&ret_pages, page_list);
 	count_vm_events(PGACTIVATE, pgactivate);
 
-	*ret_nr_dirty += nr_dirty;
-	*ret_nr_congested += nr_congested;
-	*ret_nr_unqueued_dirty += nr_unqueued_dirty;
-	*ret_nr_writeback += nr_writeback;
-	*ret_nr_immediate += nr_immediate;
+	if (stat) {
+		stat->nr_dirty = nr_dirty;
+		stat->nr_congested = nr_congested;
+		stat->nr_unqueued_dirty = nr_unqueued_dirty;
+		stat->nr_writeback = nr_writeback;
+		stat->nr_immediate = nr_immediate;
+	}
 	return nr_reclaimed;
 }
 
@@ -1282,7 +1288,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 		.priority = DEF_PRIORITY,
 		.may_unmap = 1,
 	};
-	unsigned long ret, dummy1, dummy2, dummy3, dummy4, dummy5;
+	unsigned long ret;
 	struct page *page, *next;
 	LIST_HEAD(clean_pages);
 
@@ -1295,8 +1301,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 	}
 
 	ret = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
-			TTU_UNMAP|TTU_IGNORE_ACCESS,
-			&dummy1, &dummy2, &dummy3, &dummy4, &dummy5, true);
+			TTU_UNMAP|TTU_IGNORE_ACCESS, NULL, true);
 	list_splice(&clean_pages, page_list);
 	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -ret);
 	return ret;
@@ -1696,11 +1701,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	unsigned long nr_scanned;
 	unsigned long nr_reclaimed = 0;
 	unsigned long nr_taken;
-	unsigned long nr_dirty = 0;
-	unsigned long nr_congested = 0;
-	unsigned long nr_unqueued_dirty = 0;
-	unsigned long nr_writeback = 0;
-	unsigned long nr_immediate = 0;
+	struct reclaim_stat stat = {};
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1745,9 +1746,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		return 0;
 
 	nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, TTU_UNMAP,
-				&nr_dirty, &nr_unqueued_dirty, &nr_congested,
-				&nr_writeback, &nr_immediate,
-				false);
+				&stat, false);
 
 	spin_lock_irq(&pgdat->lru_lock);
 
@@ -1781,7 +1780,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	 * of pages under pages flagged for immediate reclaim and stall if any
 	 * are encountered in the nr_immediate check below.
 	 */
-	if (nr_writeback && nr_writeback == nr_taken)
+	if (stat.nr_writeback && stat.nr_writeback == nr_taken)
 		set_bit(PGDAT_WRITEBACK, &pgdat->flags);
 
 	/*
@@ -1793,7 +1792,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * Tag a zone as congested if all the dirty pages scanned were
 		 * backed by a congested BDI and wait_iff_congested will stall.
 		 */
-		if (nr_dirty && nr_dirty == nr_congested)
+		if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested)
 			set_bit(PGDAT_CONGESTED, &pgdat->flags);
 
 		/*
@@ -1802,7 +1801,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * the pgdat PGDAT_DIRTY and kswapd will start writing pages from
 		 * reclaim context.
 		 */
-		if (nr_unqueued_dirty == nr_taken)
+		if (stat.nr_unqueued_dirty == nr_taken)
 			set_bit(PGDAT_DIRTY, &pgdat->flags);
 
 		/*
@@ -1811,7 +1810,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * that pages are cycling through the LRU faster than
 		 * they are written so also forcibly stall.
 		 */
-		if (nr_immediate && current_may_throttle())
+		if (stat.nr_immediate && current_may_throttle())
 			congestion_wait(BLK_RW_ASYNC, HZ/10);
 	}
 
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 6/7] mm, vmscan: enhance mm_vmscan_lru_shrink_inactive tracepoint
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:19   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

mm_vmscan_lru_shrink_inactive will currently report the number of
scanned and reclaimed pages. This doesn't give us an idea how the
reclaim went except for the overall effectiveness though. Export
and show other counters which will tell us why we couldn't reclaim
some pages.
	- nr_dirty, nr_writeback, nr_congested and nr_immediate tells
	  us how many pages are blocked due to IO
	- nr_activate tells us how many pages were moved to the active
	  list
	- nr_ref_keep reports how many pages are kept on the LRU due
	  to references (mostly for the file pages which are about to
	  go for another round through the inactive list)
	- nr_unmap_fail - how many pages failed to unmap

All these are rather low level so they might change in future but the
tracepoint is already implementation specific so no tools should be
depending on its stability.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 29 ++++++++++++++++++++++++++---
 mm/vmscan.c                   | 14 ++++++++++++++
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 7ec59e0432c4..9037c1734294 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -340,14 +340,27 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
 	TP_PROTO(int nid,
 		unsigned long nr_scanned, unsigned long nr_reclaimed,
+		unsigned long nr_dirty, unsigned long nr_writeback,
+		unsigned long nr_congested, unsigned long nr_immediate,
+		unsigned long nr_activate, unsigned long nr_ref_keep,
+		unsigned long nr_unmap_fail,
 		int priority, int file),
 
-	TP_ARGS(nid, nr_scanned, nr_reclaimed, priority, file),
+	TP_ARGS(nid, nr_scanned, nr_reclaimed, nr_dirty, nr_writeback,
+		nr_congested, nr_immediate, nr_activate, nr_ref_keep,
+		nr_unmap_fail, priority, file),
 
 	TP_STRUCT__entry(
 		__field(int, nid)
 		__field(unsigned long, nr_scanned)
 		__field(unsigned long, nr_reclaimed)
+		__field(unsigned long, nr_dirty)
+		__field(unsigned long, nr_writeback)
+		__field(unsigned long, nr_congested)
+		__field(unsigned long, nr_immediate)
+		__field(unsigned long, nr_activate)
+		__field(unsigned long, nr_ref_keep)
+		__field(unsigned long, nr_unmap_fail)
 		__field(int, priority)
 		__field(int, reclaim_flags)
 	),
@@ -356,14 +369,24 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		__entry->nid = nid;
 		__entry->nr_scanned = nr_scanned;
 		__entry->nr_reclaimed = nr_reclaimed;
+		__entry->nr_dirty = nr_dirty;
+		__entry->nr_writeback = nr_writeback;
+		__entry->nr_congested = nr_congested;
+		__entry->nr_immediate = nr_immediate;
+		__entry->nr_activate = nr_activate;
+		__entry->nr_ref_keep = nr_ref_keep;
+		__entry->nr_unmap_fail = nr_unmap_fail;
 		__entry->priority = priority;
 		__entry->reclaim_flags = trace_shrink_flags(file);
 	),
 
-	TP_printk("nid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d flags=%s",
+	TP_printk("nid=%d nr_scanned=%ld nr_reclaimed=%ld nr_dirty=%ld nr_writeback=%ld nr_congested=%ld nr_immediate=%ld nr_activate=%ld nr_ref_keep=%ld nr_unmap_fail=%ld priority=%d flags=%s",
 		__entry->nid,
 		__entry->nr_scanned, __entry->nr_reclaimed,
-		__entry->priority,
+		__entry->nr_dirty, __entry->nr_writeback,
+		__entry->nr_congested, __entry->nr_immediate,
+		__entry->nr_activate, __entry->nr_ref_keep,
+		__entry->nr_unmap_fail, __entry->priority,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 920e47a905c3..d05e42bee511 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -908,6 +908,9 @@ struct reclaim_stat {
 	unsigned nr_congested;
 	unsigned nr_writeback;
 	unsigned nr_immediate;
+	unsigned nr_activate;
+	unsigned nr_ref_keep;
+	unsigned nr_unmap_fail;
 };
 
 /*
@@ -929,6 +932,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 	unsigned nr_reclaimed = 0;
 	unsigned nr_writeback = 0;
 	unsigned nr_immediate = 0;
+	unsigned nr_ref_keep = 0;
+	unsigned nr_unmap_fail = 0;
 
 	cond_resched();
 
@@ -1067,6 +1072,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		case PAGEREF_ACTIVATE:
 			goto activate_locked;
 		case PAGEREF_KEEP:
+			nr_ref_keep++;
 			goto keep_locked;
 		case PAGEREF_RECLAIM:
 		case PAGEREF_RECLAIM_CLEAN:
@@ -1104,6 +1110,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 				(ttu_flags | TTU_BATCH_FLUSH | TTU_LZFREE) :
 				(ttu_flags | TTU_BATCH_FLUSH))) {
 			case SWAP_FAIL:
+				nr_unmap_fail++;
 				goto activate_locked;
 			case SWAP_AGAIN:
 				goto keep_locked;
@@ -1276,6 +1283,9 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		stat->nr_unqueued_dirty = nr_unqueued_dirty;
 		stat->nr_writeback = nr_writeback;
 		stat->nr_immediate = nr_immediate;
+		stat->nr_activate = pgactivate;
+		stat->nr_ref_keep = nr_ref_keep;
+		stat->nr_unmap_fail = nr_unmap_fail;
 	}
 	return nr_reclaimed;
 }
@@ -1825,6 +1835,10 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 
 	trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
 			nr_scanned, nr_reclaimed,
+			stat.nr_dirty,  stat.nr_writeback,
+			stat.nr_congested, stat.nr_immediate,
+			stat.nr_activate, stat.nr_ref_keep,
+			stat.nr_unmap_fail,
 			sc->priority, file);
 	return nr_reclaimed;
 }
-- 
2.11.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 6/7] mm, vmscan: enhance mm_vmscan_lru_shrink_inactive tracepoint
@ 2017-01-04 10:19   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

mm_vmscan_lru_shrink_inactive will currently report the number of
scanned and reclaimed pages. This doesn't give us an idea how the
reclaim went except for the overall effectiveness though. Export
and show other counters which will tell us why we couldn't reclaim
some pages.
	- nr_dirty, nr_writeback, nr_congested and nr_immediate tells
	  us how many pages are blocked due to IO
	- nr_activate tells us how many pages were moved to the active
	  list
	- nr_ref_keep reports how many pages are kept on the LRU due
	  to references (mostly for the file pages which are about to
	  go for another round through the inactive list)
	- nr_unmap_fail - how many pages failed to unmap

All these are rather low level so they might change in future but the
tracepoint is already implementation specific so no tools should be
depending on its stability.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 29 ++++++++++++++++++++++++++---
 mm/vmscan.c                   | 14 ++++++++++++++
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 7ec59e0432c4..9037c1734294 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -340,14 +340,27 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
 	TP_PROTO(int nid,
 		unsigned long nr_scanned, unsigned long nr_reclaimed,
+		unsigned long nr_dirty, unsigned long nr_writeback,
+		unsigned long nr_congested, unsigned long nr_immediate,
+		unsigned long nr_activate, unsigned long nr_ref_keep,
+		unsigned long nr_unmap_fail,
 		int priority, int file),
 
-	TP_ARGS(nid, nr_scanned, nr_reclaimed, priority, file),
+	TP_ARGS(nid, nr_scanned, nr_reclaimed, nr_dirty, nr_writeback,
+		nr_congested, nr_immediate, nr_activate, nr_ref_keep,
+		nr_unmap_fail, priority, file),
 
 	TP_STRUCT__entry(
 		__field(int, nid)
 		__field(unsigned long, nr_scanned)
 		__field(unsigned long, nr_reclaimed)
+		__field(unsigned long, nr_dirty)
+		__field(unsigned long, nr_writeback)
+		__field(unsigned long, nr_congested)
+		__field(unsigned long, nr_immediate)
+		__field(unsigned long, nr_activate)
+		__field(unsigned long, nr_ref_keep)
+		__field(unsigned long, nr_unmap_fail)
 		__field(int, priority)
 		__field(int, reclaim_flags)
 	),
@@ -356,14 +369,24 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		__entry->nid = nid;
 		__entry->nr_scanned = nr_scanned;
 		__entry->nr_reclaimed = nr_reclaimed;
+		__entry->nr_dirty = nr_dirty;
+		__entry->nr_writeback = nr_writeback;
+		__entry->nr_congested = nr_congested;
+		__entry->nr_immediate = nr_immediate;
+		__entry->nr_activate = nr_activate;
+		__entry->nr_ref_keep = nr_ref_keep;
+		__entry->nr_unmap_fail = nr_unmap_fail;
 		__entry->priority = priority;
 		__entry->reclaim_flags = trace_shrink_flags(file);
 	),
 
-	TP_printk("nid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d flags=%s",
+	TP_printk("nid=%d nr_scanned=%ld nr_reclaimed=%ld nr_dirty=%ld nr_writeback=%ld nr_congested=%ld nr_immediate=%ld nr_activate=%ld nr_ref_keep=%ld nr_unmap_fail=%ld priority=%d flags=%s",
 		__entry->nid,
 		__entry->nr_scanned, __entry->nr_reclaimed,
-		__entry->priority,
+		__entry->nr_dirty, __entry->nr_writeback,
+		__entry->nr_congested, __entry->nr_immediate,
+		__entry->nr_activate, __entry->nr_ref_keep,
+		__entry->nr_unmap_fail, __entry->priority,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 920e47a905c3..d05e42bee511 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -908,6 +908,9 @@ struct reclaim_stat {
 	unsigned nr_congested;
 	unsigned nr_writeback;
 	unsigned nr_immediate;
+	unsigned nr_activate;
+	unsigned nr_ref_keep;
+	unsigned nr_unmap_fail;
 };
 
 /*
@@ -929,6 +932,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 	unsigned nr_reclaimed = 0;
 	unsigned nr_writeback = 0;
 	unsigned nr_immediate = 0;
+	unsigned nr_ref_keep = 0;
+	unsigned nr_unmap_fail = 0;
 
 	cond_resched();
 
@@ -1067,6 +1072,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		case PAGEREF_ACTIVATE:
 			goto activate_locked;
 		case PAGEREF_KEEP:
+			nr_ref_keep++;
 			goto keep_locked;
 		case PAGEREF_RECLAIM:
 		case PAGEREF_RECLAIM_CLEAN:
@@ -1104,6 +1110,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 				(ttu_flags | TTU_BATCH_FLUSH | TTU_LZFREE) :
 				(ttu_flags | TTU_BATCH_FLUSH))) {
 			case SWAP_FAIL:
+				nr_unmap_fail++;
 				goto activate_locked;
 			case SWAP_AGAIN:
 				goto keep_locked;
@@ -1276,6 +1283,9 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		stat->nr_unqueued_dirty = nr_unqueued_dirty;
 		stat->nr_writeback = nr_writeback;
 		stat->nr_immediate = nr_immediate;
+		stat->nr_activate = pgactivate;
+		stat->nr_ref_keep = nr_ref_keep;
+		stat->nr_unmap_fail = nr_unmap_fail;
 	}
 	return nr_reclaimed;
 }
@@ -1825,6 +1835,10 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 
 	trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
 			nr_scanned, nr_reclaimed,
+			stat.nr_dirty,  stat.nr_writeback,
+			stat.nr_congested, stat.nr_immediate,
+			stat.nr_activate, stat.nr_ref_keep,
+			stat.nr_unmap_fail,
 			sc->priority, file);
 	return nr_reclaimed;
 }
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 7/7] mm, vmscan: add mm_vmscan_inactive_list_is_low tracepoint
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:19   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Currently we have tracepoints for both active and inactive LRU lists
reclaim but we do not have any which would tell us why we we decided to
age the active list. Without that it is quite hard to diagnose
active/inactive lists balancing. Add mm_vmscan_inactive_list_is_low
tracepoint to tell us this information.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 40 ++++++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                   | 23 ++++++++++++++---------
 2 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 9037c1734294..625a4b7967d0 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -15,6 +15,7 @@
 #define RECLAIM_WB_MIXED	0x0010u
 #define RECLAIM_WB_SYNC		0x0004u /* Unused, all reclaim async */
 #define RECLAIM_WB_ASYNC	0x0008u
+#define RECLAIM_WB_LRU		(RECLAIM_WB_ANON|RECLAIM_WB_FILE)
 
 #define show_reclaim_flags(flags)				\
 	(flags) ? __print_flags(flags, "|",			\
@@ -426,6 +427,45 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_inactive_list_is_low,
+
+	TP_PROTO(int nid, int reclaim_idx,
+		unsigned long total_inactive, unsigned long inactive,
+		unsigned long total_active, unsigned long active,
+		unsigned long ratio, int file),
+
+	TP_ARGS(nid, reclaim_idx, total_inactive, inactive, total_active, active, ratio, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(int, reclaim_idx)
+		__field(unsigned long, total_inactive)
+		__field(unsigned long, inactive)
+		__field(unsigned long, total_active)
+		__field(unsigned long, active)
+		__field(unsigned long, ratio)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->reclaim_idx = reclaim_idx;
+		__entry->total_inactive = total_inactive;
+		__entry->inactive = inactive;
+		__entry->total_active = total_active;
+		__entry->active = active;
+		__entry->ratio = ratio;
+		__entry->reclaim_flags = trace_shrink_flags(file) & RECLAIM_WB_LRU;
+	),
+
+	TP_printk("nid=%d reclaim_idx=%d total_inactive=%ld inactive=%ld total_active=%ld active=%ld ratio=%ld flags=%s",
+		__entry->nid,
+		__entry->reclaim_idx,
+		__entry->total_inactive, __entry->inactive,
+		__entry->total_active, __entry->active,
+		__entry->ratio,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d05e42bee511..2a5c6c3fed2d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2039,11 +2039,11 @@ static void shrink_active_list(unsigned long nr_to_scan,
  *   10TB     320        32GB
  */
 static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
-						struct scan_control *sc)
+						struct scan_control *sc, bool trace)
 {
 	unsigned long inactive_ratio;
-	unsigned long inactive;
-	unsigned long active;
+	unsigned long total_inactive, inactive;
+	unsigned long total_active, active;
 	unsigned long gb;
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
 	int zid;
@@ -2055,8 +2055,8 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
 	if (!file && !total_swap_pages)
 		return false;
 
-	inactive = lruvec_lru_size(lruvec, file * LRU_FILE);
-	active = lruvec_lru_size(lruvec, file * LRU_FILE + LRU_ACTIVE);
+	total_inactive = inactive = lruvec_lru_size(lruvec, file * LRU_FILE);
+	total_active = active = lruvec_lru_size(lruvec, file * LRU_FILE + LRU_ACTIVE);
 
 	/*
 	 * For zone-constrained allocations, it is necessary to check if
@@ -2085,6 +2085,11 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
 	else
 		inactive_ratio = 1;
 
+	if (trace)
+		trace_mm_vmscan_inactive_list_is_low(pgdat->node_id,
+				sc->reclaim_idx,
+				total_inactive, inactive,
+				total_active, active, inactive_ratio, file);
 	return inactive * inactive_ratio < active;
 }
 
@@ -2092,7 +2097,7 @@ static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
 				 struct lruvec *lruvec, struct scan_control *sc)
 {
 	if (is_active_lru(lru)) {
-		if (inactive_list_is_low(lruvec, is_file_lru(lru), sc))
+		if (inactive_list_is_low(lruvec, is_file_lru(lru), sc, true))
 			shrink_active_list(nr_to_scan, lruvec, sc, lru);
 		return 0;
 	}
@@ -2223,7 +2228,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 	 * lruvec even if it has plenty of old anonymous pages unless the
 	 * system is under heavy pressure.
 	 */
-	if (!inactive_list_is_low(lruvec, true, sc) &&
+	if (!inactive_list_is_low(lruvec, true, sc, false) &&
 	    lruvec_lru_size(lruvec, LRU_INACTIVE_FILE) >> sc->priority) {
 		scan_balance = SCAN_FILE;
 		goto out;
@@ -2448,7 +2453,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 	 * Even if we did not try to evict anon pages at all, we want to
 	 * rebalance the anon lru active/inactive ratio.
 	 */
-	if (inactive_list_is_low(lruvec, false, sc))
+	if (inactive_list_is_low(lruvec, false, sc, true))
 		shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
 				   sc, LRU_ACTIVE_ANON);
 }
@@ -3098,7 +3103,7 @@ static void age_active_anon(struct pglist_data *pgdat,
 	do {
 		struct lruvec *lruvec = mem_cgroup_lruvec(pgdat, memcg);
 
-		if (inactive_list_is_low(lruvec, false, sc))
+		if (inactive_list_is_low(lruvec, false, sc, true))
 			shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
 					   sc, LRU_ACTIVE_ANON);
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 7/7] mm, vmscan: add mm_vmscan_inactive_list_is_low tracepoint
@ 2017-01-04 10:19   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Currently we have tracepoints for both active and inactive LRU lists
reclaim but we do not have any which would tell us why we we decided to
age the active list. Without that it is quite hard to diagnose
active/inactive lists balancing. Add mm_vmscan_inactive_list_is_low
tracepoint to tell us this information.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 40 ++++++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                   | 23 ++++++++++++++---------
 2 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 9037c1734294..625a4b7967d0 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -15,6 +15,7 @@
 #define RECLAIM_WB_MIXED	0x0010u
 #define RECLAIM_WB_SYNC		0x0004u /* Unused, all reclaim async */
 #define RECLAIM_WB_ASYNC	0x0008u
+#define RECLAIM_WB_LRU		(RECLAIM_WB_ANON|RECLAIM_WB_FILE)
 
 #define show_reclaim_flags(flags)				\
 	(flags) ? __print_flags(flags, "|",			\
@@ -426,6 +427,45 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_inactive_list_is_low,
+
+	TP_PROTO(int nid, int reclaim_idx,
+		unsigned long total_inactive, unsigned long inactive,
+		unsigned long total_active, unsigned long active,
+		unsigned long ratio, int file),
+
+	TP_ARGS(nid, reclaim_idx, total_inactive, inactive, total_active, active, ratio, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(int, reclaim_idx)
+		__field(unsigned long, total_inactive)
+		__field(unsigned long, inactive)
+		__field(unsigned long, total_active)
+		__field(unsigned long, active)
+		__field(unsigned long, ratio)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->reclaim_idx = reclaim_idx;
+		__entry->total_inactive = total_inactive;
+		__entry->inactive = inactive;
+		__entry->total_active = total_active;
+		__entry->active = active;
+		__entry->ratio = ratio;
+		__entry->reclaim_flags = trace_shrink_flags(file) & RECLAIM_WB_LRU;
+	),
+
+	TP_printk("nid=%d reclaim_idx=%d total_inactive=%ld inactive=%ld total_active=%ld active=%ld ratio=%ld flags=%s",
+		__entry->nid,
+		__entry->reclaim_idx,
+		__entry->total_inactive, __entry->inactive,
+		__entry->total_active, __entry->active,
+		__entry->ratio,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d05e42bee511..2a5c6c3fed2d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2039,11 +2039,11 @@ static void shrink_active_list(unsigned long nr_to_scan,
  *   10TB     320        32GB
  */
 static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
-						struct scan_control *sc)
+						struct scan_control *sc, bool trace)
 {
 	unsigned long inactive_ratio;
-	unsigned long inactive;
-	unsigned long active;
+	unsigned long total_inactive, inactive;
+	unsigned long total_active, active;
 	unsigned long gb;
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
 	int zid;
@@ -2055,8 +2055,8 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
 	if (!file && !total_swap_pages)
 		return false;
 
-	inactive = lruvec_lru_size(lruvec, file * LRU_FILE);
-	active = lruvec_lru_size(lruvec, file * LRU_FILE + LRU_ACTIVE);
+	total_inactive = inactive = lruvec_lru_size(lruvec, file * LRU_FILE);
+	total_active = active = lruvec_lru_size(lruvec, file * LRU_FILE + LRU_ACTIVE);
 
 	/*
 	 * For zone-constrained allocations, it is necessary to check if
@@ -2085,6 +2085,11 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
 	else
 		inactive_ratio = 1;
 
+	if (trace)
+		trace_mm_vmscan_inactive_list_is_low(pgdat->node_id,
+				sc->reclaim_idx,
+				total_inactive, inactive,
+				total_active, active, inactive_ratio, file);
 	return inactive * inactive_ratio < active;
 }
 
@@ -2092,7 +2097,7 @@ static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
 				 struct lruvec *lruvec, struct scan_control *sc)
 {
 	if (is_active_lru(lru)) {
-		if (inactive_list_is_low(lruvec, is_file_lru(lru), sc))
+		if (inactive_list_is_low(lruvec, is_file_lru(lru), sc, true))
 			shrink_active_list(nr_to_scan, lruvec, sc, lru);
 		return 0;
 	}
@@ -2223,7 +2228,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 	 * lruvec even if it has plenty of old anonymous pages unless the
 	 * system is under heavy pressure.
 	 */
-	if (!inactive_list_is_low(lruvec, true, sc) &&
+	if (!inactive_list_is_low(lruvec, true, sc, false) &&
 	    lruvec_lru_size(lruvec, LRU_INACTIVE_FILE) >> sc->priority) {
 		scan_balance = SCAN_FILE;
 		goto out;
@@ -2448,7 +2453,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 	 * Even if we did not try to evict anon pages at all, we want to
 	 * rebalance the anon lru active/inactive ratio.
 	 */
-	if (inactive_list_is_low(lruvec, false, sc))
+	if (inactive_list_is_low(lruvec, false, sc, true))
 		shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
 				   sc, LRU_ACTIVE_ANON);
 }
@@ -3098,7 +3103,7 @@ static void age_active_anon(struct pglist_data *pgdat,
 	do {
 		struct lruvec *lruvec = mem_cgroup_lruvec(pgdat, memcg);
 
-		if (inactive_list_is_low(lruvec, false, sc))
+		if (inactive_list_is_low(lruvec, false, sc, true))
 			shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
 					   sc, LRU_ACTIVE_ANON);
 
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-04 10:30   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On Wed 04-01-17 11:19:35, Michal Hocko wrote:
> Hi,
> this is the second version of the patchset [1]. I hope I've addressed all
> the review feedback.

I forgot to mention that this is based on the latest mmotm +
http://lkml.kernel.org/r/20161220130135.15719-1-mhocko@kernel.org which
are sitting in the mm tree but haven't been released as a mmmotm yet.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
@ 2017-01-04 10:30   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 10:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On Wed 04-01-17 11:19:35, Michal Hocko wrote:
> Hi,
> this is the second version of the patchset [1]. I hope I've addressed all
> the review feedback.

I forgot to mention that this is based on the latest mmotm +
http://lkml.kernel.org/r/20161220130135.15719-1-mhocko@kernel.org which
are sitting in the mm tree but haven't been released as a mmmotm yet.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04 10:19   ` Michal Hocko
@ 2017-01-04 12:52     ` Vlastimil Babka
  -1 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04 12:52 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Minchan Kim, Hillf Danton, linux-mm,
	LKML, Michal Hocko

On 01/04/2017 11:19 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of
> 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> 	  effectiveness.

Well, this point is no longer true, is it...

> 	- nr_referenced pages which tells us that we are hitting referenced
> 	  pages which are deactivated. If this is a large part of the
> 	  reported nr_deactivated pages then we might be hitting into
> 	  the active list too early because they might be still part of
> 	  the working set. This might help to debug performance issues.
> 	- nr_activated pages which tells us how many pages are kept on the

"nr_activated" is slightly misleading? They remain active, they are not
being activated (that's why the pgactivate vmstat is also not increased
on them, right?). I guess rename to "nr_active" ? Or something like
"nr_remain_active" although that's longer.

[...]

> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  	unsigned long pgmoved = 0;
>  	struct page *page;
>  	int nr_pages;
> +	int nr_moved = 0;
>  
>  	while (!list_empty(list)) {
>  		page = lru_to_page(list);
> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  				spin_lock_irq(&pgdat->lru_lock);
>  			} else
>  				list_add(&page->lru, pages_to_free);
> +		} else {
> +			nr_moved += nr_pages;
>  		}
>  	}
>  
>  	if (!is_active_lru(lru))
>  		__count_vm_events(PGDEACTIVATE, pgmoved);

So we now have pgmoved and nr_moved. One is used for vmstat, other for
tracepoint, and the only difference is that vmstat includes pages where
we raced with page being unmapped from all pte's (IIUC?) and thus
removed from lru, which should be rather rare? I guess those are being
counted into vmstat only due to how the code evolved from using pagevec.
If we don't consider them in the tracepoint, then I'd suggest we don't
count them into vmstat either, and simplify this.

> +
> +	return nr_moved;
>  }
>  
>  static void shrink_active_list(unsigned long nr_to_scan,
> @@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	LIST_HEAD(l_inactive);
>  	struct page *page;
>  	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -	unsigned long nr_rotated = 0;
> +	unsigned nr_deactivate, nr_activate;
> +	unsigned nr_rotated = 0;
>  	isolate_mode_t isolate_mode = 0;
>  	int file = is_file_lru(lru);
>  	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> @@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	 */
>  	reclaim_stat->recent_rotated[file] += nr_rotated;
>  
> -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> +	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
>  	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
>  	spin_unlock_irq(&pgdat->lru_lock);
>  
>  	mem_cgroup_uncharge_list(&l_hold);
>  	free_hot_cold_page_list(&l_hold, true);
> +	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_taken, nr_activate,
> +			nr_deactivate, nr_rotated, sc->priority, file);
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04 12:52     ` Vlastimil Babka
  0 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04 12:52 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Minchan Kim, Hillf Danton, linux-mm,
	LKML, Michal Hocko

On 01/04/2017 11:19 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of
> 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> 	  effectiveness.

Well, this point is no longer true, is it...

> 	- nr_referenced pages which tells us that we are hitting referenced
> 	  pages which are deactivated. If this is a large part of the
> 	  reported nr_deactivated pages then we might be hitting into
> 	  the active list too early because they might be still part of
> 	  the working set. This might help to debug performance issues.
> 	- nr_activated pages which tells us how many pages are kept on the

"nr_activated" is slightly misleading? They remain active, they are not
being activated (that's why the pgactivate vmstat is also not increased
on them, right?). I guess rename to "nr_active" ? Or something like
"nr_remain_active" although that's longer.

[...]

> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  	unsigned long pgmoved = 0;
>  	struct page *page;
>  	int nr_pages;
> +	int nr_moved = 0;
>  
>  	while (!list_empty(list)) {
>  		page = lru_to_page(list);
> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  				spin_lock_irq(&pgdat->lru_lock);
>  			} else
>  				list_add(&page->lru, pages_to_free);
> +		} else {
> +			nr_moved += nr_pages;
>  		}
>  	}
>  
>  	if (!is_active_lru(lru))
>  		__count_vm_events(PGDEACTIVATE, pgmoved);

So we now have pgmoved and nr_moved. One is used for vmstat, other for
tracepoint, and the only difference is that vmstat includes pages where
we raced with page being unmapped from all pte's (IIUC?) and thus
removed from lru, which should be rather rare? I guess those are being
counted into vmstat only due to how the code evolved from using pagevec.
If we don't consider them in the tracepoint, then I'd suggest we don't
count them into vmstat either, and simplify this.

> +
> +	return nr_moved;
>  }
>  
>  static void shrink_active_list(unsigned long nr_to_scan,
> @@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	LIST_HEAD(l_inactive);
>  	struct page *page;
>  	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -	unsigned long nr_rotated = 0;
> +	unsigned nr_deactivate, nr_activate;
> +	unsigned nr_rotated = 0;
>  	isolate_mode_t isolate_mode = 0;
>  	int file = is_file_lru(lru);
>  	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> @@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	 */
>  	reclaim_stat->recent_rotated[file] += nr_rotated;
>  
> -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> +	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
>  	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
>  	spin_unlock_irq(&pgdat->lru_lock);
>  
>  	mem_cgroup_uncharge_list(&l_hold);
>  	free_hot_cold_page_list(&l_hold, true);
> +	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_taken, nr_activate,
> +			nr_deactivate, nr_rotated, sc->priority, file);
>  }
>  
>  /*
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04 12:52     ` Vlastimil Babka
@ 2017-01-04 13:16       ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 13:16 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On Wed 04-01-17 13:52:24, Vlastimil Babka wrote:
> On 01/04/2017 11:19 AM, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > the number of
> > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > 	  effectiveness.
> 
> Well, this point is no longer true, is it...

ups, leftover
	- nr_take - the number of isolated pages

> > 	- nr_referenced pages which tells us that we are hitting referenced
> > 	  pages which are deactivated. If this is a large part of the
> > 	  reported nr_deactivated pages then we might be hitting into
> > 	  the active list too early because they might be still part of
> > 	  the working set. This might help to debug performance issues.
> > 	- nr_activated pages which tells us how many pages are kept on the
> 
> "nr_activated" is slightly misleading? They remain active, they are not
> being activated (that's why the pgactivate vmstat is also not increased
> on them, right?). I guess rename to "nr_active" ? Or something like
> "nr_remain_active" although that's longer.

will go with nr_active

> 
> [...]
> 
> > @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
> >  	unsigned long pgmoved = 0;
> >  	struct page *page;
> >  	int nr_pages;
> > +	int nr_moved = 0;
> >  
> >  	while (!list_empty(list)) {
> >  		page = lru_to_page(list);
> > @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
> >  				spin_lock_irq(&pgdat->lru_lock);
> >  			} else
> >  				list_add(&page->lru, pages_to_free);
> > +		} else {
> > +			nr_moved += nr_pages;
> >  		}
> >  	}
> >  
> >  	if (!is_active_lru(lru))
> >  		__count_vm_events(PGDEACTIVATE, pgmoved);
> 
> So we now have pgmoved and nr_moved. One is used for vmstat, other for
> tracepoint, and the only difference is that vmstat includes pages where
> we raced with page being unmapped from all pte's (IIUC?) and thus
> removed from lru, which should be rather rare? I guess those are being
> counted into vmstat only due to how the code evolved from using pagevec.
> If we don't consider them in the tracepoint, then I'd suggest we don't
> count them into vmstat either, and simplify this.

OK, but I would prefer to have this in a separate patch, OK?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04 13:16       ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 13:16 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On Wed 04-01-17 13:52:24, Vlastimil Babka wrote:
> On 01/04/2017 11:19 AM, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > the number of
> > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > 	  effectiveness.
> 
> Well, this point is no longer true, is it...

ups, leftover
	- nr_take - the number of isolated pages

> > 	- nr_referenced pages which tells us that we are hitting referenced
> > 	  pages which are deactivated. If this is a large part of the
> > 	  reported nr_deactivated pages then we might be hitting into
> > 	  the active list too early because they might be still part of
> > 	  the working set. This might help to debug performance issues.
> > 	- nr_activated pages which tells us how many pages are kept on the
> 
> "nr_activated" is slightly misleading? They remain active, they are not
> being activated (that's why the pgactivate vmstat is also not increased
> on them, right?). I guess rename to "nr_active" ? Or something like
> "nr_remain_active" although that's longer.

will go with nr_active

> 
> [...]
> 
> > @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
> >  	unsigned long pgmoved = 0;
> >  	struct page *page;
> >  	int nr_pages;
> > +	int nr_moved = 0;
> >  
> >  	while (!list_empty(list)) {
> >  		page = lru_to_page(list);
> > @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
> >  				spin_lock_irq(&pgdat->lru_lock);
> >  			} else
> >  				list_add(&page->lru, pages_to_free);
> > +		} else {
> > +			nr_moved += nr_pages;
> >  		}
> >  	}
> >  
> >  	if (!is_active_lru(lru))
> >  		__count_vm_events(PGDEACTIVATE, pgmoved);
> 
> So we now have pgmoved and nr_moved. One is used for vmstat, other for
> tracepoint, and the only difference is that vmstat includes pages where
> we raced with page being unmapped from all pte's (IIUC?) and thus
> removed from lru, which should be rather rare? I guess those are being
> counted into vmstat only due to how the code evolved from using pagevec.
> If we don't consider them in the tracepoint, then I'd suggest we don't
> count them into vmstat either, and simplify this.

OK, but I would prefer to have this in a separate patch, OK?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04 13:16       ` Michal Hocko
@ 2017-01-04 13:34         ` Vlastimil Babka
  -1 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04 13:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On 01/04/2017 02:16 PM, Michal Hocko wrote:
> On Wed 04-01-17 13:52:24, Vlastimil Babka wrote:
>> On 01/04/2017 11:19 AM, Michal Hocko wrote:
>>> From: Michal Hocko <mhocko@suse.com>
>>>
>>> Our reclaim process has several tracepoints to tell us more about how
>>> things are progressing. We are, however, missing a tracepoint to track
>>> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
>>> the number of
>>> 	- nr_scanned, nr_taken pages to tell us the LRU isolation
>>> 	  effectiveness.
>>
>> Well, this point is no longer true, is it...
> 
> ups, leftover
> 	- nr_take - the number of isolated pages

nr_taken

> 
>>> 	- nr_referenced pages which tells us that we are hitting referenced
>>> 	  pages which are deactivated. If this is a large part of the
>>> 	  reported nr_deactivated pages then we might be hitting into
>>> 	  the active list too early because they might be still part of
>>> 	  the working set. This might help to debug performance issues.
>>> 	- nr_activated pages which tells us how many pages are kept on the
>>
>> "nr_activated" is slightly misleading? They remain active, they are not
>> being activated (that's why the pgactivate vmstat is also not increased
>> on them, right?). I guess rename to "nr_active" ? Or something like
>> "nr_remain_active" although that's longer.
> 
> will go with nr_active

OK.

> 
>>
>> [...]
>>
>>> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>>>  	unsigned long pgmoved = 0;
>>>  	struct page *page;
>>>  	int nr_pages;
>>> +	int nr_moved = 0;
>>>  
>>>  	while (!list_empty(list)) {
>>>  		page = lru_to_page(list);
>>> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>>>  				spin_lock_irq(&pgdat->lru_lock);
>>>  			} else
>>>  				list_add(&page->lru, pages_to_free);
>>> +		} else {
>>> +			nr_moved += nr_pages;
>>>  		}
>>>  	}
>>>  
>>>  	if (!is_active_lru(lru))
>>>  		__count_vm_events(PGDEACTIVATE, pgmoved);
>>
>> So we now have pgmoved and nr_moved. One is used for vmstat, other for
>> tracepoint, and the only difference is that vmstat includes pages where
>> we raced with page being unmapped from all pte's (IIUC?) and thus
>> removed from lru, which should be rather rare? I guess those are being
>> counted into vmstat only due to how the code evolved from using pagevec.
>> If we don't consider them in the tracepoint, then I'd suggest we don't
>> count them into vmstat either, and simplify this.
> 
> OK, but I would prefer to have this in a separate patch, OK?

Sure, thanks!

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04 13:34         ` Vlastimil Babka
  0 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04 13:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On 01/04/2017 02:16 PM, Michal Hocko wrote:
> On Wed 04-01-17 13:52:24, Vlastimil Babka wrote:
>> On 01/04/2017 11:19 AM, Michal Hocko wrote:
>>> From: Michal Hocko <mhocko@suse.com>
>>>
>>> Our reclaim process has several tracepoints to tell us more about how
>>> things are progressing. We are, however, missing a tracepoint to track
>>> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
>>> the number of
>>> 	- nr_scanned, nr_taken pages to tell us the LRU isolation
>>> 	  effectiveness.
>>
>> Well, this point is no longer true, is it...
> 
> ups, leftover
> 	- nr_take - the number of isolated pages

nr_taken

> 
>>> 	- nr_referenced pages which tells us that we are hitting referenced
>>> 	  pages which are deactivated. If this is a large part of the
>>> 	  reported nr_deactivated pages then we might be hitting into
>>> 	  the active list too early because they might be still part of
>>> 	  the working set. This might help to debug performance issues.
>>> 	- nr_activated pages which tells us how many pages are kept on the
>>
>> "nr_activated" is slightly misleading? They remain active, they are not
>> being activated (that's why the pgactivate vmstat is also not increased
>> on them, right?). I guess rename to "nr_active" ? Or something like
>> "nr_remain_active" although that's longer.
> 
> will go with nr_active

OK.

> 
>>
>> [...]
>>
>>> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>>>  	unsigned long pgmoved = 0;
>>>  	struct page *page;
>>>  	int nr_pages;
>>> +	int nr_moved = 0;
>>>  
>>>  	while (!list_empty(list)) {
>>>  		page = lru_to_page(list);
>>> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>>>  				spin_lock_irq(&pgdat->lru_lock);
>>>  			} else
>>>  				list_add(&page->lru, pages_to_free);
>>> +		} else {
>>> +			nr_moved += nr_pages;
>>>  		}
>>>  	}
>>>  
>>>  	if (!is_active_lru(lru))
>>>  		__count_vm_events(PGDEACTIVATE, pgmoved);
>>
>> So we now have pgmoved and nr_moved. One is used for vmstat, other for
>> tracepoint, and the only difference is that vmstat includes pages where
>> we raced with page being unmapped from all pte's (IIUC?) and thus
>> removed from lru, which should be rather rare? I guess those are being
>> counted into vmstat only due to how the code evolved from using pagevec.
>> If we don't consider them in the tracepoint, then I'd suggest we don't
>> count them into vmstat either, and simplify this.
> 
> OK, but I would prefer to have this in a separate patch, OK?

Sure, thanks!


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04 10:19   ` Michal Hocko
@ 2017-01-04 13:52     ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 13:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

With fixed triggered by Vlastimil it should be like this.
---
>From b3a1480b54bf10924a9cd09c6d8b274fc81ca4ad Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Tue, 27 Dec 2016 13:18:20 +0100
Subject: [PATCH] mm, vmscan: add active list aging tracepoint

Our reclaim process has several tracepoints to tell us more about how
things are progressing. We are, however, missing a tracepoint to track
active list aging. Introduce mm_vmscan_lru_shrink_active which reports
the number of
	- nr_taken is number of isolated pages from the active list
	- nr_referenced pages which tells us that we are hitting referenced
	  pages which are deactivated. If this is a large part of the
	  reported nr_deactivated pages then we might be hitting into
	  the active list too early because they might be still part of
	  the working set. This might help to debug performance issues.
	- nr_active pages which tells us how many pages are kept on the
	  active list - mostly exec file backed pages. A high number can
	  indicate that we might be trashing on executables.

Changes since v1
- report nr_taken pages as per Minchan
- report nr_activated as per Minchan
- do not report nr_freed pages because that would add a tiny overhead to
  free_hot_cold_page_list which is a hot path
- do not report nr_unevictable because we can report this number via a
  different and more generic tracepoint in putback_lru_page
- fix move_active_pages_to_lru to report proper page count when we hit
  into large pages
- drop nr_scanned because this can be obtained from
  trace_mm_vmscan_lru_isolate as per Minchan

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 36 ++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                   | 18 ++++++++++++++----
 2 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 39bad8921ca1..c295d8f1b67a 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -363,6 +363,42 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_lru_shrink_active,
+
+	TP_PROTO(int nid, unsigned long nr_taken,
+		unsigned long nr_active, unsigned long nr_deactivated,
+		unsigned long nr_referenced, int priority, int file),
+
+	TP_ARGS(nid, nr_taken, nr_active, nr_deactivated, nr_referenced, priority, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(unsigned long, nr_taken)
+		__field(unsigned long, nr_active)
+		__field(unsigned long, nr_deactivated)
+		__field(unsigned long, nr_referenced)
+		__field(int, priority)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->nr_taken = nr_taken;
+		__entry->nr_active = nr_active;
+		__entry->nr_deactivated = nr_deactivated;
+		__entry->nr_referenced = nr_referenced;
+		__entry->priority = priority;
+		__entry->reclaim_flags = trace_shrink_flags(file);
+	),
+
+	TP_printk("nid=%d nr_taken=%ld nr_active=%ld nr_deactivated=%ld nr_referenced=%ld priority=%d flags=%s",
+		__entry->nid,
+		__entry->nr_taken,
+		__entry->nr_active, __entry->nr_deactivated, __entry->nr_referenced,
+		__entry->priority,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
+
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..70d1c55463c0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * The downside is that we have to touch page->_refcount against each page.
  * But we had to alter page->flags anyway.
+ *
+ * Returns the number of pages moved to the given lru.
  */
 
-static void move_active_pages_to_lru(struct lruvec *lruvec,
+static unsigned move_active_pages_to_lru(struct lruvec *lruvec,
 				     struct list_head *list,
 				     struct list_head *pages_to_free,
 				     enum lru_list lru)
@@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 	unsigned long pgmoved = 0;
 	struct page *page;
 	int nr_pages;
+	int nr_moved = 0;
 
 	while (!list_empty(list)) {
 		page = lru_to_page(list);
@@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 				spin_lock_irq(&pgdat->lru_lock);
 			} else
 				list_add(&page->lru, pages_to_free);
+		} else {
+			nr_moved += nr_pages;
 		}
 	}
 
 	if (!is_active_lru(lru))
 		__count_vm_events(PGDEACTIVATE, pgmoved);
+
+	return nr_moved;
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
@@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_inactive);
 	struct page *page;
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
-	unsigned long nr_rotated = 0;
+	unsigned nr_deactivate, nr_activate;
+	unsigned nr_rotated = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	reclaim_stat->recent_rotated[file] += nr_rotated;
 
-	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
-	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
+	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
+	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&pgdat->lru_lock);
 
 	mem_cgroup_uncharge_list(&l_hold);
 	free_hot_cold_page_list(&l_hold, true);
+	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_taken, nr_activate,
+			nr_deactivate, nr_rotated, sc->priority, file);
 }
 
 /*
-- 
2.11.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04 13:52     ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 13:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

With fixed triggered by Vlastimil it should be like this.
---

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct
  2017-01-04 10:19   ` Michal Hocko
@ 2017-01-04 14:51     ` Vlastimil Babka
  -1 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04 14:51 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Minchan Kim, Hillf Danton, linux-mm,
	LKML, Michal Hocko

On 01/04/2017 11:19 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> shrink_page_list returns quite some counters back to its caller. Extract
> the existing 5 into struct reclaim_stat because this makes the code
> easier to follow and also allows further counters to be returned.
> 
> While we are at it, make all of them unsigned rather than unsigned long
> as we do not really need full 64b for them (we never scan more than
> SWAP_CLUSTER_MAX pages at once). This should reduce some stack space.
> 
> This patch shouldn't introduce any functional change.

[...]

> @@ -1266,11 +1270,13 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  	list_splice(&ret_pages, page_list);
>  	count_vm_events(PGACTIVATE, pgactivate);
>  
> -	*ret_nr_dirty += nr_dirty;
> -	*ret_nr_congested += nr_congested;
> -	*ret_nr_unqueued_dirty += nr_unqueued_dirty;
> -	*ret_nr_writeback += nr_writeback;
> -	*ret_nr_immediate += nr_immediate;
> +	if (stat) {
> +		stat->nr_dirty = nr_dirty;
> +		stat->nr_congested = nr_congested;
> +		stat->nr_unqueued_dirty = nr_unqueued_dirty;
> +		stat->nr_writeback = nr_writeback;
> +		stat->nr_immediate = nr_immediate;
> +	}

This change of '+=' to '=' raised my eybrows, but it seems both callers
don't care so this is indeed no functional change and potentially faster.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct
@ 2017-01-04 14:51     ` Vlastimil Babka
  0 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04 14:51 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Minchan Kim, Hillf Danton, linux-mm,
	LKML, Michal Hocko

On 01/04/2017 11:19 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> shrink_page_list returns quite some counters back to its caller. Extract
> the existing 5 into struct reclaim_stat because this makes the code
> easier to follow and also allows further counters to be returned.
> 
> While we are at it, make all of them unsigned rather than unsigned long
> as we do not really need full 64b for them (we never scan more than
> SWAP_CLUSTER_MAX pages at once). This should reduce some stack space.
> 
> This patch shouldn't introduce any functional change.

[...]

> @@ -1266,11 +1270,13 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  	list_splice(&ret_pages, page_list);
>  	count_vm_events(PGACTIVATE, pgactivate);
>  
> -	*ret_nr_dirty += nr_dirty;
> -	*ret_nr_congested += nr_congested;
> -	*ret_nr_unqueued_dirty += nr_unqueued_dirty;
> -	*ret_nr_writeback += nr_writeback;
> -	*ret_nr_immediate += nr_immediate;
> +	if (stat) {
> +		stat->nr_dirty = nr_dirty;
> +		stat->nr_congested = nr_congested;
> +		stat->nr_unqueued_dirty = nr_unqueued_dirty;
> +		stat->nr_writeback = nr_writeback;
> +		stat->nr_immediate = nr_immediate;
> +	}

This change of '+=' to '=' raised my eybrows, but it seems both callers
don't care so this is indeed no functional change and potentially faster.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct
  2017-01-04 14:51     ` Vlastimil Babka
@ 2017-01-04 15:09       ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 15:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On Wed 04-01-17 15:51:43, Vlastimil Babka wrote:
[...]
> > @@ -1266,11 +1270,13 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  	list_splice(&ret_pages, page_list);
> >  	count_vm_events(PGACTIVATE, pgactivate);
> >  
> > -	*ret_nr_dirty += nr_dirty;
> > -	*ret_nr_congested += nr_congested;
> > -	*ret_nr_unqueued_dirty += nr_unqueued_dirty;
> > -	*ret_nr_writeback += nr_writeback;
> > -	*ret_nr_immediate += nr_immediate;
> > +	if (stat) {
> > +		stat->nr_dirty = nr_dirty;
> > +		stat->nr_congested = nr_congested;
> > +		stat->nr_unqueued_dirty = nr_unqueued_dirty;
> > +		stat->nr_writeback = nr_writeback;
> > +		stat->nr_immediate = nr_immediate;
> > +	}
> 
> This change of '+=' to '=' raised my eybrows, but it seems both callers
> don't care so this is indeed no functional change and potentially faster.

Yes, I was quite surprised as well, maybe we had a code which relied on
the aggregated numbers in the past but I didn't bother to go over git
logs to check. There is no such user anymore...
 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct
@ 2017-01-04 15:09       ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04 15:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Minchan Kim,
	Hillf Danton, linux-mm, LKML

On Wed 04-01-17 15:51:43, Vlastimil Babka wrote:
[...]
> > @@ -1266,11 +1270,13 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  	list_splice(&ret_pages, page_list);
> >  	count_vm_events(PGACTIVATE, pgactivate);
> >  
> > -	*ret_nr_dirty += nr_dirty;
> > -	*ret_nr_congested += nr_congested;
> > -	*ret_nr_unqueued_dirty += nr_unqueued_dirty;
> > -	*ret_nr_writeback += nr_writeback;
> > -	*ret_nr_immediate += nr_immediate;
> > +	if (stat) {
> > +		stat->nr_dirty = nr_dirty;
> > +		stat->nr_congested = nr_congested;
> > +		stat->nr_unqueued_dirty = nr_unqueued_dirty;
> > +		stat->nr_writeback = nr_writeback;
> > +		stat->nr_immediate = nr_immediate;
> > +	}
> 
> This change of '+=' to '=' raised my eybrows, but it seems both callers
> don't care so this is indeed no functional change and potentially faster.

Yes, I was quite surprised as well, maybe we had a code which relied on
the aggregated numbers in the past but I didn't bother to go over git
logs to check. There is no such user anymore...
 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04 13:52     ` Michal Hocko
@ 2017-01-05  5:41       ` Minchan Kim
  -1 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-05  5:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML

On Wed, Jan 04, 2017 at 02:52:47PM +0100, Michal Hocko wrote:
> With fixed triggered by Vlastimil it should be like this.
> ---
> From b3a1480b54bf10924a9cd09c6d8b274fc81ca4ad Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 27 Dec 2016 13:18:20 +0100
> Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of
> 	- nr_taken is number of isolated pages from the active list
> 	- nr_referenced pages which tells us that we are hitting referenced
> 	  pages which are deactivated. If this is a large part of the
> 	  reported nr_deactivated pages then we might be hitting into
> 	  the active list too early because they might be still part of
> 	  the working set. This might help to debug performance issues.
> 	- nr_active pages which tells us how many pages are kept on the
> 	  active list - mostly exec file backed pages. A high number can
> 	  indicate that we might be trashing on executables.
> 
> Changes since v1
> - report nr_taken pages as per Minchan
> - report nr_activated as per Minchan
> - do not report nr_freed pages because that would add a tiny overhead to
>   free_hot_cold_page_list which is a hot path
> - do not report nr_unevictable because we can report this number via a
>   different and more generic tracepoint in putback_lru_page
> - fix move_active_pages_to_lru to report proper page count when we hit
>   into large pages
> - drop nr_scanned because this can be obtained from
>   trace_mm_vmscan_lru_isolate as per Minchan
> 
> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> Acked-by: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Minchan Kim <minchan@kernel.org>

Thanks.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-05  5:41       ` Minchan Kim
  0 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-05  5:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML

On Wed, Jan 04, 2017 at 02:52:47PM +0100, Michal Hocko wrote:
> With fixed triggered by Vlastimil it should be like this.
> ---
> From b3a1480b54bf10924a9cd09c6d8b274fc81ca4ad Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 27 Dec 2016 13:18:20 +0100
> Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of
> 	- nr_taken is number of isolated pages from the active list
> 	- nr_referenced pages which tells us that we are hitting referenced
> 	  pages which are deactivated. If this is a large part of the
> 	  reported nr_deactivated pages then we might be hitting into
> 	  the active list too early because they might be still part of
> 	  the working set. This might help to debug performance issues.
> 	- nr_active pages which tells us how many pages are kept on the
> 	  active list - mostly exec file backed pages. A high number can
> 	  indicate that we might be trashing on executables.
> 
> Changes since v1
> - report nr_taken pages as per Minchan
> - report nr_activated as per Minchan
> - do not report nr_freed pages because that would add a tiny overhead to
>   free_hot_cold_page_list which is a hot path
> - do not report nr_unevictable because we can report this number via a
>   different and more generic tracepoint in putback_lru_page
> - fix move_active_pages_to_lru to report proper page count when we hit
>   into large pages
> - drop nr_scanned because this can be obtained from
>   trace_mm_vmscan_lru_isolate as per Minchan
> 
> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> Acked-by: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Minchan Kim <minchan@kernel.org>

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
  2017-01-04 10:19   ` Michal Hocko
@ 2017-01-05  6:04     ` Minchan Kim
  -1 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-05  6:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML, Michal Hocko

On Wed, Jan 04, 2017 at 11:19:39AM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> from is file or anonymous but we do not know which LRU this is.
> 
> It is useful to know whether the list is active or inactive, since we
> are using the same function to isolate pages from both of them and it's
> hard to distinguish otherwise.
> 
> Chaneges since v1
> - drop LRU_ prefix from names and use lowercase as per Vlastimil
> - move and convert show_lru_name to mmflags.h EM magic as per Vlastimil
> 
> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> Acked-by: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/trace/events/mmflags.h |  8 ++++++++
>  include/trace/events/vmscan.h  | 12 ++++++------
>  mm/vmscan.c                    |  2 +-
>  3 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> index aa4caa6914a9..6172afa2fd82 100644
> --- a/include/trace/events/mmflags.h
> +++ b/include/trace/events/mmflags.h
> @@ -240,6 +240,13 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY,	"softdirty"	)		\
>  	IFDEF_ZONE_HIGHMEM(	EM (ZONE_HIGHMEM,"HighMem"))	\
>  				EMe(ZONE_MOVABLE,"Movable")
>  
> +#define LRU_NAMES		\
> +		EM (LRU_INACTIVE_ANON, "inactive_anon") \
> +		EM (LRU_ACTIVE_ANON, "active_anon") \
> +		EM (LRU_INACTIVE_FILE, "inactive_file") \
> +		EM (LRU_ACTIVE_FILE, "active_file") \
> +		EMe(LRU_UNEVICTABLE, "unevictable")
> +
>  /*
>   * First define the enums in the above macros to be exported to userspace
>   * via TRACE_DEFINE_ENUM().
> @@ -253,6 +260,7 @@ COMPACTION_STATUS
>  COMPACTION_PRIORITY
>  COMPACTION_FEEDBACK
>  ZONE_TYPE
> +LRU_NAMES
>  
>  /*
>   * Now redefine the EM() and EMe() macros to map the enums to the strings
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 36c999f806bf..7ec59e0432c4 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>  		unsigned long nr_skipped,
>  		unsigned long nr_taken,
>  		isolate_mode_t isolate_mode,
> -		int file),
> +		int lru),

It may break trace-vmscan-postprocess.pl. Other than that,

Acked-by: Minchan Kim <minchan@kernel.org>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
@ 2017-01-05  6:04     ` Minchan Kim
  0 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-05  6:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML, Michal Hocko

On Wed, Jan 04, 2017 at 11:19:39AM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> from is file or anonymous but we do not know which LRU this is.
> 
> It is useful to know whether the list is active or inactive, since we
> are using the same function to isolate pages from both of them and it's
> hard to distinguish otherwise.
> 
> Chaneges since v1
> - drop LRU_ prefix from names and use lowercase as per Vlastimil
> - move and convert show_lru_name to mmflags.h EM magic as per Vlastimil
> 
> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> Acked-by: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/trace/events/mmflags.h |  8 ++++++++
>  include/trace/events/vmscan.h  | 12 ++++++------
>  mm/vmscan.c                    |  2 +-
>  3 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> index aa4caa6914a9..6172afa2fd82 100644
> --- a/include/trace/events/mmflags.h
> +++ b/include/trace/events/mmflags.h
> @@ -240,6 +240,13 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY,	"softdirty"	)		\
>  	IFDEF_ZONE_HIGHMEM(	EM (ZONE_HIGHMEM,"HighMem"))	\
>  				EMe(ZONE_MOVABLE,"Movable")
>  
> +#define LRU_NAMES		\
> +		EM (LRU_INACTIVE_ANON, "inactive_anon") \
> +		EM (LRU_ACTIVE_ANON, "active_anon") \
> +		EM (LRU_INACTIVE_FILE, "inactive_file") \
> +		EM (LRU_ACTIVE_FILE, "active_file") \
> +		EMe(LRU_UNEVICTABLE, "unevictable")
> +
>  /*
>   * First define the enums in the above macros to be exported to userspace
>   * via TRACE_DEFINE_ENUM().
> @@ -253,6 +260,7 @@ COMPACTION_STATUS
>  COMPACTION_PRIORITY
>  COMPACTION_FEEDBACK
>  ZONE_TYPE
> +LRU_NAMES
>  
>  /*
>   * Now redefine the EM() and EMe() macros to map the enums to the strings
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 36c999f806bf..7ec59e0432c4 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>  		unsigned long nr_skipped,
>  		unsigned long nr_taken,
>  		isolate_mode_t isolate_mode,
> -		int file),
> +		int lru),

It may break trace-vmscan-postprocess.pl. Other than that,

Acked-by: Minchan Kim <minchan@kernel.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-05  8:25   ` Vlastimil Babka
  -1 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-05  8:25 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Minchan Kim, Hillf Danton, linux-mm, LKML

On 01/04/2017 11:19 AM, Michal Hocko wrote:
> Hi,
> this is the second version of the patchset [1]. I hope I've addressed all
> the review feedback.
> 
> While debugging [2] I've realized that there is some room for
> improvements in the tracepoints set we offer currently. I had hard times
> to make any conclusion from the existing ones. The resulting problem
> turned out to be active list aging [3] and we are missing at least two
> tracepoints to debug such a problem.
> 
> Some existing tracepoints could export more information to see _why_ the
> reclaim progress cannot be made not only _how much_ we could reclaim.
> The later could be seen quite reasonably from the vmstat counters
> already. It can be argued that we are showing too many implementation
> details in those tracepoints but I consider them way too lowlevel
> already to be usable by any kernel independent userspace. I would be
> _really_ surprised if anything but debugging tools have used them.
> 
> Any feedback is highly appreciated.

When patch-specific feedback is addressed, then for the whole series:

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> [1] http://lkml.kernel.org/r/20161228153032.10821-1-mhocko@kernel.org
> [2] http://lkml.kernel.org/r/20161215225702.GA27944@boerne.fritz.box
> [3] http://lkml.kernel.org/r/20161223105157.GB23109@dhcp22.suse.cz
> 
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
@ 2017-01-05  8:25   ` Vlastimil Babka
  0 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-05  8:25 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Minchan Kim, Hillf Danton, linux-mm, LKML

On 01/04/2017 11:19 AM, Michal Hocko wrote:
> Hi,
> this is the second version of the patchset [1]. I hope I've addressed all
> the review feedback.
> 
> While debugging [2] I've realized that there is some room for
> improvements in the tracepoints set we offer currently. I had hard times
> to make any conclusion from the existing ones. The resulting problem
> turned out to be active list aging [3] and we are missing at least two
> tracepoints to debug such a problem.
> 
> Some existing tracepoints could export more information to see _why_ the
> reclaim progress cannot be made not only _how much_ we could reclaim.
> The later could be seen quite reasonably from the vmstat counters
> already. It can be argued that we are showing too many implementation
> details in those tracepoints but I consider them way too lowlevel
> already to be usable by any kernel independent userspace. I would be
> _really_ surprised if anything but debugging tools have used them.
> 
> Any feedback is highly appreciated.

When patch-specific feedback is addressed, then for the whole series:

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> [1] http://lkml.kernel.org/r/20161228153032.10821-1-mhocko@kernel.org
> [2] http://lkml.kernel.org/r/20161215225702.GA27944@boerne.fritz.box
> [3] http://lkml.kernel.org/r/20161223105157.GB23109@dhcp22.suse.cz
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
  2017-01-05  6:04     ` Minchan Kim
@ 2017-01-05 10:16       ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-05 10:16 UTC (permalink / raw)
  To: Minchan Kim, Mel Gorman
  Cc: Andrew Morton, Johannes Weiner, Vlastimil Babka, Hillf Danton,
	linux-mm, LKML

On Thu 05-01-17 15:04:58, Minchan Kim wrote:
> On Wed, Jan 04, 2017 at 11:19:39AM +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> > from is file or anonymous but we do not know which LRU this is.
> > 
> > It is useful to know whether the list is active or inactive, since we
> > are using the same function to isolate pages from both of them and it's
> > hard to distinguish otherwise.
> > 
> > Chaneges since v1
> > - drop LRU_ prefix from names and use lowercase as per Vlastimil
> > - move and convert show_lru_name to mmflags.h EM magic as per Vlastimil
> > 
> > Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> > Acked-by: Mel Gorman <mgorman@suse.de>
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> 
> > ---
> >  include/trace/events/mmflags.h |  8 ++++++++
> >  include/trace/events/vmscan.h  | 12 ++++++------
> >  mm/vmscan.c                    |  2 +-
> >  3 files changed, 15 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> > index aa4caa6914a9..6172afa2fd82 100644
> > --- a/include/trace/events/mmflags.h
> > +++ b/include/trace/events/mmflags.h
> > @@ -240,6 +240,13 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY,	"softdirty"	)		\
> >  	IFDEF_ZONE_HIGHMEM(	EM (ZONE_HIGHMEM,"HighMem"))	\
> >  				EMe(ZONE_MOVABLE,"Movable")
> >  
> > +#define LRU_NAMES		\
> > +		EM (LRU_INACTIVE_ANON, "inactive_anon") \
> > +		EM (LRU_ACTIVE_ANON, "active_anon") \
> > +		EM (LRU_INACTIVE_FILE, "inactive_file") \
> > +		EM (LRU_ACTIVE_FILE, "active_file") \
> > +		EMe(LRU_UNEVICTABLE, "unevictable")
> > +
> >  /*
> >   * First define the enums in the above macros to be exported to userspace
> >   * via TRACE_DEFINE_ENUM().
> > @@ -253,6 +260,7 @@ COMPACTION_STATUS
> >  COMPACTION_PRIORITY
> >  COMPACTION_FEEDBACK
> >  ZONE_TYPE
> > +LRU_NAMES
> >  
> >  /*
> >   * Now redefine the EM() and EMe() macros to map the enums to the strings
> > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > index 36c999f806bf..7ec59e0432c4 100644
> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
> >  		unsigned long nr_skipped,
> >  		unsigned long nr_taken,
> >  		isolate_mode_t isolate_mode,
> > -		int file),
> > +		int lru),
> 
> It may break trace-vmscan-postprocess.pl. Other than that,

I wasn't aware of the script. And you are right it will break it. The
following should fix it. Btw. shrink_inactive_list tracepoint changes
will to be synced as well. I do not speak perl much but the following
should just work (untested yet).
---
diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
index 8f961ef2b457..ba976805853a 100644
--- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
+++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
@@ -112,8 +112,8 @@ my $regex_direct_end_default = 'nr_reclaimed=([0-9]*)';
 my $regex_kswapd_wake_default = 'nid=([0-9]*) order=([0-9]*)';
 my $regex_kswapd_sleep_default = 'nid=([0-9]*)';
 my $regex_wakeup_kswapd_default = 'nid=([0-9]*) zid=([0-9]*) order=([0-9]*)';
-my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) file=([0-9]*)';
-my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) zid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
+my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) classzone_idx=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_skipped=([0-9]*) nr_taken=([0-9]*) lru=([a-z_]*)';
+my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) nr_dirty=([0-9]*) nr_writeback=([0-9]*) nr_congested=([0-9]*) nr_immediate=([0-9]*) nr_activate=([0-9]*) nr_ref_keep=([0-9]*) nr_unmap_fail=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
 my $regex_lru_shrink_active_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_rotated=([0-9]*) priority=([0-9]*)';
 my $regex_writepage_default = 'page=([0-9a-f]*) pfn=([0-9]*) flags=([A-Z_|]*)';
 
@@ -205,15 +205,15 @@ $regex_wakeup_kswapd = generate_traceevent_regex(
 $regex_lru_isolate = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_isolate",
 			$regex_lru_isolate_default,
-			"isolate_mode", "order",
-			"nr_requested", "nr_scanned", "nr_taken",
-			"file");
+			"isolate_mode", "classzone_idx", "order",
+			"nr_requested", "nr_scanned", "nr_skipped", "nr_taken",
+			"lru");
 $regex_lru_shrink_inactive = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_shrink_inactive",
 			$regex_lru_shrink_inactive_default,
-			"nid", "zid",
-			"nr_scanned", "nr_reclaimed", "priority",
-			"flags");
+			"nid", "nr_scanned", "nr_reclaimed", "nr_dirty", "nr_writeback",
+			"nr_congested", "nr_immediate", "nr_activate", "nr_ref_keep",
+			"nr_unmap_fail", "priority", "flags");
 $regex_lru_shrink_active = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_shrink_active",
 			$regex_lru_shrink_active_default,
@@ -381,8 +381,8 @@ sub process_events {
 				next;
 			}
 			my $isolate_mode = $1;
-			my $nr_scanned = $4;
-			my $file = $6;
+			my $nr_scanned = $5;
+			my $file = $8;
 
 			# To closer match vmstat scanning statistics, only count isolate_both
 			# and isolate_inactive as scanning. isolate_active is rotation
@@ -391,7 +391,7 @@ sub process_events {
 			# isolate_both     == 3
 			if ($isolate_mode != 2) {
 				$perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
-				if ($file == 1) {
+				if ($file =~ /_file/) {
 					$perprocesspid{$process_pid}->{HIGH_NR_FILE_SCANNED} += $nr_scanned;
 				} else {
 					$perprocesspid{$process_pid}->{HIGH_NR_ANON_SCANNED} += $nr_scanned;
@@ -406,8 +406,8 @@ sub process_events {
 				next;
 			}
 
-			my $nr_reclaimed = $4;
-			my $flags = $6;
+			my $nr_reclaimed = $3;
+			my $flags = $12;
 			my $file = 0;
 			if ($flags =~ /RECLAIM_WB_FILE/) {
 				$file = 1;
 
> Acked-by: Minchan Kim <minchan@kernel.org>

Thanks
 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
@ 2017-01-05 10:16       ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-05 10:16 UTC (permalink / raw)
  To: Minchan Kim, Mel Gorman
  Cc: Andrew Morton, Johannes Weiner, Vlastimil Babka, Hillf Danton,
	linux-mm, LKML

On Thu 05-01-17 15:04:58, Minchan Kim wrote:
> On Wed, Jan 04, 2017 at 11:19:39AM +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > mm_vmscan_lru_isolate currently prints only whether the LRU we isolate
> > from is file or anonymous but we do not know which LRU this is.
> > 
> > It is useful to know whether the list is active or inactive, since we
> > are using the same function to isolate pages from both of them and it's
> > hard to distinguish otherwise.
> > 
> > Chaneges since v1
> > - drop LRU_ prefix from names and use lowercase as per Vlastimil
> > - move and convert show_lru_name to mmflags.h EM magic as per Vlastimil
> > 
> > Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> > Acked-by: Mel Gorman <mgorman@suse.de>
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> 
> > ---
> >  include/trace/events/mmflags.h |  8 ++++++++
> >  include/trace/events/vmscan.h  | 12 ++++++------
> >  mm/vmscan.c                    |  2 +-
> >  3 files changed, 15 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> > index aa4caa6914a9..6172afa2fd82 100644
> > --- a/include/trace/events/mmflags.h
> > +++ b/include/trace/events/mmflags.h
> > @@ -240,6 +240,13 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY,	"softdirty"	)		\
> >  	IFDEF_ZONE_HIGHMEM(	EM (ZONE_HIGHMEM,"HighMem"))	\
> >  				EMe(ZONE_MOVABLE,"Movable")
> >  
> > +#define LRU_NAMES		\
> > +		EM (LRU_INACTIVE_ANON, "inactive_anon") \
> > +		EM (LRU_ACTIVE_ANON, "active_anon") \
> > +		EM (LRU_INACTIVE_FILE, "inactive_file") \
> > +		EM (LRU_ACTIVE_FILE, "active_file") \
> > +		EMe(LRU_UNEVICTABLE, "unevictable")
> > +
> >  /*
> >   * First define the enums in the above macros to be exported to userspace
> >   * via TRACE_DEFINE_ENUM().
> > @@ -253,6 +260,7 @@ COMPACTION_STATUS
> >  COMPACTION_PRIORITY
> >  COMPACTION_FEEDBACK
> >  ZONE_TYPE
> > +LRU_NAMES
> >  
> >  /*
> >   * Now redefine the EM() and EMe() macros to map the enums to the strings
> > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > index 36c999f806bf..7ec59e0432c4 100644
> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
> >  		unsigned long nr_skipped,
> >  		unsigned long nr_taken,
> >  		isolate_mode_t isolate_mode,
> > -		int file),
> > +		int lru),
> 
> It may break trace-vmscan-postprocess.pl. Other than that,

I wasn't aware of the script. And you are right it will break it. The
following should fix it. Btw. shrink_inactive_list tracepoint changes
will to be synced as well. I do not speak perl much but the following
should just work (untested yet).
---
diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
index 8f961ef2b457..ba976805853a 100644
--- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
+++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
@@ -112,8 +112,8 @@ my $regex_direct_end_default = 'nr_reclaimed=([0-9]*)';
 my $regex_kswapd_wake_default = 'nid=([0-9]*) order=([0-9]*)';
 my $regex_kswapd_sleep_default = 'nid=([0-9]*)';
 my $regex_wakeup_kswapd_default = 'nid=([0-9]*) zid=([0-9]*) order=([0-9]*)';
-my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) file=([0-9]*)';
-my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) zid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
+my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) classzone_idx=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_skipped=([0-9]*) nr_taken=([0-9]*) lru=([a-z_]*)';
+my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) nr_dirty=([0-9]*) nr_writeback=([0-9]*) nr_congested=([0-9]*) nr_immediate=([0-9]*) nr_activate=([0-9]*) nr_ref_keep=([0-9]*) nr_unmap_fail=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
 my $regex_lru_shrink_active_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_rotated=([0-9]*) priority=([0-9]*)';
 my $regex_writepage_default = 'page=([0-9a-f]*) pfn=([0-9]*) flags=([A-Z_|]*)';
 
@@ -205,15 +205,15 @@ $regex_wakeup_kswapd = generate_traceevent_regex(
 $regex_lru_isolate = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_isolate",
 			$regex_lru_isolate_default,
-			"isolate_mode", "order",
-			"nr_requested", "nr_scanned", "nr_taken",
-			"file");
+			"isolate_mode", "classzone_idx", "order",
+			"nr_requested", "nr_scanned", "nr_skipped", "nr_taken",
+			"lru");
 $regex_lru_shrink_inactive = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_shrink_inactive",
 			$regex_lru_shrink_inactive_default,
-			"nid", "zid",
-			"nr_scanned", "nr_reclaimed", "priority",
-			"flags");
+			"nid", "nr_scanned", "nr_reclaimed", "nr_dirty", "nr_writeback",
+			"nr_congested", "nr_immediate", "nr_activate", "nr_ref_keep",
+			"nr_unmap_fail", "priority", "flags");
 $regex_lru_shrink_active = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_shrink_active",
 			$regex_lru_shrink_active_default,
@@ -381,8 +381,8 @@ sub process_events {
 				next;
 			}
 			my $isolate_mode = $1;
-			my $nr_scanned = $4;
-			my $file = $6;
+			my $nr_scanned = $5;
+			my $file = $8;
 
 			# To closer match vmstat scanning statistics, only count isolate_both
 			# and isolate_inactive as scanning. isolate_active is rotation
@@ -391,7 +391,7 @@ sub process_events {
 			# isolate_both     == 3
 			if ($isolate_mode != 2) {
 				$perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
-				if ($file == 1) {
+				if ($file =~ /_file/) {
 					$perprocesspid{$process_pid}->{HIGH_NR_FILE_SCANNED} += $nr_scanned;
 				} else {
 					$perprocesspid{$process_pid}->{HIGH_NR_ANON_SCANNED} += $nr_scanned;
@@ -406,8 +406,8 @@ sub process_events {
 				next;
 			}
 
-			my $nr_reclaimed = $4;
-			my $flags = $6;
+			my $nr_reclaimed = $3;
+			my $flags = $12;
 			my $file = 0;
 			if ($flags =~ /RECLAIM_WB_FILE/) {
 				$file = 1;
 
> Acked-by: Minchan Kim <minchan@kernel.org>

Thanks
 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
  2017-01-04 10:19 ` Michal Hocko
@ 2017-01-05 10:39   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-05 10:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

Andrew,
it seems that all the patches have been acked. One of the patches has
been refreshed and send as a reply-to original one. One script in the
Documentation directory needs to be updated but I guess this is low
priority.

Should I resubmit what I have with or you are going to pick it up from
here?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints
@ 2017-01-05 10:39   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-05 10:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Minchan Kim,
	Hillf Danton, linux-mm, LKML

Andrew,
it seems that all the patches have been acked. One of the patches has
been refreshed and send as a reply-to original one. One script in the
Documentation directory needs to be updated but I guess this is low
priority.

Should I resubmit what I have with or you are going to pick it up from
here?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
  2017-01-05 10:16       ` Michal Hocko
@ 2017-01-05 14:56         ` Mel Gorman
  -1 siblings, 0 replies; 74+ messages in thread
From: Mel Gorman @ 2017-01-05 14:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Minchan Kim, Andrew Morton, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML

On Thu, Jan 05, 2017 at 11:16:13AM +0100, Michal Hocko wrote:
> > > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > > index 36c999f806bf..7ec59e0432c4 100644
> > > --- a/include/trace/events/vmscan.h
> > > +++ b/include/trace/events/vmscan.h
> > > @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
> > >  		unsigned long nr_skipped,
> > >  		unsigned long nr_taken,
> > >  		isolate_mode_t isolate_mode,
> > > -		int file),
> > > +		int lru),
> > 
> > It may break trace-vmscan-postprocess.pl. Other than that,
> 
> I wasn't aware of the script. And you are right it will break it. The
> following should fix it. Btw. shrink_inactive_list tracepoint changes
> will to be synced as well. I do not speak perl much but the following
> should just work (untested yet).

It's also optional to remove them. When those were first merged, it was
done to illustrate how multiple tracepoints can be used to aggregate
tracepoint information. They are better ways of gathering the same class
of information. They are of historical interest but not as fully supported
scripts that can never break.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
@ 2017-01-05 14:56         ` Mel Gorman
  0 siblings, 0 replies; 74+ messages in thread
From: Mel Gorman @ 2017-01-05 14:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Minchan Kim, Andrew Morton, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML

On Thu, Jan 05, 2017 at 11:16:13AM +0100, Michal Hocko wrote:
> > > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > > index 36c999f806bf..7ec59e0432c4 100644
> > > --- a/include/trace/events/vmscan.h
> > > +++ b/include/trace/events/vmscan.h
> > > @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
> > >  		unsigned long nr_skipped,
> > >  		unsigned long nr_taken,
> > >  		isolate_mode_t isolate_mode,
> > > -		int file),
> > > +		int lru),
> > 
> > It may break trace-vmscan-postprocess.pl. Other than that,
> 
> I wasn't aware of the script. And you are right it will break it. The
> following should fix it. Btw. shrink_inactive_list tracepoint changes
> will to be synced as well. I do not speak perl much but the following
> should just work (untested yet).

It's also optional to remove them. When those were first merged, it was
done to illustrate how multiple tracepoints can be used to aggregate
tracepoint information. They are better ways of gathering the same class
of information. They are of historical interest but not as fully supported
scripts that can never break.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
  2017-01-05 14:56         ` Mel Gorman
@ 2017-01-05 15:17           ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-05 15:17 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, Andrew Morton, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML

On Thu 05-01-17 14:56:23, Mel Gorman wrote:
> On Thu, Jan 05, 2017 at 11:16:13AM +0100, Michal Hocko wrote:
> > > > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > > > index 36c999f806bf..7ec59e0432c4 100644
> > > > --- a/include/trace/events/vmscan.h
> > > > +++ b/include/trace/events/vmscan.h
> > > > @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
> > > >  		unsigned long nr_skipped,
> > > >  		unsigned long nr_taken,
> > > >  		isolate_mode_t isolate_mode,
> > > > -		int file),
> > > > +		int lru),
> > > 
> > > It may break trace-vmscan-postprocess.pl. Other than that,
> > 
> > I wasn't aware of the script. And you are right it will break it. The
> > following should fix it. Btw. shrink_inactive_list tracepoint changes
> > will to be synced as well. I do not speak perl much but the following
> > should just work (untested yet).
> 
> It's also optional to remove them. When those were first merged, it was
> done to illustrate how multiple tracepoints can be used to aggregate
> tracepoint information. They are better ways of gathering the same class
> of information. They are of historical interest but not as fully supported
> scripts that can never break.

Yeah, that was my understanding and why I didn't consider it a priority.
But it seemed like an easy thing to fix even with my anti-perl mindset.
Here is the full patch (untested)
---
>From 1b843a180a1436873aab1fe3819dfc7dbf393870 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 5 Jan 2017 11:34:03 +0100
Subject: [PATCH] trace-vmscan-postprocess: sync with tracepoints updates

Both mm_vmscan_lru_shrink_active and mm_vmscan_lru_isolate have changed
so the script needs to be update to reflect those changes

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 .../trace/postprocess/trace-vmscan-postprocess.pl  | 26 +++++++++++-----------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
index 8f961ef2b457..ba976805853a 100644
--- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
+++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
@@ -112,8 +112,8 @@ my $regex_direct_end_default = 'nr_reclaimed=([0-9]*)';
 my $regex_kswapd_wake_default = 'nid=([0-9]*) order=([0-9]*)';
 my $regex_kswapd_sleep_default = 'nid=([0-9]*)';
 my $regex_wakeup_kswapd_default = 'nid=([0-9]*) zid=([0-9]*) order=([0-9]*)';
-my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) file=([0-9]*)';
-my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) zid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
+my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) classzone_idx=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_skipped=([0-9]*) nr_taken=([0-9]*) lru=([a-z_]*)';
+my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) nr_dirty=([0-9]*) nr_writeback=([0-9]*) nr_congested=([0-9]*) nr_immediate=([0-9]*) nr_activate=([0-9]*) nr_ref_keep=([0-9]*) nr_unmap_fail=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
 my $regex_lru_shrink_active_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_rotated=([0-9]*) priority=([0-9]*)';
 my $regex_writepage_default = 'page=([0-9a-f]*) pfn=([0-9]*) flags=([A-Z_|]*)';
 
@@ -205,15 +205,15 @@ $regex_wakeup_kswapd = generate_traceevent_regex(
 $regex_lru_isolate = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_isolate",
 			$regex_lru_isolate_default,
-			"isolate_mode", "order",
-			"nr_requested", "nr_scanned", "nr_taken",
-			"file");
+			"isolate_mode", "classzone_idx", "order",
+			"nr_requested", "nr_scanned", "nr_skipped", "nr_taken",
+			"lru");
 $regex_lru_shrink_inactive = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_shrink_inactive",
 			$regex_lru_shrink_inactive_default,
-			"nid", "zid",
-			"nr_scanned", "nr_reclaimed", "priority",
-			"flags");
+			"nid", "nr_scanned", "nr_reclaimed", "nr_dirty", "nr_writeback",
+			"nr_congested", "nr_immediate", "nr_activate", "nr_ref_keep",
+			"nr_unmap_fail", "priority", "flags");
 $regex_lru_shrink_active = generate_traceevent_regex(
 			"vmscan/mm_vmscan_lru_shrink_active",
 			$regex_lru_shrink_active_default,
@@ -381,8 +381,8 @@ sub process_events {
 				next;
 			}
 			my $isolate_mode = $1;
-			my $nr_scanned = $4;
-			my $file = $6;
+			my $nr_scanned = $5;
+			my $file = $8;
 
 			# To closer match vmstat scanning statistics, only count isolate_both
 			# and isolate_inactive as scanning. isolate_active is rotation
@@ -391,7 +391,7 @@ sub process_events {
 			# isolate_both     == 3
 			if ($isolate_mode != 2) {
 				$perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
-				if ($file == 1) {
+				if ($file =~ /_file/) {
 					$perprocesspid{$process_pid}->{HIGH_NR_FILE_SCANNED} += $nr_scanned;
 				} else {
 					$perprocesspid{$process_pid}->{HIGH_NR_ANON_SCANNED} += $nr_scanned;
@@ -406,8 +406,8 @@ sub process_events {
 				next;
 			}
 
-			my $nr_reclaimed = $4;
-			my $flags = $6;
+			my $nr_reclaimed = $3;
+			my $flags = $12;
 			my $file = 0;
 			if ($flags =~ /RECLAIM_WB_FILE/) {
 				$file = 1;
-- 
2.11.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint
@ 2017-01-05 15:17           ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-05 15:17 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, Andrew Morton, Johannes Weiner, Vlastimil Babka,
	Hillf Danton, linux-mm, LKML

On Thu 05-01-17 14:56:23, Mel Gorman wrote:
> On Thu, Jan 05, 2017 at 11:16:13AM +0100, Michal Hocko wrote:
> > > > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > > > index 36c999f806bf..7ec59e0432c4 100644
> > > > --- a/include/trace/events/vmscan.h
> > > > +++ b/include/trace/events/vmscan.h
> > > > @@ -277,9 +277,9 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
> > > >  		unsigned long nr_skipped,
> > > >  		unsigned long nr_taken,
> > > >  		isolate_mode_t isolate_mode,
> > > > -		int file),
> > > > +		int lru),
> > > 
> > > It may break trace-vmscan-postprocess.pl. Other than that,
> > 
> > I wasn't aware of the script. And you are right it will break it. The
> > following should fix it. Btw. shrink_inactive_list tracepoint changes
> > will to be synced as well. I do not speak perl much but the following
> > should just work (untested yet).
> 
> It's also optional to remove them. When those were first merged, it was
> done to illustrate how multiple tracepoints can be used to aggregate
> tracepoint information. They are better ways of gathering the same class
> of information. They are of historical interest but not as fully supported
> scripts that can never break.

Yeah, that was my understanding and why I didn't consider it a priority.
But it seemed like an easy thing to fix even with my anti-perl mindset.
Here is the full patch (untested)
---

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04  5:07                     ` Minchan Kim
@ 2017-01-04  7:50                       ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04  7:50 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Wed 04-01-17 14:07:22, Minchan Kim wrote:
> On Tue, Jan 03, 2017 at 09:21:22AM +0100, Michal Hocko wrote:
[...]
> > with other tracepoints but that can be helpful because you do not have
> > all the tracepoints enabled all the time. So unless you see this
> > particular thing as a road block I would rather keep it.
> 
> I didn't know how long this thread becomes lenghy. To me, it was no worth
> to discuss. I did best effot to explain my stand with valid points, I think
> and don't want to go infinite loop. If you don't agree still, separate
> the patch. One includes only necessary things with removing nr_scanned, which
> I am happy to ack it. Based upon it, add one more patch you want adding
> nr_scanned with your claim. I will reply that thread with my claim and
> let's keep an eye on it that whether maintainer will take it or not.

To be honest this is just not worth the effort and rather than
discussing further I will just drop the nr_scanned slthough I disagree
that your concerns regarding this _particular counter_ are really valid.

> If maintainer will take it, it's good indication which will represent
> we can add more extra tracepoint easily with "might be helpful with someone
> although it's redunant" so do not prevent others who want to do
> in the future.

no we do not work in a precedence system like that.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04  7:50                       ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-04  7:50 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Wed 04-01-17 14:07:22, Minchan Kim wrote:
> On Tue, Jan 03, 2017 at 09:21:22AM +0100, Michal Hocko wrote:
[...]
> > with other tracepoints but that can be helpful because you do not have
> > all the tracepoints enabled all the time. So unless you see this
> > particular thing as a road block I would rather keep it.
> 
> I didn't know how long this thread becomes lenghy. To me, it was no worth
> to discuss. I did best effot to explain my stand with valid points, I think
> and don't want to go infinite loop. If you don't agree still, separate
> the patch. One includes only necessary things with removing nr_scanned, which
> I am happy to ack it. Based upon it, add one more patch you want adding
> nr_scanned with your claim. I will reply that thread with my claim and
> let's keep an eye on it that whether maintainer will take it or not.

To be honest this is just not worth the effort and rather than
discussing further I will just drop the nr_scanned slthough I disagree
that your concerns regarding this _particular counter_ are really valid.

> If maintainer will take it, it's good indication which will represent
> we can add more extra tracepoint easily with "might be helpful with someone
> although it's redunant" so do not prevent others who want to do
> in the future.

no we do not work in a precedence system like that.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-04  5:07                     ` Minchan Kim
@ 2017-01-04  7:28                       ` Vlastimil Babka
  -1 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04  7:28 UTC (permalink / raw)
  To: Minchan Kim, Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Rik van Riel, LKML

On 01/04/2017 06:07 AM, Minchan Kim wrote:
> With this,
> ./scripts/bloat-o-meter vmlinux.old vmlinux.new.new
> add/remove: 1/1 grow/shrink: 0/9 up/down: 1394/-1636 (-242)
> function                                     old     new   delta
> isolate_lru_pages                              -    1394   +1394
> print_fmt_mm_vmscan_lru_shrink_inactive      359     355      -4
> vermagic                                      64      58      -6
> perf_trace_mm_vmscan_lru_shrink_active       264     256      -8
> trace_raw_output_mm_vmscan_lru_shrink_active     203     193     -10
> trace_event_raw_event_mm_vmscan_lru_shrink_active     241     225     -16
> print_fmt_mm_vmscan_lru_shrink_active        458     426     -32
> trace_event_define_fields_mm_vmscan_lru_shrink_active     384     336     -48
> shrink_inactive_list                        1430    1271    -159
> shrink_active_list                          1265    1082    -183
> isolate_lru_pages.isra                      1170       -   -1170
> Total: Before=26268743, After=26268501, chg -0.00%
> 
> We can save 242 bytes.
> 
> If we consider binary size, 424 bytes save.
> 
> #> ls -l vmlinux.old vmlinux.new.new
> 194092840  vmlinux.old
> 194092416  vmlinux.new.new

Which is roughly 0.0002%. Not that I'm against fighting bloat, but let's
not forget that it's not the only factor. For example the following part
from above:

> isolate_lru_pages                              -    1394   +1394
> isolate_lru_pages.isra                      1170       -   -1170

shows that your change has prevented a -fipa-src gcc optimisation, which
is "interprocedural scalar replacement of aggregates, removal of unused
parameters and replacement of parameters passed by reference by
parameters passed by value." Well, I'm no gcc expert :) but it might be
that the change is not a simple win-win.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04  7:28                       ` Vlastimil Babka
  0 siblings, 0 replies; 74+ messages in thread
From: Vlastimil Babka @ 2017-01-04  7:28 UTC (permalink / raw)
  To: Minchan Kim, Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Rik van Riel, LKML

On 01/04/2017 06:07 AM, Minchan Kim wrote:
> With this,
> ./scripts/bloat-o-meter vmlinux.old vmlinux.new.new
> add/remove: 1/1 grow/shrink: 0/9 up/down: 1394/-1636 (-242)
> function                                     old     new   delta
> isolate_lru_pages                              -    1394   +1394
> print_fmt_mm_vmscan_lru_shrink_inactive      359     355      -4
> vermagic                                      64      58      -6
> perf_trace_mm_vmscan_lru_shrink_active       264     256      -8
> trace_raw_output_mm_vmscan_lru_shrink_active     203     193     -10
> trace_event_raw_event_mm_vmscan_lru_shrink_active     241     225     -16
> print_fmt_mm_vmscan_lru_shrink_active        458     426     -32
> trace_event_define_fields_mm_vmscan_lru_shrink_active     384     336     -48
> shrink_inactive_list                        1430    1271    -159
> shrink_active_list                          1265    1082    -183
> isolate_lru_pages.isra                      1170       -   -1170
> Total: Before=26268743, After=26268501, chg -0.00%
> 
> We can save 242 bytes.
> 
> If we consider binary size, 424 bytes save.
> 
> #> ls -l vmlinux.old vmlinux.new.new
> 194092840  vmlinux.old
> 194092416  vmlinux.new.new

Which is roughly 0.0002%. Not that I'm against fighting bloat, but let's
not forget that it's not the only factor. For example the following part
from above:

> isolate_lru_pages                              -    1394   +1394
> isolate_lru_pages.isra                      1170       -   -1170

shows that your change has prevented a -fipa-src gcc optimisation, which
is "interprocedural scalar replacement of aggregates, removal of unused
parameters and replacement of parameters passed by reference by
parameters passed by value." Well, I'm no gcc expert :) but it might be
that the change is not a simple win-win.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-03  8:21                   ` Michal Hocko
@ 2017-01-04  5:07                     ` Minchan Kim
  -1 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-04  5:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Tue, Jan 03, 2017 at 09:21:22AM +0100, Michal Hocko wrote:
> On Tue 03-01-17 14:03:28, Minchan Kim wrote:
> > Hi Michal,
> > 
> > On Fri, Dec 30, 2016 at 05:37:42PM +0100, Michal Hocko wrote:
> > > On Sat 31-12-16 01:04:56, Minchan Kim wrote:
> > > [...]
> > > > > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > > > > From: Michal Hocko <mhocko@suse.com>
> > > > > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > > > > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > > > > 
> > > > > Our reclaim process has several tracepoints to tell us more about how
> > > > > things are progressing. We are, however, missing a tracepoint to track
> > > > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > > > 
> > > > I agree this part.
> > > > 
> > > > > the number of
> > > > > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > > > > 	  effectiveness.
> > > > 
> > > > I agree nr_taken for knowing shrinking effectiveness but don't
> > > > agree nr_scanned. If we want to know LRU isolation effectiveness
> > > > with nr_scanned and nr_taken, isolate_lru_pages will do.
> > > 
> > > Yes it will. On the other hand the number is there and there is no
> > > additional overhead, maintenance or otherwise, to provide that number.
> > 
> > You are adding some instructions, how can you imagine it's no overhead?
> 
> There should be close to zero overhead when the tracepoint is disabled
> (we pay only one more argument when the function is called). Is this
> really worth discussing in this cold path? We are talking about the
> reclaim here.

I am talking about that why we should add pointless code in there.
No matter it's overhead. We are looping infinite. Blindly, it adds
overhead although you might think so trivial.

> 
> > Let's say whether it's measurable. Although it's not big in particular case,
> > it would be measurable if everyone start to say like that "it's trivial so
> > what's the problem adding a few instructions although it was duplicated?"
> > 
> > You already said "LRU isolate effectiveness". It should be done in there,
> > isolate_lru_pages and we have been. You need another reasons if you want to
> > add the duplicated work, strongly.
> 
> isolate_lru_pages is certainly there but you have to enable a trace
> point for that. Sometimes it is quite useful to get a reasonably good
> picture even without all the vmscan tracepoints enabled because they
> can generate quite a lot of output. So if the counter is available I

If someone want to see "isolate effectivenss", he should enable
mm_vmscan_lru_isolate which was born in that and has more helpful
information.

Think it in an opposit way. If some users want to see just active
list aging problem and no interested in "LRU isolate effectivness",
you are adding meaningless output for him and he has no choice to
turn it off with your patch.

> see no reason to exclude it, especially when it can provide a useful
> information. One of the most frustrating debugging experience is when

I said several times. Please think over if everyone begins adding extra
parameters in every tracepoints which we could already get it via other
tracepoint with "just, it might be useful in a specific context".
Could you be happy with that, really?

> you are missing some part of the information and have to guess which
> part is that and patch, rebuild the kernel and hope to reproduce it
> again in the same/similar way.

No need to rebuild. Just enable mm_vmscan_lru_isolate.

> 
> There are two things about this and other tracepoint patches in general
> I believe. 1) Is the tracepoint useful? and 2) Do we have to go over
> extra hops to show tracepoint data?
> 
> I guess we are in an agreement that the answer for 1 is yes. And

yeb.

> regarding 2, all the data we are showing are there or trivially
> retrieved without touching _any_ hot path. Som of it might be duplicated


Currently, you rely on just unfortunate modulization to just add
unncessary information to the tracepoint.

I just removed nr_scanned in your patch and look below.

./scripts/bloat-o-meter vmlinux.old vmlinux.new
add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-147 (-147)
function                                     old     new   delta
perf_trace_mm_vmscan_lru_shrink_active       264     256      -8
trace_raw_output_mm_vmscan_lru_shrink_active     203     193     -10
trace_event_raw_event_mm_vmscan_lru_shrink_active     241     225     -16
print_fmt_mm_vmscan_lru_shrink_active        458     426     -32
shrink_active_list                          1265    1232     -33
trace_event_define_fields_mm_vmscan_lru_shrink_active     384     336     -48
Total: Before=26268743, After=26268596, chg -0.00%

Let's furhter it more.

We can factor out logics to account isolation of LRU from shrink_[in]active_list
which is more clean, I think.

>From 1053968d526427ecad96b682aa586701c4ecfc84 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Wed, 4 Jan 2017 10:04:36 +0900
Subject: [PATCH] factor out LRU isolation accounting.

Not-yet-signed-off-by: Minchan Kim <minchan@kernel.org>
---
 include/trace/events/vmscan.h | 14 +++++----
 mm/vmscan.c                   | 68 ++++++++++++++++++-------------------------
 2 files changed, 37 insertions(+), 45 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 79b3cd9c7048..5fc3a94a14cd 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -364,14 +364,15 @@ TRACE_EVENT(mm_vmscan_writepage,
 TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
 	TP_PROTO(int nid,
-		unsigned long nr_scanned, unsigned long nr_reclaimed,
+		unsigned long nr_taken,
+		unsigned long nr_reclaimed,
 		int priority, int file),
 
-	TP_ARGS(nid, nr_scanned, nr_reclaimed, priority, file),
+	TP_ARGS(nid, nr_taken, nr_reclaimed, priority, file),
 
 	TP_STRUCT__entry(
 		__field(int, nid)
-		__field(unsigned long, nr_scanned)
+		__field(unsigned long, nr_taken)
 		__field(unsigned long, nr_reclaimed)
 		__field(int, priority)
 		__field(int, reclaim_flags)
@@ -379,15 +380,16 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
 	TP_fast_assign(
 		__entry->nid = nid;
-		__entry->nr_scanned = nr_scanned;
+		__entry->nr_taken = nr_taken;
 		__entry->nr_reclaimed = nr_reclaimed;
 		__entry->priority = priority;
 		__entry->reclaim_flags = trace_shrink_flags(file);
 	),
 
-	TP_printk("nid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d flags=%s",
+	TP_printk("nid=%d nr_taken=%ld nr_reclaimed=%ld priority=%d flags=%s",
 		__entry->nid,
-		__entry->nr_scanned, __entry->nr_reclaimed,
+		__entry->nr_taken,
+		__entry->nr_reclaimed,
 		__entry->priority,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 37ccd4e0b349..74f55f39f963 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1454,16 +1454,16 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec,
  * @nr_to_scan:	The number of pages to look through on the list.
  * @lruvec:	The LRU vector to pull pages from.
  * @dst:	The temp list to put pages on to.
- * @nr_scanned:	The number of pages that were scanned.
  * @sc:		The scan_control struct for this reclaim session
  * @mode:	One of the LRU isolation modes
  * @lru:	LRU list id for isolating
  *
  * returns how many pages were moved onto *@dst.
  */
-static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
+static unsigned long isolate_lru_pages(struct pglist_data *pgdat,
+		unsigned long nr_to_scan,
 		struct lruvec *lruvec, struct list_head *dst,
-		unsigned long *nr_scanned, struct scan_control *sc,
+		struct scan_control *sc,
 		isolate_mode_t mode, enum lru_list lru)
 {
 	struct list_head *src = &lruvec->lists[lru];
@@ -1471,8 +1471,11 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 	unsigned long nr_zone_taken[MAX_NR_ZONES] = { 0 };
 	unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
 	unsigned long scan, nr_pages;
+	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 	LIST_HEAD(pages_skipped);
+	int file = is_file_lru(lru);
 
+	spin_lock_irq(&pgdat->lru_lock);
 	for (scan = 0; scan < nr_to_scan && nr_taken < nr_to_scan &&
 					!list_empty(src);) {
 		struct page *page;
@@ -1540,10 +1543,25 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 
 		list_splice(&pages_skipped, src);
 	}
-	*nr_scanned = scan;
 	trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan,
 				    nr_taken, mode, is_file_lru(lru));
 	update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
+
+	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken);
+	reclaim_stat->recent_scanned[file] += nr_taken;
+
+	if (global_reclaim(sc))
+		__mod_node_page_state(pgdat, NR_PAGES_SCANNED, scan);
+	if (is_active_lru(lru)) {
+		__count_vm_events(PGREFILL, scan);
+	} else {
+		if (current_is_kswapd())
+			__count_vm_events(PGSCAN_KSWAPD, scan);
+		else
+			__count_vm_events(PGSCAN_DIRECT, scan);
+	}
+	spin_unlock_irq(&pgdat->lru_lock);
+
 	return nr_taken;
 }
 
@@ -1735,7 +1753,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		     struct scan_control *sc, enum lru_list lru)
 {
 	LIST_HEAD(page_list);
-	unsigned long nr_scanned;
 	unsigned long nr_reclaimed = 0;
 	unsigned long nr_taken;
 	unsigned long nr_dirty = 0;
@@ -1746,7 +1763,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
-	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 
 	if (!inactive_reclaimable_pages(lruvec, sc, lru))
 		return 0;
@@ -1766,23 +1782,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	if (!sc->may_writepage)
 		isolate_mode |= ISOLATE_CLEAN;
 
-	spin_lock_irq(&pgdat->lru_lock);
-
-	nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &page_list,
-				     &nr_scanned, sc, isolate_mode, lru);
-
-	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken);
-	reclaim_stat->recent_scanned[file] += nr_taken;
-
-	if (global_reclaim(sc)) {
-		__mod_node_page_state(pgdat, NR_PAGES_SCANNED, nr_scanned);
-		if (current_is_kswapd())
-			__count_vm_events(PGSCAN_KSWAPD, nr_scanned);
-		else
-			__count_vm_events(PGSCAN_DIRECT, nr_scanned);
-	}
-	spin_unlock_irq(&pgdat->lru_lock);
-
+	nr_taken = isolate_lru_pages(pgdat, nr_to_scan, lruvec, &page_list,
+					sc, isolate_mode, lru);
 	if (nr_taken == 0)
 		return 0;
 
@@ -1866,7 +1867,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		wait_iff_congested(pgdat, BLK_RW_ASYNC, HZ/10);
 
 	trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
-			nr_scanned, nr_reclaimed,
+			nr_taken,
+			nr_reclaimed,
 			sc->priority, file);
 	return nr_reclaimed;
 }
@@ -1943,18 +1945,17 @@ static void shrink_active_list(unsigned long nr_to_scan,
 			       enum lru_list lru)
 {
 	unsigned long nr_taken;
-	unsigned long nr_scanned;
 	unsigned long vm_flags;
 	LIST_HEAD(l_hold);	/* The pages which were snipped off */
 	LIST_HEAD(l_active);
 	LIST_HEAD(l_inactive);
 	struct page *page;
-	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 	unsigned nr_deactivate, nr_activate;
 	unsigned nr_rotated = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
+	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 
 	lru_add_drain();
 
@@ -1963,19 +1964,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	if (!sc->may_writepage)
 		isolate_mode |= ISOLATE_CLEAN;
 
-	spin_lock_irq(&pgdat->lru_lock);
-
-	nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &l_hold,
-				     &nr_scanned, sc, isolate_mode, lru);
-
-	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken);
-	reclaim_stat->recent_scanned[file] += nr_taken;
-
-	if (global_reclaim(sc))
-		__mod_node_page_state(pgdat, NR_PAGES_SCANNED, nr_scanned);
-	__count_vm_events(PGREFILL, nr_scanned);
-
-	spin_unlock_irq(&pgdat->lru_lock);
+	nr_taken = isolate_lru_pages(pgdat, nr_to_scan, lruvec, &l_hold,
+				     sc, isolate_mode, lru);
 
 	while (!list_empty(&l_hold)) {
 		cond_resched();
-- 
2.7.4

With this,
./scripts/bloat-o-meter vmlinux.old vmlinux.new.new
add/remove: 1/1 grow/shrink: 0/9 up/down: 1394/-1636 (-242)
function                                     old     new   delta
isolate_lru_pages                              -    1394   +1394
print_fmt_mm_vmscan_lru_shrink_inactive      359     355      -4
vermagic                                      64      58      -6
perf_trace_mm_vmscan_lru_shrink_active       264     256      -8
trace_raw_output_mm_vmscan_lru_shrink_active     203     193     -10
trace_event_raw_event_mm_vmscan_lru_shrink_active     241     225     -16
print_fmt_mm_vmscan_lru_shrink_active        458     426     -32
trace_event_define_fields_mm_vmscan_lru_shrink_active     384     336     -48
shrink_inactive_list                        1430    1271    -159
shrink_active_list                          1265    1082    -183
isolate_lru_pages.isra                      1170       -   -1170
Total: Before=26268743, After=26268501, chg -0.00%

We can save 242 bytes.

If we consider binary size, 424 bytes save.

#> ls -l vmlinux.old vmlinux.new.new
194092840  vmlinux.old
194092416  vmlinux.new.new

> with other tracepoints but that can be helpful because you do not have
> all the tracepoints enabled all the time. So unless you see this
> particular thing as a road block I would rather keep it.

I didn't know how long this thread becomes lenghy. To me, it was no worth
to discuss. I did best effot to explain my stand with valid points, I think
and don't want to go infinite loop. If you don't agree still, separate
the patch. One includes only necessary things with removing nr_scanned, which
I am happy to ack it. Based upon it, add one more patch you want adding
nr_scanned with your claim. I will reply that thread with my claim and
let's keep an eye on it that whether maintainer will take it or not.
If maintainer will take it, it's good indication which will represent
we can add more extra tracepoint easily with "might be helpful with someone
although it's redunant" so do not prevent others who want to do
in the future.

>  
> > > The inactive counterpart does that for quite some time already. So why
> > 
> > It couldn't be a reason. If it was duplicated in there, it would be
> > better to fix it rather than adding more duplciated work to match both
> > sides.
> 
> I really do not see this as a bad thing.
> 
> > > exactly does that matter? Don't take me wrong but isn't this more on a
> > > nit picking side than necessary? Or do I just misunderstand your
> > > concenrs? It is not like we are providing a stable user API as the
> > 
> > My concern is that I don't see what we can get benefit from those
> > duplicated work. If it doesn't give benefit to us, I don't want to add.
> > I hope you think another reasonable reasons.
> > 
> > > tracepoint is clearly implementation specific and not something to be
> > > used for anything other than debugging.
> > 
> > My point is we already had things "LRU isolation effectivness". Namely,
> > isolate_lru_pages.
> > 
> > > 
> > > > > 	- nr_rotated pages which tells us that we are hitting referenced
> > > > > 	  pages which are deactivated. If this is a large part of the
> > > > > 	  reported nr_deactivated pages then the active list is too small
> > > > 
> > > > It might be but not exactly. If your goal is to know LRU size, it can be
> > > > done in get_scan_count. I tend to agree LRU size is helpful for
> > > > performance analysis because decreased LRU size signals memory shortage
> > > > then performance drop.
> > > 
> > > No, I am not really interested in the exact size but rather to allow to
> > > find whether we are aging the active list too early...
> > 
> > Could you elaborate it more that how we can get active list early aging
> > with nr_rotated?
> 
> If you see too many referenced pages on the active list then they have
> been used since promoted and that is an indication that they might be
> reclaimed too early. If you are debugging a performance issue and see
> this happening then it might be a good indication to look at.

This is better than "active list is too small". I hope you change
description with this.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-04  5:07                     ` Minchan Kim
  0 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-04  5:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Tue, Jan 03, 2017 at 09:21:22AM +0100, Michal Hocko wrote:
> On Tue 03-01-17 14:03:28, Minchan Kim wrote:
> > Hi Michal,
> > 
> > On Fri, Dec 30, 2016 at 05:37:42PM +0100, Michal Hocko wrote:
> > > On Sat 31-12-16 01:04:56, Minchan Kim wrote:
> > > [...]
> > > > > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > > > > From: Michal Hocko <mhocko@suse.com>
> > > > > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > > > > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > > > > 
> > > > > Our reclaim process has several tracepoints to tell us more about how
> > > > > things are progressing. We are, however, missing a tracepoint to track
> > > > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > > > 
> > > > I agree this part.
> > > > 
> > > > > the number of
> > > > > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > > > > 	  effectiveness.
> > > > 
> > > > I agree nr_taken for knowing shrinking effectiveness but don't
> > > > agree nr_scanned. If we want to know LRU isolation effectiveness
> > > > with nr_scanned and nr_taken, isolate_lru_pages will do.
> > > 
> > > Yes it will. On the other hand the number is there and there is no
> > > additional overhead, maintenance or otherwise, to provide that number.
> > 
> > You are adding some instructions, how can you imagine it's no overhead?
> 
> There should be close to zero overhead when the tracepoint is disabled
> (we pay only one more argument when the function is called). Is this
> really worth discussing in this cold path? We are talking about the
> reclaim here.

I am talking about that why we should add pointless code in there.
No matter it's overhead. We are looping infinite. Blindly, it adds
overhead although you might think so trivial.

> 
> > Let's say whether it's measurable. Although it's not big in particular case,
> > it would be measurable if everyone start to say like that "it's trivial so
> > what's the problem adding a few instructions although it was duplicated?"
> > 
> > You already said "LRU isolate effectiveness". It should be done in there,
> > isolate_lru_pages and we have been. You need another reasons if you want to
> > add the duplicated work, strongly.
> 
> isolate_lru_pages is certainly there but you have to enable a trace
> point for that. Sometimes it is quite useful to get a reasonably good
> picture even without all the vmscan tracepoints enabled because they
> can generate quite a lot of output. So if the counter is available I

If someone want to see "isolate effectivenss", he should enable
mm_vmscan_lru_isolate which was born in that and has more helpful
information.

Think it in an opposit way. If some users want to see just active
list aging problem and no interested in "LRU isolate effectivness",
you are adding meaningless output for him and he has no choice to
turn it off with your patch.

> see no reason to exclude it, especially when it can provide a useful
> information. One of the most frustrating debugging experience is when

I said several times. Please think over if everyone begins adding extra
parameters in every tracepoints which we could already get it via other
tracepoint with "just, it might be useful in a specific context".
Could you be happy with that, really?

> you are missing some part of the information and have to guess which
> part is that and patch, rebuild the kernel and hope to reproduce it
> again in the same/similar way.

No need to rebuild. Just enable mm_vmscan_lru_isolate.

> 
> There are two things about this and other tracepoint patches in general
> I believe. 1) Is the tracepoint useful? and 2) Do we have to go over
> extra hops to show tracepoint data?
> 
> I guess we are in an agreement that the answer for 1 is yes. And

yeb.

> regarding 2, all the data we are showing are there or trivially
> retrieved without touching _any_ hot path. Som of it might be duplicated


Currently, you rely on just unfortunate modulization to just add
unncessary information to the tracepoint.

I just removed nr_scanned in your patch and look below.

./scripts/bloat-o-meter vmlinux.old vmlinux.new
add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-147 (-147)
function                                     old     new   delta
perf_trace_mm_vmscan_lru_shrink_active       264     256      -8
trace_raw_output_mm_vmscan_lru_shrink_active     203     193     -10
trace_event_raw_event_mm_vmscan_lru_shrink_active     241     225     -16
print_fmt_mm_vmscan_lru_shrink_active        458     426     -32
shrink_active_list                          1265    1232     -33
trace_event_define_fields_mm_vmscan_lru_shrink_active     384     336     -48
Total: Before=26268743, After=26268596, chg -0.00%

Let's furhter it more.

We can factor out logics to account isolation of LRU from shrink_[in]active_list
which is more clean, I think.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2017-01-03  5:03                 ` Minchan Kim
@ 2017-01-03  8:21                   ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-03  8:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Tue 03-01-17 14:03:28, Minchan Kim wrote:
> Hi Michal,
> 
> On Fri, Dec 30, 2016 at 05:37:42PM +0100, Michal Hocko wrote:
> > On Sat 31-12-16 01:04:56, Minchan Kim wrote:
> > [...]
> > > > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > > > From: Michal Hocko <mhocko@suse.com>
> > > > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > > > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > > > 
> > > > Our reclaim process has several tracepoints to tell us more about how
> > > > things are progressing. We are, however, missing a tracepoint to track
> > > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > > 
> > > I agree this part.
> > > 
> > > > the number of
> > > > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > > > 	  effectiveness.
> > > 
> > > I agree nr_taken for knowing shrinking effectiveness but don't
> > > agree nr_scanned. If we want to know LRU isolation effectiveness
> > > with nr_scanned and nr_taken, isolate_lru_pages will do.
> > 
> > Yes it will. On the other hand the number is there and there is no
> > additional overhead, maintenance or otherwise, to provide that number.
> 
> You are adding some instructions, how can you imagine it's no overhead?

There should be close to zero overhead when the tracepoint is disabled
(we pay only one more argument when the function is called). Is this
really worth discussing in this cold path? We are talking about the
reclaim here.

> Let's say whether it's measurable. Although it's not big in particular case,
> it would be measurable if everyone start to say like that "it's trivial so
> what's the problem adding a few instructions although it was duplicated?"
> 
> You already said "LRU isolate effectiveness". It should be done in there,
> isolate_lru_pages and we have been. You need another reasons if you want to
> add the duplicated work, strongly.

isolate_lru_pages is certainly there but you have to enable a trace
point for that. Sometimes it is quite useful to get a reasonably good
picture even without all the vmscan tracepoints enabled because they
can generate quite a lot of output. So if the counter is available I
see no reason to exclude it, especially when it can provide a useful
information. One of the most frustrating debugging experience is when
you are missing some part of the information and have to guess which
part is that and patch, rebuild the kernel and hope to reproduce it
again in the same/similar way.

There are two things about this and other tracepoint patches in general
I believe. 1) Is the tracepoint useful? and 2) Do we have to go over
extra hops to show tracepoint data?

I guess we are in an agreement that the answer for 1 is yes. And
regarding 2, all the data we are showing are there or trivially
retrieved without touching _any_ hot path. Som of it might be duplicated
with other tracepoints but that can be helpful because you do not have
all the tracepoints enabled all the time. So unless you see this
particular thing as a road block I would rather keep it.
 
> > The inactive counterpart does that for quite some time already. So why
> 
> It couldn't be a reason. If it was duplicated in there, it would be
> better to fix it rather than adding more duplciated work to match both
> sides.

I really do not see this as a bad thing.

> > exactly does that matter? Don't take me wrong but isn't this more on a
> > nit picking side than necessary? Or do I just misunderstand your
> > concenrs? It is not like we are providing a stable user API as the
> 
> My concern is that I don't see what we can get benefit from those
> duplicated work. If it doesn't give benefit to us, I don't want to add.
> I hope you think another reasonable reasons.
> 
> > tracepoint is clearly implementation specific and not something to be
> > used for anything other than debugging.
> 
> My point is we already had things "LRU isolation effectivness". Namely,
> isolate_lru_pages.
> 
> > 
> > > > 	- nr_rotated pages which tells us that we are hitting referenced
> > > > 	  pages which are deactivated. If this is a large part of the
> > > > 	  reported nr_deactivated pages then the active list is too small
> > > 
> > > It might be but not exactly. If your goal is to know LRU size, it can be
> > > done in get_scan_count. I tend to agree LRU size is helpful for
> > > performance analysis because decreased LRU size signals memory shortage
> > > then performance drop.
> > 
> > No, I am not really interested in the exact size but rather to allow to
> > find whether we are aging the active list too early...
> 
> Could you elaborate it more that how we can get active list early aging
> with nr_rotated?

If you see too many referenced pages on the active list then they have
been used since promoted and that is an indication that they might be
reclaimed too early. If you are debugging a performance issue and see
this happening then it might be a good indication to look at.

Thanks
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-03  8:21                   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2017-01-03  8:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Tue 03-01-17 14:03:28, Minchan Kim wrote:
> Hi Michal,
> 
> On Fri, Dec 30, 2016 at 05:37:42PM +0100, Michal Hocko wrote:
> > On Sat 31-12-16 01:04:56, Minchan Kim wrote:
> > [...]
> > > > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > > > From: Michal Hocko <mhocko@suse.com>
> > > > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > > > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > > > 
> > > > Our reclaim process has several tracepoints to tell us more about how
> > > > things are progressing. We are, however, missing a tracepoint to track
> > > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > > 
> > > I agree this part.
> > > 
> > > > the number of
> > > > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > > > 	  effectiveness.
> > > 
> > > I agree nr_taken for knowing shrinking effectiveness but don't
> > > agree nr_scanned. If we want to know LRU isolation effectiveness
> > > with nr_scanned and nr_taken, isolate_lru_pages will do.
> > 
> > Yes it will. On the other hand the number is there and there is no
> > additional overhead, maintenance or otherwise, to provide that number.
> 
> You are adding some instructions, how can you imagine it's no overhead?

There should be close to zero overhead when the tracepoint is disabled
(we pay only one more argument when the function is called). Is this
really worth discussing in this cold path? We are talking about the
reclaim here.

> Let's say whether it's measurable. Although it's not big in particular case,
> it would be measurable if everyone start to say like that "it's trivial so
> what's the problem adding a few instructions although it was duplicated?"
> 
> You already said "LRU isolate effectiveness". It should be done in there,
> isolate_lru_pages and we have been. You need another reasons if you want to
> add the duplicated work, strongly.

isolate_lru_pages is certainly there but you have to enable a trace
point for that. Sometimes it is quite useful to get a reasonably good
picture even without all the vmscan tracepoints enabled because they
can generate quite a lot of output. So if the counter is available I
see no reason to exclude it, especially when it can provide a useful
information. One of the most frustrating debugging experience is when
you are missing some part of the information and have to guess which
part is that and patch, rebuild the kernel and hope to reproduce it
again in the same/similar way.

There are two things about this and other tracepoint patches in general
I believe. 1) Is the tracepoint useful? and 2) Do we have to go over
extra hops to show tracepoint data?

I guess we are in an agreement that the answer for 1 is yes. And
regarding 2, all the data we are showing are there or trivially
retrieved without touching _any_ hot path. Som of it might be duplicated
with other tracepoints but that can be helpful because you do not have
all the tracepoints enabled all the time. So unless you see this
particular thing as a road block I would rather keep it.
 
> > The inactive counterpart does that for quite some time already. So why
> 
> It couldn't be a reason. If it was duplicated in there, it would be
> better to fix it rather than adding more duplciated work to match both
> sides.

I really do not see this as a bad thing.

> > exactly does that matter? Don't take me wrong but isn't this more on a
> > nit picking side than necessary? Or do I just misunderstand your
> > concenrs? It is not like we are providing a stable user API as the
> 
> My concern is that I don't see what we can get benefit from those
> duplicated work. If it doesn't give benefit to us, I don't want to add.
> I hope you think another reasonable reasons.
> 
> > tracepoint is clearly implementation specific and not something to be
> > used for anything other than debugging.
> 
> My point is we already had things "LRU isolation effectivness". Namely,
> isolate_lru_pages.
> 
> > 
> > > > 	- nr_rotated pages which tells us that we are hitting referenced
> > > > 	  pages which are deactivated. If this is a large part of the
> > > > 	  reported nr_deactivated pages then the active list is too small
> > > 
> > > It might be but not exactly. If your goal is to know LRU size, it can be
> > > done in get_scan_count. I tend to agree LRU size is helpful for
> > > performance analysis because decreased LRU size signals memory shortage
> > > then performance drop.
> > 
> > No, I am not really interested in the exact size but rather to allow to
> > find whether we are aging the active list too early...
> 
> Could you elaborate it more that how we can get active list early aging
> with nr_rotated?

If you see too many referenced pages on the active list then they have
been used since promoted and that is an indication that they might be
reclaimed too early. If you are debugging a performance issue and see
this happening then it might be a good indication to look at.

Thanks
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-30 16:37               ` Michal Hocko
@ 2017-01-03  5:03                 ` Minchan Kim
  -1 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-03  5:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

Hi Michal,

On Fri, Dec 30, 2016 at 05:37:42PM +0100, Michal Hocko wrote:
> On Sat 31-12-16 01:04:56, Minchan Kim wrote:
> [...]
> > > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.com>
> > > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > > 
> > > Our reclaim process has several tracepoints to tell us more about how
> > > things are progressing. We are, however, missing a tracepoint to track
> > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > 
> > I agree this part.
> > 
> > > the number of
> > > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > > 	  effectiveness.
> > 
> > I agree nr_taken for knowing shrinking effectiveness but don't
> > agree nr_scanned. If we want to know LRU isolation effectiveness
> > with nr_scanned and nr_taken, isolate_lru_pages will do.
> 
> Yes it will. On the other hand the number is there and there is no
> additional overhead, maintenance or otherwise, to provide that number.

You are adding some instructions, how can you imagine it's no overhead?
Let's say whether it's measurable. Although it's not big in particular case,
it would be measurable if everyone start to say like that "it's trivial so
what's the problem adding a few instructions although it was duplicated?"

You already said "LRU isolate effectiveness". It should be done in there,
isolate_lru_pages and we have been. You need another reasons if you want to
add the duplicated work, strongly.

> The inactive counterpart does that for quite some time already. So why

It couldn't be a reason. If it was duplicated in there, it would be
better to fix it rather than adding more duplciated work to match both
sides.

> exactly does that matter? Don't take me wrong but isn't this more on a
> nit picking side than necessary? Or do I just misunderstand your
> concenrs? It is not like we are providing a stable user API as the

My concern is that I don't see what we can get benefit from those
duplicated work. If it doesn't give benefit to us, I don't want to add.
I hope you think another reasonable reasons.

> tracepoint is clearly implementation specific and not something to be
> used for anything other than debugging.

My point is we already had things "LRU isolation effectivness". Namely,
isolate_lru_pages.

> 
> > > 	- nr_rotated pages which tells us that we are hitting referenced
> > > 	  pages which are deactivated. If this is a large part of the
> > > 	  reported nr_deactivated pages then the active list is too small
> > 
> > It might be but not exactly. If your goal is to know LRU size, it can be
> > done in get_scan_count. I tend to agree LRU size is helpful for
> > performance analysis because decreased LRU size signals memory shortage
> > then performance drop.
> 
> No, I am not really interested in the exact size but rather to allow to
> find whether we are aging the active list too early...

Could you elaborate it more that how we can get active list early aging
with nr_rotated?

Thanks.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2017-01-03  5:03                 ` Minchan Kim
  0 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2017-01-03  5:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

Hi Michal,

On Fri, Dec 30, 2016 at 05:37:42PM +0100, Michal Hocko wrote:
> On Sat 31-12-16 01:04:56, Minchan Kim wrote:
> [...]
> > > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.com>
> > > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > > 
> > > Our reclaim process has several tracepoints to tell us more about how
> > > things are progressing. We are, however, missing a tracepoint to track
> > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > 
> > I agree this part.
> > 
> > > the number of
> > > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > > 	  effectiveness.
> > 
> > I agree nr_taken for knowing shrinking effectiveness but don't
> > agree nr_scanned. If we want to know LRU isolation effectiveness
> > with nr_scanned and nr_taken, isolate_lru_pages will do.
> 
> Yes it will. On the other hand the number is there and there is no
> additional overhead, maintenance or otherwise, to provide that number.

You are adding some instructions, how can you imagine it's no overhead?
Let's say whether it's measurable. Although it's not big in particular case,
it would be measurable if everyone start to say like that "it's trivial so
what's the problem adding a few instructions although it was duplicated?"

You already said "LRU isolate effectiveness". It should be done in there,
isolate_lru_pages and we have been. You need another reasons if you want to
add the duplicated work, strongly.

> The inactive counterpart does that for quite some time already. So why

It couldn't be a reason. If it was duplicated in there, it would be
better to fix it rather than adding more duplciated work to match both
sides.

> exactly does that matter? Don't take me wrong but isn't this more on a
> nit picking side than necessary? Or do I just misunderstand your
> concenrs? It is not like we are providing a stable user API as the

My concern is that I don't see what we can get benefit from those
duplicated work. If it doesn't give benefit to us, I don't want to add.
I hope you think another reasonable reasons.

> tracepoint is clearly implementation specific and not something to be
> used for anything other than debugging.

My point is we already had things "LRU isolation effectivness". Namely,
isolate_lru_pages.

> 
> > > 	- nr_rotated pages which tells us that we are hitting referenced
> > > 	  pages which are deactivated. If this is a large part of the
> > > 	  reported nr_deactivated pages then the active list is too small
> > 
> > It might be but not exactly. If your goal is to know LRU size, it can be
> > done in get_scan_count. I tend to agree LRU size is helpful for
> > performance analysis because decreased LRU size signals memory shortage
> > then performance drop.
> 
> No, I am not really interested in the exact size but rather to allow to
> find whether we are aging the active list too early...

Could you elaborate it more that how we can get active list early aging
with nr_rotated?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-30 16:37               ` Michal Hocko
@ 2016-12-30 17:30                 ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-30 17:30 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Fri 30-12-16 17:37:42, Michal Hocko wrote:
> On Sat 31-12-16 01:04:56, Minchan Kim wrote:
[...]
> > > 	- nr_rotated pages which tells us that we are hitting referenced
> > > 	  pages which are deactivated. If this is a large part of the
> > > 	  reported nr_deactivated pages then the active list is too small
> > 
> > It might be but not exactly. If your goal is to know LRU size, it can be
> > done in get_scan_count. I tend to agree LRU size is helpful for
> > performance analysis because decreased LRU size signals memory shortage
> > then performance drop.
> 
> No, I am not really interested in the exact size but rather to allow to
> find whether we are aging the active list too early...

But thinking about that some more, maybe sticking with the nr_rotated
terminology is rather confusing and displaying the value as nr_referenced
would be more clear.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-30 17:30                 ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-30 17:30 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Fri 30-12-16 17:37:42, Michal Hocko wrote:
> On Sat 31-12-16 01:04:56, Minchan Kim wrote:
[...]
> > > 	- nr_rotated pages which tells us that we are hitting referenced
> > > 	  pages which are deactivated. If this is a large part of the
> > > 	  reported nr_deactivated pages then the active list is too small
> > 
> > It might be but not exactly. If your goal is to know LRU size, it can be
> > done in get_scan_count. I tend to agree LRU size is helpful for
> > performance analysis because decreased LRU size signals memory shortage
> > then performance drop.
> 
> No, I am not really interested in the exact size but rather to allow to
> find whether we are aging the active list too early...

But thinking about that some more, maybe sticking with the nr_rotated
terminology is rather confusing and displaying the value as nr_referenced
would be more clear.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-30 16:04             ` Minchan Kim
@ 2016-12-30 16:37               ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-30 16:37 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Sat 31-12-16 01:04:56, Minchan Kim wrote:
[...]
> > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.com>
> > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> 
> I agree this part.
> 
> > the number of
> > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > 	  effectiveness.
> 
> I agree nr_taken for knowing shrinking effectiveness but don't
> agree nr_scanned. If we want to know LRU isolation effectiveness
> with nr_scanned and nr_taken, isolate_lru_pages will do.

Yes it will. On the other hand the number is there and there is no
additional overhead, maintenance or otherwise, to provide that number.
The inactive counterpart does that for quite some time already. So why
exactly does that matter? Don't take me wrong but isn't this more on a
nit picking side than necessary? Or do I just misunderstand your
concenrs? It is not like we are providing a stable user API as the
tracepoint is clearly implementation specific and not something to be
used for anything other than debugging.

> > 	- nr_rotated pages which tells us that we are hitting referenced
> > 	  pages which are deactivated. If this is a large part of the
> > 	  reported nr_deactivated pages then the active list is too small
> 
> It might be but not exactly. If your goal is to know LRU size, it can be
> done in get_scan_count. I tend to agree LRU size is helpful for
> performance analysis because decreased LRU size signals memory shortage
> then performance drop.

No, I am not really interested in the exact size but rather to allow to
find whether we are aging the active list too early...

> 
> > 	- nr_activated pages which tells us how many pages are keept on the
>                                                                kept

fixed

> 
> > 	  active list - mostly exec pages. A high number can indicate
> 
>                                file-based exec pages

OK, fixed

> 
> > 	  that we might be trashing on executables.
> 
> And welcome to drop nr_unevictable, nr_freed.
> 
> I will be off until next week monday so please understand if my response
> is slow.

There is no reason to hurry...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-30 16:37               ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-30 16:37 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Sat 31-12-16 01:04:56, Minchan Kim wrote:
[...]
> > From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.com>
> > Date: Tue, 27 Dec 2016 13:18:20 +0100
> > Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> 
> I agree this part.
> 
> > the number of
> > 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> > 	  effectiveness.
> 
> I agree nr_taken for knowing shrinking effectiveness but don't
> agree nr_scanned. If we want to know LRU isolation effectiveness
> with nr_scanned and nr_taken, isolate_lru_pages will do.

Yes it will. On the other hand the number is there and there is no
additional overhead, maintenance or otherwise, to provide that number.
The inactive counterpart does that for quite some time already. So why
exactly does that matter? Don't take me wrong but isn't this more on a
nit picking side than necessary? Or do I just misunderstand your
concenrs? It is not like we are providing a stable user API as the
tracepoint is clearly implementation specific and not something to be
used for anything other than debugging.

> > 	- nr_rotated pages which tells us that we are hitting referenced
> > 	  pages which are deactivated. If this is a large part of the
> > 	  reported nr_deactivated pages then the active list is too small
> 
> It might be but not exactly. If your goal is to know LRU size, it can be
> done in get_scan_count. I tend to agree LRU size is helpful for
> performance analysis because decreased LRU size signals memory shortage
> then performance drop.

No, I am not really interested in the exact size but rather to allow to
find whether we are aging the active list too early...

> 
> > 	- nr_activated pages which tells us how many pages are keept on the
>                                                                kept

fixed

> 
> > 	  active list - mostly exec pages. A high number can indicate
> 
>                                file-based exec pages

OK, fixed

> 
> > 	  that we might be trashing on executables.
> 
> And welcome to drop nr_unevictable, nr_freed.
> 
> I will be off until next week monday so please understand if my response
> is slow.

There is no reason to hurry...
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-30  9:26           ` Michal Hocko
@ 2016-12-30 16:04             ` Minchan Kim
  -1 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2016-12-30 16:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Fri, Dec 30, 2016 at 10:26:37AM +0100, Michal Hocko wrote:
> On Fri 30-12-16 10:48:53, Minchan Kim wrote:
> > On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> > > On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > > > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> [...]
> > > > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > > > +
> > > > > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > > > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > > > +		unsigned long nr_rotated, int priority, int file),
> > > > > +
> > > > > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > > > 
> > > > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > > > when node-lru was introduced. However, the question is we really need all those
> > > > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> > > 
> > > Dunno. Is it harmful to add this information? I like it more when the
> > > numbers just add up and you have a clear picture. You never know what
> > > might be useful when debugging a weird behavior. 
> > 
> > Michal, I'm not huge fan of "might be useful" although it's a small piece of code.
> 
> But these are tracepoints. One of their primary reasons to exist is
> to help debug things.  And it is really hard to predict what might be
> useful in advance. It is not like the patch would export numbers which
> would be irrelevant to the reclaim.

What's different?

Please think over if everyone says like that they want to add something
with the reason "it's tracepoint which helps dubug and we cannot assume
what might be useful in the future."

> 
> > It adds just all of kinds overheads (memory footprint, runtime performance,
> > maintainance) without any proved benefit.
> 
> Does it really add any measurable overhead or the maintenance burden? I

Don't limit your thought in this particular case and expand the idea to
others who want to see random value via tracepoint with just "might-be-
good". We will lose the reason to prevent that trend if we merge any
tracepoint expansion patch with just "might-be-useful" reason.
Finally, that would bite us.

> think the only place we could argue about is free_hot_cold_page_list
> which is used in hot paths.

The point of view about shrinking active list, what we want to know
is just (nr_taken|nr_deactivated|priority|file) and it's enough,
I think. So, if you want to add nr_freed, nr_unevictable, nr_rotated
please, describe "what problem we can solve with those each numbers".

> 
> I think we can sacrifice it. The same for culled unevictable
> pages. We wouldn't know what is the missing part
> nr_taken-(nr_activate+nr_deactivate) because it could be either freed or
> moved to the unevictable list but that could be handled in a separate
> tracepoint in putback_lru_page which sounds like a useful thing I guess.
>  
> > If we allow such things, people would start adding more things with just "why not,
> > it might be useful. you never know the future" and it ends up making linux fiction
> > novel mess.
> 
> I agree with this concern in general, but is this the case in this
> particular case?

I believe it's not different.

> 
> > If it's necessary, someday, someone will catch up and will send or ask patch with
> > detailed description "why the stat is important and how it is good for us to solve
> > some problem".
> 
> I can certainly enhance the changelog. See below.
> 
> > From that, we can learn workload, way to solve the problem and git
> > history has the valuable description so new comers can keep the community up easily.
> > So, finally, overheads are justified and get merged.
> > 
> > Please add must-have for your goal described.
> 
> My primary point is that tracepoints which do not give us a good picture
> are quite useless and force us to add trace_printk or other means to
> give us further information. Then I wonder why to have an incomplete
> tracepoint at all.
> 
> Anyway, what do you think about this updated patch? I have kept Hillf's
> A-b so please let me know if it is no longer valid.
> --- 
> From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 27 Dec 2016 13:18:20 +0100
> Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports

I agree this part.

> the number of
> 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> 	  effectiveness.

I agree nr_taken for knowing shrinking effectiveness but don't
agree nr_scanned. If we want to know LRU isolation effectiveness
with nr_scanned and nr_taken, isolate_lru_pages will do.

> 	- nr_rotated pages which tells us that we are hitting referenced
> 	  pages which are deactivated. If this is a large part of the
> 	  reported nr_deactivated pages then the active list is too small

It might be but not exactly. If your goal is to know LRU size, it can be
done in get_scan_count. I tend to agree LRU size is helpful for
performance analysis because decreased LRU size signals memory shortage
then performance drop.

> 	- nr_activated pages which tells us how many pages are keept on the
                                                               kept

> 	  active list - mostly exec pages. A high number can indicate

                               file-based exec pages

> 	  that we might be trashing on executables.

And welcome to drop nr_unevictable, nr_freed.

I will be off until next week monday so please understand if my response
is slow.

Thanks.

> 
> Changes since v1
> - report nr_taken pages as per Minchan
> - report nr_activated as per Minchan
> - do not report nr_freed pages because that would add a tiny overhead to
>   free_hot_cold_page_list which is a hot path
> - do not report nr_unevictable because we can report this number via a
>   different and more generic tracepoint in putback_lru_page
> - fix move_active_pages_to_lru to report proper page count when we hit
>   into large pages
> 
> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
>  mm/vmscan.c                   | 18 ++++++++++++++----
>  2 files changed, 52 insertions(+), 4 deletions(-)
> 
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 39bad8921ca1..f9ef242ece1b 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
>  		show_reclaim_flags(__entry->reclaim_flags))
>  );
>  
> +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> +
> +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_taken,
> +		unsigned long nr_activate, unsigned long nr_deactivated,
> +		unsigned long nr_rotated, int priority, int file),
> +
> +	TP_ARGS(nid, nr_scanned, nr_taken, nr_activate, nr_deactivated, nr_rotated, priority, file),
> +
> +	TP_STRUCT__entry(
> +		__field(int, nid)
> +		__field(unsigned long, nr_scanned)
> +		__field(unsigned long, nr_taken)
> +		__field(unsigned long, nr_activate)
> +		__field(unsigned long, nr_deactivated)
> +		__field(unsigned long, nr_rotated)
> +		__field(int, priority)
> +		__field(int, reclaim_flags)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->nid = nid;
> +		__entry->nr_scanned = nr_scanned;
> +		__entry->nr_taken = nr_taken;
> +		__entry->nr_activate = nr_activate;
> +		__entry->nr_deactivated = nr_deactivated;
> +		__entry->nr_rotated = nr_rotated;
> +		__entry->priority = priority;
> +		__entry->reclaim_flags = trace_shrink_flags(file);
> +	),
> +
> +	TP_printk("nid=%d nr_scanned=%ld nr_taken=%ld nr_activated=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
> +		__entry->nid,
> +		__entry->nr_scanned, __entry->nr_taken,
> +		__entry->nr_activate, __entry->nr_deactivated, __entry->nr_rotated,
> +		__entry->priority,
> +		show_reclaim_flags(__entry->reclaim_flags))
> +);
> +
>  #endif /* _TRACE_VMSCAN_H */
>  
>  /* This part must be outside protection */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c4abf08861d2..4da4d8d0496c 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
>   *
>   * The downside is that we have to touch page->_refcount against each page.
>   * But we had to alter page->flags anyway.
> + *
> + * Returns the number of pages moved to the given lru.
>   */
>  
> -static void move_active_pages_to_lru(struct lruvec *lruvec,
> +static unsigned move_active_pages_to_lru(struct lruvec *lruvec,
>  				     struct list_head *list,
>  				     struct list_head *pages_to_free,
>  				     enum lru_list lru)
> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  	unsigned long pgmoved = 0;
>  	struct page *page;
>  	int nr_pages;
> +	int nr_moved = 0;
>  
>  	while (!list_empty(list)) {
>  		page = lru_to_page(list);
> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  				spin_lock_irq(&pgdat->lru_lock);
>  			} else
>  				list_add(&page->lru, pages_to_free);
> +		} else {
> +			nr_moved += nr_pages;
>  		}
>  	}
>  
>  	if (!is_active_lru(lru))
>  		__count_vm_events(PGDEACTIVATE, pgmoved);
> +
> +	return nr_moved;
>  }
>  
>  static void shrink_active_list(unsigned long nr_to_scan,
> @@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	LIST_HEAD(l_inactive);
>  	struct page *page;
>  	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -	unsigned long nr_rotated = 0;
> +	unsigned nr_deactivate, nr_activate;
> +	unsigned nr_rotated = 0;
>  	isolate_mode_t isolate_mode = 0;
>  	int file = is_file_lru(lru);
>  	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> @@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	 */
>  	reclaim_stat->recent_rotated[file] += nr_rotated;
>  
> -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> +	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
>  	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
>  	spin_unlock_irq(&pgdat->lru_lock);
>  
>  	mem_cgroup_uncharge_list(&l_hold);
>  	free_hot_cold_page_list(&l_hold, true);
> +	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_taken,
> +			nr_activate, nr_deactivate, nr_rotated, sc->priority, file);
>  }
>  
>  /*
> -- 
> 2.10.2
> 
> 
> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-30 16:04             ` Minchan Kim
  0 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2016-12-30 16:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, Mel Gorman,
	Johannes Weiner, Vlastimil Babka, Rik van Riel, LKML

On Fri, Dec 30, 2016 at 10:26:37AM +0100, Michal Hocko wrote:
> On Fri 30-12-16 10:48:53, Minchan Kim wrote:
> > On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> > > On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > > > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> [...]
> > > > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > > > +
> > > > > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > > > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > > > +		unsigned long nr_rotated, int priority, int file),
> > > > > +
> > > > > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > > > 
> > > > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > > > when node-lru was introduced. However, the question is we really need all those
> > > > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> > > 
> > > Dunno. Is it harmful to add this information? I like it more when the
> > > numbers just add up and you have a clear picture. You never know what
> > > might be useful when debugging a weird behavior. 
> > 
> > Michal, I'm not huge fan of "might be useful" although it's a small piece of code.
> 
> But these are tracepoints. One of their primary reasons to exist is
> to help debug things.  And it is really hard to predict what might be
> useful in advance. It is not like the patch would export numbers which
> would be irrelevant to the reclaim.

What's different?

Please think over if everyone says like that they want to add something
with the reason "it's tracepoint which helps dubug and we cannot assume
what might be useful in the future."

> 
> > It adds just all of kinds overheads (memory footprint, runtime performance,
> > maintainance) without any proved benefit.
> 
> Does it really add any measurable overhead or the maintenance burden? I

Don't limit your thought in this particular case and expand the idea to
others who want to see random value via tracepoint with just "might-be-
good". We will lose the reason to prevent that trend if we merge any
tracepoint expansion patch with just "might-be-useful" reason.
Finally, that would bite us.

> think the only place we could argue about is free_hot_cold_page_list
> which is used in hot paths.

The point of view about shrinking active list, what we want to know
is just (nr_taken|nr_deactivated|priority|file) and it's enough,
I think. So, if you want to add nr_freed, nr_unevictable, nr_rotated
please, describe "what problem we can solve with those each numbers".

> 
> I think we can sacrifice it. The same for culled unevictable
> pages. We wouldn't know what is the missing part
> nr_taken-(nr_activate+nr_deactivate) because it could be either freed or
> moved to the unevictable list but that could be handled in a separate
> tracepoint in putback_lru_page which sounds like a useful thing I guess.
>  
> > If we allow such things, people would start adding more things with just "why not,
> > it might be useful. you never know the future" and it ends up making linux fiction
> > novel mess.
> 
> I agree with this concern in general, but is this the case in this
> particular case?

I believe it's not different.

> 
> > If it's necessary, someday, someone will catch up and will send or ask patch with
> > detailed description "why the stat is important and how it is good for us to solve
> > some problem".
> 
> I can certainly enhance the changelog. See below.
> 
> > From that, we can learn workload, way to solve the problem and git
> > history has the valuable description so new comers can keep the community up easily.
> > So, finally, overheads are justified and get merged.
> > 
> > Please add must-have for your goal described.
> 
> My primary point is that tracepoints which do not give us a good picture
> are quite useless and force us to add trace_printk or other means to
> give us further information. Then I wonder why to have an incomplete
> tracepoint at all.
> 
> Anyway, what do you think about this updated patch? I have kept Hillf's
> A-b so please let me know if it is no longer valid.
> --- 
> From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 27 Dec 2016 13:18:20 +0100
> Subject: [PATCH] mm, vmscan: add active list aging tracepoint
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports

I agree this part.

> the number of
> 	- nr_scanned, nr_taken pages to tell us the LRU isolation
> 	  effectiveness.

I agree nr_taken for knowing shrinking effectiveness but don't
agree nr_scanned. If we want to know LRU isolation effectiveness
with nr_scanned and nr_taken, isolate_lru_pages will do.

> 	- nr_rotated pages which tells us that we are hitting referenced
> 	  pages which are deactivated. If this is a large part of the
> 	  reported nr_deactivated pages then the active list is too small

It might be but not exactly. If your goal is to know LRU size, it can be
done in get_scan_count. I tend to agree LRU size is helpful for
performance analysis because decreased LRU size signals memory shortage
then performance drop.

> 	- nr_activated pages which tells us how many pages are keept on the
                                                               kept

> 	  active list - mostly exec pages. A high number can indicate

                               file-based exec pages

> 	  that we might be trashing on executables.

And welcome to drop nr_unevictable, nr_freed.

I will be off until next week monday so please understand if my response
is slow.

Thanks.

> 
> Changes since v1
> - report nr_taken pages as per Minchan
> - report nr_activated as per Minchan
> - do not report nr_freed pages because that would add a tiny overhead to
>   free_hot_cold_page_list which is a hot path
> - do not report nr_unevictable because we can report this number via a
>   different and more generic tracepoint in putback_lru_page
> - fix move_active_pages_to_lru to report proper page count when we hit
>   into large pages
> 
> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
>  mm/vmscan.c                   | 18 ++++++++++++++----
>  2 files changed, 52 insertions(+), 4 deletions(-)
> 
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 39bad8921ca1..f9ef242ece1b 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
>  		show_reclaim_flags(__entry->reclaim_flags))
>  );
>  
> +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> +
> +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_taken,
> +		unsigned long nr_activate, unsigned long nr_deactivated,
> +		unsigned long nr_rotated, int priority, int file),
> +
> +	TP_ARGS(nid, nr_scanned, nr_taken, nr_activate, nr_deactivated, nr_rotated, priority, file),
> +
> +	TP_STRUCT__entry(
> +		__field(int, nid)
> +		__field(unsigned long, nr_scanned)
> +		__field(unsigned long, nr_taken)
> +		__field(unsigned long, nr_activate)
> +		__field(unsigned long, nr_deactivated)
> +		__field(unsigned long, nr_rotated)
> +		__field(int, priority)
> +		__field(int, reclaim_flags)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->nid = nid;
> +		__entry->nr_scanned = nr_scanned;
> +		__entry->nr_taken = nr_taken;
> +		__entry->nr_activate = nr_activate;
> +		__entry->nr_deactivated = nr_deactivated;
> +		__entry->nr_rotated = nr_rotated;
> +		__entry->priority = priority;
> +		__entry->reclaim_flags = trace_shrink_flags(file);
> +	),
> +
> +	TP_printk("nid=%d nr_scanned=%ld nr_taken=%ld nr_activated=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
> +		__entry->nid,
> +		__entry->nr_scanned, __entry->nr_taken,
> +		__entry->nr_activate, __entry->nr_deactivated, __entry->nr_rotated,
> +		__entry->priority,
> +		show_reclaim_flags(__entry->reclaim_flags))
> +);
> +
>  #endif /* _TRACE_VMSCAN_H */
>  
>  /* This part must be outside protection */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c4abf08861d2..4da4d8d0496c 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
>   *
>   * The downside is that we have to touch page->_refcount against each page.
>   * But we had to alter page->flags anyway.
> + *
> + * Returns the number of pages moved to the given lru.
>   */
>  
> -static void move_active_pages_to_lru(struct lruvec *lruvec,
> +static unsigned move_active_pages_to_lru(struct lruvec *lruvec,
>  				     struct list_head *list,
>  				     struct list_head *pages_to_free,
>  				     enum lru_list lru)
> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  	unsigned long pgmoved = 0;
>  	struct page *page;
>  	int nr_pages;
> +	int nr_moved = 0;
>  
>  	while (!list_empty(list)) {
>  		page = lru_to_page(list);
> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  				spin_lock_irq(&pgdat->lru_lock);
>  			} else
>  				list_add(&page->lru, pages_to_free);
> +		} else {
> +			nr_moved += nr_pages;
>  		}
>  	}
>  
>  	if (!is_active_lru(lru))
>  		__count_vm_events(PGDEACTIVATE, pgmoved);
> +
> +	return nr_moved;
>  }
>  
>  static void shrink_active_list(unsigned long nr_to_scan,
> @@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	LIST_HEAD(l_inactive);
>  	struct page *page;
>  	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -	unsigned long nr_rotated = 0;
> +	unsigned nr_deactivate, nr_activate;
> +	unsigned nr_rotated = 0;
>  	isolate_mode_t isolate_mode = 0;
>  	int file = is_file_lru(lru);
>  	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> @@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	 */
>  	reclaim_stat->recent_rotated[file] += nr_rotated;
>  
> -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> +	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
>  	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
>  	spin_unlock_irq(&pgdat->lru_lock);
>  
>  	mem_cgroup_uncharge_list(&l_hold);
>  	free_hot_cold_page_list(&l_hold, true);
> +	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_taken,
> +			nr_activate, nr_deactivate, nr_rotated, sc->priority, file);
>  }
>  
>  /*
> -- 
> 2.10.2
> 
> 
> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-30  9:26           ` Michal Hocko
@ 2016-12-30  9:38             ` Hillf Danton
  -1 siblings, 0 replies; 74+ messages in thread
From: Hillf Danton @ 2016-12-30  9:38 UTC (permalink / raw)
  To: 'Michal Hocko', 'Minchan Kim'
  Cc: linux-mm, 'Andrew Morton', 'Mel Gorman',
	'Johannes Weiner', 'Vlastimil Babka',
	'Rik van Riel', 'LKML'


On Friday, December 30, 2016 5:27 PM Michal Hocko wrote: 
> Anyway, what do you think about this updated patch? I have kept Hillf's
> A-b so please let me know if it is no longer valid.
> 
My mind is not changed:)

Happy new year folks!

Hillf

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-30  9:38             ` Hillf Danton
  0 siblings, 0 replies; 74+ messages in thread
From: Hillf Danton @ 2016-12-30  9:38 UTC (permalink / raw)
  To: 'Michal Hocko', 'Minchan Kim'
  Cc: linux-mm, 'Andrew Morton', 'Mel Gorman',
	'Johannes Weiner', 'Vlastimil Babka',
	'Rik van Riel', 'LKML'


On Friday, December 30, 2016 5:27 PM Michal Hocko wrote: 
> Anyway, what do you think about this updated patch? I have kept Hillf's
> A-b so please let me know if it is no longer valid.
> 
My mind is not changed:)

Happy new year folks!

Hillf

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-30  1:48         ` Minchan Kim
@ 2016-12-30  9:26           ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-30  9:26 UTC (permalink / raw)
  To: Minchan Kim, Hillf Danton
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML

On Fri 30-12-16 10:48:53, Minchan Kim wrote:
> On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> > On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
[...]
> > > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > > +
> > > > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > > +		unsigned long nr_rotated, int priority, int file),
> > > > +
> > > > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > > 
> > > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > > when node-lru was introduced. However, the question is we really need all those
> > > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> > 
> > Dunno. Is it harmful to add this information? I like it more when the
> > numbers just add up and you have a clear picture. You never know what
> > might be useful when debugging a weird behavior. 
> 
> Michal, I'm not huge fan of "might be useful" although it's a small piece of code.

But these are tracepoints. One of their primary reasons to exist is
to help debug things.  And it is really hard to predict what might be
useful in advance. It is not like the patch would export numbers which
would be irrelevant to the reclaim.

> It adds just all of kinds overheads (memory footprint, runtime performance,
> maintainance) without any proved benefit.

Does it really add any measurable overhead or the maintenance burden? I
think the only place we could argue about is free_hot_cold_page_list
which is used in hot paths.

I think we can sacrifice it. The same for culled unevictable
pages. We wouldn't know what is the missing part
nr_taken-(nr_activate+nr_deactivate) because it could be either freed or
moved to the unevictable list but that could be handled in a separate
tracepoint in putback_lru_page which sounds like a useful thing I guess.
 
> If we allow such things, people would start adding more things with just "why not,
> it might be useful. you never know the future" and it ends up making linux fiction
> novel mess.

I agree with this concern in general, but is this the case in this
particular case?

> If it's necessary, someday, someone will catch up and will send or ask patch with
> detailed description "why the stat is important and how it is good for us to solve
> some problem".

I can certainly enhance the changelog. See below.

> From that, we can learn workload, way to solve the problem and git
> history has the valuable description so new comers can keep the community up easily.
> So, finally, overheads are justified and get merged.
> 
> Please add must-have for your goal described.

My primary point is that tracepoints which do not give us a good picture
are quite useless and force us to add trace_printk or other means to
give us further information. Then I wonder why to have an incomplete
tracepoint at all.

Anyway, what do you think about this updated patch? I have kept Hillf's
A-b so please let me know if it is no longer valid.
--- 
>From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Tue, 27 Dec 2016 13:18:20 +0100
Subject: [PATCH] mm, vmscan: add active list aging tracepoint

Our reclaim process has several tracepoints to tell us more about how
things are progressing. We are, however, missing a tracepoint to track
active list aging. Introduce mm_vmscan_lru_shrink_active which reports
the number of
	- nr_scanned, nr_taken pages to tell us the LRU isolation
	  effectiveness.
	- nr_rotated pages which tells us that we are hitting referenced
	  pages which are deactivated. If this is a large part of the
	  reported nr_deactivated pages then the active list is too small
	- nr_activated pages which tells us how many pages are keept on the
	  active list - mostly exec pages. A high number can indicate
	  that we might be trashing on executables.

Changes since v1
- report nr_taken pages as per Minchan
- report nr_activated as per Minchan
- do not report nr_freed pages because that would add a tiny overhead to
  free_hot_cold_page_list which is a hot path
- do not report nr_unevictable because we can report this number via a
  different and more generic tracepoint in putback_lru_page
- fix move_active_pages_to_lru to report proper page count when we hit
  into large pages

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                   | 18 ++++++++++++++----
 2 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 39bad8921ca1..f9ef242ece1b 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_lru_shrink_active,
+
+	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_taken,
+		unsigned long nr_activate, unsigned long nr_deactivated,
+		unsigned long nr_rotated, int priority, int file),
+
+	TP_ARGS(nid, nr_scanned, nr_taken, nr_activate, nr_deactivated, nr_rotated, priority, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(unsigned long, nr_scanned)
+		__field(unsigned long, nr_taken)
+		__field(unsigned long, nr_activate)
+		__field(unsigned long, nr_deactivated)
+		__field(unsigned long, nr_rotated)
+		__field(int, priority)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->nr_scanned = nr_scanned;
+		__entry->nr_taken = nr_taken;
+		__entry->nr_activate = nr_activate;
+		__entry->nr_deactivated = nr_deactivated;
+		__entry->nr_rotated = nr_rotated;
+		__entry->priority = priority;
+		__entry->reclaim_flags = trace_shrink_flags(file);
+	),
+
+	TP_printk("nid=%d nr_scanned=%ld nr_taken=%ld nr_activated=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
+		__entry->nid,
+		__entry->nr_scanned, __entry->nr_taken,
+		__entry->nr_activate, __entry->nr_deactivated, __entry->nr_rotated,
+		__entry->priority,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
+
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..4da4d8d0496c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * The downside is that we have to touch page->_refcount against each page.
  * But we had to alter page->flags anyway.
+ *
+ * Returns the number of pages moved to the given lru.
  */
 
-static void move_active_pages_to_lru(struct lruvec *lruvec,
+static unsigned move_active_pages_to_lru(struct lruvec *lruvec,
 				     struct list_head *list,
 				     struct list_head *pages_to_free,
 				     enum lru_list lru)
@@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 	unsigned long pgmoved = 0;
 	struct page *page;
 	int nr_pages;
+	int nr_moved = 0;
 
 	while (!list_empty(list)) {
 		page = lru_to_page(list);
@@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 				spin_lock_irq(&pgdat->lru_lock);
 			} else
 				list_add(&page->lru, pages_to_free);
+		} else {
+			nr_moved += nr_pages;
 		}
 	}
 
 	if (!is_active_lru(lru))
 		__count_vm_events(PGDEACTIVATE, pgmoved);
+
+	return nr_moved;
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
@@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_inactive);
 	struct page *page;
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
-	unsigned long nr_rotated = 0;
+	unsigned nr_deactivate, nr_activate;
+	unsigned nr_rotated = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	reclaim_stat->recent_rotated[file] += nr_rotated;
 
-	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
-	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
+	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
+	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&pgdat->lru_lock);
 
 	mem_cgroup_uncharge_list(&l_hold);
 	free_hot_cold_page_list(&l_hold, true);
+	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_taken,
+			nr_activate, nr_deactivate, nr_rotated, sc->priority, file);
 }
 
 /*
-- 
2.10.2


-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-30  9:26           ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-30  9:26 UTC (permalink / raw)
  To: Minchan Kim, Hillf Danton
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML

On Fri 30-12-16 10:48:53, Minchan Kim wrote:
> On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> > On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
[...]
> > > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > > +
> > > > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > > +		unsigned long nr_rotated, int priority, int file),
> > > > +
> > > > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > > 
> > > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > > when node-lru was introduced. However, the question is we really need all those
> > > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> > 
> > Dunno. Is it harmful to add this information? I like it more when the
> > numbers just add up and you have a clear picture. You never know what
> > might be useful when debugging a weird behavior. 
> 
> Michal, I'm not huge fan of "might be useful" although it's a small piece of code.

But these are tracepoints. One of their primary reasons to exist is
to help debug things.  And it is really hard to predict what might be
useful in advance. It is not like the patch would export numbers which
would be irrelevant to the reclaim.

> It adds just all of kinds overheads (memory footprint, runtime performance,
> maintainance) without any proved benefit.

Does it really add any measurable overhead or the maintenance burden? I
think the only place we could argue about is free_hot_cold_page_list
which is used in hot paths.

I think we can sacrifice it. The same for culled unevictable
pages. We wouldn't know what is the missing part
nr_taken-(nr_activate+nr_deactivate) because it could be either freed or
moved to the unevictable list but that could be handled in a separate
tracepoint in putback_lru_page which sounds like a useful thing I guess.
 
> If we allow such things, people would start adding more things with just "why not,
> it might be useful. you never know the future" and it ends up making linux fiction
> novel mess.

I agree with this concern in general, but is this the case in this
particular case?

> If it's necessary, someday, someone will catch up and will send or ask patch with
> detailed description "why the stat is important and how it is good for us to solve
> some problem".

I can certainly enhance the changelog. See below.

> From that, we can learn workload, way to solve the problem and git
> history has the valuable description so new comers can keep the community up easily.
> So, finally, overheads are justified and get merged.
> 
> Please add must-have for your goal described.

My primary point is that tracepoints which do not give us a good picture
are quite useless and force us to add trace_printk or other means to
give us further information. Then I wonder why to have an incomplete
tracepoint at all.

Anyway, what do you think about this updated patch? I have kept Hillf's
A-b so please let me know if it is no longer valid.
--- 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-29  7:52       ` Michal Hocko
@ 2016-12-30  1:48         ` Minchan Kim
  -1 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2016-12-30  1:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML

On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > > 
> > > Our reclaim process has several tracepoints to tell us more about how
> > > things are progressing. We are, however, missing a tracepoint to track
> > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > > the number of scanned, rotated, deactivated and freed pages from the
> > > particular node's active list.
> > > 
> > > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > > ---
> > >  include/linux/gfp.h           |  2 +-
> > >  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
> > >  mm/page_alloc.c               |  6 +++++-
> > >  mm/vmscan.c                   | 22 +++++++++++++++++-----
> > >  4 files changed, 61 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > > index 4175dca4ac39..61aa9b49e86d 100644
> > > --- a/include/linux/gfp.h
> > > +++ b/include/linux/gfp.h
> > > @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
> > >  extern void __free_pages(struct page *page, unsigned int order);
> > >  extern void free_pages(unsigned long addr, unsigned int order);
> > >  extern void free_hot_cold_page(struct page *page, bool cold);
> > > -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> > > +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
> > >  
> > >  struct page_frag_cache;
> > >  extern void __page_frag_drain(struct page *page, unsigned int order,
> > > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > > index 39bad8921ca1..d34cc0ced2be 100644
> > > --- a/include/trace/events/vmscan.h
> > > +++ b/include/trace/events/vmscan.h
> > > @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
> > >  		show_reclaim_flags(__entry->reclaim_flags))
> > >  );
> > >  
> > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > +
> > > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > +		unsigned long nr_rotated, int priority, int file),
> > > +
> > > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > 
> > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > when node-lru was introduced. However, the question is we really need all those
> > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> 
> Dunno. Is it harmful to add this information? I like it more when the
> numbers just add up and you have a clear picture. You never know what
> might be useful when debugging a weird behavior. 

Michal, I'm not huge fan of "might be useful" although it's a small piece of code.
It adds just all of kinds overheads (memory footprint, runtime performance,
maintainance) without any proved benefit.

If we allow such things, people would start adding more things with just "why not,
it might be useful. you never know the future" and it ends up making linux fiction
novel mess.

If it's necessary, someday, someone will catch up and will send or ask patch with
detailed description "why the stat is important and how it is good for us to solve
some problem". From that, we can learn workload, way to solve the problem and git
history has the valuable description so new comers can keep the community up easily.
So, finally, overheads are justified and get merged.

Please add must-have for your goal described.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-30  1:48         ` Minchan Kim
  0 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2016-12-30  1:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML

On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > > 
> > > Our reclaim process has several tracepoints to tell us more about how
> > > things are progressing. We are, however, missing a tracepoint to track
> > > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > > the number of scanned, rotated, deactivated and freed pages from the
> > > particular node's active list.
> > > 
> > > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > > ---
> > >  include/linux/gfp.h           |  2 +-
> > >  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
> > >  mm/page_alloc.c               |  6 +++++-
> > >  mm/vmscan.c                   | 22 +++++++++++++++++-----
> > >  4 files changed, 61 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > > index 4175dca4ac39..61aa9b49e86d 100644
> > > --- a/include/linux/gfp.h
> > > +++ b/include/linux/gfp.h
> > > @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
> > >  extern void __free_pages(struct page *page, unsigned int order);
> > >  extern void free_pages(unsigned long addr, unsigned int order);
> > >  extern void free_hot_cold_page(struct page *page, bool cold);
> > > -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> > > +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
> > >  
> > >  struct page_frag_cache;
> > >  extern void __page_frag_drain(struct page *page, unsigned int order,
> > > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > > index 39bad8921ca1..d34cc0ced2be 100644
> > > --- a/include/trace/events/vmscan.h
> > > +++ b/include/trace/events/vmscan.h
> > > @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
> > >  		show_reclaim_flags(__entry->reclaim_flags))
> > >  );
> > >  
> > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > +
> > > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > +		unsigned long nr_rotated, int priority, int file),
> > > +
> > > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > 
> > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > when node-lru was introduced. However, the question is we really need all those
> > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> 
> Dunno. Is it harmful to add this information? I like it more when the
> numbers just add up and you have a clear picture. You never know what
> might be useful when debugging a weird behavior. 

Michal, I'm not huge fan of "might be useful" although it's a small piece of code.
It adds just all of kinds overheads (memory footprint, runtime performance,
maintainance) without any proved benefit.

If we allow such things, people would start adding more things with just "why not,
it might be useful. you never know the future" and it ends up making linux fiction
novel mess.

If it's necessary, someday, someone will catch up and will send or ask patch with
detailed description "why the stat is important and how it is good for us to solve
some problem". From that, we can learn workload, way to solve the problem and git
history has the valuable description so new comers can keep the community up easily.
So, finally, overheads are justified and get merged.

Please add must-have for your goal described.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-29  5:33     ` Minchan Kim
@ 2016-12-29  7:52       ` Michal Hocko
  -1 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-29  7:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML

On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > the number of scanned, rotated, deactivated and freed pages from the
> > particular node's active list.
> > 
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  include/linux/gfp.h           |  2 +-
> >  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
> >  mm/page_alloc.c               |  6 +++++-
> >  mm/vmscan.c                   | 22 +++++++++++++++++-----
> >  4 files changed, 61 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > index 4175dca4ac39..61aa9b49e86d 100644
> > --- a/include/linux/gfp.h
> > +++ b/include/linux/gfp.h
> > @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
> >  extern void __free_pages(struct page *page, unsigned int order);
> >  extern void free_pages(unsigned long addr, unsigned int order);
> >  extern void free_hot_cold_page(struct page *page, bool cold);
> > -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> > +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
> >  
> >  struct page_frag_cache;
> >  extern void __page_frag_drain(struct page *page, unsigned int order,
> > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > index 39bad8921ca1..d34cc0ced2be 100644
> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
> >  		show_reclaim_flags(__entry->reclaim_flags))
> >  );
> >  
> > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > +
> > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > +		unsigned long nr_rotated, int priority, int file),
> > +
> > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> 
> I agree it is helpful. And it was when I investigated aging problem of 32bit
> when node-lru was introduced. However, the question is we really need all those
> kinds of information? just enough with nr_taken, nr_deactivated, priority, file?

Dunno. Is it harmful to add this information? I like it more when the
numbers just add up and you have a clear picture. You never know what
might be useful when debugging a weird behavior. 

[...]
> > -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> > -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> > +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> 
> Who use nr_active in here?

this is an omission. I just forgot to add it... Thanks for noticing.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-29  7:52       ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-29  7:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML

On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > Our reclaim process has several tracepoints to tell us more about how
> > things are progressing. We are, however, missing a tracepoint to track
> > active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> > the number of scanned, rotated, deactivated and freed pages from the
> > particular node's active list.
> > 
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  include/linux/gfp.h           |  2 +-
> >  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
> >  mm/page_alloc.c               |  6 +++++-
> >  mm/vmscan.c                   | 22 +++++++++++++++++-----
> >  4 files changed, 61 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > index 4175dca4ac39..61aa9b49e86d 100644
> > --- a/include/linux/gfp.h
> > +++ b/include/linux/gfp.h
> > @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
> >  extern void __free_pages(struct page *page, unsigned int order);
> >  extern void free_pages(unsigned long addr, unsigned int order);
> >  extern void free_hot_cold_page(struct page *page, bool cold);
> > -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> > +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
> >  
> >  struct page_frag_cache;
> >  extern void __page_frag_drain(struct page *page, unsigned int order,
> > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > index 39bad8921ca1..d34cc0ced2be 100644
> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
> >  		show_reclaim_flags(__entry->reclaim_flags))
> >  );
> >  
> > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > +
> > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > +		unsigned long nr_rotated, int priority, int file),
> > +
> > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> 
> I agree it is helpful. And it was when I investigated aging problem of 32bit
> when node-lru was introduced. However, the question is we really need all those
> kinds of information? just enough with nr_taken, nr_deactivated, priority, file?

Dunno. Is it harmful to add this information? I like it more when the
numbers just add up and you have a clear picture. You never know what
might be useful when debugging a weird behavior. 

[...]
> > -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> > -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> > +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> 
> Who use nr_active in here?

this is an omission. I just forgot to add it... Thanks for noticing.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-28 15:30   ` Michal Hocko
@ 2016-12-29  7:44     ` Hillf Danton
  -1 siblings, 0 replies; 74+ messages in thread
From: Hillf Danton @ 2016-12-29  7:44 UTC (permalink / raw)
  To: 'Michal Hocko', linux-mm
  Cc: 'Andrew Morton', 'Mel Gorman',
	'Johannes Weiner', 'Vlastimil Babka',
	'Rik van Riel', 'LKML', 'Michal Hocko'


On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-29  7:44     ` Hillf Danton
  0 siblings, 0 replies; 74+ messages in thread
From: Hillf Danton @ 2016-12-29  7:44 UTC (permalink / raw)
  To: 'Michal Hocko', linux-mm
  Cc: 'Andrew Morton', 'Mel Gorman',
	'Johannes Weiner', 'Vlastimil Babka',
	'Rik van Riel', 'LKML', 'Michal Hocko'


On Wednesday, December 28, 2016 11:30 PM Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-28 15:30   ` Michal Hocko
@ 2016-12-29  5:33     ` Minchan Kim
  -1 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2016-12-29  5:33 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML, Michal Hocko

On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/gfp.h           |  2 +-
>  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
>  mm/page_alloc.c               |  6 +++++-
>  mm/vmscan.c                   | 22 +++++++++++++++++-----
>  4 files changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 4175dca4ac39..61aa9b49e86d 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
>  extern void __free_pages(struct page *page, unsigned int order);
>  extern void free_pages(unsigned long addr, unsigned int order);
>  extern void free_hot_cold_page(struct page *page, bool cold);
> -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
>  
>  struct page_frag_cache;
>  extern void __page_frag_drain(struct page *page, unsigned int order,
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 39bad8921ca1..d34cc0ced2be 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
>  		show_reclaim_flags(__entry->reclaim_flags))
>  );
>  
> +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> +
> +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> +		unsigned long nr_rotated, int priority, int file),
> +
> +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),

I agree it is helpful. And it was when I investigated aging problem of 32bit
when node-lru was introduced. However, the question is we really need all those
kinds of information? just enough with nr_taken, nr_deactivated, priority, file?

Also, look at minor thing below.

Thanks.

> +
> +	TP_STRUCT__entry(
> +		__field(int, nid)
> +		__field(unsigned long, nr_scanned)
> +		__field(unsigned long, nr_freed)
> +		__field(unsigned long, nr_unevictable)
> +		__field(unsigned long, nr_deactivated)
> +		__field(unsigned long, nr_rotated)
> +		__field(int, priority)
> +		__field(int, reclaim_flags)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->nid = nid;
> +		__entry->nr_scanned = nr_scanned;
> +		__entry->nr_freed = nr_freed;
> +		__entry->nr_unevictable = nr_unevictable;
> +		__entry->nr_deactivated = nr_deactivated;
> +		__entry->nr_rotated = nr_rotated;
> +		__entry->priority = priority;
> +		__entry->reclaim_flags = trace_shrink_flags(file);
> +	),
> +
> +	TP_printk("nid=%d nr_scanned=%ld nr_freed=%ld nr_unevictable=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
> +		__entry->nid,
> +		__entry->nr_scanned, __entry->nr_freed, __entry->nr_unevictable,
> +		__entry->nr_deactivated, __entry->nr_rotated,
> +		__entry->priority,
> +		show_reclaim_flags(__entry->reclaim_flags))
> +);
> +
>  #endif /* _TRACE_VMSCAN_H */
>  
>  /* This part must be outside protection */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1c24112308d6..77d204660857 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2487,14 +2487,18 @@ void free_hot_cold_page(struct page *page, bool cold)
>  /*
>   * Free a list of 0-order pages
>   */
> -void free_hot_cold_page_list(struct list_head *list, bool cold)
> +int free_hot_cold_page_list(struct list_head *list, bool cold)
>  {
>  	struct page *page, *next;
> +	int ret = 0;
>  
>  	list_for_each_entry_safe(page, next, list, lru) {
>  		trace_mm_page_free_batched(page, cold);
>  		free_hot_cold_page(page, cold);
> +		ret++;
>  	}
> +
> +	return ret;
>  }
>  
>  /*
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c4abf08861d2..2302a1a58c6e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
>   *
>   * The downside is that we have to touch page->_refcount against each page.
>   * But we had to alter page->flags anyway.
> + *
> + * Returns the number of pages moved to the given lru.
>   */
>  
> -static void move_active_pages_to_lru(struct lruvec *lruvec,
> +static int move_active_pages_to_lru(struct lruvec *lruvec,
>  				     struct list_head *list,
>  				     struct list_head *pages_to_free,
>  				     enum lru_list lru)
> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  	unsigned long pgmoved = 0;
>  	struct page *page;
>  	int nr_pages;
> +	int nr_moved = 0;
>  
>  	while (!list_empty(list)) {
>  		page = lru_to_page(list);
> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  				spin_lock_irq(&pgdat->lru_lock);
>  			} else
>  				list_add(&page->lru, pages_to_free);
> +		} else {
> +			nr_moved++;
>  		}
>  	}
>  
>  	if (!is_active_lru(lru))
>  		__count_vm_events(PGDEACTIVATE, pgmoved);
> +
> +	return nr_moved;
>  }
>  
>  static void shrink_active_list(unsigned long nr_to_scan,
> @@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	LIST_HEAD(l_inactive);
>  	struct page *page;
>  	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -	unsigned long nr_rotated = 0;
> +	unsigned long nr_rotated = 0, nr_unevictable = 0;
> +	unsigned long nr_freed, nr_deactivate, nr_activate;
>  	isolate_mode_t isolate_mode = 0;
>  	int file = is_file_lru(lru);
>  	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> @@ -1935,6 +1943,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  
>  		if (unlikely(!page_evictable(page))) {
>  			putback_lru_page(page);
> +			nr_unevictable++;
>  			continue;
>  		}
>  
> @@ -1980,13 +1989,16 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	 */
>  	reclaim_stat->recent_rotated[file] += nr_rotated;
>  
> -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);

Who use nr_active in here?

> +	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
>  	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
>  	spin_unlock_irq(&pgdat->lru_lock);
>  
>  	mem_cgroup_uncharge_list(&l_hold);
> -	free_hot_cold_page_list(&l_hold, true);
> +	nr_freed = free_hot_cold_page_list(&l_hold, true);
> +	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_freed,
> +			nr_unevictable, nr_deactivate, nr_rotated,
> +			sc->priority, file);
>  }
>  
>  /*
> -- 
> 2.10.2
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-29  5:33     ` Minchan Kim
  0 siblings, 0 replies; 74+ messages in thread
From: Minchan Kim @ 2016-12-29  5:33 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Johannes Weiner,
	Vlastimil Babka, Rik van Riel, LKML, Michal Hocko

On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Our reclaim process has several tracepoints to tell us more about how
> things are progressing. We are, however, missing a tracepoint to track
> active list aging. Introduce mm_vmscan_lru_shrink_active which reports
> the number of scanned, rotated, deactivated and freed pages from the
> particular node's active list.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/gfp.h           |  2 +-
>  include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
>  mm/page_alloc.c               |  6 +++++-
>  mm/vmscan.c                   | 22 +++++++++++++++++-----
>  4 files changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 4175dca4ac39..61aa9b49e86d 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
>  extern void __free_pages(struct page *page, unsigned int order);
>  extern void free_pages(unsigned long addr, unsigned int order);
>  extern void free_hot_cold_page(struct page *page, bool cold);
> -extern void free_hot_cold_page_list(struct list_head *list, bool cold);
> +extern int free_hot_cold_page_list(struct list_head *list, bool cold);
>  
>  struct page_frag_cache;
>  extern void __page_frag_drain(struct page *page, unsigned int order,
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 39bad8921ca1..d34cc0ced2be 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
>  		show_reclaim_flags(__entry->reclaim_flags))
>  );
>  
> +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> +
> +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> +		unsigned long nr_rotated, int priority, int file),
> +
> +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),

I agree it is helpful. And it was when I investigated aging problem of 32bit
when node-lru was introduced. However, the question is we really need all those
kinds of information? just enough with nr_taken, nr_deactivated, priority, file?

Also, look at minor thing below.

Thanks.

> +
> +	TP_STRUCT__entry(
> +		__field(int, nid)
> +		__field(unsigned long, nr_scanned)
> +		__field(unsigned long, nr_freed)
> +		__field(unsigned long, nr_unevictable)
> +		__field(unsigned long, nr_deactivated)
> +		__field(unsigned long, nr_rotated)
> +		__field(int, priority)
> +		__field(int, reclaim_flags)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->nid = nid;
> +		__entry->nr_scanned = nr_scanned;
> +		__entry->nr_freed = nr_freed;
> +		__entry->nr_unevictable = nr_unevictable;
> +		__entry->nr_deactivated = nr_deactivated;
> +		__entry->nr_rotated = nr_rotated;
> +		__entry->priority = priority;
> +		__entry->reclaim_flags = trace_shrink_flags(file);
> +	),
> +
> +	TP_printk("nid=%d nr_scanned=%ld nr_freed=%ld nr_unevictable=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
> +		__entry->nid,
> +		__entry->nr_scanned, __entry->nr_freed, __entry->nr_unevictable,
> +		__entry->nr_deactivated, __entry->nr_rotated,
> +		__entry->priority,
> +		show_reclaim_flags(__entry->reclaim_flags))
> +);
> +
>  #endif /* _TRACE_VMSCAN_H */
>  
>  /* This part must be outside protection */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1c24112308d6..77d204660857 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2487,14 +2487,18 @@ void free_hot_cold_page(struct page *page, bool cold)
>  /*
>   * Free a list of 0-order pages
>   */
> -void free_hot_cold_page_list(struct list_head *list, bool cold)
> +int free_hot_cold_page_list(struct list_head *list, bool cold)
>  {
>  	struct page *page, *next;
> +	int ret = 0;
>  
>  	list_for_each_entry_safe(page, next, list, lru) {
>  		trace_mm_page_free_batched(page, cold);
>  		free_hot_cold_page(page, cold);
> +		ret++;
>  	}
> +
> +	return ret;
>  }
>  
>  /*
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c4abf08861d2..2302a1a58c6e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
>   *
>   * The downside is that we have to touch page->_refcount against each page.
>   * But we had to alter page->flags anyway.
> + *
> + * Returns the number of pages moved to the given lru.
>   */
>  
> -static void move_active_pages_to_lru(struct lruvec *lruvec,
> +static int move_active_pages_to_lru(struct lruvec *lruvec,
>  				     struct list_head *list,
>  				     struct list_head *pages_to_free,
>  				     enum lru_list lru)
> @@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  	unsigned long pgmoved = 0;
>  	struct page *page;
>  	int nr_pages;
> +	int nr_moved = 0;
>  
>  	while (!list_empty(list)) {
>  		page = lru_to_page(list);
> @@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
>  				spin_lock_irq(&pgdat->lru_lock);
>  			} else
>  				list_add(&page->lru, pages_to_free);
> +		} else {
> +			nr_moved++;
>  		}
>  	}
>  
>  	if (!is_active_lru(lru))
>  		__count_vm_events(PGDEACTIVATE, pgmoved);
> +
> +	return nr_moved;
>  }
>  
>  static void shrink_active_list(unsigned long nr_to_scan,
> @@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	LIST_HEAD(l_inactive);
>  	struct page *page;
>  	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -	unsigned long nr_rotated = 0;
> +	unsigned long nr_rotated = 0, nr_unevictable = 0;
> +	unsigned long nr_freed, nr_deactivate, nr_activate;
>  	isolate_mode_t isolate_mode = 0;
>  	int file = is_file_lru(lru);
>  	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> @@ -1935,6 +1943,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  
>  		if (unlikely(!page_evictable(page))) {
>  			putback_lru_page(page);
> +			nr_unevictable++;
>  			continue;
>  		}
>  
> @@ -1980,13 +1989,16 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  	 */
>  	reclaim_stat->recent_rotated[file] += nr_rotated;
>  
> -	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
> -	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
> +	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);

Who use nr_active in here?

> +	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
>  	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
>  	spin_unlock_irq(&pgdat->lru_lock);
>  
>  	mem_cgroup_uncharge_list(&l_hold);
> -	free_hot_cold_page_list(&l_hold, true);
> +	nr_freed = free_hot_cold_page_list(&l_hold, true);
> +	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_freed,
> +			nr_unevictable, nr_deactivate, nr_rotated,
> +			sc->priority, file);
>  }
>  
>  /*
> -- 
> 2.10.2
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 2/7] mm, vmscan: add active list aging tracepoint
  2016-12-28 15:30 [PATCH 0/7] " Michal Hocko
@ 2016-12-28 15:30   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-28 15:30 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Vlastimil Babka,
	Rik van Riel, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Our reclaim process has several tracepoints to tell us more about how
things are progressing. We are, however, missing a tracepoint to track
active list aging. Introduce mm_vmscan_lru_shrink_active which reports
the number of scanned, rotated, deactivated and freed pages from the
particular node's active list.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/gfp.h           |  2 +-
 include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
 mm/page_alloc.c               |  6 +++++-
 mm/vmscan.c                   | 22 +++++++++++++++++-----
 4 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 4175dca4ac39..61aa9b49e86d 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
 extern void __free_pages(struct page *page, unsigned int order);
 extern void free_pages(unsigned long addr, unsigned int order);
 extern void free_hot_cold_page(struct page *page, bool cold);
-extern void free_hot_cold_page_list(struct list_head *list, bool cold);
+extern int free_hot_cold_page_list(struct list_head *list, bool cold);
 
 struct page_frag_cache;
 extern void __page_frag_drain(struct page *page, unsigned int order,
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 39bad8921ca1..d34cc0ced2be 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_lru_shrink_active,
+
+	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
+		unsigned long nr_unevictable, unsigned long nr_deactivated,
+		unsigned long nr_rotated, int priority, int file),
+
+	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(unsigned long, nr_scanned)
+		__field(unsigned long, nr_freed)
+		__field(unsigned long, nr_unevictable)
+		__field(unsigned long, nr_deactivated)
+		__field(unsigned long, nr_rotated)
+		__field(int, priority)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->nr_scanned = nr_scanned;
+		__entry->nr_freed = nr_freed;
+		__entry->nr_unevictable = nr_unevictable;
+		__entry->nr_deactivated = nr_deactivated;
+		__entry->nr_rotated = nr_rotated;
+		__entry->priority = priority;
+		__entry->reclaim_flags = trace_shrink_flags(file);
+	),
+
+	TP_printk("nid=%d nr_scanned=%ld nr_freed=%ld nr_unevictable=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
+		__entry->nid,
+		__entry->nr_scanned, __entry->nr_freed, __entry->nr_unevictable,
+		__entry->nr_deactivated, __entry->nr_rotated,
+		__entry->priority,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
+
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1c24112308d6..77d204660857 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2487,14 +2487,18 @@ void free_hot_cold_page(struct page *page, bool cold)
 /*
  * Free a list of 0-order pages
  */
-void free_hot_cold_page_list(struct list_head *list, bool cold)
+int free_hot_cold_page_list(struct list_head *list, bool cold)
 {
 	struct page *page, *next;
+	int ret = 0;
 
 	list_for_each_entry_safe(page, next, list, lru) {
 		trace_mm_page_free_batched(page, cold);
 		free_hot_cold_page(page, cold);
+		ret++;
 	}
+
+	return ret;
 }
 
 /*
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..2302a1a58c6e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * The downside is that we have to touch page->_refcount against each page.
  * But we had to alter page->flags anyway.
+ *
+ * Returns the number of pages moved to the given lru.
  */
 
-static void move_active_pages_to_lru(struct lruvec *lruvec,
+static int move_active_pages_to_lru(struct lruvec *lruvec,
 				     struct list_head *list,
 				     struct list_head *pages_to_free,
 				     enum lru_list lru)
@@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 	unsigned long pgmoved = 0;
 	struct page *page;
 	int nr_pages;
+	int nr_moved = 0;
 
 	while (!list_empty(list)) {
 		page = lru_to_page(list);
@@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 				spin_lock_irq(&pgdat->lru_lock);
 			} else
 				list_add(&page->lru, pages_to_free);
+		} else {
+			nr_moved++;
 		}
 	}
 
 	if (!is_active_lru(lru))
 		__count_vm_events(PGDEACTIVATE, pgmoved);
+
+	return nr_moved;
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
@@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_inactive);
 	struct page *page;
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
-	unsigned long nr_rotated = 0;
+	unsigned long nr_rotated = 0, nr_unevictable = 0;
+	unsigned long nr_freed, nr_deactivate, nr_activate;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1935,6 +1943,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
 
 		if (unlikely(!page_evictable(page))) {
 			putback_lru_page(page);
+			nr_unevictable++;
 			continue;
 		}
 
@@ -1980,13 +1989,16 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	reclaim_stat->recent_rotated[file] += nr_rotated;
 
-	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
-	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
+	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
+	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&pgdat->lru_lock);
 
 	mem_cgroup_uncharge_list(&l_hold);
-	free_hot_cold_page_list(&l_hold, true);
+	nr_freed = free_hot_cold_page_list(&l_hold, true);
+	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_freed,
+			nr_unevictable, nr_deactivate, nr_rotated,
+			sc->priority, file);
 }
 
 /*
-- 
2.10.2

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 2/7] mm, vmscan: add active list aging tracepoint
@ 2016-12-28 15:30   ` Michal Hocko
  0 siblings, 0 replies; 74+ messages in thread
From: Michal Hocko @ 2016-12-28 15:30 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Mel Gorman, Johannes Weiner, Vlastimil Babka,
	Rik van Riel, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Our reclaim process has several tracepoints to tell us more about how
things are progressing. We are, however, missing a tracepoint to track
active list aging. Introduce mm_vmscan_lru_shrink_active which reports
the number of scanned, rotated, deactivated and freed pages from the
particular node's active list.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/gfp.h           |  2 +-
 include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
 mm/page_alloc.c               |  6 +++++-
 mm/vmscan.c                   | 22 +++++++++++++++++-----
 4 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 4175dca4ac39..61aa9b49e86d 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -503,7 +503,7 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
 extern void __free_pages(struct page *page, unsigned int order);
 extern void free_pages(unsigned long addr, unsigned int order);
 extern void free_hot_cold_page(struct page *page, bool cold);
-extern void free_hot_cold_page_list(struct list_head *list, bool cold);
+extern int free_hot_cold_page_list(struct list_head *list, bool cold);
 
 struct page_frag_cache;
 extern void __page_frag_drain(struct page *page, unsigned int order,
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 39bad8921ca1..d34cc0ced2be 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_lru_shrink_active,
+
+	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
+		unsigned long nr_unevictable, unsigned long nr_deactivated,
+		unsigned long nr_rotated, int priority, int file),
+
+	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(unsigned long, nr_scanned)
+		__field(unsigned long, nr_freed)
+		__field(unsigned long, nr_unevictable)
+		__field(unsigned long, nr_deactivated)
+		__field(unsigned long, nr_rotated)
+		__field(int, priority)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->nr_scanned = nr_scanned;
+		__entry->nr_freed = nr_freed;
+		__entry->nr_unevictable = nr_unevictable;
+		__entry->nr_deactivated = nr_deactivated;
+		__entry->nr_rotated = nr_rotated;
+		__entry->priority = priority;
+		__entry->reclaim_flags = trace_shrink_flags(file);
+	),
+
+	TP_printk("nid=%d nr_scanned=%ld nr_freed=%ld nr_unevictable=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
+		__entry->nid,
+		__entry->nr_scanned, __entry->nr_freed, __entry->nr_unevictable,
+		__entry->nr_deactivated, __entry->nr_rotated,
+		__entry->priority,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
+
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1c24112308d6..77d204660857 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2487,14 +2487,18 @@ void free_hot_cold_page(struct page *page, bool cold)
 /*
  * Free a list of 0-order pages
  */
-void free_hot_cold_page_list(struct list_head *list, bool cold)
+int free_hot_cold_page_list(struct list_head *list, bool cold)
 {
 	struct page *page, *next;
+	int ret = 0;
 
 	list_for_each_entry_safe(page, next, list, lru) {
 		trace_mm_page_free_batched(page, cold);
 		free_hot_cold_page(page, cold);
+		ret++;
 	}
+
+	return ret;
 }
 
 /*
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..2302a1a58c6e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * The downside is that we have to touch page->_refcount against each page.
  * But we had to alter page->flags anyway.
+ *
+ * Returns the number of pages moved to the given lru.
  */
 
-static void move_active_pages_to_lru(struct lruvec *lruvec,
+static int move_active_pages_to_lru(struct lruvec *lruvec,
 				     struct list_head *list,
 				     struct list_head *pages_to_free,
 				     enum lru_list lru)
@@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 	unsigned long pgmoved = 0;
 	struct page *page;
 	int nr_pages;
+	int nr_moved = 0;
 
 	while (!list_empty(list)) {
 		page = lru_to_page(list);
@@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 				spin_lock_irq(&pgdat->lru_lock);
 			} else
 				list_add(&page->lru, pages_to_free);
+		} else {
+			nr_moved++;
 		}
 	}
 
 	if (!is_active_lru(lru))
 		__count_vm_events(PGDEACTIVATE, pgmoved);
+
+	return nr_moved;
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
@@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_inactive);
 	struct page *page;
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
-	unsigned long nr_rotated = 0;
+	unsigned long nr_rotated = 0, nr_unevictable = 0;
+	unsigned long nr_freed, nr_deactivate, nr_activate;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1935,6 +1943,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
 
 		if (unlikely(!page_evictable(page))) {
 			putback_lru_page(page);
+			nr_unevictable++;
 			continue;
 		}
 
@@ -1980,13 +1989,16 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	reclaim_stat->recent_rotated[file] += nr_rotated;
 
-	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
-	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
+	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
+	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&pgdat->lru_lock);
 
 	mem_cgroup_uncharge_list(&l_hold);
-	free_hot_cold_page_list(&l_hold, true);
+	nr_freed = free_hot_cold_page_list(&l_hold, true);
+	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_freed,
+			nr_unevictable, nr_deactivate, nr_rotated,
+			sc->priority, file);
 }
 
 /*
-- 
2.10.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2017-01-05 15:18 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-04 10:19 [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints Michal Hocko
2017-01-04 10:19 ` Michal Hocko
2017-01-04 10:19 ` [PATCH 1/7] mm, vmscan: remove unused mm_vmscan_memcg_isolate Michal Hocko
2017-01-04 10:19   ` Michal Hocko
2017-01-04 10:19 ` [PATCH 2/7] mm, vmscan: add active list aging tracepoint Michal Hocko
2017-01-04 10:19   ` Michal Hocko
2017-01-04 12:52   ` Vlastimil Babka
2017-01-04 12:52     ` Vlastimil Babka
2017-01-04 13:16     ` Michal Hocko
2017-01-04 13:16       ` Michal Hocko
2017-01-04 13:34       ` Vlastimil Babka
2017-01-04 13:34         ` Vlastimil Babka
2017-01-04 13:52   ` Michal Hocko
2017-01-04 13:52     ` Michal Hocko
2017-01-05  5:41     ` Minchan Kim
2017-01-05  5:41       ` Minchan Kim
2017-01-04 10:19 ` [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate Michal Hocko
2017-01-04 10:19   ` Michal Hocko
2017-01-04 10:19 ` [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint Michal Hocko
2017-01-04 10:19   ` Michal Hocko
2017-01-05  6:04   ` Minchan Kim
2017-01-05  6:04     ` Minchan Kim
2017-01-05 10:16     ` Michal Hocko
2017-01-05 10:16       ` Michal Hocko
2017-01-05 14:56       ` Mel Gorman
2017-01-05 14:56         ` Mel Gorman
2017-01-05 15:17         ` Michal Hocko
2017-01-05 15:17           ` Michal Hocko
2017-01-04 10:19 ` [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct Michal Hocko
2017-01-04 10:19   ` Michal Hocko
2017-01-04 14:51   ` Vlastimil Babka
2017-01-04 14:51     ` Vlastimil Babka
2017-01-04 15:09     ` Michal Hocko
2017-01-04 15:09       ` Michal Hocko
2017-01-04 10:19 ` [PATCH 6/7] mm, vmscan: enhance mm_vmscan_lru_shrink_inactive tracepoint Michal Hocko
2017-01-04 10:19   ` Michal Hocko
2017-01-04 10:19 ` [PATCH 7/7] mm, vmscan: add mm_vmscan_inactive_list_is_low tracepoint Michal Hocko
2017-01-04 10:19   ` Michal Hocko
2017-01-04 10:30 ` [PATCH 0/7 v2] vm, vmscan: enahance vmscan tracepoints Michal Hocko
2017-01-04 10:30   ` Michal Hocko
2017-01-05  8:25 ` Vlastimil Babka
2017-01-05  8:25   ` Vlastimil Babka
2017-01-05 10:39 ` Michal Hocko
2017-01-05 10:39   ` Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2016-12-28 15:30 [PATCH 0/7] " Michal Hocko
2016-12-28 15:30 ` [PATCH 2/7] mm, vmscan: add active list aging tracepoint Michal Hocko
2016-12-28 15:30   ` Michal Hocko
2016-12-29  5:33   ` Minchan Kim
2016-12-29  5:33     ` Minchan Kim
2016-12-29  7:52     ` Michal Hocko
2016-12-29  7:52       ` Michal Hocko
2016-12-30  1:48       ` Minchan Kim
2016-12-30  1:48         ` Minchan Kim
2016-12-30  9:26         ` Michal Hocko
2016-12-30  9:26           ` Michal Hocko
2016-12-30  9:38           ` Hillf Danton
2016-12-30  9:38             ` Hillf Danton
2016-12-30 16:04           ` Minchan Kim
2016-12-30 16:04             ` Minchan Kim
2016-12-30 16:37             ` Michal Hocko
2016-12-30 16:37               ` Michal Hocko
2016-12-30 17:30               ` Michal Hocko
2016-12-30 17:30                 ` Michal Hocko
2017-01-03  5:03               ` Minchan Kim
2017-01-03  5:03                 ` Minchan Kim
2017-01-03  8:21                 ` Michal Hocko
2017-01-03  8:21                   ` Michal Hocko
2017-01-04  5:07                   ` Minchan Kim
2017-01-04  5:07                     ` Minchan Kim
2017-01-04  7:28                     ` Vlastimil Babka
2017-01-04  7:28                       ` Vlastimil Babka
2017-01-04  7:50                     ` Michal Hocko
2017-01-04  7:50                       ` Michal Hocko
2016-12-29  7:44   ` Hillf Danton
2016-12-29  7:44     ` Hillf Danton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.