linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Minchan Kim <minchan@kernel.org>,
	Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vlastimil Babka <vbabka@suse.cz>, Rik van Riel <riel@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint
Date: Fri, 30 Dec 2016 10:26:37 +0100	[thread overview]
Message-ID: <20161230092636.GA13301@dhcp22.suse.cz> (raw)
In-Reply-To: <20161230014853.GA4184@bbox>

On Fri 30-12-16 10:48:53, Minchan Kim wrote:
> On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> > On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
[...]
> > > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > > +
> > > > +	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > > +		unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > > +		unsigned long nr_rotated, int priority, int file),
> > > > +
> > > > +	TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > > 
> > > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > > when node-lru was introduced. However, the question is we really need all those
> > > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> > 
> > Dunno. Is it harmful to add this information? I like it more when the
> > numbers just add up and you have a clear picture. You never know what
> > might be useful when debugging a weird behavior. 
> 
> Michal, I'm not huge fan of "might be useful" although it's a small piece of code.

But these are tracepoints. One of their primary reasons to exist is
to help debug things.  And it is really hard to predict what might be
useful in advance. It is not like the patch would export numbers which
would be irrelevant to the reclaim.

> It adds just all of kinds overheads (memory footprint, runtime performance,
> maintainance) without any proved benefit.

Does it really add any measurable overhead or the maintenance burden? I
think the only place we could argue about is free_hot_cold_page_list
which is used in hot paths.

I think we can sacrifice it. The same for culled unevictable
pages. We wouldn't know what is the missing part
nr_taken-(nr_activate+nr_deactivate) because it could be either freed or
moved to the unevictable list but that could be handled in a separate
tracepoint in putback_lru_page which sounds like a useful thing I guess.
 
> If we allow such things, people would start adding more things with just "why not,
> it might be useful. you never know the future" and it ends up making linux fiction
> novel mess.

I agree with this concern in general, but is this the case in this
particular case?

> If it's necessary, someday, someone will catch up and will send or ask patch with
> detailed description "why the stat is important and how it is good for us to solve
> some problem".

I can certainly enhance the changelog. See below.

> From that, we can learn workload, way to solve the problem and git
> history has the valuable description so new comers can keep the community up easily.
> So, finally, overheads are justified and get merged.
> 
> Please add must-have for your goal described.

My primary point is that tracepoints which do not give us a good picture
are quite useless and force us to add trace_printk or other means to
give us further information. Then I wonder why to have an incomplete
tracepoint at all.

Anyway, what do you think about this updated patch? I have kept Hillf's
A-b so please let me know if it is no longer valid.
--- 
>From 5f1bc22ad1e54050b4da3228d68945e70342ebb6 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Tue, 27 Dec 2016 13:18:20 +0100
Subject: [PATCH] mm, vmscan: add active list aging tracepoint

Our reclaim process has several tracepoints to tell us more about how
things are progressing. We are, however, missing a tracepoint to track
active list aging. Introduce mm_vmscan_lru_shrink_active which reports
the number of
	- nr_scanned, nr_taken pages to tell us the LRU isolation
	  effectiveness.
	- nr_rotated pages which tells us that we are hitting referenced
	  pages which are deactivated. If this is a large part of the
	  reported nr_deactivated pages then the active list is too small
	- nr_activated pages which tells us how many pages are keept on the
	  active list - mostly exec pages. A high number can indicate
	  that we might be trashing on executables.

Changes since v1
- report nr_taken pages as per Minchan
- report nr_activated as per Minchan
- do not report nr_freed pages because that would add a tiny overhead to
  free_hot_cold_page_list which is a hot path
- do not report nr_unevictable because we can report this number via a
  different and more generic tracepoint in putback_lru_page
- fix move_active_pages_to_lru to report proper page count when we hit
  into large pages

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/vmscan.h | 38 ++++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                   | 18 ++++++++++++++----
 2 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 39bad8921ca1..f9ef242ece1b 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -363,6 +363,44 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 		show_reclaim_flags(__entry->reclaim_flags))
 );
 
+TRACE_EVENT(mm_vmscan_lru_shrink_active,
+
+	TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_taken,
+		unsigned long nr_activate, unsigned long nr_deactivated,
+		unsigned long nr_rotated, int priority, int file),
+
+	TP_ARGS(nid, nr_scanned, nr_taken, nr_activate, nr_deactivated, nr_rotated, priority, file),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(unsigned long, nr_scanned)
+		__field(unsigned long, nr_taken)
+		__field(unsigned long, nr_activate)
+		__field(unsigned long, nr_deactivated)
+		__field(unsigned long, nr_rotated)
+		__field(int, priority)
+		__field(int, reclaim_flags)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->nr_scanned = nr_scanned;
+		__entry->nr_taken = nr_taken;
+		__entry->nr_activate = nr_activate;
+		__entry->nr_deactivated = nr_deactivated;
+		__entry->nr_rotated = nr_rotated;
+		__entry->priority = priority;
+		__entry->reclaim_flags = trace_shrink_flags(file);
+	),
+
+	TP_printk("nid=%d nr_scanned=%ld nr_taken=%ld nr_activated=%ld nr_deactivated=%ld nr_rotated=%ld priority=%d flags=%s",
+		__entry->nid,
+		__entry->nr_scanned, __entry->nr_taken,
+		__entry->nr_activate, __entry->nr_deactivated, __entry->nr_rotated,
+		__entry->priority,
+		show_reclaim_flags(__entry->reclaim_flags))
+);
+
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..4da4d8d0496c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1846,9 +1846,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * The downside is that we have to touch page->_refcount against each page.
  * But we had to alter page->flags anyway.
+ *
+ * Returns the number of pages moved to the given lru.
  */
 
-static void move_active_pages_to_lru(struct lruvec *lruvec,
+static unsigned move_active_pages_to_lru(struct lruvec *lruvec,
 				     struct list_head *list,
 				     struct list_head *pages_to_free,
 				     enum lru_list lru)
@@ -1857,6 +1859,7 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 	unsigned long pgmoved = 0;
 	struct page *page;
 	int nr_pages;
+	int nr_moved = 0;
 
 	while (!list_empty(list)) {
 		page = lru_to_page(list);
@@ -1882,11 +1885,15 @@ static void move_active_pages_to_lru(struct lruvec *lruvec,
 				spin_lock_irq(&pgdat->lru_lock);
 			} else
 				list_add(&page->lru, pages_to_free);
+		} else {
+			nr_moved += nr_pages;
 		}
 	}
 
 	if (!is_active_lru(lru))
 		__count_vm_events(PGDEACTIVATE, pgmoved);
+
+	return nr_moved;
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
@@ -1902,7 +1909,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_inactive);
 	struct page *page;
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
-	unsigned long nr_rotated = 0;
+	unsigned nr_deactivate, nr_activate;
+	unsigned nr_rotated = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1980,13 +1988,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	reclaim_stat->recent_rotated[file] += nr_rotated;
 
-	move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
-	move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
+	nr_activate = move_active_pages_to_lru(lruvec, &l_active, &l_hold, lru);
+	nr_deactivate = move_active_pages_to_lru(lruvec, &l_inactive, &l_hold, lru - LRU_ACTIVE);
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&pgdat->lru_lock);
 
 	mem_cgroup_uncharge_list(&l_hold);
 	free_hot_cold_page_list(&l_hold, true);
+	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_scanned, nr_taken,
+			nr_activate, nr_deactivate, nr_rotated, sc->priority, file);
 }
 
 /*
-- 
2.10.2


-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-12-30  9:26 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-28 15:30 [PATCH 0/7] vm, vmscan: enahance vmscan tracepoints Michal Hocko
2016-12-28 15:30 ` [PATCH 1/7] mm, vmscan: remove unused mm_vmscan_memcg_isolate Michal Hocko
2016-12-29  7:33   ` Hillf Danton
2016-12-28 15:30 ` [PATCH 2/7] mm, vmscan: add active list aging tracepoint Michal Hocko
2016-12-29  5:33   ` Minchan Kim
2016-12-29  7:52     ` Michal Hocko
2016-12-30  1:48       ` Minchan Kim
2016-12-30  9:26         ` Michal Hocko [this message]
2016-12-30  9:38           ` Hillf Danton
2016-12-30 16:04           ` Minchan Kim
2016-12-30 16:37             ` Michal Hocko
2016-12-30 17:30               ` Michal Hocko
2017-01-03  5:03               ` Minchan Kim
2017-01-03  8:21                 ` Michal Hocko
2017-01-04  5:07                   ` Minchan Kim
2017-01-04  7:28                     ` Vlastimil Babka
2017-01-04  7:50                     ` Michal Hocko
2016-12-29  7:44   ` Hillf Danton
2016-12-28 15:30 ` [PATCH 3/7] mm, vmscan: show the number of skipped pages in mm_vmscan_lru_isolate Michal Hocko
2016-12-29  5:53   ` Minchan Kim
2016-12-29  7:49   ` Hillf Danton
2017-01-03 17:21   ` Vlastimil Babka
2017-01-03 20:43     ` Michal Hocko
2016-12-28 15:30 ` [PATCH 4/7] mm, vmscan: show LRU name in mm_vmscan_lru_isolate tracepoint Michal Hocko
2016-12-28 15:50   ` Nikolay Borisov
2016-12-28 16:00     ` Michal Hocko
2016-12-28 16:40       ` Nikolay Borisov
2016-12-28 16:49         ` Michal Hocko
2016-12-29  6:02   ` Minchan Kim
2016-12-29  7:56     ` Michal Hocko
2016-12-30  1:56       ` Minchan Kim
2016-12-30  9:33         ` Michal Hocko
2016-12-29  7:53   ` Hillf Danton
2017-01-03 17:08   ` Vlastimil Babka
2017-01-03 20:47     ` Michal Hocko
2017-01-03 20:52       ` Michal Hocko
2017-01-03 21:24         ` Michal Hocko
2017-01-03 21:40           ` Vlastimil Babka
2017-01-03 21:48             ` Michal Hocko
2016-12-28 15:30 ` [PATCH 5/7] mm, vmscan: extract shrink_page_list reclaim counters into a struct Michal Hocko
2016-12-29  8:00   ` Hillf Danton
2016-12-28 15:30 ` [PATCH 6/7] mm, vmscan: enhance mm_vmscan_lru_shrink_inactive tracepoint Michal Hocko
2016-12-29  8:05   ` Hillf Danton
2016-12-28 15:30 ` [PATCH 7/7] mm, vmscan: add mm_vmscan_inactive_list_is_low tracepoint Michal Hocko
2016-12-29  8:19   ` Hillf Danton
2016-12-30  9:11 ` [PATCH 0/7] vm, vmscan: enahance vmscan tracepoints Mel Gorman
2016-12-30  9:36   ` Michal Hocko
2016-12-30 10:20     ` Mel Gorman
2017-01-04 10:19 [PATCH 0/7 v2] " Michal Hocko
2017-01-04 10:19 ` [PATCH 2/7] mm, vmscan: add active list aging tracepoint Michal Hocko
2017-01-04 12:52   ` Vlastimil Babka
2017-01-04 13:16     ` Michal Hocko
2017-01-04 13:34       ` Vlastimil Babka
2017-01-04 13:52   ` Michal Hocko
2017-01-05  5:41     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161230092636.GA13301@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hillf.zj@alibaba-inc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).