All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	riel@redhat.com, Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH 3/4] tracing, page-allocator: Add trace event for page traffic related to the buddy lists
Date: Wed, 5 Aug 2009 10:43:46 +0100	[thread overview]
Message-ID: <20090805094346.GC21950@csn.ul.ie> (raw)
In-Reply-To: <20090805182034.5BCD.A69D9226@jp.fujitsu.com>

On Wed, Aug 05, 2009 at 06:24:40PM +0900, KOSAKI Motohiro wrote:
> > The page allocation trace event reports that a page was successfully allocated
> > but it does not specify where it came from. When analysing performance,
> > it can be important to distinguish between pages coming from the per-cpu
> > allocator and pages coming from the buddy lists as the latter requires the
> > zone lock to the taken and more data structures to be examined.
> > 
> > This patch adds a trace event for __rmqueue reporting when a page is being
> > allocated from the buddy lists. It distinguishes between being called
> > to refill the per-cpu lists or whether it is a high-order allocation.
> > Similarly, this patch adds an event to catch when the PCP lists are being
> > drained a little and pages are going back to the buddy lists.
> > 
> > This is trickier to draw conclusions from but high activity on those
> > events could explain why there were a large number of cache misses on a
> > page-allocator-intensive workload. The coalescing and splitting of buddies
> > involves a lot of writing of page metadata and cache line bounces not to
> > mention the acquisition of an interrupt-safe lock necessary to enter this
> > path.
> > 
> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > Acked-by: Rik van Riel <riel@redhat.com>
> > ---
> >  include/trace/events/kmem.h |   54 +++++++++++++++++++++++++++++++++++++++++++
> >  mm/page_alloc.c             |    2 +
> >  2 files changed, 56 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> > index 0b4002e..3be3df3 100644
> > --- a/include/trace/events/kmem.h
> > +++ b/include/trace/events/kmem.h
> > @@ -311,6 +311,60 @@ TRACE_EVENT(mm_page_alloc,
> >  		show_gfp_flags(__entry->gfp_flags))
> >  );
> >  
> > +TRACE_EVENT(mm_page_alloc_zone_locked,
> > +
> > +	TP_PROTO(const void *page, unsigned int order,
> > +				int migratetype, int percpu_refill),
> > +
> > +	TP_ARGS(page, order, migratetype, percpu_refill),
> > +
> > +	TP_STRUCT__entry(
> > +		__field(	const void *,	page		)
> > +		__field(	unsigned int,	order		)
> > +		__field(	int,		migratetype	)
> > +		__field(	int,		percpu_refill	)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		__entry->page		= page;
> > +		__entry->order		= order;
> > +		__entry->migratetype	= migratetype;
> > +		__entry->percpu_refill	= percpu_refill;
> > +	),
> > +
> > +	TP_printk("page=%p pfn=%lu order=%u migratetype=%d percpu_refill=%d",
> > +		__entry->page,
> > +		page_to_pfn((struct page *)__entry->page),
> > +		__entry->order,
> > +		__entry->migratetype,
> > +		__entry->percpu_refill)
> > +);
> > +
> > +TRACE_EVENT(mm_page_pcpu_drain,
> > +
> > +	TP_PROTO(const void *page, int order, int migratetype),
> > +
> > +	TP_ARGS(page, order, migratetype),
> > +
> > +	TP_STRUCT__entry(
> > +		__field(	const void *,	page		)
> > +		__field(	int,		order		)
> > +		__field(	int,		migratetype	)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		__entry->page		= page;
> > +		__entry->order		= order;
> > +		__entry->migratetype	= migratetype;
> > +	),
> > +
> > +	TP_printk("page=%p pfn=%lu order=%d migratetype=%d",
> > +		__entry->page,
> > +		page_to_pfn((struct page *)__entry->page),
> > +		__entry->order,
> > +		__entry->migratetype)
> > +);
> > +
> >  TRACE_EVENT(mm_page_alloc_extfrag,
> >  
> >  	TP_PROTO(const void *page,
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index c2c90cd..35b92a9 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -535,6 +535,7 @@ static void free_pages_bulk(struct zone *zone, int count,
> >  		page = list_entry(list->prev, struct page, lru);
> >  		/* have to delete it as __free_one_page list manipulates */
> >  		list_del(&page->lru);
> > +		trace_mm_page_pcpu_drain(page, order, page_private(page));
> 
> pcp refill (trace_mm_page_alloc_zone_locked) logged migratetype, but
> this tracepoint doesn't. why?
> 

It does log migratetype as migratetype is in page_private(page) in this
context.

> 
> >  		__free_one_page(page, zone, order, page_private(page));
> >  	}
> >  	spin_unlock(&zone->lock);
> > @@ -878,6 +879,7 @@ retry_reserve:
> >  		}
> >  	}
> >  
> > +	trace_mm_page_alloc_zone_locked(page, order, migratetype, order == 0);
> >  	return page;
> >  }
> 
> Umm, Can we assume order-0 always mean pcp refill?
> 

Right now, that assumption is accurate. Which callpath ends up here with
order == 0 and it's not a PCP refill?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mel@csn.ul.ie>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	riel@redhat.com, Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH 3/4] tracing, page-allocator: Add trace event for page traffic related to the buddy lists
Date: Wed, 5 Aug 2009 10:43:46 +0100	[thread overview]
Message-ID: <20090805094346.GC21950@csn.ul.ie> (raw)
In-Reply-To: <20090805182034.5BCD.A69D9226@jp.fujitsu.com>

On Wed, Aug 05, 2009 at 06:24:40PM +0900, KOSAKI Motohiro wrote:
> > The page allocation trace event reports that a page was successfully allocated
> > but it does not specify where it came from. When analysing performance,
> > it can be important to distinguish between pages coming from the per-cpu
> > allocator and pages coming from the buddy lists as the latter requires the
> > zone lock to the taken and more data structures to be examined.
> > 
> > This patch adds a trace event for __rmqueue reporting when a page is being
> > allocated from the buddy lists. It distinguishes between being called
> > to refill the per-cpu lists or whether it is a high-order allocation.
> > Similarly, this patch adds an event to catch when the PCP lists are being
> > drained a little and pages are going back to the buddy lists.
> > 
> > This is trickier to draw conclusions from but high activity on those
> > events could explain why there were a large number of cache misses on a
> > page-allocator-intensive workload. The coalescing and splitting of buddies
> > involves a lot of writing of page metadata and cache line bounces not to
> > mention the acquisition of an interrupt-safe lock necessary to enter this
> > path.
> > 
> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > Acked-by: Rik van Riel <riel@redhat.com>
> > ---
> >  include/trace/events/kmem.h |   54 +++++++++++++++++++++++++++++++++++++++++++
> >  mm/page_alloc.c             |    2 +
> >  2 files changed, 56 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> > index 0b4002e..3be3df3 100644
> > --- a/include/trace/events/kmem.h
> > +++ b/include/trace/events/kmem.h
> > @@ -311,6 +311,60 @@ TRACE_EVENT(mm_page_alloc,
> >  		show_gfp_flags(__entry->gfp_flags))
> >  );
> >  
> > +TRACE_EVENT(mm_page_alloc_zone_locked,
> > +
> > +	TP_PROTO(const void *page, unsigned int order,
> > +				int migratetype, int percpu_refill),
> > +
> > +	TP_ARGS(page, order, migratetype, percpu_refill),
> > +
> > +	TP_STRUCT__entry(
> > +		__field(	const void *,	page		)
> > +		__field(	unsigned int,	order		)
> > +		__field(	int,		migratetype	)
> > +		__field(	int,		percpu_refill	)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		__entry->page		= page;
> > +		__entry->order		= order;
> > +		__entry->migratetype	= migratetype;
> > +		__entry->percpu_refill	= percpu_refill;
> > +	),
> > +
> > +	TP_printk("page=%p pfn=%lu order=%u migratetype=%d percpu_refill=%d",
> > +		__entry->page,
> > +		page_to_pfn((struct page *)__entry->page),
> > +		__entry->order,
> > +		__entry->migratetype,
> > +		__entry->percpu_refill)
> > +);
> > +
> > +TRACE_EVENT(mm_page_pcpu_drain,
> > +
> > +	TP_PROTO(const void *page, int order, int migratetype),
> > +
> > +	TP_ARGS(page, order, migratetype),
> > +
> > +	TP_STRUCT__entry(
> > +		__field(	const void *,	page		)
> > +		__field(	int,		order		)
> > +		__field(	int,		migratetype	)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		__entry->page		= page;
> > +		__entry->order		= order;
> > +		__entry->migratetype	= migratetype;
> > +	),
> > +
> > +	TP_printk("page=%p pfn=%lu order=%d migratetype=%d",
> > +		__entry->page,
> > +		page_to_pfn((struct page *)__entry->page),
> > +		__entry->order,
> > +		__entry->migratetype)
> > +);
> > +
> >  TRACE_EVENT(mm_page_alloc_extfrag,
> >  
> >  	TP_PROTO(const void *page,
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index c2c90cd..35b92a9 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -535,6 +535,7 @@ static void free_pages_bulk(struct zone *zone, int count,
> >  		page = list_entry(list->prev, struct page, lru);
> >  		/* have to delete it as __free_one_page list manipulates */
> >  		list_del(&page->lru);
> > +		trace_mm_page_pcpu_drain(page, order, page_private(page));
> 
> pcp refill (trace_mm_page_alloc_zone_locked) logged migratetype, but
> this tracepoint doesn't. why?
> 

It does log migratetype as migratetype is in page_private(page) in this
context.

> 
> >  		__free_one_page(page, zone, order, page_private(page));
> >  	}
> >  	spin_unlock(&zone->lock);
> > @@ -878,6 +879,7 @@ retry_reserve:
> >  		}
> >  	}
> >  
> > +	trace_mm_page_alloc_zone_locked(page, order, migratetype, order == 0);
> >  	return page;
> >  }
> 
> Umm, Can we assume order-0 always mean pcp refill?
> 

Right now, that assumption is accurate. Which callpath ends up here with
order == 0 and it's not a PCP refill?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-08-05  9:43 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-04 18:12 [PATCH 0/4] Add some trace events for the page allocator v3 Mel Gorman
2009-08-04 18:12 ` Mel Gorman
2009-08-04 18:12 ` [PATCH 1/4] tracing, page-allocator: Add trace events for page allocation and page freeing Mel Gorman
2009-08-04 18:12   ` Mel Gorman
2009-08-05  9:13   ` KOSAKI Motohiro
2009-08-05  9:13     ` KOSAKI Motohiro
2009-08-05  9:40     ` Mel Gorman
2009-08-05  9:40       ` Mel Gorman
2009-08-07  1:17       ` KOSAKI Motohiro
2009-08-07  1:17         ` KOSAKI Motohiro
2009-08-07 17:31         ` Mel Gorman
2009-08-07 17:31           ` Mel Gorman
2009-08-08  5:44           ` KOSAKI Motohiro
2009-08-08  5:44             ` KOSAKI Motohiro
2009-08-04 18:12 ` [PATCH 2/4] tracing, mm: Add trace events for anti-fragmentation falling back to other migratetypes Mel Gorman
2009-08-04 18:12   ` Mel Gorman
2009-08-05  9:26   ` KOSAKI Motohiro
2009-08-05  9:26     ` KOSAKI Motohiro
2009-08-04 18:12 ` [PATCH 3/4] tracing, page-allocator: Add trace event for page traffic related to the buddy lists Mel Gorman
2009-08-04 18:12   ` Mel Gorman
2009-08-05  9:24   ` KOSAKI Motohiro
2009-08-05  9:24     ` KOSAKI Motohiro
2009-08-05  9:43     ` Mel Gorman [this message]
2009-08-05  9:43       ` Mel Gorman
2009-08-07  1:03       ` KOSAKI Motohiro
2009-08-07  1:03         ` KOSAKI Motohiro
2009-08-04 18:12 ` [PATCH 4/4] tracing, page-allocator: Add a postprocessing script for page-allocator-related ftrace events Mel Gorman
2009-08-04 18:12   ` Mel Gorman
2009-08-04 18:22   ` Andrew Morton
2009-08-04 18:22     ` Andrew Morton
2009-08-04 18:27     ` Rik van Riel
2009-08-04 18:27       ` Rik van Riel
2009-08-04 19:13       ` Andrew Morton
2009-08-04 19:13         ` Andrew Morton
2009-08-04 20:48         ` Mel Gorman
2009-08-04 20:48           ` Mel Gorman
2009-08-05  7:41           ` Ingo Molnar
2009-08-05  7:41             ` Ingo Molnar
2009-08-05  9:07             ` Mel Gorman
2009-08-05  9:07               ` Mel Gorman
2009-08-05  9:16               ` Ingo Molnar
2009-08-05  9:16                 ` Ingo Molnar
2009-08-05 10:27               ` Johannes Weiner
2009-08-05 10:27                 ` Johannes Weiner
2009-08-06 15:48                 ` Mel Gorman
2009-08-06 15:48                   ` Mel Gorman
2009-08-05 14:53           ` Larry Woodman
2009-08-05 14:53             ` Larry Woodman
2009-08-06 15:54             ` Mel Gorman
2009-08-06 15:54               ` Mel Gorman
2009-08-04 19:57     ` Ingo Molnar
2009-08-04 19:57       ` Ingo Molnar
2009-08-04 20:18       ` Andrew Morton
2009-08-04 20:18         ` Andrew Morton
2009-08-04 20:35         ` Ingo Molnar
2009-08-04 20:35           ` Ingo Molnar
2009-08-04 20:53           ` Andrew Morton
2009-08-04 20:53             ` Andrew Morton
2009-08-05  7:53             ` Ingo Molnar
2009-08-05  7:53               ` Ingo Molnar
2009-08-05 13:04           ` Peter Zijlstra
2009-08-05 13:04             ` Peter Zijlstra
2009-08-05 15:07         ` Valdis.Kletnieks
2009-08-05 14:53       ` Valdis.Kletnieks
2009-08-05 18:53         ` perf: "Longum est iter per praecepta, breve et efficax per exempla" Carlos R. Mafra
2009-08-06  7:08           ` Pekka Enberg
2009-08-06  7:35             ` Ingo Molnar
2009-08-06  8:38               ` Carlos R. Mafra
2009-08-06  8:32             ` Carlos R. Mafra
2009-08-06  9:10               ` Ingo Molnar
2009-08-08 12:37           ` [tip:perfcounters/urgent] " tip-bot for Carlos R. Mafra
2009-08-09 11:11           ` [tip:perfcounters/core] " tip-bot for Carlos R. Mafra
2009-08-06 15:50       ` [PATCH 4/4] tracing, page-allocator: Add a postprocessing script for page-allocator-related ftrace events Mel Gorman
2009-08-06 15:50         ` Mel Gorman
2009-08-05  3:07     ` KOSAKI Motohiro
2009-08-05  3:07       ` KOSAKI Motohiro
  -- strict thread matches above, loose matches on Subject: below --
2009-07-29 21:05 [RFC PATCH 0/4] Add some trace events for the page allocator v2 Mel Gorman
2009-07-29 21:05 ` [PATCH 3/4] tracing, page-allocator: Add trace event for page traffic related to the buddy lists Mel Gorman
2009-07-29 21:05   ` Mel Gorman
2009-07-30 13:43   ` Rik van Riel
2009-07-30 13:43     ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090805094346.GC21950@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.