All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Dave Hansen <dave.hansen@intel.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: warn about allocations which stall for too long
Date: Mon, 26 Sep 2016 10:12:01 +0200	[thread overview]
Message-ID: <20160926081200.GB27030@dhcp22.suse.cz> (raw)
In-Reply-To: <57E56789.1070205@intel.com>

On Fri 23-09-16 10:34:01, Dave Hansen wrote:
> On 09/23/2016 01:15 AM, Michal Hocko wrote:
> > +	/* Make sure we know about allocations which stall for too long */
> > +	if (!(gfp_mask & __GFP_NOWARN) && time_after(jiffies, alloc_start + stall_timeout)) {
> > +		pr_warn("%s: page alloction stalls for %ums: order:%u mode:%#x(%pGg)\n",
> > +				current->comm, jiffies_to_msecs(jiffies-alloc_start),
> > +				order, gfp_mask, &gfp_mask);
> > +		stall_timeout += 10 * HZ;
> > +		dump_stack();
> > +	}
> 
> This would make an awesome tracepoint.  There's probably still plenty of
> value to having it in dmesg, but the configurability of tracepoints is
> hard to beat.

Currently we only have trace_mm_page_alloc in __alloc_pages_nodemask. I
think we want to add another one to mark the beginning of the allocation
so that we can track allocation latencies per allocation context and
ideally drop them down into sources - congestion waits, reclaim path,
slab reclaim etc. Janani Ravichandran is working on a script to do that
http://lkml.kernel.org/r/20160911222411.GA2854@janani-Inspiron-3521

But this sounds a bit orthogonal to my proposal here because I would
really like to warn unconditionally when an allocation stalls for
unreasonably long. Tracepoints are not an ideal tool for that because
you have to start collecting tracing output before this situations
happen. Moreover in my experience I often had to replace my local
debugging trace_printks by regular printks because the prior ones just
got lost under a heavy memory pressure.
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID
From: Michal Hocko <mhocko@kernel.org>
To: Dave Hansen <dave.hansen@intel.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: warn about allocations which stall for too long
Date: Mon, 26 Sep 2016 10:12:01 +0200	[thread overview]
Message-ID: <20160926081200.GB27030@dhcp22.suse.cz> (raw)
In-Reply-To: <57E56789.1070205@intel.com>

On Fri 23-09-16 10:34:01, Dave Hansen wrote:
> On 09/23/2016 01:15 AM, Michal Hocko wrote:
> > +	/* Make sure we know about allocations which stall for too long */
> > +	if (!(gfp_mask & __GFP_NOWARN) && time_after(jiffies, alloc_start + stall_timeout)) {
> > +		pr_warn("%s: page alloction stalls for %ums: order:%u mode:%#x(%pGg)\n",
> > +				current->comm, jiffies_to_msecs(jiffies-alloc_start),
> > +				order, gfp_mask, &gfp_mask);
> > +		stall_timeout += 10 * HZ;
> > +		dump_stack();
> > +	}
> 
> This would make an awesome tracepoint.  There's probably still plenty of
> value to having it in dmesg, but the configurability of tracepoints is
> hard to beat.

Currently we only have trace_mm_page_alloc in __alloc_pages_nodemask. I
think we want to add another one to mark the beginning of the allocation
so that we can track allocation latencies per allocation context and
ideally drop them down into sources - congestion waits, reclaim path,
slab reclaim etc. Janani Ravichandran is working on a script to do that
http://lkml.kernel.org/r/20160911222411.GA2854@janani-Inspiron-3521

But this sounds a bit orthogonal to my proposal here because I would
really like to warn unconditionally when an allocation stalls for
unreasonably long. Tracepoints are not an ideal tool for that because
you have to start collecting tracing output before this situations
happen. Moreover in my experience I often had to replace my local
debugging trace_printks by regular printks because the prior ones just
got lost under a heavy memory pressure.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-09-26  8:13 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-23  8:15 Michal Hocko
2016-09-23  8:15 ` Michal Hocko
2016-09-23  8:29 ` Hillf Danton
2016-09-23  8:29   ` Hillf Danton
2016-09-23  8:32   ` Michal Hocko
2016-09-23  8:32     ` Michal Hocko
2016-09-23  8:44     ` Hillf Danton
2016-09-23  8:44       ` Hillf Danton
2016-09-23  9:15       ` Michal Hocko
2016-09-23  9:15         ` Michal Hocko
2016-09-23 14:36 ` Tetsuo Handa
2016-09-23 14:36   ` Tetsuo Handa
2016-09-23 15:02   ` Michal Hocko
2016-09-23 15:02     ` Michal Hocko
2016-09-24  3:00     ` Tetsuo Handa
2016-09-24  3:00       ` Tetsuo Handa
2016-09-26  8:17       ` Michal Hocko
2016-09-26  8:17         ` Michal Hocko
2016-09-27 12:57         ` Tetsuo Handa
2016-09-27 12:57           ` Tetsuo Handa
2016-09-29  8:48           ` Michal Hocko
2016-09-29  8:48             ` Michal Hocko
2016-09-23 17:34 ` Dave Hansen
2016-09-23 17:34   ` Dave Hansen
2016-09-24 13:19   ` Balbir Singh
2016-09-24 13:19     ` Balbir Singh
2016-09-26  8:13     ` Michal Hocko
2016-09-26  8:13       ` Michal Hocko
2016-09-26  8:12   ` Michal Hocko [this message]
2016-09-26  8:12     ` Michal Hocko
2016-09-29  8:44 ` [PATCH 0/2] " Michal Hocko
2016-09-29  8:44   ` Michal Hocko
2016-09-29  8:44   ` [PATCH 1/2] mm: consolidate warn_alloc_failed users Michal Hocko
2016-09-29  8:44     ` Michal Hocko
2016-09-29  9:23     ` Vlastimil Babka
2016-09-29  9:23       ` Vlastimil Babka
2016-09-29  8:44   ` [PATCH 2/2] mm: warn about allocations which stall for too long Michal Hocko
2016-09-29  8:44     ` Michal Hocko
2016-09-29  9:02     ` Tetsuo Handa
2016-09-29  9:02       ` Tetsuo Handa
2016-09-29  9:10       ` Michal Hocko
2016-09-29  9:10         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160926081200.GB27030@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --subject='Re: [PATCH] mm: warn about allocations which stall for too long' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.