From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751999AbdAYSMD (ORCPT <rfc822;w@1wt.eu>);
        Wed, 25 Jan 2017 13:12:03 -0500
Received: from gum.cmpxchg.org ([85.214.110.215]:39792 "EHLO gum.cmpxchg.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751536AbdAYSMC (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 25 Jan 2017 13:12:02 -0500
Date: Wed, 25 Jan 2017 13:11:50 -0500
From: Johannes Weiner <hannes@cmpxchg.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: mhocko@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6] mm: Add memory allocation watchdog kernel thread.
Message-ID: <20170125181150.GA16398@cmpxchg.org>
References: <1478416501-10104-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1478416501-10104-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>
User-Agent: Mutt/1.7.2 (2016-11-26)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Sun, Nov 06, 2016 at 04:15:01PM +0900, Tetsuo Handa wrote:
> +- Why need to use it?
> +
> +Currently, when something went wrong inside memory allocation request,
> +the system might stall without any kernel messages.
> +
> +Although there is khungtaskd kernel thread as an asynchronous monitoring
> +approach, khungtaskd kernel thread is not always helpful because memory
> +allocating tasks unlikely sleep in uninterruptible state for
> +/proc/sys/kernel/hung_task_timeout_secs seconds.
> +
> +Although there is warn_alloc() as a synchronous monitoring approach
> +which emits
> +
> +  "%s: page allocation stalls for %ums, order:%u, mode:%#x(%pGg)\n"
> +
> +line, warn_alloc() is not bullet proof because allocating tasks can get
> +stuck before calling warn_alloc() and/or allocating tasks are using
> +__GFP_NOWARN flag and/or such lines are suppressed by ratelimiting and/or
> +such lines are corrupted due to collisions.

I'm not fully convinced by this explanation. Do you have a real life
example where the warn_alloc() stall info is not enough? If yes, this
should be included here and in the changelog. If not, the extra code,
the task_struct overhead etc. don't seem justified.

__GFP_NOWARN shouldn't suppress stall warnings, IMO. It's for whether
the caller expects allocation failure and is prepared to handle it; an
allocation stalling out for 10s is an issue regardless of the callsite.

---

>>From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Wed, 25 Jan 2017 12:57:20 -0500
Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
 warnings

__GFP_NOWARN, which is usually added to avoid warnings from callsites
that expect to fail and have fallbacks, currently also suppresses
allocation stall warnings. These trigger when an allocation is stuck
inside the allocator for 10 seconds or longer.

But there is no class of allocations that can get legitimately stuck
in the allocator for this long. This always indicates a problem.

Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f3e0c69a97b7..7ce051d1d575 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3704,7 +3704,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Make sure we know about allocations which stall for too long */
 	if (time_after(jiffies, alloc_start + stall_timeout)) {
-		warn_alloc(gfp_mask,
+		warn_alloc(gfp_mask & ~__GFP_NOWARN,
 			"page allocation stalls for %ums, order:%u",
 			jiffies_to_msecs(jiffies-alloc_start), order);
 		stall_timeout += 10 * HZ;
-- 
2.11.0