From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752884AbbLCWEL (ORCPT <rfc822;w@1wt.eu>);
	Thu, 3 Dec 2015 17:04:11 -0500
Received: from mail-yk0-f179.google.com ([209.85.160.179]:33256 "EHLO
	mail-yk0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750849AbbLCWEJ (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 3 Dec 2015 17:04:09 -0500
Date: Thu, 3 Dec 2015 17:04:06 -0500
From: Tejun Heo <tj@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>, Ingo Molnar <mingo@redhat.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] workqueue: warn if memory reclaim tries to flush
 !WQ_MEM_RECLAIM workqueue
Message-ID: <20151203220406.GA2630@mtj.duckdns.org>
References: <20151203002810.GJ19878@mtj.duckdns.org>
 <20151203093350.GP17308@twins.programming.kicks-ass.net>
 <20151203100018.GO11639@twins.programming.kicks-ass.net>
 <20151203144811.GA27463@mtj.duckdns.org>
 <20151203150442.GR17308@twins.programming.kicks-ass.net>
 <20151203150604.GC27463@mtj.duckdns.org>
 <20151203192616.GJ27463@mtj.duckdns.org>
 <20151203204313.GX17308@twins.programming.kicks-ass.net>
 <20151203205632.GM27463@mtj.duckdns.org>
 <20151203210911.GZ17308@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20151203210911.GZ17308@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello, Peter.

On Thu, Dec 03, 2015 at 10:09:11PM +0100, Peter Zijlstra wrote:
> On Thu, Dec 03, 2015 at 03:56:32PM -0500, Tejun Heo wrote:
> > So, if I'm not mistaken, those are all marking tasks which can be
> > depended upon during memory reclaim and we do want to catch them all.
> 
> Up to a point yes, these are things that want to be reliable during
> reclaim, but lacking memory reserves and usage bounds (which we
> discussed last at lsf/mm) these are just wanna-be.

Hmmm... even if buggy in that they can't guarantee forward-progress
even with access to the emergency pool, I think it makes sense to warn
them about creating an extra dependency which doesn't have access to
the emergency pool.

> > PF_MEMALLOC shouldn't depend on something which require memory to be
> > reclaimed to guarantee forward progress.
> 
> PF_MEMALLOC basically avoids reclaim for any memory allocation while its
> set.

So, the assumption is that they're already on the reclaim path and
thus shouldn't recurse into it again.

> The thing is, even if your workqueue has WQ_MEM_RECLAIM set, it will not
> hit the mayday button until you're completely full flat out of memory.

It's more trigger-happy than that.  It's timer based.  If new worker
can't be created for a certain amount of time for whatever reason,
it'll summon the rescuer.

> At which point you're probably boned anyway, because, as per the above,
> all that code assumes there's _some_ memory to be had.

Not really.  PF_MEMALLOC tasks have access to the emergency pool,
creating new workers doesn't, so this really is creating a dependency
which is qualitatively different.

> One solution is to always fail maybe_create_worker() when PF_MEMALLOC is
> set, thus always hitting the mayday button.

I'm not following.  When PF_MEMALLOC is set where?

Thanks.

-- 
tejun