From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752884AbbLCWEL (ORCPT ); Thu, 3 Dec 2015 17:04:11 -0500 Received: from mail-yk0-f179.google.com ([209.85.160.179]:33256 "EHLO mail-yk0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750849AbbLCWEJ (ORCPT ); Thu, 3 Dec 2015 17:04:09 -0500 Date: Thu, 3 Dec 2015 17:04:06 -0500 From: Tejun Heo To: Peter Zijlstra Cc: Ulrich Obergfell , Ingo Molnar , Andrew Morton , linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue Message-ID: <20151203220406.GA2630@mtj.duckdns.org> References: <20151203002810.GJ19878@mtj.duckdns.org> <20151203093350.GP17308@twins.programming.kicks-ass.net> <20151203100018.GO11639@twins.programming.kicks-ass.net> <20151203144811.GA27463@mtj.duckdns.org> <20151203150442.GR17308@twins.programming.kicks-ass.net> <20151203150604.GC27463@mtj.duckdns.org> <20151203192616.GJ27463@mtj.duckdns.org> <20151203204313.GX17308@twins.programming.kicks-ass.net> <20151203205632.GM27463@mtj.duckdns.org> <20151203210911.GZ17308@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151203210911.GZ17308@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Peter. On Thu, Dec 03, 2015 at 10:09:11PM +0100, Peter Zijlstra wrote: > On Thu, Dec 03, 2015 at 03:56:32PM -0500, Tejun Heo wrote: > > So, if I'm not mistaken, those are all marking tasks which can be > > depended upon during memory reclaim and we do want to catch them all. > > Up to a point yes, these are things that want to be reliable during > reclaim, but lacking memory reserves and usage bounds (which we > discussed last at lsf/mm) these are just wanna-be. Hmmm... even if buggy in that they can't guarantee forward-progress even with access to the emergency pool, I think it makes sense to warn them about creating an extra dependency which doesn't have access to the emergency pool. > > PF_MEMALLOC shouldn't depend on something which require memory to be > > reclaimed to guarantee forward progress. > > PF_MEMALLOC basically avoids reclaim for any memory allocation while its > set. So, the assumption is that they're already on the reclaim path and thus shouldn't recurse into it again. > The thing is, even if your workqueue has WQ_MEM_RECLAIM set, it will not > hit the mayday button until you're completely full flat out of memory. It's more trigger-happy than that. It's timer based. If new worker can't be created for a certain amount of time for whatever reason, it'll summon the rescuer. > At which point you're probably boned anyway, because, as per the above, > all that code assumes there's _some_ memory to be had. Not really. PF_MEMALLOC tasks have access to the emergency pool, creating new workers doesn't, so this really is creating a dependency which is qualitatively different. > One solution is to always fail maybe_create_worker() when PF_MEMALLOC is > set, thus always hitting the mayday button. I'm not following. When PF_MEMALLOC is set where? Thanks. -- tejun