From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753139AbcBBGy1 (ORCPT ); Tue, 2 Feb 2016 01:54:27 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:50876 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752325AbcBBGyY (ORCPT ); Tue, 2 Feb 2016 01:54:24 -0500 Subject: Re: [PATCH wq/for-4.5-fixes] workqueue: skip flush dependency checks for legacy workqueues To: Tejun Heo , Thierry Reding References: <20151203002810.GJ19878@mtj.duckdns.org> <20151203093350.GP17308@twins.programming.kicks-ass.net> <20151203100018.GO11639@twins.programming.kicks-ass.net> <20151203144811.GA27463@mtj.duckdns.org> <20151203150442.GR17308@twins.programming.kicks-ass.net> <20151203150604.GC27463@mtj.duckdns.org> <20151203192616.GJ27463@mtj.duckdns.org> <20160126173843.GA11115@ulmo.nvidia.com> <20160129105946.GJ32380@htj.duckdns.org> Cc: Peter Zijlstra , Ulrich Obergfell , Ingo Molnar , Andrew Morton , linux-kernel@vger.kernel.org, kernel-team@fb.com, Jon Hunter , linux-tegra@vger.kernel.org From: Archit Taneja Message-ID: <56B05298.2060301@codeaurora.org> Date: Tue, 2 Feb 2016 12:24:16 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20160129105946.GJ32380@htj.duckdns.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/29/2016 04:29 PM, Tejun Heo wrote: > fca839c00a12 ("workqueue: warn if memory reclaim tries to flush > !WQ_MEM_RECLAIM workqueue") implemented flush dependency warning which > triggers if a PF_MEMALLOC task or WQ_MEM_RECLAIM workqueue tries to > flush a !WQ_MEM_RECLAIM workquee. > > This assumes that workqueues marked with WQ_MEM_RECLAIM sit in memory > reclaim path and making it depend on something which may need more > memory to make forward progress can lead to deadlocks. Unfortunately, > workqueues created with the legacy create*_workqueue() interface > always have WQ_MEM_RECLAIM regardless of whether they are depended > upon memory reclaim or not. These spurious WQ_MEM_RECLAIM markings > cause spurious triggering of the flush dependency checks. > > WARNING: CPU: 0 PID: 6 at kernel/workqueue.c:2361 check_flush_dependency+0x138/0x144() > workqueue: WQ_MEM_RECLAIM deferwq:deferred_probe_work_func is flushing !WQ_MEM_RECLAIM events:lru_add_drain_per_cpu > ... > Workqueue: deferwq deferred_probe_work_func > [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > [] (show_stack) from [] (dump_stack+0x94/0xd4) > [] (dump_stack) from [] (warn_slowpath_common+0x80/0xb0) > [] (warn_slowpath_common) from [] (warn_slowpath_fmt+0x30/0x40) > [] (warn_slowpath_fmt) from [] (check_flush_dependency+0x138/0x144) > [] (check_flush_dependency) from [] (flush_work+0x50/0x15c) > [] (flush_work) from [] (lru_add_drain_all+0x130/0x180) > [] (lru_add_drain_all) from [] (migrate_prep+0x8/0x10) > [] (migrate_prep) from [] (alloc_contig_range+0xd8/0x338) > [] (alloc_contig_range) from [] (cma_alloc+0xe0/0x1ac) > [] (cma_alloc) from [] (__alloc_from_contiguous+0x38/0xd8) > [] (__alloc_from_contiguous) from [] (__dma_alloc+0x240/0x278) > [] (__dma_alloc) from [] (arm_dma_alloc+0x54/0x5c) > [] (arm_dma_alloc) from [] (dmam_alloc_coherent+0xc0/0xec) > [] (dmam_alloc_coherent) from [] (ahci_port_start+0x150/0x1dc) > [] (ahci_port_start) from [] (ata_host_start.part.3+0xc8/0x1c8) > [] (ata_host_start.part.3) from [] (ata_host_activate+0x50/0x148) > [] (ata_host_activate) from [] (ahci_host_activate+0x44/0x114) > [] (ahci_host_activate) from [] (ahci_platform_init_host+0x1d8/0x3c8) > [] (ahci_platform_init_host) from [] (tegra_ahci_probe+0x448/0x4e8) > [] (tegra_ahci_probe) from [] (platform_drv_probe+0x50/0xac) > [] (platform_drv_probe) from [] (driver_probe_device+0x214/0x2c0) > [] (driver_probe_device) from [] (bus_for_each_drv+0x60/0x94) > [] (bus_for_each_drv) from [] (__device_attach+0xb0/0x114) > [] (__device_attach) from [] (bus_probe_device+0x84/0x8c) > [] (bus_probe_device) from [] (deferred_probe_work_func+0x68/0x98) > [] (deferred_probe_work_func) from [] (process_one_work+0x120/0x3f8) > [] (process_one_work) from [] (worker_thread+0x38/0x55c) > [] (worker_thread) from [] (kthread+0xdc/0xf4) > [] (kthread) from [] (ret_from_fork+0x14/0x3c) > > Fix it by marking workqueues created via create*_workqueue() with > __WQ_LEGACY and disabling flush dependency checks on them. > > Signed-off-by: Tejun Heo > Reported-by: Thierry Reding > Link: http://lkml.kernel.org/g/20160126173843.GA11115@ulmo.nvidia.com > --- > Hello, Thierry. > > Can youp please verify the fix? This fixes a similar backtrace observed when the drm/msm driver tries to allocate a vram buffer via cma. Thanks, Archit > > Thanks. > > include/linux/workqueue.h | 9 +++++---- > kernel/workqueue.c | 3 ++- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h > index 0e32bc7..ca73c50 100644 > --- a/include/linux/workqueue.h > +++ b/include/linux/workqueue.h > @@ -311,6 +311,7 @@ enum { > > __WQ_DRAINING = 1 << 16, /* internal: workqueue is draining */ > __WQ_ORDERED = 1 << 17, /* internal: workqueue is ordered */ > + __WQ_LEGACY = 1 << 18, /* internal: create*_workqueue() */ > > WQ_MAX_ACTIVE = 512, /* I like 512, better ideas? */ > WQ_MAX_UNBOUND_PER_CPU = 4, /* 4 * #cpus for unbound wq */ > @@ -411,12 +412,12 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active, > alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args) > > #define create_workqueue(name) \ > - alloc_workqueue("%s", WQ_MEM_RECLAIM, 1, (name)) > + alloc_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, 1, (name)) > #define create_freezable_workqueue(name) \ > - alloc_workqueue("%s", WQ_FREEZABLE | WQ_UNBOUND | WQ_MEM_RECLAIM, \ > - 1, (name)) > + alloc_workqueue("%s", __WQ_LEGACY | WQ_FREEZABLE | WQ_UNBOUND | \ > + WQ_MEM_RECLAIM, 1, (name)) > #define create_singlethread_workqueue(name) \ > - alloc_ordered_workqueue("%s", WQ_MEM_RECLAIM, name) > + alloc_ordered_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, name) > > extern void destroy_workqueue(struct workqueue_struct *wq); > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index 61a0264..dc7faad 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -2355,7 +2355,8 @@ static void check_flush_dependency(struct workqueue_struct *target_wq, > WARN_ONCE(current->flags & PF_MEMALLOC, > "workqueue: PF_MEMALLOC task %d(%s) is flushing !WQ_MEM_RECLAIM %s:%pf", > current->pid, current->comm, target_wq->name, target_func); > - WARN_ONCE(worker && (worker->current_pwq->wq->flags & WQ_MEM_RECLAIM), > + WARN_ONCE(worker && ((worker->current_pwq->wq->flags & > + (WQ_MEM_RECLAIM | __WQ_LEGACY)) == WQ_MEM_RECLAIM), > "workqueue: WQ_MEM_RECLAIM %s:%pf is flushing !WQ_MEM_RECLAIM %s:%pf", > worker->current_pwq->wq->name, worker->current_func, > target_wq->name, target_func); > -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation