From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941210AbcJYS2A (ORCPT ); Tue, 25 Oct 2016 14:28:00 -0400 Received: from mail-wm0-f50.google.com ([74.125.82.50]:38779 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932543AbcJYS17 (ORCPT ); Tue, 25 Oct 2016 14:27:59 -0400 From: Roman Pen Cc: Roman Pen , Andy Lutomirski , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Tejun Heo , linux-kernel@vger.kernel.org Subject: [PATCH v4 1/1] workqueue: ignore dead tasks in a workqueue sleep hook Date: Tue, 25 Oct 2016 20:27:42 +0200 Message-Id: <20161025182742.10486-1-roman.penyaev@profitbricks.com> X-Mailer: git-send-email 2.9.3 To: unlisted-recipients:; (no To-header on input) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If panic_on_oops is not set and oops happens inside workqueue kthread, kernel kills this kthread. Current patch fixes recursive GPF which happens when wq_worker_sleeping() function unconditionally accesses the NULL kthread->vfork_done ptr thru kthread_data() -> to_kthread(). The stack is the following: [] dump_stack+0x68/0x93 [] ? do_exit+0x7ab/0xc10 [] __schedule_bug+0x83/0xe0 [] __schedule+0x7ea/0xba0 [] ? vprintk_default+0x1f/0x30 [] ? printk+0x48/0x50 [] schedule+0x40/0x90 [] do_exit+0x9ca/0xc10 [] ? kmsg_dump+0x11d/0x190 [] ? kmsg_dump+0x17/0x190 [] oops_end+0x99/0xd0 [] no_context+0x185/0x3e0 [] __bad_area_nosemaphore+0x83/0x1c0 [] ? vprintk_emit+0x25e/0x530 [] bad_area_nosemaphore+0x14/0x20 [] __do_page_fault+0xac/0x570 [] ? console_trylock+0x1e/0xe0 [] ? trace_hardirqs_off_thunk+0x1a/0x1c [] do_page_fault+0xc/0x10 [] page_fault+0x22/0x30 [] ? kthread_data+0x33/0x40 [] ? wq_worker_sleeping+0xe/0x80 [] __schedule+0x47b/0xba0 [] schedule+0x40/0x90 [] do_exit+0x7dd/0xc10 [] oops_end+0x99/0xd0 kthread->vfork_done is zeroed out on the following path: do_exit() exit_mm() mm_release() complete_vfork_done() In order to fix a bug dead tasks must be ignored. Signed-off-by: Roman Pen Cc: Andy Lutomirski Cc: Oleg Nesterov Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Tejun Heo Cc: linux-kernel@vger.kernel.org --- v4: o instead of TASK_DEAD state use more generic PF_EXITING flag. o same dead task check should be also done for wk_worker_waking_up(). With this we try to avoid a case, when we scheduled back to a task, which was just in do_exit and have set the PF_EXITING flag. v3: o minor comment and coding style fixes. v2: o put a task->state check directly into a wq_worker_sleeping() function instead of changing the __schedule(). kernel/workqueue.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 9dc7ac5101e0..23f2d764cebf 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -851,7 +851,17 @@ static void wake_up_worker(struct worker_pool *pool) */ void wq_worker_waking_up(struct task_struct *task, int cpu) { - struct worker *worker = kthread_data(task); + struct worker *worker; + + if (task->flags & PF_EXITING) { + /* + * Careful here, t->vfork_done is zeroed out for + * almost dead tasks, do not touch kthread_data(). + */ + return; + } + + worker = kthread_data(task); if (!(worker->flags & WORKER_NOT_RUNNING)) { WARN_ON_ONCE(worker->pool->cpu != cpu); @@ -875,9 +885,19 @@ void wq_worker_waking_up(struct task_struct *task, int cpu) */ struct task_struct *wq_worker_sleeping(struct task_struct *task) { - struct worker *worker = kthread_data(task), *to_wakeup = NULL; + struct worker *worker, *to_wakeup = NULL; struct worker_pool *pool; + if (task->flags & PF_EXITING) { + /* + * Careful here, t->vfork_done is zeroed out for + * almost dead tasks, do not touch kthread_data(). + */ + return NULL; + } + + worker = kthread_data(task); + /* * Rescuers, which may not have all the fields set up like normal * workers, also reach here, let's not access anything before -- 2.9.3