From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752760AbcKIIry (ORCPT ); Wed, 9 Nov 2016 03:47:54 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:57060 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752570AbcKIIrx (ORCPT ); Wed, 9 Nov 2016 03:47:53 -0500 Date: Wed, 9 Nov 2016 09:45:08 +0100 (CET) From: Thomas Gleixner To: Oleg Nesterov cc: Andy Lutomirski , Roman Pen , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Tejun Heo , "linux-kernel@vger.kernel.org" , Chunming Zhou , Alex Deucher Subject: Re: [PATCH 2/2] kthread: don't use to_live_kthread() in kthread_park() and kthread_unpark() In-Reply-To: <20161031200823.GC19430@redhat.com> Message-ID: References: <20161025110508.9052-1-roman.penyaev@profitbricks.com> <20161025140333.GB4326@redhat.com> <20161025154301.GA12015@redhat.com> <20161026141359.GA6893@redhat.com> <20161026155155.GA28832@redhat.com> <20161028161106.GA8933@redhat.com> <20161031200729.GA19430@redhat.com> <20161031200823.GC19430@redhat.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 31 Oct 2016, Oleg Nesterov wrote: > I think we need to unexport kthread_park/unpark, and either make it return > "void" or actually fix the race with kthred_stop/exit. This patch just adds > WARN_ON(PF_EXITING) for now. I'll have a look. > The usage of kthread_park() in cpuhp code (cpu.c, smpboot.c, stop_machine.c) > is fine. It can never see an exiting/exited kthread, smpboot_destroy_threads() > clears *ht->store, smpboot_park_thread() checks it is not NULL under the same > smpboot_threads_lock. cpuhp_threads and cpu_stop_threads never exit, so other > callers are fine too. > > But it has two more users: > > - watchdog_park_threads() and it does not look nice. The code is actually > correct, get_online_cpus() ensures that kthread_park() can't race with > itself (note that kthread_park() can't handle this race correctly), but > imo it should not use kthread_park() directly. Should we provide an interface through the smpboot thread infrastructure for this? > - drivers/gpu/drm/amd/scheduler/gpu_scheduler.c and I think it should not > use kthread_park() too. > > But this patch should not break this code. kthread_park() must not be > called after amd_sched_fini() which does kthread_stop(), otherwise even > to_live_kthread() is not safe because task_struct can be already freed > and sched->thread can point to nowhere. Right. That's why the smpboot thread code holds a task ref which is only released after the thread has been stopped. I can see why that gpu driver wants to use the park mechanism and I guess there are other legitimate use cases as well. I prefer to implement a park/unpark variant which is safe to use on arbitrary kthreads over forcing driver writers to come up with even more broken open coded implementations of that. > Signed-off-by: Oleg Nesterov Reviewed-by: Thomas Gleixner