From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759415AbcCVLlc (ORCPT ); Tue, 22 Mar 2016 07:41:32 -0400 Received: from e06smtp06.uk.ibm.com ([195.75.94.102]:59724 "EHLO e06smtp06.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758872AbcCVLlS (ORCPT ); Tue, 22 Mar 2016 07:41:18 -0400 X-IBM-Helo: d06dlp03.portsmouth.uk.ibm.com X-IBM-MailFrom: heiko.carstens@de.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Tue, 22 Mar 2016 12:32:21 +0100 From: Heiko Carstens To: Peter Zijlstra Cc: Davidlohr Bueso , tglx@linutronix.de, mingo@kernel.org, bigeasy@linutronix.de, umgwanakikbuti@gmail.com, paulmck@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, kmo@daterainc.com Subject: Re: [PATCH 4/3] rtmutex: Avoid barrier in rt_mutex_handle_deadlock Message-ID: <20160322113221.GA3921@osiris> References: <1457461223-4301-1-git-send-email-dave@stgolabs.net> <20160308220539.GB4404@linux-uzut.site> <20160314134038.GZ6356@twins.programming.kicks-ass.net> <20160321181622.GB32012@linux-uzut.site> <20160322102153.GL6344@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160322102153.GL6344@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032211-0025-0000-0000-000008E9093A Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 22, 2016 at 11:21:53AM +0100, Peter Zijlstra wrote: > On Mon, Mar 21, 2016 at 11:16:22AM -0700, Davidlohr Bueso wrote: > > > +/* > > + * Helpers for modifying the state of either the current task, or a foreign > > + * task. Each of these calls come in both full barrier and weak flavors: > > + * > > + * Weak > > + * set_task_state() __set_task_state() > > + * set_current_state() __set_current_state() > > + * > > + * Where set_current_state() and set_task_state() includes a full smp barrier > > + * -after- the write of ->state is correctly serialized with the later test > > + * of whether to actually sleep: > > + * > > + * for (;;) { > > + * set_current_state(TASK_UNINTERRUPTIBLE); > > + * if (event_indicated) > > + * break; > > + * schedule(); > > + * } > > + * > > + * This is commonly necessary for processes sleeping and waking through flag > > + * based events. If the caller does not need such serialization, then use > > + * weaker counterparts, which simply writes the state. > > + * > > + * Refer to Documentation/memory-barriers.txt > > + */ > > I would prefer to pretend set_task_state() does not exist, using it on > anything other than task==current is very very tricky. > > With the below patch; we're only left with: > > arch/s390/mm/fault.c: __set_task_state(tsk, TASK_UNINTERRUPTIBLE); > arch/s390/mm/fault.c: __set_task_state(tsk, TASK_UNINTERRUPTIBLE); > drivers/md/bcache/btree.c: set_task_state(c->gc_thread, TASK_INTERRUPTIBLE); > kernel/exit.c: set_task_state(tsk, TASK_UNINTERRUPTIBLE); > kernel/exit.c: __set_task_state(tsk, TASK_RUNNING); > > exit most probably also has tsk==current, but I didn't check. > > bacache seems to rely on the fact that the task is not running after > kthread_create() to change the state. But I've no idea why; the only > think I can come up with is because load accounting, a new thread blocks > in UNINTERRUPTIBLE which adds to load. But by setting it to > INTERRUPTIBLE before waking up it can actually mess that up. This really > should be fixed. > > And s390 does something entirely vile, no idea what. For the two s390 usages tsk equals current. So it could be easily replaced with set_current_state().