From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751209AbeEEEit convert rfc822-to-8bit (ORCPT ); Sat, 5 May 2018 00:38:49 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:56092 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750752AbeEEEis (ORCPT ); Sat, 5 May 2018 00:38:48 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: "Paul E. McKenney" Cc: Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org, tglx@linutronix.de, Anna-Maria Gleixner References: <20180504144014.5378-1-bigeasy@linutronix.de> <87fu37e4rn.fsf@xmission.com> <20180504170459.624e7vkro5fn2bgw@linutronix.de> <876043e3xb.fsf@xmission.com> <20180504175241.GT26088@linux.vnet.ibm.com> <87sh77b5w7.fsf@xmission.com> <20180504194510.GW26088@linux.vnet.ibm.com> <87vac39oaf.fsf@xmission.com> <20180504203713.GY26088@linux.vnet.ibm.com> Date: Fri, 04 May 2018 23:38:37 -0500 In-Reply-To: <20180504203713.GY26088@linux.vnet.ibm.com> (Paul E. McKenney's message of "Fri, 4 May 2018 13:37:13 -0700") Message-ID: <87k1siaf8y.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=1fEoy5-0002T3-Oh;;;mid=<87k1siaf8y.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=97.119.174.25;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18dqTmMapMmHa5c2jKt1qOmgpJ3X7a8UqY= X-SA-Exim-Connect-IP: 97.119.174.25 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4985] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa01 1397; Body=1 Fuz1=1 Fuz2=1] * 0.1 XMSolicitRefs_0 Weightloss drug * 0.2 T_XMDrugObfuBody_14 obfuscated drug references X-Spam-DCC: XMission; sa01 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;"Paul E. McKenney" X-Spam-Relay-Country: X-Spam-Timing: total 1140 ms - load_scoreonly_sql: 0.06 (0.0%), signal_user_changed: 3.1 (0.3%), b_tie_ro: 2.1 (0.2%), parse: 1.65 (0.1%), extract_message_metadata: 32 (2.8%), get_uri_detail_list: 8 (0.7%), tests_pri_-1000: 8 (0.7%), tests_pri_-950: 2.1 (0.2%), tests_pri_-900: 1.65 (0.1%), tests_pri_-400: 50 (4.4%), check_bayes: 47 (4.2%), b_tokenize: 21 (1.8%), b_tok_get_all: 11 (1.0%), b_comp_prob: 8 (0.7%), b_tok_touch_all: 3.3 (0.3%), b_finish: 0.81 (0.1%), tests_pri_0: 1029 (90.2%), check_dkim_signature: 1.05 (0.1%), check_dkim_adsp: 5 (0.4%), tests_pri_500: 7 (0.6%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org "Paul E. McKenney" writes: > On Fri, May 04, 2018 at 03:08:40PM -0500, Eric W. Biederman wrote: >> "Paul E. McKenney" writes: >> >> > On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote: >> >> "Paul E. McKenney" writes: >> >> >> >> > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: >> >> >> Sebastian Andrzej Siewior writes: >> >> >> >> >> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: >> >> >> >> Sebastian Andrzej Siewior writes: >> >> >> >> > From: Anna-Maria Gleixner >> >> >> > … >> >> >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > >> >> >> >> > wait_lock irq safe") for different reason. >> >> >> >> >> >> >> >> Which tree has this change been made in? I am not finding the commit >> >> >> >> you mention above in Linus's tree. >> >> >> > >> >> >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make >> >> >> > wait_lock irq safe"). >> >> >> >> >> >> Can you fix that in your patch description and can you also up the >> >> >> description of rcu_read_unlock? >> >> >> >> >> >> If we don't need to jump through hoops it looks very reasonable to >> >> >> remove this unnecessary logic. But we should fix the description >> >> >> in rcu_read_unlock that still says we need these hoops. >> >> > >> >> > The hoops are still required for rcu_read_lock(), otherwise you >> >> > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. >> >> > What happens with this patch (if I understand it correctly) is that the >> >> > signal code now uses a different way of jumping through the hoops. >> >> > But the hoops are still jumped through. >> >> >> >> The patch changes: >> >> >> >> local_irq_disable(); >> >> rcu_read_lock(); >> >> spin_lock(); >> >> rcu_read_unlock(); >> >> >> >> to: >> >> >> >> rcu_read_lock(); >> >> spin_lock_irq(); >> >> rcu_read_unlock(); >> >> >> >> Now that I have a chance to relfect on it the fact that the patern >> >> that is being restored does not work is scary. As the failure has >> >> nothing to do with lock ordering and people won't realize what is going >> >> on. Especially since the common rcu modes won't care. >> >> >> >> So is it true that taking spin_lock_irq before calling rcu_read_unlock >> >> is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make >> >> wait_lock irq safe") actually fixed that and we can correct the >> >> documentation of rcu_read_unlock() ? And fix __lock_task_sighand? >> > >> > The problem is that the thing taking the lock might be the scheduler, >> > or one of the locks taken while the scheduler's pi and rq locks are >> > held. This occurs only with RCU-preempt. >> > >> > Here is what can happen: >> > >> > o A task does rcu_read_lock(). >> > >> > o That task is preempted. >> > >> > o That task stays preempted for a long time, and is therefore >> > priority boosted. This boosting involves a high-priority RCU >> > kthread creating an rt_mutex, pretending that the preempted task >> > already holds it, and then acquiring it. >> > >> > o The task awakens, acquires the scheduler's rq lock, and >> > then does rcu_read_unlock(). >> > >> > o Because the task has been priority boosted, __rcu_read_unlock() >> > invokes the rcu_read_unlock_special() slowpath, which does >> > (as you say) rt_mutex_unlock() to deboost. The deboosting >> > can cause the scheduler to acquire the rq and pi locks, which >> > results in deadlock. >> > >> > In contrast, holding these scheduler locks across the entirety of the >> > RCU-preempt read-side critical section is harmless because then the >> > critical section cannot be preempted, which means that priority boosting >> > cannot happen, which means that there will be no need to deboost at >> > rcu_read_unlock() time. >> > >> > This restriction has not changed, and as far as I can see is inherent >> > in the fact that RCU uses the scheduler and the scheduler uses RCU. >> > There is going to be an odd corner case in there somewhere! >> >> However if I read things correctly b4abf91047cf ("rtmutex: Make >> wait_lock irq safe") did change this. >> >> In particular it changed things so that it is only the scheduler locks >> that matter, not any old lock that disabled interrupts. This was done >> by disabling disabling interrupts when taking the wait_lock. >> >> The rcu_read_unlock documentation states: >> >> * In most situations, rcu_read_unlock() is immune from deadlock. >> * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock() >> * is responsible for deboosting, which it does via rt_mutex_unlock(). >> * Unfortunately, this function acquires the scheduler's runqueue and >> * priority-inheritance spinlocks. This means that deadlock could result >> * if the caller of rcu_read_unlock() already holds one of these locks or >> * any lock that is ever acquired while holding them; or any lock which >> * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() >> * does not disable irqs while taking ->wait_lock. >> >> So we can now remove the clause: >> * ; or any lock which >> * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() >> * does not disable irqs while taking ->wait_lock. >> >> Without the any lock that disabled interrupts restriction it is now safe >> to not worry about the issues with the scheduler locks and the rt_mutex >> Which does make it safe to not worry about these crazy complexities in >> lock_task_sighand. >> >> Paul do you agree or is the patch unsafe? > > Ah, I thought you were trying to get rid of all but the first line of > that paragraph, not just the final clause. Apologies for my confusion! > > It looks plausible, but the patch should be stress-tested on a preemptible > kernel with priority boosting enabled. Has that been done? > > (Me, I would run rcutorture scenario TREE03 for an extended time period > on b4abf91047cf with your patch applied. But what testing have you > done already?) Not my patch. I was just the reviewer asking for some obvious cleanups. So people won't get confused. Sebastian? Can you tell Paul what testing you have done? Eric