From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751854AbeEDToD (ORCPT ); Fri, 4 May 2018 15:44:03 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:58518 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751676AbeEDToB (ORCPT ); Fri, 4 May 2018 15:44:01 -0400 Date: Fri, 4 May 2018 12:45:10 -0700 From: "Paul E. McKenney" To: "Eric W. Biederman" Cc: Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org, tglx@linutronix.de, Anna-Maria Gleixner Subject: Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore Reply-To: paulmck@linux.vnet.ibm.com References: <20180504144014.5378-1-bigeasy@linutronix.de> <87fu37e4rn.fsf@xmission.com> <20180504170459.624e7vkro5fn2bgw@linutronix.de> <876043e3xb.fsf@xmission.com> <20180504175241.GT26088@linux.vnet.ibm.com> <87sh77b5w7.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87sh77b5w7.fsf@xmission.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 18050419-0048-0000-0000-000002686F10 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008970; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000258; SDB=6.01027509; UDB=6.00524870; IPR=6.00806642; MB=3.00020932; MTD=3.00000008; XFM=3.00000015; UTC=2018-05-04 19:43:50 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18050419-0049-0000-0000-000045019142 Message-Id: <20180504194510.GW26088@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-05-04_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1805040180 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote: > "Paul E. McKenney" writes: > > > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: > >> Sebastian Andrzej Siewior writes: > >> > >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: > >> >> Sebastian Andrzej Siewior writes: > >> >> > From: Anna-Maria Gleixner > >> > … > >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > > >> >> > wait_lock irq safe") for different reason. > >> >> > >> >> Which tree has this change been made in? I am not finding the commit > >> >> you mention above in Linus's tree. > >> > > >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make > >> > wait_lock irq safe"). > >> > >> Can you fix that in your patch description and can you also up the > >> description of rcu_read_unlock? > >> > >> If we don't need to jump through hoops it looks very reasonable to > >> remove this unnecessary logic. But we should fix the description > >> in rcu_read_unlock that still says we need these hoops. > > > > The hoops are still required for rcu_read_lock(), otherwise you > > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. > > What happens with this patch (if I understand it correctly) is that the > > signal code now uses a different way of jumping through the hoops. > > But the hoops are still jumped through. > > The patch changes: > > local_irq_disable(); > rcu_read_lock(); > spin_lock(); > rcu_read_unlock(); > > to: > > rcu_read_lock(); > spin_lock_irq(); > rcu_read_unlock(); > > Now that I have a chance to relfect on it the fact that the patern > that is being restored does not work is scary. As the failure has > nothing to do with lock ordering and people won't realize what is going > on. Especially since the common rcu modes won't care. > > So is it true that taking spin_lock_irq before calling rcu_read_unlock > is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make > wait_lock irq safe") actually fixed that and we can correct the > documentation of rcu_read_unlock() ? And fix __lock_task_sighand? The problem is that the thing taking the lock might be the scheduler, or one of the locks taken while the scheduler's pi and rq locks are held. This occurs only with RCU-preempt. Here is what can happen: o A task does rcu_read_lock(). o That task is preempted. o That task stays preempted for a long time, and is therefore priority boosted. This boosting involves a high-priority RCU kthread creating an rt_mutex, pretending that the preempted task already holds it, and then acquiring it. o The task awakens, acquires the scheduler's rq lock, and then does rcu_read_unlock(). o Because the task has been priority boosted, __rcu_read_unlock() invokes the rcu_read_unlock_special() slowpath, which does (as you say) rt_mutex_unlock() to deboost. The deboosting can cause the scheduler to acquire the rq and pi locks, which results in deadlock. In contrast, holding these scheduler locks across the entirety of the RCU-preempt read-side critical section is harmless because then the critical section cannot be preempted, which means that priority boosting cannot happen, which means that there will be no need to deboost at rcu_read_unlock() time. This restriction has not changed, and as far as I can see is inherent in the fact that RCU uses the scheduler and the scheduler uses RCU. There is going to be an odd corner case in there somewhere! Thanx, Paul