From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756147AbbJVFfV (ORCPT ); Thu, 22 Oct 2015 01:35:21 -0400 Received: from blu004-omc1s28.hotmail.com ([65.55.116.39]:53574 "EHLO BLU004-OMC1S28.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751181AbbJVFfT (ORCPT ); Thu, 22 Oct 2015 01:35:19 -0400 X-TMN: [n97lzAQgzp4N7lkXL5NA/oC+N2eV/6hk] X-Originating-Email: [wanpeng.li@hotmail.com] Message-ID: Subject: Re: lockdep-related warning in kernel/sched/deadline.c::find_lock_later_rq() To: Luca Abeni , linux-kernel@vger.kernel.org References: <56279107.3010602@unitn.it> CC: Peter Zijlstra , Ingo Molnar , Juri Lelli From: Wanpeng Li Date: Thu, 22 Oct 2015 13:35:11 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <56279107.3010602@unitn.it> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 22 Oct 2015 05:35:17.0352 (UTC) FILETIME=[6F0C4E80:01D10C8B] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/21/15 9:20 PM, Luca Abeni wrote: > Hi all, > > after fixing task migrations for SCHED_DEADLINE, I started to see some > lockdep-related warnings that look like this: > [ 794.428081] WARNING: CPU: 1 PID: 0 at > /home/luca/Src/GRUB/linux-reclaiming/kernel/locking/lockdep.c:3407 > lock_release+0x3f4/0x440() > [ 794.428439] releasing a pinned lock > [ 794.428439] Modules linked in: > [ 794.428439] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.3.0-rc5+ #32 > [ 794.428439] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS Bochs 01/01/2011 > [ 794.428439] ffffffff81c946d0 ffff880007b03cc8 ffffffff8135abd2 > ffff880007b03d10 > [ 794.428439] ffff880007b03d00 ffffffff810546f1 ffff880007b159d8 > ffffffff81096663 > [ 794.428439] ffff88000754c558 0000000000000001 ffff88000754bd80 > ffff880007b03d60 > [ 794.428439] Call Trace: > [ 794.428439] [] dump_stack+0x4b/0x69 > [ 794.428439] [] warn_slowpath_common+0x81/0xc0 > [ 794.428439] [] ? find_lock_later_rq+0x103/0x180 > [ 794.428439] [] warn_slowpath_fmt+0x47/0x50 > [ 794.428439] [] lock_release+0x3f4/0x440 > [ 794.428439] [] _raw_spin_unlock+0x1a/0x30 > [ 794.428439] [] find_lock_later_rq+0x103/0x180 > [ 794.428439] [] push_dl_task.part.33+0x61/0x190 > [ 794.428439] [] dl_task_timer+0x194/0x2e0 > [ 794.428439] [] ? task_woken_dl+0x60/0x60 > [ 794.428439] [] __hrtimer_run_queues+0x107/0x470 > [ 794.428439] [] ? hrtimer_interrupt+0x82/0x1b0 > [ 794.428439] [] hrtimer_interrupt+0xa6/0x1b0 > [ 794.428439] [] local_apic_timer_interrupt+0x30/0x60 > [ 794.428439] [] smp_apic_timer_interrupt+0x38/0x50 > [ 794.428439] [] apic_timer_interrupt+0x84/0x90 > [ 794.428439] [] ? default_idle+0x1b/0x140 > [ 794.428439] [] ? default_idle+0x19/0x140 > [ 794.428439] [] arch_cpu_idle+0xa/0x10 > [ 794.428439] [] default_idle_call+0x25/0x40 > [ 794.428439] [] cpu_startup_entry+0x310/0x390 > [ 794.428439] [] start_secondary+0xf3/0x100 > [ 794.428439] ---[ end trace 1d35bf076299f814 ]--- > (this happended on KVM, but I can trigger similar warnings on real > hardware too). > > This was not happening before my last patch, because push_dl_task() > never migrated > any task, and always returned before calling find_lock_later_rq()... > > Now, if I understand correctly the issue is that dl_task_timer() does: > rq = task_rq_lock(p, &flags); > [...] > if (has_pushable_dl_tasks(rq)) > push_dl_task(rq); > with task_rq_lock() that pins rq->lock and push_tl_task() that invokes > find_lock_later_rq() that unlocks rq->lock() while it is pinned. > > I am not sure about how to fix this issue: as a first try, I did > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 142df26..5b1ba95 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -668,8 +668,11 @@ static enum hrtimer_restart dl_task_timer(struct > hrtimer *timer) > * Queueing this task back might have overloaded rq, check if we > need > * to kick someone away. > */ > - if (has_pushable_dl_tasks(rq)) > + if (has_pushable_dl_tasks(rq)) { > + lockdep_unpin_lock(&rq->lock); > push_dl_task(rq); > + lockdep_pin_lock(&rq->lock); > + } > #endif > > unlock: > > This removes the warning, but I am not sure if it is the correct fix > (is it > valid to unpin rq->lock, here?). > > If someone can confirm that this is the correct approach, I'll test > the patch a > little bit more and then I'll send a properly signed-off patch; > otherwise, if > someone can suggest the correct approach I'll try it. wake_up_new_task() -> __task_rq_lock() -> task_woken() -> push_dl/rt_tasks() -> push_dl/rt_task() I think you also should consider the lockdep pin_lock in this path. Regards, Wanpeng Li