From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754708Ab2DSD0P (ORCPT ); Wed, 18 Apr 2012 23:26:15 -0400 Received: from wolverine01.qualcomm.com ([199.106.114.254]:26284 "EHLO wolverine01.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751180Ab2DSD0L (ORCPT ); Wed, 18 Apr 2012 23:26:11 -0400 X-IronPort-AV: E=McAfee;i="5400,1158,6685"; a="183001721" From: Stephen Boyd To: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org, Tejun Heo , Ben Dooks Subject: [PATCH 2/2] ks8851: Fix mutex deadlock in ks8851_net_stop() Date: Wed, 18 Apr 2012 20:25:58 -0700 Message-Id: <1334805958-29119-2-git-send-email-sboyd@codeaurora.org> X-Mailer: git-send-email 1.7.10.207.g0bb2e In-Reply-To: <1334805958-29119-1-git-send-email-sboyd@codeaurora.org> References: <1334805958-29119-1-git-send-email-sboyd@codeaurora.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org There is a potential deadlock scenario when the ks8851 driver is removed. The interrupt handler schedules a workqueue which acquires a mutex that ks8851_net_stop() also acquires before flushing the workqueue. Previously lockdep wouldn't be able to find this problem but now that it has the support we can trigger this lockdep warning by rmmoding the driver after an ifconfig up. Fix the possible deadlock by disabling the interrupts in the chip and then release the lock across the workqueue flushing. The mutex is only there to proect the registers anyway so this should be ok. ======================================================= [ INFO: possible circular locking dependency detected ] 3.0.21-00021-g8b33780-dirty #2911 ------------------------------------------------------- rmmod/125 is trying to acquire lock: ((&ks->irq_work)){+.+...}, at: [] flush_work+0x0/0xac but task is already holding lock: (&ks->lock){+.+...}, at: [] ks8851_net_stop+0x64/0x138 [ks8851] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&ks->lock){+.+...}: [] __lock_acquire+0x940/0x9f8 [] lock_acquire+0x10c/0x130 [] mutex_lock_nested+0x68/0x3dc [] ks8851_irq_work+0x24/0x46c [ks8851] [] process_one_work+0x2d8/0x518 [] worker_thread+0x220/0x3a0 [] kthread+0x88/0x94 [] kernel_thread_exit+0x0/0x8 -> #0 ((&ks->irq_work)){+.+...}: [] validate_chain+0x914/0x1018 [] __lock_acquire+0x940/0x9f8 [] lock_acquire+0x10c/0x130 [] flush_work+0x4c/0xac [] ks8851_net_stop+0x6c/0x138 [ks8851] [] __dev_close_many+0x98/0xcc [] dev_close_many+0x68/0xd0 [] rollback_registered_many+0xcc/0x2b8 [] rollback_registered+0x28/0x34 [] unregister_netdevice_queue+0x58/0x7c [] unregister_netdev+0x18/0x20 [] ks8851_remove+0x64/0xb4 [ks8851] [] spi_drv_remove+0x18/0x1c [] __device_release_driver+0x7c/0xbc [] driver_detach+0x8c/0xb4 [] bus_remove_driver+0xb8/0xe8 [] sys_delete_module+0x1e8/0x27c [] ret_fast_syscall+0x0/0x3c other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ks->lock); lock((&ks->irq_work)); lock(&ks->lock); lock((&ks->irq_work)); *** DEADLOCK *** 4 locks held by rmmod/125: #0: (&__lockdep_no_validate__){+.+.+.}, at: [] driver_detach+0x6c/0xb4 #1: (&__lockdep_no_validate__){+.+.+.}, at: [] driver_detach+0x78/0xb4 #2: (rtnl_mutex){+.+.+.}, at: [] unregister_netdev+0xc/0x20 #3: (&ks->lock){+.+...}, at: [] ks8851_net_stop+0x64/0x138 [ks8851] Cc: Ben Dooks Signed-off-by: Stephen Boyd --- drivers/net/ethernet/micrel/ks8851.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/micrel/ks8851.c b/drivers/net/ethernet/micrel/ks8851.c index c722aa60..20f50ec 100644 --- a/drivers/net/ethernet/micrel/ks8851.c +++ b/drivers/net/ethernet/micrel/ks8851.c @@ -889,16 +889,17 @@ static int ks8851_net_stop(struct net_device *dev) netif_stop_queue(dev); mutex_lock(&ks->lock); + /* turn off the IRQs and ack any outstanding */ + ks8851_wrreg16(ks, KS_IER, 0x0000); + ks8851_wrreg16(ks, KS_ISR, 0xffff); + mutex_unlock(&ks->lock); /* stop any outstanding work */ flush_work(&ks->irq_work); flush_work(&ks->tx_work); flush_work(&ks->rxctrl_work); - /* turn off the IRQs and ack any outstanding */ - ks8851_wrreg16(ks, KS_IER, 0x0000); - ks8851_wrreg16(ks, KS_ISR, 0xffff); - + mutex_lock(&ks->lock); /* shutdown RX process */ ks8851_wrreg16(ks, KS_RXCR1, 0x0000); @@ -907,6 +908,7 @@ static int ks8851_net_stop(struct net_device *dev) /* set powermode to soft power down to save power */ ks8851_set_powermode(ks, PMECR_PM_SOFTDOWN); + mutex_unlock(&ks->lock); /* ensure any queued tx buffers are dumped */ while (!skb_queue_empty(&ks->txq)) { @@ -918,7 +920,6 @@ static int ks8851_net_stop(struct net_device *dev) dev_kfree_skb(txb); } - mutex_unlock(&ks->lock); return 0; } -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.