From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933142AbXCFTUQ (ORCPT ); Tue, 6 Mar 2007 14:20:16 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S965485AbXCFTUQ (ORCPT ); Tue, 6 Mar 2007 14:20:16 -0500 Received: from extu-mxob-2.symantec.com ([216.10.194.135]:1217 "EHLO extu-mxob-2.symantec.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932965AbXCFTUN (ORCPT ); Tue, 6 Mar 2007 14:20:13 -0500 X-AuditID: d80ac287-a2c32bb000002978-5e-45edbeec314b Date: Tue, 6 Mar 2007 19:20:11 +0000 (GMT) From: Hugh Dickins X-X-Sender: hugh@blonde.wat.veritas.com To: Oliver Neukum cc: Maneesh Soni , Greg Kroah-Hartman , Adrian Bunk , Linus Torvalds , linux-kernel@vger.kernel.org Subject: 2.6.21-rc suspend regression: sysfs deadlock Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-OriginalArrivalTime: 06 Mar 2007 19:20:09.0107 (UTC) FILETIME=[7462CE30:01C76024] X-Brightmail-Tracker: AAAAAA== Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Resume from RAM on a ThinkPad T43p is now happy with Thomas' periodic tick fix - the most unusable aspect of that for me had been how slow repeat keys were to start repeating, but that's all fine now. But suspend to RAM still hanging, unless I "chmod a-x /usr/sbin/docker" on SuSE 10.2: docker undock tries to unregister /sys/block/sr0 and hangs: 60x60 D B0415080 0 10778 10771 (NOTLB) e8227e04 00000086 e80c60b0 b0415080 ef3f5454 b041dc20 ef3f5430 00000001 e80c60b0 72af360e 00000085 00001941 e80c61bc e8227e00 b01606bf ef47d3c0 ed07c1dc ed07c1e4 00000246 e8227e30 b02f6ef0 e80c60b0 00000001 e80c60b0 Call Trace: [] __down+0xaa/0xb8 [] __down_failed+0xa/0x10 [] sysfs_drop_dentry+0xa2/0xda [] __sysfs_remove_dir+0x6d/0xf8 [] sysfs_remove_dir+0x15/0x20 [] kobject_del+0x16/0x22 [] device_del+0x1c9/0x1e2 [] __scsi_remove_device+0x43/0x7a [] scsi_remove_device+0x1f/0x2b [] sdev_store_delete+0x16/0x1b [] dev_attr_store+0x32/0x34 [] flush_write_buffer+0x37/0x3d [] sysfs_write_file+0x5e/0x82 [] vfs_write+0xa7/0x150 [] sys_write+0x47/0x6b [] sysenter_past_esp+0x5f/0x85 /usr/lib/dockutils/hooks/thinkpad/60x60 undock /usr/lib/dockutils/dockhandler undock /usr/sbin/docker undock /etc/pm/hooks/23dock suspend This comes from Oliver's commit 94bebf4d1b8e7719f0f3944c037a21cfd99a4af7 Driver core: fix race in sysfs between sysfs_remove_file() and read()/write() in 2.6.21-rc1. It looks to me like sysfs_write_file downs buffer->sem while calling flush_write_buffer, and flushing that particular write buffer entails downing buffer->sem in orphan_all_buffers. Suspend no longer deadlocks with the following silly patch, but I expect this either pokes a small hole in your scheme, or renders it pointless. Maybe that commit needs to be reverted, or maybe you can see how to fix it up for -rc3. Thanks, Hugh --- 2.6.21-rc2-git5/fs/sysfs/inode.c 2007-02-28 08:30:26.000000000 +0000 +++ linux/fs/sysfs/inode.c 2007-03-06 18:03:13.000000000 +0000 @@ -227,11 +227,8 @@ static inline void orphan_all_buffers(st mutex_lock_nested(&node->i_mutex, I_MUTEX_CHILD); if (node->i_private) { - list_for_each_entry(buf, &set->associates, associates) { - down(&buf->sem); + list_for_each_entry(buf, &set->associates, associates) buf->orphaned = 1; - up(&buf->sem); - } } mutex_unlock(&node->i_mutex); }