From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757141AbZBJWgg (ORCPT ); Tue, 10 Feb 2009 17:36:36 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755634AbZBJWg1 (ORCPT ); Tue, 10 Feb 2009 17:36:27 -0500 Received: from mail-bw0-f161.google.com ([209.85.218.161]:63400 "EHLO mail-bw0-f161.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754086AbZBJWgZ (ORCPT ); Tue, 10 Feb 2009 17:36:25 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=HFIe04i0hDEpnpFHn/2W3seQHuhakXSxcRdxgrZx92b1VWM+/SocXkWr2Hc3224uh7 rPSJu0ZxU/xxEVTzsVmFcL+GOHu3Lw832W4jocDNvTnQ5HbjY9V3DH+EmqK54fDRXuw3 3VhxotEq2Grr5qF6AmPqoOk8B1JRtnT7wzw+U= Date: Wed, 11 Feb 2009 00:37:12 +0200 From: "Michael S. Tsirkin" To: Dave Airlie , dri-devel@lists.sourceforge.net Cc: Linux Kernel Mailing List , Kernel Testers List , "Rafael J. Wysocki" Subject: Re: [Bug #12574] possible circular locking dependency detected Message-ID: <20090210223711.GA6809@google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.15+20070412 (2007-04-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dave, dri guys, Could you take a look at this circular dependency please (below)? I observe it when suspending laptop with radeon drm loaded and with lockdep enabled. It seems that the root of the problem is that various vm ops such as drm_vm_open, drm_mmap) are called with mm semaphore taken, and take dev->struct_mutex. On the other hand, drm_rmmap_locked is called with dev->struct_mutex, and calls mtrr_del which depends on mm semaphore indirectly. What do you think? Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12574 Subject : possible circular locking dependency detected Submitter : Michael S. Tsirkin Date : 2009-01-29 11:35 (11 days old) /var/log/message dump below. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.29-rc4-mst-debug #95 ------------------------------------------------------- sleep.sh/6730 is trying to acquire lock: (&per_cpu(cpu_policy_rwsem, cpu)){----}, at: [] lock_policy_rwsem_write+0x31/0x70 but task is already holding lock: (&cpu_hotplug.lock){--..}, at: [] cpu_hotplug_begin+0x1a/0x50 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&cpu_hotplug.lock){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] mutex_lock_nested+0x9d/0x2e0 [] get_online_cpus+0x2c/0x40 [] mtrr_del_page+0x2f/0x160 [] mtrr_del+0x3a/0x50 [] drm_rmmap_locked+0xc2/0x180 [drm] [] drm_master_destroy+0x151/0x160 [drm] [] kref_put+0x2c/0x80 [] drm_master_put+0x12/0x20 [drm] [] drm_release+0x25b/0x4a0 [drm] [] __fput+0xbd/0x1d0 [] fput+0x19/0x20 [] filp_close+0x47/0x70 [] sys_close+0x6a/0xc0 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #5 (&dev->struct_mutex){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] mutex_lock_nested+0x9d/0x2e0 [] drm_vm_open+0x2d/0x50 [drm] [] dup_mm+0x227/0x310 [] copy_process+0xd7f/0x1020 [] do_fork+0x78/0x320 [] sys_clone+0x2f/0x40 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #4 (&mm->mmap_sem/1){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] down_write_nested+0x48/0x70 [] dup_mm+0xc8/0x310 [] copy_process+0xd7f/0x1020 [] do_fork+0x78/0x320 [] sys_clone+0x2f/0x40 [] syscall_call+0x7/0xb [] 0xffffffff -> #3 (&mm->mmap_sem){----}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] might_fault+0x73/0x90 [] copy_to_user+0x33/0x60 [] filldir64+0xb5/0xe0 [] sysfs_readdir+0x11f/0x1f0 [] vfs_readdir+0x8d/0xb0 [] sys_getdents64+0x69/0xc0 [] syscall_call+0x7/0xb [] 0xffffffff -> #2 (sysfs_mutex){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] mutex_lock_nested+0x9d/0x2e0 [] sysfs_addrm_start+0x2c/0xb0 [] create_dir+0x40/0x90 [] sysfs_create_subdir+0x16/0x20 [] internal_create_group+0x50/0x1a0 [] sysfs_create_group+0xc/0x10 [] cpufreq_stat_notifier_policy+0x9c/0x230 [cpufreq_stats] [] notifier_call_chain+0x37/0x80 [] __blocking_notifier_call_chain+0x44/0x60 [] blocking_notifier_call_chain+0x1a/0x20 [] __cpufreq_set_policy+0xd6/0x230 [] cpufreq_add_dev+0x4e8/0x6b0 [] sysdev_driver_register+0x75/0x130 [] cpufreq_register_driver+0xb5/0x1c0 [] uinput_destroy_device+0x4d/0x60 [uinput] [] do_one_initcall+0x2a/0x160 [] sys_init_module+0x85/0x1b0 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #1 ((cpufreq_policy_notifier_list).rwsem){----}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] down_read+0x41/0x60 [] __blocking_notifier_call_chain+0x2a/0x60 [] blocking_notifier_call_chain+0x1a/0x20 [] cpufreq_add_dev+0x1a5/0x6b0 [] sysdev_driver_register+0x75/0x130 [] cpufreq_register_driver+0xb5/0x1c0 [] uinput_destroy_device+0x4d/0x60 [uinput] [] do_one_initcall+0x2a/0x160 [] sys_init_module+0x85/0x1b0 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #0 (&per_cpu(cpu_policy_rwsem, cpu)){----}: [] validate_chain+0x5eb/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] down_write+0x41/0x60 [] lock_policy_rwsem_write+0x31/0x70 [] cpufreq_cpu_callback+0x45/0x80 [] notifier_call_chain+0x37/0x80 [] __raw_notifier_call_chain+0x19/0x20 [] _cpu_down+0x79/0x280 [] disable_nonboot_cpus+0x7c/0x100 [] suspend_devices_and_enter+0xd5/0x170 [] enter_state+0x1b0/0x1c0 [] state_store+0x8f/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x99/0x130 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x35 [] 0xffffffff other info that might help us debug this: 4 locks held by sleep.sh/6730: #0: (&buffer->mutex){--..}, at: [] sysfs_write_file+0x2b/0x100 #1: (pm_mutex){--..}, at: [] enter_state+0x4b/0x1c0 #2: (cpu_add_remove_lock){--..}, at: [] cpu_maps_update_begin+0xf/0x20 #3: (&cpu_hotplug.lock){--..}, at: [] cpu_hotplug_begin+0x1a/0x50 stack backtrace: Pid: 6730, comm: sleep.sh Not tainted 2.6.29-rc4-mst-debug #95 Call Trace: [] print_circular_bug_tail+0x7c/0xe0 [] validate_chain+0x5eb/0x1150 [] __lock_acquire+0x246/0xa50 [] ? __cancel_work_timer+0x2e/0x190 [] lock_acquire+0x60/0x80 [] ? lock_policy_rwsem_write+0x31/0x70 [] down_write+0x41/0x60 [] ? lock_policy_rwsem_write+0x31/0x70 [] lock_policy_rwsem_write+0x31/0x70 [] cpufreq_cpu_callback+0x45/0x80 [] notifier_call_chain+0x37/0x80 [] __raw_notifier_call_chain+0x19/0x20 [] _cpu_down+0x79/0x280 [] ? cpu_maps_update_begin+0xf/0x20 [] disable_nonboot_cpus+0x7c/0x100 [] ? acpi_disable_all_gpes+0x25/0x2a [] suspend_devices_and_enter+0xd5/0x170 [] enter_state+0x1b0/0x1c0 [] state_store+0x8f/0xd0 [] ? state_store+0x0/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x99/0x130 [] ? sysenter_exit+0xf/0x18 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x35 -- MST From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [Bug #12574] possible circular locking dependency detected Date: Wed, 11 Feb 2009 00:37:12 +0200 Message-ID: <20090210223711.GA6809@google.com> References: Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=nRGE3CHCUhxMl2v2nEUqA8uMiuA279sW858r5rzTWLQ=; b=u96Nck0F0iJjQ2Iyi/7UG6xJk6lGVdcB6fjB2BIiH1HZdchC0wPjTRGMTyHi3sMN4h Kz6iEjIRcIJZYfZbVHsBtAVknzMyuB4mIxqpLbzNMl7whI74mXao9a8JKmeYuodnmUaz 8X5FCpEhbaMpwjrnt3pxLa4sdEOP5+quaPpmk= Content-Disposition: inline In-Reply-To: Sender: kernel-testers-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Dave Airlie , dri-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Cc: Linux Kernel Mailing List , Kernel Testers List , "Rafael J. Wysocki" Dave, dri guys, Could you take a look at this circular dependency please (below)? I observe it when suspending laptop with radeon drm loaded and with lockdep enabled. It seems that the root of the problem is that various vm ops such as drm_vm_open, drm_mmap) are called with mm semaphore taken, and take dev->struct_mutex. On the other hand, drm_rmmap_locked is called with dev->struct_mutex, and calls mtrr_del which depends on mm semaphore indirectly. What do you think? Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12574 Subject : possible circular locking dependency detected Submitter : Michael S. Tsirkin Date : 2009-01-29 11:35 (11 days old) /var/log/message dump below. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.29-rc4-mst-debug #95 ------------------------------------------------------- sleep.sh/6730 is trying to acquire lock: (&per_cpu(cpu_policy_rwsem, cpu)){----}, at: [] lock_policy_rwsem_write+0x31/0x70 but task is already holding lock: (&cpu_hotplug.lock){--..}, at: [] cpu_hotplug_begin+0x1a/0x50 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&cpu_hotplug.lock){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] mutex_lock_nested+0x9d/0x2e0 [] get_online_cpus+0x2c/0x40 [] mtrr_del_page+0x2f/0x160 [] mtrr_del+0x3a/0x50 [] drm_rmmap_locked+0xc2/0x180 [drm] [] drm_master_destroy+0x151/0x160 [drm] [] kref_put+0x2c/0x80 [] drm_master_put+0x12/0x20 [drm] [] drm_release+0x25b/0x4a0 [drm] [] __fput+0xbd/0x1d0 [] fput+0x19/0x20 [] filp_close+0x47/0x70 [] sys_close+0x6a/0xc0 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #5 (&dev->struct_mutex){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] mutex_lock_nested+0x9d/0x2e0 [] drm_vm_open+0x2d/0x50 [drm] [] dup_mm+0x227/0x310 [] copy_process+0xd7f/0x1020 [] do_fork+0x78/0x320 [] sys_clone+0x2f/0x40 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #4 (&mm->mmap_sem/1){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] down_write_nested+0x48/0x70 [] dup_mm+0xc8/0x310 [] copy_process+0xd7f/0x1020 [] do_fork+0x78/0x320 [] sys_clone+0x2f/0x40 [] syscall_call+0x7/0xb [] 0xffffffff -> #3 (&mm->mmap_sem){----}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] might_fault+0x73/0x90 [] copy_to_user+0x33/0x60 [] filldir64+0xb5/0xe0 [] sysfs_readdir+0x11f/0x1f0 [] vfs_readdir+0x8d/0xb0 [] sys_getdents64+0x69/0xc0 [] syscall_call+0x7/0xb [] 0xffffffff -> #2 (sysfs_mutex){--..}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] mutex_lock_nested+0x9d/0x2e0 [] sysfs_addrm_start+0x2c/0xb0 [] create_dir+0x40/0x90 [] sysfs_create_subdir+0x16/0x20 [] internal_create_group+0x50/0x1a0 [] sysfs_create_group+0xc/0x10 [] cpufreq_stat_notifier_policy+0x9c/0x230 [cpufreq_stats] [] notifier_call_chain+0x37/0x80 [] __blocking_notifier_call_chain+0x44/0x60 [] blocking_notifier_call_chain+0x1a/0x20 [] __cpufreq_set_policy+0xd6/0x230 [] cpufreq_add_dev+0x4e8/0x6b0 [] sysdev_driver_register+0x75/0x130 [] cpufreq_register_driver+0xb5/0x1c0 [] uinput_destroy_device+0x4d/0x60 [uinput] [] do_one_initcall+0x2a/0x160 [] sys_init_module+0x85/0x1b0 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #1 ((cpufreq_policy_notifier_list).rwsem){----}: [] validate_chain+0xb51/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] down_read+0x41/0x60 [] __blocking_notifier_call_chain+0x2a/0x60 [] blocking_notifier_call_chain+0x1a/0x20 [] cpufreq_add_dev+0x1a5/0x6b0 [] sysdev_driver_register+0x75/0x130 [] cpufreq_register_driver+0xb5/0x1c0 [] uinput_destroy_device+0x4d/0x60 [uinput] [] do_one_initcall+0x2a/0x160 [] sys_init_module+0x85/0x1b0 [] sysenter_do_call+0x12/0x35 [] 0xffffffff -> #0 (&per_cpu(cpu_policy_rwsem, cpu)){----}: [] validate_chain+0x5eb/0x1150 [] __lock_acquire+0x246/0xa50 [] lock_acquire+0x60/0x80 [] down_write+0x41/0x60 [] lock_policy_rwsem_write+0x31/0x70 [] cpufreq_cpu_callback+0x45/0x80 [] notifier_call_chain+0x37/0x80 [] __raw_notifier_call_chain+0x19/0x20 [] _cpu_down+0x79/0x280 [] disable_nonboot_cpus+0x7c/0x100 [] suspend_devices_and_enter+0xd5/0x170 [] enter_state+0x1b0/0x1c0 [] state_store+0x8f/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x99/0x130 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x35 [] 0xffffffff other info that might help us debug this: 4 locks held by sleep.sh/6730: #0: (&buffer->mutex){--..}, at: [] sysfs_write_file+0x2b/0x100 #1: (pm_mutex){--..}, at: [] enter_state+0x4b/0x1c0 #2: (cpu_add_remove_lock){--..}, at: [] cpu_maps_update_begin+0xf/0x20 #3: (&cpu_hotplug.lock){--..}, at: [] cpu_hotplug_begin+0x1a/0x50 stack backtrace: Pid: 6730, comm: sleep.sh Not tainted 2.6.29-rc4-mst-debug #95 Call Trace: [] print_circular_bug_tail+0x7c/0xe0 [] validate_chain+0x5eb/0x1150 [] __lock_acquire+0x246/0xa50 [] ? __cancel_work_timer+0x2e/0x190 [] lock_acquire+0x60/0x80 [] ? lock_policy_rwsem_write+0x31/0x70 [] down_write+0x41/0x60 [] ? lock_policy_rwsem_write+0x31/0x70 [] lock_policy_rwsem_write+0x31/0x70 [] cpufreq_cpu_callback+0x45/0x80 [] notifier_call_chain+0x37/0x80 [] __raw_notifier_call_chain+0x19/0x20 [] _cpu_down+0x79/0x280 [] ? cpu_maps_update_begin+0xf/0x20 [] disable_nonboot_cpus+0x7c/0x100 [] ? acpi_disable_all_gpes+0x25/0x2a [] suspend_devices_and_enter+0xd5/0x170 [] enter_state+0x1b0/0x1c0 [] state_store+0x8f/0xd0 [] ? state_store+0x0/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x99/0x130 [] ? sysenter_exit+0xf/0x18 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x35 -- MST