Circular lock dependency chain between pm.mutex and topology_lock

From: Felix Kuehling <felix.kuehling@amd.com>
To: "Joshi, Mukul" <Mukul.Joshi@amd.com>,
	"Quan, Evan" <Evan.Quan@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Subject: Circular lock dependency chain between pm.mutex and topology_lock
Date: Wed, 20 Apr 2022 17:06:47 -0400	[thread overview]
Message-ID: <36459762-ca2c-1fbf-35bc-54bc7cb71dda@amd.com> (raw)

Hi Evan and Mukul,

You both made changes recently involving the pm.mutex and the 
topology_lock respectively. I'm now seeing a circular lock dependency 
between those locks (see below) that could potentially lead to a 
deadlock. This cycle also involves the mmap_lock and some file-system 
related code through the SMU firmware loading. A summary of the dump 
below is:

  * Thread A can take mmap_lock while holding i_mutex_dir_key
  * Thread B can take i_mutex_dir_key while holding the pm.mutex
  * Thread C can take pm.mutex while holding topology_lock
  * Thread D can take topology_lock while holding mmap_lock

The backtraces below illustrate each of these scenarios. If four such 
threads run concurrently and try to enter these critical sections around 
the same time, you may get a deadlock between them. I see two potential 
ways to break this cycle within our driver:

 1. Avoid holding the pm.mutex while loading SMU firmware
 2. Avoid holding the topology lock while getting mem-info in the KFD
    topology code

Can you determine, which one of those is the better/easier solution and 
implement a fix?

Thank you,
   Felix

[  168.544078] ======================================================
[  168.550309] WARNING: possible circular locking dependency detected
[  168.556523] 5.16.0-kfd-fkuehlin #148 Tainted: G            E
[  168.562558] ------------------------------------------------------
[  168.568764] kfdtest/3479 is trying to acquire lock:
[  168.573672] ffffffffc0927a70 (&topology_lock){++++}-{3:3}, at: kfd_topology_device_by_id+0x16/0x60 [amdgpu]
[  168.583663]
                but task is already holding lock:
[  168.589529] ffff97d303dee668 (&mm->mmap_lock#2){++++}-{3:3}, at: vm_mmap_pgoff+0xa9/0x180
[  168.597755]
                which lock already depends on the new lock.

[  168.605970]
                the existing dependency chain (in reverse order) is:
[  168.613487]
                -> #3 (&mm->mmap_lock#2){++++}-{3:3}:
[  168.619700]        lock_acquire+0xca/0x2e0
[  168.623814]        down_read+0x3e/0x140
[  168.627676]        do_user_addr_fault+0x40d/0x690
[  168.632399]        exc_page_fault+0x6f/0x270
[  168.636692]        asm_exc_page_fault+0x1e/0x30
[  168.641249]        filldir64+0xc8/0x1e0
[  168.645115]        call_filldir+0x7c/0x110
[  168.649238]        ext4_readdir+0x58e/0x940
[  168.653442]        iterate_dir+0x16a/0x1b0
[  168.657558]        __x64_sys_getdents64+0x83/0x140
[  168.662375]        do_syscall_64+0x35/0x80
[  168.666492]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  168.672095]
                -> #2 (&type->i_mutex_dir_key#6){++++}-{3:3}:
[  168.679008]        lock_acquire+0xca/0x2e0
[  168.683122]        down_read+0x3e/0x140
[  168.686982]        path_openat+0x5b2/0xa50
[  168.691095]        do_file_open_root+0xfc/0x190
[  168.695652]        file_open_root+0xd8/0x1b0
[  168.702010]        kernel_read_file_from_path_initns+0xc4/0x140
[  168.709542]        _request_firmware+0x2e9/0x5e0
[  168.715741]        request_firmware+0x32/0x50
[  168.721667]        amdgpu_cgs_get_firmware_info+0x370/0xdd0 [amdgpu]
[  168.730060]        smu7_upload_smu_firmware_image+0x53/0x190 [amdgpu]
[  168.738414]        fiji_start_smu+0xcf/0x4e0 [amdgpu]
[  168.745539]        pp_dpm_load_fw+0x21/0x30 [amdgpu]
[  168.752503]        amdgpu_pm_load_smu_firmware+0x4b/0x80 [amdgpu]
[  168.760698]        amdgpu_device_fw_loading+0xb8/0x140 [amdgpu]
[  168.768412]        amdgpu_device_init.cold+0xdf6/0x1716 [amdgpu]
[  168.776285]        amdgpu_driver_load_kms+0x15/0x120 [amdgpu]
[  168.784034]        amdgpu_pci_probe+0x19b/0x3a0 [amdgpu]
[  168.791161]        local_pci_probe+0x40/0x80
[  168.797027]        work_for_cpu_fn+0x10/0x20
[  168.802839]        process_one_work+0x273/0x5b0
[  168.808903]        worker_thread+0x20f/0x3d0
[  168.814700]        kthread+0x176/0x1a0
[  168.819968]        ret_from_fork+0x1f/0x30
[  168.825563]
                -> #1 (&adev->pm.mutex){+.+.}-{3:3}:
[  168.834721]        lock_acquire+0xca/0x2e0
[  168.840364]        __mutex_lock+0xa2/0x930
[  168.846020]        amdgpu_dpm_get_mclk+0x37/0x60 [amdgpu]
[  168.853257]        amdgpu_amdkfd_get_local_mem_info+0xba/0xe0 [amdgpu]
[  168.861547]        kfd_create_vcrat_image_gpu+0x1b1/0xbb0 [amdgpu]
[  168.869478]        kfd_create_crat_image_virtual+0x447/0x510 [amdgpu]
[  168.877884]        kfd_topology_add_device+0x5c8/0x6f0 [amdgpu]
[  168.885556]        kgd2kfd_device_init.cold+0x385/0x4c5 [amdgpu]
[  168.893347]        amdgpu_amdkfd_device_init+0x138/0x180 [amdgpu]
[  168.901177]        amdgpu_device_init.cold+0x141b/0x1716 [amdgpu]
[  168.909025]        amdgpu_driver_load_kms+0x15/0x120 [amdgpu]
[  168.916458]        amdgpu_pci_probe+0x19b/0x3a0 [amdgpu]
[  168.923442]        local_pci_probe+0x40/0x80
[  168.929249]        work_for_cpu_fn+0x10/0x20
[  168.935008]        process_one_work+0x273/0x5b0
[  168.940944]        worker_thread+0x20f/0x3d0
[  168.946623]        kthread+0x176/0x1a0
[  168.951765]        ret_from_fork+0x1f/0x30
[  168.957277]
                -> #0 (&topology_lock){++++}-{3:3}:
[  168.965993]        check_prev_add+0x8f/0xbf0
[  168.971613]        __lock_acquire+0x1299/0x1ca0
[  168.977485]        lock_acquire+0xca/0x2e0
[  168.982877]        down_read+0x3e/0x140
[  168.987975]        kfd_topology_device_by_id+0x16/0x60 [amdgpu]
[  168.995583]        kfd_device_by_id+0xa/0x20 [amdgpu]
[  169.002180]        kfd_mmap+0x95/0x200 [amdgpu]
[  169.008293]        mmap_region+0x337/0x5a0
[  169.013679]        do_mmap+0x3aa/0x540
[  169.018678]        vm_mmap_pgoff+0xdc/0x180
[  169.024095]        ksys_mmap_pgoff+0x186/0x1f0
[  169.029734]        do_syscall_64+0x35/0x80
[  169.035005]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  169.041754]
                other info that might help us debug this:

[  169.053276] Chain exists of:
                  &topology_lock --> &type->i_mutex_dir_key#6 --> &mm->mmap_lock#2

[  169.068389]  Possible unsafe locking scenario:

[  169.076661]        CPU0                    CPU1
[  169.082383]        ----                    ----
[  169.088087]   lock(&mm->mmap_lock#2);
[  169.092922]                                lock(&type->i_mutex_dir_key#6);
[  169.100975]                                lock(&mm->mmap_lock#2);
[  169.108320]   lock(&topology_lock);
[  169.112957]
                 *** DEADLOCK ***

[  169.122265] 1 lock held by kfdtest/3479:
[  169.127286]  #0: ffff97d303dee668 (&mm->mmap_lock#2){++++}-{3:3}, at: vm_mmap_pgoff+0xa9/0x180
[  169.137033]
                stack backtrace:
[  169.143579] CPU: 18 PID: 3479 Comm: kfdtest Tainted: G            E     5.16.0-kfd-fkuehlin #148
[  169.153480] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EPYCD8-2T, BIOS P2.60 04/10/2020
[  169.164175] Call Trace:
[  169.167761]  <TASK>
[  169.170988]  dump_stack_lvl+0x45/0x59
[  169.175795]  check_noncircular+0xfe/0x110
[  169.180947]  check_prev_add+0x8f/0xbf0
[  169.185828]  __lock_acquire+0x1299/0x1ca0
[  169.190966]  lock_acquire+0xca/0x2e0
[  169.195677]  ? kfd_topology_device_by_id+0x16/0x60 [amdgpu]
[  169.202808]  down_read+0x3e/0x140
[  169.207333]  ? kfd_topology_device_by_id+0x16/0x60 [amdgpu]
[  169.214332]  kfd_topology_device_by_id+0x16/0x60 [amdgpu]
[  169.221123]  kfd_device_by_id+0xa/0x20 [amdgpu]
[  169.227071]  kfd_mmap+0x95/0x200 [amdgpu]
[  169.232521]  mmap_region+0x337/0x5a0
[  169.237305]  do_mmap+0x3aa/0x540
[  169.241713]  vm_mmap_pgoff+0xdc/0x180
[  169.246529]  ksys_mmap_pgoff+0x186/0x1f0
[  169.251603]  do_syscall_64+0x35/0x80
[  169.256335]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  169.262536] RIP: 0033:0x7f828e0d1916
[  169.267261] Code: 00 00 00 00 f3 0f 1e fa 41 f7 c1 ff 0f 00 00 75 2b 55 48 89 fd 53 89 cb 48 85 ff 74 37 41 89 da 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 62 5b 5d c3 0f 1f 80 00 00 00 00 48 8b 05 41
[  169.288448] RSP: 002b:00007ffe9bc5e7c8 EFLAGS: 00000206 ORIG_RAX: 0000000000000009
[  169.297252] RAX: ffffffffffffffda RBX: 0000000000000011 RCX: 00007f828e0d1916
[  169.305628] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000201000
[  169.313993] RBP: 0000000000201000 R08: 0000000000000003 R09: 04b4000000000000
[  169.322358] R10: 0000000000000011 R11: 0000000000000206 R12: 000055571c9c5a40
[  169.330733] R13: 0000000000000000 R14: 0000000000000011 R15: 0000000000000000
[  169.339097]  </TASK>

-- 
F e l i x   K u e h l i n g
PMTS Software Development Engineer | Linux Compute Kernel
1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada
(O) +1(289)695-1597
     _     _   _   _____   _____
    / \   | \ / | |  _  \  \ _  |
   / A \  | \M/ | | |D) )  /|_| |
  /_/ \_\ |_| |_| |_____/ |__/ \|   facebook.com/AMD | amd.com