All of lore.kernel.org
 help / color / mirror / Atom feed
* [Nouveau] nouveau lockdep deadlock report with 5.18-rc6
@ 2022-05-17 11:10 ` Hans de Goede
  0 siblings, 0 replies; 10+ messages in thread
From: Hans de Goede @ 2022-05-17 11:10 UTC (permalink / raw)
  To: Ben Skeggs, Karol Herbst, Lyude Paul; +Cc: nouveau, dri-devel

Hi All,

I just noticed the below lockdep possible deadlock report with a 5.18-rc6
kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108GLM [NVS 5200M] [10de:0dfc] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)

This is with the laptop in Optimus mode, so with the Intel integrated
gfx from the i5-3320M CPU driving the LCD panel and with nothing connected
to the HDMI connector, which is always routed to the NVIDIA GPU on this
laptop.

The lockdep possible deadlock warning seems to happen when the NVIDIA GPU
is runtime suspended shortly after gdm has loaded:

[   24.859171] ======================================================
[   24.859173] WARNING: possible circular locking dependency detected
[   24.859175] 5.18.0-rc6+ #34 Tainted: G            E    
[   24.859178] ------------------------------------------------------
[   24.859179] kworker/1:1/46 is trying to acquire lock:
[   24.859181] ffff92b0c0ee0518 (&cli->mutex){+.+.}-{3:3}, at: nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859231] 
               but task is already holding lock:
[   24.859233] ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
[   24.859243] 
               which lock already depends on the new lock.

[   24.859244] 
               the existing dependency chain (in reverse order) is:
[   24.859246] 
               -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
[   24.859249]        __ww_mutex_lock.constprop.0+0xb3/0xfb0
[   24.859256]        ww_mutex_lock+0x38/0xa0
[   24.859259]        nouveau_bo_pin+0x30/0x380 [nouveau]
[   24.859297]        nouveau_channel_del+0x1d7/0x3e0 [nouveau]
[   24.859328]        nouveau_channel_new+0x48/0x730 [nouveau]
[   24.859358]        nouveau_abi16_ioctl_channel_alloc+0x113/0x360 [nouveau]
[   24.859389]        drm_ioctl_kernel+0xa1/0x150
[   24.859392]        drm_ioctl+0x21c/0x410
[   24.859395]        nouveau_drm_ioctl+0x56/0x1820 [nouveau]
[   24.859431]        __x64_sys_ioctl+0x8d/0xc0
[   24.859436]        do_syscall_64+0x5b/0x80
[   24.859440]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[   24.859443] 
               -> #0 (&cli->mutex){+.+.}-{3:3}:
[   24.859446]        __lock_acquire+0x12e2/0x1f90
[   24.859450]        lock_acquire+0xad/0x290
[   24.859453]        __mutex_lock+0x90/0x830
[   24.859456]        nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859493]        ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
[   24.859498]        ttm_mem_evict_first+0x25c/0x4b0 [ttm]
[   24.859503]        ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
[   24.859509]        nouveau_debugfs_fini+0x161/0x260 [nouveau]
[   24.859545]        nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
[   24.859582]        pci_pm_runtime_suspend+0x5c/0x180
[   24.859585]        __rpm_callback+0x48/0x1b0
[   24.859589]        rpm_callback+0x5a/0x70
[   24.859591]        rpm_suspend+0x10a/0x6f0
[   24.859594]        pm_runtime_work+0xa0/0xb0
[   24.859596]        process_one_work+0x254/0x560
[   24.859601]        worker_thread+0x4f/0x390
[   24.859604]        kthread+0xe6/0x110
[   24.859607]        ret_from_fork+0x22/0x30
[   24.859611] 
               other info that might help us debug this:

[   24.859612]  Possible unsafe locking scenario:

[   24.859613]        CPU0                    CPU1
[   24.859615]        ----                    ----
[   24.859616]   lock(reservation_ww_class_mutex);
[   24.859618]                                lock(&cli->mutex);
[   24.859620]                                lock(reservation_ww_class_mutex);
[   24.859622]   lock(&cli->mutex);
[   24.859624] 
                *** DEADLOCK ***

[   24.859625] 3 locks held by kworker/1:1/46:
[   24.859627]  #0: ffff92b0c0bb4338 ((wq_completion)pm){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
[   24.859634]  #1: ffffa8ffc02dfe80 ((work_completion)(&dev->power.work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
[   24.859641]  #2: ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
[   24.859649] 
               stack backtrace:
[   24.859651] CPU: 1 PID: 46 Comm: kworker/1:1 Tainted: G            E     5.18.0-rc6+ #34
[   24.859654] Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21 05/08/2017
[   24.859656] Workqueue: pm pm_runtime_work
[   24.859660] Call Trace:
[   24.859662]  <TASK>
[   24.859665]  dump_stack_lvl+0x5b/0x74
[   24.859669]  check_noncircular+0xdf/0x100
[   24.859672]  ? register_lock_class+0x38/0x470
[   24.859678]  __lock_acquire+0x12e2/0x1f90
[   24.859683]  lock_acquire+0xad/0x290
[   24.859686]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859724]  ? lock_is_held_type+0xa6/0x120
[   24.859730]  __mutex_lock+0x90/0x830
[   24.859733]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859770]  ? nvif_vmm_map+0x114/0x130 [nouveau]
[   24.859791]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859829]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859866]  nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859905]  ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
[   24.859912]  ttm_mem_evict_first+0x25c/0x4b0 [ttm]
[   24.859919]  ? lock_release+0x20/0x2a0
[   24.859923]  ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
[   24.859930]  nouveau_debugfs_fini+0x161/0x260 [nouveau]
[   24.859968]  nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
[   24.860005]  pci_pm_runtime_suspend+0x5c/0x180
[   24.860008]  ? pci_dev_put+0x20/0x20
[   24.860011]  __rpm_callback+0x48/0x1b0
[   24.860014]  ? pci_dev_put+0x20/0x20
[   24.860018]  rpm_callback+0x5a/0x70
[   24.860020]  ? pci_dev_put+0x20/0x20
[   24.860023]  rpm_suspend+0x10a/0x6f0
[   24.860025]  ? process_one_work+0x1d0/0x560
[   24.860031]  pm_runtime_work+0xa0/0xb0
[   24.860034]  process_one_work+0x254/0x560
[   24.860039]  worker_thread+0x4f/0x390
[   24.860043]  ? process_one_work+0x560/0x560
[   24.860046]  kthread+0xe6/0x110
[   24.860049]  ? kthread_complete_and_exit+0x20/0x20
[   24.860053]  ret_from_fork+0x22/0x30
[   24.860059]  </TASK>

Regards,

Hans



^ permalink raw reply	[flat|nested] 10+ messages in thread

* nouveau lockdep deadlock report with 5.18-rc6
@ 2022-05-17 11:10 ` Hans de Goede
  0 siblings, 0 replies; 10+ messages in thread
From: Hans de Goede @ 2022-05-17 11:10 UTC (permalink / raw)
  To: Ben Skeggs, Karol Herbst, Lyude Paul; +Cc: nouveau, dri-devel

Hi All,

I just noticed the below lockdep possible deadlock report with a 5.18-rc6
kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108GLM [NVS 5200M] [10de:0dfc] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)

This is with the laptop in Optimus mode, so with the Intel integrated
gfx from the i5-3320M CPU driving the LCD panel and with nothing connected
to the HDMI connector, which is always routed to the NVIDIA GPU on this
laptop.

The lockdep possible deadlock warning seems to happen when the NVIDIA GPU
is runtime suspended shortly after gdm has loaded:

[   24.859171] ======================================================
[   24.859173] WARNING: possible circular locking dependency detected
[   24.859175] 5.18.0-rc6+ #34 Tainted: G            E    
[   24.859178] ------------------------------------------------------
[   24.859179] kworker/1:1/46 is trying to acquire lock:
[   24.859181] ffff92b0c0ee0518 (&cli->mutex){+.+.}-{3:3}, at: nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859231] 
               but task is already holding lock:
[   24.859233] ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
[   24.859243] 
               which lock already depends on the new lock.

[   24.859244] 
               the existing dependency chain (in reverse order) is:
[   24.859246] 
               -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
[   24.859249]        __ww_mutex_lock.constprop.0+0xb3/0xfb0
[   24.859256]        ww_mutex_lock+0x38/0xa0
[   24.859259]        nouveau_bo_pin+0x30/0x380 [nouveau]
[   24.859297]        nouveau_channel_del+0x1d7/0x3e0 [nouveau]
[   24.859328]        nouveau_channel_new+0x48/0x730 [nouveau]
[   24.859358]        nouveau_abi16_ioctl_channel_alloc+0x113/0x360 [nouveau]
[   24.859389]        drm_ioctl_kernel+0xa1/0x150
[   24.859392]        drm_ioctl+0x21c/0x410
[   24.859395]        nouveau_drm_ioctl+0x56/0x1820 [nouveau]
[   24.859431]        __x64_sys_ioctl+0x8d/0xc0
[   24.859436]        do_syscall_64+0x5b/0x80
[   24.859440]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[   24.859443] 
               -> #0 (&cli->mutex){+.+.}-{3:3}:
[   24.859446]        __lock_acquire+0x12e2/0x1f90
[   24.859450]        lock_acquire+0xad/0x290
[   24.859453]        __mutex_lock+0x90/0x830
[   24.859456]        nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859493]        ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
[   24.859498]        ttm_mem_evict_first+0x25c/0x4b0 [ttm]
[   24.859503]        ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
[   24.859509]        nouveau_debugfs_fini+0x161/0x260 [nouveau]
[   24.859545]        nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
[   24.859582]        pci_pm_runtime_suspend+0x5c/0x180
[   24.859585]        __rpm_callback+0x48/0x1b0
[   24.859589]        rpm_callback+0x5a/0x70
[   24.859591]        rpm_suspend+0x10a/0x6f0
[   24.859594]        pm_runtime_work+0xa0/0xb0
[   24.859596]        process_one_work+0x254/0x560
[   24.859601]        worker_thread+0x4f/0x390
[   24.859604]        kthread+0xe6/0x110
[   24.859607]        ret_from_fork+0x22/0x30
[   24.859611] 
               other info that might help us debug this:

[   24.859612]  Possible unsafe locking scenario:

[   24.859613]        CPU0                    CPU1
[   24.859615]        ----                    ----
[   24.859616]   lock(reservation_ww_class_mutex);
[   24.859618]                                lock(&cli->mutex);
[   24.859620]                                lock(reservation_ww_class_mutex);
[   24.859622]   lock(&cli->mutex);
[   24.859624] 
                *** DEADLOCK ***

[   24.859625] 3 locks held by kworker/1:1/46:
[   24.859627]  #0: ffff92b0c0bb4338 ((wq_completion)pm){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
[   24.859634]  #1: ffffa8ffc02dfe80 ((work_completion)(&dev->power.work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
[   24.859641]  #2: ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
[   24.859649] 
               stack backtrace:
[   24.859651] CPU: 1 PID: 46 Comm: kworker/1:1 Tainted: G            E     5.18.0-rc6+ #34
[   24.859654] Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21 05/08/2017
[   24.859656] Workqueue: pm pm_runtime_work
[   24.859660] Call Trace:
[   24.859662]  <TASK>
[   24.859665]  dump_stack_lvl+0x5b/0x74
[   24.859669]  check_noncircular+0xdf/0x100
[   24.859672]  ? register_lock_class+0x38/0x470
[   24.859678]  __lock_acquire+0x12e2/0x1f90
[   24.859683]  lock_acquire+0xad/0x290
[   24.859686]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859724]  ? lock_is_held_type+0xa6/0x120
[   24.859730]  __mutex_lock+0x90/0x830
[   24.859733]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859770]  ? nvif_vmm_map+0x114/0x130 [nouveau]
[   24.859791]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859829]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859866]  nouveau_vga_lastclose+0x910/0x1030 [nouveau]
[   24.859905]  ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
[   24.859912]  ttm_mem_evict_first+0x25c/0x4b0 [ttm]
[   24.859919]  ? lock_release+0x20/0x2a0
[   24.859923]  ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
[   24.859930]  nouveau_debugfs_fini+0x161/0x260 [nouveau]
[   24.859968]  nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
[   24.860005]  pci_pm_runtime_suspend+0x5c/0x180
[   24.860008]  ? pci_dev_put+0x20/0x20
[   24.860011]  __rpm_callback+0x48/0x1b0
[   24.860014]  ? pci_dev_put+0x20/0x20
[   24.860018]  rpm_callback+0x5a/0x70
[   24.860020]  ? pci_dev_put+0x20/0x20
[   24.860023]  rpm_suspend+0x10a/0x6f0
[   24.860025]  ? process_one_work+0x1d0/0x560
[   24.860031]  pm_runtime_work+0xa0/0xb0
[   24.860034]  process_one_work+0x254/0x560
[   24.860039]  worker_thread+0x4f/0x390
[   24.860043]  ? process_one_work+0x560/0x560
[   24.860046]  kthread+0xe6/0x110
[   24.860049]  ? kthread_complete_and_exit+0x20/0x20
[   24.860053]  ret_from_fork+0x22/0x30
[   24.860059]  </TASK>

Regards,

Hans



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Nouveau] nouveau lockdep deadlock report with 5.18-rc6
  2022-05-17 11:10 ` Hans de Goede
@ 2022-05-17 22:24   ` Lyude Paul
  -1 siblings, 0 replies; 10+ messages in thread
From: Lyude Paul @ 2022-05-17 22:24 UTC (permalink / raw)
  To: Hans de Goede, Ben Skeggs, Karol Herbst; +Cc: nouveau, dri-devel

YEah I saw this as well, will try to bisect soon

On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> Hi All,
> 
> I just noticed the below lockdep possible deadlock report with a 5.18-rc6
> kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> 
> 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108GLM [NVS
> 5200M] [10de:0dfc] (rev a1)
> 01:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio
> Controller [10de:0bea] (rev a1)
> 
> This is with the laptop in Optimus mode, so with the Intel integrated
> gfx from the i5-3320M CPU driving the LCD panel and with nothing connected
> to the HDMI connector, which is always routed to the NVIDIA GPU on this
> laptop.
> 
> The lockdep possible deadlock warning seems to happen when the NVIDIA GPU
> is runtime suspended shortly after gdm has loaded:
> 
> [   24.859171] ======================================================
> [   24.859173] WARNING: possible circular locking dependency detected
> [   24.859175] 5.18.0-rc6+ #34 Tainted: G            E    
> [   24.859178] ------------------------------------------------------
> [   24.859179] kworker/1:1/46 is trying to acquire lock:
> [   24.859181] ffff92b0c0ee0518 (&cli->mutex){+.+.}-{3:3}, at:
> nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859231] 
>                but task is already holding lock:
> [   24.859233] ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3},
> at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859243] 
>                which lock already depends on the new lock.
> 
> [   24.859244] 
>                the existing dependency chain (in reverse order) is:
> [   24.859246] 
>                -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
> [   24.859249]        __ww_mutex_lock.constprop.0+0xb3/0xfb0
> [   24.859256]        ww_mutex_lock+0x38/0xa0
> [   24.859259]        nouveau_bo_pin+0x30/0x380 [nouveau]
> [   24.859297]        nouveau_channel_del+0x1d7/0x3e0 [nouveau]
> [   24.859328]        nouveau_channel_new+0x48/0x730 [nouveau]
> [   24.859358]        nouveau_abi16_ioctl_channel_alloc+0x113/0x360
> [nouveau]
> [   24.859389]        drm_ioctl_kernel+0xa1/0x150
> [   24.859392]        drm_ioctl+0x21c/0x410
> [   24.859395]        nouveau_drm_ioctl+0x56/0x1820 [nouveau]
> [   24.859431]        __x64_sys_ioctl+0x8d/0xc0
> [   24.859436]        do_syscall_64+0x5b/0x80
> [   24.859440]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   24.859443] 
>                -> #0 (&cli->mutex){+.+.}-{3:3}:
> [   24.859446]        __lock_acquire+0x12e2/0x1f90
> [   24.859450]        lock_acquire+0xad/0x290
> [   24.859453]        __mutex_lock+0x90/0x830
> [   24.859456]        nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859493]        ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859498]        ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859503]        ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859509]        nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859545]        nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.859582]        pci_pm_runtime_suspend+0x5c/0x180
> [   24.859585]        __rpm_callback+0x48/0x1b0
> [   24.859589]        rpm_callback+0x5a/0x70
> [   24.859591]        rpm_suspend+0x10a/0x6f0
> [   24.859594]        pm_runtime_work+0xa0/0xb0
> [   24.859596]        process_one_work+0x254/0x560
> [   24.859601]        worker_thread+0x4f/0x390
> [   24.859604]        kthread+0xe6/0x110
> [   24.859607]        ret_from_fork+0x22/0x30
> [   24.859611] 
>                other info that might help us debug this:
> 
> [   24.859612]  Possible unsafe locking scenario:
> 
> [   24.859613]        CPU0                    CPU1
> [   24.859615]        ----                    ----
> [   24.859616]   lock(reservation_ww_class_mutex);
> [   24.859618]                                lock(&cli->mutex);
> [   24.859620]                               
> lock(reservation_ww_class_mutex);
> [   24.859622]   lock(&cli->mutex);
> [   24.859624] 
>                 *** DEADLOCK ***
> 
> [   24.859625] 3 locks held by kworker/1:1/46:
> [   24.859627]  #0: ffff92b0c0bb4338 ((wq_completion)pm){+.+.}-{0:0}, at:
> process_one_work+0x1d0/0x560
> [   24.859634]  #1: ffffa8ffc02dfe80 ((work_completion)(&dev-
> >power.work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
> [   24.859641]  #2: ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-
> {3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859649] 
>                stack backtrace:
> [   24.859651] CPU: 1 PID: 46 Comm: kworker/1:1 Tainted: G            E    
> 5.18.0-rc6+ #34
> [   24.859654] Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21
> 05/08/2017
> [   24.859656] Workqueue: pm pm_runtime_work
> [   24.859660] Call Trace:
> [   24.859662]  <TASK>
> [   24.859665]  dump_stack_lvl+0x5b/0x74
> [   24.859669]  check_noncircular+0xdf/0x100
> [   24.859672]  ? register_lock_class+0x38/0x470
> [   24.859678]  __lock_acquire+0x12e2/0x1f90
> [   24.859683]  lock_acquire+0xad/0x290
> [   24.859686]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859724]  ? lock_is_held_type+0xa6/0x120
> [   24.859730]  __mutex_lock+0x90/0x830
> [   24.859733]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859770]  ? nvif_vmm_map+0x114/0x130 [nouveau]
> [   24.859791]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859829]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859866]  nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859905]  ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859912]  ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859919]  ? lock_release+0x20/0x2a0
> [   24.859923]  ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859930]  nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859968]  nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.860005]  pci_pm_runtime_suspend+0x5c/0x180
> [   24.860008]  ? pci_dev_put+0x20/0x20
> [   24.860011]  __rpm_callback+0x48/0x1b0
> [   24.860014]  ? pci_dev_put+0x20/0x20
> [   24.860018]  rpm_callback+0x5a/0x70
> [   24.860020]  ? pci_dev_put+0x20/0x20
> [   24.860023]  rpm_suspend+0x10a/0x6f0
> [   24.860025]  ? process_one_work+0x1d0/0x560
> [   24.860031]  pm_runtime_work+0xa0/0xb0
> [   24.860034]  process_one_work+0x254/0x560
> [   24.860039]  worker_thread+0x4f/0x390
> [   24.860043]  ? process_one_work+0x560/0x560
> [   24.860046]  kthread+0xe6/0x110
> [   24.860049]  ? kthread_complete_and_exit+0x20/0x20
> [   24.860053]  ret_from_fork+0x22/0x30
> [   24.860059]  </TASK>
> 
> Regards,
> 
> Hans
> 
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau lockdep deadlock report with 5.18-rc6
@ 2022-05-17 22:24   ` Lyude Paul
  0 siblings, 0 replies; 10+ messages in thread
From: Lyude Paul @ 2022-05-17 22:24 UTC (permalink / raw)
  To: Hans de Goede, Ben Skeggs, Karol Herbst; +Cc: nouveau, dri-devel

YEah I saw this as well, will try to bisect soon

On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> Hi All,
> 
> I just noticed the below lockdep possible deadlock report with a 5.18-rc6
> kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> 
> 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108GLM [NVS
> 5200M] [10de:0dfc] (rev a1)
> 01:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio
> Controller [10de:0bea] (rev a1)
> 
> This is with the laptop in Optimus mode, so with the Intel integrated
> gfx from the i5-3320M CPU driving the LCD panel and with nothing connected
> to the HDMI connector, which is always routed to the NVIDIA GPU on this
> laptop.
> 
> The lockdep possible deadlock warning seems to happen when the NVIDIA GPU
> is runtime suspended shortly after gdm has loaded:
> 
> [   24.859171] ======================================================
> [   24.859173] WARNING: possible circular locking dependency detected
> [   24.859175] 5.18.0-rc6+ #34 Tainted: G            E    
> [   24.859178] ------------------------------------------------------
> [   24.859179] kworker/1:1/46 is trying to acquire lock:
> [   24.859181] ffff92b0c0ee0518 (&cli->mutex){+.+.}-{3:3}, at:
> nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859231] 
>                but task is already holding lock:
> [   24.859233] ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3},
> at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859243] 
>                which lock already depends on the new lock.
> 
> [   24.859244] 
>                the existing dependency chain (in reverse order) is:
> [   24.859246] 
>                -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
> [   24.859249]        __ww_mutex_lock.constprop.0+0xb3/0xfb0
> [   24.859256]        ww_mutex_lock+0x38/0xa0
> [   24.859259]        nouveau_bo_pin+0x30/0x380 [nouveau]
> [   24.859297]        nouveau_channel_del+0x1d7/0x3e0 [nouveau]
> [   24.859328]        nouveau_channel_new+0x48/0x730 [nouveau]
> [   24.859358]        nouveau_abi16_ioctl_channel_alloc+0x113/0x360
> [nouveau]
> [   24.859389]        drm_ioctl_kernel+0xa1/0x150
> [   24.859392]        drm_ioctl+0x21c/0x410
> [   24.859395]        nouveau_drm_ioctl+0x56/0x1820 [nouveau]
> [   24.859431]        __x64_sys_ioctl+0x8d/0xc0
> [   24.859436]        do_syscall_64+0x5b/0x80
> [   24.859440]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   24.859443] 
>                -> #0 (&cli->mutex){+.+.}-{3:3}:
> [   24.859446]        __lock_acquire+0x12e2/0x1f90
> [   24.859450]        lock_acquire+0xad/0x290
> [   24.859453]        __mutex_lock+0x90/0x830
> [   24.859456]        nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859493]        ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859498]        ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859503]        ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859509]        nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859545]        nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.859582]        pci_pm_runtime_suspend+0x5c/0x180
> [   24.859585]        __rpm_callback+0x48/0x1b0
> [   24.859589]        rpm_callback+0x5a/0x70
> [   24.859591]        rpm_suspend+0x10a/0x6f0
> [   24.859594]        pm_runtime_work+0xa0/0xb0
> [   24.859596]        process_one_work+0x254/0x560
> [   24.859601]        worker_thread+0x4f/0x390
> [   24.859604]        kthread+0xe6/0x110
> [   24.859607]        ret_from_fork+0x22/0x30
> [   24.859611] 
>                other info that might help us debug this:
> 
> [   24.859612]  Possible unsafe locking scenario:
> 
> [   24.859613]        CPU0                    CPU1
> [   24.859615]        ----                    ----
> [   24.859616]   lock(reservation_ww_class_mutex);
> [   24.859618]                                lock(&cli->mutex);
> [   24.859620]                               
> lock(reservation_ww_class_mutex);
> [   24.859622]   lock(&cli->mutex);
> [   24.859624] 
>                 *** DEADLOCK ***
> 
> [   24.859625] 3 locks held by kworker/1:1/46:
> [   24.859627]  #0: ffff92b0c0bb4338 ((wq_completion)pm){+.+.}-{0:0}, at:
> process_one_work+0x1d0/0x560
> [   24.859634]  #1: ffffa8ffc02dfe80 ((work_completion)(&dev-
> >power.work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
> [   24.859641]  #2: ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-
> {3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859649] 
>                stack backtrace:
> [   24.859651] CPU: 1 PID: 46 Comm: kworker/1:1 Tainted: G            E    
> 5.18.0-rc6+ #34
> [   24.859654] Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21
> 05/08/2017
> [   24.859656] Workqueue: pm pm_runtime_work
> [   24.859660] Call Trace:
> [   24.859662]  <TASK>
> [   24.859665]  dump_stack_lvl+0x5b/0x74
> [   24.859669]  check_noncircular+0xdf/0x100
> [   24.859672]  ? register_lock_class+0x38/0x470
> [   24.859678]  __lock_acquire+0x12e2/0x1f90
> [   24.859683]  lock_acquire+0xad/0x290
> [   24.859686]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859724]  ? lock_is_held_type+0xa6/0x120
> [   24.859730]  __mutex_lock+0x90/0x830
> [   24.859733]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859770]  ? nvif_vmm_map+0x114/0x130 [nouveau]
> [   24.859791]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859829]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859866]  nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859905]  ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859912]  ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859919]  ? lock_release+0x20/0x2a0
> [   24.859923]  ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859930]  nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859968]  nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.860005]  pci_pm_runtime_suspend+0x5c/0x180
> [   24.860008]  ? pci_dev_put+0x20/0x20
> [   24.860011]  __rpm_callback+0x48/0x1b0
> [   24.860014]  ? pci_dev_put+0x20/0x20
> [   24.860018]  rpm_callback+0x5a/0x70
> [   24.860020]  ? pci_dev_put+0x20/0x20
> [   24.860023]  rpm_suspend+0x10a/0x6f0
> [   24.860025]  ? process_one_work+0x1d0/0x560
> [   24.860031]  pm_runtime_work+0xa0/0xb0
> [   24.860034]  process_one_work+0x254/0x560
> [   24.860039]  worker_thread+0x4f/0x390
> [   24.860043]  ? process_one_work+0x560/0x560
> [   24.860046]  kthread+0xe6/0x110
> [   24.860049]  ? kthread_complete_and_exit+0x20/0x20
> [   24.860053]  ret_from_fork+0x22/0x30
> [   24.860059]  </TASK>
> 
> Regards,
> 
> Hans
> 
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Nouveau] nouveau lockdep deadlock report with 5.18-rc6
  2022-05-17 11:10 ` Hans de Goede
@ 2022-05-18 17:42   ` Lyude Paul
  -1 siblings, 0 replies; 10+ messages in thread
From: Lyude Paul @ 2022-05-18 17:42 UTC (permalink / raw)
  To: Hans de Goede, Ben Skeggs, Karol Herbst; +Cc: nouveau, dri-devel

Yeah I noticed this as well, I will try to bisect this the next change that I
get


On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> Hi All,
> 
> I just noticed the below lockdep possible deadlock report with a 5.18-rc6
> kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> 
> 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108GLM [NVS
> 5200M] [10de:0dfc] (rev a1)
> 01:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio
> Controller [10de:0bea] (rev a1)
> 
> This is with the laptop in Optimus mode, so with the Intel integrated
> gfx from the i5-3320M CPU driving the LCD panel and with nothing connected
> to the HDMI connector, which is always routed to the NVIDIA GPU on this
> laptop.
> 
> The lockdep possible deadlock warning seems to happen when the NVIDIA GPU
> is runtime suspended shortly after gdm has loaded:
> 
> [   24.859171] ======================================================
> [   24.859173] WARNING: possible circular locking dependency detected
> [   24.859175] 5.18.0-rc6+ #34 Tainted: G            E    
> [   24.859178] ------------------------------------------------------
> [   24.859179] kworker/1:1/46 is trying to acquire lock:
> [   24.859181] ffff92b0c0ee0518 (&cli->mutex){+.+.}-{3:3}, at:
> nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859231] 
>                but task is already holding lock:
> [   24.859233] ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3},
> at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859243] 
>                which lock already depends on the new lock.
> 
> [   24.859244] 
>                the existing dependency chain (in reverse order) is:
> [   24.859246] 
>                -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
> [   24.859249]        __ww_mutex_lock.constprop.0+0xb3/0xfb0
> [   24.859256]        ww_mutex_lock+0x38/0xa0
> [   24.859259]        nouveau_bo_pin+0x30/0x380 [nouveau]
> [   24.859297]        nouveau_channel_del+0x1d7/0x3e0 [nouveau]
> [   24.859328]        nouveau_channel_new+0x48/0x730 [nouveau]
> [   24.859358]        nouveau_abi16_ioctl_channel_alloc+0x113/0x360
> [nouveau]
> [   24.859389]        drm_ioctl_kernel+0xa1/0x150
> [   24.859392]        drm_ioctl+0x21c/0x410
> [   24.859395]        nouveau_drm_ioctl+0x56/0x1820 [nouveau]
> [   24.859431]        __x64_sys_ioctl+0x8d/0xc0
> [   24.859436]        do_syscall_64+0x5b/0x80
> [   24.859440]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   24.859443] 
>                -> #0 (&cli->mutex){+.+.}-{3:3}:
> [   24.859446]        __lock_acquire+0x12e2/0x1f90
> [   24.859450]        lock_acquire+0xad/0x290
> [   24.859453]        __mutex_lock+0x90/0x830
> [   24.859456]        nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859493]        ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859498]        ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859503]        ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859509]        nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859545]        nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.859582]        pci_pm_runtime_suspend+0x5c/0x180
> [   24.859585]        __rpm_callback+0x48/0x1b0
> [   24.859589]        rpm_callback+0x5a/0x70
> [   24.859591]        rpm_suspend+0x10a/0x6f0
> [   24.859594]        pm_runtime_work+0xa0/0xb0
> [   24.859596]        process_one_work+0x254/0x560
> [   24.859601]        worker_thread+0x4f/0x390
> [   24.859604]        kthread+0xe6/0x110
> [   24.859607]        ret_from_fork+0x22/0x30
> [   24.859611] 
>                other info that might help us debug this:
> 
> [   24.859612]  Possible unsafe locking scenario:
> 
> [   24.859613]        CPU0                    CPU1
> [   24.859615]        ----                    ----
> [   24.859616]   lock(reservation_ww_class_mutex);
> [   24.859618]                                lock(&cli->mutex);
> [   24.859620]                               
> lock(reservation_ww_class_mutex);
> [   24.859622]   lock(&cli->mutex);
> [   24.859624] 
>                 *** DEADLOCK ***
> 
> [   24.859625] 3 locks held by kworker/1:1/46:
> [   24.859627]  #0: ffff92b0c0bb4338 ((wq_completion)pm){+.+.}-{0:0}, at:
> process_one_work+0x1d0/0x560
> [   24.859634]  #1: ffffa8ffc02dfe80 ((work_completion)(&dev-
> >power.work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
> [   24.859641]  #2: ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-
> {3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859649] 
>                stack backtrace:
> [   24.859651] CPU: 1 PID: 46 Comm: kworker/1:1 Tainted: G            E    
> 5.18.0-rc6+ #34
> [   24.859654] Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21
> 05/08/2017
> [   24.859656] Workqueue: pm pm_runtime_work
> [   24.859660] Call Trace:
> [   24.859662]  <TASK>
> [   24.859665]  dump_stack_lvl+0x5b/0x74
> [   24.859669]  check_noncircular+0xdf/0x100
> [   24.859672]  ? register_lock_class+0x38/0x470
> [   24.859678]  __lock_acquire+0x12e2/0x1f90
> [   24.859683]  lock_acquire+0xad/0x290
> [   24.859686]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859724]  ? lock_is_held_type+0xa6/0x120
> [   24.859730]  __mutex_lock+0x90/0x830
> [   24.859733]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859770]  ? nvif_vmm_map+0x114/0x130 [nouveau]
> [   24.859791]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859829]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859866]  nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859905]  ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859912]  ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859919]  ? lock_release+0x20/0x2a0
> [   24.859923]  ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859930]  nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859968]  nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.860005]  pci_pm_runtime_suspend+0x5c/0x180
> [   24.860008]  ? pci_dev_put+0x20/0x20
> [   24.860011]  __rpm_callback+0x48/0x1b0
> [   24.860014]  ? pci_dev_put+0x20/0x20
> [   24.860018]  rpm_callback+0x5a/0x70
> [   24.860020]  ? pci_dev_put+0x20/0x20
> [   24.860023]  rpm_suspend+0x10a/0x6f0
> [   24.860025]  ? process_one_work+0x1d0/0x560
> [   24.860031]  pm_runtime_work+0xa0/0xb0
> [   24.860034]  process_one_work+0x254/0x560
> [   24.860039]  worker_thread+0x4f/0x390
> [   24.860043]  ? process_one_work+0x560/0x560
> [   24.860046]  kthread+0xe6/0x110
> [   24.860049]  ? kthread_complete_and_exit+0x20/0x20
> [   24.860053]  ret_from_fork+0x22/0x30
> [   24.860059]  </TASK>
> 
> Regards,
> 
> Hans
> 
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nouveau lockdep deadlock report with 5.18-rc6
@ 2022-05-18 17:42   ` Lyude Paul
  0 siblings, 0 replies; 10+ messages in thread
From: Lyude Paul @ 2022-05-18 17:42 UTC (permalink / raw)
  To: Hans de Goede, Ben Skeggs, Karol Herbst; +Cc: nouveau, dri-devel

Yeah I noticed this as well, I will try to bisect this the next change that I
get


On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> Hi All,
> 
> I just noticed the below lockdep possible deadlock report with a 5.18-rc6
> kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> 
> 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108GLM [NVS
> 5200M] [10de:0dfc] (rev a1)
> 01:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio
> Controller [10de:0bea] (rev a1)
> 
> This is with the laptop in Optimus mode, so with the Intel integrated
> gfx from the i5-3320M CPU driving the LCD panel and with nothing connected
> to the HDMI connector, which is always routed to the NVIDIA GPU on this
> laptop.
> 
> The lockdep possible deadlock warning seems to happen when the NVIDIA GPU
> is runtime suspended shortly after gdm has loaded:
> 
> [   24.859171] ======================================================
> [   24.859173] WARNING: possible circular locking dependency detected
> [   24.859175] 5.18.0-rc6+ #34 Tainted: G            E    
> [   24.859178] ------------------------------------------------------
> [   24.859179] kworker/1:1/46 is trying to acquire lock:
> [   24.859181] ffff92b0c0ee0518 (&cli->mutex){+.+.}-{3:3}, at:
> nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859231] 
>                but task is already holding lock:
> [   24.859233] ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3},
> at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859243] 
>                which lock already depends on the new lock.
> 
> [   24.859244] 
>                the existing dependency chain (in reverse order) is:
> [   24.859246] 
>                -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
> [   24.859249]        __ww_mutex_lock.constprop.0+0xb3/0xfb0
> [   24.859256]        ww_mutex_lock+0x38/0xa0
> [   24.859259]        nouveau_bo_pin+0x30/0x380 [nouveau]
> [   24.859297]        nouveau_channel_del+0x1d7/0x3e0 [nouveau]
> [   24.859328]        nouveau_channel_new+0x48/0x730 [nouveau]
> [   24.859358]        nouveau_abi16_ioctl_channel_alloc+0x113/0x360
> [nouveau]
> [   24.859389]        drm_ioctl_kernel+0xa1/0x150
> [   24.859392]        drm_ioctl+0x21c/0x410
> [   24.859395]        nouveau_drm_ioctl+0x56/0x1820 [nouveau]
> [   24.859431]        __x64_sys_ioctl+0x8d/0xc0
> [   24.859436]        do_syscall_64+0x5b/0x80
> [   24.859440]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   24.859443] 
>                -> #0 (&cli->mutex){+.+.}-{3:3}:
> [   24.859446]        __lock_acquire+0x12e2/0x1f90
> [   24.859450]        lock_acquire+0xad/0x290
> [   24.859453]        __mutex_lock+0x90/0x830
> [   24.859456]        nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859493]        ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859498]        ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859503]        ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859509]        nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859545]        nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.859582]        pci_pm_runtime_suspend+0x5c/0x180
> [   24.859585]        __rpm_callback+0x48/0x1b0
> [   24.859589]        rpm_callback+0x5a/0x70
> [   24.859591]        rpm_suspend+0x10a/0x6f0
> [   24.859594]        pm_runtime_work+0xa0/0xb0
> [   24.859596]        process_one_work+0x254/0x560
> [   24.859601]        worker_thread+0x4f/0x390
> [   24.859604]        kthread+0xe6/0x110
> [   24.859607]        ret_from_fork+0x22/0x30
> [   24.859611] 
>                other info that might help us debug this:
> 
> [   24.859612]  Possible unsafe locking scenario:
> 
> [   24.859613]        CPU0                    CPU1
> [   24.859615]        ----                    ----
> [   24.859616]   lock(reservation_ww_class_mutex);
> [   24.859618]                                lock(&cli->mutex);
> [   24.859620]                               
> lock(reservation_ww_class_mutex);
> [   24.859622]   lock(&cli->mutex);
> [   24.859624] 
>                 *** DEADLOCK ***
> 
> [   24.859625] 3 locks held by kworker/1:1/46:
> [   24.859627]  #0: ffff92b0c0bb4338 ((wq_completion)pm){+.+.}-{0:0}, at:
> process_one_work+0x1d0/0x560
> [   24.859634]  #1: ffffa8ffc02dfe80 ((work_completion)(&dev-
> >power.work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560
> [   24.859641]  #2: ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-
> {3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm]
> [   24.859649] 
>                stack backtrace:
> [   24.859651] CPU: 1 PID: 46 Comm: kworker/1:1 Tainted: G            E    
> 5.18.0-rc6+ #34
> [   24.859654] Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21
> 05/08/2017
> [   24.859656] Workqueue: pm pm_runtime_work
> [   24.859660] Call Trace:
> [   24.859662]  <TASK>
> [   24.859665]  dump_stack_lvl+0x5b/0x74
> [   24.859669]  check_noncircular+0xdf/0x100
> [   24.859672]  ? register_lock_class+0x38/0x470
> [   24.859678]  __lock_acquire+0x12e2/0x1f90
> [   24.859683]  lock_acquire+0xad/0x290
> [   24.859686]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859724]  ? lock_is_held_type+0xa6/0x120
> [   24.859730]  __mutex_lock+0x90/0x830
> [   24.859733]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859770]  ? nvif_vmm_map+0x114/0x130 [nouveau]
> [   24.859791]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859829]  ? nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859866]  nouveau_vga_lastclose+0x910/0x1030 [nouveau]
> [   24.859905]  ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm]
> [   24.859912]  ttm_mem_evict_first+0x25c/0x4b0 [ttm]
> [   24.859919]  ? lock_release+0x20/0x2a0
> [   24.859923]  ttm_resource_manager_evict_all+0x93/0x1b0 [ttm]
> [   24.859930]  nouveau_debugfs_fini+0x161/0x260 [nouveau]
> [   24.859968]  nouveau_drm_ioctl+0xa4a/0x1820 [nouveau]
> [   24.860005]  pci_pm_runtime_suspend+0x5c/0x180
> [   24.860008]  ? pci_dev_put+0x20/0x20
> [   24.860011]  __rpm_callback+0x48/0x1b0
> [   24.860014]  ? pci_dev_put+0x20/0x20
> [   24.860018]  rpm_callback+0x5a/0x70
> [   24.860020]  ? pci_dev_put+0x20/0x20
> [   24.860023]  rpm_suspend+0x10a/0x6f0
> [   24.860025]  ? process_one_work+0x1d0/0x560
> [   24.860031]  pm_runtime_work+0xa0/0xb0
> [   24.860034]  process_one_work+0x254/0x560
> [   24.860039]  worker_thread+0x4f/0x390
> [   24.860043]  ? process_one_work+0x560/0x560
> [   24.860046]  kthread+0xe6/0x110
> [   24.860049]  ? kthread_complete_and_exit+0x20/0x20
> [   24.860053]  ret_from_fork+0x22/0x30
> [   24.860059]  </TASK>
> 
> Regards,
> 
> Hans
> 
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Nouveau] nouveau lockdep deadlock report with 5.18-rc6
  2022-05-18 17:42   ` Lyude Paul
@ 2022-05-20 11:46     ` Computer Enthusiastic
  -1 siblings, 0 replies; 10+ messages in thread
From: Computer Enthusiastic @ 2022-05-20 11:46 UTC (permalink / raw)
  To: Lyude Paul; +Cc: Hans de Goede, dri-devel, Ben Skeggs, nouveau

Hello,

Il giorno mer 18 mag 2022 alle ore 19:42 Lyude Paul <lyude@redhat.com>
ha scritto:
>
> Yeah I noticed this as well, I will try to bisect this the next change that I
> get
>
> On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> > Hi All,
> > I just noticed the below lockdep possible deadlock report with a 5.18-rc6
> > kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> > [..]
I hope not to be off topic in regard to kernel version, otherwise I
apologize in advance.

I would like to report that I'm constantly observing a similar, but
somehow different, lockdep warning (see [1]) in kernels 5.16 and 5.17
(compiled with lockdep debugging features) every time I activate the
Suspend To Ram (regardless if STR succeeds or not).

Thanks.

[1] https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/547#note_1361411

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Nouveau] nouveau lockdep deadlock report with 5.18-rc6
@ 2022-05-20 11:46     ` Computer Enthusiastic
  0 siblings, 0 replies; 10+ messages in thread
From: Computer Enthusiastic @ 2022-05-20 11:46 UTC (permalink / raw)
  To: Lyude Paul; +Cc: Hans de Goede, dri-devel, Ben Skeggs, Karol Herbst, nouveau

Hello,

Il giorno mer 18 mag 2022 alle ore 19:42 Lyude Paul <lyude@redhat.com>
ha scritto:
>
> Yeah I noticed this as well, I will try to bisect this the next change that I
> get
>
> On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> > Hi All,
> > I just noticed the below lockdep possible deadlock report with a 5.18-rc6
> > kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> > [..]
I hope not to be off topic in regard to kernel version, otherwise I
apologize in advance.

I would like to report that I'm constantly observing a similar, but
somehow different, lockdep warning (see [1]) in kernels 5.16 and 5.17
(compiled with lockdep debugging features) every time I activate the
Suspend To Ram (regardless if STR succeeds or not).

Thanks.

[1] https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/547#note_1361411

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Nouveau] nouveau lockdep deadlock report with 5.18-rc6
  2022-05-20 11:46     ` Computer Enthusiastic
@ 2022-05-23 19:59       ` Lyude Paul
  -1 siblings, 0 replies; 10+ messages in thread
From: Lyude Paul @ 2022-05-23 19:59 UTC (permalink / raw)
  To: Computer Enthusiastic; +Cc: Hans de Goede, dri-devel, Ben Skeggs, nouveau

On Fri, 2022-05-20 at 13:46 +0200, Computer Enthusiastic wrote:
> Hello,
> 
> Il giorno mer 18 mag 2022 alle ore 19:42 Lyude Paul <lyude@redhat.com>
> ha scritto:
> > 
> > On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> > > Hi All,
> > > I just noticed the below lockdep possible deadlock report with a 5.18-
> > > rc6
> > > kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> > > [..]
> I hope not to be off topic in regard to kernel version, otherwise I
> apologize in advance.
> 
> I would like to report that I'm constantly observing a similar, but
> somehow different, lockdep warning (see [1]) in kernels 5.16 and 5.17
> (compiled with lockdep debugging features) every time I activate the
> Suspend To Ram (regardless if STR succeeds or not).

You may be on the right track actually, as so far my investigation has shown
that this bug definitely would have been present around those kernel versions
as well. Trying to come up with a solution for this

> 
> Thanks.
> 
> [1]
> https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/547#note_1361411
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Nouveau] nouveau lockdep deadlock report with 5.18-rc6
@ 2022-05-23 19:59       ` Lyude Paul
  0 siblings, 0 replies; 10+ messages in thread
From: Lyude Paul @ 2022-05-23 19:59 UTC (permalink / raw)
  To: Computer Enthusiastic
  Cc: Hans de Goede, dri-devel, Ben Skeggs, Karol Herbst, nouveau

On Fri, 2022-05-20 at 13:46 +0200, Computer Enthusiastic wrote:
> Hello,
> 
> Il giorno mer 18 mag 2022 alle ore 19:42 Lyude Paul <lyude@redhat.com>
> ha scritto:
> > 
> > On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote:
> > > Hi All,
> > > I just noticed the below lockdep possible deadlock report with a 5.18-
> > > rc6
> > > kernel on a Dell Latitude E6430 laptop with the following nvidia GPU:
> > > [..]
> I hope not to be off topic in regard to kernel version, otherwise I
> apologize in advance.
> 
> I would like to report that I'm constantly observing a similar, but
> somehow different, lockdep warning (see [1]) in kernels 5.16 and 5.17
> (compiled with lockdep debugging features) every time I activate the
> Suspend To Ram (regardless if STR succeeds or not).

You may be on the right track actually, as so far my investigation has shown
that this bug definitely would have been present around those kernel versions
as well. Trying to come up with a solution for this

> 
> Thanks.
> 
> [1]
> https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/547#note_1361411
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-05-23 19:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-17 11:10 [Nouveau] nouveau lockdep deadlock report with 5.18-rc6 Hans de Goede
2022-05-17 11:10 ` Hans de Goede
2022-05-17 22:24 ` [Nouveau] " Lyude Paul
2022-05-17 22:24   ` Lyude Paul
2022-05-18 17:42 ` [Nouveau] " Lyude Paul
2022-05-18 17:42   ` Lyude Paul
2022-05-20 11:46   ` [Nouveau] " Computer Enthusiastic
2022-05-20 11:46     ` Computer Enthusiastic
2022-05-23 19:59     ` Lyude Paul
2022-05-23 19:59       ` Lyude Paul

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.