All of lore.kernel.org
 help / color / mirror / Atom feed
* Warnings in DRM code when removing/unbinding a driver
@ 2019-12-16 17:23 ` John Garry
  0 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-16 17:23 UTC (permalink / raw)
  To: xinliang, kongxinwei (A), kongxinwei (A), Chenfeng (puck),
	airlied, daniel
  Cc: dri-devel, linux-kernel

Hi all,

Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system 
with the HIBMC hw:

[   27.788806] WARNING: CPU: 24 PID: 1 at 
drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
[   27.798969] Modules linked in:
[   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B 
  5.5.0-rc1-dirty #565
[   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019
[   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
[   27.823654] pc : bo_driver_move_notify+0x8c/0x98
[   27.828262] lr : bo_driver_move_notify+0x40/0x98
[   27.832868] sp : ffff00236f0677e0
[   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
[   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
[   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
[   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
[   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
[   27.862687] x19: ffff002336936000 x18: 0000000000001258
[   27.867989] x17: 0000000000001190 x16: 00000000000011d0
[   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
[   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
[   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
[   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
[   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
[   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
[   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
[   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
[   27.915712] Call trace:
[   27.918151]  bo_driver_move_notify+0x8c/0x98
[   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
[   27.927194]  ttm_bo_put+0x3a0/0x5d0
[   27.930673]  drm_gem_vram_object_free+0xc/0x18
[   27.935109]  drm_gem_object_free+0x34/0xd0
[   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
[   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
[   27.949020]  drm_framebuffer_free+0x48/0x58
[   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
[   27.957889]  drm_mode_object_put+0x28/0x38
[   27.961976]  hibmc_fbdev_fini+0x54/0x78
[   27.965802]  hibmc_unload+0x2c/0xd0
[   27.969281]  hibmc_pci_remove+0x2c/0x40
[   27.973109]  pci_device_remove+0x6c/0x140
[   27.977110]  really_probe+0x174/0x548
[   27.980763]  driver_probe_device+0x7c/0x148
[   27.984936]  device_driver_attach+0x94/0xa0
[   27.989109]  __driver_attach+0xa8/0x110
[   27.992935]  bus_for_each_dev+0xe8/0x158
[   27.996849]  driver_attach+0x30/0x40
[   28.000415]  bus_add_driver+0x234/0x2f0
[   28.004241]  driver_register+0xbc/0x1d0
[   28.008067]  __pci_register_driver+0xbc/0xd0
[   28.012329]  hibmc_pci_driver_init+0x20/0x28
[   28.016590]  do_one_initcall+0xb4/0x254
[   28.020417]  kernel_init_freeable+0x27c/0x328
[   28.024765]  kernel_init+0x10/0x118
[   28.028245]  ret_from_fork+0x10/0x18
[   28.031813] ---[ end trace 35a83b71b657878d ]---
[   28.036503] ------------[ cut here ]------------
[   28.041115] WARNING: CPU: 24 PID: 1 at 
drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
[   28.051537] Modules linked in:
[   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W 
  5.5.0-rc1-dirty #565
[   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019

[snip]

Indeed, simply unbinding the device from the driver causes the same sort 
of issue:

root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
root@(none)$ ls
0000:05:00.0  bind          new_id        remove_id     uevent        unbind
root@(none)$ echo 0000\:05\:00.0 > unbind
[  116.074352] ------------[ cut here ]------------
[  116.078978] WARNING: CPU: 17 PID: 1178 at 
drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
[  116.089661] Modules linked in:
[  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W 
5.5.0-rc1-dirty #565
[  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019
[  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
[  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
[  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
[  116.123910] sp : ffff0022e6cef8e0
[  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
[  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
[  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
[  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
[  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
[  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
[  116.159032] x17: 000000000000000000000000001
[  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
[  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
[  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
[  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
[  116.206756] Call trace:
[  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
[  116.213803]  ttm_bo_release_list+0x184/0x220
[  116.218064]  ttm_bo_put+0x410/0x5d0
[  116.221544]  drm_gem_vram_object_free+0xc/0x18
[  116.225979]  drm_gem_object_free+0x34/0xd0
[  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
[  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
[  116.239890]  drm_framebuffer_free+0x48/0x58
[  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
[  116.248759]  drm_mode_object_put+0x28/0x38
[  116.252846]  hibmc_fbdev_fini+0x54/0x78
[  116.256672]  hibmc_unload+0x2c/0xd0
[  116.260151]  hibmc_pci_remove+0x2c/0x40
[  116.263979]  pci_device_remove+0x6c/0x140
[  116.267980]  device_release_driver_internal+0x134/0x250
[  116.273196]  device_driver_detach+0x28/0x38
[  116.277369]  unbind_store+0xfc/0x150
[  116.280934]  drv_attr_store+0x48/0x60
[  116.284589]  sysfs_kf_write+0x80/0xb0
[  116.288241]  kernfs_fop_write+0x1d4/0x320
[  116.292243]  __vfs_write+0x54/0x98
[  116.295635]  vfs_write+0xe8/0x270
[  116.298940]  ksys_write+0xc8/0x180
[  116.302333]  __arm64_sys_write+0x40/0x50
[  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
[  116.311029]  el0_svc_handler+0x34/0xb0
[  116.314770]  el0_sync_handler+0x10c/0x1c8
[  116.318769]  el0_sync+0x140/0x180
[  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
[  116.326868] ------------[ cut here ]------------


dmesg and .config is here:
https://pastebin.com/4P5yaZBS

I'm not sure if this is a HIBMC driver issue or issue with the framework.

john



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Warnings in DRM code when removing/unbinding a driver
@ 2019-12-16 17:23 ` John Garry
  0 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-16 17:23 UTC (permalink / raw)
  To: xinliang, kongxinwei (A), kongxinwei (A), Chenfeng (puck),
	airlied, daniel
  Cc: linux-kernel, dri-devel

Hi all,

Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system 
with the HIBMC hw:

[   27.788806] WARNING: CPU: 24 PID: 1 at 
drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
[   27.798969] Modules linked in:
[   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B 
  5.5.0-rc1-dirty #565
[   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019
[   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
[   27.823654] pc : bo_driver_move_notify+0x8c/0x98
[   27.828262] lr : bo_driver_move_notify+0x40/0x98
[   27.832868] sp : ffff00236f0677e0
[   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
[   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
[   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
[   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
[   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
[   27.862687] x19: ffff002336936000 x18: 0000000000001258
[   27.867989] x17: 0000000000001190 x16: 00000000000011d0
[   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
[   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
[   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
[   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
[   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
[   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
[   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
[   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
[   27.915712] Call trace:
[   27.918151]  bo_driver_move_notify+0x8c/0x98
[   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
[   27.927194]  ttm_bo_put+0x3a0/0x5d0
[   27.930673]  drm_gem_vram_object_free+0xc/0x18
[   27.935109]  drm_gem_object_free+0x34/0xd0
[   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
[   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
[   27.949020]  drm_framebuffer_free+0x48/0x58
[   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
[   27.957889]  drm_mode_object_put+0x28/0x38
[   27.961976]  hibmc_fbdev_fini+0x54/0x78
[   27.965802]  hibmc_unload+0x2c/0xd0
[   27.969281]  hibmc_pci_remove+0x2c/0x40
[   27.973109]  pci_device_remove+0x6c/0x140
[   27.977110]  really_probe+0x174/0x548
[   27.980763]  driver_probe_device+0x7c/0x148
[   27.984936]  device_driver_attach+0x94/0xa0
[   27.989109]  __driver_attach+0xa8/0x110
[   27.992935]  bus_for_each_dev+0xe8/0x158
[   27.996849]  driver_attach+0x30/0x40
[   28.000415]  bus_add_driver+0x234/0x2f0
[   28.004241]  driver_register+0xbc/0x1d0
[   28.008067]  __pci_register_driver+0xbc/0xd0
[   28.012329]  hibmc_pci_driver_init+0x20/0x28
[   28.016590]  do_one_initcall+0xb4/0x254
[   28.020417]  kernel_init_freeable+0x27c/0x328
[   28.024765]  kernel_init+0x10/0x118
[   28.028245]  ret_from_fork+0x10/0x18
[   28.031813] ---[ end trace 35a83b71b657878d ]---
[   28.036503] ------------[ cut here ]------------
[   28.041115] WARNING: CPU: 24 PID: 1 at 
drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
[   28.051537] Modules linked in:
[   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W 
  5.5.0-rc1-dirty #565
[   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019

[snip]

Indeed, simply unbinding the device from the driver causes the same sort 
of issue:

root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
root@(none)$ ls
0000:05:00.0  bind          new_id        remove_id     uevent        unbind
root@(none)$ echo 0000\:05\:00.0 > unbind
[  116.074352] ------------[ cut here ]------------
[  116.078978] WARNING: CPU: 17 PID: 1178 at 
drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
[  116.089661] Modules linked in:
[  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W 
5.5.0-rc1-dirty #565
[  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019
[  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
[  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
[  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
[  116.123910] sp : ffff0022e6cef8e0
[  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
[  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
[  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
[  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
[  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
[  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
[  116.159032] x17: 000000000000000000000000001
[  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
[  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
[  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
[  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
[  116.206756] Call trace:
[  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
[  116.213803]  ttm_bo_release_list+0x184/0x220
[  116.218064]  ttm_bo_put+0x410/0x5d0
[  116.221544]  drm_gem_vram_object_free+0xc/0x18
[  116.225979]  drm_gem_object_free+0x34/0xd0
[  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
[  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
[  116.239890]  drm_framebuffer_free+0x48/0x58
[  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
[  116.248759]  drm_mode_object_put+0x28/0x38
[  116.252846]  hibmc_fbdev_fini+0x54/0x78
[  116.256672]  hibmc_unload+0x2c/0xd0
[  116.260151]  hibmc_pci_remove+0x2c/0x40
[  116.263979]  pci_device_remove+0x6c/0x140
[  116.267980]  device_release_driver_internal+0x134/0x250
[  116.273196]  device_driver_detach+0x28/0x38
[  116.277369]  unbind_store+0xfc/0x150
[  116.280934]  drv_attr_store+0x48/0x60
[  116.284589]  sysfs_kf_write+0x80/0xb0
[  116.288241]  kernfs_fop_write+0x1d4/0x320
[  116.292243]  __vfs_write+0x54/0x98
[  116.295635]  vfs_write+0xe8/0x270
[  116.298940]  ksys_write+0xc8/0x180
[  116.302333]  __arm64_sys_write+0x40/0x50
[  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
[  116.311029]  el0_svc_handler+0x34/0xb0
[  116.314770]  el0_sync_handler+0x10c/0x1c8
[  116.318769]  el0_sync+0x140/0x180
[  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
[  116.326868] ------------[ cut here ]------------


dmesg and .config is here:
https://pastebin.com/4P5yaZBS

I'm not sure if this is a HIBMC driver issue or issue with the framework.

john


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-16 17:23 ` John Garry
@ 2019-12-17  9:20   ` John Garry
  -1 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-17  9:20 UTC (permalink / raw)
  To: zourongrong, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: dri-devel, linux-kernel, Linuxarm, xuwei (O)

On 16/12/2019 17:23, John Garry wrote:

+, -

> Hi all,

xinliang <z.liuxinliang@hisilicon.com> is bouncing. We need to get his 
new mail address.

John

> 
> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system 
> with the HIBMC hw:
> 
> [   27.788806] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
> [   27.798969] Modules linked in:
> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B 
>   5.5.0-rc1-dirty #565
> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> [   27.832868] sp : ffff00236f0677e0
> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> [   27.915712] Call trace:
> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> [   27.935109]  drm_gem_object_free+0x34/0xd0
> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> [   27.949020]  drm_framebuffer_free+0x48/0x58
> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> [   27.957889]  drm_mode_object_put+0x28/0x38
> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> [   27.965802]  hibmc_unload+0x2c/0xd0
> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> [   27.973109]  pci_device_remove+0x6c/0x140
> [   27.977110]  really_probe+0x174/0x548
> [   27.980763]  driver_probe_device+0x7c/0x148
> [   27.984936]  device_driver_attach+0x94/0xa0
> [   27.989109]  __driver_attach+0xa8/0x110
> [   27.992935]  bus_for_each_dev+0xe8/0x158
> [   27.996849]  driver_attach+0x30/0x40
> [   28.000415]  bus_add_driver+0x234/0x2f0
> [   28.004241]  driver_register+0xbc/0x1d0
> [   28.008067]  __pci_register_driver+0xbc/0xd0
> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> [   28.016590]  do_one_initcall+0xb4/0x254
> [   28.020417]  kernel_init_freeable+0x27c/0x328
> [   28.024765]  kernel_init+0x10/0x118
> [   28.028245]  ret_from_fork+0x10/0x18
> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> [   28.036503] ------------[ cut here ]------------
> [   28.041115] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 
> ttm_buffer_object_destroy+0x4c/0x80
> [   28.051537] Modules linked in:
> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W 
>   5.5.0-rc1-dirty #565
> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> 
> [snip]
> 
> Indeed, simply unbinding the device from the driver causes the same sort 
> of issue:
> 
> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> root@(none)$ ls
> 0000:05:00.0  bind          new_id        remove_id     uevent        
> unbind
> root@(none)$ echo 0000\:05\:00.0 > unbind
> [  116.074352] ------------[ cut here ]------------
> [  116.078978] WARNING: CPU: 17 PID: 1178 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 
> ttm_buffer_object_destroy+0x4c/0x80
> [  116.089661] Modules linked in:
> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W 
> 5.5.0-rc1-dirty #565
> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> [  116.123910] sp : ffff0022e6cef8e0
> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> [  116.159032] x17: 000000000000000000000000001
> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> [  116.206756] Call trace:
> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> [  116.213803]  ttm_bo_release_list+0x184/0x220
> [  116.218064]  ttm_bo_put+0x410/0x5d0
> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> [  116.225979]  drm_gem_object_free+0x34/0xd0
> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> [  116.239890]  drm_framebuffer_free+0x48/0x58
> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> [  116.248759]  drm_mode_object_put+0x28/0x38
> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> [  116.256672]  hibmc_unload+0x2c/0xd0
> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> [  116.263979]  pci_device_remove+0x6c/0x140
> [  116.267980]  device_release_driver_internal+0x134/0x250
> [  116.273196]  device_driver_detach+0x28/0x38
> [  116.277369]  unbind_store+0xfc/0x150
> [  116.280934]  drv_attr_store+0x48/0x60
> [  116.284589]  sysfs_kf_write+0x80/0xb0
> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> [  116.292243]  __vfs_write+0x54/0x98
> [  116.295635]  vfs_write+0xe8/0x270
> [  116.298940]  ksys_write+0xc8/0x180
> [  116.302333]  __arm64_sys_write+0x40/0x50
> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> [  116.311029]  el0_svc_handler+0x34/0xb0
> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> [  116.318769]  el0_sync+0x140/0x180
> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> [  116.326868] ------------[ cut here ]------------
> 
> 
> dmesg and .config is here:
> https://pastebin.com/4P5yaZBS
> 
> I'm not sure if this is a HIBMC driver issue or issue with the framework.
> 
> john
> 
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-17  9:20   ` John Garry
  0 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-17  9:20 UTC (permalink / raw)
  To: zourongrong, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: xuwei (O), linux-kernel, dri-devel, Linuxarm

On 16/12/2019 17:23, John Garry wrote:

+, -

> Hi all,

xinliang <z.liuxinliang@hisilicon.com> is bouncing. We need to get his 
new mail address.

John

> 
> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system 
> with the HIBMC hw:
> 
> [   27.788806] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
> [   27.798969] Modules linked in:
> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B 
>   5.5.0-rc1-dirty #565
> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> [   27.832868] sp : ffff00236f0677e0
> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> [   27.915712] Call trace:
> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> [   27.935109]  drm_gem_object_free+0x34/0xd0
> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> [   27.949020]  drm_framebuffer_free+0x48/0x58
> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> [   27.957889]  drm_mode_object_put+0x28/0x38
> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> [   27.965802]  hibmc_unload+0x2c/0xd0
> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> [   27.973109]  pci_device_remove+0x6c/0x140
> [   27.977110]  really_probe+0x174/0x548
> [   27.980763]  driver_probe_device+0x7c/0x148
> [   27.984936]  device_driver_attach+0x94/0xa0
> [   27.989109]  __driver_attach+0xa8/0x110
> [   27.992935]  bus_for_each_dev+0xe8/0x158
> [   27.996849]  driver_attach+0x30/0x40
> [   28.000415]  bus_add_driver+0x234/0x2f0
> [   28.004241]  driver_register+0xbc/0x1d0
> [   28.008067]  __pci_register_driver+0xbc/0xd0
> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> [   28.016590]  do_one_initcall+0xb4/0x254
> [   28.020417]  kernel_init_freeable+0x27c/0x328
> [   28.024765]  kernel_init+0x10/0x118
> [   28.028245]  ret_from_fork+0x10/0x18
> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> [   28.036503] ------------[ cut here ]------------
> [   28.041115] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 
> ttm_buffer_object_destroy+0x4c/0x80
> [   28.051537] Modules linked in:
> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W 
>   5.5.0-rc1-dirty #565
> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> 
> [snip]
> 
> Indeed, simply unbinding the device from the driver causes the same sort 
> of issue:
> 
> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> root@(none)$ ls
> 0000:05:00.0  bind          new_id        remove_id     uevent        
> unbind
> root@(none)$ echo 0000\:05\:00.0 > unbind
> [  116.074352] ------------[ cut here ]------------
> [  116.078978] WARNING: CPU: 17 PID: 1178 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 
> ttm_buffer_object_destroy+0x4c/0x80
> [  116.089661] Modules linked in:
> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W 
> 5.5.0-rc1-dirty #565
> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> [  116.123910] sp : ffff0022e6cef8e0
> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> [  116.159032] x17: 000000000000000000000000001
> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> [  116.206756] Call trace:
> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> [  116.213803]  ttm_bo_release_list+0x184/0x220
> [  116.218064]  ttm_bo_put+0x410/0x5d0
> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> [  116.225979]  drm_gem_object_free+0x34/0xd0
> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> [  116.239890]  drm_framebuffer_free+0x48/0x58
> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> [  116.248759]  drm_mode_object_put+0x28/0x38
> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> [  116.256672]  hibmc_unload+0x2c/0xd0
> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> [  116.263979]  pci_device_remove+0x6c/0x140
> [  116.267980]  device_release_driver_internal+0x134/0x250
> [  116.273196]  device_driver_detach+0x28/0x38
> [  116.277369]  unbind_store+0xfc/0x150
> [  116.280934]  drv_attr_store+0x48/0x60
> [  116.284589]  sysfs_kf_write+0x80/0xb0
> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> [  116.292243]  __vfs_write+0x54/0x98
> [  116.295635]  vfs_write+0xe8/0x270
> [  116.298940]  ksys_write+0xc8/0x180
> [  116.302333]  __arm64_sys_write+0x40/0x50
> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> [  116.311029]  el0_svc_handler+0x34/0xb0
> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> [  116.318769]  el0_sync+0x140/0x180
> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> [  116.326868] ------------[ cut here ]------------
> 
> 
> dmesg and .config is here:
> https://pastebin.com/4P5yaZBS
> 
> I'm not sure if this is a HIBMC driver issue or issue with the framework.
> 
> john
> 
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-17  9:20   ` John Garry
@ 2019-12-17 13:24     ` Daniel Vetter
  -1 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-17 13:24 UTC (permalink / raw)
  To: John Garry
  Cc: zourongrong, kongxinwei (A), Chenfeng (puck),
	airlied, daniel, dri-devel, linux-kernel, Linuxarm, xuwei (O)

On Tue, Dec 17, 2019 at 09:20:43AM +0000, John Garry wrote:
> On 16/12/2019 17:23, John Garry wrote:
> 
> +, -
> 
> > Hi all,
> 
> xinliang <z.liuxinliang@hisilicon.com> is bouncing. We need to get his new
> mail address.
> 
> John
> 
> > 
> > Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
> > with the HIBMC hw:
> > 
> > [   27.788806] WARNING: CPU: 24 PID: 1 at
> > drivers/gpu/drm/drm_gem_vram_helper.c:564
> > bo_driver_move_notify+0x8c/0x98
> > [   27.798969] Modules linked in:
> > [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
> >  5.5.0-rc1-dirty #565
> > [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> > RC0 - V1.16.01 03/15/2019
> > [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> > [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> > [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> > [   27.832868] sp : ffff00236f0677e0
> > [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> > [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> > [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> > [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> > [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> > [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> > [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> > [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> > [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> > [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> > [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> > [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> > [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> > [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> > [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> > [   27.915712] Call trace:
> > [   27.918151]  bo_driver_move_notify+0x8c/0x98
> > [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> > [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> > [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> > [   27.935109]  drm_gem_object_free+0x34/0xd0
> > [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> > [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> > [   27.949020]  drm_framebuffer_free+0x48/0x58
> > [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> > [   27.957889]  drm_mode_object_put+0x28/0x38
> > [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> > [   27.965802]  hibmc_unload+0x2c/0xd0
> > [   27.969281]  hibmc_pci_remove+0x2c/0x40
> > [   27.973109]  pci_device_remove+0x6c/0x140
> > [   27.977110]  really_probe+0x174/0x548
> > [   27.980763]  driver_probe_device+0x7c/0x148
> > [   27.984936]  device_driver_attach+0x94/0xa0
> > [   27.989109]  __driver_attach+0xa8/0x110
> > [   27.992935]  bus_for_each_dev+0xe8/0x158
> > [   27.996849]  driver_attach+0x30/0x40
> > [   28.000415]  bus_add_driver+0x234/0x2f0
> > [   28.004241]  driver_register+0xbc/0x1d0
> > [   28.008067]  __pci_register_driver+0xbc/0xd0
> > [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> > [   28.016590]  do_one_initcall+0xb4/0x254
> > [   28.020417]  kernel_init_freeable+0x27c/0x328
> > [   28.024765]  kernel_init+0x10/0x118
> > [   28.028245]  ret_from_fork+0x10/0x18
> > [   28.031813] ---[ end trace 35a83b71b657878d ]---
> > [   28.036503] ------------[ cut here ]------------
> > [   28.041115] WARNING: CPU: 24 PID: 1 at
> > drivers/gpu/drm/drm_gem_vram_helper.c:40
> > ttm_buffer_object_destroy+0x4c/0x80
> > [   28.051537] Modules linked in:
> > [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
> >  5.5.0-rc1-dirty #565
> > [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> > RC0 - V1.16.01 03/15/2019
> > 
> > [snip]
> > 
> > Indeed, simply unbinding the device from the driver causes the same sort
> > of issue:
> > 
> > root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> > root@(none)$ ls
> > 0000:05:00.0  bind          new_id        remove_id     uevent
> > unbind
> > root@(none)$ echo 0000\:05\:00.0 > unbind
> > [  116.074352] ------------[ cut here ]------------
> > [  116.078978] WARNING: CPU: 17 PID: 1178 at
> > drivers/gpu/drm/drm_gem_vram_helper.c:40
> > ttm_buffer_object_destroy+0x4c/0x80
> > [  116.089661] Modules linked in:
> > [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
> > 5.5.0-rc1-dirty #565
> > [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> > RC0 - V1.16.01 03/15/2019
> > [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> > [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> > [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> > [  116.123910] sp : ffff0022e6cef8e0
> > [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> > [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> > [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> > [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> > [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> > [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> > [  116.159032] x17: 000000000000000000000000001
> > [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> > [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> > [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> > [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> > [  116.206756] Call trace:
> > [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> > [  116.213803]  ttm_bo_release_list+0x184/0x220
> > [  116.218064]  ttm_bo_put+0x410/0x5d0
> > [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> > [  116.225979]  drm_gem_object_free+0x34/0xd0
> > [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> > [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> > [  116.239890]  drm_framebuffer_free+0x48/0x58
> > [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> > [  116.248759]  drm_mode_object_put+0x28/0x38
> > [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> > [  116.256672]  hibmc_unload+0x2c/0xd0
> > [  116.260151]  hibmc_pci_remove+0x2c/0x40
> > [  116.263979]  pci_device_remove+0x6c/0x140
> > [  116.267980]  device_release_driver_internal+0x134/0x250
> > [  116.273196]  device_driver_detach+0x28/0x38
> > [  116.277369]  unbind_store+0xfc/0x150
> > [  116.280934]  drv_attr_store+0x48/0x60
> > [  116.284589]  sysfs_kf_write+0x80/0xb0
> > [  116.288241]  kernfs_fop_write+0x1d4/0x320
> > [  116.292243]  __vfs_write+0x54/0x98
> > [  116.295635]  vfs_write+0xe8/0x270
> > [  116.298940]  ksys_write+0xc8/0x180
> > [  116.302333]  __arm64_sys_write+0x40/0x50
> > [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> > [  116.311029]  el0_svc_handler+0x34/0xb0
> > [  116.314770]  el0_sync_handler+0x10c/0x1c8
> > [  116.318769]  el0_sync+0x140/0x180
> > [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> > [  116.326868] ------------[ cut here ]------------
> > 
> > 
> > dmesg and .config is here:
> > https://pastebin.com/4P5yaZBS
> > 
> > I'm not sure if this is a HIBMC driver issue or issue with the framework.

Display-only drivers shouldn't go boom like this, the drm framework is
fixed for those. Unfortunately there's still many drivers that get their
unload sequence and resource refcounting totally wrong. For a start see
devm_drm_dev_init() and related documentation for recommendations for
current best practices:

https://dri.freedesktop.org/docs/drm/gpu/drm-internals.html#display-driver-example

Cheers, Daniel

> > 
> > john
> > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-17 13:24     ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-17 13:24 UTC (permalink / raw)
  To: John Garry
  Cc: airlied, Chenfeng (puck),
	linux-kernel, dri-devel, Linuxarm, kongxinwei (A), xuwei (O),
	zourongrong

On Tue, Dec 17, 2019 at 09:20:43AM +0000, John Garry wrote:
> On 16/12/2019 17:23, John Garry wrote:
> 
> +, -
> 
> > Hi all,
> 
> xinliang <z.liuxinliang@hisilicon.com> is bouncing. We need to get his new
> mail address.
> 
> John
> 
> > 
> > Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
> > with the HIBMC hw:
> > 
> > [   27.788806] WARNING: CPU: 24 PID: 1 at
> > drivers/gpu/drm/drm_gem_vram_helper.c:564
> > bo_driver_move_notify+0x8c/0x98
> > [   27.798969] Modules linked in:
> > [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
> >  5.5.0-rc1-dirty #565
> > [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> > RC0 - V1.16.01 03/15/2019
> > [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> > [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> > [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> > [   27.832868] sp : ffff00236f0677e0
> > [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> > [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> > [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> > [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> > [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> > [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> > [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> > [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> > [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> > [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> > [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> > [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> > [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> > [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> > [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> > [   27.915712] Call trace:
> > [   27.918151]  bo_driver_move_notify+0x8c/0x98
> > [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> > [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> > [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> > [   27.935109]  drm_gem_object_free+0x34/0xd0
> > [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> > [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> > [   27.949020]  drm_framebuffer_free+0x48/0x58
> > [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> > [   27.957889]  drm_mode_object_put+0x28/0x38
> > [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> > [   27.965802]  hibmc_unload+0x2c/0xd0
> > [   27.969281]  hibmc_pci_remove+0x2c/0x40
> > [   27.973109]  pci_device_remove+0x6c/0x140
> > [   27.977110]  really_probe+0x174/0x548
> > [   27.980763]  driver_probe_device+0x7c/0x148
> > [   27.984936]  device_driver_attach+0x94/0xa0
> > [   27.989109]  __driver_attach+0xa8/0x110
> > [   27.992935]  bus_for_each_dev+0xe8/0x158
> > [   27.996849]  driver_attach+0x30/0x40
> > [   28.000415]  bus_add_driver+0x234/0x2f0
> > [   28.004241]  driver_register+0xbc/0x1d0
> > [   28.008067]  __pci_register_driver+0xbc/0xd0
> > [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> > [   28.016590]  do_one_initcall+0xb4/0x254
> > [   28.020417]  kernel_init_freeable+0x27c/0x328
> > [   28.024765]  kernel_init+0x10/0x118
> > [   28.028245]  ret_from_fork+0x10/0x18
> > [   28.031813] ---[ end trace 35a83b71b657878d ]---
> > [   28.036503] ------------[ cut here ]------------
> > [   28.041115] WARNING: CPU: 24 PID: 1 at
> > drivers/gpu/drm/drm_gem_vram_helper.c:40
> > ttm_buffer_object_destroy+0x4c/0x80
> > [   28.051537] Modules linked in:
> > [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
> >  5.5.0-rc1-dirty #565
> > [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> > RC0 - V1.16.01 03/15/2019
> > 
> > [snip]
> > 
> > Indeed, simply unbinding the device from the driver causes the same sort
> > of issue:
> > 
> > root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> > root@(none)$ ls
> > 0000:05:00.0  bind          new_id        remove_id     uevent
> > unbind
> > root@(none)$ echo 0000\:05\:00.0 > unbind
> > [  116.074352] ------------[ cut here ]------------
> > [  116.078978] WARNING: CPU: 17 PID: 1178 at
> > drivers/gpu/drm/drm_gem_vram_helper.c:40
> > ttm_buffer_object_destroy+0x4c/0x80
> > [  116.089661] Modules linked in:
> > [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
> > 5.5.0-rc1-dirty #565
> > [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> > RC0 - V1.16.01 03/15/2019
> > [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> > [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> > [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> > [  116.123910] sp : ffff0022e6cef8e0
> > [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> > [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> > [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> > [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> > [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> > [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> > [  116.159032] x17: 000000000000000000000000001
> > [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> > [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> > [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> > [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> > [  116.206756] Call trace:
> > [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> > [  116.213803]  ttm_bo_release_list+0x184/0x220
> > [  116.218064]  ttm_bo_put+0x410/0x5d0
> > [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> > [  116.225979]  drm_gem_object_free+0x34/0xd0
> > [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> > [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> > [  116.239890]  drm_framebuffer_free+0x48/0x58
> > [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> > [  116.248759]  drm_mode_object_put+0x28/0x38
> > [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> > [  116.256672]  hibmc_unload+0x2c/0xd0
> > [  116.260151]  hibmc_pci_remove+0x2c/0x40
> > [  116.263979]  pci_device_remove+0x6c/0x140
> > [  116.267980]  device_release_driver_internal+0x134/0x250
> > [  116.273196]  device_driver_detach+0x28/0x38
> > [  116.277369]  unbind_store+0xfc/0x150
> > [  116.280934]  drv_attr_store+0x48/0x60
> > [  116.284589]  sysfs_kf_write+0x80/0xb0
> > [  116.288241]  kernfs_fop_write+0x1d4/0x320
> > [  116.292243]  __vfs_write+0x54/0x98
> > [  116.295635]  vfs_write+0xe8/0x270
> > [  116.298940]  ksys_write+0xc8/0x180
> > [  116.302333]  __arm64_sys_write+0x40/0x50
> > [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> > [  116.311029]  el0_svc_handler+0x34/0xb0
> > [  116.314770]  el0_sync_handler+0x10c/0x1c8
> > [  116.318769]  el0_sync+0x140/0x180
> > [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> > [  116.326868] ------------[ cut here ]------------
> > 
> > 
> > dmesg and .config is here:
> > https://pastebin.com/4P5yaZBS
> > 
> > I'm not sure if this is a HIBMC driver issue or issue with the framework.

Display-only drivers shouldn't go boom like this, the drm framework is
fixed for those. Unfortunately there's still many drivers that get their
unload sequence and resource refcounting totally wrong. For a start see
devm_drm_dev_init() and related documentation for recommendations for
current best practices:

https://dri.freedesktop.org/docs/drm/gpu/drm-internals.html#display-driver-example

Cheers, Daniel

> > 
> > john
> > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-16 17:23 ` John Garry
@ 2019-12-17 16:34   ` Ezequiel Garcia
  -1 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2019-12-17 16:34 UTC (permalink / raw)
  To: John Garry, xinliang, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel

Hi John,

On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
> Hi all,
> 
> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system 
> with the HIBMC hw:
> 
> [   27.788806] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98

A total shot in the dark. This might make no sense,
but it's worth a try:

diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
index 2fd4ca91a62d..69bb0e29da88 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
@@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
 {
        struct hibmc_drm_private *priv = dev->dev_private;
 
-       hibmc_fbdev_fini(priv);
-
        drm_atomic_helper_shutdown(dev);
+       hibmc_fbdev_fini(priv);
 
        if (dev->irq_enabled)
                drm_irq_uninstall(dev);

Hope it helps,
Ezequiel

> [   27.798969] Modules linked in:
> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B 
>   5.5.0-rc1-dirty #565
> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> [   27.832868] sp : ffff00236f0677e0
> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> [   27.915712] Call trace:
> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> [   27.935109]  drm_gem_object_free+0x34/0xd0
> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> [   27.949020]  drm_framebuffer_free+0x48/0x58
> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> [   27.957889]  drm_mode_object_put+0x28/0x38
> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> [   27.965802]  hibmc_unload+0x2c/0xd0
> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> [   27.973109]  pci_device_remove+0x6c/0x140
> [   27.977110]  really_probe+0x174/0x548
> [   27.980763]  driver_probe_device+0x7c/0x148
> [   27.984936]  device_driver_attach+0x94/0xa0
> [   27.989109]  __driver_attach+0xa8/0x110
> [   27.992935]  bus_for_each_dev+0xe8/0x158
> [   27.996849]  driver_attach+0x30/0x40
> [   28.000415]  bus_add_driver+0x234/0x2f0
> [   28.004241]  driver_register+0xbc/0x1d0
> [   28.008067]  __pci_register_driver+0xbc/0xd0
> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> [   28.016590]  do_one_initcall+0xb4/0x254
> [   28.020417]  kernel_init_freeable+0x27c/0x328
> [   28.024765]  kernel_init+0x10/0x118
> [   28.028245]  ret_from_fork+0x10/0x18
> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> [   28.036503] ------------[ cut here ]------------
> [   28.041115] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
> [   28.051537] Modules linked in:
> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W 
>   5.5.0-rc1-dirty #565
> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> 
> [snip]
> 
> Indeed, simply unbinding the device from the driver causes the same sort 
> of issue:
> 
> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> root@(none)$ ls
> 0000:05:00.0  bind          new_id        remove_id     uevent        unbind
> root@(none)$ echo 0000\:05\:00.0 > unbind
> [  116.074352] ------------[ cut here ]------------
> [  116.078978] WARNING: CPU: 17 PID: 1178 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
> [  116.089661] Modules linked in:
> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W 
> 5.5.0-rc1-dirty #565
> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> [  116.123910] sp : ffff0022e6cef8e0
> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> [  116.159032] x17: 000000000000000000000000001
> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> [  116.206756] Call trace:
> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> [  116.213803]  ttm_bo_release_list+0x184/0x220
> [  116.218064]  ttm_bo_put+0x410/0x5d0
> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> [  116.225979]  drm_gem_object_free+0x34/0xd0
> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> [  116.239890]  drm_framebuffer_free+0x48/0x58
> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> [  116.248759]  drm_mode_object_put+0x28/0x38
> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> [  116.256672]  hibmc_unload+0x2c/0xd0
> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> [  116.263979]  pci_device_remove+0x6c/0x140
> [  116.267980]  device_release_driver_internal+0x134/0x250
> [  116.273196]  device_driver_detach+0x28/0x38
> [  116.277369]  unbind_store+0xfc/0x150
> [  116.280934]  drv_attr_store+0x48/0x60
> [  116.284589]  sysfs_kf_write+0x80/0xb0
> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> [  116.292243]  __vfs_write+0x54/0x98
> [  116.295635]  vfs_write+0xe8/0x270
> [  116.298940]  ksys_write+0xc8/0x180
> [  116.302333]  __arm64_sys_write+0x40/0x50
> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> [  116.311029]  el0_svc_handler+0x34/0xb0
> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> [  116.318769]  el0_sync+0x140/0x180
> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> [  116.326868] ------------[ cut here ]------------
> 
> 
> dmesg and .config is here:
> https://pastebin.com/4P5yaZBS
> 
> I'm not sure if this is a HIBMC driver issue or issue with the framework.
> 
> john
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-17 16:34   ` Ezequiel Garcia
  0 siblings, 0 replies; 32+ messages in thread
From: Ezequiel Garcia @ 2019-12-17 16:34 UTC (permalink / raw)
  To: John Garry, xinliang, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel

Hi John,

On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
> Hi all,
> 
> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system 
> with the HIBMC hw:
> 
> [   27.788806] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98

A total shot in the dark. This might make no sense,
but it's worth a try:

diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
index 2fd4ca91a62d..69bb0e29da88 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
@@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
 {
        struct hibmc_drm_private *priv = dev->dev_private;
 
-       hibmc_fbdev_fini(priv);
-
        drm_atomic_helper_shutdown(dev);
+       hibmc_fbdev_fini(priv);
 
        if (dev->irq_enabled)
                drm_irq_uninstall(dev);

Hope it helps,
Ezequiel

> [   27.798969] Modules linked in:
> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B 
>   5.5.0-rc1-dirty #565
> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> [   27.832868] sp : ffff00236f0677e0
> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> [   27.915712] Call trace:
> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> [   27.935109]  drm_gem_object_free+0x34/0xd0
> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> [   27.949020]  drm_framebuffer_free+0x48/0x58
> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> [   27.957889]  drm_mode_object_put+0x28/0x38
> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> [   27.965802]  hibmc_unload+0x2c/0xd0
> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> [   27.973109]  pci_device_remove+0x6c/0x140
> [   27.977110]  really_probe+0x174/0x548
> [   27.980763]  driver_probe_device+0x7c/0x148
> [   27.984936]  device_driver_attach+0x94/0xa0
> [   27.989109]  __driver_attach+0xa8/0x110
> [   27.992935]  bus_for_each_dev+0xe8/0x158
> [   27.996849]  driver_attach+0x30/0x40
> [   28.000415]  bus_add_driver+0x234/0x2f0
> [   28.004241]  driver_register+0xbc/0x1d0
> [   28.008067]  __pci_register_driver+0xbc/0xd0
> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> [   28.016590]  do_one_initcall+0xb4/0x254
> [   28.020417]  kernel_init_freeable+0x27c/0x328
> [   28.024765]  kernel_init+0x10/0x118
> [   28.028245]  ret_from_fork+0x10/0x18
> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> [   28.036503] ------------[ cut here ]------------
> [   28.041115] WARNING: CPU: 24 PID: 1 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
> [   28.051537] Modules linked in:
> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W 
>   5.5.0-rc1-dirty #565
> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> 
> [snip]
> 
> Indeed, simply unbinding the device from the driver causes the same sort 
> of issue:
> 
> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> root@(none)$ ls
> 0000:05:00.0  bind          new_id        remove_id     uevent        unbind
> root@(none)$ echo 0000\:05\:00.0 > unbind
> [  116.074352] ------------[ cut here ]------------
> [  116.078978] WARNING: CPU: 17 PID: 1178 at 
> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
> [  116.089661] Modules linked in:
> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W 
> 5.5.0-rc1-dirty #565
> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
> RC0 - V1.16.01 03/15/2019
> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> [  116.123910] sp : ffff0022e6cef8e0
> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> [  116.159032] x17: 000000000000000000000000001
> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> [  116.206756] Call trace:
> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> [  116.213803]  ttm_bo_release_list+0x184/0x220
> [  116.218064]  ttm_bo_put+0x410/0x5d0
> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> [  116.225979]  drm_gem_object_free+0x34/0xd0
> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> [  116.239890]  drm_framebuffer_free+0x48/0x58
> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> [  116.248759]  drm_mode_object_put+0x28/0x38
> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> [  116.256672]  hibmc_unload+0x2c/0xd0
> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> [  116.263979]  pci_device_remove+0x6c/0x140
> [  116.267980]  device_release_driver_internal+0x134/0x250
> [  116.273196]  device_driver_detach+0x28/0x38
> [  116.277369]  unbind_store+0xfc/0x150
> [  116.280934]  drv_attr_store+0x48/0x60
> [  116.284589]  sysfs_kf_write+0x80/0xb0
> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> [  116.292243]  __vfs_write+0x54/0x98
> [  116.295635]  vfs_write+0xe8/0x270
> [  116.298940]  ksys_write+0xc8/0x180
> [  116.302333]  __arm64_sys_write+0x40/0x50
> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> [  116.311029]  el0_svc_handler+0x34/0xb0
> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> [  116.318769]  el0_sync+0x140/0x180
> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> [  116.326868] ------------[ cut here ]------------
> 
> 
> dmesg and .config is here:
> https://pastebin.com/4P5yaZBS
> 
> I'm not sure if this is a HIBMC driver issue or issue with the framework.
> 
> john
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-17 16:34   ` Ezequiel Garcia
@ 2019-12-17 17:27     ` John Garry
  -1 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-17 17:27 UTC (permalink / raw)
  To: Ezequiel Garcia, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: Linuxarm, linux-kernel, dri-devel

Hi Ezequiel,

> On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
>> Hi all,
>>
>> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
>> with the HIBMC hw:
>>
>> [   27.788806] WARNING: CPU: 24 PID: 1 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
> 
> A total shot in the dark. This might make no sense,
> but it's worth a try:

Thanks for the suggestion, but still the same splat.

I haven't had a chance to analyze the problem myself. But perhaps we 
should just change over the device-managed interface, as Daniel mentioned.

> 
> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> index 2fd4ca91a62d..69bb0e29da88 100644
> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> @@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
>   {
>          struct hibmc_drm_private *priv = dev->dev_private;
>   
> -       hibmc_fbdev_fini(priv);
> -
>          drm_atomic_helper_shutdown(dev);
> +       hibmc_fbdev_fini(priv);
>   
>          if (dev->irq_enabled)
>                  drm_irq_uninstall(dev);
> 
> Hope it helps,
> Ezequiel
> 

Thanks,
John

[EOM]

>> [   27.798969] Modules linked in:
>> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
>>    5.5.0-rc1-dirty #565
>> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
>> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
>> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
>> [   27.832868] sp : ffff00236f0677e0
>> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
>> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
>> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
>> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
>> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
>> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
>> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
>> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
>> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
>> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
>> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
>> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
>> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
>> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
>> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
>> [   27.915712] Call trace:
>> [   27.918151]  bo_driver_move_notify+0x8c/0x98
>> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
>> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
>> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
>> [   27.935109]  drm_gem_object_free+0x34/0xd0
>> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
>> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
>> [   27.949020]  drm_framebuffer_free+0x48/0x58
>> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
>> [   27.957889]  drm_mode_object_put+0x28/0x38
>> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
>> [   27.965802]  hibmc_unload+0x2c/0xd0
>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>> [   27.973109]  pci_device_remove+0x6c/0x140
>> [   27.977110]  really_probe+0x174/0x548
>> [   27.980763]  driver_probe_device+0x7c/0x148
>> [   27.984936]  device_driver_attach+0x94/0xa0
>> [   27.989109]  __driver_attach+0xa8/0x110
>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>> [   27.996849]  driver_attach+0x30/0x40
>> [   28.000415]  bus_add_driver+0x234/0x2f0
>> [   28.004241]  driver_register+0xbc/0x1d0
>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>> [   28.016590]  do_one_initcall+0xb4/0x254
>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>> [   28.024765]  kernel_init+0x10/0x118
>> [   28.028245]  ret_from_fork+0x10/0x18
>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>> [   28.036503] ------------[ cut here ]------------
>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
>> [   28.051537] Modules linked in:
>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>    5.5.0-rc1-dirty #565
>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>>
>> [snip]
>>
>> Indeed, simply unbinding the device from the driver causes the same sort
>> of issue:
>>
>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>> root@(none)$ ls
>> 0000:05:00.0  bind          new_id        remove_id     uevent        unbind
>> root@(none)$ echo 0000\:05\:00.0 > unbind
>> [  116.074352] ------------[ cut here ]------------
>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
>> [  116.089661] Modules linked in:
>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>> 5.5.0-rc1-dirty #565
>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>> [  116.123910] sp : ffff0022e6cef8e0
>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>> [  116.159032] x17: 000000000000000000000000001
>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>> [  116.206756] Call trace:
>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>> [  116.248759]  drm_mode_object_put+0x28/0x38
>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>> [  116.256672]  hibmc_unload+0x2c/0xd0
>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>> [  116.263979]  pci_device_remove+0x6c/0x140
>> [  116.267980]  device_release_driver_internal+0x134/0x250
>> [  116.273196]  device_driver_detach+0x28/0x38
>> [  116.277369]  unbind_store+0xfc/0x150
>> [  116.280934]  drv_attr_store+0x48/0x60
>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>> [  116.292243]  __vfs_write+0x54/0x98
>> [  116.295635]  vfs_write+0xe8/0x270
>> [  116.298940]  ksys_write+0xc8/0x180
>> [  116.302333]  __arm64_sys_write+0x40/0x50
>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>> [  116.311029]  el0_svc_handler+0x34/0xb0
>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>> [  116.318769]  el0_sync+0x140/0x180
>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>> [  116.326868] ------------[ cut here ]------------
>>
>>
>> dmesg and .config is here:
>> https://pastebin.com/4P5yaZBS
>>
>> I'm not sure if this is a HIBMC driver issue or issue with the framework.
>>
>> john
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-17 17:27     ` John Garry
  0 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-17 17:27 UTC (permalink / raw)
  To: Ezequiel Garcia, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: Linuxarm, dri-devel, linux-kernel

Hi Ezequiel,

> On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
>> Hi all,
>>
>> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
>> with the HIBMC hw:
>>
>> [   27.788806] WARNING: CPU: 24 PID: 1 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
> 
> A total shot in the dark. This might make no sense,
> but it's worth a try:

Thanks for the suggestion, but still the same splat.

I haven't had a chance to analyze the problem myself. But perhaps we 
should just change over the device-managed interface, as Daniel mentioned.

> 
> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> index 2fd4ca91a62d..69bb0e29da88 100644
> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> @@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
>   {
>          struct hibmc_drm_private *priv = dev->dev_private;
>   
> -       hibmc_fbdev_fini(priv);
> -
>          drm_atomic_helper_shutdown(dev);
> +       hibmc_fbdev_fini(priv);
>   
>          if (dev->irq_enabled)
>                  drm_irq_uninstall(dev);
> 
> Hope it helps,
> Ezequiel
> 

Thanks,
John

[EOM]

>> [   27.798969] Modules linked in:
>> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
>>    5.5.0-rc1-dirty #565
>> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
>> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
>> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
>> [   27.832868] sp : ffff00236f0677e0
>> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
>> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
>> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
>> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
>> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
>> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
>> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
>> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
>> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
>> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
>> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
>> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
>> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
>> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
>> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
>> [   27.915712] Call trace:
>> [   27.918151]  bo_driver_move_notify+0x8c/0x98
>> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
>> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
>> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
>> [   27.935109]  drm_gem_object_free+0x34/0xd0
>> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
>> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
>> [   27.949020]  drm_framebuffer_free+0x48/0x58
>> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
>> [   27.957889]  drm_mode_object_put+0x28/0x38
>> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
>> [   27.965802]  hibmc_unload+0x2c/0xd0
>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>> [   27.973109]  pci_device_remove+0x6c/0x140
>> [   27.977110]  really_probe+0x174/0x548
>> [   27.980763]  driver_probe_device+0x7c/0x148
>> [   27.984936]  device_driver_attach+0x94/0xa0
>> [   27.989109]  __driver_attach+0xa8/0x110
>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>> [   27.996849]  driver_attach+0x30/0x40
>> [   28.000415]  bus_add_driver+0x234/0x2f0
>> [   28.004241]  driver_register+0xbc/0x1d0
>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>> [   28.016590]  do_one_initcall+0xb4/0x254
>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>> [   28.024765]  kernel_init+0x10/0x118
>> [   28.028245]  ret_from_fork+0x10/0x18
>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>> [   28.036503] ------------[ cut here ]------------
>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
>> [   28.051537] Modules linked in:
>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>    5.5.0-rc1-dirty #565
>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>>
>> [snip]
>>
>> Indeed, simply unbinding the device from the driver causes the same sort
>> of issue:
>>
>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>> root@(none)$ ls
>> 0000:05:00.0  bind          new_id        remove_id     uevent        unbind
>> root@(none)$ echo 0000\:05\:00.0 > unbind
>> [  116.074352] ------------[ cut here ]------------
>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40 ttm_buffer_object_destroy+0x4c/0x80
>> [  116.089661] Modules linked in:
>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>> 5.5.0-rc1-dirty #565
>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>> [  116.123910] sp : ffff0022e6cef8e0
>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>> [  116.159032] x17: 000000000000000000000000001
>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>> [  116.206756] Call trace:
>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>> [  116.248759]  drm_mode_object_put+0x28/0x38
>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>> [  116.256672]  hibmc_unload+0x2c/0xd0
>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>> [  116.263979]  pci_device_remove+0x6c/0x140
>> [  116.267980]  device_release_driver_internal+0x134/0x250
>> [  116.273196]  device_driver_detach+0x28/0x38
>> [  116.277369]  unbind_store+0xfc/0x150
>> [  116.280934]  drv_attr_store+0x48/0x60
>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>> [  116.292243]  __vfs_write+0x54/0x98
>> [  116.295635]  vfs_write+0xe8/0x270
>> [  116.298940]  ksys_write+0xc8/0x180
>> [  116.302333]  __arm64_sys_write+0x40/0x50
>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>> [  116.311029]  el0_svc_handler+0x34/0xb0
>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>> [  116.318769]  el0_sync+0x140/0x180
>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>> [  116.326868] ------------[ cut here ]------------
>>
>>
>> dmesg and .config is here:
>> https://pastebin.com/4P5yaZBS
>>
>> I'm not sure if this is a HIBMC driver issue or issue with the framework.
>>
>> john
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-17 17:27     ` John Garry
@ 2019-12-18 18:08       ` John Garry
  -1 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-18 18:08 UTC (permalink / raw)
  To: Ezequiel Garcia, kongxinwei (A), Chenfeng (puck),
	airlied, daniel, tzimmermann
  Cc: Linuxarm, dri-devel, linux-kernel, kraxel, dbueso

+

So the v5.4 kernel does not have this issue.

I have bisected the initial occurrence to:

commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Fri Sep 6 14:20:53 2019 +0200

     drm/vram: Add kmap ref-counting to GEM VRAM objects

     The kmap and kunmap operations of GEM VRAM buffers can now be called
     in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
     buffer's memory to kernel address space and the final call to
     drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
     functions increment or decrement a reference counter.

So this either exposes or creates the issue.

John

>> On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
>>> Hi all,
>>>
>>> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
>>> with the HIBMC hw:
>>>
>>> [   27.788806] WARNING: CPU: 24 PID: 1 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:564 
>>> bo_driver_move_notify+0x8c/0x98
>>
>> A total shot in the dark. This might make no sense,
>> but it's worth a try:
> 
> Thanks for the suggestion, but still the same splat.
> 
> I haven't had a chance to analyze the problem myself. But perhaps we 
> should just change over the device-managed interface, as Daniel mentioned.
> 
>>
>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c 
>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> index 2fd4ca91a62d..69bb0e29da88 100644
>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> @@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
>>   {
>>          struct hibmc_drm_private *priv = dev->dev_private;
>> -       hibmc_fbdev_fini(priv);
>> -
>>          drm_atomic_helper_shutdown(dev);
>> +       hibmc_fbdev_fini(priv);
>>          if (dev->irq_enabled)
>>                  drm_irq_uninstall(dev);
>>
>> Hope it helps,
>> Ezequiel
>>
> 
> Thanks,
> John
> 
> [EOM]
> 
>>> [   27.798969] Modules linked in:
>>> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
>>>    5.5.0-rc1-dirty #565
>>> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
>>> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
>>> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
>>> [   27.832868] sp : ffff00236f0677e0
>>> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
>>> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
>>> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
>>> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
>>> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
>>> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
>>> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
>>> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
>>> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
>>> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
>>> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
>>> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
>>> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
>>> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
>>> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
>>> [   27.915712] Call trace:
>>> [   27.918151]  bo_driver_move_notify+0x8c/0x98
>>> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
>>> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
>>> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
>>> [   27.935109]  drm_gem_object_free+0x34/0xd0
>>> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
>>> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
>>> [   27.949020]  drm_framebuffer_free+0x48/0x58
>>> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
>>> [   27.957889]  drm_mode_object_put+0x28/0x38
>>> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
>>> [   27.965802]  hibmc_unload+0x2c/0xd0
>>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>>> [   27.973109]  pci_device_remove+0x6c/0x140
>>> [   27.977110]  really_probe+0x174/0x548
>>> [   27.980763]  driver_probe_device+0x7c/0x148
>>> [   27.984936]  device_driver_attach+0x94/0xa0
>>> [   27.989109]  __driver_attach+0xa8/0x110
>>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>>> [   27.996849]  driver_attach+0x30/0x40
>>> [   28.000415]  bus_add_driver+0x234/0x2f0
>>> [   28.004241]  driver_register+0xbc/0x1d0
>>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>>> [   28.016590]  do_one_initcall+0xb4/0x254
>>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>>> [   28.024765]  kernel_init+0x10/0x118
>>> [   28.028245]  ret_from_fork+0x10/0x18
>>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>>> [   28.036503] ------------[ cut here ]------------
>>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40 
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [   28.051537] Modules linked in:
>>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>>    5.5.0-rc1-dirty #565
>>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>>
>>> [snip]
>>>
>>> Indeed, simply unbinding the device from the driver causes the same sort
>>> of issue:
>>>
>>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>>> root@(none)$ ls
>>> 0000:05:00.0  bind          new_id        remove_id     uevent        
>>> unbind
>>> root@(none)$ echo 0000\:05\:00.0 > unbind
>>> [  116.074352] ------------[ cut here ]------------
>>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40 
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.089661] Modules linked in:
>>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>>> 5.5.0-rc1-dirty #565
>>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>>> [  116.123910] sp : ffff0022e6cef8e0
>>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>>> [  116.159032] x17: 000000000000000000000000001
>>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>>> [  116.206756] Call trace:
>>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>>> [  116.248759]  drm_mode_object_put+0x28/0x38
>>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>>> [  116.256672]  hibmc_unload+0x2c/0xd0
>>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>>> [  116.263979]  pci_device_remove+0x6c/0x140
>>> [  116.267980]  device_release_driver_internal+0x134/0x250
>>> [  116.273196]  device_driver_detach+0x28/0x38
>>> [  116.277369]  unbind_store+0xfc/0x150
>>> [  116.280934]  drv_attr_store+0x48/0x60
>>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>>> [  116.292243]  __vfs_write+0x54/0x98
>>> [  116.295635]  vfs_write+0xe8/0x270
>>> [  116.298940]  ksys_write+0xc8/0x180
>>> [  116.302333]  __arm64_sys_write+0x40/0x50
>>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>>> [  116.311029]  el0_svc_handler+0x34/0xb0
>>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>>> [  116.318769]  el0_sync+0x140/0x180
>>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>>> [  116.326868] ------------[ cut here ]------------
>>>
>>>
>>> dmesg and .config is here:
>>> https://pastebin.com/4P5yaZBS
>>>
>>> I'm not sure if this is a HIBMC driver issue or issue with the 
>>> framework.
>>>
>>> john
>>>
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>
> 
> _______________________________________________
> Linuxarm mailing list
> Linuxarm@huawei.com
> http://hulk.huawei.com/mailman/listinfo/linuxarm
> .


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-18 18:08       ` John Garry
  0 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-18 18:08 UTC (permalink / raw)
  To: Ezequiel Garcia, kongxinwei (A), Chenfeng (puck),
	airlied, daniel, tzimmermann
  Cc: dbueso, kraxel, Linuxarm, dri-devel, linux-kernel

+

So the v5.4 kernel does not have this issue.

I have bisected the initial occurrence to:

commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Fri Sep 6 14:20:53 2019 +0200

     drm/vram: Add kmap ref-counting to GEM VRAM objects

     The kmap and kunmap operations of GEM VRAM buffers can now be called
     in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
     buffer's memory to kernel address space and the final call to
     drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
     functions increment or decrement a reference counter.

So this either exposes or creates the issue.

John

>> On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
>>> Hi all,
>>>
>>> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
>>> with the HIBMC hw:
>>>
>>> [   27.788806] WARNING: CPU: 24 PID: 1 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:564 
>>> bo_driver_move_notify+0x8c/0x98
>>
>> A total shot in the dark. This might make no sense,
>> but it's worth a try:
> 
> Thanks for the suggestion, but still the same splat.
> 
> I haven't had a chance to analyze the problem myself. But perhaps we 
> should just change over the device-managed interface, as Daniel mentioned.
> 
>>
>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c 
>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> index 2fd4ca91a62d..69bb0e29da88 100644
>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> @@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
>>   {
>>          struct hibmc_drm_private *priv = dev->dev_private;
>> -       hibmc_fbdev_fini(priv);
>> -
>>          drm_atomic_helper_shutdown(dev);
>> +       hibmc_fbdev_fini(priv);
>>          if (dev->irq_enabled)
>>                  drm_irq_uninstall(dev);
>>
>> Hope it helps,
>> Ezequiel
>>
> 
> Thanks,
> John
> 
> [EOM]
> 
>>> [   27.798969] Modules linked in:
>>> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
>>>    5.5.0-rc1-dirty #565
>>> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
>>> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
>>> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
>>> [   27.832868] sp : ffff00236f0677e0
>>> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
>>> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
>>> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
>>> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
>>> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
>>> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
>>> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
>>> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
>>> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
>>> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
>>> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
>>> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
>>> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
>>> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
>>> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
>>> [   27.915712] Call trace:
>>> [   27.918151]  bo_driver_move_notify+0x8c/0x98
>>> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
>>> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
>>> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
>>> [   27.935109]  drm_gem_object_free+0x34/0xd0
>>> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
>>> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
>>> [   27.949020]  drm_framebuffer_free+0x48/0x58
>>> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
>>> [   27.957889]  drm_mode_object_put+0x28/0x38
>>> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
>>> [   27.965802]  hibmc_unload+0x2c/0xd0
>>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>>> [   27.973109]  pci_device_remove+0x6c/0x140
>>> [   27.977110]  really_probe+0x174/0x548
>>> [   27.980763]  driver_probe_device+0x7c/0x148
>>> [   27.984936]  device_driver_attach+0x94/0xa0
>>> [   27.989109]  __driver_attach+0xa8/0x110
>>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>>> [   27.996849]  driver_attach+0x30/0x40
>>> [   28.000415]  bus_add_driver+0x234/0x2f0
>>> [   28.004241]  driver_register+0xbc/0x1d0
>>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>>> [   28.016590]  do_one_initcall+0xb4/0x254
>>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>>> [   28.024765]  kernel_init+0x10/0x118
>>> [   28.028245]  ret_from_fork+0x10/0x18
>>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>>> [   28.036503] ------------[ cut here ]------------
>>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40 
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [   28.051537] Modules linked in:
>>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>>    5.5.0-rc1-dirty #565
>>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>>
>>> [snip]
>>>
>>> Indeed, simply unbinding the device from the driver causes the same sort
>>> of issue:
>>>
>>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>>> root@(none)$ ls
>>> 0000:05:00.0  bind          new_id        remove_id     uevent        
>>> unbind
>>> root@(none)$ echo 0000\:05\:00.0 > unbind
>>> [  116.074352] ------------[ cut here ]------------
>>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40 
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.089661] Modules linked in:
>>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>>> 5.5.0-rc1-dirty #565
>>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>>> [  116.123910] sp : ffff0022e6cef8e0
>>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>>> [  116.159032] x17: 000000000000000000000000001
>>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>>> [  116.206756] Call trace:
>>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>>> [  116.248759]  drm_mode_object_put+0x28/0x38
>>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>>> [  116.256672]  hibmc_unload+0x2c/0xd0
>>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>>> [  116.263979]  pci_device_remove+0x6c/0x140
>>> [  116.267980]  device_release_driver_internal+0x134/0x250
>>> [  116.273196]  device_driver_detach+0x28/0x38
>>> [  116.277369]  unbind_store+0xfc/0x150
>>> [  116.280934]  drv_attr_store+0x48/0x60
>>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>>> [  116.292243]  __vfs_write+0x54/0x98
>>> [  116.295635]  vfs_write+0xe8/0x270
>>> [  116.298940]  ksys_write+0xc8/0x180
>>> [  116.302333]  __arm64_sys_write+0x40/0x50
>>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>>> [  116.311029]  el0_svc_handler+0x34/0xb0
>>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>>> [  116.318769]  el0_sync+0x140/0x180
>>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>>> [  116.326868] ------------[ cut here ]------------
>>>
>>>
>>> dmesg and .config is here:
>>> https://pastebin.com/4P5yaZBS
>>>
>>> I'm not sure if this is a HIBMC driver issue or issue with the 
>>> framework.
>>>
>>> john
>>>
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>
> 
> _______________________________________________
> Linuxarm mailing list
> Linuxarm@huawei.com
> http://hulk.huawei.com/mailman/listinfo/linuxarm
> .

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-18 18:08       ` John Garry
@ 2019-12-19  9:54         ` Daniel Vetter
  -1 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-19  9:54 UTC (permalink / raw)
  To: John Garry
  Cc: Ezequiel Garcia, kongxinwei (A), Chenfeng (puck),
	airlied, Thomas Zimmermann, Linuxarm, dri-devel, linux-kernel,
	Gerd Hoffmann, dbueso

On Wed, Dec 18, 2019 at 7:08 PM John Garry <john.garry@huawei.com> wrote:
>
> +
>
> So the v5.4 kernel does not have this issue.
>
> I have bisected the initial occurrence to:
>
> commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
> Author: Thomas Zimmermann <tzimmermann@suse.de>
> Date:   Fri Sep 6 14:20:53 2019 +0200
>
>      drm/vram: Add kmap ref-counting to GEM VRAM objects
>
>      The kmap and kunmap operations of GEM VRAM buffers can now be called
>      in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
>      buffer's memory to kernel address space and the final call to
>      drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
>      functions increment or decrement a reference counter.
>
> So this either exposes or creates the issue.

Yeah that's just shooting the messenger. Like I said, for most drivers
you can pretty much assume that their unload sequence has been broken
since forever. It's not often tested, and especially the hotunbind
from a device (as opposed to driver unload) stuff wasn't even possible
to get right until just recently.
-Daniel

>
> John
>
> >> On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
> >>> Hi all,
> >>>
> >>> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
> >>> with the HIBMC hw:
> >>>
> >>> [   27.788806] WARNING: CPU: 24 PID: 1 at
> >>> drivers/gpu/drm/drm_gem_vram_helper.c:564
> >>> bo_driver_move_notify+0x8c/0x98
> >>
> >> A total shot in the dark. This might make no sense,
> >> but it's worth a try:
> >
> > Thanks for the suggestion, but still the same splat.
> >
> > I haven't had a chance to analyze the problem myself. But perhaps we
> > should just change over the device-managed interface, as Daniel mentioned.
> >
> >>
> >> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> index 2fd4ca91a62d..69bb0e29da88 100644
> >> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> @@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
> >>   {
> >>          struct hibmc_drm_private *priv = dev->dev_private;
> >> -       hibmc_fbdev_fini(priv);
> >> -
> >>          drm_atomic_helper_shutdown(dev);
> >> +       hibmc_fbdev_fini(priv);
> >>          if (dev->irq_enabled)
> >>                  drm_irq_uninstall(dev);
> >>
> >> Hope it helps,
> >> Ezequiel
> >>
> >
> > Thanks,
> > John
> >
> > [EOM]
> >
> >>> [   27.798969] Modules linked in:
> >>> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
> >>>    5.5.0-rc1-dirty #565
> >>> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> >>> RC0 - V1.16.01 03/15/2019
> >>> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> >>> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> >>> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> >>> [   27.832868] sp : ffff00236f0677e0
> >>> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> >>> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> >>> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> >>> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> >>> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> >>> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> >>> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> >>> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> >>> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> >>> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> >>> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> >>> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> >>> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> >>> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> >>> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> >>> [   27.915712] Call trace:
> >>> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> >>> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> >>> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> >>> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> >>> [   27.935109]  drm_gem_object_free+0x34/0xd0
> >>> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> >>> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> >>> [   27.949020]  drm_framebuffer_free+0x48/0x58
> >>> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> >>> [   27.957889]  drm_mode_object_put+0x28/0x38
> >>> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> >>> [   27.965802]  hibmc_unload+0x2c/0xd0
> >>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> >>> [   27.973109]  pci_device_remove+0x6c/0x140
> >>> [   27.977110]  really_probe+0x174/0x548
> >>> [   27.980763]  driver_probe_device+0x7c/0x148
> >>> [   27.984936]  device_driver_attach+0x94/0xa0
> >>> [   27.989109]  __driver_attach+0xa8/0x110
> >>> [   27.992935]  bus_for_each_dev+0xe8/0x158
> >>> [   27.996849]  driver_attach+0x30/0x40
> >>> [   28.000415]  bus_add_driver+0x234/0x2f0
> >>> [   28.004241]  driver_register+0xbc/0x1d0
> >>> [   28.008067]  __pci_register_driver+0xbc/0xd0
> >>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> >>> [   28.016590]  do_one_initcall+0xb4/0x254
> >>> [   28.020417]  kernel_init_freeable+0x27c/0x328
> >>> [   28.024765]  kernel_init+0x10/0x118
> >>> [   28.028245]  ret_from_fork+0x10/0x18
> >>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> >>> [   28.036503] ------------[ cut here ]------------
> >>> [   28.041115] WARNING: CPU: 24 PID: 1 at
> >>> drivers/gpu/drm/drm_gem_vram_helper.c:40
> >>> ttm_buffer_object_destroy+0x4c/0x80
> >>> [   28.051537] Modules linked in:
> >>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
> >>>    5.5.0-rc1-dirty #565
> >>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> >>> RC0 - V1.16.01 03/15/2019
> >>>
> >>> [snip]
> >>>
> >>> Indeed, simply unbinding the device from the driver causes the same sort
> >>> of issue:
> >>>
> >>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> >>> root@(none)$ ls
> >>> 0000:05:00.0  bind          new_id        remove_id     uevent
> >>> unbind
> >>> root@(none)$ echo 0000\:05\:00.0 > unbind
> >>> [  116.074352] ------------[ cut here ]------------
> >>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
> >>> drivers/gpu/drm/drm_gem_vram_helper.c:40
> >>> ttm_buffer_object_destroy+0x4c/0x80
> >>> [  116.089661] Modules linked in:
> >>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
> >>> 5.5.0-rc1-dirty #565
> >>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> >>> RC0 - V1.16.01 03/15/2019
> >>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> >>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> >>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> >>> [  116.123910] sp : ffff0022e6cef8e0
> >>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> >>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> >>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> >>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> >>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> >>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> >>> [  116.159032] x17: 000000000000000000000000001
> >>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> >>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> >>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> >>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> >>> [  116.206756] Call trace:
> >>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> >>> [  116.213803]  ttm_bo_release_list+0x184/0x220
> >>> [  116.218064]  ttm_bo_put+0x410/0x5d0
> >>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> >>> [  116.225979]  drm_gem_object_free+0x34/0xd0
> >>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> >>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> >>> [  116.239890]  drm_framebuffer_free+0x48/0x58
> >>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> >>> [  116.248759]  drm_mode_object_put+0x28/0x38
> >>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> >>> [  116.256672]  hibmc_unload+0x2c/0xd0
> >>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> >>> [  116.263979]  pci_device_remove+0x6c/0x140
> >>> [  116.267980]  device_release_driver_internal+0x134/0x250
> >>> [  116.273196]  device_driver_detach+0x28/0x38
> >>> [  116.277369]  unbind_store+0xfc/0x150
> >>> [  116.280934]  drv_attr_store+0x48/0x60
> >>> [  116.284589]  sysfs_kf_write+0x80/0xb0
> >>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> >>> [  116.292243]  __vfs_write+0x54/0x98
> >>> [  116.295635]  vfs_write+0xe8/0x270
> >>> [  116.298940]  ksys_write+0xc8/0x180
> >>> [  116.302333]  __arm64_sys_write+0x40/0x50
> >>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> >>> [  116.311029]  el0_svc_handler+0x34/0xb0
> >>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> >>> [  116.318769]  el0_sync+0x140/0x180
> >>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> >>> [  116.326868] ------------[ cut here ]------------
> >>>
> >>>
> >>> dmesg and .config is here:
> >>> https://pastebin.com/4P5yaZBS
> >>>
> >>> I'm not sure if this is a HIBMC driver issue or issue with the
> >>> framework.
> >>>
> >>> john
> >>>
> >>>
> >>> _______________________________________________
> >>> dri-devel mailing list
> >>> dri-devel@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >>
> >>
> >
> > _______________________________________________
> > Linuxarm mailing list
> > Linuxarm@huawei.com
> > http://hulk.huawei.com/mailman/listinfo/linuxarm
> > .
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-19  9:54         ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-19  9:54 UTC (permalink / raw)
  To: John Garry
  Cc: dbueso, airlied, Chenfeng (puck),
	Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Gerd Hoffmann, Thomas Zimmermann, Ezequiel Garcia

On Wed, Dec 18, 2019 at 7:08 PM John Garry <john.garry@huawei.com> wrote:
>
> +
>
> So the v5.4 kernel does not have this issue.
>
> I have bisected the initial occurrence to:
>
> commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
> Author: Thomas Zimmermann <tzimmermann@suse.de>
> Date:   Fri Sep 6 14:20:53 2019 +0200
>
>      drm/vram: Add kmap ref-counting to GEM VRAM objects
>
>      The kmap and kunmap operations of GEM VRAM buffers can now be called
>      in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
>      buffer's memory to kernel address space and the final call to
>      drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
>      functions increment or decrement a reference counter.
>
> So this either exposes or creates the issue.

Yeah that's just shooting the messenger. Like I said, for most drivers
you can pretty much assume that their unload sequence has been broken
since forever. It's not often tested, and especially the hotunbind
from a device (as opposed to driver unload) stuff wasn't even possible
to get right until just recently.
-Daniel

>
> John
>
> >> On Mon, 2019-12-16 at 17:23 +0000, John Garry wrote:
> >>> Hi all,
> >>>
> >>> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
> >>> with the HIBMC hw:
> >>>
> >>> [   27.788806] WARNING: CPU: 24 PID: 1 at
> >>> drivers/gpu/drm/drm_gem_vram_helper.c:564
> >>> bo_driver_move_notify+0x8c/0x98
> >>
> >> A total shot in the dark. This might make no sense,
> >> but it's worth a try:
> >
> > Thanks for the suggestion, but still the same splat.
> >
> > I haven't had a chance to analyze the problem myself. But perhaps we
> > should just change over the device-managed interface, as Daniel mentioned.
> >
> >>
> >> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> index 2fd4ca91a62d..69bb0e29da88 100644
> >> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> >> @@ -247,9 +247,8 @@ static int hibmc_unload(struct drm_device *dev)
> >>   {
> >>          struct hibmc_drm_private *priv = dev->dev_private;
> >> -       hibmc_fbdev_fini(priv);
> >> -
> >>          drm_atomic_helper_shutdown(dev);
> >> +       hibmc_fbdev_fini(priv);
> >>          if (dev->irq_enabled)
> >>                  drm_irq_uninstall(dev);
> >>
> >> Hope it helps,
> >> Ezequiel
> >>
> >
> > Thanks,
> > John
> >
> > [EOM]
> >
> >>> [   27.798969] Modules linked in:
> >>> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
> >>>    5.5.0-rc1-dirty #565
> >>> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> >>> RC0 - V1.16.01 03/15/2019
> >>> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> >>> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> >>> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> >>> [   27.832868] sp : ffff00236f0677e0
> >>> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> >>> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> >>> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> >>> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> >>> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> >>> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> >>> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> >>> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> >>> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> >>> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> >>> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> >>> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> >>> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> >>> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> >>> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> >>> [   27.915712] Call trace:
> >>> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> >>> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> >>> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> >>> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> >>> [   27.935109]  drm_gem_object_free+0x34/0xd0
> >>> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> >>> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> >>> [   27.949020]  drm_framebuffer_free+0x48/0x58
> >>> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> >>> [   27.957889]  drm_mode_object_put+0x28/0x38
> >>> [   27.961976]  hibmc_fbdev_fini+0x54/0x78
> >>> [   27.965802]  hibmc_unload+0x2c/0xd0
> >>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> >>> [   27.973109]  pci_device_remove+0x6c/0x140
> >>> [   27.977110]  really_probe+0x174/0x548
> >>> [   27.980763]  driver_probe_device+0x7c/0x148
> >>> [   27.984936]  device_driver_attach+0x94/0xa0
> >>> [   27.989109]  __driver_attach+0xa8/0x110
> >>> [   27.992935]  bus_for_each_dev+0xe8/0x158
> >>> [   27.996849]  driver_attach+0x30/0x40
> >>> [   28.000415]  bus_add_driver+0x234/0x2f0
> >>> [   28.004241]  driver_register+0xbc/0x1d0
> >>> [   28.008067]  __pci_register_driver+0xbc/0xd0
> >>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> >>> [   28.016590]  do_one_initcall+0xb4/0x254
> >>> [   28.020417]  kernel_init_freeable+0x27c/0x328
> >>> [   28.024765]  kernel_init+0x10/0x118
> >>> [   28.028245]  ret_from_fork+0x10/0x18
> >>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> >>> [   28.036503] ------------[ cut here ]------------
> >>> [   28.041115] WARNING: CPU: 24 PID: 1 at
> >>> drivers/gpu/drm/drm_gem_vram_helper.c:40
> >>> ttm_buffer_object_destroy+0x4c/0x80
> >>> [   28.051537] Modules linked in:
> >>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
> >>>    5.5.0-rc1-dirty #565
> >>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> >>> RC0 - V1.16.01 03/15/2019
> >>>
> >>> [snip]
> >>>
> >>> Indeed, simply unbinding the device from the driver causes the same sort
> >>> of issue:
> >>>
> >>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> >>> root@(none)$ ls
> >>> 0000:05:00.0  bind          new_id        remove_id     uevent
> >>> unbind
> >>> root@(none)$ echo 0000\:05\:00.0 > unbind
> >>> [  116.074352] ------------[ cut here ]------------
> >>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
> >>> drivers/gpu/drm/drm_gem_vram_helper.c:40
> >>> ttm_buffer_object_destroy+0x4c/0x80
> >>> [  116.089661] Modules linked in:
> >>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
> >>> 5.5.0-rc1-dirty #565
> >>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> >>> RC0 - V1.16.01 03/15/2019
> >>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> >>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> >>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> >>> [  116.123910] sp : ffff0022e6cef8e0
> >>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> >>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> >>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> >>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> >>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> >>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> >>> [  116.159032] x17: 000000000000000000000000001
> >>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> >>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> >>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> >>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> >>> [  116.206756] Call trace:
> >>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> >>> [  116.213803]  ttm_bo_release_list+0x184/0x220
> >>> [  116.218064]  ttm_bo_put+0x410/0x5d0
> >>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> >>> [  116.225979]  drm_gem_object_free+0x34/0xd0
> >>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> >>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> >>> [  116.239890]  drm_framebuffer_free+0x48/0x58
> >>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> >>> [  116.248759]  drm_mode_object_put+0x28/0x38
> >>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> >>> [  116.256672]  hibmc_unload+0x2c/0xd0
> >>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> >>> [  116.263979]  pci_device_remove+0x6c/0x140
> >>> [  116.267980]  device_release_driver_internal+0x134/0x250
> >>> [  116.273196]  device_driver_detach+0x28/0x38
> >>> [  116.277369]  unbind_store+0xfc/0x150
> >>> [  116.280934]  drv_attr_store+0x48/0x60
> >>> [  116.284589]  sysfs_kf_write+0x80/0xb0
> >>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> >>> [  116.292243]  __vfs_write+0x54/0x98
> >>> [  116.295635]  vfs_write+0xe8/0x270
> >>> [  116.298940]  ksys_write+0xc8/0x180
> >>> [  116.302333]  __arm64_sys_write+0x40/0x50
> >>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> >>> [  116.311029]  el0_svc_handler+0x34/0xb0
> >>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> >>> [  116.318769]  el0_sync+0x140/0x180
> >>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> >>> [  116.326868] ------------[ cut here ]------------
> >>>
> >>>
> >>> dmesg and .config is here:
> >>> https://pastebin.com/4P5yaZBS
> >>>
> >>> I'm not sure if this is a HIBMC driver issue or issue with the
> >>> framework.
> >>>
> >>> john
> >>>
> >>>
> >>> _______________________________________________
> >>> dri-devel mailing list
> >>> dri-devel@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >>
> >>
> >
> > _______________________________________________
> > Linuxarm mailing list
> > Linuxarm@huawei.com
> > http://hulk.huawei.com/mailman/listinfo/linuxarm
> > .
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-19  9:54         ` Daniel Vetter
@ 2019-12-19 10:03           ` John Garry
  -1 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-19 10:03 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Ezequiel Garcia, kongxinwei (A), Chenfeng (puck),
	airlied, Thomas Zimmermann, Linuxarm, dri-devel, linux-kernel,
	Gerd Hoffmann, dbueso

On 19/12/2019 09:54, Daniel Vetter wrote:
> On Wed, Dec 18, 2019 at 7:08 PM John Garry <john.garry@huawei.com> wrote:
>>
>> +
>>
>> So the v5.4 kernel does not have this issue.
>>
>> I have bisected the initial occurrence to:
>>
>> commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
>> Author: Thomas Zimmermann <tzimmermann@suse.de>
>> Date:   Fri Sep 6 14:20:53 2019 +0200
>>
>>       drm/vram: Add kmap ref-counting to GEM VRAM objects
>>
>>       The kmap and kunmap operations of GEM VRAM buffers can now be called
>>       in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
>>       buffer's memory to kernel address space and the final call to
>>       drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
>>       functions increment or decrement a reference counter.
>>
>> So this either exposes or creates the issue.
> 
> Yeah that's just shooting the messenger.

OK, so it exposes it.

  Like I said, for most drivers
> you can pretty much assume that their unload sequence has been broken
> since forever. It's not often tested, and especially the hotunbind
> from a device (as opposed to driver unload) stuff wasn't even possible
> to get right until just recently.

Do you think it's worth trying to fix this for 5.5 and earlier, or just 
switch to the device-managed interface for 5.6 and forget about 5.5 and 
earlier?

Thanks,
John

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-19 10:03           ` John Garry
  0 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2019-12-19 10:03 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dbueso, airlied, Chenfeng (puck),
	Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Gerd Hoffmann, Thomas Zimmermann, Ezequiel Garcia

On 19/12/2019 09:54, Daniel Vetter wrote:
> On Wed, Dec 18, 2019 at 7:08 PM John Garry <john.garry@huawei.com> wrote:
>>
>> +
>>
>> So the v5.4 kernel does not have this issue.
>>
>> I have bisected the initial occurrence to:
>>
>> commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
>> Author: Thomas Zimmermann <tzimmermann@suse.de>
>> Date:   Fri Sep 6 14:20:53 2019 +0200
>>
>>       drm/vram: Add kmap ref-counting to GEM VRAM objects
>>
>>       The kmap and kunmap operations of GEM VRAM buffers can now be called
>>       in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
>>       buffer's memory to kernel address space and the final call to
>>       drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
>>       functions increment or decrement a reference counter.
>>
>> So this either exposes or creates the issue.
> 
> Yeah that's just shooting the messenger.

OK, so it exposes it.

  Like I said, for most drivers
> you can pretty much assume that their unload sequence has been broken
> since forever. It's not often tested, and especially the hotunbind
> from a device (as opposed to driver unload) stuff wasn't even possible
> to get right until just recently.

Do you think it's worth trying to fix this for 5.5 and earlier, or just 
switch to the device-managed interface for 5.6 and forget about 5.5 and 
earlier?

Thanks,
John
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-19 10:03           ` John Garry
@ 2019-12-19 10:10             ` Daniel Vetter
  -1 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-19 10:10 UTC (permalink / raw)
  To: John Garry
  Cc: Ezequiel Garcia, kongxinwei (A), Chenfeng (puck),
	airlied, Thomas Zimmermann, Linuxarm, dri-devel, linux-kernel,
	Gerd Hoffmann, dbueso

On Thu, Dec 19, 2019 at 11:03 AM John Garry <john.garry@huawei.com> wrote:
>
> On 19/12/2019 09:54, Daniel Vetter wrote:
> > On Wed, Dec 18, 2019 at 7:08 PM John Garry <john.garry@huawei.com> wrote:
> >>
> >> +
> >>
> >> So the v5.4 kernel does not have this issue.
> >>
> >> I have bisected the initial occurrence to:
> >>
> >> commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
> >> Author: Thomas Zimmermann <tzimmermann@suse.de>
> >> Date:   Fri Sep 6 14:20:53 2019 +0200
> >>
> >>       drm/vram: Add kmap ref-counting to GEM VRAM objects
> >>
> >>       The kmap and kunmap operations of GEM VRAM buffers can now be called
> >>       in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
> >>       buffer's memory to kernel address space and the final call to
> >>       drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
> >>       functions increment or decrement a reference counter.
> >>
> >> So this either exposes or creates the issue.
> >
> > Yeah that's just shooting the messenger.
>
> OK, so it exposes it.
>
>   Like I said, for most drivers
> > you can pretty much assume that their unload sequence has been broken
> > since forever. It's not often tested, and especially the hotunbind
> > from a device (as opposed to driver unload) stuff wasn't even possible
> > to get right until just recently.
>
> Do you think it's worth trying to fix this for 5.5 and earlier, or just
> switch to the device-managed interface for 5.6 and forget about 5.5 and
> earlier?

I suspect it's going to be quite some trickery to fix this properly
and everywhere, even for just one driver. Lots of drm drivers
unfortunately use anti-patterns with wrong lifetimes (e.g. you can't
use devm_kmalloc for anything that hangs of a drm_device, like
plane/crtc/connector). Except when it's for a real hotunpluggable
device (usb) we've never bothered backporting these fixes. Too much
broken stuff unfortunately.
-Daniel

>
> Thanks,
> John



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-19 10:10             ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-19 10:10 UTC (permalink / raw)
  To: John Garry
  Cc: dbueso, airlied, Chenfeng (puck),
	Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Gerd Hoffmann, Thomas Zimmermann, Ezequiel Garcia

On Thu, Dec 19, 2019 at 11:03 AM John Garry <john.garry@huawei.com> wrote:
>
> On 19/12/2019 09:54, Daniel Vetter wrote:
> > On Wed, Dec 18, 2019 at 7:08 PM John Garry <john.garry@huawei.com> wrote:
> >>
> >> +
> >>
> >> So the v5.4 kernel does not have this issue.
> >>
> >> I have bisected the initial occurrence to:
> >>
> >> commit 37a48adfba6cf6e87df9ba8b75ab85d514ed86d8
> >> Author: Thomas Zimmermann <tzimmermann@suse.de>
> >> Date:   Fri Sep 6 14:20:53 2019 +0200
> >>
> >>       drm/vram: Add kmap ref-counting to GEM VRAM objects
> >>
> >>       The kmap and kunmap operations of GEM VRAM buffers can now be called
> >>       in interleaving pairs. The first call to drm_gem_vram_kmap() maps the
> >>       buffer's memory to kernel address space and the final call to
> >>       drm_gem_vram_kunmap() unmaps the memory. Intermediate calls to these
> >>       functions increment or decrement a reference counter.
> >>
> >> So this either exposes or creates the issue.
> >
> > Yeah that's just shooting the messenger.
>
> OK, so it exposes it.
>
>   Like I said, for most drivers
> > you can pretty much assume that their unload sequence has been broken
> > since forever. It's not often tested, and especially the hotunbind
> > from a device (as opposed to driver unload) stuff wasn't even possible
> > to get right until just recently.
>
> Do you think it's worth trying to fix this for 5.5 and earlier, or just
> switch to the device-managed interface for 5.6 and forget about 5.5 and
> earlier?

I suspect it's going to be quite some trickery to fix this properly
and everywhere, even for just one driver. Lots of drm drivers
unfortunately use anti-patterns with wrong lifetimes (e.g. you can't
use devm_kmalloc for anything that hangs of a drm_device, like
plane/crtc/connector). Except when it's for a real hotunpluggable
device (usb) we've never bothered backporting these fixes. Too much
broken stuff unfortunately.
-Daniel

>
> Thanks,
> John



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-19 10:10             ` Daniel Vetter
@ 2019-12-19 11:31               ` Gerd Hoffmann
  -1 siblings, 0 replies; 32+ messages in thread
From: Gerd Hoffmann @ 2019-12-19 11:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: John Garry, Ezequiel Garcia, kongxinwei (A), Chenfeng (puck),
	airlied, Thomas Zimmermann, Linuxarm, dri-devel, linux-kernel,
	dbueso

  Hi,

> >   Like I said, for most drivers
> > > you can pretty much assume that their unload sequence has been broken
> > > since forever. It's not often tested, and especially the hotunbind
> > > from a device (as opposed to driver unload) stuff wasn't even possible
> > > to get right until just recently.
> >
> > Do you think it's worth trying to fix this for 5.5 and earlier, or just
> > switch to the device-managed interface for 5.6 and forget about 5.5 and
> > earlier?
> 
> I suspect it's going to be quite some trickery to fix this properly
> and everywhere, even for just one driver. Lots of drm drivers
> unfortunately use anti-patterns with wrong lifetimes (e.g. you can't
> use devm_kmalloc for anything that hangs of a drm_device, like
> plane/crtc/connector). Except when it's for a real hotunpluggable
> device (usb) we've never bothered backporting these fixes. Too much
> broken stuff unfortunately.

While being at it:  How would a driver cleanup properly cleanup gem
objects created by userspace on hotunbind?  Specifically a gem object
pinned to vram?

cheers,
  Gerd


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-19 11:31               ` Gerd Hoffmann
  0 siblings, 0 replies; 32+ messages in thread
From: Gerd Hoffmann @ 2019-12-19 11:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dbueso, airlied, Chenfeng (puck),
	John Garry, Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Thomas Zimmermann, Ezequiel Garcia

  Hi,

> >   Like I said, for most drivers
> > > you can pretty much assume that their unload sequence has been broken
> > > since forever. It's not often tested, and especially the hotunbind
> > > from a device (as opposed to driver unload) stuff wasn't even possible
> > > to get right until just recently.
> >
> > Do you think it's worth trying to fix this for 5.5 and earlier, or just
> > switch to the device-managed interface for 5.6 and forget about 5.5 and
> > earlier?
> 
> I suspect it's going to be quite some trickery to fix this properly
> and everywhere, even for just one driver. Lots of drm drivers
> unfortunately use anti-patterns with wrong lifetimes (e.g. you can't
> use devm_kmalloc for anything that hangs of a drm_device, like
> plane/crtc/connector). Except when it's for a real hotunpluggable
> device (usb) we've never bothered backporting these fixes. Too much
> broken stuff unfortunately.

While being at it:  How would a driver cleanup properly cleanup gem
objects created by userspace on hotunbind?  Specifically a gem object
pinned to vram?

cheers,
  Gerd

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-19 11:31               ` Gerd Hoffmann
@ 2019-12-19 12:42                 ` Daniel Vetter
  -1 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-19 12:42 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: John Garry, Ezequiel Garcia, kongxinwei (A), Chenfeng (puck),
	airlied, Thomas Zimmermann, Linuxarm, dri-devel, linux-kernel,
	dbueso

On Thu, Dec 19, 2019 at 12:32 PM Gerd Hoffmann <kraxel@redhat.com> wrote:
>
>   Hi,
>
> > >   Like I said, for most drivers
> > > > you can pretty much assume that their unload sequence has been broken
> > > > since forever. It's not often tested, and especially the hotunbind
> > > > from a device (as opposed to driver unload) stuff wasn't even possible
> > > > to get right until just recently.
> > >
> > > Do you think it's worth trying to fix this for 5.5 and earlier, or just
> > > switch to the device-managed interface for 5.6 and forget about 5.5 and
> > > earlier?
> >
> > I suspect it's going to be quite some trickery to fix this properly
> > and everywhere, even for just one driver. Lots of drm drivers
> > unfortunately use anti-patterns with wrong lifetimes (e.g. you can't
> > use devm_kmalloc for anything that hangs of a drm_device, like
> > plane/crtc/connector). Except when it's for a real hotunpluggable
> > device (usb) we've never bothered backporting these fixes. Too much
> > broken stuff unfortunately.
>
> While being at it:  How would a driver cleanup properly cleanup gem
> objects created by userspace on hotunbind?  Specifically a gem object
> pinned to vram?

Two things:
- the mmap needs to be torn down and replaced by something which will
sigbus. Probably should have that as a helper (plus vram fault code
should use drm_dev_enter/exit to plug races).
- otherwise all datastructures need to be properly refcounted.
drm_device now is (if your driver isn't broken), but any dma_fence or
dma_buf we create and export has an independent lifetime, and
currently the refcounting for is still wobbly I think.

So some work to do, both in helpers/core code and in drivers to get updated.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2019-12-19 12:42                 ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2019-12-19 12:42 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: dbueso, airlied, Chenfeng (puck),
	John Garry, Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Thomas Zimmermann, Ezequiel Garcia

On Thu, Dec 19, 2019 at 12:32 PM Gerd Hoffmann <kraxel@redhat.com> wrote:
>
>   Hi,
>
> > >   Like I said, for most drivers
> > > > you can pretty much assume that their unload sequence has been broken
> > > > since forever. It's not often tested, and especially the hotunbind
> > > > from a device (as opposed to driver unload) stuff wasn't even possible
> > > > to get right until just recently.
> > >
> > > Do you think it's worth trying to fix this for 5.5 and earlier, or just
> > > switch to the device-managed interface for 5.6 and forget about 5.5 and
> > > earlier?
> >
> > I suspect it's going to be quite some trickery to fix this properly
> > and everywhere, even for just one driver. Lots of drm drivers
> > unfortunately use anti-patterns with wrong lifetimes (e.g. you can't
> > use devm_kmalloc for anything that hangs of a drm_device, like
> > plane/crtc/connector). Except when it's for a real hotunpluggable
> > device (usb) we've never bothered backporting these fixes. Too much
> > broken stuff unfortunately.
>
> While being at it:  How would a driver cleanup properly cleanup gem
> objects created by userspace on hotunbind?  Specifically a gem object
> pinned to vram?

Two things:
- the mmap needs to be torn down and replaced by something which will
sigbus. Probably should have that as a helper (plus vram fault code
should use drm_dev_enter/exit to plug races).
- otherwise all datastructures need to be properly refcounted.
drm_device now is (if your driver isn't broken), but any dma_fence or
dma_buf we create and export has an independent lifetime, and
currently the refcounting for is still wobbly I think.

So some work to do, both in helpers/core code and in drivers to get updated.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* SIGBUS on device disappearance (Re: Warnings in DRM code when removing/unbinding a driver)
  2019-12-19 12:42                 ` Daniel Vetter
@ 2019-12-23  9:00                   ` Pekka Paalanen
  -1 siblings, 0 replies; 32+ messages in thread
From: Pekka Paalanen @ 2019-12-23  9:00 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Gerd Hoffmann, dbueso, airlied, Chenfeng (puck),
	John Garry, Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Thomas Zimmermann, Ezequiel Garcia

[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]

On Thu, 19 Dec 2019 13:42:33 +0100
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Thu, Dec 19, 2019 at 12:32 PM Gerd Hoffmann <kraxel@redhat.com> wrote:
> >
> > While being at it:  How would a driver cleanup properly cleanup gem
> > objects created by userspace on hotunbind?  Specifically a gem object
> > pinned to vram?  
> 
> Two things:
> - the mmap needs to be torn down and replaced by something which will
> sigbus. Probably should have that as a helper (plus vram fault code
> should use drm_dev_enter/exit to plug races).

Hi,

I assume SIGBUS is the traditional way to say "oops, the memory you
mmapped and tried to access no longer exists". Is there nothing
else for this?

I'm asking, because SIGBUS is really hard to handle right in
userspace. It can be caused by any number of wildly different
reasons, yet being a signal means that a userspace process can only
have a single global handler for it. That makes it almost
impossible to use safely in libraries, because you would want to
register independent handlers from multiple libraries in the same
process. Some libraries may also be using threads.

How to handle a SIGBUS completely depends on what triggered it.
Almost always userspace wants it to be a non-fatal error. A Wayland
compositor can hit SIGBUS on accessing wl_shm-based client buffers
(regular mmapped files), and then it just wants to continue with
garbage data as if nothing happened and possibly send a protocol
error to the client provoking it.

I would also imagine that Mesa, when it starts looking into
supporting GPU hotunplug, needs to handle vanished mmaps. I don't
think Mesa can ever install signal handlers, because that would
mess with the applications that may already be using SIGBUS for
handling disappearing mmapped files. It needs to start returning
errors via API calls. I cannot imagine a way to reliably prevent
such SIGBUS either by e.g. ensuring Mesa gets notified of removal
before it actually starts failing.

For now, I'm just looking for a simple "yes" or "no" here for the
something else. If it's "no" like I expect, creating something else
is probably in the order of years to get into a usable state. Does
anyone already have plans towards that?


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* SIGBUS on device disappearance (Re: Warnings in DRM code when removing/unbinding a driver)
@ 2019-12-23  9:00                   ` Pekka Paalanen
  0 siblings, 0 replies; 32+ messages in thread
From: Pekka Paalanen @ 2019-12-23  9:00 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dbueso, airlied, Chenfeng (puck),
	John Garry, Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Gerd Hoffmann, Thomas Zimmermann, Ezequiel Garcia


[-- Attachment #1.1: Type: text/plain, Size: 2272 bytes --]

On Thu, 19 Dec 2019 13:42:33 +0100
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Thu, Dec 19, 2019 at 12:32 PM Gerd Hoffmann <kraxel@redhat.com> wrote:
> >
> > While being at it:  How would a driver cleanup properly cleanup gem
> > objects created by userspace on hotunbind?  Specifically a gem object
> > pinned to vram?  
> 
> Two things:
> - the mmap needs to be torn down and replaced by something which will
> sigbus. Probably should have that as a helper (plus vram fault code
> should use drm_dev_enter/exit to plug races).

Hi,

I assume SIGBUS is the traditional way to say "oops, the memory you
mmapped and tried to access no longer exists". Is there nothing
else for this?

I'm asking, because SIGBUS is really hard to handle right in
userspace. It can be caused by any number of wildly different
reasons, yet being a signal means that a userspace process can only
have a single global handler for it. That makes it almost
impossible to use safely in libraries, because you would want to
register independent handlers from multiple libraries in the same
process. Some libraries may also be using threads.

How to handle a SIGBUS completely depends on what triggered it.
Almost always userspace wants it to be a non-fatal error. A Wayland
compositor can hit SIGBUS on accessing wl_shm-based client buffers
(regular mmapped files), and then it just wants to continue with
garbage data as if nothing happened and possibly send a protocol
error to the client provoking it.

I would also imagine that Mesa, when it starts looking into
supporting GPU hotunplug, needs to handle vanished mmaps. I don't
think Mesa can ever install signal handlers, because that would
mess with the applications that may already be using SIGBUS for
handling disappearing mmapped files. It needs to start returning
errors via API calls. I cannot imagine a way to reliably prevent
such SIGBUS either by e.g. ensuring Mesa gets notified of removal
before it actually starts failing.

For now, I'm just looking for a simple "yes" or "no" here for the
something else. If it's "no" like I expect, creating something else
is probably in the order of years to get into a usable state. Does
anyone already have plans towards that?


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: SIGBUS on device disappearance (Re: Warnings in DRM code when removing/unbinding a driver)
  2019-12-23  9:00                   ` Pekka Paalanen
@ 2020-01-07 15:42                     ` Daniel Vetter
  -1 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2020-01-07 15:42 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Daniel Vetter, Gerd Hoffmann, dbueso, airlied, Chenfeng (puck),
	John Garry, Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Thomas Zimmermann, Ezequiel Garcia

On Mon, Dec 23, 2019 at 11:00:15AM +0200, Pekka Paalanen wrote:
> On Thu, 19 Dec 2019 13:42:33 +0100
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Thu, Dec 19, 2019 at 12:32 PM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > >
> > > While being at it:  How would a driver cleanup properly cleanup gem
> > > objects created by userspace on hotunbind?  Specifically a gem object
> > > pinned to vram?  
> > 
> > Two things:
> > - the mmap needs to be torn down and replaced by something which will
> > sigbus. Probably should have that as a helper (plus vram fault code
> > should use drm_dev_enter/exit to plug races).
> 
> Hi,
> 
> I assume SIGBUS is the traditional way to say "oops, the memory you
> mmapped and tried to access no longer exists". Is there nothing
> else for this?
> 
> I'm asking, because SIGBUS is really hard to handle right in
> userspace. It can be caused by any number of wildly different
> reasons, yet being a signal means that a userspace process can only
> have a single global handler for it. That makes it almost
> impossible to use safely in libraries, because you would want to
> register independent handlers from multiple libraries in the same
> process. Some libraries may also be using threads.
> 
> How to handle a SIGBUS completely depends on what triggered it.
> Almost always userspace wants it to be a non-fatal error. A Wayland
> compositor can hit SIGBUS on accessing wl_shm-based client buffers
> (regular mmapped files), and then it just wants to continue with
> garbage data as if nothing happened and possibly send a protocol
> error to the client provoking it.

For drm drivers that you actually want to hotunplug (as opposed to more
just for driver development) they all use system memory/shmem, so
shouldn't sigbus. I think at least, I haven't tested anything. This is for
udl, or the tiny displays behind an spi bridge.

For pci drivers where the mmap often points at a pci bridge the mmio range
will be gone, so not SIGBUSing is going to be a tough order. Not
impossible, but before we enshrine this into uapi someont will have to do
some serious typing.

> I would also imagine that Mesa, when it starts looking into
> supporting GPU hotunplug, needs to handle vanished mmaps. I don't
> think Mesa can ever install signal handlers, because that would
> mess with the applications that may already be using SIGBUS for
> handling disappearing mmapped files. It needs to start returning
> errors via API calls. I cannot imagine a way to reliably prevent
> such SIGBUS either by e.g. ensuring Mesa gets notified of removal
> before it actually starts failing.

Mesa already blows up in all kinds of interesting ways when it gets an EIO
at execbuf. I think. Robust handling of gpu hotunplug for gl/vk contexts
is going to be more work on top (and mmap is probably the least issue
there, at least right now).

> For now, I'm just looking for a simple "yes" or "no" here for the
> something else. If it's "no" like I expect, creating something else
> is probably in the order of years to get into a usable state. Does
> anyone already have plans towards that?

I agree with you that SIGBUS for mmap of hotunplugged devices is
essentially unusable because sighandlers and all what you point out (would
make it impossible to have robust vk/gl contexts, at least robuts against
hotunplug).

So in principle I'm open to have some other uapi for this, but it's going
to be serios amounts of work across the stack.

For display only udl-style devices otoh I think we should be mostly there,
+/- driver bugs as usual.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: SIGBUS on device disappearance (Re: Warnings in DRM code when removing/unbinding a driver)
@ 2020-01-07 15:42                     ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2020-01-07 15:42 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Thomas Zimmermann, dbueso, airlied, Chenfeng (puck),
	John Garry, Linuxarm, dri-devel, linux-kernel, kongxinwei (A),
	Gerd Hoffmann, Ezequiel Garcia

On Mon, Dec 23, 2019 at 11:00:15AM +0200, Pekka Paalanen wrote:
> On Thu, 19 Dec 2019 13:42:33 +0100
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Thu, Dec 19, 2019 at 12:32 PM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > >
> > > While being at it:  How would a driver cleanup properly cleanup gem
> > > objects created by userspace on hotunbind?  Specifically a gem object
> > > pinned to vram?  
> > 
> > Two things:
> > - the mmap needs to be torn down and replaced by something which will
> > sigbus. Probably should have that as a helper (plus vram fault code
> > should use drm_dev_enter/exit to plug races).
> 
> Hi,
> 
> I assume SIGBUS is the traditional way to say "oops, the memory you
> mmapped and tried to access no longer exists". Is there nothing
> else for this?
> 
> I'm asking, because SIGBUS is really hard to handle right in
> userspace. It can be caused by any number of wildly different
> reasons, yet being a signal means that a userspace process can only
> have a single global handler for it. That makes it almost
> impossible to use safely in libraries, because you would want to
> register independent handlers from multiple libraries in the same
> process. Some libraries may also be using threads.
> 
> How to handle a SIGBUS completely depends on what triggered it.
> Almost always userspace wants it to be a non-fatal error. A Wayland
> compositor can hit SIGBUS on accessing wl_shm-based client buffers
> (regular mmapped files), and then it just wants to continue with
> garbage data as if nothing happened and possibly send a protocol
> error to the client provoking it.

For drm drivers that you actually want to hotunplug (as opposed to more
just for driver development) they all use system memory/shmem, so
shouldn't sigbus. I think at least, I haven't tested anything. This is for
udl, or the tiny displays behind an spi bridge.

For pci drivers where the mmap often points at a pci bridge the mmio range
will be gone, so not SIGBUSing is going to be a tough order. Not
impossible, but before we enshrine this into uapi someont will have to do
some serious typing.

> I would also imagine that Mesa, when it starts looking into
> supporting GPU hotunplug, needs to handle vanished mmaps. I don't
> think Mesa can ever install signal handlers, because that would
> mess with the applications that may already be using SIGBUS for
> handling disappearing mmapped files. It needs to start returning
> errors via API calls. I cannot imagine a way to reliably prevent
> such SIGBUS either by e.g. ensuring Mesa gets notified of removal
> before it actually starts failing.

Mesa already blows up in all kinds of interesting ways when it gets an EIO
at execbuf. I think. Robust handling of gpu hotunplug for gl/vk contexts
is going to be more work on top (and mmap is probably the least issue
there, at least right now).

> For now, I'm just looking for a simple "yes" or "no" here for the
> something else. If it's "no" like I expect, creating something else
> is probably in the order of years to get into a usable state. Does
> anyone already have plans towards that?

I agree with you that SIGBUS for mmap of hotunplugged devices is
essentially unusable because sighandlers and all what you point out (would
make it impossible to have robust vk/gl contexts, at least robuts against
hotunplug).

So in principle I'm open to have some other uapi for this, but it's going
to be serios amounts of work across the stack.

For display only udl-style devices otoh I think we should be mostly there,
+/- driver bugs as usual.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2019-12-16 17:23 ` John Garry
@ 2020-01-10 10:49   ` Thomas Zimmermann
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Zimmermann @ 2020-01-10 10:49 UTC (permalink / raw)
  To: John Garry, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 7793 bytes --]

Hi John

Am 16.12.19 um 18:23 schrieb John Garry:
> Hi all,
> 
> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
> with the HIBMC hw:
> 
> [   27.788806] WARNING: CPU: 24 PID: 1 at
> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
> [   27.798969] Modules linked in:
> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
>  5.5.0-rc1-dirty #565
> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> RC0 - V1.16.01 03/15/2019
> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> [   27.832868] sp : ffff00236f0677e0
> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> [   27.915712] Call trace:
> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> [   27.935109]  drm_gem_object_free+0x34/0xd0
> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> [   27.949020]  drm_framebuffer_free+0x48/0x58
> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> [   27.957889]  drm_mode_object_put+0x28/0x38
> [   27.961976]  hibmc_fbdev_fini+0x54/0x78

drm-tip now contains

commit a88248506a2bcfeaef6837a53cde19fe11970e6c
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Dec 3 09:38:15 2019 +0100

    drm/hisilicon/hibmc: Switch to generic fbdev emulation

which removes this entire code and switches hibmc to generic fbdev
emulation. Does that fix the problem?

Best regards
Thomas

> [   27.965802]  hibmc_unload+0x2c/0xd0
> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> [   27.973109]  pci_device_remove+0x6c/0x140
> [   27.977110]  really_probe+0x174/0x548
> [   27.980763]  driver_probe_device+0x7c/0x148
> [   27.984936]  device_driver_attach+0x94/0xa0
> [   27.989109]  __driver_attach+0xa8/0x110
> [   27.992935]  bus_for_each_dev+0xe8/0x158
> [   27.996849]  driver_attach+0x30/0x40
> [   28.000415]  bus_add_driver+0x234/0x2f0
> [   28.004241]  driver_register+0xbc/0x1d0
> [   28.008067]  __pci_register_driver+0xbc/0xd0
> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> [   28.016590]  do_one_initcall+0xb4/0x254
> [   28.020417]  kernel_init_freeable+0x27c/0x328
> [   28.024765]  kernel_init+0x10/0x118
> [   28.028245]  ret_from_fork+0x10/0x18
> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> [   28.036503] ------------[ cut here ]------------
> [   28.041115] WARNING: CPU: 24 PID: 1 at
> drivers/gpu/drm/drm_gem_vram_helper.c:40
> ttm_buffer_object_destroy+0x4c/0x80
> [   28.051537] Modules linked in:
> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>  5.5.0-rc1-dirty #565
> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> RC0 - V1.16.01 03/15/2019
> 
> [snip]
> 
> Indeed, simply unbinding the device from the driver causes the same sort
> of issue:
> 
> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> root@(none)$ ls
> 0000:05:00.0  bind          new_id        remove_id     uevent       
> unbind
> root@(none)$ echo 0000\:05\:00.0 > unbind
> [  116.074352] ------------[ cut here ]------------
> [  116.078978] WARNING: CPU: 17 PID: 1178 at
> drivers/gpu/drm/drm_gem_vram_helper.c:40
> ttm_buffer_object_destroy+0x4c/0x80
> [  116.089661] Modules linked in:
> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
> 5.5.0-rc1-dirty #565
> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> RC0 - V1.16.01 03/15/2019
> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> [  116.123910] sp : ffff0022e6cef8e0
> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> [  116.159032] x17: 000000000000000000000000001
> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> [  116.206756] Call trace:
> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> [  116.213803]  ttm_bo_release_list+0x184/0x220
> [  116.218064]  ttm_bo_put+0x410/0x5d0
> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> [  116.225979]  drm_gem_object_free+0x34/0xd0
> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> [  116.239890]  drm_framebuffer_free+0x48/0x58
> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> [  116.248759]  drm_mode_object_put+0x28/0x38
> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> [  116.256672]  hibmc_unload+0x2c/0xd0
> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> [  116.263979]  pci_device_remove+0x6c/0x140
> [  116.267980]  device_release_driver_internal+0x134/0x250
> [  116.273196]  device_driver_detach+0x28/0x38
> [  116.277369]  unbind_store+0xfc/0x150
> [  116.280934]  drv_attr_store+0x48/0x60
> [  116.284589]  sysfs_kf_write+0x80/0xb0
> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> [  116.292243]  __vfs_write+0x54/0x98
> [  116.295635]  vfs_write+0xe8/0x270
> [  116.298940]  ksys_write+0xc8/0x180
> [  116.302333]  __arm64_sys_write+0x40/0x50
> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> [  116.311029]  el0_svc_handler+0x34/0xb0
> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> [  116.318769]  el0_sync+0x140/0x180
> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> [  116.326868] ------------[ cut here ]------------
> 
> 
> dmesg and .config is here:
> https://pastebin.com/4P5yaZBS
> 
> I'm not sure if this is a HIBMC driver issue or issue with the framework.
> 
> john
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2020-01-10 10:49   ` Thomas Zimmermann
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Zimmermann @ 2020-01-10 10:49 UTC (permalink / raw)
  To: John Garry, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 7793 bytes --]

Hi John

Am 16.12.19 um 18:23 schrieb John Garry:
> Hi all,
> 
> Enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE causes many warns on a system
> with the HIBMC hw:
> 
> [   27.788806] WARNING: CPU: 24 PID: 1 at
> drivers/gpu/drm/drm_gem_vram_helper.c:564 bo_driver_move_notify+0x8c/0x98
> [   27.798969] Modules linked in:
> [   27.802018] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B
>  5.5.0-rc1-dirty #565
> [   27.810358] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> RC0 - V1.16.01 03/15/2019
> [   27.818872] pstate: 20c00009 (nzCv daif +PAN +UAO)
> [   27.823654] pc : bo_driver_move_notify+0x8c/0x98
> [   27.828262] lr : bo_driver_move_notify+0x40/0x98
> [   27.832868] sp : ffff00236f0677e0
> [   27.836173] x29: ffff00236f0677e0 x28: ffffa0001454e5e0
> [   27.841476] x27: ffff002366e52128 x26: ffffa000149e67b0
> [   27.846779] x25: ffff002366e523e0 x24: ffff002336936120
> [   27.852082] x23: ffff0023346f4010 x22: ffff002336936128
> [   27.857385] x21: ffffa000149c15c0 x20: ffff0023369361f8
> [   27.862687] x19: ffff002336936000 x18: 0000000000001258
> [   27.867989] x17: 0000000000001190 x16: 00000000000011d0
> [   27.873292] x15: 0000000000001348 x14: ffffa00012d68190
> [   27.878595] x13: 0000000000000006 x12: 1ffff40003241f91
> [   27.883897] x11: ffff940003241f91 x10: dfffa00000000000
> [   27.889200] x9 : ffff940003241f92 x8 : 0000000000000001
> [   27.894502] x7 : ffffa0001920fc88 x6 : ffff940003241f92
> [   27.899804] x5 : ffff940003241f92 x4 : ffff0023369363a0
> [   27.905107] x3 : ffffa00010c104b8 x2 : dfffa00000000000
> [   27.910409] x1 : 0000000000000003 x0 : 0000000000000001
> [   27.915712] Call trace:
> [   27.918151]  bo_driver_move_notify+0x8c/0x98
> [   27.922412]  ttm_bo_cleanup_memtype_use+0x54/0x100
> [   27.927194]  ttm_bo_put+0x3a0/0x5d0
> [   27.930673]  drm_gem_vram_object_free+0xc/0x18
> [   27.935109]  drm_gem_object_free+0x34/0xd0
> [   27.939196]  drm_gem_object_put_unlocked+0xc8/0xf0
> [   27.943978]  hibmc_user_framebuffer_destroy+0x20/0x40
> [   27.949020]  drm_framebuffer_free+0x48/0x58
> [   27.953194]  drm_mode_object_put.part.1+0x90/0xe8
> [   27.957889]  drm_mode_object_put+0x28/0x38
> [   27.961976]  hibmc_fbdev_fini+0x54/0x78

drm-tip now contains

commit a88248506a2bcfeaef6837a53cde19fe11970e6c
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Dec 3 09:38:15 2019 +0100

    drm/hisilicon/hibmc: Switch to generic fbdev emulation

which removes this entire code and switches hibmc to generic fbdev
emulation. Does that fix the problem?

Best regards
Thomas

> [   27.965802]  hibmc_unload+0x2c/0xd0
> [   27.969281]  hibmc_pci_remove+0x2c/0x40
> [   27.973109]  pci_device_remove+0x6c/0x140
> [   27.977110]  really_probe+0x174/0x548
> [   27.980763]  driver_probe_device+0x7c/0x148
> [   27.984936]  device_driver_attach+0x94/0xa0
> [   27.989109]  __driver_attach+0xa8/0x110
> [   27.992935]  bus_for_each_dev+0xe8/0x158
> [   27.996849]  driver_attach+0x30/0x40
> [   28.000415]  bus_add_driver+0x234/0x2f0
> [   28.004241]  driver_register+0xbc/0x1d0
> [   28.008067]  __pci_register_driver+0xbc/0xd0
> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
> [   28.016590]  do_one_initcall+0xb4/0x254
> [   28.020417]  kernel_init_freeable+0x27c/0x328
> [   28.024765]  kernel_init+0x10/0x118
> [   28.028245]  ret_from_fork+0x10/0x18
> [   28.031813] ---[ end trace 35a83b71b657878d ]---
> [   28.036503] ------------[ cut here ]------------
> [   28.041115] WARNING: CPU: 24 PID: 1 at
> drivers/gpu/drm/drm_gem_vram_helper.c:40
> ttm_buffer_object_destroy+0x4c/0x80
> [   28.051537] Modules linked in:
> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>  5.5.0-rc1-dirty #565
> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> RC0 - V1.16.01 03/15/2019
> 
> [snip]
> 
> Indeed, simply unbinding the device from the driver causes the same sort
> of issue:
> 
> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
> root@(none)$ ls
> 0000:05:00.0  bind          new_id        remove_id     uevent       
> unbind
> root@(none)$ echo 0000\:05\:00.0 > unbind
> [  116.074352] ------------[ cut here ]------------
> [  116.078978] WARNING: CPU: 17 PID: 1178 at
> drivers/gpu/drm/drm_gem_vram_helper.c:40
> ttm_buffer_object_destroy+0x4c/0x80
> [  116.089661] Modules linked in:
> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
> 5.5.0-rc1-dirty #565
> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> RC0 - V1.16.01 03/15/2019
> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
> [  116.123910] sp : ffff0022e6cef8e0
> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
> [  116.159032] x17: 000000000000000000000000001
> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
> [  116.206756] Call trace:
> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
> [  116.213803]  ttm_bo_release_list+0x184/0x220
> [  116.218064]  ttm_bo_put+0x410/0x5d0
> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
> [  116.225979]  drm_gem_object_free+0x34/0xd0
> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
> [  116.239890]  drm_framebuffer_free+0x48/0x58
> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
> [  116.248759]  drm_mode_object_put+0x28/0x38
> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
> [  116.256672]  hibmc_unload+0x2c/0xd0
> [  116.260151]  hibmc_pci_remove+0x2c/0x40
> [  116.263979]  pci_device_remove+0x6c/0x140
> [  116.267980]  device_release_driver_internal+0x134/0x250
> [  116.273196]  device_driver_detach+0x28/0x38
> [  116.277369]  unbind_store+0xfc/0x150
> [  116.280934]  drv_attr_store+0x48/0x60
> [  116.284589]  sysfs_kf_write+0x80/0xb0
> [  116.288241]  kernfs_fop_write+0x1d4/0x320
> [  116.292243]  __vfs_write+0x54/0x98
> [  116.295635]  vfs_write+0xe8/0x270
> [  116.298940]  ksys_write+0xc8/0x180
> [  116.302333]  __arm64_sys_write+0x40/0x50
> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
> [  116.311029]  el0_svc_handler+0x34/0xb0
> [  116.314770]  el0_sync_handler+0x10c/0x1c8
> [  116.318769]  el0_sync+0x140/0x180
> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
> [  116.326868] ------------[ cut here ]------------
> 
> 
> dmesg and .config is here:
> https://pastebin.com/4P5yaZBS
> 
> I'm not sure if this is a HIBMC driver issue or issue with the framework.
> 
> john
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2020-01-10 10:49   ` Thomas Zimmermann
@ 2020-01-10 12:54     ` John Garry
  -1 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2020-01-10 12:54 UTC (permalink / raw)
  To: Thomas Zimmermann, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel



Hi Thomas,

> drm-tip now contains

I have tested today's linux-next, which includes this:

> 
> commit a88248506a2bcfeaef6837a53cde19fe11970e6c
> Author: Thomas Zimmermann <tzimmermann@suse.de>
> Date:   Tue Dec 3 09:38:15 2019 +0100
> 
>      drm/hisilicon/hibmc: Switch to generic fbdev emulation
> 
> which removes this entire code and switches hibmc to generic fbdev
> emulation. Does that fix the problem?
> 

And I see no warn, here's a dmesg snippet:

[   20.672787] pci 0007:90:00.0: can't derive routing for PCI INT A
[   20.678831] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[   20.686536] pci_bus 0007:90: 2-byte config write to 0007:90:00.0 
offset 0x4 may corrupt adjacent RW1C bits
[   20.696888] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
[   20.703545] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[   20.710108] [TTM] Initializing pool allocator
[   20.714561] [TTM] Initializing DMA pool allocator
[   20.720212] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   20.726863] [drm] No driver support for vblank timestamp query.
[   20.754777] Console: switching to colour frame buffer device 100x37
[   20.778180] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
[   20.786447] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0 
on minor 0
[   20.794346] Console: switching to colour dummy device 80x25
[   20.801884] pci 0007:90:00.0: can't derive routing for PCI INT A
[   20.807963] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[   20.813656] [TTM] Finalizing pool allocator
[   20.817905] [TTM] Finalizing DMA pool allocator
[   20.822576] [TTM] Zone  kernel: Used memory at exit: 0 KiB
[   20.828760] [TTM] Zone   dma32: Used memory at exit: 0 KiB
[   20.834978] pci 0007:90:00.0: can't derive routing for PCI INT A
[   20.841021] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[   20.848858] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
[   20.855516] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[   20.862079] [TTM] Initializing pool allocator
[   20.866525] [TTM] Initializing DMA pool allocator
[   20.872064] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   20.878716] [drm] No driver support for vblank timestamp query.
[   20.905996] Console: switching to colour frame buffer device 100x37
[   20.929385] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
[   20.937241] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0 
on minor 0
[   21.171906] loop: module loaded

Thanks,
John

> Best regards
> Thomas
> 
>> [   27.965802]  hibmc_unload+0x2c/0xd0
>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>> [   27.973109]  pci_device_remove+0x6c/0x140
>> [   27.977110]  really_probe+0x174/0x548
>> [   27.980763]  driver_probe_device+0x7c/0x148
>> [   27.984936]  device_driver_attach+0x94/0xa0
>> [   27.989109]  __driver_attach+0xa8/0x110
>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>> [   27.996849]  driver_attach+0x30/0x40
>> [   28.000415]  bus_add_driver+0x234/0x2f0
>> [   28.004241]  driver_register+0xbc/0x1d0
>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>> [   28.016590]  do_one_initcall+0xb4/0x254
>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>> [   28.024765]  kernel_init+0x10/0x118
>> [   28.028245]  ret_from_fork+0x10/0x18
>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>> [   28.036503] ------------[ cut here ]------------
>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>> ttm_buffer_object_destroy+0x4c/0x80
>> [   28.051537] Modules linked in:
>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>   5.5.0-rc1-dirty #565
>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>>
>> [snip]
>>
>> Indeed, simply unbinding the device from the driver causes the same sort
>> of issue:
>>
>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>> root@(none)$ ls
>> 0000:05:00.0  bind          new_id        remove_id     uevent
>> unbind
>> root@(none)$ echo 0000\:05\:00.0 > unbind
>> [  116.074352] ------------[ cut here ]------------
>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>> ttm_buffer_object_destroy+0x4c/0x80
>> [  116.089661] Modules linked in:
>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>> 5.5.0-rc1-dirty #565
>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>> [  116.123910] sp : ffff0022e6cef8e0
>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>> [  116.159032] x17: 000000000000000000000000001
>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>> [  116.206756] Call trace:
>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>> [  116.248759]  drm_mode_object_put+0x28/0x38
>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>> [  116.256672]  hibmc_unload+0x2c/0xd0
>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>> [  116.263979]  pci_device_remove+0x6c/0x140
>> [  116.267980]  device_release_driver_internal+0x134/0x250
>> [  116.273196]  device_driver_detach+0x28/0x38
>> [  116.277369]  unbind_store+0xfc/0x150
>> [  116.280934]  drv_attr_store+0x48/0x60
>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>> [  116.292243]  __vfs_write+0x54/0x98
>> [  116.295635]  vfs_write+0xe8/0x270
>> [  116.298940]  ksys_write+0xc8/0x180
>> [  116.302333]  __arm64_sys_write+0x40/0x50
>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>> [  116.311029]  el0_svc_handler+0x34/0xb0
>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>> [  116.318769]  el0_sync+0x140/0x180
>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>> [  116.326868] ------------[ cut here ]------------
>>
>>
>> dmesg and .config is here:
>> https://pastebin.com/4P5yaZBS
>>
>> I'm not sure if this is a HIBMC driver issue or issue with the framework.
>>
>> john
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2020-01-10 12:54     ` John Garry
  0 siblings, 0 replies; 32+ messages in thread
From: John Garry @ 2020-01-10 12:54 UTC (permalink / raw)
  To: Thomas Zimmermann, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel



Hi Thomas,

> drm-tip now contains

I have tested today's linux-next, which includes this:

> 
> commit a88248506a2bcfeaef6837a53cde19fe11970e6c
> Author: Thomas Zimmermann <tzimmermann@suse.de>
> Date:   Tue Dec 3 09:38:15 2019 +0100
> 
>      drm/hisilicon/hibmc: Switch to generic fbdev emulation
> 
> which removes this entire code and switches hibmc to generic fbdev
> emulation. Does that fix the problem?
> 

And I see no warn, here's a dmesg snippet:

[   20.672787] pci 0007:90:00.0: can't derive routing for PCI INT A
[   20.678831] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[   20.686536] pci_bus 0007:90: 2-byte config write to 0007:90:00.0 
offset 0x4 may corrupt adjacent RW1C bits
[   20.696888] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
[   20.703545] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[   20.710108] [TTM] Initializing pool allocator
[   20.714561] [TTM] Initializing DMA pool allocator
[   20.720212] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   20.726863] [drm] No driver support for vblank timestamp query.
[   20.754777] Console: switching to colour frame buffer device 100x37
[   20.778180] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
[   20.786447] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0 
on minor 0
[   20.794346] Console: switching to colour dummy device 80x25
[   20.801884] pci 0007:90:00.0: can't derive routing for PCI INT A
[   20.807963] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[   20.813656] [TTM] Finalizing pool allocator
[   20.817905] [TTM] Finalizing DMA pool allocator
[   20.822576] [TTM] Zone  kernel: Used memory at exit: 0 KiB
[   20.828760] [TTM] Zone   dma32: Used memory at exit: 0 KiB
[   20.834978] pci 0007:90:00.0: can't derive routing for PCI INT A
[   20.841021] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[   20.848858] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
[   20.855516] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[   20.862079] [TTM] Initializing pool allocator
[   20.866525] [TTM] Initializing DMA pool allocator
[   20.872064] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   20.878716] [drm] No driver support for vblank timestamp query.
[   20.905996] Console: switching to colour frame buffer device 100x37
[   20.929385] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
[   20.937241] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0 
on minor 0
[   21.171906] loop: module loaded

Thanks,
John

> Best regards
> Thomas
> 
>> [   27.965802]  hibmc_unload+0x2c/0xd0
>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>> [   27.973109]  pci_device_remove+0x6c/0x140
>> [   27.977110]  really_probe+0x174/0x548
>> [   27.980763]  driver_probe_device+0x7c/0x148
>> [   27.984936]  device_driver_attach+0x94/0xa0
>> [   27.989109]  __driver_attach+0xa8/0x110
>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>> [   27.996849]  driver_attach+0x30/0x40
>> [   28.000415]  bus_add_driver+0x234/0x2f0
>> [   28.004241]  driver_register+0xbc/0x1d0
>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>> [   28.016590]  do_one_initcall+0xb4/0x254
>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>> [   28.024765]  kernel_init+0x10/0x118
>> [   28.028245]  ret_from_fork+0x10/0x18
>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>> [   28.036503] ------------[ cut here ]------------
>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>> ttm_buffer_object_destroy+0x4c/0x80
>> [   28.051537] Modules linked in:
>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>   5.5.0-rc1-dirty #565
>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>>
>> [snip]
>>
>> Indeed, simply unbinding the device from the driver causes the same sort
>> of issue:
>>
>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>> root@(none)$ ls
>> 0000:05:00.0  bind          new_id        remove_id     uevent
>> unbind
>> root@(none)$ echo 0000\:05\:00.0 > unbind
>> [  116.074352] ------------[ cut here ]------------
>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>> ttm_buffer_object_destroy+0x4c/0x80
>> [  116.089661] Modules linked in:
>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>> 5.5.0-rc1-dirty #565
>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>> RC0 - V1.16.01 03/15/2019
>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>> [  116.123910] sp : ffff0022e6cef8e0
>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>> [  116.159032] x17: 000000000000000000000000001
>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>> [  116.206756] Call trace:
>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>> [  116.248759]  drm_mode_object_put+0x28/0x38
>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>> [  116.256672]  hibmc_unload+0x2c/0xd0
>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>> [  116.263979]  pci_device_remove+0x6c/0x140
>> [  116.267980]  device_release_driver_internal+0x134/0x250
>> [  116.273196]  device_driver_detach+0x28/0x38
>> [  116.277369]  unbind_store+0xfc/0x150
>> [  116.280934]  drv_attr_store+0x48/0x60
>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>> [  116.292243]  __vfs_write+0x54/0x98
>> [  116.295635]  vfs_write+0xe8/0x270
>> [  116.298940]  ksys_write+0xc8/0x180
>> [  116.302333]  __arm64_sys_write+0x40/0x50
>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>> [  116.311029]  el0_svc_handler+0x34/0xb0
>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>> [  116.318769]  el0_sync+0x140/0x180
>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>> [  116.326868] ------------[ cut here ]------------
>>
>>
>> dmesg and .config is here:
>> https://pastebin.com/4P5yaZBS
>>
>> I'm not sure if this is a HIBMC driver issue or issue with the framework.
>>
>> john
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
  2020-01-10 12:54     ` John Garry
@ 2020-01-13  8:05       ` Thomas Zimmermann
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Zimmermann @ 2020-01-13  8:05 UTC (permalink / raw)
  To: John Garry, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 8415 bytes --]

Hi John

Am 10.01.20 um 13:54 schrieb John Garry:
> 
> 
> Hi Thomas,
> 
>> drm-tip now contains
> 
> I have tested today's linux-next, which includes this:
> 
>>
>> commit a88248506a2bcfeaef6837a53cde19fe11970e6c
>> Author: Thomas Zimmermann <tzimmermann@suse.de>
>> Date:   Tue Dec 3 09:38:15 2019 +0100
>>
>>      drm/hisilicon/hibmc: Switch to generic fbdev emulation
>>
>> which removes this entire code and switches hibmc to generic fbdev
>> emulation. Does that fix the problem?
>>
> 
> And I see no warn, here's a dmesg snippet:

Great. So I'll consider this fixed. Thanks for reporting ad testing.

Best regards
Thomas

> 
> [   20.672787] pci 0007:90:00.0: can't derive routing for PCI INT A
> [   20.678831] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
> [   20.686536] pci_bus 0007:90: 2-byte config write to 0007:90:00.0
> offset 0x4 may corrupt adjacent RW1C bits
> [   20.696888] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
> [   20.703545] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
> [   20.710108] [TTM] Initializing pool allocator
> [   20.714561] [TTM] Initializing DMA pool allocator
> [   20.720212] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [   20.726863] [drm] No driver support for vblank timestamp query.
> [   20.754777] Console: switching to colour frame buffer device 100x37
> [   20.778180] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
> [   20.786447] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0
> on minor 0
> [   20.794346] Console: switching to colour dummy device 80x25
> [   20.801884] pci 0007:90:00.0: can't derive routing for PCI INT A
> [   20.807963] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
> [   20.813656] [TTM] Finalizing pool allocator
> [   20.817905] [TTM] Finalizing DMA pool allocator
> [   20.822576] [TTM] Zone  kernel: Used memory at exit: 0 KiB
> [   20.828760] [TTM] Zone   dma32: Used memory at exit: 0 KiB
> [   20.834978] pci 0007:90:00.0: can't derive routing for PCI INT A
> [   20.841021] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
> [   20.848858] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
> [   20.855516] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
> [   20.862079] [TTM] Initializing pool allocator
> [   20.866525] [TTM] Initializing DMA pool allocator
> [   20.872064] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [   20.878716] [drm] No driver support for vblank timestamp query.
> [   20.905996] Console: switching to colour frame buffer device 100x37
> [   20.929385] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
> [   20.937241] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0
> on minor 0
> [   21.171906] loop: module loaded
> 
> Thanks,
> John
> 
>> Best regards
>> Thomas
>>
>>> [   27.965802]  hibmc_unload+0x2c/0xd0
>>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>>> [   27.973109]  pci_device_remove+0x6c/0x140
>>> [   27.977110]  really_probe+0x174/0x548
>>> [   27.980763]  driver_probe_device+0x7c/0x148
>>> [   27.984936]  device_driver_attach+0x94/0xa0
>>> [   27.989109]  __driver_attach+0xa8/0x110
>>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>>> [   27.996849]  driver_attach+0x30/0x40
>>> [   28.000415]  bus_add_driver+0x234/0x2f0
>>> [   28.004241]  driver_register+0xbc/0x1d0
>>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>>> [   28.016590]  do_one_initcall+0xb4/0x254
>>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>>> [   28.024765]  kernel_init+0x10/0x118
>>> [   28.028245]  ret_from_fork+0x10/0x18
>>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>>> [   28.036503] ------------[ cut here ]------------
>>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [   28.051537] Modules linked in:
>>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>>   5.5.0-rc1-dirty #565
>>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>>
>>> [snip]
>>>
>>> Indeed, simply unbinding the device from the driver causes the same sort
>>> of issue:
>>>
>>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>>> root@(none)$ ls
>>> 0000:05:00.0  bind          new_id        remove_id     uevent
>>> unbind
>>> root@(none)$ echo 0000\:05\:00.0 > unbind
>>> [  116.074352] ------------[ cut here ]------------
>>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.089661] Modules linked in:
>>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>>> 5.5.0-rc1-dirty #565
>>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>>> [  116.123910] sp : ffff0022e6cef8e0
>>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>>> [  116.159032] x17: 000000000000000000000000001
>>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>>> [  116.206756] Call trace:
>>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>>> [  116.248759]  drm_mode_object_put+0x28/0x38
>>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>>> [  116.256672]  hibmc_unload+0x2c/0xd0
>>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>>> [  116.263979]  pci_device_remove+0x6c/0x140
>>> [  116.267980]  device_release_driver_internal+0x134/0x250
>>> [  116.273196]  device_driver_detach+0x28/0x38
>>> [  116.277369]  unbind_store+0xfc/0x150
>>> [  116.280934]  drv_attr_store+0x48/0x60
>>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>>> [  116.292243]  __vfs_write+0x54/0x98
>>> [  116.295635]  vfs_write+0xe8/0x270
>>> [  116.298940]  ksys_write+0xc8/0x180
>>> [  116.302333]  __arm64_sys_write+0x40/0x50
>>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>>> [  116.311029]  el0_svc_handler+0x34/0xb0
>>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>>> [  116.318769]  el0_sync+0x140/0x180
>>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>>> [  116.326868] ------------[ cut here ]------------
>>>
>>>
>>> dmesg and .config is here:
>>> https://pastebin.com/4P5yaZBS
>>>
>>> I'm not sure if this is a HIBMC driver issue or issue with the
>>> framework.
>>>
>>> john
>>>
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Warnings in DRM code when removing/unbinding a driver
@ 2020-01-13  8:05       ` Thomas Zimmermann
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Zimmermann @ 2020-01-13  8:05 UTC (permalink / raw)
  To: John Garry, kongxinwei (A), Chenfeng (puck), airlied, daniel
  Cc: linux-kernel, dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 8415 bytes --]

Hi John

Am 10.01.20 um 13:54 schrieb John Garry:
> 
> 
> Hi Thomas,
> 
>> drm-tip now contains
> 
> I have tested today's linux-next, which includes this:
> 
>>
>> commit a88248506a2bcfeaef6837a53cde19fe11970e6c
>> Author: Thomas Zimmermann <tzimmermann@suse.de>
>> Date:   Tue Dec 3 09:38:15 2019 +0100
>>
>>      drm/hisilicon/hibmc: Switch to generic fbdev emulation
>>
>> which removes this entire code and switches hibmc to generic fbdev
>> emulation. Does that fix the problem?
>>
> 
> And I see no warn, here's a dmesg snippet:

Great. So I'll consider this fixed. Thanks for reporting ad testing.

Best regards
Thomas

> 
> [   20.672787] pci 0007:90:00.0: can't derive routing for PCI INT A
> [   20.678831] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
> [   20.686536] pci_bus 0007:90: 2-byte config write to 0007:90:00.0
> offset 0x4 may corrupt adjacent RW1C bits
> [   20.696888] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
> [   20.703545] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
> [   20.710108] [TTM] Initializing pool allocator
> [   20.714561] [TTM] Initializing DMA pool allocator
> [   20.720212] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [   20.726863] [drm] No driver support for vblank timestamp query.
> [   20.754777] Console: switching to colour frame buffer device 100x37
> [   20.778180] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
> [   20.786447] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0
> on minor 0
> [   20.794346] Console: switching to colour dummy device 80x25
> [   20.801884] pci 0007:90:00.0: can't derive routing for PCI INT A
> [   20.807963] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
> [   20.813656] [TTM] Finalizing pool allocator
> [   20.817905] [TTM] Finalizing DMA pool allocator
> [   20.822576] [TTM] Zone  kernel: Used memory at exit: 0 KiB
> [   20.828760] [TTM] Zone   dma32: Used memory at exit: 0 KiB
> [   20.834978] pci 0007:90:00.0: can't derive routing for PCI INT A
> [   20.841021] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
> [   20.848858] [TTM] Zone  kernel: Available graphics memory: 57359458 KiB
> [   20.855516] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
> [   20.862079] [TTM] Initializing pool allocator
> [   20.866525] [TTM] Initializing DMA pool allocator
> [   20.872064] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [   20.878716] [drm] No driver support for vblank timestamp query.
> [   20.905996] Console: switching to colour frame buffer device 100x37
> [   20.929385] hibmc-drm 0007:91:00.0: fb0: hibmcdrmfb frame buffer device
> [   20.937241] [drm] Initialized hibmc 1.0.0 20160828 for 0007:91:00.0
> on minor 0
> [   21.171906] loop: module loaded
> 
> Thanks,
> John
> 
>> Best regards
>> Thomas
>>
>>> [   27.965802]  hibmc_unload+0x2c/0xd0
>>> [   27.969281]  hibmc_pci_remove+0x2c/0x40
>>> [   27.973109]  pci_device_remove+0x6c/0x140
>>> [   27.977110]  really_probe+0x174/0x548
>>> [   27.980763]  driver_probe_device+0x7c/0x148
>>> [   27.984936]  device_driver_attach+0x94/0xa0
>>> [   27.989109]  __driver_attach+0xa8/0x110
>>> [   27.992935]  bus_for_each_dev+0xe8/0x158
>>> [   27.996849]  driver_attach+0x30/0x40
>>> [   28.000415]  bus_add_driver+0x234/0x2f0
>>> [   28.004241]  driver_register+0xbc/0x1d0
>>> [   28.008067]  __pci_register_driver+0xbc/0xd0
>>> [   28.012329]  hibmc_pci_driver_init+0x20/0x28
>>> [   28.016590]  do_one_initcall+0xb4/0x254
>>> [   28.020417]  kernel_init_freeable+0x27c/0x328
>>> [   28.024765]  kernel_init+0x10/0x118
>>> [   28.028245]  ret_from_fork+0x10/0x18
>>> [   28.031813] ---[ end trace 35a83b71b657878d ]---
>>> [   28.036503] ------------[ cut here ]------------
>>> [   28.041115] WARNING: CPU: 24 PID: 1 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [   28.051537] Modules linked in:
>>> [   28.054585] CPU: 24 PID: 1 Comm: swapper/0 Tainted: G    B   W
>>>   5.5.0-rc1-dirty #565
>>> [   28.062924] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>>
>>> [snip]
>>>
>>> Indeed, simply unbinding the device from the driver causes the same sort
>>> of issue:
>>>
>>> root@(none)$ cd ./bus/pci/drivers/hibmc-drm/
>>> root@(none)$ ls
>>> 0000:05:00.0  bind          new_id        remove_id     uevent
>>> unbind
>>> root@(none)$ echo 0000\:05\:00.0 > unbind
>>> [  116.074352] ------------[ cut here ]------------
>>> [  116.078978] WARNING: CPU: 17 PID: 1178 at
>>> drivers/gpu/drm/drm_gem_vram_helper.c:40
>>> ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.089661] Modules linked in:
>>> [  116.092711] CPU: 17 PID: 1178 Comm: sh Tainted: G    B   W
>>> 5.5.0-rc1-dirty #565
>>> [  116.100704] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
>>> RC0 - V1.16.01 03/15/2019
>>> [  116.109218] pstate: 20400009 (nzCv daif +PAN -UAO)
>>> [  116.114001] pc : ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.118956] lr : ttm_buffer_object_destroy+0x18/0x80
>>> [  116.123910] sp : ffff0022e6cef8e0
>>> [  116.127215] x29: ffff0022e6cef8e0 x28: ffff00231b1fb000
>>> [  116.132519] x27: 0000000000000000 x26: ffff00231b1fb000
>>> [  116.137821] x25: ffff0022e6cefdc0 x24: 0000000000002480
>>> [  116.143124] x23: ffff0023682b6ab0 x22: ffff0023682b6800
>>> [  116.148427] x21: ffff0023682b6800 x20: 0000000000000000
>>> [  116.153730] x19: ffff0023682b6800 x18: 0000000000000000
>>> [  116.159032] x17: 000000000000000000000000001
>>> [  116.185545] x7 : ffff0023682b6b07 x6 : ffff80046d056d61
>>> [  116.190848] x5 : ffff80046d056d61 x4 : ffff0023682b6ba0
>>> [  116.196151] x3 : ffffa00010197338 x2 : dfffa00000000000
>>> [  116.201453] x1 : 0000000000000003 x0 : 0000000000000001
>>> [  116.206756] Call trace:
>>> [  116.209195]  ttm_buffer_object_destroy+0x4c/0x80
>>> [  116.213803]  ttm_bo_release_list+0x184/0x220
>>> [  116.218064]  ttm_bo_put+0x410/0x5d0
>>> [  116.221544]  drm_gem_vram_object_free+0xc/0x18
>>> [  116.225979]  drm_gem_object_free+0x34/0xd0
>>> [  116.230066]  drm_gem_object_put_unlocked+0xc8/0xf0
>>> [  116.234848]  hibmc_user_framebuffer_destroy+0x20/0x40
>>> [  116.239890]  drm_framebuffer_free+0x48/0x58
>>> [  116.244064]  drm_mode_object_put.part.1+0x90/0xe8
>>> [  116.248759]  drm_mode_object_put+0x28/0x38
>>> [  116.252846]  hibmc_fbdev_fini+0x54/0x78
>>> [  116.256672]  hibmc_unload+0x2c/0xd0
>>> [  116.260151]  hibmc_pci_remove+0x2c/0x40
>>> [  116.263979]  pci_device_remove+0x6c/0x140
>>> [  116.267980]  device_release_driver_internal+0x134/0x250
>>> [  116.273196]  device_driver_detach+0x28/0x38
>>> [  116.277369]  unbind_store+0xfc/0x150
>>> [  116.280934]  drv_attr_store+0x48/0x60
>>> [  116.284589]  sysfs_kf_write+0x80/0xb0
>>> [  116.288241]  kernfs_fop_write+0x1d4/0x320
>>> [  116.292243]  __vfs_write+0x54/0x98
>>> [  116.295635]  vfs_write+0xe8/0x270
>>> [  116.298940]  ksys_write+0xc8/0x180
>>> [  116.302333]  __arm64_sys_write+0x40/0x50
>>> [  116.306248]  el0_svc_common.constprop.0+0xa4/0x1f8
>>> [  116.311029]  el0_svc_handler+0x34/0xb0
>>> [  116.314770]  el0_sync_handler+0x10c/0x1c8
>>> [  116.318769]  el0_sync+0x140/0x180
>>> [  116.322074] ---[ end trace e60e43d0e316b5c8 ]---
>>> [  116.326868] ------------[ cut here ]------------
>>>
>>>
>>> dmesg and .config is here:
>>> https://pastebin.com/4P5yaZBS
>>>
>>> I'm not sure if this is a HIBMC driver issue or issue with the
>>> framework.
>>>
>>> john
>>>
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-01-13  8:05 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-16 17:23 Warnings in DRM code when removing/unbinding a driver John Garry
2019-12-16 17:23 ` John Garry
2019-12-17  9:20 ` John Garry
2019-12-17  9:20   ` John Garry
2019-12-17 13:24   ` Daniel Vetter
2019-12-17 13:24     ` Daniel Vetter
2019-12-17 16:34 ` Ezequiel Garcia
2019-12-17 16:34   ` Ezequiel Garcia
2019-12-17 17:27   ` John Garry
2019-12-17 17:27     ` John Garry
2019-12-18 18:08     ` John Garry
2019-12-18 18:08       ` John Garry
2019-12-19  9:54       ` Daniel Vetter
2019-12-19  9:54         ` Daniel Vetter
2019-12-19 10:03         ` John Garry
2019-12-19 10:03           ` John Garry
2019-12-19 10:10           ` Daniel Vetter
2019-12-19 10:10             ` Daniel Vetter
2019-12-19 11:31             ` Gerd Hoffmann
2019-12-19 11:31               ` Gerd Hoffmann
2019-12-19 12:42               ` Daniel Vetter
2019-12-19 12:42                 ` Daniel Vetter
2019-12-23  9:00                 ` SIGBUS on device disappearance (Re: Warnings in DRM code when removing/unbinding a driver) Pekka Paalanen
2019-12-23  9:00                   ` Pekka Paalanen
2020-01-07 15:42                   ` Daniel Vetter
2020-01-07 15:42                     ` Daniel Vetter
2020-01-10 10:49 ` Warnings in DRM code when removing/unbinding a driver Thomas Zimmermann
2020-01-10 10:49   ` Thomas Zimmermann
2020-01-10 12:54   ` John Garry
2020-01-10 12:54     ` John Garry
2020-01-13  8:05     ` Thomas Zimmermann
2020-01-13  8:05       ` Thomas Zimmermann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.