* 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini @ 2018-02-13 20:04 Meelis Roos 2018-02-14 14:29 ` Meelis Roos 0 siblings, 1 reply; 11+ messages in thread From: Meelis Roos @ 2018-02-13 20:04 UTC (permalink / raw) To: nouveau, dri-devel, Ben Skeggs, Linux Kernel list This is 4.16-rc1+todays git ona lowly P4 with NV5, worked fine in 4.15: [ 7.361155] nouveau 0000:01:00.0: NVIDIA NV05 (20154000) [ 7.386601] nouveau 0000:01:00.0: bios: version 02.05.19.03.00 [ 7.386715] nouveau 0000:01:00.0: bios: DCB table not found [ 7.386983] nouveau 0000:01:00.0: bios: DCB table not found [ 7.387166] nouveau 0000:01:00.0: bios: DCB table not found [ 7.387266] nouveau 0000:01:00.0: bios: DCB table not found [ 7.397578] agpgart-intel 0000:00:00.0: AGP 2.0 bridge [ 7.397705] agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode [ 7.397827] nouveau 0000:01:00.0: putting AGP V2 device into 4x mode [ 7.398021] ================================================================================ [ 7.398163] UBSAN: Undefined behaviour in drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c:315:12 [ 7.398302] member access within null pointer of type 'struct nvkm_therm' [ 7.398403] CPU: 0 PID: 125 Comm: systemd-udevd Not tainted 4.16.0-rc1-00010-g178e834c47b0 #65 [ 7.398543] Hardware name: /D850GB , BIOS GB85010A.86A.0078.P18.0110081719 10/08/2001 [ 7.398686] Call Trace: [ 7.398788] dump_stack+0x16/0x18 [ 7.398885] ubsan_epilogue+0xe/0x2f [ 7.398979] ubsan_type_mismatch_common+0xdc/0x152 [ 7.399079] __ubsan_handle_type_mismatch+0x24/0x26 [ 7.399368] nvkm_therm_clkgate_fini+0x14d/0x174 [nouveau] [ 7.399638] ? nvkm_device_subdev+0x1b9/0x1fa [nouveau] [ 7.399907] nvkm_device_fini+0x113/0x3e9 [nouveau] [ 7.400010] ? ktime_get+0x4b/0x135 [ 7.400253] ? nvkm_devinit_post+0x35/0xbf [nouveau] [ 7.400519] nvkm_device_init+0x228/0x5b0 [nouveau] [ 7.400626] ? kmem_cache_alloc+0xbd/0x12a [ 7.400893] nvkm_udevice_init+0x51/0xa9 [nouveau] [ 7.401137] nvkm_object_init+0xc8/0x442 [nouveau] [ 7.401244] ? check_preempt_wakeup+0xc2/0x1c1 [ 7.401487] ? nvkm_client_child_new+0x1d/0x38 [nouveau] [ 7.401729] nvkm_ioctl_new+0x152/0x3d9 [nouveau] [ 7.401835] ? default_wake_function+0x1a/0x35 [ 7.402077] ? nvif_vmm_init+0x2ce/0x2ce [nouveau] [ 7.402345] ? nvkm_udevice_rd08+0x5b/0x5b [nouveau] [ 7.402587] nvkm_ioctl+0x1c6/0x48d [nouveau] [ 7.402829] ? nvif_client_init+0xc3/0x114 [nouveau] [ 7.403094] ? nvkm_client_map+0xf/0xf [nouveau] [ 7.403382] nvkm_client_ioctl+0x1c/0x22 [nouveau] [ 7.403643] nvif_object_ioctl+0x6f/0xff [nouveau] [ 7.403903] nvif_object_init+0xd4/0x1de [nouveau] [ 7.404164] nvif_device_init+0x21/0x5c [nouveau] [ 7.404453] nouveau_cli_init+0x21f/0xe1f [nouveau] [ 7.404733] ? nouveau_drm_load+0x1d/0xe11 [nouveau] [ 7.405011] nouveau_drm_load+0x54/0xe11 [nouveau] [ 7.405112] ? kernfs_new_node+0x2b/0x8e [ 7.405209] ? kernfs_create_link+0x55/0xcd [ 7.405323] ? drm_dev_register+0x12f/0x2e0 [drm] [ 7.405437] drm_dev_register+0x168/0x2e0 [drm] [ 7.405538] ? pci_enable_device_flags+0xeb/0x15e [ 7.405651] drm_get_pci_dev+0xbf/0x230 [drm] [ 7.405924] nouveau_drm_probe+0x183/0x1ea [nouveau] [ 7.406035] pci_device_probe+0xaa/0x163 [ 7.406136] driver_probe_device+0x1db/0x383 [ 7.406234] __driver_attach+0x86/0xb8 [ 7.406330] ? driver_probe_device+0x383/0x383 [ 7.406427] bus_for_each_dev+0x4e/0x83 [ 7.406522] driver_attach+0x1d/0x33 [ 7.406618] ? driver_probe_device+0x383/0x383 [ 7.406714] bus_add_driver+0x184/0x273 [ 7.406810] driver_register+0x66/0x107 [ 7.407039] ? nouveau_drm_init+0x66/0x1000 [nouveau] [ 7.407146] __pci_register_driver+0x47/0x71 [ 7.407379] nouveau_drm_init+0x18a/0x1000 [nouveau] [ 7.407478] ? 0xf831a000 [ 7.407575] do_one_initcall+0x4f/0x1e2 [ 7.407672] ? free_unref_page_commit.isra.88+0xd5/0x176 [ 7.407771] ? kvfree+0x3c/0x3e [ 7.407864] ? __vunmap+0x89/0xef [ 7.407960] ? do_init_module+0x1a/0x23f [ 7.408055] do_init_module+0x82/0x23f [ 7.408153] load_module+0x243c/0x36ae [ 7.408253] ? kernel_read+0x4c/0xa1 [ 7.408350] SyS_finit_module+0x78/0x8d [ 7.408447] do_fast_syscall_32+0xc1/0x31b [ 7.408545] entry_SYSENTER_32+0x4e/0x7c [ 7.408640] EIP: 0xb7ee9ad5 [ 7.408730] EFLAGS: 00000296 CPU: 0 [ 7.408823] EAX: ffffffda EBX: 00000019 ECX: b7ce0bdd EDX: 00000000 [ 7.408920] ESI: 00eb6670 EDI: 00ebe610 EBP: 00000000 ESP: bff8704c [ 7.409017] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 7.409113] ================================================================================ [ 7.409344] BUG: unable to handle kernel NULL pointer dereference at (null) [ 7.409640] IP: nvkm_therm_clkgate_fini+0x15/0x174 [nouveau] [ 7.409738] *pde = 00000000 [ 7.409833] Oops: 0000 [#1] [ 7.409923] Modules linked in: nouveau(+) evdev wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops uhci_hcd ttm ehci_hcd usbcore drm pcspkr psmouse sr_mod cdrom sg drm_panel_orientation_quirks parport_pc floppy i2c_i801 parport usb_common snd_intel8x0 snd_ac97_codec button rng_core ac97_bus snd_pcm snd_timer snd soundcore eeprom adm1031 adm1025 hwmon_vid i2c_core ip_tables x_tables ipv6 autofs4 [ 7.410357] CPU: 0 PID: 125 Comm: systemd-udevd Not tainted 4.16.0-rc1-00010-g178e834c47b0 #65 [ 7.410499] Hardware name: /D850GB , BIOS GB85010A.86A.0078.P18.0110081719 10/08/2001 [ 7.410824] EIP: nvkm_therm_clkgate_fini+0x15/0x174 [nouveau] [ 7.410921] EFLAGS: 00010286 CPU: 0 [ 7.411014] EAX: f6b3b800 EBX: 00000000 ECX: 00000006 EDX: 00000007 [ 7.411109] ESI: 00000000 EDI: 00000000 EBP: f6155858 ESP: f6155834 [ 7.411205] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 [ 7.411299] CR0: 80050033 CR2: 00000000 CR3: 3614b000 CR4: 000006d0 [ 7.411395] Call Trace: [ 7.411662] ? nvkm_device_subdev+0x1b9/0x1fa [nouveau] [ 7.411926] nvkm_device_fini+0x113/0x3e9 [nouveau] [ 7.412030] ? ktime_get+0x4b/0x135 [ 7.412274] ? nvkm_devinit_post+0x35/0xbf [nouveau] [ 7.412536] nvkm_device_init+0x228/0x5b0 [nouveau] [ 7.412640] ? kmem_cache_alloc+0xbd/0x12a [ 7.412906] nvkm_udevice_init+0x51/0xa9 [nouveau] [ 7.413146] nvkm_object_init+0xc8/0x442 [nouveau] [ 7.413248] ? check_preempt_wakeup+0xc2/0x1c1 [ 7.413602] ? nvkm_client_child_new+0x1d/0x38 [nouveau] [ 7.413956] nvkm_ioctl_new+0x152/0x3d9 [nouveau] [ 7.414055] ? default_wake_function+0x1a/0x35 [ 7.414409] ? nvif_vmm_init+0x2ce/0x2ce [nouveau] [ 7.414788] ? nvkm_udevice_rd08+0x5b/0x5b [nouveau] [ 7.415150] nvkm_ioctl+0x1c6/0x48d [nouveau] [ 7.416466] ? nvif_client_init+0xc3/0x114 [nouveau] [ 7.416832] ? nvkm_client_map+0xf/0xf [nouveau] [ 7.417201] nvkm_client_ioctl+0x1c/0x22 [nouveau] [ 7.417554] nvif_object_ioctl+0x6f/0xff [nouveau] [ 7.417909] nvif_object_init+0xd4/0x1de [nouveau] [ 7.418271] nvif_device_init+0x21/0x5c [nouveau] [ 7.418536] nouveau_cli_init+0x21f/0xe1f [nouveau] [ 7.418799] ? nouveau_drm_load+0x1d/0xe11 [nouveau] [ 7.419058] nouveau_drm_load+0x54/0xe11 [nouveau] [ 7.419158] ? kernfs_new_node+0x2b/0x8e [ 7.419255] ? kernfs_create_link+0x55/0xcd [ 7.419369] ? drm_dev_register+0x12f/0x2e0 [drm] [ 7.419496] drm_dev_register+0x168/0x2e0 [drm] [ 7.419596] ? pci_enable_device_flags+0xeb/0x15e [ 7.419724] drm_get_pci_dev+0xbf/0x230 [drm] [ 7.420102] nouveau_drm_probe+0x183/0x1ea [nouveau] [ 7.420207] pci_device_probe+0xaa/0x163 [ 7.420305] driver_probe_device+0x1db/0x383 [ 7.420402] __driver_attach+0x86/0xb8 [ 7.420497] ? driver_probe_device+0x383/0x383 [ 7.420597] bus_for_each_dev+0x4e/0x83 [ 7.420694] driver_attach+0x1d/0x33 [ 7.420790] ? driver_probe_device+0x383/0x383 [ 7.420886] bus_add_driver+0x184/0x273 [ 7.420983] driver_register+0x66/0x107 [ 7.421215] ? nouveau_drm_init+0x66/0x1000 [nouveau] [ 7.421322] __pci_register_driver+0x47/0x71 [ 7.421555] nouveau_drm_init+0x18a/0x1000 [nouveau] [ 7.421654] ? 0xf831a000 [ 7.421751] do_one_initcall+0x4f/0x1e2 [ 7.421850] ? free_unref_page_commit.isra.88+0xd5/0x176 [ 7.421947] ? kvfree+0x3c/0x3e [ 7.422041] ? __vunmap+0x89/0xef [ 7.422136] ? do_init_module+0x1a/0x23f [ 7.422232] do_init_module+0x82/0x23f [ 7.422329] load_module+0x243c/0x36ae [ 7.422428] ? kernel_read+0x4c/0xa1 [ 7.422524] SyS_finit_module+0x78/0x8d [ 7.422624] do_fast_syscall_32+0xc1/0x31b [ 7.422722] entry_SYSENTER_32+0x4e/0x7c [ 7.422817] EIP: 0xb7ee9ad5 [ 7.422907] EFLAGS: 00000296 CPU: 0 [ 7.423001] EAX: ffffffda EBX: 00000019 ECX: b7ce0bdd EDX: 00000000 [ 7.423098] ESI: 00eb6670 EDI: 00ebe610 EBP: 00000000 ESP: bff8704c [ 7.423195] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 7.423291] Code: e9 30 ff ff ff 31 d2 b8 78 cf b0 f8 e8 ba 07 a2 c8 e9 0f ff ff ff 55 89 e5 57 56 53 83 ec 18 89 c3 89 d6 85 c0 0f 84 2c 01 00 00 <8b> 3b 85 ff 0f 84 11 01 00 00 8b 47 30 85 c0 0f 84 a1 00 00 00 [ 7.423757] EIP: nvkm_therm_clkgate_fini+0x15/0x174 [nouveau] SS:ESP: 0068:f6155834 [ 7.423899] CR2: 0000000000000000 [ 7.424033] ---[ end trace cad535783d11d7b9 ]--- -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini 2018-02-13 20:04 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini Meelis Roos @ 2018-02-14 14:29 ` Meelis Roos 0 siblings, 0 replies; 11+ messages in thread From: Meelis Roos @ 2018-02-14 14:29 UTC (permalink / raw) To: nouveau, dri-devel, Ben Skeggs, Linux Kernel list > This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in 4.15: NV5 in another PC (secondary card in x86-64) made the systrem crash on boot, in nvkm_therm_clkgate_fini. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini @ 2018-02-14 14:29 ` Meelis Roos 0 siblings, 0 replies; 11+ messages in thread From: Meelis Roos @ 2018-02-14 14:29 UTC (permalink / raw) To: nouveau, dri-devel, Ben Skeggs, Linux Kernel list > This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in 4.15: NV5 in another PC (secondary card in x86-64) made the systrem crash on boot, in nvkm_therm_clkgate_fini. -- Meelis Roos (mroos@linux.ee) _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini 2018-02-14 14:29 ` Meelis Roos @ 2018-02-14 14:35 ` Ilia Mirkin -1 siblings, 0 replies; 11+ messages in thread From: Ilia Mirkin @ 2018-02-14 14:35 UTC (permalink / raw) To: Meelis Roos; +Cc: nouveau, dri-devel, Ben Skeggs, Linux Kernel list On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos@linux.ee> wrote: >> This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in 4.15: > > NV5 in another PC (secondary card in x86-64) made the systrem crash on > boot, in nvkm_therm_clkgate_fini. Mind booting with nouveau.debug=trace? That should hopefully tell us more exactly which thing is dying. If you have a cross-compile/distcc setup handy, a bisect may be even more useful. It's funny, I had a NV5 plugged into my desktop for testing, and *just* took it out (because the box wouldn't even get to BIOS anymore ... although it was unrelated to the NV5, probably just something mis-seated.) -ilia ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini @ 2018-02-14 14:35 ` Ilia Mirkin 0 siblings, 0 replies; 11+ messages in thread From: Ilia Mirkin @ 2018-02-14 14:35 UTC (permalink / raw) To: Meelis Roos; +Cc: nouveau, Ben Skeggs, dri-devel, Linux Kernel list On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos@linux.ee> wrote: >> This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in 4.15: > > NV5 in another PC (secondary card in x86-64) made the systrem crash on > boot, in nvkm_therm_clkgate_fini. Mind booting with nouveau.debug=trace? That should hopefully tell us more exactly which thing is dying. If you have a cross-compile/distcc setup handy, a bisect may be even more useful. It's funny, I had a NV5 plugged into my desktop for testing, and *just* took it out (because the box wouldn't even get to BIOS anymore ... although it was unrelated to the NV5, probably just something mis-seated.) -ilia _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini 2018-02-14 14:35 ` Ilia Mirkin @ 2018-02-14 14:36 ` Ilia Mirkin -1 siblings, 0 replies; 11+ messages in thread From: Ilia Mirkin @ 2018-02-14 14:36 UTC (permalink / raw) To: Meelis Roos; +Cc: nouveau, dri-devel, Ben Skeggs, Linux Kernel list On Wed, Feb 14, 2018 at 9:35 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote: > On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos@linux.ee> wrote: >>> This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in 4.15: >> >> NV5 in another PC (secondary card in x86-64) made the systrem crash on >> boot, in nvkm_therm_clkgate_fini. > > Mind booting with nouveau.debug=trace? That should hopefully tell us > more exactly which thing is dying. If you have a cross-compile/distcc > setup handy, a bisect may be even more useful. Erm, sorry, nevermind. You even said it -- nvkm_therm_clkgate_fini is somehow mis-hooked up for NV5 now. A bisect result would still make the culprit a lot more obvious. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini @ 2018-02-14 14:36 ` Ilia Mirkin 0 siblings, 0 replies; 11+ messages in thread From: Ilia Mirkin @ 2018-02-14 14:36 UTC (permalink / raw) To: Meelis Roos; +Cc: nouveau, Ben Skeggs, dri-devel, Linux Kernel list On Wed, Feb 14, 2018 at 9:35 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote: > On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos@linux.ee> wrote: >>> This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in 4.15: >> >> NV5 in another PC (secondary card in x86-64) made the systrem crash on >> boot, in nvkm_therm_clkgate_fini. > > Mind booting with nouveau.debug=trace? That should hopefully tell us > more exactly which thing is dying. If you have a cross-compile/distcc > setup handy, a bisect may be even more useful. Erm, sorry, nevermind. You even said it -- nvkm_therm_clkgate_fini is somehow mis-hooked up for NV5 now. A bisect result would still make the culprit a lot more obvious. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini 2018-02-14 14:36 ` Ilia Mirkin (?) @ 2018-02-14 17:41 ` Pierre Moreau 2018-02-14 19:11 ` Lyude Paul -1 siblings, 1 reply; 11+ messages in thread From: Pierre Moreau @ 2018-02-14 17:41 UTC (permalink / raw) To: Ilia Mirkin Cc: Lyude Paul, Meelis Roos, nouveau, Ben Skeggs, dri-devel, Linux Kernel list [-- Attachment #1: Type: text/plain, Size: 1208 bytes --] On 2018-02-14 — 09:36, Ilia Mirkin wrote: > On Wed, Feb 14, 2018 at 9:35 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote: > > On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos@linux.ee> wrote: > >>> This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in 4.15: > >> > >> NV5 in another PC (secondary card in x86-64) made the systrem crash on > >> boot, in nvkm_therm_clkgate_fini. > > > > Mind booting with nouveau.debug=trace? That should hopefully tell us > > more exactly which thing is dying. If you have a cross-compile/distcc > > setup handy, a bisect may be even more useful. > > Erm, sorry, nevermind. You even said it -- nvkm_therm_clkgate_fini is > somehow mis-hooked up for NV5 now. A bisect result would still make > the culprit a lot more obvious. CC’ing Lyude Paul as she hooked up the clockgating support. Looking at the code, only NV40+ do have a therm engine. Therefore, shouldn’t nvkm_therm_clkgate_enable(), nvkm_therm_clkgate_fini() and nvkm_therm_clkgate_oneinit() all check for therm being not NULL, on top of their check for the clkgate_* hooks being there? Or instead, maybe have the check in nvkm_device_init() nvkm_device_init()? Pierre [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini @ 2018-02-14 19:11 ` Lyude Paul 0 siblings, 0 replies; 11+ messages in thread From: Lyude Paul @ 2018-02-14 19:11 UTC (permalink / raw) To: Pierre Moreau, Ilia Mirkin Cc: Meelis Roos, nouveau, Ben Skeggs, dri-devel, Linux Kernel list Actually this was brought up to me already, there's a fix on the mailing list for this I reviewed a little while ago from nvidia that we should pull in: https://patchwork.freedesktop.org/patch/203205/ Would you guys mind confirming that this patch fixes your issues? On Wed, 2018-02-14 at 18:41 +0100, Pierre Moreau wrote: > On 2018-02-14 — 09:36, Ilia Mirkin wrote: > > On Wed, Feb 14, 2018 at 9:35 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote: > > > On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos@linux.ee> wrote: > > > > > This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in > > > > > 4.15: > > > > > > > > NV5 in another PC (secondary card in x86-64) made the systrem crash on > > > > boot, in nvkm_therm_clkgate_fini. > > > > > > Mind booting with nouveau.debug=trace? That should hopefully tell us > > > more exactly which thing is dying. If you have a cross-compile/distcc > > > setup handy, a bisect may be even more useful. > > > > Erm, sorry, nevermind. You even said it -- nvkm_therm_clkgate_fini is > > somehow mis-hooked up for NV5 now. A bisect result would still make > > the culprit a lot more obvious. > > CC’ing Lyude Paul as she hooked up the clockgating support. > > Looking at the code, only NV40+ do have a therm engine. Therefore, shouldn’t > nvkm_therm_clkgate_enable(), nvkm_therm_clkgate_fini() and > nvkm_therm_clkgate_oneinit() all check for therm being not NULL, on top of > their check for the clkgate_* hooks being there? Or instead, maybe have the > check in nvkm_device_init() nvkm_device_init()? > > Pierre -- Cheers, Lyude Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini @ 2018-02-14 19:11 ` Lyude Paul 0 siblings, 0 replies; 11+ messages in thread From: Lyude Paul @ 2018-02-14 19:11 UTC (permalink / raw) To: Pierre Moreau, Ilia Mirkin Cc: nouveau, Meelis Roos, Ben Skeggs, dri-devel, Linux Kernel list Actually this was brought up to me already, there's a fix on the mailing list for this I reviewed a little while ago from nvidia that we should pull in: https://patchwork.freedesktop.org/patch/203205/ Would you guys mind confirming that this patch fixes your issues? On Wed, 2018-02-14 at 18:41 +0100, Pierre Moreau wrote: > On 2018-02-14 — 09:36, Ilia Mirkin wrote: > > On Wed, Feb 14, 2018 at 9:35 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote: > > > On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos@linux.ee> wrote: > > > > > This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in > > > > > 4.15: > > > > > > > > NV5 in another PC (secondary card in x86-64) made the systrem crash on > > > > boot, in nvkm_therm_clkgate_fini. > > > > > > Mind booting with nouveau.debug=trace? That should hopefully tell us > > > more exactly which thing is dying. If you have a cross-compile/distcc > > > setup handy, a bisect may be even more useful. > > > > Erm, sorry, nevermind. You even said it -- nvkm_therm_clkgate_fini is > > somehow mis-hooked up for NV5 now. A bisect result would still make > > the culprit a lot more obvious. > > CC’ing Lyude Paul as she hooked up the clockgating support. > > Looking at the code, only NV40+ do have a therm engine. Therefore, shouldn’t > nvkm_therm_clkgate_enable(), nvkm_therm_clkgate_fini() and > nvkm_therm_clkgate_oneinit() all check for therm being not NULL, on top of > their check for the clkgate_* hooks being there? Or instead, maybe have the > check in nvkm_device_init() nvkm_device_init()? > > Pierre -- Cheers, Lyude Paul _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini 2018-02-14 19:11 ` Lyude Paul (?) @ 2018-02-14 20:59 ` Meelis Roos -1 siblings, 0 replies; 11+ messages in thread From: Meelis Roos @ 2018-02-14 20:59 UTC (permalink / raw) To: Lyude Paul Cc: Pierre Moreau, Ilia Mirkin, nouveau, Ben Skeggs, dri-devel, Linux Kernel list > Actually this was brought up to me already, there's a fix on the mailing list > for this I reviewed a little while ago from nvidia that we should pull in: > > https://patchwork.freedesktop.org/patch/203205/ > > Would you guys mind confirming that this patch fixes your issues? It works on my amd64, P4 is still compiling. [ 1.124987] nouveau 0000:04:05.0: NVIDIA NV05 (20154000) [ 1.161464] nouveau 0000:04:05.0: bios: version 03.05.00.10.00 [ 1.161475] nouveau 0000:04:05.0: bios: DCB table not found [ 1.161535] nouveau 0000:04:05.0: bios: DCB table not found [ 1.161577] nouveau 0000:04:05.0: bios: DCB table not found [ 1.161586] nouveau 0000:04:05.0: bios: DCB table not found [ 1.344008] tsc: Refined TSC clocksource calibration: 2200.078 MHz [ 1.344024] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fb67c69f81, max_idle_ns: 440795210317 ns [ 1.344037] clocksource: Switched to clocksource tsc [ 1.408102] nouveau 0000:04:05.0: tmr: unknown input clock freq [ 1.409471] nouveau 0000:04:05.0: fb: 32 MiB SDRAM [ 1.414459] nouveau 0000:04:05.0: DRM: VRAM: 31 MiB [ 1.414467] nouveau 0000:04:05.0: DRM: GART: 128 MiB [ 1.414476] nouveau 0000:04:05.0: DRM: BMP version 5.17 [ 1.414484] nouveau 0000:04:05.0: DRM: No DCB data found in VBIOS [ 1.415629] nouveau 0000:04:05.0: DRM: Adaptor not initialised, running VBIOS init tables. [ 1.415829] nouveau 0000:04:05.0: bios: DCB table not found [ 1.416125] nouveau 0000:04:05.0: DRM: Saving VGA fonts [ 1.477526] nouveau 0000:04:05.0: DRM: No DCB data found in VBIOS [ 1.478428] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 1.478438] [drm] Driver supports precise vblank timestamp query. [ 1.479618] nouveau 0000:04:05.0: DRM: MM: using M2MF for buffer copies [ 1.517930] nouveau 0000:04:05.0: DRM: allocated 1024x768 fb: 0x4000, bo 00000000a09f4d1f [ 1.519294] nouveau 0000:04:05.0: fb1: nouveaufb frame buffer device [ 1.519313] [drm] Initialized nouveau 1.3.1 20120801 for 0000:04:05.0 on minor 1 -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-02-14 20:59 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-02-13 20:04 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini Meelis Roos 2018-02-14 14:29 ` Meelis Roos 2018-02-14 14:29 ` Meelis Roos 2018-02-14 14:35 ` [Nouveau] " Ilia Mirkin 2018-02-14 14:35 ` Ilia Mirkin 2018-02-14 14:36 ` Ilia Mirkin 2018-02-14 14:36 ` Ilia Mirkin 2018-02-14 17:41 ` Pierre Moreau 2018-02-14 19:11 ` Lyude Paul 2018-02-14 19:11 ` Lyude Paul 2018-02-14 20:59 ` [Nouveau] " Meelis Roos
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.