All of lore.kernel.org
 help / color / mirror / Atom feed
* Adreno crash on i.MX53 running 5.3-rc6
@ 2019-09-02 13:51 Fabio Estevam
  2019-09-02 14:45 ` Robin Murphy
  0 siblings, 1 reply; 13+ messages in thread
From: Fabio Estevam @ 2019-09-02 13:51 UTC (permalink / raw)
  To: Jonathan Marek, Chris Healy, Rob Clark, jcrouse; +Cc: DRI mailing list

Hi,

I am getting the following crash when booting the adreno driver on
i.MX53 running a 5.3-rc6 kernel.

Such error does not happen with 5.2 though.

Before I start running a bisect, I am wondering if anyone has any
ideas about this issue.

Thanks,

Fabio Estevam

[    2.083249] 8<--- cut here ---
[    2.086460] Unable to handle kernel paging request at virtual
address 50001000
[    2.094174] pgd = (ptrval)
[    2.096911] [50001000] *pgd=00000000
[    2.100606] Internal error: Oops: 805 [#1] SMP ARM
[    2.105412] Modules linked in:
[    2.108487] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.3.0-rc6-00271-g9f159ae07f07 #4
[    2.116411] Hardware name: Freescale i.MX53 (Device Tree Support)
[    2.122538] PC is at v7_dma_clean_range+0x20/0x38
[    2.127254] LR is at __dma_page_cpu_to_dev+0x28/0x90
[    2.132226] pc : [<c011c76c>]    lr : [<c01181c4>]    psr: 20000013
[    2.138500] sp : d80b5a88  ip : de96c000  fp : d840ce6c
[    2.143732] r10: 00000000  r9 : 00000001  r8 : d843e010
[    2.148964] r7 : 00000000  r6 : 00008000  r5 : ddb6c000  r4 : 00000000
[    2.155500] r3 : 0000003f  r2 : 00000040  r1 : 50008000  r0 : 50001000
[    2.162037] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[    2.169180] Control: 10c5387d  Table: 70004019  DAC: 00000051
[    2.174934] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
[    2.180949] Stack: (0xd80b5a88 to 0xd80b6000)
[    2.185319] 5a80:                   c011c7bc d8491780 d840ce6c
d849b380 00000000 c011822c
[    2.193509] 5aa0: c0d01a18 c0118abc c0118a78 d84a0200 00000008
c1308908 d838e800 d849a4a8
[    2.201697] 5ac0: d8491780 c06699b4 ffffffff ffffffff 00000000
d8491600 d80b5b20 d84a0200
[    2.209886] 5ae0: d8491780 d8491600 d80b5b20 d8491600 d849a4a8
d84a0200 00000003 d84a0358
[    2.218077] 5b00: c1308908 d8491600 d849a4a8 d8491780 d840ce6c
c066a55c c1308908 c066a104
[    2.226266] 5b20: 01001000 00000000 d84a0200 10700ac6 d849a480
d84a0200 00000000 d8491600
[    2.234455] 5b40: 00000000 e0845000 c1308908 c066a72c d849a480
d840ce6c d840ce00 c1308908
[    2.242643] 5b60: 00000000 c066b584 d849a488 d849a4a8 00000000
c1308908 d840ce6c c066ff40
[    2.250832] 5b80: d849a488 d849a4a8 00000000 c1308908 00000000
d81b4000 00000000 e0845000
[    2.259021] 5ba0: d838e800 c1308908 d8491600 10700ac6 d80b5bc8
d840ce00 d840ce6c 00000001
[    2.267210] 5bc0: 00000000 e0845000 d838e800 c066ece4 01000000
00000000 10ff0000 00000000
[    2.275399] 5be0: c1308908 00000001 d81b4000 00000000 01000000
00000000 00000001 10700ac6
[    2.283587] 5c00: c0d6d564 d840ce00 d81b4010 00000001 d81b4000
c0d6d564 c1308908 d80b5c48
[    2.291777] 5c20: d838e800 c061f9cc c1029dec d80b5c48 d838e800
00000000 00000000 c13e8788
[    2.299965] 5c40: ffffffff c1308928 c102a234 00000000 01000000
00000000 10ff0000 00000000
[    2.308154] 5c60: 00000001 00000000 a0000013 10700ac6 c13b7658
d840ce00 d838e800 d81b4000
[    2.316343] 5c80: d840ce00 c1308908 00000002 d838f800 00000000
c0620514 00000001 10700ac6
[    2.324531] 5ca0: d8496440 00000000 d81b4010 c1aa1c00 d838e800
c061e070 00000000 00000000
[    2.332720] 5cc0: 00000000 c0d6c534 df56cf34 000000c8 00000000
10700ac6 d81b4010 00000000
[    2.340909] 5ce0: 00000000 d8496440 d838e800 c103acd0 d8496280
00000000 c1380488 c06a3e10
[    2.349097] 5d00: 00000000 00000000 ffffffff d838f800 d838e800
d843e010 d8496440 c1308908
[    2.357286] 5d20: 00000000 d83f9640 c1380488 c0668554 00000006
00000007 c13804d4 d83f9640
[    2.365475] 5d40: c1380488 c017ec18 d80c0000 c0c43e40 d843e010
d8496440 00000001 c0182a94
[    2.373665] 5d60: 60000013 10700ac6 d843e010 d8496280 d8496400
00000018 d8496440 00000001
[    2.381854] 5d80: c13804d4 d83f9640 c1380488 c06a4280 c1380488
00000000 c0d764f8 d8496440
[    2.390044] 5da0: c1380488 d843e010 c0d764f8 c1308908 00000000
00000000 c13ef300 c06a44f0
[    2.398232] 5dc0: c0d8a0dc dffcc6f0 d843e010 dffcc6f0 00000000
d843e010 00000000 c06680b8
[    2.406421] 5de0: d84988c0 d83f9640 d84988c0 d84989a0 d8498230
10700ac6 00000001 d843e010
[    2.414610] 5e00: 00000000 c137eec0 00000000 c137eec0 00000000
00000000 c13ef300 c06ac1a0
[    2.422799] 5e20: d843e010 c1aa40dc c1aa40e0 00000000 c137eec0
c06aa014 d843e010 c137eec0
[    2.430988] 5e40: c137eec0 c1308908 c13e9880 c13e85d4 00000000
c06aa368 c1308908 c13e9880
[    2.439178] 5e60: c13e85d4 d843e010 00000000 c137eec0 c1308908
c13e9880 c13e85d4 c06aa618
[    2.447367] 5e80: 00000000 c137eec0 d843e010 c06aa6a4 00000000
c137eec0 c06aa620 c06a844c
[    2.455556] 5ea0: d80888d4 d80888a4 d84914d0 10700ac6 d80888d4
c137eec0 d8494f00 c1380d28
[    2.463745] 5ec0: 00000000 c06a946c c105f3d4 c1308908 00000000
c137eec0 c1308908 00000000
[    2.471934] 5ee0: c125fdd0 c06ab304 c1308928 c1308908 00000000
c0103178 00000109 00000000
[    2.480123] 5f00: dffffc6e dffffc00 c1126860 00000109 00000109
c014dc88 c11253ac c10607a0
[    2.488312] 5f20: 00000000 00000006 00000006 00000000 c12adeec
dffffc6e 00000000 10700ac6
[    2.496501] 5f40: c1308f18 10700ac6 00000007 c13e9880 c13ef300
c1294850 c1308928 c12ae4c4
[    2.504690] 5f60: 00000000 c12011f8 00000006 00000006 00000000
c120066c 00000000 00000109
[    2.512878] 5f80: 00000000 00000000 c0c3bb28 00000000 00000000
00000000 00000000 00000000
[    2.521066] 5fa0: 00000000 c0c3bb30 00000000 c01010b4 00000000
00000000 00000000 00000000
[    2.529255] 5fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[    2.537443] 5fe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[    2.545640] [<c011c76c>] (v7_dma_clean_range) from [<c01181c4>]
(__dma_page_cpu_to_dev+0x28/0x90)
[    2.554526] [<c01181c4>] (__dma_page_cpu_to_dev) from [<c0118abc>]
(arm_dma_sync_sg_for_device+0x44/0x64)
[    2.564121] [<c0118abc>] (arm_dma_sync_sg_for_device) from
[<c06699b4>] (get_pages+0x1ac/0x214)
[    2.572834] [<c06699b4>] (get_pages) from [<c066a55c>]
(msm_gem_get_and_pin_iova+0xb0/0x13c)
[    2.581284] [<c066a55c>] (msm_gem_get_and_pin_iova) from
[<c066a72c>] (_msm_gem_kernel_new+0x38/0xa8)
[    2.590515] [<c066a72c>] (_msm_gem_kernel_new) from [<c066b584>]
(msm_gem_kernel_new+0x24/0x2c)
[    2.599230] [<c066b584>] (msm_gem_kernel_new) from [<c066ff40>]
(msm_ringbuffer_new+0x68/0x140)
[    2.607940] [<c066ff40>] (msm_ringbuffer_new) from [<c066ece4>]
(msm_gpu_init+0x430/0x5fc)
[    2.616220] [<c066ece4>] (msm_gpu_init) from [<c061f9cc>]
(adreno_gpu_init+0x16c/0x298)
[    2.624236] [<c061f9cc>] (adreno_gpu_init) from [<c0620514>]
(a2xx_gpu_init+0x84/0x104)
[    2.632252] [<c0620514>] (a2xx_gpu_init) from [<c061e070>]
(adreno_bind+0x190/0x274)
[    2.640018] [<c061e070>] (adreno_bind) from [<c06a3e10>]
(component_bind_all+0xe8/0x22c)
[    2.648124] [<c06a3e10>] (component_bind_all) from [<c0668554>]
(msm_drm_bind+0xf4/0x610)
[    2.656315] [<c0668554>] (msm_drm_bind) from [<c06a4280>]
(try_to_bring_up_master+0x158/0x198)
[    2.664940] [<c06a4280>] (try_to_bring_up_master) from [<c06a44f0>]
(component_master_add_with_match+0xb8/0xf8)
[    2.675042] [<c06a44f0>] (component_master_add_with_match) from
[<c06680b8>] (msm_pdev_probe+0x214/0x28c)
[    2.684630] [<c06680b8>] (msm_pdev_probe) from [<c06ac1a0>]
(platform_drv_probe+0x48/0x98)
[    2.692908] [<c06ac1a0>] (platform_drv_probe) from [<c06aa014>]
(really_probe+0xec/0x2cc)
[    2.701099] [<c06aa014>] (really_probe) from [<c06aa368>]
(driver_probe_device+0x5c/0x164)
[    2.709376] [<c06aa368>] (driver_probe_device) from [<c06aa618>]
(device_driver_attach+0x58/0x60)
[    2.718259] [<c06aa618>] (device_driver_attach) from [<c06aa6a4>]
(__driver_attach+0x84/0xc0)
[    2.726796] [<c06aa6a4>] (__driver_attach) from [<c06a844c>]
(bus_for_each_dev+0x70/0xb4)
[    2.734985] [<c06a844c>] (bus_for_each_dev) from [<c06a946c>]
(bus_add_driver+0x154/0x1e0)
[    2.743262] [<c06a946c>] (bus_add_driver) from [<c06ab304>]
(driver_register+0x74/0x108)
[    2.751369] [<c06ab304>] (driver_register) from [<c0103178>]
(do_one_initcall+0x80/0x32c)
[    2.759560] [<c0103178>] (do_one_initcall) from [<c12011f8>]
(kernel_init_freeable+0x2e4/0x3c8)
[    2.768278] [<c12011f8>] (kernel_init_freeable) from [<c0c3bb30>]
(kernel_init+0x8/0x114)
[    2.776469] [<c0c3bb30>] (kernel_init) from [<c01010b4>]
(ret_from_fork+0x14/0x20)
[    2.784046] Exception stack(0xd80b5fb0 to 0xd80b5ff8)
[    2.789107] 5fa0:                                     00000000
00000000 00000000 00000000
[    2.797295] 5fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[    2.805482] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    2.812111] Code: e1a02312 e2423001 e1c00003 e320f000 (ee070f3a)
[    2.818319] ---[ end trace cdc18b3504e6a4f8 ]---
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-02 13:51 Adreno crash on i.MX53 running 5.3-rc6 Fabio Estevam
@ 2019-09-02 14:45 ` Robin Murphy
  2019-09-02 18:03   ` Fabio Estevam
  0 siblings, 1 reply; 13+ messages in thread
From: Robin Murphy @ 2019-09-02 14:45 UTC (permalink / raw)
  To: Fabio Estevam, Jonathan Marek, Chris Healy, Rob Clark, jcrouse
  Cc: DRI mailing list

On 02/09/2019 14:51, Fabio Estevam wrote:
> Hi,
> 
> I am getting the following crash when booting the adreno driver on
> i.MX53 running a 5.3-rc6 kernel.
> 
> Such error does not happen with 5.2 though.
> 
> Before I start running a bisect, I am wondering if anyone has any
> ideas about this issue.

Try 0036bc73ccbe - that looks like something that CONFIG_DMA_API_DEBUG 
should have been screaming about anyway.

Robin.

> 
> Thanks,
> 
> Fabio Estevam
> 
> [    2.083249] 8<--- cut here ---
> [    2.086460] Unable to handle kernel paging request at virtual
> address 50001000
> [    2.094174] pgd = (ptrval)
> [    2.096911] [50001000] *pgd=00000000
> [    2.100606] Internal error: Oops: 805 [#1] SMP ARM
> [    2.105412] Modules linked in:
> [    2.108487] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 5.3.0-rc6-00271-g9f159ae07f07 #4
> [    2.116411] Hardware name: Freescale i.MX53 (Device Tree Support)
> [    2.122538] PC is at v7_dma_clean_range+0x20/0x38
> [    2.127254] LR is at __dma_page_cpu_to_dev+0x28/0x90
> [    2.132226] pc : [<c011c76c>]    lr : [<c01181c4>]    psr: 20000013
> [    2.138500] sp : d80b5a88  ip : de96c000  fp : d840ce6c
> [    2.143732] r10: 00000000  r9 : 00000001  r8 : d843e010
> [    2.148964] r7 : 00000000  r6 : 00008000  r5 : ddb6c000  r4 : 00000000
> [    2.155500] r3 : 0000003f  r2 : 00000040  r1 : 50008000  r0 : 50001000
> [    2.162037] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [    2.169180] Control: 10c5387d  Table: 70004019  DAC: 00000051
> [    2.174934] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
> [    2.180949] Stack: (0xd80b5a88 to 0xd80b6000)
> [    2.185319] 5a80:                   c011c7bc d8491780 d840ce6c
> d849b380 00000000 c011822c
> [    2.193509] 5aa0: c0d01a18 c0118abc c0118a78 d84a0200 00000008
> c1308908 d838e800 d849a4a8
> [    2.201697] 5ac0: d8491780 c06699b4 ffffffff ffffffff 00000000
> d8491600 d80b5b20 d84a0200
> [    2.209886] 5ae0: d8491780 d8491600 d80b5b20 d8491600 d849a4a8
> d84a0200 00000003 d84a0358
> [    2.218077] 5b00: c1308908 d8491600 d849a4a8 d8491780 d840ce6c
> c066a55c c1308908 c066a104
> [    2.226266] 5b20: 01001000 00000000 d84a0200 10700ac6 d849a480
> d84a0200 00000000 d8491600
> [    2.234455] 5b40: 00000000 e0845000 c1308908 c066a72c d849a480
> d840ce6c d840ce00 c1308908
> [    2.242643] 5b60: 00000000 c066b584 d849a488 d849a4a8 00000000
> c1308908 d840ce6c c066ff40
> [    2.250832] 5b80: d849a488 d849a4a8 00000000 c1308908 00000000
> d81b4000 00000000 e0845000
> [    2.259021] 5ba0: d838e800 c1308908 d8491600 10700ac6 d80b5bc8
> d840ce00 d840ce6c 00000001
> [    2.267210] 5bc0: 00000000 e0845000 d838e800 c066ece4 01000000
> 00000000 10ff0000 00000000
> [    2.275399] 5be0: c1308908 00000001 d81b4000 00000000 01000000
> 00000000 00000001 10700ac6
> [    2.283587] 5c00: c0d6d564 d840ce00 d81b4010 00000001 d81b4000
> c0d6d564 c1308908 d80b5c48
> [    2.291777] 5c20: d838e800 c061f9cc c1029dec d80b5c48 d838e800
> 00000000 00000000 c13e8788
> [    2.299965] 5c40: ffffffff c1308928 c102a234 00000000 01000000
> 00000000 10ff0000 00000000
> [    2.308154] 5c60: 00000001 00000000 a0000013 10700ac6 c13b7658
> d840ce00 d838e800 d81b4000
> [    2.316343] 5c80: d840ce00 c1308908 00000002 d838f800 00000000
> c0620514 00000001 10700ac6
> [    2.324531] 5ca0: d8496440 00000000 d81b4010 c1aa1c00 d838e800
> c061e070 00000000 00000000
> [    2.332720] 5cc0: 00000000 c0d6c534 df56cf34 000000c8 00000000
> 10700ac6 d81b4010 00000000
> [    2.340909] 5ce0: 00000000 d8496440 d838e800 c103acd0 d8496280
> 00000000 c1380488 c06a3e10
> [    2.349097] 5d00: 00000000 00000000 ffffffff d838f800 d838e800
> d843e010 d8496440 c1308908
> [    2.357286] 5d20: 00000000 d83f9640 c1380488 c0668554 00000006
> 00000007 c13804d4 d83f9640
> [    2.365475] 5d40: c1380488 c017ec18 d80c0000 c0c43e40 d843e010
> d8496440 00000001 c0182a94
> [    2.373665] 5d60: 60000013 10700ac6 d843e010 d8496280 d8496400
> 00000018 d8496440 00000001
> [    2.381854] 5d80: c13804d4 d83f9640 c1380488 c06a4280 c1380488
> 00000000 c0d764f8 d8496440
> [    2.390044] 5da0: c1380488 d843e010 c0d764f8 c1308908 00000000
> 00000000 c13ef300 c06a44f0
> [    2.398232] 5dc0: c0d8a0dc dffcc6f0 d843e010 dffcc6f0 00000000
> d843e010 00000000 c06680b8
> [    2.406421] 5de0: d84988c0 d83f9640 d84988c0 d84989a0 d8498230
> 10700ac6 00000001 d843e010
> [    2.414610] 5e00: 00000000 c137eec0 00000000 c137eec0 00000000
> 00000000 c13ef300 c06ac1a0
> [    2.422799] 5e20: d843e010 c1aa40dc c1aa40e0 00000000 c137eec0
> c06aa014 d843e010 c137eec0
> [    2.430988] 5e40: c137eec0 c1308908 c13e9880 c13e85d4 00000000
> c06aa368 c1308908 c13e9880
> [    2.439178] 5e60: c13e85d4 d843e010 00000000 c137eec0 c1308908
> c13e9880 c13e85d4 c06aa618
> [    2.447367] 5e80: 00000000 c137eec0 d843e010 c06aa6a4 00000000
> c137eec0 c06aa620 c06a844c
> [    2.455556] 5ea0: d80888d4 d80888a4 d84914d0 10700ac6 d80888d4
> c137eec0 d8494f00 c1380d28
> [    2.463745] 5ec0: 00000000 c06a946c c105f3d4 c1308908 00000000
> c137eec0 c1308908 00000000
> [    2.471934] 5ee0: c125fdd0 c06ab304 c1308928 c1308908 00000000
> c0103178 00000109 00000000
> [    2.480123] 5f00: dffffc6e dffffc00 c1126860 00000109 00000109
> c014dc88 c11253ac c10607a0
> [    2.488312] 5f20: 00000000 00000006 00000006 00000000 c12adeec
> dffffc6e 00000000 10700ac6
> [    2.496501] 5f40: c1308f18 10700ac6 00000007 c13e9880 c13ef300
> c1294850 c1308928 c12ae4c4
> [    2.504690] 5f60: 00000000 c12011f8 00000006 00000006 00000000
> c120066c 00000000 00000109
> [    2.512878] 5f80: 00000000 00000000 c0c3bb28 00000000 00000000
> 00000000 00000000 00000000
> [    2.521066] 5fa0: 00000000 c0c3bb30 00000000 c01010b4 00000000
> 00000000 00000000 00000000
> [    2.529255] 5fc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [    2.537443] 5fe0: 00000000 00000000 00000000 00000000 00000013
> 00000000 00000000 00000000
> [    2.545640] [<c011c76c>] (v7_dma_clean_range) from [<c01181c4>]
> (__dma_page_cpu_to_dev+0x28/0x90)
> [    2.554526] [<c01181c4>] (__dma_page_cpu_to_dev) from [<c0118abc>]
> (arm_dma_sync_sg_for_device+0x44/0x64)
> [    2.564121] [<c0118abc>] (arm_dma_sync_sg_for_device) from
> [<c06699b4>] (get_pages+0x1ac/0x214)
> [    2.572834] [<c06699b4>] (get_pages) from [<c066a55c>]
> (msm_gem_get_and_pin_iova+0xb0/0x13c)
> [    2.581284] [<c066a55c>] (msm_gem_get_and_pin_iova) from
> [<c066a72c>] (_msm_gem_kernel_new+0x38/0xa8)
> [    2.590515] [<c066a72c>] (_msm_gem_kernel_new) from [<c066b584>]
> (msm_gem_kernel_new+0x24/0x2c)
> [    2.599230] [<c066b584>] (msm_gem_kernel_new) from [<c066ff40>]
> (msm_ringbuffer_new+0x68/0x140)
> [    2.607940] [<c066ff40>] (msm_ringbuffer_new) from [<c066ece4>]
> (msm_gpu_init+0x430/0x5fc)
> [    2.616220] [<c066ece4>] (msm_gpu_init) from [<c061f9cc>]
> (adreno_gpu_init+0x16c/0x298)
> [    2.624236] [<c061f9cc>] (adreno_gpu_init) from [<c0620514>]
> (a2xx_gpu_init+0x84/0x104)
> [    2.632252] [<c0620514>] (a2xx_gpu_init) from [<c061e070>]
> (adreno_bind+0x190/0x274)
> [    2.640018] [<c061e070>] (adreno_bind) from [<c06a3e10>]
> (component_bind_all+0xe8/0x22c)
> [    2.648124] [<c06a3e10>] (component_bind_all) from [<c0668554>]
> (msm_drm_bind+0xf4/0x610)
> [    2.656315] [<c0668554>] (msm_drm_bind) from [<c06a4280>]
> (try_to_bring_up_master+0x158/0x198)
> [    2.664940] [<c06a4280>] (try_to_bring_up_master) from [<c06a44f0>]
> (component_master_add_with_match+0xb8/0xf8)
> [    2.675042] [<c06a44f0>] (component_master_add_with_match) from
> [<c06680b8>] (msm_pdev_probe+0x214/0x28c)
> [    2.684630] [<c06680b8>] (msm_pdev_probe) from [<c06ac1a0>]
> (platform_drv_probe+0x48/0x98)
> [    2.692908] [<c06ac1a0>] (platform_drv_probe) from [<c06aa014>]
> (really_probe+0xec/0x2cc)
> [    2.701099] [<c06aa014>] (really_probe) from [<c06aa368>]
> (driver_probe_device+0x5c/0x164)
> [    2.709376] [<c06aa368>] (driver_probe_device) from [<c06aa618>]
> (device_driver_attach+0x58/0x60)
> [    2.718259] [<c06aa618>] (device_driver_attach) from [<c06aa6a4>]
> (__driver_attach+0x84/0xc0)
> [    2.726796] [<c06aa6a4>] (__driver_attach) from [<c06a844c>]
> (bus_for_each_dev+0x70/0xb4)
> [    2.734985] [<c06a844c>] (bus_for_each_dev) from [<c06a946c>]
> (bus_add_driver+0x154/0x1e0)
> [    2.743262] [<c06a946c>] (bus_add_driver) from [<c06ab304>]
> (driver_register+0x74/0x108)
> [    2.751369] [<c06ab304>] (driver_register) from [<c0103178>]
> (do_one_initcall+0x80/0x32c)
> [    2.759560] [<c0103178>] (do_one_initcall) from [<c12011f8>]
> (kernel_init_freeable+0x2e4/0x3c8)
> [    2.768278] [<c12011f8>] (kernel_init_freeable) from [<c0c3bb30>]
> (kernel_init+0x8/0x114)
> [    2.776469] [<c0c3bb30>] (kernel_init) from [<c01010b4>]
> (ret_from_fork+0x14/0x20)
> [    2.784046] Exception stack(0xd80b5fb0 to 0xd80b5ff8)
> [    2.789107] 5fa0:                                     00000000
> 00000000 00000000 00000000
> [    2.797295] 5fc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [    2.805482] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [    2.812111] Code: e1a02312 e2423001 e1c00003 e320f000 (ee070f3a)
> [    2.818319] ---[ end trace cdc18b3504e6a4f8 ]---
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-02 14:45 ` Robin Murphy
@ 2019-09-02 18:03   ` Fabio Estevam
  2019-09-03 15:22     ` Rob Clark
  0 siblings, 1 reply; 13+ messages in thread
From: Fabio Estevam @ 2019-09-02 18:03 UTC (permalink / raw)
  To: Robin Murphy; +Cc: DRI mailing list, Chris Healy, Jonathan Marek

Hi Robin,

On Mon, Sep 2, 2019 at 11:45 AM Robin Murphy <robin.murphy@arm.com> wrote:

> Try 0036bc73ccbe - that looks like something that CONFIG_DMA_API_DEBUG
> should have been screaming about anyway.

Thanks for your suggestion.

I can successfully boot after reverting the following commits:

commit 141db5703c887f46957615cd6616ca28fe4691e0 (HEAD)
Author: Fabio Estevam <festevam@gmail.com>
Date:   Mon Sep 2 14:58:18 2019 -0300

    Revert "drm/msm: stop abusing dma_map/unmap for cache"

    This reverts commit 0036bc73ccbe7e600a3468bf8e8879b122252274.

commit fa5b1f620f2984c254877d6049214c39c24c8207
Author: Fabio Estevam <festevam@gmail.com>
Date:   Mon Sep 2 14:56:01 2019 -0300

    Revert "drm/msm: Use the correct dma_sync calls in msm_gem"

    This reverts commit 3de433c5b38af49a5fc7602721e2ab5d39f1e69c.

Rob,

What would be the recommended approach for fixing this?

Thanks
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-02 18:03   ` Fabio Estevam
@ 2019-09-03 15:22     ` Rob Clark
       [not found]       ` <95ae3680-02c7-a1b8-5ea6-1a4d89293649@marek.ca>
  0 siblings, 1 reply; 13+ messages in thread
From: Rob Clark @ 2019-09-03 15:22 UTC (permalink / raw)
  To: Fabio Estevam; +Cc: DRI mailing list, Robin Murphy, Chris Healy, Jonathan Marek

On Mon, Sep 2, 2019 at 11:03 AM Fabio Estevam <festevam@gmail.com> wrote:
>
> Hi Robin,
>
> On Mon, Sep 2, 2019 at 11:45 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> > Try 0036bc73ccbe - that looks like something that CONFIG_DMA_API_DEBUG
> > should have been screaming about anyway.
>
> Thanks for your suggestion.
>
> I can successfully boot after reverting the following commits:
>
> commit 141db5703c887f46957615cd6616ca28fe4691e0 (HEAD)
> Author: Fabio Estevam <festevam@gmail.com>
> Date:   Mon Sep 2 14:58:18 2019 -0300
>
>     Revert "drm/msm: stop abusing dma_map/unmap for cache"
>
>     This reverts commit 0036bc73ccbe7e600a3468bf8e8879b122252274.
>
> commit fa5b1f620f2984c254877d6049214c39c24c8207
> Author: Fabio Estevam <festevam@gmail.com>
> Date:   Mon Sep 2 14:56:01 2019 -0300
>
>     Revert "drm/msm: Use the correct dma_sync calls in msm_gem"
>
>     This reverts commit 3de433c5b38af49a5fc7602721e2ab5d39f1e69c.
>
> Rob,
>
> What would be the recommended approach for fixing this?
>

We need a direct way to handle cache, so we can stop trying to trick
DMA API into doing what we want.

Something like this is what I had in mind:

https://patchwork.freedesktop.org/series/65211/

I guess I could respin that.  I'm not really sure of any other way to
have things working on the different combinations of archs and dma_ops
that we have.  Lately fixing one has been breaking another.

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
       [not found]       ` <95ae3680-02c7-a1b8-5ea6-1a4d89293649@marek.ca>
@ 2019-09-03 19:32         ` Fabio Estevam
  2019-09-04  0:12           ` Rob Clark
  0 siblings, 1 reply; 13+ messages in thread
From: Fabio Estevam @ 2019-09-03 19:32 UTC (permalink / raw)
  To: Jonathan Marek; +Cc: DRI mailing list, Robin Murphy, Chris Healy

Hi Jonathan,

On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
>
> Hi,
>
> I tried this and it works with patches 4+5 from Rob's series and
> changing gpummu to use sg_phys(sg) instead of sg->dma_address
> (dma_address isn't set now that dma_map_sg isn't used).

Thanks for testing it. I haven't had a chance to test it yet.

Rob,

I assume your series is targeted to 5.4, correct?

If this is the case, what we should do about the i.MX5 regression on 5.3?

Would a revert of the two commits be acceptable in 5.3 in order to
avoid the regression?

Please advise.

Thanks
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-03 19:32         ` Fabio Estevam
@ 2019-09-04  0:12           ` Rob Clark
  2019-09-04  0:31             ` Fabio Estevam
  2019-09-04 18:06             ` Robin Murphy
  0 siblings, 2 replies; 13+ messages in thread
From: Rob Clark @ 2019-09-04  0:12 UTC (permalink / raw)
  To: Fabio Estevam; +Cc: DRI mailing list, Robin Murphy, Chris Healy, Jonathan Marek

On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@gmail.com> wrote:
>
> Hi Jonathan,
>
> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
> >
> > Hi,
> >
> > I tried this and it works with patches 4+5 from Rob's series and
> > changing gpummu to use sg_phys(sg) instead of sg->dma_address
> > (dma_address isn't set now that dma_map_sg isn't used).
>
> Thanks for testing it. I haven't had a chance to test it yet.
>
> Rob,
>
> I assume your series is targeted to 5.4, correct?

maybe, although Christoph Hellwig didn't seem like a big fan of
exposing cache ops, and would rather add a new allocation API for
uncached pages.. so I'm not entirely sure what the way forward will
be.

In the mean time, it is a bit ugly, but I guess something like this should work:

--------------------
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 7263f4373f07..5a6a79fbc9d6 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -52,7 +52,7 @@ static void sync_for_device(struct msm_gem_object *msm_obj)
 {
     struct device *dev = msm_obj->base.dev->dev;

-    if (get_dma_ops(dev)) {
+    if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
         dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
             msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
     } else {
@@ -65,7 +65,7 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
 {
     struct device *dev = msm_obj->base.dev->dev;

-    if (get_dma_ops(dev)) {
+    if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
         dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
             msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
     } else {
--------------------

BR,
-R

> If this is the case, what we should do about the i.MX5 regression on 5.3?
>
> Would a revert of the two commits be acceptable in 5.3 in order to
> avoid the regression?
>
> Please advise.
>
> Thanks
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-04  0:12           ` Rob Clark
@ 2019-09-04  0:31             ` Fabio Estevam
  2019-09-04 18:06             ` Robin Murphy
  1 sibling, 0 replies; 13+ messages in thread
From: Fabio Estevam @ 2019-09-04  0:31 UTC (permalink / raw)
  To: Rob Clark; +Cc: DRI mailing list, Robin Murphy, Chris Healy, Jonathan Marek

Hi Rob,

On Tue, Sep 3, 2019 at 9:12 PM Rob Clark <robdclark@gmail.com> wrote:

> In the mean time, it is a bit ugly, but I guess something like this should work:

Yes, this works on a i.MX53 board, thanks:

Tested-by: Fabio Estevam <festevam@gmail.com>

Is this something you could submit for 5.3?

Thanks
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-04  0:12           ` Rob Clark
  2019-09-04  0:31             ` Fabio Estevam
@ 2019-09-04 18:06             ` Robin Murphy
  2019-09-05 17:03               ` Rob Clark
  1 sibling, 1 reply; 13+ messages in thread
From: Robin Murphy @ 2019-09-04 18:06 UTC (permalink / raw)
  To: Rob Clark, Fabio Estevam; +Cc: DRI mailing list, Chris Healy, Jonathan Marek

On 04/09/2019 01:12, Rob Clark wrote:
> On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@gmail.com> wrote:
>>
>> Hi Jonathan,
>>
>> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
>>>
>>> Hi,
>>>
>>> I tried this and it works with patches 4+5 from Rob's series and
>>> changing gpummu to use sg_phys(sg) instead of sg->dma_address
>>> (dma_address isn't set now that dma_map_sg isn't used).
>>
>> Thanks for testing it. I haven't had a chance to test it yet.
>>
>> Rob,
>>
>> I assume your series is targeted to 5.4, correct?
> 
> maybe, although Christoph Hellwig didn't seem like a big fan of
> exposing cache ops, and would rather add a new allocation API for
> uncached pages.. so I'm not entirely sure what the way forward will
> be.

TBH, the use of map/unmap looked reasonable in the context of 
"start/stop using these pages for stuff which may include DMA", so even 
if it was cheekily ignoring sg->dma_address I'm not sure I'd really 
consider it "abuse" - in comparison, using sync without a prior map 
unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG 
will be rendered useless with false positives if this driver is active 
while trying to debug something else.

The warning referenced in 0036bc73ccbe represents something being 
unmapped which didn't match a corresponding map - from what I can make 
of get_pages()/put_pages() it looks like that would need msm_obj->flags 
or msm_obj->sgt to change during the lifetime of the object, neither of 
which sounds like a thing that should legitimately happen. Are you sure 
this isn't all just hiding a subtle bug elsewhere? After all, if what 
was being unmapped wasn't right, who says that what's now being synced is?

Robin.

> In the mean time, it is a bit ugly, but I guess something like this should work:
> 
> --------------------
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 7263f4373f07..5a6a79fbc9d6 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -52,7 +52,7 @@ static void sync_for_device(struct msm_gem_object *msm_obj)
>   {
>       struct device *dev = msm_obj->base.dev->dev;
> 
> -    if (get_dma_ops(dev)) {
> +    if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
>           dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
>               msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
>       } else {
> @@ -65,7 +65,7 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
>   {
>       struct device *dev = msm_obj->base.dev->dev;
> 
> -    if (get_dma_ops(dev)) {
> +    if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
>           dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
>               msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
>       } else {
> --------------------
> 
> BR,
> -R
> 
>> If this is the case, what we should do about the i.MX5 regression on 5.3?
>>
>> Would a revert of the two commits be acceptable in 5.3 in order to
>> avoid the regression?
>>
>> Please advise.
>>
>> Thanks
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-04 18:06             ` Robin Murphy
@ 2019-09-05 17:03               ` Rob Clark
  2019-09-05 19:05                 ` Rob Clark
  0 siblings, 1 reply; 13+ messages in thread
From: Rob Clark @ 2019-09-05 17:03 UTC (permalink / raw)
  To: Robin Murphy; +Cc: DRI mailing list, Chris Healy, Jonathan Marek

On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 04/09/2019 01:12, Rob Clark wrote:
> > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@gmail.com> wrote:
> >>
> >> Hi Jonathan,
> >>
> >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I tried this and it works with patches 4+5 from Rob's series and
> >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address
> >>> (dma_address isn't set now that dma_map_sg isn't used).
> >>
> >> Thanks for testing it. I haven't had a chance to test it yet.
> >>
> >> Rob,
> >>
> >> I assume your series is targeted to 5.4, correct?
> >
> > maybe, although Christoph Hellwig didn't seem like a big fan of
> > exposing cache ops, and would rather add a new allocation API for
> > uncached pages.. so I'm not entirely sure what the way forward will
> > be.
>
> TBH, the use of map/unmap looked reasonable in the context of
> "start/stop using these pages for stuff which may include DMA", so even
> if it was cheekily ignoring sg->dma_address I'm not sure I'd really
> consider it "abuse" - in comparison, using sync without a prior map
> unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG
> will be rendered useless with false positives if this driver is active
> while trying to debug something else.
>
> The warning referenced in 0036bc73ccbe represents something being
> unmapped which didn't match a corresponding map - from what I can make
> of get_pages()/put_pages() it looks like that would need msm_obj->flags
> or msm_obj->sgt to change during the lifetime of the object, neither of
> which sounds like a thing that should legitimately happen. Are you sure
> this isn't all just hiding a subtle bug elsewhere? After all, if what
> was being unmapped wasn't right, who says that what's now being synced is?
>

Correct, msm_obj->flags/sgt should not change.

I reverted the various patches, and went back to the original setup
that used dma_{map,unmap}_sg() to reproduce the original issue that
prompted the change in the first place.  It is a pretty massive flood
of splats, which pretty quickly overflowed the dmesg ring buffer, so I
might be missing some things, but I'll poke around some more.

The one thing I wonder about, what would happen if the buffer is
allocated and dma_map_sg() called before drm/msm attaches it's own
iommu_domains, and then dma_unmap_sg() afterwards.  We aren't actually
ever using the iommu domain that DMA API is creating for the device,
so all the extra iommu_map/unmap (and tlb flush) is at best
unnecessary.  But I'm not sure if it could be having some unintended
side effects that cause this sort of problem.

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-05 17:03               ` Rob Clark
@ 2019-09-05 19:05                 ` Rob Clark
  2019-09-05 22:30                   ` Rob Clark
  2019-09-10 14:35                   ` Robin Murphy
  0 siblings, 2 replies; 13+ messages in thread
From: Rob Clark @ 2019-09-05 19:05 UTC (permalink / raw)
  To: Robin Murphy; +Cc: DRI mailing list, Chris Healy, Jonathan Marek

On Thu, Sep 5, 2019 at 10:03 AM Rob Clark <robdclark@gmail.com> wrote:
>
> On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy <robin.murphy@arm.com> wrote:
> >
> > On 04/09/2019 01:12, Rob Clark wrote:
> > > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@gmail.com> wrote:
> > >>
> > >> Hi Jonathan,
> > >>
> > >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I tried this and it works with patches 4+5 from Rob's series and
> > >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address
> > >>> (dma_address isn't set now that dma_map_sg isn't used).
> > >>
> > >> Thanks for testing it. I haven't had a chance to test it yet.
> > >>
> > >> Rob,
> > >>
> > >> I assume your series is targeted to 5.4, correct?
> > >
> > > maybe, although Christoph Hellwig didn't seem like a big fan of
> > > exposing cache ops, and would rather add a new allocation API for
> > > uncached pages.. so I'm not entirely sure what the way forward will
> > > be.
> >
> > TBH, the use of map/unmap looked reasonable in the context of
> > "start/stop using these pages for stuff which may include DMA", so even
> > if it was cheekily ignoring sg->dma_address I'm not sure I'd really
> > consider it "abuse" - in comparison, using sync without a prior map
> > unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG
> > will be rendered useless with false positives if this driver is active
> > while trying to debug something else.
> >
> > The warning referenced in 0036bc73ccbe represents something being
> > unmapped which didn't match a corresponding map - from what I can make
> > of get_pages()/put_pages() it looks like that would need msm_obj->flags
> > or msm_obj->sgt to change during the lifetime of the object, neither of
> > which sounds like a thing that should legitimately happen. Are you sure
> > this isn't all just hiding a subtle bug elsewhere? After all, if what
> > was being unmapped wasn't right, who says that what's now being synced is?
> >
>
> Correct, msm_obj->flags/sgt should not change.
>
> I reverted the various patches, and went back to the original setup
> that used dma_{map,unmap}_sg() to reproduce the original issue that
> prompted the change in the first place.  It is a pretty massive flood
> of splats, which pretty quickly overflowed the dmesg ring buffer, so I
> might be missing some things, but I'll poke around some more.
>
> The one thing I wonder about, what would happen if the buffer is
> allocated and dma_map_sg() called before drm/msm attaches it's own
> iommu_domains, and then dma_unmap_sg() afterwards.  We aren't actually
> ever using the iommu domain that DMA API is creating for the device,
> so all the extra iommu_map/unmap (and tlb flush) is at best
> unnecessary.  But I'm not sure if it could be having some unintended
> side effects that cause this sort of problem.
>

it seems like every time (or at least every time we splat), we end up
w/ iova=fffffffffffff000 .. which doesn't sound likely to be right.
Although from just looking at the dma-iommu.c code, I'm not sure how
this happens.  And adding some printk's results in enough traces that
I can't boot for some reason..

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-05 19:05                 ` Rob Clark
@ 2019-09-05 22:30                   ` Rob Clark
  2019-09-07  1:21                     ` Rob Clark
  2019-09-10 14:35                   ` Robin Murphy
  1 sibling, 1 reply; 13+ messages in thread
From: Rob Clark @ 2019-09-05 22:30 UTC (permalink / raw)
  To: Robin Murphy; +Cc: DRI mailing list, Chris Healy, Jonathan Marek

On Thu, Sep 5, 2019 at 12:05 PM Rob Clark <robdclark@gmail.com> wrote:
>
> On Thu, Sep 5, 2019 at 10:03 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy <robin.murphy@arm.com> wrote:
> > >
> > > On 04/09/2019 01:12, Rob Clark wrote:
> > > > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@gmail.com> wrote:
> > > >>
> > > >> Hi Jonathan,
> > > >>
> > > >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
> > > >>>
> > > >>> Hi,
> > > >>>
> > > >>> I tried this and it works with patches 4+5 from Rob's series and
> > > >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address
> > > >>> (dma_address isn't set now that dma_map_sg isn't used).
> > > >>
> > > >> Thanks for testing it. I haven't had a chance to test it yet.
> > > >>
> > > >> Rob,
> > > >>
> > > >> I assume your series is targeted to 5.4, correct?
> > > >
> > > > maybe, although Christoph Hellwig didn't seem like a big fan of
> > > > exposing cache ops, and would rather add a new allocation API for
> > > > uncached pages.. so I'm not entirely sure what the way forward will
> > > > be.
> > >
> > > TBH, the use of map/unmap looked reasonable in the context of
> > > "start/stop using these pages for stuff which may include DMA", so even
> > > if it was cheekily ignoring sg->dma_address I'm not sure I'd really
> > > consider it "abuse" - in comparison, using sync without a prior map
> > > unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG
> > > will be rendered useless with false positives if this driver is active
> > > while trying to debug something else.
> > >
> > > The warning referenced in 0036bc73ccbe represents something being
> > > unmapped which didn't match a corresponding map - from what I can make
> > > of get_pages()/put_pages() it looks like that would need msm_obj->flags
> > > or msm_obj->sgt to change during the lifetime of the object, neither of
> > > which sounds like a thing that should legitimately happen. Are you sure
> > > this isn't all just hiding a subtle bug elsewhere? After all, if what
> > > was being unmapped wasn't right, who says that what's now being synced is?
> > >
> >
> > Correct, msm_obj->flags/sgt should not change.
> >
> > I reverted the various patches, and went back to the original setup
> > that used dma_{map,unmap}_sg() to reproduce the original issue that
> > prompted the change in the first place.  It is a pretty massive flood
> > of splats, which pretty quickly overflowed the dmesg ring buffer, so I
> > might be missing some things, but I'll poke around some more.
> >
> > The one thing I wonder about, what would happen if the buffer is
> > allocated and dma_map_sg() called before drm/msm attaches it's own
> > iommu_domains, and then dma_unmap_sg() afterwards.  We aren't actually
> > ever using the iommu domain that DMA API is creating for the device,
> > so all the extra iommu_map/unmap (and tlb flush) is at best
> > unnecessary.  But I'm not sure if it could be having some unintended
> > side effects that cause this sort of problem.
> >
>
> it seems like every time (or at least every time we splat), we end up
> w/ iova=fffffffffffff000 .. which doesn't sound likely to be right.
> Although from just looking at the dma-iommu.c code, I'm not sure how
> this happens.  And adding some printk's results in enough traces that
> I can't boot for some reason..
>

Ok, I see better what is going on.. at least on the kernel that I'm
using on the yoga c630 laptop, where I have a patch[1] to skip domain
attach.  That results in to_smmu_domain(domain)->pgtbl_ops being null,
so arm_smmu_map() fails.  So we skip __finalise_sg() which sets the
sg_dma_address().  Which causes the failure on unmap.

That said, I'm pretty sure I've seen (or had reported) a similar splat
(although maybe not so frequent) on devices without that patch (where
the bootloader isn't enabling scanout).  I'll have to switch over to a
different device that doesn't light up display from bootloader, so
that I can drop that skip-domain-attach patch

All that said, this would be much easier if I could do the cache
operations without all this unneeded iommu stuff.  (Not to mention the
unnecessary TLB flushes that I suspect are also happening.)

[1] https://patchwork.kernel.org/patch/11038793/

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-05 22:30                   ` Rob Clark
@ 2019-09-07  1:21                     ` Rob Clark
  0 siblings, 0 replies; 13+ messages in thread
From: Rob Clark @ 2019-09-07  1:21 UTC (permalink / raw)
  To: Robin Murphy; +Cc: DRI mailing list, Chris Healy, Jonathan Marek

On Thu, Sep 5, 2019 at 3:30 PM Rob Clark <robdclark@gmail.com> wrote:
>
> On Thu, Sep 5, 2019 at 12:05 PM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Thu, Sep 5, 2019 at 10:03 AM Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy <robin.murphy@arm.com> wrote:
> > > >
> > > > On 04/09/2019 01:12, Rob Clark wrote:
> > > > > On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@gmail.com> wrote:
> > > > >>
> > > > >> Hi Jonathan,
> > > > >>
> > > > >> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
> > > > >>>
> > > > >>> Hi,
> > > > >>>
> > > > >>> I tried this and it works with patches 4+5 from Rob's series and
> > > > >>> changing gpummu to use sg_phys(sg) instead of sg->dma_address
> > > > >>> (dma_address isn't set now that dma_map_sg isn't used).
> > > > >>
> > > > >> Thanks for testing it. I haven't had a chance to test it yet.
> > > > >>
> > > > >> Rob,
> > > > >>
> > > > >> I assume your series is targeted to 5.4, correct?
> > > > >
> > > > > maybe, although Christoph Hellwig didn't seem like a big fan of
> > > > > exposing cache ops, and would rather add a new allocation API for
> > > > > uncached pages.. so I'm not entirely sure what the way forward will
> > > > > be.
> > > >
> > > > TBH, the use of map/unmap looked reasonable in the context of
> > > > "start/stop using these pages for stuff which may include DMA", so even
> > > > if it was cheekily ignoring sg->dma_address I'm not sure I'd really
> > > > consider it "abuse" - in comparison, using sync without a prior map
> > > > unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG
> > > > will be rendered useless with false positives if this driver is active
> > > > while trying to debug something else.
> > > >
> > > > The warning referenced in 0036bc73ccbe represents something being
> > > > unmapped which didn't match a corresponding map - from what I can make
> > > > of get_pages()/put_pages() it looks like that would need msm_obj->flags
> > > > or msm_obj->sgt to change during the lifetime of the object, neither of
> > > > which sounds like a thing that should legitimately happen. Are you sure
> > > > this isn't all just hiding a subtle bug elsewhere? After all, if what
> > > > was being unmapped wasn't right, who says that what's now being synced is?
> > > >
> > >
> > > Correct, msm_obj->flags/sgt should not change.
> > >
> > > I reverted the various patches, and went back to the original setup
> > > that used dma_{map,unmap}_sg() to reproduce the original issue that
> > > prompted the change in the first place.  It is a pretty massive flood
> > > of splats, which pretty quickly overflowed the dmesg ring buffer, so I
> > > might be missing some things, but I'll poke around some more.
> > >
> > > The one thing I wonder about, what would happen if the buffer is
> > > allocated and dma_map_sg() called before drm/msm attaches it's own
> > > iommu_domains, and then dma_unmap_sg() afterwards.  We aren't actually
> > > ever using the iommu domain that DMA API is creating for the device,
> > > so all the extra iommu_map/unmap (and tlb flush) is at best
> > > unnecessary.  But I'm not sure if it could be having some unintended
> > > side effects that cause this sort of problem.
> > >
> >
> > it seems like every time (or at least every time we splat), we end up
> > w/ iova=fffffffffffff000 .. which doesn't sound likely to be right.
> > Although from just looking at the dma-iommu.c code, I'm not sure how
> > this happens.  And adding some printk's results in enough traces that
> > I can't boot for some reason..
> >
>
> Ok, I see better what is going on.. at least on the kernel that I'm
> using on the yoga c630 laptop, where I have a patch[1] to skip domain
> attach.  That results in to_smmu_domain(domain)->pgtbl_ops being null,
> so arm_smmu_map() fails.  So we skip __finalise_sg() which sets the
> sg_dma_address().  Which causes the failure on unmap.
>
> That said, I'm pretty sure I've seen (or had reported) a similar splat
> (although maybe not so frequent) on devices without that patch (where
> the bootloader isn't enabling scanout).  I'll have to switch over to a
> different device that doesn't light up display from bootloader, so
> that I can drop that skip-domain-attach patch
>
> All that said, this would be much easier if I could do the cache
> operations without all this unneeded iommu stuff.  (Not to mention the
> unnecessary TLB flushes that I suspect are also happening.)
>
> [1] https://patchwork.kernel.org/patch/11038793/
>

fwiw, with https://patchwork.freedesktop.org/series/63096/ we could go
back to simply using dma_{map,unmap}_sg() in all cases, as the iommu
dma_ops would no longer get in the way.

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Adreno crash on i.MX53 running 5.3-rc6
  2019-09-05 19:05                 ` Rob Clark
  2019-09-05 22:30                   ` Rob Clark
@ 2019-09-10 14:35                   ` Robin Murphy
  1 sibling, 0 replies; 13+ messages in thread
From: Robin Murphy @ 2019-09-10 14:35 UTC (permalink / raw)
  To: Rob Clark; +Cc: DRI mailing list, Chris Healy, Jonathan Marek

On 05/09/2019 20:05, Rob Clark wrote:
> On Thu, Sep 5, 2019 at 10:03 AM Rob Clark <robdclark@gmail.com> wrote:
>>
>> On Wed, Sep 4, 2019 at 11:06 AM Robin Murphy <robin.murphy@arm.com> wrote:
>>>
>>> On 04/09/2019 01:12, Rob Clark wrote:
>>>> On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam@gmail.com> wrote:
>>>>>
>>>>> Hi Jonathan,
>>>>>
>>>>> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan@marek.ca> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I tried this and it works with patches 4+5 from Rob's series and
>>>>>> changing gpummu to use sg_phys(sg) instead of sg->dma_address
>>>>>> (dma_address isn't set now that dma_map_sg isn't used).
>>>>>
>>>>> Thanks for testing it. I haven't had a chance to test it yet.
>>>>>
>>>>> Rob,
>>>>>
>>>>> I assume your series is targeted to 5.4, correct?
>>>>
>>>> maybe, although Christoph Hellwig didn't seem like a big fan of
>>>> exposing cache ops, and would rather add a new allocation API for
>>>> uncached pages.. so I'm not entirely sure what the way forward will
>>>> be.
>>>
>>> TBH, the use of map/unmap looked reasonable in the context of
>>> "start/stop using these pages for stuff which may include DMA", so even
>>> if it was cheekily ignoring sg->dma_address I'm not sure I'd really
>>> consider it "abuse" - in comparison, using sync without a prior map
>>> unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG
>>> will be rendered useless with false positives if this driver is active
>>> while trying to debug something else.
>>>
>>> The warning referenced in 0036bc73ccbe represents something being
>>> unmapped which didn't match a corresponding map - from what I can make
>>> of get_pages()/put_pages() it looks like that would need msm_obj->flags
>>> or msm_obj->sgt to change during the lifetime of the object, neither of
>>> which sounds like a thing that should legitimately happen. Are you sure
>>> this isn't all just hiding a subtle bug elsewhere? After all, if what
>>> was being unmapped wasn't right, who says that what's now being synced is?
>>>
>>
>> Correct, msm_obj->flags/sgt should not change.
>>
>> I reverted the various patches, and went back to the original setup
>> that used dma_{map,unmap}_sg() to reproduce the original issue that
>> prompted the change in the first place.  It is a pretty massive flood
>> of splats, which pretty quickly overflowed the dmesg ring buffer, so I
>> might be missing some things, but I'll poke around some more.
>>
>> The one thing I wonder about, what would happen if the buffer is
>> allocated and dma_map_sg() called before drm/msm attaches it's own
>> iommu_domains, and then dma_unmap_sg() afterwards.  We aren't actually
>> ever using the iommu domain that DMA API is creating for the device,
>> so all the extra iommu_map/unmap (and tlb flush) is at best
>> unnecessary.  But I'm not sure if it could be having some unintended
>> side effects that cause this sort of problem.

Right, one of the semi-intentional side-effects of 43c5bf11a610 is that 
iommu-dma no longer interferes with unmanaged domains - it will still go 
and make its own redundant mappings in the unattached default domain, 
but as long as the DMA API usage is fundamentally sound then it 
shouldn't actually get in the way.
> it seems like every time (or at least every time we splat), we end up
> w/ iova=fffffffffffff000 .. which doesn't sound likely to be right.
> Although from just looking at the dma-iommu.c code, I'm not sure how
> this happens.  And adding some printk's results in enough traces that
> I can't boot for some reason..

Yeah, that's a bogus IOVA for sure, so regardless of how we actually 
make Adreno happy it would still be interesting to figure out how it 
came about. Do you see any WARNs from io-pgtable-arm before the one from 
__iommu_dma_unmap()?

Robin.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-09-10 14:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-02 13:51 Adreno crash on i.MX53 running 5.3-rc6 Fabio Estevam
2019-09-02 14:45 ` Robin Murphy
2019-09-02 18:03   ` Fabio Estevam
2019-09-03 15:22     ` Rob Clark
     [not found]       ` <95ae3680-02c7-a1b8-5ea6-1a4d89293649@marek.ca>
2019-09-03 19:32         ` Fabio Estevam
2019-09-04  0:12           ` Rob Clark
2019-09-04  0:31             ` Fabio Estevam
2019-09-04 18:06             ` Robin Murphy
2019-09-05 17:03               ` Rob Clark
2019-09-05 19:05                 ` Rob Clark
2019-09-05 22:30                   ` Rob Clark
2019-09-07  1:21                     ` Rob Clark
2019-09-10 14:35                   ` Robin Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.