* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-10 14:27 ` Leo Yan 0 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-10 14:27 UTC (permalink / raw) To: Mark Rutland, Catalin Marinas, linux-arm-kernel, linux-kernel Hi Mark, I work mainline kernel on Hikey620 board, I find it's easily to introduce the panic and report the log as below. So I bisect the kernel and finally narrow down the commit e3067861ba66 ("arm64: add basic VMAP_STACK support") which introduce this issue. I tried to remove 'select HAVE_ARCH_VMAP_STACK' from arch/arm64/Kconfig, then I can see the panic issue will dismiss. So could you check this and have insight for this issue? [ 42.384103] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 42.389799] 5-...: (1 GPs behind) idle=252/140000000000000/0 softirq=1208/1236 fqs=2615 [ 42.397982] (detected by 0, t=5255 jiffies, g=238, c=237, q=10) [ 42.403999] Task dump for CPU 5: [ 42.407225] bash R running task 0 2202 2176 0x00000002 [ 42.414281] Call trace: [ 42.416738] [<ffff0000080842a0>] ret_from_fork+0x0/0x18 [ 43.308258] Unable to handle kernel paging request at virtual address 80083517baba [ 43.315864] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003a629000 [ 43.322429] [000080083517baba] *pgd=0000000000000000 [ 43.327419] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 43.332994] Modules linked in: [ 43.336053] CPU: 1 PID: 54 Comm: kworker/u16:2 Not tainted 4.13.0-rc3-00019-ge306786 #199 [ 43.344235] Hardware name: HiKey Development Board (DT) [ 43.349472] Workqueue: writeback wb_workfn (flush-179:0) [ 43.354789] task: ffff80003be71e80 task.stack: ffff0000093a0000 [ 43.360723] PC is at __kmalloc+0x64/0x250 [ 43.364739] LR is at __kmalloc+0x30/0x250 [ 43.368754] pc : [<ffff0000081dccc4>] lr : [<ffff0000081dcc90>] pstate: 40000145 [ 43.376165] sp : ffff0000093a34f0 [ 43.379484] x29: ffff0000093a34f0 x28: ffff80003aaaa228 [ 43.384806] x27: 0000000000000000 x26: ffff80003be71e80 [ 43.390128] x25: ffff80003be71e80 x24: 0000000000400000 [ 43.395450] x23: ffff80003be71e80 x22: ffff0000087c6458 [ 43.400772] x21: 0000000001011200 x20: ffff80003aff0080 [ 43.406093] x19: ffff800005f04a80 x18: 000000000000000e [ 43.411415] x17: 0000ffffa714438c x16: ffff000008244400 [ 43.416737] x15: 000011bb63554e12 x14: 0000000a155ddaef [ 43.422059] x13: 00000000000fffff x12: 0000000000000040 [ 43.427381] x11: ffff7e0000f1bbc0 x10: 0000000000000001 [ 43.432707] x9 : 0000000000000000 x8 : 0000000000000019 [ 43.438035] x7 : ffff7e0000ec8020 x6 : 0000000000000040 [ 43.443363] x5 : 0000000000000000 x4 : 0000000000000001 [ 43.448691] x3 : 07ffffffffffffff x2 : ffff800005f04a80 [ 43.454020] x1 : 000080003504f000 x0 : 000000080012caba [ 43.459350] Process kworker/u16:2 (pid: 54, stack limit = 0xffff0000093a0000) [ 43.466501] Call trace: [ 43.468958] Exception stack(0xffff0000093a33b0 to 0xffff0000093a34f0) [ 43.475415] 33a0: 000000080012caba 000080003504f000 [ 43.483273] 33c0: ffff800005f04a80 07ffffffffffffff 0000000000000001 0000000000000000 [ 43.491131] 33e0: 0000000000000040 ffff7e0000ec8020 0000000000000019 0000000000000000 [ 43.498989] 3400: 0000000000000001 ffff7e0000f1bbc0 0000000000000040 00000000000fffff [ 43.506847] 3420: 0000000a155ddaef 000011bb63554e12 ffff000008244400 0000ffffa714438c [ 43.514705] 3440: 000000000000000e ffff800005f04a80 ffff80003aff0080 0000000001011200 [ 43.522563] 3460: ffff0000087c6458 ffff80003be71e80 0000000000400000 ffff80003be71e80 [ 43.530421] 3480: ffff80003be71e80 0000000000000000 ffff80003aaaa228 ffff0000093a34f0 [ 43.538279] 34a0: ffff0000081dcc90 ffff0000093a34f0 ffff0000081dccc4 0000000040000145 [ 43.546129] 34c0: 00000000ffffffff 0000000000000001 ffffffffffffffff 0000000000000000 [ 43.553976] 34e0: ffff0000093a34f0 ffff0000081dccc4 [ 43.558862] [<ffff0000081dccc4>] __kmalloc+0x64/0x250 [ 43.563926] [<ffff0000087c6458>] mmc_alloc_sg+0x28/0x60 [ 43.569162] [<ffff0000087c653c>] mmc_init_request+0xac/0xc0 [ 43.574747] [<ffff000008378af4>] alloc_request_size+0x4c/0x90 [ 43.580506] [<ffff00000817c734>] mempool_alloc+0x54/0x140 [ 43.585915] [<ffff000008379d7c>] get_request+0x264/0x6d0 [ 43.591238] [<ffff00000837ce50>] blk_queue_bio+0xe0/0x2e0 [ 43.596647] [<ffff00000837acc8>] generic_make_request+0xe8/0x260 [ 43.602663] [<ffff00000837aef0>] submit_bio+0xb0/0x188 [ 43.607812] [<ffff0000082330f0>] submit_bh_wbc+0x130/0x188 [ 43.613308] [<ffff0000082332e0>] __block_write_full_page+0x198/0x3a0 [ 43.619673] [<ffff00000823374c>] block_write_full_page+0x134/0x148 [ 43.625866] [<ffff000008236b20>] blkdev_writepage+0x18/0x20 [ 43.631448] [<ffff00000818596c>] __writepage+0x1c/0x70 [ 43.636595] [<ffff0000081861d8>] write_cache_pages+0x160/0x360 [ 43.642438] [<ffff000008186418>] generic_writepages+0x40/0x78 [ 43.648194] [<ffff000008236adc>] blkdev_writepages+0xc/0x18 [ 43.653783] [<ffff00000818853c>] do_writepages+0x2c/0xa8 [ 43.659113] [<ffff00000822986c>] __writeback_single_inode+0x34/0x1a8 [ 43.665486] [<ffff000008229f34>] writeback_sb_inodes+0x1ec/0x398 [ 43.671510] [<ffff00000822a17c>] __writeback_inodes_wb+0x9c/0xe0 [ 43.677534] [<ffff00000822a418>] wb_writeback+0x1a8/0x1b0 [ 43.682950] [<ffff00000822a9d8>] wb_workfn+0x148/0x240 [ 43.688105] [<ffff0000080d93f4>] process_one_work+0x1ac/0x318 [ 43.693867] [<ffff0000080d95a8>] worker_thread+0x48/0x420 [ 43.699283] [<ffff0000080df664>] kthread+0xfc/0x128 [ 43.704178] [<ffff0000080842b0>] ret_from_fork+0x10/0x18 [ 43.709506] Code: b90012e0 f9400260 d538d081 91002000 (f8616818) [ 43.715617] ---[ end trace 9381b75685031f84 ]--- [ 43.720290] note: kworker/u16:2[54] exited with preempt_count 1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-10 14:27 ` Leo Yan 0 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-10 14:27 UTC (permalink / raw) To: linux-arm-kernel Hi Mark, I work mainline kernel on Hikey620 board, I find it's easily to introduce the panic and report the log as below. So I bisect the kernel and finally narrow down the commit e3067861ba66 ("arm64: add basic VMAP_STACK support") which introduce this issue. I tried to remove 'select HAVE_ARCH_VMAP_STACK' from arch/arm64/Kconfig, then I can see the panic issue will dismiss. So could you check this and have insight for this issue? [ 42.384103] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 42.389799] 5-...: (1 GPs behind) idle=252/140000000000000/0 softirq=1208/1236 fqs=2615 [ 42.397982] (detected by 0, t=5255 jiffies, g=238, c=237, q=10) [ 42.403999] Task dump for CPU 5: [ 42.407225] bash R running task 0 2202 2176 0x00000002 [ 42.414281] Call trace: [ 42.416738] [<ffff0000080842a0>] ret_from_fork+0x0/0x18 [ 43.308258] Unable to handle kernel paging request at virtual address 80083517baba [ 43.315864] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003a629000 [ 43.322429] [000080083517baba] *pgd=0000000000000000 [ 43.327419] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 43.332994] Modules linked in: [ 43.336053] CPU: 1 PID: 54 Comm: kworker/u16:2 Not tainted 4.13.0-rc3-00019-ge306786 #199 [ 43.344235] Hardware name: HiKey Development Board (DT) [ 43.349472] Workqueue: writeback wb_workfn (flush-179:0) [ 43.354789] task: ffff80003be71e80 task.stack: ffff0000093a0000 [ 43.360723] PC is at __kmalloc+0x64/0x250 [ 43.364739] LR is at __kmalloc+0x30/0x250 [ 43.368754] pc : [<ffff0000081dccc4>] lr : [<ffff0000081dcc90>] pstate: 40000145 [ 43.376165] sp : ffff0000093a34f0 [ 43.379484] x29: ffff0000093a34f0 x28: ffff80003aaaa228 [ 43.384806] x27: 0000000000000000 x26: ffff80003be71e80 [ 43.390128] x25: ffff80003be71e80 x24: 0000000000400000 [ 43.395450] x23: ffff80003be71e80 x22: ffff0000087c6458 [ 43.400772] x21: 0000000001011200 x20: ffff80003aff0080 [ 43.406093] x19: ffff800005f04a80 x18: 000000000000000e [ 43.411415] x17: 0000ffffa714438c x16: ffff000008244400 [ 43.416737] x15: 000011bb63554e12 x14: 0000000a155ddaef [ 43.422059] x13: 00000000000fffff x12: 0000000000000040 [ 43.427381] x11: ffff7e0000f1bbc0 x10: 0000000000000001 [ 43.432707] x9 : 0000000000000000 x8 : 0000000000000019 [ 43.438035] x7 : ffff7e0000ec8020 x6 : 0000000000000040 [ 43.443363] x5 : 0000000000000000 x4 : 0000000000000001 [ 43.448691] x3 : 07ffffffffffffff x2 : ffff800005f04a80 [ 43.454020] x1 : 000080003504f000 x0 : 000000080012caba [ 43.459350] Process kworker/u16:2 (pid: 54, stack limit = 0xffff0000093a0000) [ 43.466501] Call trace: [ 43.468958] Exception stack(0xffff0000093a33b0 to 0xffff0000093a34f0) [ 43.475415] 33a0: 000000080012caba 000080003504f000 [ 43.483273] 33c0: ffff800005f04a80 07ffffffffffffff 0000000000000001 0000000000000000 [ 43.491131] 33e0: 0000000000000040 ffff7e0000ec8020 0000000000000019 0000000000000000 [ 43.498989] 3400: 0000000000000001 ffff7e0000f1bbc0 0000000000000040 00000000000fffff [ 43.506847] 3420: 0000000a155ddaef 000011bb63554e12 ffff000008244400 0000ffffa714438c [ 43.514705] 3440: 000000000000000e ffff800005f04a80 ffff80003aff0080 0000000001011200 [ 43.522563] 3460: ffff0000087c6458 ffff80003be71e80 0000000000400000 ffff80003be71e80 [ 43.530421] 3480: ffff80003be71e80 0000000000000000 ffff80003aaaa228 ffff0000093a34f0 [ 43.538279] 34a0: ffff0000081dcc90 ffff0000093a34f0 ffff0000081dccc4 0000000040000145 [ 43.546129] 34c0: 00000000ffffffff 0000000000000001 ffffffffffffffff 0000000000000000 [ 43.553976] 34e0: ffff0000093a34f0 ffff0000081dccc4 [ 43.558862] [<ffff0000081dccc4>] __kmalloc+0x64/0x250 [ 43.563926] [<ffff0000087c6458>] mmc_alloc_sg+0x28/0x60 [ 43.569162] [<ffff0000087c653c>] mmc_init_request+0xac/0xc0 [ 43.574747] [<ffff000008378af4>] alloc_request_size+0x4c/0x90 [ 43.580506] [<ffff00000817c734>] mempool_alloc+0x54/0x140 [ 43.585915] [<ffff000008379d7c>] get_request+0x264/0x6d0 [ 43.591238] [<ffff00000837ce50>] blk_queue_bio+0xe0/0x2e0 [ 43.596647] [<ffff00000837acc8>] generic_make_request+0xe8/0x260 [ 43.602663] [<ffff00000837aef0>] submit_bio+0xb0/0x188 [ 43.607812] [<ffff0000082330f0>] submit_bh_wbc+0x130/0x188 [ 43.613308] [<ffff0000082332e0>] __block_write_full_page+0x198/0x3a0 [ 43.619673] [<ffff00000823374c>] block_write_full_page+0x134/0x148 [ 43.625866] [<ffff000008236b20>] blkdev_writepage+0x18/0x20 [ 43.631448] [<ffff00000818596c>] __writepage+0x1c/0x70 [ 43.636595] [<ffff0000081861d8>] write_cache_pages+0x160/0x360 [ 43.642438] [<ffff000008186418>] generic_writepages+0x40/0x78 [ 43.648194] [<ffff000008236adc>] blkdev_writepages+0xc/0x18 [ 43.653783] [<ffff00000818853c>] do_writepages+0x2c/0xa8 [ 43.659113] [<ffff00000822986c>] __writeback_single_inode+0x34/0x1a8 [ 43.665486] [<ffff000008229f34>] writeback_sb_inodes+0x1ec/0x398 [ 43.671510] [<ffff00000822a17c>] __writeback_inodes_wb+0x9c/0xe0 [ 43.677534] [<ffff00000822a418>] wb_writeback+0x1a8/0x1b0 [ 43.682950] [<ffff00000822a9d8>] wb_workfn+0x148/0x240 [ 43.688105] [<ffff0000080d93f4>] process_one_work+0x1ac/0x318 [ 43.693867] [<ffff0000080d95a8>] worker_thread+0x48/0x420 [ 43.699283] [<ffff0000080df664>] kthread+0xfc/0x128 [ 43.704178] [<ffff0000080842b0>] ret_from_fork+0x10/0x18 [ 43.709506] Code: b90012e0 f9400260 d538d081 91002000 (f8616818) [ 43.715617] ---[ end trace 9381b75685031f84 ]--- [ 43.720290] note: kworker/u16:2[54] exited with preempt_count 1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-10 14:27 ` Leo Yan @ 2017-10-10 15:45 ` Mark Rutland -1 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-10 15:45 UTC (permalink / raw) To: Leo Yan; +Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, ard.biesheuvel On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > Hi Mark, Hi Leo, > I work mainline kernel on Hikey620 board, I find it's easily to > introduce the panic and report the log as below. So I bisect the kernel > and finally narrow down the commit e3067861ba66 ("arm64: add basic > VMAP_STACK support") which introduce this issue. > > I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > could you check this and have insight for this issue? Given the stuff in the backtrace, my suspicion is something is trying to perform DMA to/from the stack, getting junk addresses form the attempted virt<->phys conversions. Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? Thanks, Mark. > [ 42.384103] INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 42.389799] 5-...: (1 GPs behind) idle=252/140000000000000/0 softirq=1208/1236 fqs=2615 > [ 42.397982] (detected by 0, t=5255 jiffies, g=238, c=237, q=10) > [ 42.403999] Task dump for CPU 5: > [ 42.407225] bash R running task 0 2202 2176 0x00000002 > [ 42.414281] Call trace: > [ 42.416738] [<ffff0000080842a0>] ret_from_fork+0x0/0x18 > [ 43.308258] Unable to handle kernel paging request at virtual address 80083517baba > [ 43.315864] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003a629000 > [ 43.322429] [000080083517baba] *pgd=0000000000000000 > [ 43.327419] Internal error: Oops: 96000004 [#1] PREEMPT SMP > [ 43.332994] Modules linked in: > [ 43.336053] CPU: 1 PID: 54 Comm: kworker/u16:2 Not tainted 4.13.0-rc3-00019-ge306786 #199 > [ 43.344235] Hardware name: HiKey Development Board (DT) > [ 43.349472] Workqueue: writeback wb_workfn (flush-179:0) > [ 43.354789] task: ffff80003be71e80 task.stack: ffff0000093a0000 > [ 43.360723] PC is at __kmalloc+0x64/0x250 > [ 43.364739] LR is at __kmalloc+0x30/0x250 > [ 43.368754] pc : [<ffff0000081dccc4>] lr : [<ffff0000081dcc90>] pstate: 40000145 > [ 43.376165] sp : ffff0000093a34f0 > [ 43.379484] x29: ffff0000093a34f0 x28: ffff80003aaaa228 > [ 43.384806] x27: 0000000000000000 x26: ffff80003be71e80 > [ 43.390128] x25: ffff80003be71e80 x24: 0000000000400000 > [ 43.395450] x23: ffff80003be71e80 x22: ffff0000087c6458 > [ 43.400772] x21: 0000000001011200 x20: ffff80003aff0080 > [ 43.406093] x19: ffff800005f04a80 x18: 000000000000000e > [ 43.411415] x17: 0000ffffa714438c x16: ffff000008244400 > [ 43.416737] x15: 000011bb63554e12 x14: 0000000a155ddaef > [ 43.422059] x13: 00000000000fffff x12: 0000000000000040 > [ 43.427381] x11: ffff7e0000f1bbc0 x10: 0000000000000001 > [ 43.432707] x9 : 0000000000000000 x8 : 0000000000000019 > [ 43.438035] x7 : ffff7e0000ec8020 x6 : 0000000000000040 > [ 43.443363] x5 : 0000000000000000 x4 : 0000000000000001 > [ 43.448691] x3 : 07ffffffffffffff x2 : ffff800005f04a80 > [ 43.454020] x1 : 000080003504f000 x0 : 000000080012caba > [ 43.459350] Process kworker/u16:2 (pid: 54, stack limit = 0xffff0000093a0000) > [ 43.466501] Call trace: > [ 43.468958] Exception stack(0xffff0000093a33b0 to 0xffff0000093a34f0) > [ 43.475415] 33a0: 000000080012caba 000080003504f000 > [ 43.483273] 33c0: ffff800005f04a80 07ffffffffffffff 0000000000000001 0000000000000000 > [ 43.491131] 33e0: 0000000000000040 ffff7e0000ec8020 0000000000000019 0000000000000000 > [ 43.498989] 3400: 0000000000000001 ffff7e0000f1bbc0 0000000000000040 00000000000fffff > [ 43.506847] 3420: 0000000a155ddaef 000011bb63554e12 ffff000008244400 0000ffffa714438c > [ 43.514705] 3440: 000000000000000e ffff800005f04a80 ffff80003aff0080 0000000001011200 > [ 43.522563] 3460: ffff0000087c6458 ffff80003be71e80 0000000000400000 ffff80003be71e80 > [ 43.530421] 3480: ffff80003be71e80 0000000000000000 ffff80003aaaa228 ffff0000093a34f0 > [ 43.538279] 34a0: ffff0000081dcc90 ffff0000093a34f0 ffff0000081dccc4 0000000040000145 > [ 43.546129] 34c0: 00000000ffffffff 0000000000000001 ffffffffffffffff 0000000000000000 > [ 43.553976] 34e0: ffff0000093a34f0 ffff0000081dccc4 > [ 43.558862] [<ffff0000081dccc4>] __kmalloc+0x64/0x250 > [ 43.563926] [<ffff0000087c6458>] mmc_alloc_sg+0x28/0x60 > [ 43.569162] [<ffff0000087c653c>] mmc_init_request+0xac/0xc0 > [ 43.574747] [<ffff000008378af4>] alloc_request_size+0x4c/0x90 > [ 43.580506] [<ffff00000817c734>] mempool_alloc+0x54/0x140 > [ 43.585915] [<ffff000008379d7c>] get_request+0x264/0x6d0 > [ 43.591238] [<ffff00000837ce50>] blk_queue_bio+0xe0/0x2e0 > [ 43.596647] [<ffff00000837acc8>] generic_make_request+0xe8/0x260 > [ 43.602663] [<ffff00000837aef0>] submit_bio+0xb0/0x188 > [ 43.607812] [<ffff0000082330f0>] submit_bh_wbc+0x130/0x188 > [ 43.613308] [<ffff0000082332e0>] __block_write_full_page+0x198/0x3a0 > [ 43.619673] [<ffff00000823374c>] block_write_full_page+0x134/0x148 > [ 43.625866] [<ffff000008236b20>] blkdev_writepage+0x18/0x20 > [ 43.631448] [<ffff00000818596c>] __writepage+0x1c/0x70 > [ 43.636595] [<ffff0000081861d8>] write_cache_pages+0x160/0x360 > [ 43.642438] [<ffff000008186418>] generic_writepages+0x40/0x78 > [ 43.648194] [<ffff000008236adc>] blkdev_writepages+0xc/0x18 > [ 43.653783] [<ffff00000818853c>] do_writepages+0x2c/0xa8 > [ 43.659113] [<ffff00000822986c>] __writeback_single_inode+0x34/0x1a8 > [ 43.665486] [<ffff000008229f34>] writeback_sb_inodes+0x1ec/0x398 > [ 43.671510] [<ffff00000822a17c>] __writeback_inodes_wb+0x9c/0xe0 > [ 43.677534] [<ffff00000822a418>] wb_writeback+0x1a8/0x1b0 > [ 43.682950] [<ffff00000822a9d8>] wb_workfn+0x148/0x240 > [ 43.688105] [<ffff0000080d93f4>] process_one_work+0x1ac/0x318 > [ 43.693867] [<ffff0000080d95a8>] worker_thread+0x48/0x420 > [ 43.699283] [<ffff0000080df664>] kthread+0xfc/0x128 > [ 43.704178] [<ffff0000080842b0>] ret_from_fork+0x10/0x18 > [ 43.709506] Code: b90012e0 f9400260 d538d081 91002000 (f8616818) > [ 43.715617] ---[ end trace 9381b75685031f84 ]--- > [ 43.720290] note: kworker/u16:2[54] exited with preempt_count 1 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-10 15:45 ` Mark Rutland 0 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-10 15:45 UTC (permalink / raw) To: linux-arm-kernel On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > Hi Mark, Hi Leo, > I work mainline kernel on Hikey620 board, I find it's easily to > introduce the panic and report the log as below. So I bisect the kernel > and finally narrow down the commit e3067861ba66 ("arm64: add basic > VMAP_STACK support") which introduce this issue. > > I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > could you check this and have insight for this issue? Given the stuff in the backtrace, my suspicion is something is trying to perform DMA to/from the stack, getting junk addresses form the attempted virt<->phys conversions. Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? Thanks, Mark. > [ 42.384103] INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 42.389799] 5-...: (1 GPs behind) idle=252/140000000000000/0 softirq=1208/1236 fqs=2615 > [ 42.397982] (detected by 0, t=5255 jiffies, g=238, c=237, q=10) > [ 42.403999] Task dump for CPU 5: > [ 42.407225] bash R running task 0 2202 2176 0x00000002 > [ 42.414281] Call trace: > [ 42.416738] [<ffff0000080842a0>] ret_from_fork+0x0/0x18 > [ 43.308258] Unable to handle kernel paging request at virtual address 80083517baba > [ 43.315864] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003a629000 > [ 43.322429] [000080083517baba] *pgd=0000000000000000 > [ 43.327419] Internal error: Oops: 96000004 [#1] PREEMPT SMP > [ 43.332994] Modules linked in: > [ 43.336053] CPU: 1 PID: 54 Comm: kworker/u16:2 Not tainted 4.13.0-rc3-00019-ge306786 #199 > [ 43.344235] Hardware name: HiKey Development Board (DT) > [ 43.349472] Workqueue: writeback wb_workfn (flush-179:0) > [ 43.354789] task: ffff80003be71e80 task.stack: ffff0000093a0000 > [ 43.360723] PC is at __kmalloc+0x64/0x250 > [ 43.364739] LR is at __kmalloc+0x30/0x250 > [ 43.368754] pc : [<ffff0000081dccc4>] lr : [<ffff0000081dcc90>] pstate: 40000145 > [ 43.376165] sp : ffff0000093a34f0 > [ 43.379484] x29: ffff0000093a34f0 x28: ffff80003aaaa228 > [ 43.384806] x27: 0000000000000000 x26: ffff80003be71e80 > [ 43.390128] x25: ffff80003be71e80 x24: 0000000000400000 > [ 43.395450] x23: ffff80003be71e80 x22: ffff0000087c6458 > [ 43.400772] x21: 0000000001011200 x20: ffff80003aff0080 > [ 43.406093] x19: ffff800005f04a80 x18: 000000000000000e > [ 43.411415] x17: 0000ffffa714438c x16: ffff000008244400 > [ 43.416737] x15: 000011bb63554e12 x14: 0000000a155ddaef > [ 43.422059] x13: 00000000000fffff x12: 0000000000000040 > [ 43.427381] x11: ffff7e0000f1bbc0 x10: 0000000000000001 > [ 43.432707] x9 : 0000000000000000 x8 : 0000000000000019 > [ 43.438035] x7 : ffff7e0000ec8020 x6 : 0000000000000040 > [ 43.443363] x5 : 0000000000000000 x4 : 0000000000000001 > [ 43.448691] x3 : 07ffffffffffffff x2 : ffff800005f04a80 > [ 43.454020] x1 : 000080003504f000 x0 : 000000080012caba > [ 43.459350] Process kworker/u16:2 (pid: 54, stack limit = 0xffff0000093a0000) > [ 43.466501] Call trace: > [ 43.468958] Exception stack(0xffff0000093a33b0 to 0xffff0000093a34f0) > [ 43.475415] 33a0: 000000080012caba 000080003504f000 > [ 43.483273] 33c0: ffff800005f04a80 07ffffffffffffff 0000000000000001 0000000000000000 > [ 43.491131] 33e0: 0000000000000040 ffff7e0000ec8020 0000000000000019 0000000000000000 > [ 43.498989] 3400: 0000000000000001 ffff7e0000f1bbc0 0000000000000040 00000000000fffff > [ 43.506847] 3420: 0000000a155ddaef 000011bb63554e12 ffff000008244400 0000ffffa714438c > [ 43.514705] 3440: 000000000000000e ffff800005f04a80 ffff80003aff0080 0000000001011200 > [ 43.522563] 3460: ffff0000087c6458 ffff80003be71e80 0000000000400000 ffff80003be71e80 > [ 43.530421] 3480: ffff80003be71e80 0000000000000000 ffff80003aaaa228 ffff0000093a34f0 > [ 43.538279] 34a0: ffff0000081dcc90 ffff0000093a34f0 ffff0000081dccc4 0000000040000145 > [ 43.546129] 34c0: 00000000ffffffff 0000000000000001 ffffffffffffffff 0000000000000000 > [ 43.553976] 34e0: ffff0000093a34f0 ffff0000081dccc4 > [ 43.558862] [<ffff0000081dccc4>] __kmalloc+0x64/0x250 > [ 43.563926] [<ffff0000087c6458>] mmc_alloc_sg+0x28/0x60 > [ 43.569162] [<ffff0000087c653c>] mmc_init_request+0xac/0xc0 > [ 43.574747] [<ffff000008378af4>] alloc_request_size+0x4c/0x90 > [ 43.580506] [<ffff00000817c734>] mempool_alloc+0x54/0x140 > [ 43.585915] [<ffff000008379d7c>] get_request+0x264/0x6d0 > [ 43.591238] [<ffff00000837ce50>] blk_queue_bio+0xe0/0x2e0 > [ 43.596647] [<ffff00000837acc8>] generic_make_request+0xe8/0x260 > [ 43.602663] [<ffff00000837aef0>] submit_bio+0xb0/0x188 > [ 43.607812] [<ffff0000082330f0>] submit_bh_wbc+0x130/0x188 > [ 43.613308] [<ffff0000082332e0>] __block_write_full_page+0x198/0x3a0 > [ 43.619673] [<ffff00000823374c>] block_write_full_page+0x134/0x148 > [ 43.625866] [<ffff000008236b20>] blkdev_writepage+0x18/0x20 > [ 43.631448] [<ffff00000818596c>] __writepage+0x1c/0x70 > [ 43.636595] [<ffff0000081861d8>] write_cache_pages+0x160/0x360 > [ 43.642438] [<ffff000008186418>] generic_writepages+0x40/0x78 > [ 43.648194] [<ffff000008236adc>] blkdev_writepages+0xc/0x18 > [ 43.653783] [<ffff00000818853c>] do_writepages+0x2c/0xa8 > [ 43.659113] [<ffff00000822986c>] __writeback_single_inode+0x34/0x1a8 > [ 43.665486] [<ffff000008229f34>] writeback_sb_inodes+0x1ec/0x398 > [ 43.671510] [<ffff00000822a17c>] __writeback_inodes_wb+0x9c/0xe0 > [ 43.677534] [<ffff00000822a418>] wb_writeback+0x1a8/0x1b0 > [ 43.682950] [<ffff00000822a9d8>] wb_workfn+0x148/0x240 > [ 43.688105] [<ffff0000080d93f4>] process_one_work+0x1ac/0x318 > [ 43.693867] [<ffff0000080d95a8>] worker_thread+0x48/0x420 > [ 43.699283] [<ffff0000080df664>] kthread+0xfc/0x128 > [ 43.704178] [<ffff0000080842b0>] ret_from_fork+0x10/0x18 > [ 43.709506] Code: b90012e0 f9400260 d538d081 91002000 (f8616818) > [ 43.715617] ---[ end trace 9381b75685031f84 ]--- > [ 43.720290] note: kworker/u16:2[54] exited with preempt_count 1 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-10 15:45 ` Mark Rutland @ 2017-10-10 16:03 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-10-10 16:03 UTC (permalink / raw) To: Mark Rutland, Leo Yan Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On 10/10/17 16:45, Mark Rutland wrote: > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >> Hi Mark, > > Hi Leo, > >> I work mainline kernel on Hikey620 board, I find it's easily to >> introduce the panic and report the log as below. So I bisect the kernel >> and finally narrow down the commit e3067861ba66 ("arm64: add basic >> VMAP_STACK support") which introduce this issue. >> >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >> could you check this and have insight for this issue? > > Given the stuff in the backtrace, my suspicion is something is trying to > perform DMA to/from the stack, getting junk addresses form the attempted > virt<->phys conversions. > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack addresses either way, too. Robin. > > Thanks, > Mark. > >> [ 42.384103] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 42.389799] 5-...: (1 GPs behind) idle=252/140000000000000/0 softirq=1208/1236 fqs=2615 >> [ 42.397982] (detected by 0, t=5255 jiffies, g=238, c=237, q=10) >> [ 42.403999] Task dump for CPU 5: >> [ 42.407225] bash R running task 0 2202 2176 0x00000002 >> [ 42.414281] Call trace: >> [ 42.416738] [<ffff0000080842a0>] ret_from_fork+0x0/0x18 >> [ 43.308258] Unable to handle kernel paging request at virtual address 80083517baba >> [ 43.315864] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003a629000 >> [ 43.322429] [000080083517baba] *pgd=0000000000000000 >> [ 43.327419] Internal error: Oops: 96000004 [#1] PREEMPT SMP >> [ 43.332994] Modules linked in: >> [ 43.336053] CPU: 1 PID: 54 Comm: kworker/u16:2 Not tainted 4.13.0-rc3-00019-ge306786 #199 >> [ 43.344235] Hardware name: HiKey Development Board (DT) >> [ 43.349472] Workqueue: writeback wb_workfn (flush-179:0) >> [ 43.354789] task: ffff80003be71e80 task.stack: ffff0000093a0000 >> [ 43.360723] PC is at __kmalloc+0x64/0x250 >> [ 43.364739] LR is at __kmalloc+0x30/0x250 >> [ 43.368754] pc : [<ffff0000081dccc4>] lr : [<ffff0000081dcc90>] pstate: 40000145 >> [ 43.376165] sp : ffff0000093a34f0 >> [ 43.379484] x29: ffff0000093a34f0 x28: ffff80003aaaa228 >> [ 43.384806] x27: 0000000000000000 x26: ffff80003be71e80 >> [ 43.390128] x25: ffff80003be71e80 x24: 0000000000400000 >> [ 43.395450] x23: ffff80003be71e80 x22: ffff0000087c6458 >> [ 43.400772] x21: 0000000001011200 x20: ffff80003aff0080 >> [ 43.406093] x19: ffff800005f04a80 x18: 000000000000000e >> [ 43.411415] x17: 0000ffffa714438c x16: ffff000008244400 >> [ 43.416737] x15: 000011bb63554e12 x14: 0000000a155ddaef >> [ 43.422059] x13: 00000000000fffff x12: 0000000000000040 >> [ 43.427381] x11: ffff7e0000f1bbc0 x10: 0000000000000001 >> [ 43.432707] x9 : 0000000000000000 x8 : 0000000000000019 >> [ 43.438035] x7 : ffff7e0000ec8020 x6 : 0000000000000040 >> [ 43.443363] x5 : 0000000000000000 x4 : 0000000000000001 >> [ 43.448691] x3 : 07ffffffffffffff x2 : ffff800005f04a80 >> [ 43.454020] x1 : 000080003504f000 x0 : 000000080012caba >> [ 43.459350] Process kworker/u16:2 (pid: 54, stack limit = 0xffff0000093a0000) >> [ 43.466501] Call trace: >> [ 43.468958] Exception stack(0xffff0000093a33b0 to 0xffff0000093a34f0) >> [ 43.475415] 33a0: 000000080012caba 000080003504f000 >> [ 43.483273] 33c0: ffff800005f04a80 07ffffffffffffff 0000000000000001 0000000000000000 >> [ 43.491131] 33e0: 0000000000000040 ffff7e0000ec8020 0000000000000019 0000000000000000 >> [ 43.498989] 3400: 0000000000000001 ffff7e0000f1bbc0 0000000000000040 00000000000fffff >> [ 43.506847] 3420: 0000000a155ddaef 000011bb63554e12 ffff000008244400 0000ffffa714438c >> [ 43.514705] 3440: 000000000000000e ffff800005f04a80 ffff80003aff0080 0000000001011200 >> [ 43.522563] 3460: ffff0000087c6458 ffff80003be71e80 0000000000400000 ffff80003be71e80 >> [ 43.530421] 3480: ffff80003be71e80 0000000000000000 ffff80003aaaa228 ffff0000093a34f0 >> [ 43.538279] 34a0: ffff0000081dcc90 ffff0000093a34f0 ffff0000081dccc4 0000000040000145 >> [ 43.546129] 34c0: 00000000ffffffff 0000000000000001 ffffffffffffffff 0000000000000000 >> [ 43.553976] 34e0: ffff0000093a34f0 ffff0000081dccc4 >> [ 43.558862] [<ffff0000081dccc4>] __kmalloc+0x64/0x250 >> [ 43.563926] [<ffff0000087c6458>] mmc_alloc_sg+0x28/0x60 >> [ 43.569162] [<ffff0000087c653c>] mmc_init_request+0xac/0xc0 >> [ 43.574747] [<ffff000008378af4>] alloc_request_size+0x4c/0x90 >> [ 43.580506] [<ffff00000817c734>] mempool_alloc+0x54/0x140 >> [ 43.585915] [<ffff000008379d7c>] get_request+0x264/0x6d0 >> [ 43.591238] [<ffff00000837ce50>] blk_queue_bio+0xe0/0x2e0 >> [ 43.596647] [<ffff00000837acc8>] generic_make_request+0xe8/0x260 >> [ 43.602663] [<ffff00000837aef0>] submit_bio+0xb0/0x188 >> [ 43.607812] [<ffff0000082330f0>] submit_bh_wbc+0x130/0x188 >> [ 43.613308] [<ffff0000082332e0>] __block_write_full_page+0x198/0x3a0 >> [ 43.619673] [<ffff00000823374c>] block_write_full_page+0x134/0x148 >> [ 43.625866] [<ffff000008236b20>] blkdev_writepage+0x18/0x20 >> [ 43.631448] [<ffff00000818596c>] __writepage+0x1c/0x70 >> [ 43.636595] [<ffff0000081861d8>] write_cache_pages+0x160/0x360 >> [ 43.642438] [<ffff000008186418>] generic_writepages+0x40/0x78 >> [ 43.648194] [<ffff000008236adc>] blkdev_writepages+0xc/0x18 >> [ 43.653783] [<ffff00000818853c>] do_writepages+0x2c/0xa8 >> [ 43.659113] [<ffff00000822986c>] __writeback_single_inode+0x34/0x1a8 >> [ 43.665486] [<ffff000008229f34>] writeback_sb_inodes+0x1ec/0x398 >> [ 43.671510] [<ffff00000822a17c>] __writeback_inodes_wb+0x9c/0xe0 >> [ 43.677534] [<ffff00000822a418>] wb_writeback+0x1a8/0x1b0 >> [ 43.682950] [<ffff00000822a9d8>] wb_workfn+0x148/0x240 >> [ 43.688105] [<ffff0000080d93f4>] process_one_work+0x1ac/0x318 >> [ 43.693867] [<ffff0000080d95a8>] worker_thread+0x48/0x420 >> [ 43.699283] [<ffff0000080df664>] kthread+0xfc/0x128 >> [ 43.704178] [<ffff0000080842b0>] ret_from_fork+0x10/0x18 >> [ 43.709506] Code: b90012e0 f9400260 d538d081 91002000 (f8616818) >> [ 43.715617] ---[ end trace 9381b75685031f84 ]--- >> [ 43.720290] note: kworker/u16:2[54] exited with preempt_count 1 >> > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-10 16:03 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-10-10 16:03 UTC (permalink / raw) To: linux-arm-kernel On 10/10/17 16:45, Mark Rutland wrote: > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >> Hi Mark, > > Hi Leo, > >> I work mainline kernel on Hikey620 board, I find it's easily to >> introduce the panic and report the log as below. So I bisect the kernel >> and finally narrow down the commit e3067861ba66 ("arm64: add basic >> VMAP_STACK support") which introduce this issue. >> >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >> could you check this and have insight for this issue? > > Given the stuff in the backtrace, my suspicion is something is trying to > perform DMA to/from the stack, getting junk addresses form the attempted > virt<->phys conversions. > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack addresses either way, too. Robin. > > Thanks, > Mark. > >> [ 42.384103] INFO: rcu_preempt detected stalls on CPUs/tasks: >> [ 42.389799] 5-...: (1 GPs behind) idle=252/140000000000000/0 softirq=1208/1236 fqs=2615 >> [ 42.397982] (detected by 0, t=5255 jiffies, g=238, c=237, q=10) >> [ 42.403999] Task dump for CPU 5: >> [ 42.407225] bash R running task 0 2202 2176 0x00000002 >> [ 42.414281] Call trace: >> [ 42.416738] [<ffff0000080842a0>] ret_from_fork+0x0/0x18 >> [ 43.308258] Unable to handle kernel paging request at virtual address 80083517baba >> [ 43.315864] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80003a629000 >> [ 43.322429] [000080083517baba] *pgd=0000000000000000 >> [ 43.327419] Internal error: Oops: 96000004 [#1] PREEMPT SMP >> [ 43.332994] Modules linked in: >> [ 43.336053] CPU: 1 PID: 54 Comm: kworker/u16:2 Not tainted 4.13.0-rc3-00019-ge306786 #199 >> [ 43.344235] Hardware name: HiKey Development Board (DT) >> [ 43.349472] Workqueue: writeback wb_workfn (flush-179:0) >> [ 43.354789] task: ffff80003be71e80 task.stack: ffff0000093a0000 >> [ 43.360723] PC is at __kmalloc+0x64/0x250 >> [ 43.364739] LR is at __kmalloc+0x30/0x250 >> [ 43.368754] pc : [<ffff0000081dccc4>] lr : [<ffff0000081dcc90>] pstate: 40000145 >> [ 43.376165] sp : ffff0000093a34f0 >> [ 43.379484] x29: ffff0000093a34f0 x28: ffff80003aaaa228 >> [ 43.384806] x27: 0000000000000000 x26: ffff80003be71e80 >> [ 43.390128] x25: ffff80003be71e80 x24: 0000000000400000 >> [ 43.395450] x23: ffff80003be71e80 x22: ffff0000087c6458 >> [ 43.400772] x21: 0000000001011200 x20: ffff80003aff0080 >> [ 43.406093] x19: ffff800005f04a80 x18: 000000000000000e >> [ 43.411415] x17: 0000ffffa714438c x16: ffff000008244400 >> [ 43.416737] x15: 000011bb63554e12 x14: 0000000a155ddaef >> [ 43.422059] x13: 00000000000fffff x12: 0000000000000040 >> [ 43.427381] x11: ffff7e0000f1bbc0 x10: 0000000000000001 >> [ 43.432707] x9 : 0000000000000000 x8 : 0000000000000019 >> [ 43.438035] x7 : ffff7e0000ec8020 x6 : 0000000000000040 >> [ 43.443363] x5 : 0000000000000000 x4 : 0000000000000001 >> [ 43.448691] x3 : 07ffffffffffffff x2 : ffff800005f04a80 >> [ 43.454020] x1 : 000080003504f000 x0 : 000000080012caba >> [ 43.459350] Process kworker/u16:2 (pid: 54, stack limit = 0xffff0000093a0000) >> [ 43.466501] Call trace: >> [ 43.468958] Exception stack(0xffff0000093a33b0 to 0xffff0000093a34f0) >> [ 43.475415] 33a0: 000000080012caba 000080003504f000 >> [ 43.483273] 33c0: ffff800005f04a80 07ffffffffffffff 0000000000000001 0000000000000000 >> [ 43.491131] 33e0: 0000000000000040 ffff7e0000ec8020 0000000000000019 0000000000000000 >> [ 43.498989] 3400: 0000000000000001 ffff7e0000f1bbc0 0000000000000040 00000000000fffff >> [ 43.506847] 3420: 0000000a155ddaef 000011bb63554e12 ffff000008244400 0000ffffa714438c >> [ 43.514705] 3440: 000000000000000e ffff800005f04a80 ffff80003aff0080 0000000001011200 >> [ 43.522563] 3460: ffff0000087c6458 ffff80003be71e80 0000000000400000 ffff80003be71e80 >> [ 43.530421] 3480: ffff80003be71e80 0000000000000000 ffff80003aaaa228 ffff0000093a34f0 >> [ 43.538279] 34a0: ffff0000081dcc90 ffff0000093a34f0 ffff0000081dccc4 0000000040000145 >> [ 43.546129] 34c0: 00000000ffffffff 0000000000000001 ffffffffffffffff 0000000000000000 >> [ 43.553976] 34e0: ffff0000093a34f0 ffff0000081dccc4 >> [ 43.558862] [<ffff0000081dccc4>] __kmalloc+0x64/0x250 >> [ 43.563926] [<ffff0000087c6458>] mmc_alloc_sg+0x28/0x60 >> [ 43.569162] [<ffff0000087c653c>] mmc_init_request+0xac/0xc0 >> [ 43.574747] [<ffff000008378af4>] alloc_request_size+0x4c/0x90 >> [ 43.580506] [<ffff00000817c734>] mempool_alloc+0x54/0x140 >> [ 43.585915] [<ffff000008379d7c>] get_request+0x264/0x6d0 >> [ 43.591238] [<ffff00000837ce50>] blk_queue_bio+0xe0/0x2e0 >> [ 43.596647] [<ffff00000837acc8>] generic_make_request+0xe8/0x260 >> [ 43.602663] [<ffff00000837aef0>] submit_bio+0xb0/0x188 >> [ 43.607812] [<ffff0000082330f0>] submit_bh_wbc+0x130/0x188 >> [ 43.613308] [<ffff0000082332e0>] __block_write_full_page+0x198/0x3a0 >> [ 43.619673] [<ffff00000823374c>] block_write_full_page+0x134/0x148 >> [ 43.625866] [<ffff000008236b20>] blkdev_writepage+0x18/0x20 >> [ 43.631448] [<ffff00000818596c>] __writepage+0x1c/0x70 >> [ 43.636595] [<ffff0000081861d8>] write_cache_pages+0x160/0x360 >> [ 43.642438] [<ffff000008186418>] generic_writepages+0x40/0x78 >> [ 43.648194] [<ffff000008236adc>] blkdev_writepages+0xc/0x18 >> [ 43.653783] [<ffff00000818853c>] do_writepages+0x2c/0xa8 >> [ 43.659113] [<ffff00000822986c>] __writeback_single_inode+0x34/0x1a8 >> [ 43.665486] [<ffff000008229f34>] writeback_sb_inodes+0x1ec/0x398 >> [ 43.671510] [<ffff00000822a17c>] __writeback_inodes_wb+0x9c/0xe0 >> [ 43.677534] [<ffff00000822a418>] wb_writeback+0x1a8/0x1b0 >> [ 43.682950] [<ffff00000822a9d8>] wb_workfn+0x148/0x240 >> [ 43.688105] [<ffff0000080d93f4>] process_one_work+0x1ac/0x318 >> [ 43.693867] [<ffff0000080d95a8>] worker_thread+0x48/0x420 >> [ 43.699283] [<ffff0000080df664>] kthread+0xfc/0x128 >> [ 43.704178] [<ffff0000080842b0>] ret_from_fork+0x10/0x18 >> [ 43.709506] Code: b90012e0 f9400260 d538d081 91002000 (f8616818) >> [ 43.715617] ---[ end trace 9381b75685031f84 ]--- >> [ 43.720290] note: kworker/u16:2[54] exited with preempt_count 1 >> > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-10 16:03 ` Robin Murphy @ 2017-10-16 1:17 ` Leo Yan -1 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-16 1:17 UTC (permalink / raw) To: Robin Murphy Cc: Mark Rutland, Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > On 10/10/17 16:45, Mark Rutland wrote: > > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > >> Hi Mark, > > > > Hi Leo, > > > >> I work mainline kernel on Hikey620 board, I find it's easily to > >> introduce the panic and report the log as below. So I bisect the kernel > >> and finally narrow down the commit e3067861ba66 ("arm64: add basic > >> VMAP_STACK support") which introduce this issue. > >> > >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > >> could you check this and have insight for this issue? > > > > Given the stuff in the backtrace, my suspicion is something is trying to > > perform DMA to/from the stack, getting junk addresses form the attempted > > virt<->phys conversions. > > > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > > CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > addresses either way, too. Thanks for suggestions, Mark & Robin. I enabled these debugging configs but cannot get clue from it; but occasionally found this issue is quite likely related with CA53 errata, especialy ERRATA_A53_855873 is the relative one. So I changed to use ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. Please ignore this regression reporting. Thanks, Leo Yan ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-16 1:17 ` Leo Yan 0 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-16 1:17 UTC (permalink / raw) To: linux-arm-kernel On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > On 10/10/17 16:45, Mark Rutland wrote: > > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > >> Hi Mark, > > > > Hi Leo, > > > >> I work mainline kernel on Hikey620 board, I find it's easily to > >> introduce the panic and report the log as below. So I bisect the kernel > >> and finally narrow down the commit e3067861ba66 ("arm64: add basic > >> VMAP_STACK support") which introduce this issue. > >> > >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > >> could you check this and have insight for this issue? > > > > Given the stuff in the backtrace, my suspicion is something is trying to > > perform DMA to/from the stack, getting junk addresses form the attempted > > virt<->phys conversions. > > > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > > CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > addresses either way, too. Thanks for suggestions, Mark & Robin. I enabled these debugging configs but cannot get clue from it; but occasionally found this issue is quite likely related with CA53 errata, especialy ERRATA_A53_855873 is the relative one. So I changed to use ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. Please ignore this regression reporting. Thanks, Leo Yan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-16 1:17 ` Leo Yan @ 2017-10-16 13:48 ` Mark Rutland -1 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-16 13:48 UTC (permalink / raw) To: Leo Yan Cc: Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel Hi Leo, On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > > On 10/10/17 16:45, Mark Rutland wrote: > > > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > > >> I work mainline kernel on Hikey620 board, I find it's easily to > > >> introduce the panic and report the log as below. So I bisect the kernel > > >> and finally narrow down the commit e3067861ba66 ("arm64: add basic > > >> VMAP_STACK support") which introduce this issue. > > >> > > >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > > >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > > >> could you check this and have insight for this issue? > > > > > > Given the stuff in the backtrace, my suspicion is something is trying to > > > perform DMA to/from the stack, getting junk addresses form the attempted > > > virt<->phys conversions. > > > > > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > > > > CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > > addresses either way, too. > > Thanks for suggestions, Mark & Robin. > > I enabled these debugging configs but cannot get clue from it; but > occasionally found this issue is quite likely related with CA53 errata, > especialy ERRATA_A53_855873 is the relative one. So I changed to use > ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. Thanks for the update. Just to confirm, with the updated firmware you no longer see the issue? I can't immediately see how that would be related. Thanks, Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-16 13:48 ` Mark Rutland 0 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-16 13:48 UTC (permalink / raw) To: linux-arm-kernel Hi Leo, On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > > On 10/10/17 16:45, Mark Rutland wrote: > > > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > > >> I work mainline kernel on Hikey620 board, I find it's easily to > > >> introduce the panic and report the log as below. So I bisect the kernel > > >> and finally narrow down the commit e3067861ba66 ("arm64: add basic > > >> VMAP_STACK support") which introduce this issue. > > >> > > >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > > >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > > >> could you check this and have insight for this issue? > > > > > > Given the stuff in the backtrace, my suspicion is something is trying to > > > perform DMA to/from the stack, getting junk addresses form the attempted > > > virt<->phys conversions. > > > > > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > > > > CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > > addresses either way, too. > > Thanks for suggestions, Mark & Robin. > > I enabled these debugging configs but cannot get clue from it; but > occasionally found this issue is quite likely related with CA53 errata, > especialy ERRATA_A53_855873 is the relative one. So I changed to use > ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. Thanks for the update. Just to confirm, with the updated firmware you no longer see the issue? I can't immediately see how that would be related. Thanks, Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-16 13:48 ` Mark Rutland @ 2017-10-16 14:12 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-10-16 14:12 UTC (permalink / raw) To: Mark Rutland, Leo Yan Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On 16/10/17 14:48, Mark Rutland wrote: > Hi Leo, > > On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: >> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: >>> On 10/10/17 16:45, Mark Rutland wrote: >>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >>>>> I work mainline kernel on Hikey620 board, I find it's easily to >>>>> introduce the panic and report the log as below. So I bisect the kernel >>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic >>>>> VMAP_STACK support") which introduce this issue. >>>>> >>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >>>>> could you check this and have insight for this issue? >>>> >>>> Given the stuff in the backtrace, my suspicion is something is trying to >>>> perform DMA to/from the stack, getting junk addresses form the attempted >>>> virt<->phys conversions. >>>> >>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? >>> >>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack >>> addresses either way, too. >> >> Thanks for suggestions, Mark & Robin. >> >> I enabled these debugging configs but cannot get clue from it; but >> occasionally found this issue is quite likely related with CA53 errata, >> especialy ERRATA_A53_855873 is the relative one. So I changed to use >> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > Thanks for the update. > > Just to confirm, with the updated firmware you no longer see the issue? > > I can't immediately see how that would be related. Cores up to r0p2 have the other errata to which ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that we expect firmware to enable such hardware workarounds where possible. I assume that's why we don't explicitly document 855873 anywhere in Linux. Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-16 14:12 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-10-16 14:12 UTC (permalink / raw) To: linux-arm-kernel On 16/10/17 14:48, Mark Rutland wrote: > Hi Leo, > > On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: >> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: >>> On 10/10/17 16:45, Mark Rutland wrote: >>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >>>>> I work mainline kernel on Hikey620 board, I find it's easily to >>>>> introduce the panic and report the log as below. So I bisect the kernel >>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic >>>>> VMAP_STACK support") which introduce this issue. >>>>> >>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >>>>> could you check this and have insight for this issue? >>>> >>>> Given the stuff in the backtrace, my suspicion is something is trying to >>>> perform DMA to/from the stack, getting junk addresses form the attempted >>>> virt<->phys conversions. >>>> >>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? >>> >>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack >>> addresses either way, too. >> >> Thanks for suggestions, Mark & Robin. >> >> I enabled these debugging configs but cannot get clue from it; but >> occasionally found this issue is quite likely related with CA53 errata, >> especialy ERRATA_A53_855873 is the relative one. So I changed to use >> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > Thanks for the update. > > Just to confirm, with the updated firmware you no longer see the issue? > > I can't immediately see how that would be related. Cores up to r0p2 have the other errata to which ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that we expect firmware to enable such hardware workarounds where possible. I assume that's why we don't explicitly document 855873 anywhere in Linux. Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-16 14:12 ` Robin Murphy @ 2017-10-16 14:26 ` Mark Rutland -1 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-16 14:26 UTC (permalink / raw) To: Robin Murphy Cc: Leo Yan, Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: > On 16/10/17 14:48, Mark Rutland wrote: > > Hi Leo, > > > > On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > >> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > >>> On 10/10/17 16:45, Mark Rutland wrote: > >>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > >>>>> I work mainline kernel on Hikey620 board, I find it's easily to > >>>>> introduce the panic and report the log as below. So I bisect the kernel > >>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic > >>>>> VMAP_STACK support") which introduce this issue. > >>>>> > >>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > >>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > >>>>> could you check this and have insight for this issue? > >>>> > >>>> Given the stuff in the backtrace, my suspicion is something is trying to > >>>> perform DMA to/from the stack, getting junk addresses form the attempted > >>>> virt<->phys conversions. > >>>> > >>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > >>> > >>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > >>> addresses either way, too. > >> > >> Thanks for suggestions, Mark & Robin. > >> > >> I enabled these debugging configs but cannot get clue from it; but > >> occasionally found this issue is quite likely related with CA53 errata, > >> especialy ERRATA_A53_855873 is the relative one. So I changed to use > >> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > > > Thanks for the update. > > > > Just to confirm, with the updated firmware you no longer see the issue? > > > > I can't immediately see how that would be related. > > Cores up to r0p2 have the other errata to which > ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR > bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that > we expect firmware to enable such hardware workarounds where possible. I > assume that's why we don't explicitly document 855873 anywhere in Linux. Sure, I also looked it up. ;) I meant that I couldn't immediately see why VMAP'd stacks were likely to tickle issues with that more reliably. Thanks, Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-16 14:26 ` Mark Rutland 0 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-16 14:26 UTC (permalink / raw) To: linux-arm-kernel On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: > On 16/10/17 14:48, Mark Rutland wrote: > > Hi Leo, > > > > On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > >> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > >>> On 10/10/17 16:45, Mark Rutland wrote: > >>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > >>>>> I work mainline kernel on Hikey620 board, I find it's easily to > >>>>> introduce the panic and report the log as below. So I bisect the kernel > >>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic > >>>>> VMAP_STACK support") which introduce this issue. > >>>>> > >>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > >>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > >>>>> could you check this and have insight for this issue? > >>>> > >>>> Given the stuff in the backtrace, my suspicion is something is trying to > >>>> perform DMA to/from the stack, getting junk addresses form the attempted > >>>> virt<->phys conversions. > >>>> > >>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > >>> > >>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > >>> addresses either way, too. > >> > >> Thanks for suggestions, Mark & Robin. > >> > >> I enabled these debugging configs but cannot get clue from it; but > >> occasionally found this issue is quite likely related with CA53 errata, > >> especialy ERRATA_A53_855873 is the relative one. So I changed to use > >> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > > > Thanks for the update. > > > > Just to confirm, with the updated firmware you no longer see the issue? > > > > I can't immediately see how that would be related. > > Cores up to r0p2 have the other errata to which > ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR > bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that > we expect firmware to enable such hardware workarounds where possible. I > assume that's why we don't explicitly document 855873 anywhere in Linux. Sure, I also looked it up. ;) I meant that I couldn't immediately see why VMAP'd stacks were likely to tickle issues with that more reliably. Thanks, Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-16 14:26 ` Mark Rutland @ 2017-10-16 14:35 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-10-16 14:35 UTC (permalink / raw) To: Mark Rutland Cc: Leo Yan, Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On 16/10/17 15:26, Mark Rutland wrote: > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: >> On 16/10/17 14:48, Mark Rutland wrote: >>> Hi Leo, >>> >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: >>>>> On 10/10/17 16:45, Mark Rutland wrote: >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to >>>>>>> introduce the panic and report the log as below. So I bisect the kernel >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic >>>>>>> VMAP_STACK support") which introduce this issue. >>>>>>> >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >>>>>>> could you check this and have insight for this issue? >>>>>> >>>>>> Given the stuff in the backtrace, my suspicion is something is trying to >>>>>> perform DMA to/from the stack, getting junk addresses form the attempted >>>>>> virt<->phys conversions. >>>>>> >>>>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? >>>>> >>>>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack >>>>> addresses either way, too. >>>> >>>> Thanks for suggestions, Mark & Robin. >>>> >>>> I enabled these debugging configs but cannot get clue from it; but >>>> occasionally found this issue is quite likely related with CA53 errata, >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. >>> >>> Thanks for the update. >>> >>> Just to confirm, with the updated firmware you no longer see the issue? >>> >>> I can't immediately see how that would be related. >> >> Cores up to r0p2 have the other errata to which >> ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR >> bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that >> we expect firmware to enable such hardware workarounds where possible. I >> assume that's why we don't explicitly document 855873 anywhere in Linux. > > Sure, I also looked it up. ;) > > I meant that I couldn't immediately see why VMAP'd stacks were likely to > tickle issues with that more reliably. Ah, right - in context, "that" appeared to refer to "updated firmware", not "VMAP_STACK". Sorry. I guess the vmap addresses might tickle the "same L2 set" condition differently to when both stack and DMA buffer are linear map addresses. Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-16 14:35 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-10-16 14:35 UTC (permalink / raw) To: linux-arm-kernel On 16/10/17 15:26, Mark Rutland wrote: > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: >> On 16/10/17 14:48, Mark Rutland wrote: >>> Hi Leo, >>> >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: >>>>> On 10/10/17 16:45, Mark Rutland wrote: >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to >>>>>>> introduce the panic and report the log as below. So I bisect the kernel >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic >>>>>>> VMAP_STACK support") which introduce this issue. >>>>>>> >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >>>>>>> could you check this and have insight for this issue? >>>>>> >>>>>> Given the stuff in the backtrace, my suspicion is something is trying to >>>>>> perform DMA to/from the stack, getting junk addresses form the attempted >>>>>> virt<->phys conversions. >>>>>> >>>>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? >>>>> >>>>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack >>>>> addresses either way, too. >>>> >>>> Thanks for suggestions, Mark & Robin. >>>> >>>> I enabled these debugging configs but cannot get clue from it; but >>>> occasionally found this issue is quite likely related with CA53 errata, >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. >>> >>> Thanks for the update. >>> >>> Just to confirm, with the updated firmware you no longer see the issue? >>> >>> I can't immediately see how that would be related. >> >> Cores up to r0p2 have the other errata to which >> ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR >> bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that >> we expect firmware to enable such hardware workarounds where possible. I >> assume that's why we don't explicitly document 855873 anywhere in Linux. > > Sure, I also looked it up. ;) > > I meant that I couldn't immediately see why VMAP'd stacks were likely to > tickle issues with that more reliably. Ah, right - in context, "that" appeared to refer to "updated firmware", not "VMAP_STACK". Sorry. I guess the vmap addresses might tickle the "same L2 set" condition differently to when both stack and DMA buffer are linear map addresses. Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-16 14:35 ` Robin Murphy @ 2017-10-17 0:30 ` Leo Yan -1 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-17 0:30 UTC (permalink / raw) To: Robin Murphy Cc: Mark Rutland, Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote: > On 16/10/17 15:26, Mark Rutland wrote: > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: > >> On 16/10/17 14:48, Mark Rutland wrote: > >>> Hi Leo, > >>> > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > >>>>> On 10/10/17 16:45, Mark Rutland wrote: > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic > >>>>>>> VMAP_STACK support") which introduce this issue. > >>>>>>> > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > >>>>>>> could you check this and have insight for this issue? > >>>>>> > >>>>>> Given the stuff in the backtrace, my suspicion is something is trying to > >>>>>> perform DMA to/from the stack, getting junk addresses form the attempted > >>>>>> virt<->phys conversions. > >>>>>> > >>>>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > >>>>> > >>>>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > >>>>> addresses either way, too. > >>>> > >>>> Thanks for suggestions, Mark & Robin. > >>>> > >>>> I enabled these debugging configs but cannot get clue from it; but > >>>> occasionally found this issue is quite likely related with CA53 errata, > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > >>> > >>> Thanks for the update. > >>> > >>> Just to confirm, with the updated firmware you no longer see the issue? > >>> > >>> I can't immediately see how that would be related. > >> > >> Cores up to r0p2 have the other errata to which > >> ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR > >> bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that > >> we expect firmware to enable such hardware workarounds where possible. I > >> assume that's why we don't explicitly document 855873 anywhere in Linux. > > > > Sure, I also looked it up. ;) > > > > I meant that I couldn't immediately see why VMAP'd stacks were likely to > > tickle issues with that more reliably. > > Ah, right - in context, "that" appeared to refer to "updated firmware", > not "VMAP_STACK". Sorry. > > I guess the vmap addresses might tickle the "same L2 set" condition > differently to when both stack and DMA buffer are linear map addresses. A bit more info for this. I can reproduce this memory abort panic, and the panic places are not consistent; usually it's related with kmalloc address. Do you think "VMAP_STACK" introduces much more operations for cache clean? If so if might be in the same *set* with any other memory access (like kmalloc operations), then trigger data abort. Hikey has CA53 CPUs is r3 version so it's luck can directly apply the ERRATA 855873 in ARM-TF. BTW, in case I may mislead you guys, we should note there have another two ERRATAs applied in ARM-TFv1.4 for Hikey: ERRATA_A53_836870 := 1 ERRATA_A53_843419 := 1 Thanks, Leo Yan ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-17 0:30 ` Leo Yan 0 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-17 0:30 UTC (permalink / raw) To: linux-arm-kernel On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote: > On 16/10/17 15:26, Mark Rutland wrote: > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: > >> On 16/10/17 14:48, Mark Rutland wrote: > >>> Hi Leo, > >>> > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > >>>>> On 10/10/17 16:45, Mark Rutland wrote: > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic > >>>>>>> VMAP_STACK support") which introduce this issue. > >>>>>>> > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > >>>>>>> could you check this and have insight for this issue? > >>>>>> > >>>>>> Given the stuff in the backtrace, my suspicion is something is trying to > >>>>>> perform DMA to/from the stack, getting junk addresses form the attempted > >>>>>> virt<->phys conversions. > >>>>>> > >>>>>> Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > >>>>> > >>>>> CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > >>>>> addresses either way, too. > >>>> > >>>> Thanks for suggestions, Mark & Robin. > >>>> > >>>> I enabled these debugging configs but cannot get clue from it; but > >>>> occasionally found this issue is quite likely related with CA53 errata, > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > >>> > >>> Thanks for the update. > >>> > >>> Just to confirm, with the updated firmware you no longer see the issue? > >>> > >>> I can't immediately see how that would be related. > >> > >> Cores up to r0p2 have the other errata to which > >> ARM64_WORKAROUND_CLEAN_CACHE also applies anyway; r3p0+ have an ACTLR > >> bit to do thee CVAC->CIVAC upgrade in hardware, and our policy is that > >> we expect firmware to enable such hardware workarounds where possible. I > >> assume that's why we don't explicitly document 855873 anywhere in Linux. > > > > Sure, I also looked it up. ;) > > > > I meant that I couldn't immediately see why VMAP'd stacks were likely to > > tickle issues with that more reliably. > > Ah, right - in context, "that" appeared to refer to "updated firmware", > not "VMAP_STACK". Sorry. > > I guess the vmap addresses might tickle the "same L2 set" condition > differently to when both stack and DMA buffer are linear map addresses. A bit more info for this. I can reproduce this memory abort panic, and the panic places are not consistent; usually it's related with kmalloc address. Do you think "VMAP_STACK" introduces much more operations for cache clean? If so if might be in the same *set* with any other memory access (like kmalloc operations), then trigger data abort. Hikey has CA53 CPUs is r3 version so it's luck can directly apply the ERRATA 855873 in ARM-TF. BTW, in case I may mislead you guys, we should note there have another two ERRATAs applied in ARM-TFv1.4 for Hikey: ERRATA_A53_836870 := 1 ERRATA_A53_843419 := 1 Thanks, Leo Yan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-17 0:30 ` Leo Yan @ 2017-10-17 9:29 ` Mark Rutland -1 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-17 9:29 UTC (permalink / raw) To: Leo Yan Cc: Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On Tue, Oct 17, 2017 at 08:30:54AM +0800, Leo Yan wrote: > On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote: > > On 16/10/17 15:26, Mark Rutland wrote: > > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: > > >> On 16/10/17 14:48, Mark Rutland wrote: > > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > > >>>>> On 10/10/17 16:45, Mark Rutland wrote: > > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to > > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel > > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic > > >>>>>>> VMAP_STACK support") which introduce this issue. > > >>>>>>> > > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > > >>>>>>> could you check this and have insight for this issue? > > >>>> I enabled these debugging configs but cannot get clue from it; but > > >>>> occasionally found this issue is quite likely related with CA53 errata, > > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use > > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > >>> Just to confirm, with the updated firmware you no longer see the issue? > > >>> > > >>> I can't immediately see how that would be related. > > I guess the vmap addresses might tickle the "same L2 set" condition > > differently to when both stack and DMA buffer are linear map addresses. > > A bit more info for this. > > I can reproduce this memory abort panic, and the panic places are not > consistent; usually it's related with kmalloc address. Do you think > "VMAP_STACK" introduces much more operations for cache clean? If > so if might be in the same *set* with any other memory access (like > kmalloc operations), then trigger data abort. VMAP_STACK doesn't introduce any explicit cache maintenance, but it's possible that it causes more natural evictions. That might explain why it triggers the issue. > Hikey has CA53 CPUs is r3 version so it's luck can directly apply the > ERRATA 855873 in ARM-TF. > > BTW, in case I may mislead you guys, we should note there have another > two ERRATAs applied in ARM-TFv1.4 for Hikey: > > ERRATA_A53_836870 := 1 > ERRATA_A53_843419 := 1 Thanks for the extra info! AFAICT, erratum 836870 results in livelock rather than memory corruption, so I think we can ignore that. I'm a little worried by erratum 843419. The VMAP_STACK patches changed {adr,ldr}_this_cpu (and some users thereof), and it's possible we're managing to tickle that issue. If you still have an affected kernel, could you dump the output of: $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' ... that would show us if there are any affected sequences. >From a quick scan of my own vmlinux build from commit e3067861ba66, I didn't see any, but it's possible this depends on the config used. Thanks, Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-17 9:29 ` Mark Rutland 0 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-17 9:29 UTC (permalink / raw) To: linux-arm-kernel On Tue, Oct 17, 2017 at 08:30:54AM +0800, Leo Yan wrote: > On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote: > > On 16/10/17 15:26, Mark Rutland wrote: > > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: > > >> On 16/10/17 14:48, Mark Rutland wrote: > > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > > >>>>> On 10/10/17 16:45, Mark Rutland wrote: > > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to > > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel > > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic > > >>>>>>> VMAP_STACK support") which introduce this issue. > > >>>>>>> > > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > > >>>>>>> could you check this and have insight for this issue? > > >>>> I enabled these debugging configs but cannot get clue from it; but > > >>>> occasionally found this issue is quite likely related with CA53 errata, > > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use > > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > >>> Just to confirm, with the updated firmware you no longer see the issue? > > >>> > > >>> I can't immediately see how that would be related. > > I guess the vmap addresses might tickle the "same L2 set" condition > > differently to when both stack and DMA buffer are linear map addresses. > > A bit more info for this. > > I can reproduce this memory abort panic, and the panic places are not > consistent; usually it's related with kmalloc address. Do you think > "VMAP_STACK" introduces much more operations for cache clean? If > so if might be in the same *set* with any other memory access (like > kmalloc operations), then trigger data abort. VMAP_STACK doesn't introduce any explicit cache maintenance, but it's possible that it causes more natural evictions. That might explain why it triggers the issue. > Hikey has CA53 CPUs is r3 version so it's luck can directly apply the > ERRATA 855873 in ARM-TF. > > BTW, in case I may mislead you guys, we should note there have another > two ERRATAs applied in ARM-TFv1.4 for Hikey: > > ERRATA_A53_836870 := 1 > ERRATA_A53_843419 := 1 Thanks for the extra info! AFAICT, erratum 836870 results in livelock rather than memory corruption, so I think we can ignore that. I'm a little worried by erratum 843419. The VMAP_STACK patches changed {adr,ldr}_this_cpu (and some users thereof), and it's possible we're managing to tickle that issue. If you still have an affected kernel, could you dump the output of: $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' ... that would show us if there are any affected sequences. >From a quick scan of my own vmlinux build from commit e3067861ba66, I didn't see any, but it's possible this depends on the config used. Thanks, Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-17 9:29 ` Mark Rutland @ 2017-10-17 9:32 ` Ard Biesheuvel -1 siblings, 0 replies; 32+ messages in thread From: Ard Biesheuvel @ 2017-10-17 9:32 UTC (permalink / raw) To: Mark Rutland Cc: Leo Yan, Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel On 17 October 2017 at 10:29, Mark Rutland <mark.rutland@arm.com> wrote: > On Tue, Oct 17, 2017 at 08:30:54AM +0800, Leo Yan wrote: >> On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote: >> > On 16/10/17 15:26, Mark Rutland wrote: >> > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: >> > >> On 16/10/17 14:48, Mark Rutland wrote: >> > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: >> > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: >> > >>>>> On 10/10/17 16:45, Mark Rutland wrote: >> > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >> > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to >> > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel >> > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic >> > >>>>>>> VMAP_STACK support") which introduce this issue. >> > >>>>>>> >> > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >> > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >> > >>>>>>> could you check this and have insight for this issue? > >> > >>>> I enabled these debugging configs but cannot get clue from it; but >> > >>>> occasionally found this issue is quite likely related with CA53 errata, >> > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use >> > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > >> > >>> Just to confirm, with the updated firmware you no longer see the issue? >> > >>> >> > >>> I can't immediately see how that would be related. > >> > I guess the vmap addresses might tickle the "same L2 set" condition >> > differently to when both stack and DMA buffer are linear map addresses. >> >> A bit more info for this. >> >> I can reproduce this memory abort panic, and the panic places are not >> consistent; usually it's related with kmalloc address. Do you think >> "VMAP_STACK" introduces much more operations for cache clean? If >> so if might be in the same *set* with any other memory access (like >> kmalloc operations), then trigger data abort. > > VMAP_STACK doesn't introduce any explicit cache maintenance, but it's > possible that it causes more natural evictions. > > That might explain why it triggers the issue. > >> Hikey has CA53 CPUs is r3 version so it's luck can directly apply the >> ERRATA 855873 in ARM-TF. >> >> BTW, in case I may mislead you guys, we should note there have another >> two ERRATAs applied in ARM-TFv1.4 for Hikey: >> >> ERRATA_A53_836870 := 1 >> ERRATA_A53_843419 := 1 > > Thanks for the extra info! > > AFAICT, erratum 836870 results in livelock rather than memory > corruption, so I think we can ignore that. > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > managing to tickle that issue. > > If you still have an affected kernel, could you dump the output of: > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > ... that would show us if there are any affected sequences. > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > didn't see any, but it's possible this depends on the config used. > The linker should take care of that: it scans the entire executable, and inserts a veneer if an adrp happens to end up at a vulnerable offset in the page. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-17 9:32 ` Ard Biesheuvel 0 siblings, 0 replies; 32+ messages in thread From: Ard Biesheuvel @ 2017-10-17 9:32 UTC (permalink / raw) To: linux-arm-kernel On 17 October 2017 at 10:29, Mark Rutland <mark.rutland@arm.com> wrote: > On Tue, Oct 17, 2017 at 08:30:54AM +0800, Leo Yan wrote: >> On Mon, Oct 16, 2017 at 03:35:46PM +0100, Robin Murphy wrote: >> > On 16/10/17 15:26, Mark Rutland wrote: >> > > On Mon, Oct 16, 2017 at 03:12:45PM +0100, Robin Murphy wrote: >> > >> On 16/10/17 14:48, Mark Rutland wrote: >> > >>> On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: >> > >>>> On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: >> > >>>>> On 10/10/17 16:45, Mark Rutland wrote: >> > >>>>>> On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: >> > >>>>>>> I work mainline kernel on Hikey620 board, I find it's easily to >> > >>>>>>> introduce the panic and report the log as below. So I bisect the kernel >> > >>>>>>> and finally narrow down the commit e3067861ba66 ("arm64: add basic >> > >>>>>>> VMAP_STACK support") which introduce this issue. >> > >>>>>>> >> > >>>>>>> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from >> > >>>>>>> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So >> > >>>>>>> could you check this and have insight for this issue? > >> > >>>> I enabled these debugging configs but cannot get clue from it; but >> > >>>> occasionally found this issue is quite likely related with CA53 errata, >> > >>>> especialy ERRATA_A53_855873 is the relative one. So I changed to use >> > >>>> ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > >> > >>> Just to confirm, with the updated firmware you no longer see the issue? >> > >>> >> > >>> I can't immediately see how that would be related. > >> > I guess the vmap addresses might tickle the "same L2 set" condition >> > differently to when both stack and DMA buffer are linear map addresses. >> >> A bit more info for this. >> >> I can reproduce this memory abort panic, and the panic places are not >> consistent; usually it's related with kmalloc address. Do you think >> "VMAP_STACK" introduces much more operations for cache clean? If >> so if might be in the same *set* with any other memory access (like >> kmalloc operations), then trigger data abort. > > VMAP_STACK doesn't introduce any explicit cache maintenance, but it's > possible that it causes more natural evictions. > > That might explain why it triggers the issue. > >> Hikey has CA53 CPUs is r3 version so it's luck can directly apply the >> ERRATA 855873 in ARM-TF. >> >> BTW, in case I may mislead you guys, we should note there have another >> two ERRATAs applied in ARM-TFv1.4 for Hikey: >> >> ERRATA_A53_836870 := 1 >> ERRATA_A53_843419 := 1 > > Thanks for the extra info! > > AFAICT, erratum 836870 results in livelock rather than memory > corruption, so I think we can ignore that. > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > managing to tickle that issue. > > If you still have an affected kernel, could you dump the output of: > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > ... that would show us if there are any affected sequences. > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > didn't see any, but it's possible this depends on the config used. > The linker should take care of that: it scans the entire executable, and inserts a veneer if an adrp happens to end up at a vulnerable offset in the page. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-17 9:32 ` Ard Biesheuvel @ 2017-10-17 9:36 ` Leo Yan -1 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-17 9:36 UTC (permalink / raw) To: Ard Biesheuvel Cc: Mark Rutland, Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: [...] > > AFAICT, erratum 836870 results in livelock rather than memory > > corruption, so I think we can ignore that. > > > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > > managing to tickle that issue. > > > > If you still have an affected kernel, could you dump the output of: > > > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > > > ... that would show us if there are any affected sequences. > > > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > > didn't see any, but it's possible this depends on the config used. > > > > The linker should take care of that: it scans the entire executable, > and inserts a veneer if an adrp happens to end up at a vulnerable > offset in the page. Is this dependent on any GCC version? I am using GCC 6.2.1, so I get many affected sequences with Mark's command: leoy@leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] ffff0000080a2004: 91374000 add x0, x0, #0xdd0 ffff0000080a2008: f9000ec0 str x0, [x22,#24] -- ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 ffff0000080b7004: 9102e273 add x19, x19, #0xb8 -- ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] ffff0000080f2000: f9405c03 ldr x3, [x0,#184] ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> -- ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 ffff0000080ff000: f9000462 str x2, [x3,#8] ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] -- ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> ffff00000810b000: f94017a2 ldr x2, [x29,#40] ffff00000810b004: aa1303e0 mov x0, x19 ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> -- ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> ffff00000811dffc: 912be021 add x1, x1, #0xaf8 ffff00000811e000: 912ae000 add x0, x0, #0xab8 ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> -- ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 ffff000008138000: 9113c063 add x3, x3, #0x4f0 ffff000008138004: 71000c9f cmp w4, #0x3 -- ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> ffff00000815effc: 91362021 add x1, x1, #0xd88 ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> ffff00000815f004: b94023a1 ldr w1, [x29,#32] ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-17 9:36 ` Leo Yan 0 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-17 9:36 UTC (permalink / raw) To: linux-arm-kernel On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: [...] > > AFAICT, erratum 836870 results in livelock rather than memory > > corruption, so I think we can ignore that. > > > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > > managing to tickle that issue. > > > > If you still have an affected kernel, could you dump the output of: > > > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > > > ... that would show us if there are any affected sequences. > > > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > > didn't see any, but it's possible this depends on the config used. > > > > The linker should take care of that: it scans the entire executable, > and inserts a veneer if an adrp happens to end up at a vulnerable > offset in the page. Is this dependent on any GCC version? I am using GCC 6.2.1, so I get many affected sequences with Mark's command: leoy at leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] ffff0000080a2004: 91374000 add x0, x0, #0xdd0 ffff0000080a2008: f9000ec0 str x0, [x22,#24] -- ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 ffff0000080b7004: 9102e273 add x19, x19, #0xb8 -- ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] ffff0000080f2000: f9405c03 ldr x3, [x0,#184] ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> -- ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 ffff0000080ff000: f9000462 str x2, [x3,#8] ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] -- ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> ffff00000810b000: f94017a2 ldr x2, [x29,#40] ffff00000810b004: aa1303e0 mov x0, x19 ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> -- ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> ffff00000811dffc: 912be021 add x1, x1, #0xaf8 ffff00000811e000: 912ae000 add x0, x0, #0xab8 ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> -- ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 ffff000008138000: 9113c063 add x3, x3, #0x4f0 ffff000008138004: 71000c9f cmp w4, #0x3 -- ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> ffff00000815effc: 91362021 add x1, x1, #0xd88 ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> ffff00000815f004: b94023a1 ldr w1, [x29,#32] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-17 9:36 ` Leo Yan @ 2017-10-17 9:56 ` Mark Rutland -1 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-17 9:56 UTC (permalink / raw) To: Leo Yan Cc: Ard Biesheuvel, Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel On Tue, Oct 17, 2017 at 05:36:58PM +0800, Leo Yan wrote: > On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: > > [...] > > > > AFAICT, erratum 836870 results in livelock rather than memory > > > corruption, so I think we can ignore that. > > > > > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > > > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > > > managing to tickle that issue. > > > > > > If you still have an affected kernel, could you dump the output of: > > > > > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > > > > > ... that would show us if there are any affected sequences. > > > > > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > > > didn't see any, but it's possible this depends on the config used. > > > > > > > The linker should take care of that: it scans the entire executable, > > and inserts a veneer if an adrp happens to end up at a vulnerable > > offset in the page. > > Is this dependent on any GCC version? It is, but we should warn if CONFIG_ARM64_ERRATUM_843419 is selected and the linked doesn't support the --fix-cortex-a53-843419 option: ld does not support --fix-cortex-a53-843419; kernel may be susceptible to erratum) ... do you see this when building the kernel? > I am using GCC 6.2.1, so I get many affected sequences with Mark's command: I beleive these are all beningn. AFAICT, none of these meet the conditions for sequence 1 or sequence 2 affected by the erratum. e.g. many don't have loads/stores using the adrp result. Thanks, Mark. > leoy@leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> > ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] > ffff0000080a2004: 91374000 add x0, x0, #0xdd0 > ffff0000080a2008: f9000ec0 str x0, [x22,#24] > -- > ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> > ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 > ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 > ffff0000080b7004: 9102e273 add x19, x19, #0xb8 > -- > ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> > ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] > ffff0000080f2000: f9405c03 ldr x3, [x0,#184] > ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> > -- > ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> > ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 > ffff0000080ff000: f9000462 str x2, [x3,#8] > ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] > -- > ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> > ffff00000810b000: f94017a2 ldr x2, [x29,#40] > ffff00000810b004: aa1303e0 mov x0, x19 > ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> > -- > ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> > ffff00000811dffc: 912be021 add x1, x1, #0xaf8 > ffff00000811e000: 912ae000 add x0, x0, #0xab8 > ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> > -- > ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> > ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 > ffff000008138000: 9113c063 add x3, x3, #0x4f0 > ffff000008138004: 71000c9f cmp w4, #0x3 > -- > ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> > ffff00000815effc: 91362021 add x1, x1, #0xd88 > ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> > ffff00000815f004: b94023a1 ldr w1, [x29,#32] ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-17 9:56 ` Mark Rutland 0 siblings, 0 replies; 32+ messages in thread From: Mark Rutland @ 2017-10-17 9:56 UTC (permalink / raw) To: linux-arm-kernel On Tue, Oct 17, 2017 at 05:36:58PM +0800, Leo Yan wrote: > On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: > > [...] > > > > AFAICT, erratum 836870 results in livelock rather than memory > > > corruption, so I think we can ignore that. > > > > > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > > > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > > > managing to tickle that issue. > > > > > > If you still have an affected kernel, could you dump the output of: > > > > > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > > > > > ... that would show us if there are any affected sequences. > > > > > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > > > didn't see any, but it's possible this depends on the config used. > > > > > > > The linker should take care of that: it scans the entire executable, > > and inserts a veneer if an adrp happens to end up at a vulnerable > > offset in the page. > > Is this dependent on any GCC version? It is, but we should warn if CONFIG_ARM64_ERRATUM_843419 is selected and the linked doesn't support the --fix-cortex-a53-843419 option: ld does not support --fix-cortex-a53-843419; kernel may be susceptible to erratum) ... do you see this when building the kernel? > I am using GCC 6.2.1, so I get many affected sequences with Mark's command: I beleive these are all beningn. AFAICT, none of these meet the conditions for sequence 1 or sequence 2 affected by the erratum. e.g. many don't have loads/stores using the adrp result. Thanks, Mark. > leoy at leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> > ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] > ffff0000080a2004: 91374000 add x0, x0, #0xdd0 > ffff0000080a2008: f9000ec0 str x0, [x22,#24] > -- > ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> > ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 > ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 > ffff0000080b7004: 9102e273 add x19, x19, #0xb8 > -- > ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> > ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] > ffff0000080f2000: f9405c03 ldr x3, [x0,#184] > ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> > -- > ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> > ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 > ffff0000080ff000: f9000462 str x2, [x3,#8] > ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] > -- > ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> > ffff00000810b000: f94017a2 ldr x2, [x29,#40] > ffff00000810b004: aa1303e0 mov x0, x19 > ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> > -- > ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> > ffff00000811dffc: 912be021 add x1, x1, #0xaf8 > ffff00000811e000: 912ae000 add x0, x0, #0xab8 > ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> > -- > ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> > ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 > ffff000008138000: 9113c063 add x3, x3, #0x4f0 > ffff000008138004: 71000c9f cmp w4, #0x3 > -- > ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> > ffff00000815effc: 91362021 add x1, x1, #0xd88 > ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> > ffff00000815f004: b94023a1 ldr w1, [x29,#32] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-17 9:56 ` Mark Rutland @ 2017-10-18 6:33 ` Leo Yan -1 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-18 6:33 UTC (permalink / raw) To: Mark Rutland Cc: Ard Biesheuvel, Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel On Tue, Oct 17, 2017 at 10:56:43AM +0100, Mark Rutland wrote: > On Tue, Oct 17, 2017 at 05:36:58PM +0800, Leo Yan wrote: > > On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: > > > > [...] > > > > > > AFAICT, erratum 836870 results in livelock rather than memory > > > > corruption, so I think we can ignore that. > > > > > > > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > > > > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > > > > managing to tickle that issue. > > > > > > > > If you still have an affected kernel, could you dump the output of: > > > > > > > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > > > > > > > ... that would show us if there are any affected sequences. > > > > > > > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > > > > didn't see any, but it's possible this depends on the config used. > > > > > > > > > > The linker should take care of that: it scans the entire executable, > > > and inserts a veneer if an adrp happens to end up at a vulnerable > > > offset in the page. > > > > Is this dependent on any GCC version? > > It is, but we should warn if CONFIG_ARM64_ERRATUM_843419 is selected and > the linked doesn't support the --fix-cortex-a53-843419 option: > > ld does not support --fix-cortex-a53-843419; kernel may be susceptible to erratum) > > ... do you see this when building the kernel? No, I don't see this building warning. Thanks you and Ard for confirmation. > > I am using GCC 6.2.1, so I get many affected sequences with Mark's command: > > I beleive these are all beningn. AFAICT, none of these meet the conditions for > sequence 1 or sequence 2 affected by the erratum. e.g. many don't have > loads/stores using the adrp result. > > Thanks, > Mark. > > > leoy@leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> > > ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] > > ffff0000080a2004: 91374000 add x0, x0, #0xdd0 > > ffff0000080a2008: f9000ec0 str x0, [x22,#24] > > -- > > ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> > > ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 > > ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 > > ffff0000080b7004: 9102e273 add x19, x19, #0xb8 > > -- > > ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> > > ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] > > ffff0000080f2000: f9405c03 ldr x3, [x0,#184] > > ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> > > -- > > ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> > > ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 > > ffff0000080ff000: f9000462 str x2, [x3,#8] > > ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] > > -- > > ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> > > ffff00000810b000: f94017a2 ldr x2, [x29,#40] > > ffff00000810b004: aa1303e0 mov x0, x19 > > ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> > > -- > > ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> > > ffff00000811dffc: 912be021 add x1, x1, #0xaf8 > > ffff00000811e000: 912ae000 add x0, x0, #0xab8 > > ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> > > -- > > ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> > > ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 > > ffff000008138000: 9113c063 add x3, x3, #0x4f0 > > ffff000008138004: 71000c9f cmp w4, #0x3 > > -- > > ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> > > ffff00000815effc: 91362021 add x1, x1, #0xd88 > > ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> > > ffff00000815f004: b94023a1 ldr w1, [x29,#32] ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-18 6:33 ` Leo Yan 0 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-18 6:33 UTC (permalink / raw) To: linux-arm-kernel On Tue, Oct 17, 2017 at 10:56:43AM +0100, Mark Rutland wrote: > On Tue, Oct 17, 2017 at 05:36:58PM +0800, Leo Yan wrote: > > On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: > > > > [...] > > > > > > AFAICT, erratum 836870 results in livelock rather than memory > > > > corruption, so I think we can ignore that. > > > > > > > > I'm a little worried by erratum 843419. The VMAP_STACK patches changed > > > > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're > > > > managing to tickle that issue. > > > > > > > > If you still have an affected kernel, could you dump the output of: > > > > > > > > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > > > > > > > ... that would show us if there are any affected sequences. > > > > > > > > From a quick scan of my own vmlinux build from commit e3067861ba66, I > > > > didn't see any, but it's possible this depends on the config used. > > > > > > > > > > The linker should take care of that: it scans the entire executable, > > > and inserts a veneer if an adrp happens to end up at a vulnerable > > > offset in the page. > > > > Is this dependent on any GCC version? > > It is, but we should warn if CONFIG_ARM64_ERRATUM_843419 is selected and > the linked doesn't support the --fix-cortex-a53-843419 option: > > ld does not support --fix-cortex-a53-843419; kernel may be susceptible to erratum) > > ... do you see this when building the kernel? No, I don't see this building warning. Thanks you and Ard for confirmation. > > I am using GCC 6.2.1, so I get many affected sequences with Mark's command: > > I beleive these are all beningn. AFAICT, none of these meet the conditions for > sequence 1 or sequence 2 affected by the erratum. e.g. many don't have > loads/stores using the adrp result. > > Thanks, > Mark. > > > leoy at leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > > ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> > > ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] > > ffff0000080a2004: 91374000 add x0, x0, #0xdd0 > > ffff0000080a2008: f9000ec0 str x0, [x22,#24] > > -- > > ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> > > ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 > > ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 > > ffff0000080b7004: 9102e273 add x19, x19, #0xb8 > > -- > > ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> > > ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] > > ffff0000080f2000: f9405c03 ldr x3, [x0,#184] > > ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> > > -- > > ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> > > ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 > > ffff0000080ff000: f9000462 str x2, [x3,#8] > > ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] > > -- > > ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> > > ffff00000810b000: f94017a2 ldr x2, [x29,#40] > > ffff00000810b004: aa1303e0 mov x0, x19 > > ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> > > -- > > ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> > > ffff00000811dffc: 912be021 add x1, x1, #0xaf8 > > ffff00000811e000: 912ae000 add x0, x0, #0xab8 > > ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> > > -- > > ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> > > ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 > > ffff000008138000: 9113c063 add x3, x3, #0x4f0 > > ffff000008138004: 71000c9f cmp w4, #0x3 > > -- > > ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> > > ffff00000815effc: 91362021 add x1, x1, #0xd88 > > ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> > > ffff00000815f004: b94023a1 ldr w1, [x29,#32] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-17 9:36 ` Leo Yan @ 2017-10-17 9:57 ` Ard Biesheuvel -1 siblings, 0 replies; 32+ messages in thread From: Ard Biesheuvel @ 2017-10-17 9:57 UTC (permalink / raw) To: Leo Yan Cc: Mark Rutland, Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel On 17 October 2017 at 10:36, Leo Yan <leo.yan@linaro.org> wrote: > On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: > > [...] > >> > AFAICT, erratum 836870 results in livelock rather than memory >> > corruption, so I think we can ignore that. >> > >> > I'm a little worried by erratum 843419. The VMAP_STACK patches changed >> > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're >> > managing to tickle that issue. >> > >> > If you still have an affected kernel, could you dump the output of: >> > >> > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' >> > >> > ... that would show us if there are any affected sequences. >> > >> > From a quick scan of my own vmlinux build from commit e3067861ba66, I >> > didn't see any, but it's possible this depends on the config used. >> > >> >> The linker should take care of that: it scans the entire executable, >> and inserts a veneer if an adrp happens to end up at a vulnerable >> offset in the page. > > Is this dependent on any GCC version? > > I am using GCC 6.2.1, so I get many affected sequences with Mark's command: > > leoy@leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> > ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] > ffff0000080a2004: 91374000 add x0, x0, #0xdd0 > ffff0000080a2008: f9000ec0 str x0, [x22,#24] > -- > ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> > ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 > ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 > ffff0000080b7004: 9102e273 add x19, x19, #0xb8 > -- > ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> > ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] > ffff0000080f2000: f9405c03 ldr x3, [x0,#184] > ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> This is a branch to a veneer > -- > ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> > ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 > ffff0000080ff000: f9000462 str x2, [x3,#8] > ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] > -- > ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> > ffff00000810b000: f94017a2 ldr x2, [x29,#40] > ffff00000810b004: aa1303e0 mov x0, x19 > ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> And this as well > -- > ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> > ffff00000811dffc: 912be021 add x1, x1, #0xaf8 > ffff00000811e000: 912ae000 add x0, x0, #0xab8 > ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> > -- > ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> > ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 > ffff000008138000: 9113c063 add x3, x3, #0x4f0 > ffff000008138004: 71000c9f cmp w4, #0x3 > -- > ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> > ffff00000815effc: 91362021 add x1, x1, #0xd88 > ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> > ffff00000815f004: b94023a1 ldr w1, [x29,#32] > ... so it seems the linker is doing its job, and updating the affected sequences, and leaving the other ones alone. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-17 9:57 ` Ard Biesheuvel 0 siblings, 0 replies; 32+ messages in thread From: Ard Biesheuvel @ 2017-10-17 9:57 UTC (permalink / raw) To: linux-arm-kernel On 17 October 2017 at 10:36, Leo Yan <leo.yan@linaro.org> wrote: > On Tue, Oct 17, 2017 at 10:32:21AM +0100, Ard Biesheuvel wrote: > > [...] > >> > AFAICT, erratum 836870 results in livelock rather than memory >> > corruption, so I think we can ignore that. >> > >> > I'm a little worried by erratum 843419. The VMAP_STACK patches changed >> > {adr,ldr}_this_cpu (and some users thereof), and it's possible we're >> > managing to tickle that issue. >> > >> > If you still have an affected kernel, could you dump the output of: >> > >> > $ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' >> > >> > ... that would show us if there are any affected sequences. >> > >> > From a quick scan of my own vmlinux build from commit e3067861ba66, I >> > didn't see any, but it's possible this depends on the config used. >> > >> >> The linker should take care of that: it scans the entire executable, >> and inserts a veneer if an adrp happens to end up at a vulnerable >> offset in the page. > > Is this dependent on any GCC version? > > I am using GCC 6.2.1, so I get many affected sequences with Mark's command: > > leoy at leoy-linaro:~/Work/reference/opensource/linux$ aarch64-linux-gnu-objdump -d vmlinux | grep -A 3 'ff[8c]:\s\+[a-f0-9]\+\s\+adrp' > ffff0000080a1ffc: 90007340 adrp x0, ffff000008f09000 <page_wait_table+0x1280> > ffff0000080a2000: a900fedf stp xzr, xzr, [x22,#8] > ffff0000080a2004: 91374000 add x0, x0, #0xdd0 > ffff0000080a2008: f9000ec0 str x0, [x22,#24] > -- > ffff0000080b6ff8: b0007ce0 adrp x0, ffff000009053000 <chunk_hash_heads+0x680> > ffff0000080b6ffc: 52901801 mov w1, #0x80c0 // #32960 > ffff0000080b7000: 72a02801 movk w1, #0x140, lsl #16 > ffff0000080b7004: 9102e273 add x19, x19, #0xb8 > -- > ffff0000080f1ff8: d00070a1 adrp x1, ffff000008f07000 <bit_wait_table+0xd80> > ffff0000080f1ffc: f9406402 ldr x2, [x0,#200] > ffff0000080f2000: f9405c03 ldr x3, [x0,#184] > ffff0000080f2004: 14002915 b ffff0000080fc458 <e843419@00ce_00000e94_3c4> This is a branch to a veneer > -- > ffff0000080feff8: 90000002 adrp x2, ffff0000080fe000 <prio_changed_rt+0x88> > ffff0000080feffc: 9136e042 add x2, x2, #0xdb8 > ffff0000080ff000: f9000462 str x2, [x3,#8] > ffff0000080ff004: f9448e62 ldr x2, [x19,#2328] > -- > ffff00000810affc: f0006fe1 adrp x1, ffff000008f09000 <page_wait_table+0x1280> > ffff00000810b000: f94017a2 ldr x2, [x29,#40] > ffff00000810b004: aa1303e0 mov x0, x19 > ffff00000810b008: 1400013d b ffff00000810b4fc <e843419@00e7_000010c5_338> And this as well > -- > ffff00000811dff8: d0005c80 adrp x0, ffff000008caf000 <kallsyms_token_index+0xaf00> > ffff00000811dffc: 912be021 add x1, x1, #0xaf8 > ffff00000811e000: 912ae000 add x0, x0, #0xab8 > ffff00000811e004: 97ffe6ab bl ffff000008117ab0 <printk> > -- > ffff000008137ff8: f0004443 adrp x3, ffff0000089c2000 <clock_monotonic+0x50> > ffff000008137ffc: 910ac042 add x2, x2, #0x2b0 > ffff000008138000: 9113c063 add x3, x3, #0x4f0 > ffff000008138004: 71000c9f cmp w4, #0x3 > -- > ffff00000815eff8: b0005aa1 adrp x1, ffff000008cb3000 <kallsyms_token_index+0xef00> > ffff00000815effc: 91362021 add x1, x1, #0xd88 > ffff00000815f000: 97ffffc8 bl ffff00000815ef20 <audit_log_format> > ffff00000815f004: b94023a1 ldr w1, [x29,#32] > ... so it seems the linker is doing its job, and updating the affected sequences, and leaving the other ones alone. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") 2017-10-16 13:48 ` Mark Rutland @ 2017-10-17 0:33 ` Leo Yan -1 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-17 0:33 UTC (permalink / raw) To: Mark Rutland Cc: Robin Murphy, Catalin Marinas, linux-kernel, linux-arm-kernel, ard.biesheuvel On Mon, Oct 16, 2017 at 02:48:19PM +0100, Mark Rutland wrote: > Hi Leo, > > On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > > On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > > > On 10/10/17 16:45, Mark Rutland wrote: > > > > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > > > >> I work mainline kernel on Hikey620 board, I find it's easily to > > > >> introduce the panic and report the log as below. So I bisect the kernel > > > >> and finally narrow down the commit e3067861ba66 ("arm64: add basic > > > >> VMAP_STACK support") which introduce this issue. > > > >> > > > >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > > > >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > > > >> could you check this and have insight for this issue? > > > > > > > > Given the stuff in the backtrace, my suspicion is something is trying to > > > > perform DMA to/from the stack, getting junk addresses form the attempted > > > > virt<->phys conversions. > > > > > > > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > > > > > > CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > > > addresses either way, too. > > > > Thanks for suggestions, Mark & Robin. > > > > I enabled these debugging configs but cannot get clue from it; but > > occasionally found this issue is quite likely related with CA53 errata, > > especialy ERRATA_A53_855873 is the relative one. So I changed to use > > ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > Thanks for the update. > > Just to confirm, with the updated firmware you no longer see the issue? Yes. > I can't immediately see how that would be related. > > Thanks, > Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
* ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") @ 2017-10-17 0:33 ` Leo Yan 0 siblings, 0 replies; 32+ messages in thread From: Leo Yan @ 2017-10-17 0:33 UTC (permalink / raw) To: linux-arm-kernel On Mon, Oct 16, 2017 at 02:48:19PM +0100, Mark Rutland wrote: > Hi Leo, > > On Mon, Oct 16, 2017 at 09:17:23AM +0800, Leo Yan wrote: > > On Tue, Oct 10, 2017 at 05:03:44PM +0100, Robin Murphy wrote: > > > On 10/10/17 16:45, Mark Rutland wrote: > > > > On Tue, Oct 10, 2017 at 10:27:25PM +0800, Leo Yan wrote: > > > >> I work mainline kernel on Hikey620 board, I find it's easily to > > > >> introduce the panic and report the log as below. So I bisect the kernel > > > >> and finally narrow down the commit e3067861ba66 ("arm64: add basic > > > >> VMAP_STACK support") which introduce this issue. > > > >> > > > >> I tried to remove 'select HAVE_ARCH_VMAP_STACK' from > > > >> arch/arm64/Kconfig, then I can see the panic issue will dismiss. So > > > >> could you check this and have insight for this issue? > > > > > > > > Given the stuff in the backtrace, my suspicion is something is trying to > > > > perform DMA to/from the stack, getting junk addresses form the attempted > > > > virt<->phys conversions. > > > > > > > > Could you try enabling both VMAP_STACK and CONFIG_DEBUG_VIRTUAL? > > > > > > CONFIG_DMA_API_DEBUG should scream about drivers trying to use stack > > > addresses either way, too. > > > > Thanks for suggestions, Mark & Robin. > > > > I enabled these debugging configs but cannot get clue from it; but > > occasionally found this issue is quite likely related with CA53 errata, > > especialy ERRATA_A53_855873 is the relative one. So I changed to use > > ARM-TF mainline code with ERRATA fixing, this issue can be dismissed. > > Thanks for the update. > > Just to confirm, with the updated firmware you no longer see the issue? Yes. > I can't immediately see how that would be related. > > Thanks, > Mark. ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2017-10-18 6:33 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-10-10 14:27 ARM64: Regression with commit e3067861ba66 ("arm64: add basic VMAP_STACK support") Leo Yan 2017-10-10 14:27 ` Leo Yan 2017-10-10 15:45 ` Mark Rutland 2017-10-10 15:45 ` Mark Rutland 2017-10-10 16:03 ` Robin Murphy 2017-10-10 16:03 ` Robin Murphy 2017-10-16 1:17 ` Leo Yan 2017-10-16 1:17 ` Leo Yan 2017-10-16 13:48 ` Mark Rutland 2017-10-16 13:48 ` Mark Rutland 2017-10-16 14:12 ` Robin Murphy 2017-10-16 14:12 ` Robin Murphy 2017-10-16 14:26 ` Mark Rutland 2017-10-16 14:26 ` Mark Rutland 2017-10-16 14:35 ` Robin Murphy 2017-10-16 14:35 ` Robin Murphy 2017-10-17 0:30 ` Leo Yan 2017-10-17 0:30 ` Leo Yan 2017-10-17 9:29 ` Mark Rutland 2017-10-17 9:29 ` Mark Rutland 2017-10-17 9:32 ` Ard Biesheuvel 2017-10-17 9:32 ` Ard Biesheuvel 2017-10-17 9:36 ` Leo Yan 2017-10-17 9:36 ` Leo Yan 2017-10-17 9:56 ` Mark Rutland 2017-10-17 9:56 ` Mark Rutland 2017-10-18 6:33 ` Leo Yan 2017-10-18 6:33 ` Leo Yan 2017-10-17 9:57 ` Ard Biesheuvel 2017-10-17 9:57 ` Ard Biesheuvel 2017-10-17 0:33 ` Leo Yan 2017-10-17 0:33 ` Leo Yan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.