linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* edac KASAN warning in experimental arm64 allmodconfig boot
@ 2019-10-14 15:18 John Garry
  2019-10-14 16:09 ` Borislav Petkov
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: John Garry @ 2019-10-14 15:18 UTC (permalink / raw)
  To: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	Robert Richter
  Cc: linux-edac, linux-kernel

Hi guys,

I'm experimenting by trying to boot an allmodconfig arm64 kernel, as 
mentioned here:
https://lore.kernel.org/linux-arm-kernel/507325a3-030e-2843-0f46-7e18c60257de@huawei.com/

One thing that I noticed - it's hard to miss actually - is the amount of 
complaining from KASAN about the EDAC/ghes code. Maybe this is something 
I should not care about/red herring, or maybe something genuine. Let me 
know what you think.

The kernel is v5.4-rc3, and I raised the EDAC mc debug level to get 
extra debug prints.

Log below, Thanks,
John

Log snippet (I cut off after the first KASAN warning):

[   70.471011][    T1] random: get_random_u32 called from 
new_slab+0x360/0x698 with crng_init=0
[   70.478671][    T1] [Firmware Bug]: APEI: Invalid bit width + offset 
in GAR [0x94110034/64/0/3/0]
[   70.526585][    T1] EDAC DEBUG: edac_mc_alloc: allocating 3524 bytes 
for mci data (32 dimms, 32 csrows/channels)
[   70.542013][    T1] EDAC DEBUG: ghes_edac_dmidecode: DIMM2: 
Registered-DDR4 size = 16384 MB(ECC)
[   70.551044][    T1] EDAC DEBUG: ghes_edac_dmidecode:         type 26, 
detail 0x2080, width 72(total 64)
[   70.559986][    T1] EDAC DEBUG: edac_mc_add_mc_with_groups:
[   70.567082][    T1] EDAC DEBUG: edac_create_sysfs_mci_device: device 
mc0 created
[   70.575608][    T1] EDAC DEBUG: edac_create_dimm_object: device dimm2 
created at location memory 2
[   70.585818][    T1] EDAC DEBUG: edac_create_csrow_object: device 
csrow2 created
[   70.594110][    T1] EDAC MC0: Giving out device to module ghes_edac.c 
controller ghes_edac: DEV ghes (INTERRUPT)
[   70.605936][    T1] EDAC DEBUG: edac_mc_del_mc:
[   70.611188][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device:
[   70.619443][    T1] random: get_random_u32 called from 
kobject_put+0x8c/0x190 with crng_init=0
[   70.628163][    T1] kobject: 'csrow2' ((____ptrval____)): 
kobject_release, parent (____ptrval____) (delayed 750)
[   70.638477][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device: 
unregistering device dimm2
[   70.647903][    T1] kobject: 'dimm2' ((____ptrval____)): 
kobject_release, parent (____ptrval____) (delayed 250)
[   70.658105][    T1] EDAC MC: Removed device 0 for ghes_edac.c 
ghes_edac: DEV ghes
[   70.665673][    T1] EDAC DEBUG: edac_mc_free:
[   70.670211][    T1] EDAC DEBUG: edac_unregister_sysfs: unregistering 
device mc0
[   70.679027][    T1] kobject: 'mc0' ((____ptrval____)): 
kobject_release, parent (____ptrval____) (delayed 500)
[   70.690987][    T1] EDAC DEBUG: edac_mc_del_mc:
[   70.695769][    T1] EDAC DEBUG: edac_mc_free:
[   70.700412][    T1] ------------[ cut here ]------------
[   70.705832][    T1] ODEBUG: free active (active state 0) object type: 
timer_list hint: delayed_work_timer_fn+0x0/0x48
[   70.716663][    T1] WARNING: CPU: 50 PID: 1 at lib/debugobjects.c:484 
debug_print_object+0xec/0x130
[   70.725721][    T1] Modules linked in:
[   70.729491][    T1] CPU: 50 PID: 1 Comm: swapper/0 Not tainted 
5.4.0-rc3+ #1146
[   70.736811][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon 
D06 UEFI RC0 - V1.16.01 03/15/2019
[   70.746039][    T1] pstate: 80800009 (Nzcv daif -PAN +UAO)
[   70.746056][    T1] pc : debug_print_object+0xec/0x130
[   70.756681][    T1] lr : debug_print_object+0xec/0x130
[   70.756691][    T1] sp : ffff0020bf2c7740
[   70.756699][    T1] x29: ffff0020bf2c7740 x28: ffff0023242c5000
[   70.756715][    T1] x27: ffff0023242c5090 x26: ffffa00017543de0
[   70.756730][    T1] x25: ffffa000101cd558 x24: ffffa00012051fc0
[   70.756750][    T1] x23: ffffa000150d2200 x22: ffffa000120523a0
[   70.765894][    T1] x21: ffffa00012051640 x20: 0000000000000000
[   70.765910][    T1] x19: ffffa00015019000 x18: 00000000000025a8
[   70.765924][    T1] x17: 00000000000025a0 x16: 00000000000026b0
[   70.765939][    T1] x15: 0000000000001470 x14: 64203a746e696820
[   70.765954][    T1] x13: 7473696c5f72656d x12: 1fffe00417e58e5a
[   70.777974][    T1] x11: ffff800417e58e5a x10: dfffa00000000000
[   70.789995][    T1] x9 : ffff800417e58e5b x8 : 0000000000000001
[   70.790011][    T1] x7 : ffff0020bf2c72d7 x6 : ffff800417e58e5b
[   70.790026][    T1] x5 : 1fffe00417e57936 x4 : ffff0020bf2bc058
[   70.790041][    T1] x3 : ffffa00010000000 x2 : ffff800417e58eb0
[   70.790055][    T1] x1 : f8aafc30f531b000 x0 : 0000000000000000
[   70.802080][    T1] Call trace:
[   70.802093][    T1]  debug_print_object+0xec/0x130
[   70.802106][    T1]  __debug_check_no_obj_freed+0x114/0x290
[   70.802119][    T1]  debug_check_no_obj_freed+0x18/0x28
[   70.802130][    T1]  slab_free_freelist_hook+0x18c/0x228
[   70.802140][    T1]  kfree+0x264/0x420
[   70.802157][    T1]  _edac_mc_free+0x6c/0x210
[   70.814163][    T1]  edac_mc_free+0x68/0x88
[   70.814177][    T1]  ghes_edac_unregister+0x44/0x70
[   70.814193][    T1]  ghes_remove+0x274/0x2a0
[   70.814207][    T1]  platform_drv_remove+0x44/0x78
[   70.814217][    T1]  really_probe+0x404/0x840
[   70.814228][    T1]  driver_probe_device+0x190/0x1f0
[   70.814239][    T1]  device_driver_attach+0x7c/0xb0
[   70.814249][    T1]  __driver_attach+0x1b8/0x1d0
[   70.814261][    T1]  bus_for_each_dev+0xf8/0x190
[   70.814277][    T1]  driver_attach+0x34/0x40
[   70.826289][    T1]  bus_add_driver+0x1d8/0x340
[   70.826301][    T1]  driver_register+0x168/0x1e8
[   70.826312][    T1]  __platform_driver_register+0x80/0x90
[   70.826326][    T1]  ghes_init+0xc4/0x174
[   70.826338][    T1]  do_one_initcall+0x328/0x788
[   70.826356][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   70.838361][    T1]  kernel_init+0x18/0x178
[   70.838373][    T1]  ret_from_fork+0x10/0x18
[   70.838381][    T1] irq event stamp: 4398006
[   70.838394][    T1] hardirqs last  enabled at (4398005): 
[<ffffa000100c0e78>] el1_irq+0x138/0x200
[   70.838409][    T1] hardirqs last disabled at (4398006): 
[<ffffa000100fd884>] debug_exception_enter+0x8c/0x190
[   70.838422][    T1] softirqs last  enabled at (4398004): 
[<ffffa000100bf4a4>] __do_softirq+0x894/0x920
[   70.838439][    T1] softirqs last disabled at (4397997): 
[<ffffa000101965e4>] irq_exit+0x114/0x1a0
[   70.875171][    T1] ---[ end trace a9b7b2cbbb0f7263 ]---
[   70.885805][    T1] ------------[ cut here ]------------
[   70.892929][    T1] ODEBUG: free active (active state 0) object type: 
timer_list hint: delayed_work_timer_fn+0x0/0x48
[   70.907197][    T1] WARNING: CPU: 50 PID: 1 at lib/debugobjects.c:484 
debug_print_object+0xec/0x130
[   70.916349][    T1] Modules linked in:
[   70.916368][    T1] CPU: 50 PID: 1 Comm: swapper/0 Tainted: G 
W         5.4.0-rc3+ #1146
[   70.916378][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon 
D06 UEFI RC0 - V1.16.01 03/15/2019
[   70.916388][    T1] pstate: 80800009 (Nzcv daif -PAN +UAO)
[   70.916400][    T1] pc : debug_print_object+0xec/0x130
[   70.916412][    T1] lr : debug_print_object+0xec/0x130
[   70.916424][    T1] sp : ffff0020bf2c7740
[   70.925916][    T1] x29: ffff0020bf2c7740 x28: ffff00232427a000
[   70.925933][    T1] x27: ffff00232427a090 x26: ffffa00017543de0
[   70.925948][    T1] x25: ffffa000101cd558 x24: ffffa00012051fc0
[   70.925963][    T1] x23: ffffa000150d2200 x22: ffffa000120523a0
[   70.971505][    T1] x21: ffffa00012051640 x20: 0000000000000000
[   70.984654][    T1] x19: ffffa00015019000 x18: 00000000000025a8
[   70.984671][    T1] x17: 00000000000025a0 x16: 00000000000026b0
[   70.984685][    T1] x15: 0000000000001470 x14: 726f775f64657961
[   70.984701][    T1] x13: 6c6564203a746e69 x12: 1fffe00417e58e5a
[   71.004012][    T1] x11: ffff800417e58e5a x10: dfffa00000000000
[   71.004028][    T1] x9 : ffff800417e58e5b x8 : 0000000000000001
[   71.004043][    T1] x7 : ffff0020bf2c72d7 x6 : ffff800417e58e5b
[   71.004058][    T1] x5 : 1fffe00417e57936 x4 : ffff0020bf2bc058
[   71.034246][    T1] x3 : ffffa00010000000 x2 : ffff800417e58eb0
[   71.047049][    T1] x1 : f8aafc30f531b000 x0 : 0000000000000000
[   71.047065][    T1] Call trace:
[   71.047078][    T1]  debug_print_object+0xec/0x130
[   71.047090][    T1]  __debug_check_no_obj_freed+0x114/0x290
[   71.047103][    T1]  debug_check_no_obj_freed+0x18/0x28
[   71.047114][    T1]  slab_free_freelist_h    T1]  edac_mc_free+0x68/0x88
[   71.065065][    T1]  ghes_edac_unregister+0x44/0x70
[   71.065077][    T1]  ghes_remove+0x274/0x2a0
[   71.065088][    T1]  platform_drv_remove+0x44/0x78
[   71.065099][    T1]  really_probe+0x404/0x840
[   71.065112][    T1]  driver_probe_device+0x190/0x1f0
[   71.132887][    T1]  device_driver_attach+0x7c/0xb0
[   71.132898][    T1]  __driver_attach+0x1b8/0x1d0
[   71.132911][    T1]  bus_for_each_dev+0xf8/0x190
[   71.132921][    T1]  driver_attach+0x34/0x40
[   71.132931][    T1]  bus_add_driver+0x1d8/0x340
[   71.132942][    T1]  driver_register+0x168/0x1e8
[   71.132953][    T1]  __platform_driver_register+0x80/0x90
[   71.132964][    T1]  ghes_init+0xc4/0x174
[   71.132975][    T1]  do_one_initcall+0x328/0x788
[   71.132989][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.144995][    T1]  kernel_init+0x18/0x178
[   71.145006][    T1]  ret_from_fork+0x10/0x18
[   71.145015][    T1] irq event stamp: 4398362
[   71.145027][    T1] hardirqs last  enabled at (4398361): 
[<ffffa000100c0e78>] el1_irq+0x138/0x200
[   71.145042][    T1] hardirqs last disabled at (4398362): 
[<ffffa000100fd884>] debug_exception_enter+0x8c/0x190
[   71.145056][    T1] softirqs last  enabled at (4398360): 
[<irq_exit+0x114/0x1a0
[   71.157069][    T1] ---[ end trace a9b7b2cbbb0f7264 ]---
[   71.158439][    T1] ------------[ cut here ]------------
[   71.194319][    T1] ODEBUG: free active (active state 0) object type: 
timer_list hint: delayed_work_timer_fn+0x0/0x48
[   71.203588][    T1] WARNING: CPU: 50 PID: 1 at lib/debugobjects.c:484 
debug_print_object+0xec/0x130
[   71.212094][    T1] Modules linked in:
[   71.212112][    T1] CPU: 50 PID: 1 Comm: swapper/0 Tainted: G 
W         5.4.0-rc3+ #1146
[   71.212121][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon 
D06 UEFI RC0 - V1.16.01 03/15/2019
[   71.212131][    T1] pstate: 80800009 (Nzcv daif -PAN +UAO)
[   71.212144][    T1] pc : debug_print_object+0xec/0x130
[   71.212158][    T1] lr : debug_print_object+0xec/0x130
[   71.224447][  T830] kobject: 'brcm-gisb-arb' ((____ptrval____)): 
kobject_cleanup, parent (____ptrval____)
[   71.226086][    T1] sp : ffff0020bf2c7740
[   71.226099][    T1] x29: ffff0020bf2c7740 x28: ffff002324274000
[   71.230557][  T830] kobject: 'brcm-gisb-arb' ((____ptrval____)): auto 
cleanup 'remove' event
[   71.235419][    T1] x27: ffff002324274090 x26: ffffa00017543de0
[   71.235435][    T1] x25: ffffa000101cd558 x24: ffffa00012051fc0
[   71.235450][    T1] x23: ffffa000150d2200 x22: ffffa000120523a0
[   71.235465][    T1] x21: ffffa00012051640 x20: 0000000000000000
[   71.240402][  T830] kobject: 'brcm-gisb-arb' ((____ptrval____)): 
kobject_uevent_env
[   71.244968][    T1] x19: ffffa00015019000 x18: 00000000000025a8
[   71.244984][    T1] x17: 00000000000025a0 x16: 00000000000026b0
[   71.244999][    T1] x15: 0000000000001470 x14: 726f775f64657961
[   71.245014][    T1] x13: 6c6564203a746e69 x12: 1fffe00417e58e5a
[   71.249837][  T830] kobject: 'brcm-gisb-arb' ((____ptrval____)): 
fill_kobj_path: path = '/bus/platform/drivers/brcm-gisb-arb'
[   71.253908][    T1] x11: ffff800417e58e5a x10: dfffa00000000000
[   71.253925][    T1] x9 : ffff800417e58e5b x8 : 0000000000000001
[   71.253939][    T1] x7 : ffff0020bf2c72d7 x6 : ffff800417e58e5b
[   71.253954][    T1] x5 : 1fffe00417e57936 x4 : ffff0020bf2bc058
[   71.256447][  T832] kobject: 'wakeup40' ((____ptrval____)): 
kobject_cleanup, parent (____ptrval____)
[   71.256466][  T832] kobject: 'wakeup40' ((____ptrval____)): calling 
ktype release
[   71.256516][  T832] kobject: 'wakeup40': free name
[   71.258600][  T830] kobject: 'brcm-gisb-arb' ((____ptrval____)): auto 
cleanup kobject_del
[   71.263109][    T1] x3 : ffffa00010000000 x2 : ffff800417e58eb0
[   71.263125][    T1] x1 : f8aafc30f531b000 x0 : 0000000000000000
[   71.263139][    T1] Call trace:
[   71.263152][    T1]  debug_print_object+0xec/0x130
[   71.263169][    T1]  __debug_check_no_obj_freed+0x114/0x290
[   71.268667][  T830] kobject: 'brcm-gisb-arb' ((____ptrval____)): 
calling ktype release
[   71.272574][    T1]  debug_check_no_obj_freed+0x18/0x28
[   71.272586][    T1]  slab_free_freelist_hook+0x18c/0x228
[   71.272596][    T1]  kfree+0x264/0x420
[   71.272608][    T1]  _edac_mc_free+0x1f8/0x210
[   71.272619][    T1]  edac_mc_free+0x68/0x88
[   71.272632][    T1]  ghes_edac_unregister+0x44/0x70
[   71.277292][  T830] driver: 'brcm-gisb-arb': driver_release
[   71.282298][    T1]  ghes_remove+0x274/0x2a0
[   71.282310][    T1]  platform_drv_remove+0x44/0x78
[   71.282321][    T1]  really_probe+0x404/0x840
[   71.282331][    T1]  driver_probe_device+0x190/0x1f0
[   71.282342][    T1]  device_driver_attach+0x7c/0xb0
[   71.282352][    T1]  __driver_attach+0x1b8/0x1d0
[   71.282368][    T1]  bus_for_each_dev+0xf8/0x190
[   71.286608][  T830] kobject: 'brcm-gisb-arb': free name
[   71.290816][    T1]  driver_attach+0x34/0x40
[   71.290826][    T1]  bus_add_driver+0x1d8/0x340
[   71.290838][    T1]  driver_register+0x168/0x1e8
[   71.290849][    T1]  __platform_driver_register+0x80/0x90
[   71.290859][    T1]  ghes_init+0xc4/0x174
[   71.290872][    T1]  do_one_initcall+0x328/0x788
[   71.320457][  T833] kobject: 'wakeup' ((____ptrval____)): 
kobject_cleanup, parent (____ptrval____)
[   71.323307][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.323324][    T1]  kernel_init+0x18/0x178
[   71.332431][  T833] kobject: 'wakeup' ((____ptrval____)): calling 
ktype release
[   71.337592][    T1]  ret_from_fork+0x10/0x18
[   71.337601][    T1] irq event stamp: 4399038
[   71.337613][    T1] hardirqs last  enabled at (4399037): 
[<ffffa000100c0e78>] el1_irq+0x138/0x200
[   71.337627][    T1] hardirqs last disabled at (4399038): 
[<ffffa000100fd884>] debug_exception_enter+0x8c/0x190
[   71.337640][    T1] softirqs last  enabled at (4399036): 
[<ffffa000100bf4a4>] __do_softirq+0x894/0x920
[   71.337655][    T1] softirqs last disabled at (4399029): 
[<ffffa000101965e4>] irq_exit+0x114/0x1a0
[   71.343025][  T833] kobject: 'wakeup': free name
[   71.352445][  T834] kobject: 'stmpe-pwm' ((____ptrval____)): 
kobject_cleanup, parent (____ptrval____)
[   71.352463][  T834] kobject: 'stmpe-pwm' ((____ptrval____)): auto 
cleanup 'remove' event
[   71.352481][  T834] kobject: 'stmpe-pwm' ((____ptrval____)): 
kobject_uevent_env
[   71.352587][  T834] kobject: 'stmpe-pwm' ((____ptrval____)): 
fill_kobj_path: path = '/bus/platform/drivers/stmpe-pwm'
[   71.352645][  T834] kobject: 'stmpe-pwm' ((____ptrval____)): auto 
cleanup kobject_del
[   71.352713][  T834] kobject: 'stmpe-pwm' ((____ptrval____)): calling 
ktype release
[   71.352730][  T834] driver: 'stmpe-pwm': driver_release
[   71.352763][  T834] kobject: 'stmpe-pwm': free name
[   71.353566][    T1] ---[ end trace a9b7b2cbbb0f7265 ]---
[   71.353899][    T1] GHES GHES.1: no default pinctrl state
[   71.384529][  T851] kobject: 'wakeup15' ((____ptrval____)): 
kobject_cleanup, parent (____ptrval____)
[   71.384654][  T848] kobject: 'wakeup' ((____ptrval____)): 
kobject_cleanup, parent (____ptrval____)
[   71.386131][    T1] driver: 'GHES': driver_bound: bound to device 
'GHES.1'
[   71.386163][    T1] kobject: 'GHES.1' ((____ptrval____)): 
kobject_uevent_env
[   71.386272][    T1] kobject: 'GHES.1' ((____ptrval____)): 
fill_kobj_path: path = '/devices/platform/GHES.1'
[   71.386334][    T1] bus: 'platform': really_probe: bound device 
GHES.1 to driver GHES
[   71.386378][    T1] bus: 'platform': driver_probe_device: matched 
device GHES.2 with driver GHES
[   71.386410][    T1] bus: 'platform': really_probe: probing driver 
GHES with device GHES.2
[   71.386512][    T1] GHES GHES.2: no default pinctrl state
[   71.390169][  T851] kobject: 'wakeup15' ((____ptrval____)): calling 
ktype release
[   71.395406][  T848] kobject: 'wakeup' ((____ptrval____)): calling 
ktype release
[   71.395681][    T1] 
==================================================================
[   71.395716][    T1] BUG: KASAN: use-after-free in 
ghes_edac_unregister+0x28/0x70
[   71.395728][    T1] Read of size 8 at addr ffff002324274bdc by task 
swapper/0/1
[   71.395735][    T1]
[   71.395749][    T1] CPU: 48 PID: 1 Comm: swapper/0 Tainted: G 
W         5.4.0-rc3+ #1146
[   71.395759][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon 
D06 UEFI RC0 - V1.16.01 03/15/2019
[   71.395768][    T1] Call trace:
[   71.395780][    T1]  dump_backtrace+0x0/0x298
[   71.395790][    T1]  show_stack+0x20/0x30
[   71.395802][    T1]  dump_stack+0x190/0x21c
[   71.395815][    T1]  print_address_description.isra.6+0x80/0x3d0
[   71.395827][    T1]  __kasan_report+0x174/0x23c
[   71.395838][    T1]  kasan_report+0xc/0x18
[   71.395849][    T1]  __asan_load8+0xa4/0xb0
[   71.395861][    T1]  ghes_edac_unregister+0x28/0x70
[   71.395873][    T1]  ghes_remove+0x274/0x2a0
[   71.395884][    T1]  platform_drv_remove+0x44/0x78
[   71.395895][    T1]  really_probe+0x404/0x840
[   71.395905][    T1]  driver_probe_device+0x190/0x1f0
[   71.395916][    T1]  device_driver_attach+0x7c/0xb0
[   71.395927][    T1]  __driver_attach+0x1b8/0x1d0
[   71.395939][    T1]  bus_for_each_dev+0xf8/0x190
[   71.395949][    T1]  driver_attach+0x34/0x40
[   71.395960][    T1]  bus_add_driver+0x1d8/0x340
[   71.395970][    T1]  driver_register+0x168/0x1e8
[   71.395982][    T1]  __platform_driver_register+0x80/0x90
[   71.395993][    T1]  ghes_init+0xc4/0x174
[   71.396004][    T1]  do_one_initcall+0x328/0x788
[   71.396017][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.396028][    T1]  kernel_init+0x18/0x178
[   71.396039][    T1]  ret_from_fork+0x10/0x18
[   71.396047][    T1]
[   71.396056][    T1] Allocated by task 1:
[   71.396068][    T1]  save_stack+0x28/0xb0
[   71.396080][    T1]  __kasan_kmalloc.isra.9+0xa0/0xc8
[   71.396091][    T1]  kasan_kmalloc+0xc/0x18
[   71.396102][    T1]  __kmalloc+0x2d0/0x338
[   71.396114][    T1]  edac_mc_alloc+0xaa8/0xb18
[   71.396125][    T1]  ghes_edac_register+0x164/0x398
[   71.396137][    T1]  ghes_probe+0x648/0x6d8
[   71.396148][    T1]  platform_drv_probe+0x8c/0x110
[   71.396159][    T1]  really_probe+0x32c/0x840
[   71.396170][    T1]  driver_probe_device+0x190/0x1f0
[   71.396181][    T1]  device_driver_attach+0x7c/0xb0
[   71.396192][    T1]  __driver_attach+0x1b8/0x1d0
[   71.396203][    T1]  bus_for_each_dev+0xf8/0x190
[   71.396214][    T1]  driver_attach+0x34/0x40
[   71.396224][    T1]  bus_add_driver+0x1d8/0x340
[   71.396235][    T1]  driver_register+0x168/0x1e8
[   71.396247][    T1]  __platform_driver_register+0x80/0x90
[   71.396257][    T1]  ghes_init+0xc4/0x174
[   71.396268][    T1]  do_one_initcall+0x328/0x788
[   71.396281][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.396292][    T1]  kernel_init+0x18/0x178
[   71.396303][    T1]  ret_from_fork+0x10/0x18
[   71.396310][    T1]
[   71.396318][    T1] Freed by task 1:
[   71.396330][    T1]  save_stack+0x28/0xb0
[   71.396341][    T1]  __kasan_slab_free+0x140/0x170
[   71.396353][    T1]  kasan_slab_free+0x10/0x18
[   71.396364][    T1]  slab_free_freelist_hook+0x19c/0x228
[   71.396375][    T1]  kfree+0x264/0x420
[   71.396386][    T1]  _edac_mc_free+0x1f8/0x210
[   71.396398][    T1]  edac_mc_free+0x68/0x88
[   71.396409][    T1]  ghes_edac_unregister+0x44/0x70
[   71.396420][    T1]  ghes_remove+0x274/0x2a0
[   71.396432][    T1]  platform_drv_remove+0x44/0x78
[   71.396442][    T1]  really_probe+0x404/0x840
[   71.396453][    T1]  driver_probe_device+0x190/0x1f0
[   71.396464][    T1]  device_driver_attach+0x7c/0xb0
[   71.396475][    T1]  __driver_attach+0x1b8/0x1d0
[   71.396487][    T1]  bus_for_each_dev+0xf8/0x190
[   71.396497][    T1]  driver_attach+0x34/0x40
[   71.396508][    T1]  bus_add_driver+0x1d8/0x340
[   71.396519][    T1]  driver_register+0x168/0x1e8
[   71.396530][    T1]  __platform_driver_register+0x80/0x90
[   71.396541][    T1]  ghes_init+0xc4/0x174
[   71.396552][    T1]  do_one_initcall+0x328/0x788
[   71.396564][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.396575][    T1]  kernel_init+0x18/0x178
[   71.396586][    T1]  ret_from_fork+0x10/0x18
[   71.396593][    T1]
[   71.396604][    T1] The buggy address belongs to the object at 
ffff002324274000
[   71.396604][    T1]  which belongs to the cache kmalloc-4k of size 4096
[   71.396615][    T1] The buggy address is located 3036 bytes inside of
[   71.396615][    T1]  4096-byte region [ffff002324274000, 
ffff002324275000)
[   71.396624][    T1] The buggy address belongs to the page:
[   71.396637][    T1] page:fffffe008c709c00 refcount:1 mapcount:0 
mapping:ffff0020bfc16980 index:0x0 compound_mapcount: 0
[   71.396655][    T1] flags: 0x1ffff00000010200(slab|head)
[   71.396671][    T1] raw: 1ffff00000010200 fffffe008c709a08 
fffffe008c70c408 ffff0020bfc16980
[   71.396685][    T1] raw: 0000000000000000 0000000000020002 
00000001ffffffff 0000000000000000
[   71.396693][    T1] page dumped because: kasan: bad access detected
[   71.396701][    T1]
[   71.396709][    T1] Memory state around the buggy address:
[   71.396721][    T1]  ffff002324274a80: fb fb fb fb fb fb fb fb fb fb 
fb fb fb fb fb fb
[   71.396732][    T1]  ffff002324274b00: fb fb fb fb fb fb fb fb fb fb 
fb fb fb fb fb fb
[   71.396743][    T1] >ffff002324274b80: fb fb fb fb fb fb fb fb fb fb 
fb fb fb fb fb fb
[   71.396751][    T1]                                                     ^
[   71.396762][    T1]  ffff002324274c00: fb fb fb fb fb fb fb fb fb fb 
fb fb fb fb fb fb
[   71.396773][    T1]  ffff002324274c80: fb fb fb fb fb fb fb fb fb fb 
fb fb fb fb fb fb
[   71.396781][    T1] 
==================================================================
[   71.396789][    T1] Disabling lock debugging due to kernel taint
[   71.396834][    T1] EDAC DEBUG: edac_mc_del_mc:
[   71.396846][    T1] EDAC DEBUG: edac_mc_free:
[   71.396866][    T1] 
==================================================================
[   71.396874][    T1] BUG: KASAN: double-free or invalid-free in 
kfree+0x264/0x420
[   71.396877][    T1]
[   71.396886][    T1] CPU: 48 PID: 1 Comm: swapper/0 Tainted: G    B 
W         5.4.0-rc3+ #1146
[   71.396891][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon 
D06 UEFI RC0 - V1.16.01 03/15/2019
[   71.396895][    T1] Call trace:
[   71.396902][    T1]  dump_backtrace+0x0/0x298
[   71.396909][    T1]  show_stack+0x20/0x30
[   71.396915][    T1]  dump_stack+0x190/0x21c
[   71.396923][    T1]  print_address_description.isra.6+0x80/0x3d0
[   71.396931][    T1]  kasan_report_invalid_free+0x78/0xa0
[   71.396939][    T1]  __kasan_slab_free+0xbc/0x170
[   71.396946][    T1]  kasan_slab_free+0x10/0x18
[   71.396953][    T1]  slab_free_freelist_hook+0x19c/0x228
[   71.396959][    T1]  kfree+0x264/0x420
[   71.396967][    T1]  _edac_mc_free+0x6c/0x210
[   71.396974][    T1]  edac_mc_free+0x68/0x88
[   71.396981][    T1]  ghes_edac_unregister+0x44/0x70
[   71.396989][    T1]  ghes_remove+0x274/0x2a0
[   71.396996][    T1]  platform_drv_remove+0x44/0x78
[   71.397002][    T1]  really_probe+0x404/0x840
[   71.397009][    T1]  driver_probe_device+0x190/0x1f0
[   71.397016][    T1]  device_driver_attach+0x7c/0xb0
[   71.397022][    T1]  __driver_attach+0x1b8/0x1d0
[   71.397030][    T1]  bus_for_each_dev+0xf8/0x190
[   71.397037][    T1]  driver_attach+0x34/0x40
[   71.397043][    T1]  bus_add_driver+0x1d8/0x340
[   71.397049][    T1]  driver_register+0x168/0x1e8
[   71.397057][    T1]  __platform_driver_register+0x80/0x90
[   71.397063][    T1]  ghes_init+0xc4/0x174
[   71.397070][    T1]  do_one_initcall+0x328/0x788
[   71.397078][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.397085][    T1]  kernel_init+0x18/0x178
[   71.397092][    T1]  ret_from_fork+0x10/0x18
[   71.397096][    T1]
[   71.397100][    T1] Allocated by task 1:
[   71.397108][    T1]  save_stack+0x28/0xb0
[   71.397116][    T1]  __kasan_kmalloc.isra.9+0xa0/0xc8
[   71.397123][    T1]  kasan_kmalloc+0xc/0x18
[   71.397130][    T1]  kmem_cache_alloc_trace+0x2a0/0x2e8
[   71.397138][    T1]  edac_mc_alloc+0x5d4/0xb18
[   71.397145][    T1]  ghes_edac_register+0x164/0x398
[   71.397152][    T1]  ghes_probe+0x648/0x6d8
[   71.397160][    T1]  platform_drv_probe+0x8c/0x110
[   71.397166][    T1]  really_probe+0x32c/0x840
[   71.397173][    T1]  driver_probe_device+0x190/0x1f0
[   71.397180][    T1]  device_driver_attach+0x7c/0xb0
[   71.397186][    T1]  __driver_attach+0x1b8/0x1d0
[   71.397194][    T1]  bus_for_each_dev+0xf8/0x190
[   71.397201][    T1]  driver_attach+0x34/0x40
[   71.397207][    T1]  bus_add_driver+0x1d8/0x340
[   71.397213][    T1]  driver_register+0x168/0x1e8
[   71.397221][    T1]  __platform_driver_register+0x80/0x90
[   71.397227][    T1]  ghes_init+0xc4/0x174
[   71.397235][    T1]  do_one_initcall+0x328/0x788
[   71.397243][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.397250][    T1]  kernel_init+0x18/0x178
[   71.397257][    T1]  ret_from_fork+0x10/0x18
[   71.397260][    T1]
[   71.397264][    T1] Freed by task 1:
[   71.397272][    T1]  save_stack+0x28/0xb0
[   71.397279][    T1]  __kasan_slab_free+0x140/0x170
[   71.397286][    T1]  kasan_slab_free+0x10/0x18
[   71.397294][    T1]  slab_free_freelist_hook+0x19c/0x228
[   71.397300][    T1]  kfree+0x264/0x420
[   71.397307][    T1]  _edac_mc_free+0x15c/0x210
[   71.397315][    T1]  edac_mc_free+0x68/0x88
[   71.397322][    T1]  ghes_edac_unregister+0x44/0x70
[   71.397329][    T1]  ghes_remove+0x274/0x2a0
[   71.397337][    T1]  platform_drv_remove+0x44/0x78
[   71.397343][    T1]  really_probe+0x404/0x840
[   71.397350][    T1]  driver_probe_device+0x190/0x1f0
[   71.397357][    T1]  device_driver_attach+0x7c/0xb0
[   71.397363][    T1]  __driver_attach+0x1b8/0x1d0
[   71.397371][    T1]  bus_for_each_dev+0xf8/0x190
[   71.397377][    T1]  driver_attach+0x34/0x40
[   71.397384][    T1]  bus_add_driver+0x1d8/0x340
[   71.397391][    T1]  driver_register+0x168/0x1e8
[   71.397398][    T1]  __platform_driver_register+0x80/0x90
[   71.397404][    T1]  ghes_init+0xc4/0x174
[   71.397411][    T1]  do_one_initcall+0x328/0x788
[   71.397419][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.397427][    T1]  kernel_init+0x18/0x178
[   71.397433][    T1]  ret_from_fork+0x10/0x18
[   71.397437][    T1]
[   71.397443][    T1] The buggy address belongs to the object at 
ffff0023245a9200
[   71.397443][    T1]  which belongs to the cache kmalloc-128 of size 128
[   71.397451][    T1] The buggy address is located 0 bytes inside of
[   71.397451][    T1]  128-byte region [ffff0023245a9200, ffff0023245a9280)
[   71.397455][    T1] The buggy address belongs to the page:
[   71.397462][    T1] page:fffffe008c716a00 refcount:1 mapcount:0 
mapping:ffff0020bfc10580 index:0xffff0023245ada80 compound_mapcount: 0
[   71.397471][    T1] flags: 0x1ffff00000010200(slab|head)
[   71.397482][    T1] raw: 1ffff00000010200 fffffe008c716808 
fffffe008c70a008 ffff0020bfc10580
[   71.397492][    T1] raw: ffff0023245ada80 0000000000330016 
00000001ffffffff 0000000000000000
[   71.397496][    T1] page dumped because: kasan: bad access detected
[   71.397499][    T1]
[   71.397503][    T1] Memory state around the buggy address:
[   71.397510][    T1]  ffff0023245a9100: fc fc fc fc fc fc fc fc fc fc 
fc fc fc fc fc fc
[   71.397517][    T1]  ffff0023245a9180: fc fc fc fc fc fc fc fc fc fc 
fc fc fc fc fc fc
[   71.397523][    T1] >ffff0023245a9200: fb fb fb fb fb fb fb fb fb fb 
fb fb fb fb fb fb
[   71.397527][    T1]                    ^
[   71.397534][    T1]  ffff0023245a9280: fc fc fc fc fc fc fc fc fc fc 
fc fc fc fc fc fc
[   71.397541][    T1]  ffff0023245a9300: fc fc fc fc fc fc fc fc fc fc 
fc fc fc fc fc fc
[   71.397545][    T1] 
==================================================================







^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: edac KASAN warning in experimental arm64 allmodconfig boot
  2019-10-14 15:18 edac KASAN warning in experimental arm64 allmodconfig boot John Garry
@ 2019-10-14 16:09 ` Borislav Petkov
  2019-10-14 16:44   ` John Garry
  2019-10-14 16:15 ` James Morse
  2019-11-21 12:34 ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) John Garry
  2 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2019-10-14 16:09 UTC (permalink / raw)
  To: John Garry
  Cc: Mauro Carvalho Chehab, James Morse, tony.luck, Robert Richter,
	linux-edac, linux-kernel

On Mon, Oct 14, 2019 at 04:18:49PM +0100, John Garry wrote:
> Hi guys,
> 
> I'm experimenting by trying to boot an allmodconfig arm64 kernel, as
> mentioned here:
> https://lore.kernel.org/linux-arm-kernel/507325a3-030e-2843-0f46-7e18c60257de@huawei.com/
> 
> One thing that I noticed - it's hard to miss actually - is the amount of
> complaining from KASAN about the EDAC/ghes code. Maybe this is something I
> should not care about/red herring, or maybe something genuine. Let me know
> what you think.
> 
> The kernel is v5.4-rc3, and I raised the EDAC mc debug level to get extra
> debug prints.
> 
> Log below, Thanks,
> John
> Log snippet (I cut off after the first KASAN warning):
> 
> [   70.471011][    T1] random: get_random_u32 called from new_slab+0x360/0x698 with crng_init=0
> [   70.478671][    T1] [Firmware Bug]: APEI: Invalid bit width + offset in GAR [0x94110034/64/0/3/0]
> [   70.526585][    T1] EDAC DEBUG: edac_mc_alloc: allocating 3524 bytes for mci data (32 dimms, 32 csrows/channels)
> [   70.542013][    T1] EDAC DEBUG: ghes_edac_dmidecode: DIMM2: Registered-DDR4 size = 16384 MB(ECC)
> [   70.551044][    T1] EDAC DEBUG: ghes_edac_dmidecode:         type 26, detail 0x2080, width 72(total 64)
> [   70.559986][    T1] EDAC DEBUG: edac_mc_add_mc_with_groups:
> [   70.567082][    T1] EDAC DEBUG: edac_create_sysfs_mci_device: device mc0 created
> [   70.575608][    T1] EDAC DEBUG: edac_create_dimm_object: device dimm2 created at location memory 2
> [   70.585818][    T1] EDAC DEBUG: edac_create_csrow_object: device csrow2 created
> [   70.594110][    T1] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
> [   70.605936][    T1] EDAC DEBUG: edac_mc_del_mc:
> [   70.611188][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device:
> [   70.619443][    T1] random: get_random_u32 called from kobject_put+0x8c/0x190 with crng_init=0
> [   70.628163][    T1] kobject: 'csrow2' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 750)
> [   70.638477][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device: unregistering device dimm2
> [   70.647903][    T1] kobject: 'dimm2' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 250)
> [   70.658105][    T1] EDAC MC: Removed device 0 for ghes_edac.c ghes_edac: DEV ghes
> [   70.665673][    T1] EDAC DEBUG: edac_mc_free:
> [   70.670211][    T1] EDAC DEBUG: edac_unregister_sysfs: unregistering device mc0
> [   70.679027][    T1] kobject: 'mc0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 500)
> [   70.690987][    T1] EDAC DEBUG: edac_mc_del_mc:
> [   70.695769][    T1] EDAC DEBUG: edac_mc_free:
> [   70.700412][    T1] ------------[ cut here ]------------
> [   70.705832][    T1] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x48
> [   70.716663][    T1] WARNING: CPU: 50 PID: 1 at lib/debugobjects.c:484 debug_print_object+0xec/0x130

If I am parsing these unwrapped messages correctly (btw, pls use another
mail client for pasting log lines - thunderbird is usually ok but I
guess you need to configure it properly), that must be some workqueue
object of sorts.

Now, ghes_edac doesn't init the workqueue:

[   70.594110][    T1] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)

as it is in interrupt mode.

So the only other workqueue I see is that "delayed XXX" stuff which is in
kobject_release().

AFAICT.

Do you have CONFIG_DEBUG_KOBJECT_RELEASE enabled and if so, does the
warning go away if you disable it?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: edac KASAN warning in experimental arm64 allmodconfig boot
  2019-10-14 15:18 edac KASAN warning in experimental arm64 allmodconfig boot John Garry
  2019-10-14 16:09 ` Borislav Petkov
@ 2019-10-14 16:15 ` James Morse
  2019-10-14 16:56   ` John Garry
  2019-11-21 12:34 ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) John Garry
  2 siblings, 1 reply; 21+ messages in thread
From: James Morse @ 2019-10-14 16:15 UTC (permalink / raw)
  To: John Garry
  Cc: Borislav Petkov, Mauro Carvalho Chehab, tony.luck,
	Robert Richter, linux-edac, linux-kernel

Hi John,

On 14/10/2019 16:18, John Garry wrote:
> I'm experimenting by trying to boot an allmodconfig arm64 kernel, as mentioned here:

Crumbs!


> One thing that I noticed - it's hard to miss actually - is the amount of complaining from
> KASAN about the EDAC/ghes code. Maybe this is something I should not care about/red
> herring, or maybe something genuine. Let me know what you think.

Hmmm, I thought I tested this recently...

> Log snippet (I cut off after the first KASAN warning):
> 
> [   70.471011][    T1] random: get_random_u32 called from new_slab+0x360/0x698 with
> crng_init=0

> [   70.478671][    T1] [Firmware Bug]: APEI: Invalid bit width + offset in GAR
> [0x94110034/64/0/3/0]

(this one's for you right?)

> [   70.700412][    T1] ------------[ cut here ]------------

> [   70.802080][    T1] Call trace:
> [   70.802093][    T1]  debug_print_object+0xec/0x130
> [   70.802106][    T1]  __debug_check_no_obj_freed+0x114/0x290
> [   70.802119][    T1]  debug_check_no_obj_freed+0x18/0x28
> [   70.802130][    T1]  slab_free_freelist_hook+0x18c/0x228
> [   70.802140][    T1]  kfree+0x264/0x420
> [   70.802157][    T1]  _edac_mc_free+0x6c/0x210
> [   70.814163][    T1]  edac_mc_free+0x68/0x88
> [   70.814177][    T1]  ghes_edac_unregister+0x44/0x70
> [   70.814193][    T1]  ghes_remove+0x274/0x2a0

Ugh. This must be the test driver remove thing.

I've reproduced this, but had to remove the parent GHES twice. It looks like it tries to
use the first ghes_edac global variables when freeing the second. ghes_init prevents it
from re-allocating over the top.

The below diff fixes it for me. (I'll post it as a proper patch once I've done the
archaeology)

-----------%<-----------
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index d413a0bdc9ad..955b59b6aade 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -554,6 +554,7 @@ void ghes_edac_unregister(struct ghes *ghes)
                return;

        mci = ghes_pvt->mci;
+       ghes_pvt = NULL;
        edac_mc_del_mc(mci->pdev);
        edac_mc_free(mci);
 }

-----------%<-----------


Thanks!

James

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: edac KASAN warning in experimental arm64 allmodconfig boot
  2019-10-14 16:09 ` Borislav Petkov
@ 2019-10-14 16:44   ` John Garry
  0 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2019-10-14 16:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, James Morse, tony.luck, Robert Richter,
	linux-edac, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3915 bytes --]

On 14/10/2019 17:09, Borislav Petkov wrote:
> On Mon, Oct 14, 2019 at 04:18:49PM +0100, John Garry wrote:
>> Hi guys,
>>
>> I'm experimenting by trying to boot an allmodconfig arm64 kernel, as
>> mentioned here:
>> https://lore.kernel.org/linux-arm-kernel/507325a3-030e-2843-0f46-7e18c60257de@huawei.com/
>>
>> One thing that I noticed - it's hard to miss actually - is the amount of
>> complaining from KASAN about the EDAC/ghes code. Maybe this is something I
>> should not care about/red herring, or maybe something genuine. Let me know
>> what you think.
>>
>> The kernel is v5.4-rc3, and I raised the EDAC mc debug level to get extra
>> debug prints.
>>
>> Log below, Thanks,
>> John
>> Log snippet (I cut off after the first KASAN warning):
>>
>> [   70.471011][    T1] random: get_random_u32 called from new_slab+0x360/0x698 with crng_init=0
>> [   70.478671][    T1] [Firmware Bug]: APEI: Invalid bit width + offset in GAR [0x94110034/64/0/3/0]
>> [   70.526585][    T1] EDAC DEBUG: edac_mc_alloc: allocating 3524 bytes for mci data (32 dimms, 32 csrows/channels)
>> [   70.542013][    T1] EDAC DEBUG: ghes_edac_dmidecode: DIMM2: Registered-DDR4 size = 16384 MB(ECC)
>> [   70.551044][    T1] EDAC DEBUG: ghes_edac_dmidecode:         type 26, detail 0x2080, width 72(total 64)
>> [   70.559986][    T1] EDAC DEBUG: edac_mc_add_mc_with_groups:
>> [   70.567082][    T1] EDAC DEBUG: edac_create_sysfs_mci_device: device mc0 created
>> [   70.575608][    T1] EDAC DEBUG: edac_create_dimm_object: device dimm2 created at location memory 2
>> [   70.585818][    T1] EDAC DEBUG: edac_create_csrow_object: device csrow2 created
>> [   70.594110][    T1] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
>> [   70.605936][    T1] EDAC DEBUG: edac_mc_del_mc:
>> [   70.611188][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device:
>> [   70.619443][    T1] random: get_random_u32 called from kobject_put+0x8c/0x190 with crng_init=0
>> [   70.628163][    T1] kobject: 'csrow2' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 750)
>> [   70.638477][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device: unregistering device dimm2
>> [   70.647903][    T1] kobject: 'dimm2' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 250)
>> [   70.658105][    T1] EDAC MC: Removed device 0 for ghes_edac.c ghes_edac: DEV ghes
>> [   70.665673][    T1] EDAC DEBUG: edac_mc_free:
>> [   70.670211][    T1] EDAC DEBUG: edac_unregister_sysfs: unregistering device mc0
>> [   70.679027][    T1] kobject: 'mc0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 500)
>> [   70.690987][    T1] EDAC DEBUG: edac_mc_del_mc:
>> [   70.695769][    T1] EDAC DEBUG: edac_mc_free:
>> [   70.700412][    T1] ------------[ cut here ]------------
>> [   70.705832][    T1] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x48
>> [   70.716663][    T1] WARNING: CPU: 50 PID: 1 at lib/debugobjects.c:484 debug_print_object+0xec/0x130
>
> If I am parsing these unwrapped messages correctly (btw, pls use another
> mail client for pasting log lines - thunderbird is usually ok but I
> guess you need to configure it properly

Maybe you can receive the cutdown log attachment while I figure out how 
to do that...

), that must be some workqueue
> object of sorts.
>
> Now, ghes_edac doesn't init the workqueue:
>
> [   70.594110][    T1] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
>
> as it is in interrupt mode.
>
> So the only other workqueue I see is that "delayed XXX" stuff which is in
> kobject_release().
>
> AFAICT.
>
> Do you have CONFIG_DEBUG_KOBJECT_RELEASE enabled and if so, does the
> warning go away if you disable it?
>

Yes, it's enabled with allmodconfig, but no, it does not go away with 
disabling (see log #2).

Cheers,
John

> Thx.
>


[-- Attachment #2: kasan edac log 2 --]
[-- Type: text/plain, Size: 14252 bytes --]

t!
[   69.915028][    T1] debugfs: File '\_SB_.MB5D' in directory 'domains' already present!
[   70.055740][    T1] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[   70.106050][    T1] gbefb: couldn't reserve mmio region
[   70.111495][    T1] gbefb: probe of gbefb.0 failed with error -16
[   70.122848][    T2] _warn_unseeded_randomness: 103 callbacks suppressed
[   70.122867][    T2] random: get_random_u64 called from copy_process+0x444/0x2bf0 with crng_init=0
[   70.161416][    T1] [Firmware Bug]: APEI: Invalid bit width + offset in GAR [0x94110034/64/0/3/0]
[   70.171690][    T1] EDAC DEBUG: edac_mc_alloc: allocating 3332 bytes for mci data (32 dimms, 32 csrows/channels)
[   70.186961][    T1] EDAC DEBUG: ghes_edac_dmidecode: DIMM2: Registered-DDR4 size = 16384 MB(ECC)
[   70.195905][    T1] EDAC DEBUG: ghes_edac_dmidecode:         type 26, detail 0x2080, width 72(total 64)
[   70.204856][    T1] EDAC DEBUG: edac_mc_add_mc_with_groups: 
[   70.211902][    T1] EDAC DEBUG: edac_create_sysfs_mci_device: device mc0 created
[   70.220567][    T1] EDAC DEBUG: edac_create_dimm_object: device dimm2 created at location memory 2 
[   70.230772][    T1] EDAC DEBUG: edac_create_csrow_object: device csrow2 created
[   70.239012][    T1] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[   70.250886][    T1] EDAC DEBUG: edac_mc_del_mc: 
[   70.256169][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device: 
[   70.264999][    T1] EDAC DEBUG: csrow_attr_release: device csrow2 released
[   70.272080][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device: unregistering device dimm2
[   70.281573][    T1] EDAC DEBUG: dimm_attr_release: device dimm2 released
[   70.288461][    T1] EDAC MC: Removed device 0 for ghes_edac.c ghes_edac: DEV ghes
[   70.296035][    T1] EDAC DEBUG: edac_mc_free: 
[   70.300580][    T1] EDAC DEBUG: edac_unregister_sysfs: unregistering device mc0
[   70.309379][    T1] EDAC DEBUG: mci_attr_release: device mc0 released
[   70.318165][    T1] ==================================================================
[   70.326165][    T1] BUG: KASAN: use-after-free in ghes_edac_unregister+0x28/0x70
[   70.333575][    T1] Read of size 8 at addr ffff002323ca9b1c by task swapper/0/1
[   70.340894][    T1] 
[   70.343099][    T1] CPU: 57 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3+ #1147
[   70.350421][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   70.359652][    T1] Call trace:
[   70.362811][    T1]  dump_backtrace+0x0/0x298
[   70.367183][    T1]  show_stack+0x20/0x30
[   70.371209][    T1]  dump_stack+0x190/0x21c
[   70.375410][    T1]  print_address_description.isra.6+0x80/0x3d0
[   70.381431][    T1]  __kasan_report+0x174/0x23c
[   70.385977][    T1]  kasan_report+0xc/0x18
[   70.390088][    T1]  __asan_load8+0xa4/0xb0
[   70.394286][    T1]  ghes_edac_unregister+0x28/0x70
[   70.399181][    T1]  ghes_remove+0x274/0x2a0
[   70.403468][    T1]  platform_drv_remove+0x44/0x78
[   70.408273][    T1]  really_probe+0x404/0x840
[   70.412644][    T1]  driver_probe_device+0x190/0x1f0
[   70.417623][    T1]  device_driver_attach+0x7c/0xb0
[   70.422515][    T1]  __driver_attach+0x1b8/0x1d0
[   70.427148][    T1]  bus_for_each_dev+0xf8/0x190
[   70.431779][    T1]  driver_attach+0x34/0x40
[   70.436062][    T1]  bus_add_driver+0x1d8/0x340
[   70.440607][    T1]  driver_register+0x168/0x1e8
[   70.445239][    T1]  __platform_driver_register+0x80/0x90
[   70.450656][    T1]  ghes_init+0xc4/0x174
[   70.454680][    T1]  do_one_initcall+0x328/0x788
[   70.459314][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   70.464381][    T1]  kernel_init+0x18/0x178
[   70.468578][    T1]  ret_from_fork+0x10/0x18
[   70.472859][    T1] 
[   70.475058][    T1] Allocated by task 1:
[   70.478996][    T1]  save_stack+0x28/0xb0
[   70.483021][    T1]  __kasan_kmalloc.isra.9+0xa0/0xc8
[   70.488087][    T1]  kasan_kmalloc+0xc/0x18
[   70.492284][    T1]  __kmalloc+0x2d0/0x338
[   70.496397][    T1]  edac_mc_alloc+0xaa8/0xb18
[   70.500856][    T1]  ghes_edac_register+0x164/0x398
[   70.505748][    T1]  ghes_probe+0x648/0x6d8
[   70.509946][    T1]  platform_drv_probe+0x8c/0x110
[   70.514751][    T1]  really_probe+0x32c/0x840
[   70.519122][    T1]  driver_probe_device+0x190/0x1f0
[   70.524100][    T1]  device_driver_attach+0x7c/0xb0
[   70.528992][    T1]  __driver_attach+0x1b8/0x1d0
[   70.533624][    T1]  bus_for_each_dev+0xf8/0x190
[   70.538255][    T1]  driver_attach+0x34/0x40
[   70.542539][    T1]  bus_add_driver+0x1d8/0x340
[   70.547083][    T1]  driver_register+0x168/0x1e8
[   70.551715][    T1]  __platform_driver_register+0x80/0x90
[   70.557127][    T1]  ghes_init+0xc4/0x174
[   70.561151][    T1]  do_one_initcall+0x328/0x788
[   70.565784][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   70.570850][    T1]  kernel_init+0x18/0x178
[   70.575047][    T1]  ret_from_fork+0x10/0x18
[   70.579327][    T1] 
[   70.581525][    T1] Freed by task 1:
[   70.585115][    T1]  save_stack+0x28/0xb0
[   70.589139][    T1]  __kasan_slab_free+0x140/0x170
[   70.593945][    T1]  kasan_slab_free+0x10/0x18
[   70.598405][    T1]  slab_free_freelist_hook+0x19c/0x228
[   70.603730][    T1]  kfree+0x264/0x420
[   70.607494][    T1]  mci_attr_release+0x74/0x80
[   70.612040][    T1]  device_release+0xa4/0x108
[   70.616499][    T1]  kobject_put+0x250/0x2c0
[   70.620784][    T1]  device_unregister+0x88/0x98
[   70.625415][    T1]  edac_unregister_sysfs+0x78/0x88
[   70.630395][    T1]  edac_mc_free+0x78/0x88
[   70.634592][    T1]  ghes_edac_unregister+0x44/0x70
[   70.639485][    T1]  ghes_remove+0x274/0x2a0
[   70.643769][    T1]  platform_drv_remove+0x44/0x78
[   70.648574][    T1]  really_probe+0x404/0x840
[   70.652944][    T1]  driver_probe_device+0x190/0x1f0
[   70.657924][    T1]  device_driver_attach+0x7c/0xb0
[   70.662815][    T1]  __driver_attach+0x1b8/0x1d0
[   70.667447][    T1]  bus_for_each_dev+0xf8/0x190
[   70.672078][    T1]  driver_attach+0x34/0x40
[   70.676361][    T1]  bus_add_driver+0x1d8/0x340
[   70.680906][    T1]  driver_register+0x168/0x1e8
[   70.685539][    T1]  __platform_driver_register+0x80/0x90
[   70.690951][    T1]  ghes_init+0xc4/0x174
[   70.694975][    T1]  do_one_initcall+0x328/0x788
[   70.699607][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   70.704673][    T1]  kernel_init+0x18/0x178
[   70.708870][    T1]  ret_from_fork+0x10/0x18
[   70.713151][    T1] 
[   70.715352][    T1] The buggy address belongs to the object at ffff002323ca9000
[   70.715352][    T1]  which belongs to the cache kmalloc-4k of size 4096
[   70.729272][    T1] The buggy address is located 2844 bytes inside of
[   70.729272][    T1]  4096-byte region [ffff002323ca9000, ffff002323caa000)
[   70.742582][    T1] The buggy address belongs to the page:
[   70.748083][    T1] page:fffffe008c6f2a00 refcount:1 mapcount:0 mapping:ffff0020bfc17080 index:0x0 compound_mapcount: 0
[   70.758886][    T1] flags: 0x1ffff00000010200(slab|head)
[   70.764217][    T1] raw: 1ffff00000010200 fffffe008c6f2408 fffffe008c6f2808 ffff0020bfc17080
[   70.772671][    T1] raw: 0000000000000000 0000000000020002 00000001ffffffff 0000000000000000
[   70.781119][    T1] page dumped because: kasan: bad access detected
[   70.787397][    T1] 
[   70.789595][    T1] Memory state around the buggy address:
[   70.795096][    T1]  ffff002323ca9a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   70.803027][    T1]  ffff002323ca9a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   70.810957][    T1] >ffff002323ca9b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   70.818884][    T1]                             ^
[   70.823603][    T1]  ffff002323ca9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   70.831534][    T1]  ffff002323ca9c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   70.839461][    T1] ==================================================================
[   70.847388][    T1] Disabling lock debugging due to kernel taint
[   70.853571][    T1] EDAC DEBUG: edac_mc_del_mc: 
[   70.858302][    T1] EDAC DEBUG: edac_mc_free: 
[   70.862829][    T1] ==================================================================
[   70.870751][    T1] BUG: KASAN: double-free or invalid-free in kfree+0x264/0x420
[   70.878142][    T1] 
[   70.880331][    T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.4.0-rc3+ #1147
[   70.888939][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   70.898154][    T1] Call trace:
[   70.901296][    T1]  dump_backtrace+0x0/0x298
[   70.905651][    T1]  show_stack+0x20/0x30
[   70.909660][    T1]  dump_stack+0x190/0x21c
[   70.913844][    T1]  print_address_description.isra.6+0x80/0x3d0
[   70.919850][    T1]  kasan_report_invalid_free+0x78/0xa0
[   70.925161][    T1]  __kasan_slab_free+0xbc/0x170
[   70.929864][    T1]  kasan_slab_free+0x10/0x18
[   70.934306][    T1]  slab_free_freelist_hook+0x19c/0x228
[   70.939616][    T1]  kfree+0x264/0x420
[   70.943365][    T1]  _edac_mc_free+0x6c/0x210
[   70.947721][    T1]  edac_mc_free+0x68/0x88
[   70.951903][    T1]  ghes_edac_unregister+0x44/0x70
[   70.956782][    T1]  ghes_remove+0x274/0x2a0
[   70.961052][    T1]  platform_drv_remove+0x44/0x78
[   70.965841][    T1]  really_probe+0x404/0x840
[   70.970196][    T1]  driver_probe_device+0x190/0x1f0
[   70.975159][    T1]  device_driver_attach+0x7c/0xb0
[   70.980035][    T1]  __driver_attach+0x1b8/0x1d0
[   70.984652][    T1]  bus_for_each_dev+0xf8/0x190
[   70.989267][    T1]  driver_attach+0x34/0x40
[   70.993535][    T1]  bus_add_driver+0x1d8/0x340
[   70.998063][    T1]  driver_register+0x168/0x1e8
[   71.002680][    T1]  __platform_driver_register+0x80/0x90
[   71.008078][    T1]  ghes_init+0xc4/0x174
[   71.012086][    T1]  do_one_initcall+0x328/0x788
[   71.016704][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.021754][    T1]  kernel_init+0x18/0x178
[   71.025936][    T1]  ret_from_fork+0x10/0x18
[   71.030202][    T1] 
[   71.032385][    T1] Allocated by task 1:
[   71.036308][    T1]  save_stack+0x28/0xb0
[   71.040317][    T1]  __kasan_kmalloc.isra.9+0xa0/0xc8
[   71.045367][    T1]  kasan_kmalloc+0xc/0x18
[   71.049549][    T1]  kmem_cache_alloc_trace+0x2a0/0x2e8
[   71.054773][    T1]  edac_mc_alloc+0x7c4/0xb18
[   71.059216][    T1]  ghes_edac_register+0x164/0x398
[   71.064093][    T1]  ghes_probe+0x648/0x6d8
[   71.068275][    T1]  platform_drv_probe+0x8c/0x110
[   71.073064][    T1]  really_probe+0x32c/0x840
[   71.077419][    T1]  driver_probe_device+0x190/0x1f0
[   71.082381][    T1]  device_driver_attach+0x7c/0xb0
[   71.087257][    T1]  __driver_attach+0x1b8/0x1d0
[   71.091874][    T1]  bus_for_each_dev+0xf8/0x190
[   71.096489][    T1]  driver_attach+0x34/0x40
[   71.100757][    T1]  bus_add_driver+0x1d8/0x340
[   71.105286][    T1]  driver_register+0x168/0x1e8
[   71.109902][    T1]  __platform_driver_register+0x80/0x90
[   71.115299][    T1]  ghes_init+0xc4/0x174
[   71.119307][    T1]  do_one_initcall+0x328/0x788
[   71.123923][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.128973][    T1]  kernel_init+0x18/0x178
[   71.133155][    T1]  ret_from_fork+0x10/0x18
[   71.137420][    T1] 
[   71.139603][    T1] Freed by task 1:
[   71.143178][    T1]  save_stack+0x28/0xb0
[   71.147186][    T1]  __kasan_slab_free+0x140/0x170
[   71.151976][    T1]  kasan_slab_free+0x10/0x18
[   71.156418][    T1]  slab_free_freelist_hook+0x19c/0x228
[   71.161728][    T1]  kfree+0x264/0x420
[   71.165477][    T1]  dimm_attr_release+0x78/0x88
[   71.170093][    T1]  device_release+0xa4/0x108
[   71.174536][    T1]  kobject_put+0x250/0x2c0
[   71.178805][    T1]  device_unregister+0x88/0x98
[   71.183421][    T1]  edac_remove_sysfs_mci_device+0x20c/0x248
[   71.189166][    T1]  edac_mc_del_mc+0xec/0x158
[   71.193609][    T1]  ghes_edac_unregister+0x3c/0x70
[   71.198486][    T1]  ghes_remove+0x274/0x2a0
[   71.202755][    T1]  platform_drv_remove+0x44/0x78
[   71.207543][    T1]  really_probe+0x404/0x840
[   71.211899][    T1]  driver_probe_device+0x190/0x1f0
[   71.216861][    T1]  device_driver_attach+0x7c/0xb0
[   71.221737][    T1]  __driver_attach+0x1b8/0x1d0
[   71.226354][    T1]  bus_for_each_dev+0xf8/0x190
[   71.230969][    T1]  driver_attach+0x34/0x40
[   71.235237][    T1]  bus_add_driver+0x1d8/0x340
[   71.239766][    T1]  driver_register+0x168/0x1e8
[   71.244382][    T1]  __platform_driver_register+0x80/0x90
[   71.249778][    T1]  ghes_init+0xc4/0x174
[   71.253787][    T1]  do_one_initcall+0x328/0x788
[   71.258403][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.263453][    T1]  kernel_init+0x18/0x178
[   71.267635][    T1]  ret_from_fork+0x10/0x18
[   71.271900][    T1] 
[   71.274085][    T1] The buggy address belongs to the object at ffff002323ce2000
[   71.274085][    T1]  which belongs to the cache kmalloc-2k of size 2048
[   71.287989][    T1] The buggy address is located 0 bytes inside of
[   71.287989][    T1]  2048-byte region [ffff002323ce2000, ffff002323ce2800)
[   71.301022][    T1] The buggy address belongs to the page:
[   71.306508][    T1] page:fffffe008c6f3800 refcount:1 mapcount:0 mapping:ffff0020bfc10c80 index:0x0 compound_mapcount: 0
[   71.317291][    T1] flags: 0x1ffff00000010200(slab|head)
[   71.322606][    T1] raw: 1ffff00000010200 fffffe008c6f3608 fffffe008c6f3a08 ffff0020bfc10c80
[   71.331044][    T1] raw: 0000000000000000 0000000000050005 00000001ffffffff 0000000000000000
[   71.339477][    T1] page dumped because: kasan: bad access detected
[   71.345738][    T1] 
[   71.347920][    T1] Memory state around the buggy address:
[   71.353405][    T1]  ffff002323ce1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   71.361319][    T1]  ffff002323ce1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   71.369234][    T1] >ffff002323ce2000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.377145][    T1]                    ^
[   71.381066][    T1]  ffff002323ce2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.388981][    T1]  ffff002323ce2100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.396892][    T1] ==================================================================


[-- Attachment #3: kasan edac log --]
[-- Type: text/plain, Size: 25719 bytes --]


[   70.234085][    T1] gbefb: probe of gbefb.0 failed with error -16
[   70.249643][    T1] kobject: 'wakeup' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 750)
[   70.260091][    T1] kobject: 'wakeup63' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 750)
[   70.268834][    T1] kobject: 'wakeup' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 500)
[   70.268879][    T1] kobject: 'wakeup64' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 250)
[   70.296399][    T1] [Firmware Bug]: APEI: Invalid bit width + offset in GAR [0x94110034/64/0/3/0]
[   70.306670][    T1] EDAC DEBUG: edac_mc_alloc: allocating 3524 bytes for mci data (32 dimms, 32 csrows/channels)
[   70.322002][    T1] EDAC DEBUG: ghes_edac_dmidecode: DIMM2: Registered-DDR4 size = 16384 MB(ECC)
[   70.330897][    T1] EDAC DEBUG: ghes_edac_dmidecode:         type 26, detail 0x2080, width 72(total 64)
[   70.339844][    T1] EDAC DEBUG: edac_mc_add_mc_with_groups: 
[   70.346860][    T1] EDAC DEBUG: edac_create_sysfs_mci_device: device mc0 created
[   70.355347][    T1] EDAC DEBUG: edac_create_dimm_object: device dimm2 created at location memory 2 
[   70.365595][    T1] EDAC DEBUG: edac_create_csrow_object: device csrow2 created
[   70.373817][    T1] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[   70.385243][    T1] EDAC DEBUG: edac_mc_del_mc: 
[   70.390527][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device: 
[   70.398823][    T1] _warn_unseeded_randomness: 49 callbacks suppressed
[   70.398845][    T1] random: get_random_u32 called from kobject_put+0x8c/0x190 with crng_init=0
[   70.414150][    T1] kobject: 'csrow2' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 500)
[   70.424461][    T1] EDAC DEBUG: edac_remove_sysfs_mci_device: unregistering device dimm2
[   70.433873][    T1] kobject: 'dimm2' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 750)
[   70.444066][    T1] EDAC MC: Removed device 0 for ghes_edac.c ghes_edac: DEV ghes
[   70.451689][    T1] EDAC DEBUG: edac_mc_free: 
[   70.456229][    T1] EDAC DEBUG: edac_unregister_sysfs: unregistering device mc0
[   70.465009][    T1] kobject: 'mc0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 500)
[   70.475868][    T1] random: get_random_u32 called from new_slab+0x360/0x698 with crng_init=0
[   70.485594][    T1] EDAC DEBUG: edac_mc_del_mc: 
[   70.490369][    T1] EDAC DEBUG: edac_mc_free: 
[   70.495532][    T1] ------------[ cut here ]------------
[   70.500956][    T1] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x48
[   70.511845][    T1] WARNING: CPU: 51 PID: 1 at lib/debugobjects.c:484 debug_print_object+0xec/0x130
[   70.520900][    T1] Modules linked in:
[   70.524671][    T1] CPU: 51 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3+ #1146
[   70.531991][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   70.541221][    T1] pstate: 80800009 (Nzcv daif -PAN +UAO)
[   70.541246][    T1] pc : debug_print_object+0xec/0x130
[   70.551881][    T1] lr : debug_print_object+0xec/0x130
[   70.551890][    T1] sp : ffff0020bf2c7740
[   70.551899][    T1] x29: ffff0020bf2c7740 x28: ffff002324575000 
[   70.551914][    T1] x27: ffff002324575090 x26: ffffa00017543de0 
[   70.551929][    T1] x25: ffffa000101cd558 x24: ffffa00012051fc0 
[   70.551952][    T1] x23: ffffa000150d2200 x22: ffffa000120523a0 
[   70.561099][    T1] x21: ffffa00012051640 x20: 0000000000000000 
[   70.561116][    T1] x19: ffffa00015019000 x18: 0000000000000000 
[   70.561131][    T1] x17: 0000000000000000 x16: 00000000000026b0 
[   70.561145][    T1] x15: 0000000000000000 x14: 6e6968207473696c 
[   70.561160][    T1] x13: 5f72656d6974203a x12: 1fffe00417e58e5a 
[   70.573187][    T1] x11: ffff800417e58e5a x10: dfffa00000000000 
[   70.585213][    T1] x9 : ffff800417e58e5b x8 : 0000000000000001 
[   70.585228][    T1] x7 : ffff0020bf2c72d7 x6 : ffff800417e58e5b 
[   70.585243][    T1] x5 : 1fffe00417e57936 x4 : ffff0020bf2bc058 
[   70.585258][    T1] x3 : ffffa00010000000 x2 : ffff800417e58eb0 
[   70.585273][    T1] x1 : 28c26c7bd9c65300 x0 : 0000000000000000 
[   70.597298][    T1] Call trace:
[   70.597312][    T1]  debug_print_object+0xec/0x130
[   70.597325][    T1]  __debug_check_no_obj_freed+0x114/0x290
[   70.597337][    T1]  debug_check_no_obj_freed+0x18/0x28
[   70.597349][    T1]  slab_free_freelist_hook+0x18c/0x228
[   70.597359][    T1]  kfree+0x264/0x420
[   70.597376][    T1]  _edac_mc_free+0x6c/0x210
[   70.609382][    T1]  edac_mc_free+0x68/0x88
[   70.609396][    T1]  ghes_edac_unregister+0x44/0x70
[   70.609410][    T1]  ghes_remove+0x274/0x2a0
[   70.609424][    T1]  platform_drv_remove+0x44/0x78
[   70.609434][    T1]  really_probe+0x404/0x840
[   70.609445][    T1]  driver_probe_device+0x190/0x1f0
[   70.609456][    T1]  device_driver_attach+0x7c/0xb0
[   70.609466][    T1]  __driver_attach+0x1b8/0x1d0
[   70.609478][    T1]  bus_for_each_dev+0xf8/0x190
[   70.609488][    T1]  driver_attach+0x34/0x40
[   70.609499][    T1]  bus_add_driver+0x1d8/0x340
[   70.609509][    T1]  driver_register+0x168/0x1e8
[   70.609529][    T1]  __platform_driver_register+0x80/0x90
[   70.621543][    T1]  ghes_init+0xc4/0x174
[   70.621556][    T1]  do_one_initcall+0x328/0x788
[   70.621571][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   70.621584][    T1]  kernel_init+0x18/0x178
[   70.621594][    T1]  ret_from_fork+0x10/0x18
[   70.621610][    T1] irq event stamp: 4389198
[   70.633626][    T1] hardirqs last  enabled at (4389197): [<ffffa00010272398>] console_unlock+0x8d8/0x990
[   70.633643][    T1] hardirqs last disabled at (4389198): [<ffffa000100fd884>] debug_exception_enter+0x8c/0x190
[   70.633655][    T1] softirqs last  enabled at (4389194): [<ffffa000100bf4a4>] __do_softirq+0x894/0x920
[   70.633670][    T1] softirqs last disabled at (4389187): [<ffffa000101965e4>] irq_exit+0x114/0x1a0
[   70.633687][    T1] random: get_random_bytes called from print_oops_end_marker+0x30/0x68 with crng_init=0
[   70.633709][    T1] ---[ end trace f366d53b6f843ce8 ]---
[   70.702660][    T1] ------------[ cut here ]------------
[   70.711430][    T1] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x48
[   70.721167][    T1] WARNING: CPU: 51 PID: 1 at lib/debugobjects.c:484 debug_print_object+0xec/0x130
[   70.734461][    T1] Modules linked in:
[   70.744498][    T1] CPU: 51 PID: 1 Comm: swapper/0 Tainted: G        W         5.4.0-rc3+ #1146
[   70.744508][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   70.744519][    T1] pstate: 80800009 (Nzcv daif -PAN +UAO)
[   70.744531][    T1] pc : debug_print_object+0xec/0x130
[   70.744543][    T1] lr : debug_print_object+0xec/0x130
[   70.744555][    T1] sp : ffff0020bf2c7740
[   70.753182][    T1] x29: ffff0020bf2c7740 x28: ffff00232453a000 
[   70.753199][    T1] x27: ffff00232453a090 x26: ffffa00017543de0 
[   70.753215][    T1] x25: ffffa000101cd558 x24: ffffa00012051fc0 
[   70.753231][    T1] x23: ffffa000150d2200 x22: ffffa000120523a0 
[   70.766743][    T1] x21: ffffa00012051640 x20: 0000000000000000 
[   70.780503][    T1] x19: ffffa00015019000 x18: 0000000000000000 
[   70.780519][    T1] x17: 0000000000000000 x16: 00000000000026b0 
[   70.780534][    T1] x15: 0000000000000000 x14: 726f775f64657961 
[   70.780549][    T1] x13: 6c6564203a746e69 x12: 1fffe00417e58e5a 
[   70.799861][    T1] x11: ffff800417e58e5a x10: dfffa00000000000 
[   70.799877][    T1] x9 : ffff800417e58e5b x8 : 0000000000000001 
[   70.799892][    T1] x7 : ffff0020bf2c72d7 x6 : ffff800417e58e5b 
[   70.799907][    T1] x5 : 1fffe00417e57936 x4 : ffff0020bf2bc058 
[   70.799922][    T1] x3 : ffffa00010000000 x2 : ffff800417e58eb0 
[   70.829068][    T1] x1 : 28c26c7bd9c65300 x0 : 0000000000000000 
[   70.848735][    T1] Call trace:
[   70.848749][    T1]  debug_print_object+0xec/0x130
[   70.848762][    T1]  __debug_check_no_obj_freed+0x114/0x290
[   70.848774][    T1]  debug_check_no_obj_freed+0x18/0x28
[   70.848786][    T1]  slab_free_freelist_hook+0x18c/0x228
[   70.848801][    T1]  kfree+0x264/0x420
[   70.861248][    T1]  _edac_mc_free+0x1b0/0x210
[   70.861260][    T1]  edac_mc_free+0x68/0x88
[   70.861272][    T1]  ghes_edac_unregister+0x44/0x70
[   70.861283][    T1]  ghes_remove+0x274/0x2a0
[   70.861295][    T1]  platform_drv_remove+0x44/0x78
[   70.861305][    T1]  really_probe+0x404/0x840
[   70.861317][    T1]  driver_probe_device+0x190/0x1f0
[   70.861331][    T1]  device_driver_attach+0x7c/0xb0
[   70.926321][    T1]  __driver_attach+0x1b8/0x1d0
[   70.926338][    T1]  bus_for_each_dev+0xf8/0x190
[   70.938348][    T1]  driver_attach+0x34/0x40
[   70.938360][    T1]  bus_add_driver+0x1d8/0x340
[   70.938370][    T1]  driver_register+0x168/0x1e8
[   70.938382][    T1]  __platform_driver_register+0x80/0x90
[   70.938393][    T1]  ghes_init+0xc4/0x174
[   70.938407][    T1]  do_one_initcall+0x328/0x788
[   70.950417][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   70.950429][    T1]  kernel_init+0x18/0x178
[   70.950440][    T1]  ret_from_fork+0x10/0x18
[   70.950448][    T1] irq event stamp: 4389536
[   70.950461][    T1] hardirqs last  enabled at (4389535): [<ffffa000100c0e78>] el1_irq+0x138/0x200
[   70.950478][    T1] hardirqs last disabled at (4389536): [<ffffa000100fd884>] debug_exception_enter+0x8c/0x190
[   71.118261][    T1] softirqs last  enabled at (4389534): [<ffffa000100bf4a4>] __do_softirq+0x894/0x920
[   71.118278][    T1] softirqs last disabled at (4389527): [<ffffa000101965e4>] irq_exit+0x114/0x1a0
[   71.136533][    T1] ---[ end trace f366d53b6f843ce9 ]---
[   71.137908][    T1] ------------[ cut here ]------------
[   71.147364][    T1] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x48
[   71.158178][    T1] WARNING: CPU: 51 PID: 1 at lib/debugobjects.c:484 debug_print_object+0xec/0x130
[   71.167232][    T1] Modules linked in:
[   71.167251][    T1] CPU: 51 PID: 1 Comm: swapper/0 Tainted: G        W         5.4.0-rc3+ #1146
[   71.167261][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   71.167271][    T1] pstate: 80800009 (Nzcv daif -PAN +UAO)
[   71.167283][    T1] pc : debug_print_object+0xec/0x130
[   71.167301][    T1] lr : debug_print_object+0xec/0x130
[   71.179747][    T1] sp : ffff0020bf2c7740
[   71.179756][    T1] x29: ffff0020bf2c7740 x28: ffff002324534000 
[   71.179772][    T1] x27: ffff002324534090 x26: ffffa00017543de0 
[   71.179787][    T1] x25: ffffa000101cd558 x24: ffffa00012051fc0 
[   71.179802][    T1] x23: ffffa000150d2200 x22: ffffa000120523a0 
[   71.179821][    T1] x21: ffffa00012051640 x20: 0000000000000000 
[   71.194524][    T1] x19: ffffa00015019000 x18: 0000000000000000 
[   71.194540][    T1] x17: 0000000000000000 x16: 00000000000026b0 
[   71.194555][    T1] x15: 0000000000000000 x14: 775f646579616c65 
[   71.194569][    T1] x13: 64203a746e696820 x12: 1fffe00417e58e5a 
[   71.204857][    T1] x11: ffff800417e58e5a x10: dfffa00000000000 
[   71.204873][    T1] x9 : ffff800417e58e5b x8 : 0000000000000001 
[   71.204889][    T1] x7 : ffff0020bf2c72d7 x6 : ffff800417e58e5b 
[   71.204904][    T1] x5 : 1fffe00417e57936 x4 : ffff0020bf2bc058 
[   71.214930][    T1] x3 : ffffa00010000000 x2 : ffff800417e58eb0 
[   71.214947][    T1] x1 : 28c26c7bd9c65300 x0 : 0000000000000000 
[   71.214961][    T1] Call trace:
[   71.214974][    T1]  debug_print_object+0xec/0x130
[   71.214986][    T1]  __debug_check_no_obj_freed+0x114/0x290
[   71.215006][    T1]  debug_check_no_obj_freed+0x18/0x28
[   71.281033][    T1]  slab_free_freelist_hook+0x18c/0x228
[   71.281044][    T1]  kfree+0x264/0x420
[   71.281055][    T1]  _edac_mc_free+0x1f8/0x210
[   71.281066][    T1]  edac_mc_free+0x68/0x88
[   71.281078][    T1]  ghes_edac_unregister+0x44/0x70
[   71.281089][    T1]  ghes_remove+0x274/0x2a0
[   71.281100][    T1]  platform_drv_remove+0x44/0x78
[   71.281111][    T1]  really_probe+0x404/0x840
[   71.281121][    T1]  driver_probe_device+0x190/0x1f0
[   71.281132][    T1]  device_driver_attach+0x7c/0xb0
[   71.281142][    T1]  __driver_attach+0x1b8/0x1d0
[   71.281154][    T1]  bus_for_each_dev+0xf8/0x190
[   71.281166][    T1]  driver_attach+0x34/0x40
[   71.293176][    T1]  bus_add_driver+0x1d8/0x340
[   71.293186][    T1]  driver_register+0x168/0x1e8
[   71.293198][    T1]  __platform_driver_register+0x80/0x90
[   71.293208][    T1]  ghes_init+0xc4/0x174
[   71.293219][    T1]  do_one_initcall+0x328/0x788
[   71.293231][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.302370][    T1]  kernel_init+0x18/0x178
[   71.302381][    T1]  ret_from_fork+0x10/0x18
[   71.302389][    T1] irq event stamp: 4390142
[   71.302401][    T1] hardirqs last  enabled at (4390141): [<ffffa000100c0e78>] el1_irq+0x138/0x200
[   71.302416][    T1] hardirqs last disabled at (4390142): [<ffffa000100fd884>] debug_exception_enter+0x8c/0x190
[   71.302429][    T1] softirqs last  enabled at (4390140): [<ffffa000100bf4a4>] __do_softirq+0x894/0x920
[   71.312787][    T1] softirqs last disabled at (4390133): [<ffffa000101965e4>] irq_exit+0x114/0x1a0
[   71.312796][    T1] ---[ end trace f366d53b6f843cea ]---
[   71.374558][    T1] ==================================================================
[   71.382943][    T1] BUG: KASAN: use-after-free in ghes_edac_unregister+0x28/0x70
[   71.382954][    T1] Read of size 8 at addr ffff002324534bdc by task swapper/0/1
[   71.382961][    T1] 
[   71.382977][    T1] CPU: 52 PID: 1 Comm: swapper/0 Tainted: G        W         5.4.0-rc3+ #1146
[   71.382986][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   71.382995][    T1] Call trace:
[   71.383010][    T1]  dump_backtrace+0x0/0x298
[   71.393017][    T1]  show_stack+0x20/0x30
[   71.393029][    T1]  dump_stack+0x190/0x21c
[   71.393043][    T1]  print_address_description.isra.6+0x80/0x3d0
[   71.393055][    T1]  __kasan_report+0x174/0x2s_edac_unregister+0x28/0x70
[   71.469817][    T1]  ghes_remove+0x274/0x2a0
[   71.469837][    T1]  platform_drv_remove+0x44/0x78
[   71.484544][    T1]  really_probe+0x404/0x840
[   71.484556][    T1]  driver_probe_device+0x190/0x1f0
[   71.484567][    T1]  device_driver_attach+0x7c/0xb0
[   71.484578][    T1]  __driver_attach+0x1b8/0x1d0
[   71.484589][    T1]  bus_for_each_dev+0xf8/0x190
[   71.484600][    T1]  driver_attach+0x34/0x40
[   71.484618][    T1]  bus_add_driver+0x1d8/0x340
[   71.495501][    T1]  driver_register+0x168/0x1e8
[   71.495514][    T1]  __platform_driver_register+0x80/0x90
[   71.495525][    T1]  ghes_init+0xc4/0x174
[   71.495536][    T1]  do_one_initcall+0x328/0x788
[   71.495548][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.495560][    T1]  kernel_init+0x18/0x178
[   71.495571][    T1]  ret_from_fork+0x10/0x18
[   71.495582][    T1] 
[   71.535102][    T1] Allocated by task 1:
[   71.535115][    T1]  save_stack+0x28/0xb0
[   71.544170][    T1]  __kasan_kmalloc.isra.9+0xa0/0xc8
[   71.544181][    T1]  kasan_kmalloc+0xc/0x18
[   71.544192][    T1]  __kmalloc+0x2d0/0x338
[   71.544205][    T1]  edac_mc_alloc+0xaa8/0xb18
[   71.544216][    T1]  ghes_edac_register+0x164/0x398
[   71.544227][    T1]  ghes_probe+0x648/0x6d8
[   71.544239][    T1]  platform_drv_probe+0x8c/0x110
[   71.544250][    T1]  really_probe+0x32c/0x840
[   71.553304][    T1]  driver_probe_device+0x190/0x1f0
[   71.553315][    T1]  device_driver_attach+0x7c/0xb0
[   71.553326][    T1]  __driver_attach+0x1b8/0x1d0
[   71.553338][    T1]  bus_for_each_dev+0xf8/0x190
[   71.553348][    T1]  driver_attach+0x34/0x40
[   71.553359][    T1]  bus_add_driver+0x1d8/0x340
[   71.553369][    T1]  driver_register+0x168/0x1e8
[   71.553382][    T1]  __platform_driver_register+0x80/0x90
[   71.567572][    T1]  ghes_init+0xc4/0x174
[   71.567588][    T1]  do_one_initcall+0x328/0x788
[   71.576829][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.576841][    T1]  kernel_init+0x18/0x178
[   71.576852][    T1]  ret_from_fork+0x10/0x18
[   71.576859][    T1] 
[   71.576868][    T1] Freed by task 1:
[   71.576879][    T1]  save_stack+0x28/0xb0
[   71.576891][    T1]  __kasan_slab_free+0x140/0x170
[   71.576908][    T1]  kasan_slab_free+0x10/0x18
[   71.585708][    T1]  slab_free_freelist_hook+0x19c/0x228
[   71.585720][    T1]  kfree+0x264/0x420
[   71.585732][    T1]  _edac_mc_free+0x1f8/0x210
[   71.585743][    T1]  edac_mc_free+0x68/0x88
[   71.585754][    T1]  ghes_edac_unregister+0x44/0x70
[   71.585766][    T1]  ghes_remove+0x274/0x2a0
[   71.585777][    T1]  platform_drv_remove+0x44/0x78
[   71.585792][    T1]  really_probe+0x404/0x840
[   71.659765][  T904] kobject: 'wakeup54' ((____ptrval____)): kobject_cleanup, parent (____ptrval____)
[   71.663982][    T1]  driver_probe_device+0x190/0x1f0
[   71.663994][    T1]  device_driver_attach+0x7c/0xb0
[   71.664006][    T1]  __driver_attach+0x1b8/0x1d0
[   71.664017][    T1]  bus_for_each_dev+0xf8/0x190
[   71.664028][    T1]  driver_attach+0x34/0x40
[   71.664038][    T1]  bus_add_driver+0x1d8/0x340
[   71.664049][    T1]  driver_register+0x168/0x1e8
[   71.664061][    T1]  __platform_driver_register+0x80/0x90
[   71.664071][    T1]  ghes_init+0xc4/0x174
[   71.664082][    T1]  do_one_initcall+0x328/0x788
[   71.664094][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.664105][    T1]  kernel_init+0x18/0x178
[   71.664116][    T1]  ret_from_fork+0x10/0x18
[   71.664129][    T1] 
[   71.669171][  T904] kobject: 'wakeup54' ((____ptrval____)): calling ktype release
[   71.673978][    T1] The buggy address belongs to the object at ffff002324534000
[   71.673978][    T1]  which belongs to the cache kmalloc-4k of size 4096
[   71.673990][    T1] The buggy address is located 3036 bytes inside of
[   71.673990][    T1]  4096-byte region [ffff002324534000, ffff002324535000)
[   71.673999][    T1] The buggy address belongs to the page:
[   71.674013][    T1] page:fffffe008c714c00 refcount:1 mapcount:0 mapping:ffff0020bfc16980 index:0x0 compound_mapcount: 0
[   71.674032][    T1] flags: 0x1ffff00000010200(slab|head)
[   71.674055][    T1] raw: 1ffff00000010200 fffffe008c714808 fffffe008c716e08 ffff0020bfc16980
[   71.678784][  T904] kobject: 'wakeup54': free name
[   71.683294][    T1] raw: 0000000000000000 0000000000020002 00000001ffffffff 0000000000000000
[   71.683303][    T1] page dumped because: kasan: bad access detected
[   71.683310][    T1] 
[   71.683318][    T1] Memory state around the buggy address:
[   71.683330][    T1]  ffff002324534a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.683341][    T1]  ffff002324534b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.683352][    T1] >ffff002324534b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.683368][    T1]                                                     ^
[   71.755750][  T853] kobject: 'wakeup' ((____ptrval____)): kobject_cleanup, parent (____ptrval____)
[   71.756770][    T1]  ffff002324534c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.756781][    T1]  ffff002324534c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.761102][  T853] kobject: 'wakeup' ((____ptrval____)): calling ktype release
[   71.765835][    T1] ==================================================================
[   71.765843][    T1] Disabling lock debugging due to kernel taint
[   71.765935][  T850] kobject: 'wakeup21' ((____ptrval____)): kobject_cleanup, parent (____ptrval____)
[   71.766851][    T1] EDAC DEBUG: edac_mc_del_mc: 
[   71.766864][    T1] EDAC DEBUG: edac_mc_free: 
[   71.766881][    T1] ==================================================================
[   71.766891][    T1] BUG: KASAN: double-free or invalid-free in kfree+0x264/0x420
[   71.766895][    T1] 
[   71.766904][    T1] CPU: 48 PID: 1 Comm: swapper/0 Tainted: G    B   W         5.4.0-rc3+ #1146
[   71.766910][    T1] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   71.766915][    T1] Call trace:
[   71.766923][    T1]  dump_backtrace+0x0/0x298
[   71.766929][    T1]  show_stack+0x20/0x30
[   71.766936][    T1]  dump_stack+0x190/0x21c
[   71.766945][    T1]  print_address_description.isra.6+0x80/0x3d0
[   71.766953][    T1]  kasan_report_invalid_free+0x78/0xa0
[   71.766960][    T1]  __kasan_slab_free+0xbc/0x170
[   71.766968][    T1]  kasan_slab_free+0x10/0x18
[   71.766975][    T1]  slab_free_freelist_hook+0x19c/0x228
[   71.766981][    T1]  kfree+0x264/0x420
[   71.766989][    T1]  _edac_mc_free+0x6c/0x210
[   71.766997][    T1]  edac_mc_free+0x68/0x88
[   71.767004][    T1]  ghes_edac_unregister+0x44/0x70
[   71.767012][    T1]  ghes_remove+0x274/0x2a0
[   71.767019][    T1]  platform_drv_remove+0x44/0x78
[   71.767026][    T1]  really_probe+0x404/0x840
[   71.767033][    T1]  driver_probe_device+0x190/0x1f0
[   71.767039][    T1]  device_driver_attach+0x7c/0xb0
[   71.767046][    T1]  __driver_attach+0x1b8/0x1d0
[   71.767054][    T1]  bus_for_each_dev+0xf8/0x190
[   71.767060][    T1]  driver_attach+0x34/0x40
[   71.767067][    T1]  bus_add_driver+0x1d8/0x340
[   71.767073][    T1]  driver_register+0x168/0x1e8
[   71.767081][    T1]  __platform_driver_register+0x80/0x90
[   71.767088][    T1]  ghes_init+0xc4/0x174
[   71.767095][    T1]  do_one_initcall+0x328/0x788
[   71.767104][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.767111][    T1]  kernel_init+0x18/0x178
[   71.767118][    T1]  ret_from_fork+0x10/0x18
[   71.767122][    T1] 
[   71.767127][    T1] Allocated by task 1:
[   71.767135][    T1]  save_stack+0x28/0xb0
[   71.767143][    T1]  __kasan_kmalloc.isra.9+0xa0/0xc8
[   71.767150][    T1]  kasan_kmalloc+0xc/0x18
[   71.767157][    T1]  kmem_cache_alloc_trace+0x2a0/0x2e8
[   71.767165][    T1]  edac_mc_alloc+0x5d4/0xb18
[   71.767172][    T1]  ghes_edac_register+0x164/0x398
[   71.767180][    T1]  ghes_probe+0x648/0x6d8
[   71.767187][    T1]  platform_drv_probe+0x8c/0x110
[   71.767193][    T1]  really_probe+0x32c/0x840
[   71.767201][    T1]  driver_probe_device+0x190/0x1f0
[   71.767207][    T1]  device_driver_attach+0x7c/0xb0
[   71.767214][    T1]  __driver_attach+0x1b8/0x1d0
[   71.767222][    T1]  bus_for_each_dev+0xf8/0x190
[   71.767228][    T1]  driver_attach+0x34/0x40
[   71.767234][    T1]  bus_add_driver+0x1d8/0x340
[   71.767241][    T1]  driver_register+0x168/0x1e8
[   71.767249][    T1]  __platform_driver_register+0x80/0x90
[   71.767255][    T1]  ghes_init+0xc4/0x174
[   71.767262][    T1]  do_one_initcall+0x328/0x788
[   71.767270][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.767277][    T1]  kernel_init+0x18/0x178
[   71.767284][    T1]  ret_from_fork+0x10/0x18
[   71.767287][    T1] 
[   71.767292][    T1] Freed by task 1:
[   71.767299][    T1]  save_stack+0x28/0xb0
[   71.767306][    T1]  __kasan_slab_free+0x140/0x170
[   71.767314][    T1]  kasan_slab_free+0x10/0x18
[   71.767321][    T1]  slab_free_freelist_hook+0x19c/0x228
[   71.767327][    T1]  kfree+0x264/0x420
[   71.767335][    T1]  _edac_mc_free+0x15c/0x210
[   71.767342][    T1]  edac_mc_free+0x68/0x88
[   71.767349][    T1]  ghes_edac_unregister+0x44/0x70
[   71.767357][    T1]  ghes_remove+0x274/0x2a0
[   71.767364][    T1]  platform_drv_remove+0x44/0x78
[   71.767371][    T1]  really_probe+0x404/0x840
[   71.767377][    T1]  driver_probe_device+0x190/0x1f0
[   71.767384][    T1]  device_driver_attach+0x7c/0xb0
[   71.767391][    T1]  __driver_attach+0x1b8/0x1d0
[   71.767398][    T1]  bus_for_each_dev+0xf8/0x190
[   71.767405][    T1]  driver_attach+0x34/0x40
[   71.767411][    T1]  bus_add_driver+0x1d8/0x340
[   71.767418][    T1]  driver_register+0x168/0x1e8
[   71.767426][    T1]  __platform_driver_register+0x80/0x90
[   71.767432][    T1]  ghes_init+0xc4/0x174
[   71.767439][    T1]  do_one_initcall+0x328/0x788
[   71.767447][    T1]  kernel_init_freeable+0x2fc/0x3d4
[   71.767454][    T1]  kernel_init+0x18/0x178
[   71.767461][    T1]  ret_from_fork+0x10/0x18
[   71.767464][    T1] 
[   71.767471][    T1] The buggy address belongs to the object at ffff002324528800
[   71.767471][    T1]  which belongs to the cache kmalloc-128 of size 128
[   71.767478][    T1] The buggy address is located 0 bytes inside of
[   71.767478][    T1]  128-byte region [ffff002324528800, ffff002324528880)
[   71.767482][    T1] The buggy address belongs to the page:
[   71.767490][    T1] page:fffffe008c714a00 refcount:1 mapcount:0 mapping:ffff0020bfc10580 index:0xffff00232452e480 compound_mapcount: 0
[   71.767500][    T1] flags: 0x1ffff00000010200(slab|head)
[   71.767511][    T1] raw: 1ffff00000010200 fffffe008c72b408 fffffe008c715408 ffff0020bfc10580
[   71.767521][    T1] raw: ffff00232452e480 0000000000330019 00000001ffffffff 0000000000000000
[   71.767525][    T1] page dumped because: kasan: bad access detected
[   71.767529][    T1] 
[   71.767532][    T1] Memory state around the buggy address:
[   71.767540][    T1]  ffff002324528700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   71.767547][    T1]  ffff002324528780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   71.767553][    T1] >ffff002324528800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   71.767557][    T1]                    ^
[   71.767564][    T1]  ffff002324528880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   71.767571][    T1]  ffff002324528900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   71.767575][    T1] ==================================================================

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: edac KASAN warning in experimental arm64 allmodconfig boot
  2019-10-14 16:15 ` James Morse
@ 2019-10-14 16:56   ` John Garry
  2019-10-14 16:57     ` Borislav Petkov
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2019-10-14 16:56 UTC (permalink / raw)
  To: James Morse
  Cc: Borislav Petkov, Mauro Carvalho Chehab, tony.luck,
	Robert Richter, linux-edac, linux-kernel

On 14/10/2019 17:15, James Morse wrote:
> Hi John,
>

Hi James,

> On 14/10/2019 16:18, John Garry wrote:
>> I'm experimenting by trying to boot an allmodconfig arm64 kernel, as mentioned here:
>
> Crumbs!
>
>
>> One thing that I noticed - it's hard to miss actually - is the amount of complaining from
>> KASAN about the EDAC/ghes code. Maybe this is something I should not care about/red
>> herring, or maybe something genuine. Let me know what you think.
>
> Hmmm, I thought I tested this recently...
>
>> Log snippet (I cut off after the first KASAN warning):
>>
>> [   70.471011][    T1] random: get_random_u32 called from new_slab+0x360/0x698 with
>> crng_init=0
>
>> [   70.478671][    T1] [Firmware Bug]: APEI: Invalid bit width + offset in GAR
>> [0x94110034/64/0/3/0]
>
> (this one's for you right?)

Yeah, I'll report it. It might be already fixed.

>
>> [   70.700412][    T1] ------------[ cut here ]------------
>
>> [   70.802080][    T1] Call trace:
>> [   70.802093][    T1]  debug_print_object+0xec/0x130
>> [   70.802106][    T1]  __debug_check_no_obj_freed+0x114/0x290
>> [   70.802119][    T1]  debug_check_no_obj_freed+0x18/0x28
>> [   70.802130][    T1]  slab_free_freelist_hook+0x18c/0x228
>> [   70.802140][    T1]  kfree+0x264/0x420
>> [   70.802157][    T1]  _edac_mc_free+0x6c/0x210
>> [   70.814163][    T1]  edac_mc_free+0x68/0x88
>> [   70.814177][    T1]  ghes_edac_unregister+0x44/0x70
>> [   70.814193][    T1]  ghes_remove+0x274/0x2a0
>
> Ugh. This must be the test driver remove thing.

Yeah, the probe, remove, probe again flow from 
CONFIG_DEBUG_TEST_DRIVER_REMOVE.

>
> I've reproduced this, but had to remove the parent GHES twice. It looks like it tries to
> use the first ghes_edac global variables when freeing the second. ghes_init prevents it
> from re-allocating over the top.
>
> The below diff fixes it for me.

And for me by the looks of it. That's with CONFIG_DEBUG_KOBJECT_RELEASE 
now unset, but I expect the same with it set.

(I'll post it as a proper patch once I've done the
> archaeology)
>
> -----------%<-----------
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index d413a0bdc9ad..955b59b6aade 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -554,6 +554,7 @@ void ghes_edac_unregister(struct ghes *ghes)
>                 return;
>
>         mci = ghes_pvt->mci;
> +       ghes_pvt = NULL;
>         edac_mc_del_mc(mci->pdev);
>         edac_mc_free(mci);
>  }
>
> -----------%<-----------
>

Thanks,
John

BTW, I am not sure if my response to Boris was rejected due to 
attachments, as but it is here:

https://lore.kernel.org/linux-edac/dc974549-6ea4-899d-7f3a-b2fcfafe1528@arm.com/T/#ma0e122ca0eda9d80e869af179352f75037146d3c

>
> Thanks!
>
> James
>
> .
>



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: edac KASAN warning in experimental arm64 allmodconfig boot
  2019-10-14 16:56   ` John Garry
@ 2019-10-14 16:57     ` Borislav Petkov
  0 siblings, 0 replies; 21+ messages in thread
From: Borislav Petkov @ 2019-10-14 16:57 UTC (permalink / raw)
  To: John Garry
  Cc: James Morse, Mauro Carvalho Chehab, tony.luck, Robert Richter,
	linux-edac, linux-kernel

On Mon, Oct 14, 2019 at 05:56:02PM +0100, John Garry wrote:
> BTW, I am not sure if my response to Boris was rejected due to attachments,
> as but it is here:
> 
> https://lore.kernel.org/linux-edac/dc974549-6ea4-899d-7f3a-b2fcfafe1528@arm.com/T/#ma0e122ca0eda9d80e869af179352f75037146d3c

No, all good. It went through.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-10-14 15:18 edac KASAN warning in experimental arm64 allmodconfig boot John Garry
  2019-10-14 16:09 ` Borislav Petkov
  2019-10-14 16:15 ` James Morse
@ 2019-11-21 12:34 ` John Garry
  2019-11-21 14:23   ` Robert Richter
  2 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2019-11-21 12:34 UTC (permalink / raw)
  To: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	Robert Richter
  Cc: linux-edac, linux-kernel

On 14/10/2019 16:18, John Garry wrote:


Hi guys,

JFYI, I see an issue on linuxnext-2019119, as follows:

    21.645388] io scheduler kyber registered
[   21.734011] input: Power Button as 
/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
[   21.743295] ACPI: Power Button [PWRB]
[   21.809644] [Firmware Bug]: APEI: Invalid bit width + offset in GAR 
[0x94110034/64/0/3/0]
[   21.821974] EDAC MC0: Giving out device to module ghes_edac.c 
controller ghes_edac: DEV ghes (INTERRUPT)
[   21.831763] ------------[ cut here ]------------
[   21.836374] refcount_t: increment on 0; use-after-free.
[   21.841620] WARNING: CPU: 36 PID: 1 at lib/refcount.c:156 
refcount_inc_checked+0x44/0x50
[   21.849697] Modules linked in:
[   21.852745] CPU: 36 PID: 1 Comm: swapper/0 Not tainted 
5.4.0-rc8-next-20191119-00003-g141a9fef5092-dirty #650
[   21.862645] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019
[   21.871157] pstate: 60c00009 (nZCv daif +PAN +UAO)
[   21.875936] pc : refcount_inc_checked+0x44/0x50
[   21.880455] lr : refcount_inc_checked+0x44/0x50
[   21.884972] sp : ffff00236ffbf8a0
[   21.888274] x29: ffff00236ffbf8a0 x28: 0000000000000002
[   21.893576] x27: ffff00236cd07900 x26: ffff002369063010
[   21.898876] x25: 0000000000000000 x24: ffff00233c236824
[   21.904177] x23: ffffa000137b9000 x22: ffffa00016fbb7c0
[   21.909477] x21: ffffa00012dfd000 x20: 1fffe0046dff7f24
[   21.914777] x19: ffff00233c236000 x18: 0000000000000000
[   21.920077] x17: 0000000000000000 x16: 0000000000000000
[   21.925377] x15: 0000000000007700 x14: 64655f7365686720
[   21.930677] x13: 72656c6c6f72746e x12: 1ffff40002719618
[   21.935977] x11: ffff940002719618 x10: dfffa00000000000
[   21.941278] x9 : ffff940002719619 x8 : 0000000000000001
[   21.946578] x7 : 0000000000000000 x6 : 0000000000000001
[   21.951877] x5 : ffff940002719618 x4 : ffff00236ffb0010
[   21.957178] x3 : ffffa000112415e4 x2 : ffff80046dff7ede
[   21.962478] x1 : 5aff78756b1cf400 x0 : 0000000000000000
[   21.967779] Call trace:
[   21.970214]  refcount_inc_checked+0x44/0x50
[   21.974389]  ghes_edac_register+0x258/0x388
[   21.978562]  ghes_probe+0x28c/0x5f0
[   21.982041]  platform_drv_probe+0x70/0xd8
[   21.986039]  really_probe+0x174/0x468
[   21.989690]  driver_probe_device+0x7c/0x148
[   21.993862]  device_driver_attach+0x94/0xa0
[   21.998033]  __driver_attach+0xa4/0x110
[   22.001857]  bus_for_each_dev+0xe8/0x158
[   22.005768]  driver_attach+0x30/0x40
[   22.009331]  bus_add_driver+0x234/0x2f0
[   22.013156]  driver_register+0xbc/0x1d0
[   22.016981]  __platform_driver_register+0x7c/0x88
[   22.021675]  ghes_init+0xbc/0x14c
[   22.024979]  do_one_initcall+0xb4/0x254
[   22.028805]  kernel_init_freeable+0x248/0x2f4
[   22.033151]  kernel_init+0x10/0x118
[   22.036628]  ret_from_fork+0x10/0x18
[   22.040194] ---[ end trace 33655bb65a9835fe ]---
[   22.046666] EDAC MC: bug in low-level driver: attempt to assign
[   22.046666]     duplicate mc_idx 0 in add_mc_to_global_list()
[   22.058311] ghes_edac: Can't register at EDAC core
[   22.065402] EDAC MC: bug in low-level driver: attempt to assign
[   22.065402]     duplicate mc_idx 0 in add_mc_to_global_list()
[   22.077080] ghes_edac: Can't register at EDAC core
[   22.084140] EDAC MC: bug in low-level driver: attempt to assign
[   22.084140]     duplicate mc_idx 0 in add_mc_to_global_list()
[   22.095789] ghes_edac: Can't register at EDAC core
[   22.102873] EDAC MC: bug in low-level driver: attempt to assign
[   22.102873]     duplicate mc_idx 0 in add_mc_to_global_list()
[   22.115442] ghes_edac: Can't register at EDAC core
[   22.122536] EDAC MC: bug in low-level driver: attempt to assign
[   22.122536]     duplicate mc_idx 0 in add_mc_to_global_list()
[   22.134344] ghes_edac: Can't register at EDAC core
[   22.141441] EDAC MC: bug in low-level driver: attempt to assign
[   22.141441]     duplicate mc_idx 0 in add_mc_to_global_list()
[   22.153089] ghes_edac: Can't register at EDAC core
[   22.160161] EDAC MC: bug in low-level driver: attempt to assign
[   22.160161]     duplicate mc_idx 0 in add_mc_to_global_list()
[   22.171810] ghes_edac: Can't register at EDAC core
[   22.178933] GHES: APEI firmware first mode is enabled by APEI bit and 
WHEA _OSC.

This time I'm using a standard arm64 defconfig, except kasan and 
kmemleak is enabled (I need to enable them when developing software - 
joke). Maybe it's a known issue, I don't know.

Cheers,
John

> Hi guys,
> 
> I'm experimenting by trying to boot an allmodconfig arm64 kernel, as 
> mentioned here:
> https://lore.kernel.org/linux-arm-kernel/507325a3-030e-2843-0f46-7e18c60257de@huawei.com/ 
> 
> 
> One thing that I noticed - it's hard to miss actually - is the amount of 
> complaining from KASAN about the EDAC/ghes code. Maybe this is something 
> I should not care about/red herring, or maybe something genuine. Let me 
> know what you think.
> 
> The kernel is v5.4-rc3, and I raised the EDAC mc debug level to get 
> extra debug prints.
> 

[cut]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-21 12:34 ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) John Garry
@ 2019-11-21 14:23   ` Robert Richter
  2019-11-21 15:23     ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: Robert Richter @ 2019-11-21 14:23 UTC (permalink / raw)
  To: John Garry
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel

Hi John,

thanks for testing and reporting this. See inline.

On 21.11.19 12:34:22, John Garry wrote:
> On 14/10/2019 16:18, John Garry wrote:
> JFYI, I see an issue on linuxnext-2019119, as follows:
> 
>    21.645388] io scheduler kyber registered
> [   21.734011] input: Power Button as
> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
> [   21.743295] ACPI: Power Button [PWRB]
> [   21.809644] [Firmware Bug]: APEI: Invalid bit width + offset in GAR
> [0x94110034/64/0/3/0]
> [   21.821974] EDAC MC0: Giving out device to module ghes_edac.c controller
> ghes_edac: DEV ghes (INTERRUPT)
> [   21.831763] ------------[ cut here ]------------
> [   21.836374] refcount_t: increment on 0; use-after-free.
> [   21.841620] WARNING: CPU: 36 PID: 1 at lib/refcount.c:156
> refcount_inc_checked+0x44/0x50
> [   21.849697] Modules linked in:
> [   21.852745] CPU: 36 PID: 1 Comm: swapper/0 Not tainted
> 5.4.0-rc8-next-20191119-00003-g141a9fef5092-dirty #650
> [   21.862645] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 -
> V1.16.01 03/15/2019
> [   21.871157] pstate: 60c00009 (nZCv daif +PAN +UAO)
> [   21.875936] pc : refcount_inc_checked+0x44/0x50
> [   21.880455] lr : refcount_inc_checked+0x44/0x50

This is a warning from the refcount framework. It warns if we
increment from zero. This is reasonable as typically a kernel object
is created with a refcount of 1 and thrown away once the refcount is
zero. Afterwards the object is used-after-free.

For ghes the refcount is initialized with zero, and that is why we see
this message. However, we protect the refcount with the ghes_reg_mutex
and thus there is no use after free. The device is allocated and
registered if the refcount is zero. So this works fine.

Enclosed a fix that avoids the warning, please test.

But see below...

> [   21.884972] sp : ffff00236ffbf8a0
> [   21.888274] x29: ffff00236ffbf8a0 x28: 0000000000000002
> [   21.893576] x27: ffff00236cd07900 x26: ffff002369063010
> [   21.898876] x25: 0000000000000000 x24: ffff00233c236824
> [   21.904177] x23: ffffa000137b9000 x22: ffffa00016fbb7c0
> [   21.909477] x21: ffffa00012dfd000 x20: 1fffe0046dff7f24
> [   21.914777] x19: ffff00233c236000 x18: 0000000000000000
> [   21.920077] x17: 0000000000000000 x16: 0000000000000000
> [   21.925377] x15: 0000000000007700 x14: 64655f7365686720
> [   21.930677] x13: 72656c6c6f72746e x12: 1ffff40002719618
> [   21.935977] x11: ffff940002719618 x10: dfffa00000000000
> [   21.941278] x9 : ffff940002719619 x8 : 0000000000000001
> [   21.946578] x7 : 0000000000000000 x6 : 0000000000000001
> [   21.951877] x5 : ffff940002719618 x4 : ffff00236ffb0010
> [   21.957178] x3 : ffffa000112415e4 x2 : ffff80046dff7ede
> [   21.962478] x1 : 5aff78756b1cf400 x0 : 0000000000000000
> [   21.967779] Call trace:
> [   21.970214]  refcount_inc_checked+0x44/0x50
> [   21.974389]  ghes_edac_register+0x258/0x388
> [   21.978562]  ghes_probe+0x28c/0x5f0
> [   21.982041]  platform_drv_probe+0x70/0xd8
> [   21.986039]  really_probe+0x174/0x468
> [   21.989690]  driver_probe_device+0x7c/0x148
> [   21.993862]  device_driver_attach+0x94/0xa0
> [   21.998033]  __driver_attach+0xa4/0x110
> [   22.001857]  bus_for_each_dev+0xe8/0x158
> [   22.005768]  driver_attach+0x30/0x40
> [   22.009331]  bus_add_driver+0x234/0x2f0
> [   22.013156]  driver_register+0xbc/0x1d0
> [   22.016981]  __platform_driver_register+0x7c/0x88
> [   22.021675]  ghes_init+0xbc/0x14c
> [   22.024979]  do_one_initcall+0xb4/0x254
> [   22.028805]  kernel_init_freeable+0x248/0x2f4
> [   22.033151]  kernel_init+0x10/0x118
> [   22.036628]  ret_from_fork+0x10/0x18
> [   22.040194] ---[ end trace 33655bb65a9835fe ]---
> [   22.046666] EDAC MC: bug in low-level driver: attempt to assign
> [   22.046666]     duplicate mc_idx 0 in add_mc_to_global_list()
> [   22.058311] ghes_edac: Can't register at EDAC core
> [   22.065402] EDAC MC: bug in low-level driver: attempt to assign
> [   22.065402]     duplicate mc_idx 0 in add_mc_to_global_list()
> [   22.077080] ghes_edac: Can't register at EDAC core
> [   22.084140] EDAC MC: bug in low-level driver: attempt to assign
> [   22.084140]     duplicate mc_idx 0 in add_mc_to_global_list()
> [   22.095789] ghes_edac: Can't register at EDAC core
> [   22.102873] EDAC MC: bug in low-level driver: attempt to assign
> [   22.102873]     duplicate mc_idx 0 in add_mc_to_global_list()
> [   22.115442] ghes_edac: Can't register at EDAC core
> [   22.122536] EDAC MC: bug in low-level driver: attempt to assign
> [   22.122536]     duplicate mc_idx 0 in add_mc_to_global_list()
> [   22.134344] ghes_edac: Can't register at EDAC core
> [   22.141441] EDAC MC: bug in low-level driver: attempt to assign
> [   22.141441]     duplicate mc_idx 0 in add_mc_to_global_list()
> [   22.153089] ghes_edac: Can't register at EDAC core
> [   22.160161] EDAC MC: bug in low-level driver: attempt to assign
> [   22.160161]     duplicate mc_idx 0 in add_mc_to_global_list()
> [   22.171810] ghes_edac: Can't register at EDAC core

What I am more concerned is this here. In total this implies 8 ghes
users that all try to register a (single-instance) ghes mc device. For
non-x86 only one instance is allowed (see ghes_edac_register(), idx =
0).

So on your platform, when parsing the HEST table
(hest_ghes_dev_register()), more than one "GHES" device is parsed,
allocated and registered. Mind sending me your HEST table
(/sys/firmware/acpi/tables/HEST), or explain what happens here? If
this is a valid use case, we need to change ghes_edac_register() to
support more than one instance.

Again, please try the patch below.

Thanks,

-Robert


From 6962f8af4a7c1051c9e87a5ac60571f70d2b6814 Mon Sep 17 00:00:00 2001
From: Robert Richter <rrichter@marvell.com>
Date: Thu, 21 Nov 2019 15:01:28 +0100
Subject: [PATCH] EDAC/ghes: Do not warn on when increment refcount on 0

Signed-off-by: Robert Richter <rrichter@marvell.com>
---
 drivers/edac/ghes_edac.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 47f4e7f90ef0..b99080d8a10c 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -556,8 +556,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	ghes_pvt = pvt;
 	spin_unlock_irqrestore(&ghes_lock, flags);
 
-	/* only increment on success */
-	refcount_inc(&ghes_refcount);
+	/* only set on success */
+	refcount_set(&ghes_refcount, 1);
 
 unlock:
 	mutex_unlock(&ghes_reg_mutex);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-21 14:23   ` Robert Richter
@ 2019-11-21 15:23     ` John Garry
  2019-11-21 21:36       ` [PATCH] EDAC/ghes: Do not warn when incrementing refcount on 0 Robert Richter
  2019-11-22 11:28       ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) Robert Richter
  0 siblings, 2 replies; 21+ messages in thread
From: John Garry @ 2019-11-21 15:23 UTC (permalink / raw)
  To: Robert Richter
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On 21/11/2019 14:23, Robert Richter wrote:
> Hi John,
> 
> thanks for testing and reporting this. See inline.
> 
> On 21.11.19 12:34:22, John Garry wrote:
>> On 14/10/2019 16:18, John Garry wrote:
>> JFYI, I see an issue on linuxnext-2019119, as follows:
>>
>>     21.645388] io scheduler kyber registered
>> [   21.734011] input: Power Button as
>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
>> [   21.743295] ACPI: Power Button [PWRB]
>> [   21.809644] [Firmware Bug]: APEI: Invalid bit width + offset in GAR
>> [0x94110034/64/0/3/0]
>> [   21.821974] EDAC MC0: Giving out device to module ghes_edac.c controller
>> ghes_edac: DEV ghes (INTERRUPT)
>> [   21.831763] ------------[ cut here ]------------
>> [   21.836374] refcount_t: increment on 0; use-after-free.
>> [   21.841620] WARNING: CPU: 36 PID: 1 at lib/refcount.c:156
>> refcount_inc_checked+0x44/0x50
>> [   21.849697] Modules linked in:
>> [   21.852745] CPU: 36 PID: 1 Comm: swapper/0 Not tainted
>> 5.4.0-rc8-next-20191119-00003-g141a9fef5092-dirty #650
>> [   21.862645] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 -
>> V1.16.01 03/15/2019
>> [   21.871157] pstate: 60c00009 (nZCv daif +PAN +UAO)
>> [   21.875936] pc : refcount_inc_checked+0x44/0x50
>> [   21.880455] lr : refcount_inc_checked+0x44/0x50
> 
> This is a warning from the refcount framework. It warns if we
> increment from zero. This is reasonable as typically a kernel object
> is created with a refcount of 1 and thrown away once the refcount is
> zero. Afterwards the object is used-after-free.
> 
> For ghes the refcount is initialized with zero, and that is why we see
> this message. However, we protect the refcount with the ghes_reg_mutex
> and thus there is no use after free. The device is allocated and
> registered if the refcount is zero. So this works fine.
> 
> Enclosed a fix that avoids the warning, please test.
> 
> But see below...
> 
>> [   21.884972] sp : ffff00236ffbf8a0
>> [   21.888274] x29: ffff00236ffbf8a0 x28: 0000000000000002
>> [   21.893576] x27: ffff00236cd07900 x26: ffff002369063010
>> [   21.898876] x25: 0000000000000000 x24: ffff00233c236824
>> [   21.904177] x23: ffffa000137b9000 x22: ffffa00016fbb7c0
>> [   21.909477] x21: ffffa00012dfd000 x20: 1fffe0046dff7f24
>> [   21.914777] x19: ffff00233c236000 x18: 0000000000000000
>> [   21.920077] x17: 0000000000000000 x16: 0000000000000000
>> [   21.925377] x15: 0000000000007700 x14: 64655f7365686720
>> [   21.930677] x13: 72656c6c6f72746e x12: 1ffff40002719618
>> [   21.935977] x11: ffff940002719618 x10: dfffa00000000000
>> [   21.941278] x9 : ffff940002719619 x8 : 0000000000000001
>> [   21.946578] x7 : 0000000000000000 x6 : 0000000000000001
>> [   21.951877] x5 : ffff940002719618 x4 : ffff00236ffb0010
>> [   21.957178] x3 : ffffa000112415e4 x2 : ffff80046dff7ede
>> [   21.962478] x1 : 5aff78756b1cf400 x0 : 0000000000000000
>> [   21.967779] Call trace:
>> [   21.970214]  refcount_inc_checked+0x44/0x50
>> [   21.974389]  ghes_edac_register+0x258/0x388
>> [   21.978562]  ghes_probe+0x28c/0x5f0
>> [   21.982041]  platform_drv_probe+0x70/0xd8
>> [   21.986039]  really_probe+0x174/0x468
>> [   21.989690]  driver_probe_device+0x7c/0x148
>> [   21.993862]  device_driver_attach+0x94/0xa0
>> [   21.998033]  __driver_attach+0xa4/0x110
>> [   22.001857]  bus_for_each_dev+0xe8/0x158
>> [   22.005768]  driver_attach+0x30/0x40
>> [   22.009331]  bus_add_driver+0x234/0x2f0
>> [   22.013156]  driver_register+0xbc/0x1d0
>> [   22.016981]  __platform_driver_register+0x7c/0x88
>> [   22.021675]  ghes_init+0xbc/0x14c
>> [   22.024979]  do_one_initcall+0xb4/0x254
>> [   22.028805]  kernel_init_freeable+0x248/0x2f4
>> [   22.033151]  kernel_init+0x10/0x118
>> [   22.036628]  ret_from_fork+0x10/0x18
>> [   22.040194] ---[ end trace 33655bb65a9835fe ]---
>> [   22.046666] EDAC MC: bug in low-level driver: attempt to assign
>> [   22.046666]     duplicate mc_idx 0 in add_mc_to_global_list()
>> [   22.058311] ghes_edac: Can't register at EDAC core
>> [   22.065402] EDAC MC: bug in low-level driver: attempt to assign
>> [   22.065402]     duplicate mc_idx 0 in add_mc_to_global_list()
>> [   22.077080] ghes_edac: Can't register at EDAC core
>> [   22.084140] EDAC MC: bug in low-level driver: attempt to assign
>> [   22.084140]     duplicate mc_idx 0 in add_mc_to_global_list()
>> [   22.095789] ghes_edac: Can't register at EDAC core
>> [   22.102873] EDAC MC: bug in low-level driver: attempt to assign
>> [   22.102873]     duplicate mc_idx 0 in add_mc_to_global_list()
>> [   22.115442] ghes_edac: Can't register at EDAC core
>> [   22.122536] EDAC MC: bug in low-level driver: attempt to assign
>> [   22.122536]     duplicate mc_idx 0 in add_mc_to_global_list()
>> [   22.134344] ghes_edac: Can't register at EDAC core
>> [   22.141441] EDAC MC: bug in low-level driver: attempt to assign
>> [   22.141441]     duplicate mc_idx 0 in add_mc_to_global_list()
>> [   22.153089] ghes_edac: Can't register at EDAC core
>> [   22.160161] EDAC MC: bug in low-level driver: attempt to assign
>> [   22.160161]     duplicate mc_idx 0 in add_mc_to_global_list()
>> [   22.171810] ghes_edac: Can't register at EDAC core
> 
> What I am more concerned is this here. In total this implies 8 ghes
> users that all try to register a (single-instance) ghes mc device. For
> non-x86 only one instance is allowed (see ghes_edac_register(), idx =
> 0).
> 

[cc some guys about HEST]

> So on your platform, when parsing the HEST table
> (hest_ghes_dev_register()), more than one "GHES" device is parsed,
> allocated and registered. Mind sending me your HEST table
> (/sys/firmware/acpi/tables/HEST), or explain what happens here? 

I think that this should be the same:
https://github.com/tianocore/edk2-platforms/tree/master/Silicon/Hisilicon/Hi1620/Drivers/Apei/Hest


If
> this is a valid use case, we need to change ghes_edac_register() to
> support more than one instance.
> 
> Again, please try the patch below.
> 
> Thanks,
> 
> -Robert
> 
> 
>>From 6962f8af4a7c1051c9e87a5ac60571f70d2b6814 Mon Sep 17 00:00:00 2001
> From: Robert Richter <rrichter@marvell.com>
> Date: Thu, 21 Nov 2019 15:01:28 +0100
> Subject: [PATCH] EDAC/ghes: Do not warn on when increment refcount on 0
> 
> Signed-off-by: Robert Richter <rrichter@marvell.com>
> ---
>   drivers/edac/ghes_edac.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 47f4e7f90ef0..b99080d8a10c 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -556,8 +556,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
>   	ghes_pvt = pvt;
>   	spin_unlock_irqrestore(&ghes_lock, flags);
>   
> -	/* only increment on success */
> -	refcount_inc(&ghes_refcount);
> +	/* only set on success */
> +	refcount_set(&ghes_refcount, 1);

Yep, that seems to have silenced it all:

[   21.739895] input: Power Button as 
/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
[   21.749848] ACPI: Power Button [PWRB]
[   21.816977] [Firmware Bug]: APEI: Invalid bit width + offset in GAR 
[0x94110034/64/0/3/0]
[   21.827880] EDAC MC0: Giving out device to module ghes_edac.c 
controller ghes_edac: DEV ghes (INTERRUPT)
[   21.841313] GHES: APEI firmware first mode is enabled by APEI bit and 
WHEA _OSC.
[   21.849176] EINJ: Error INJection is initialized.
[   21.855135] ACPI GTDT: found 1 SBSA generic Watchdog(s).

Thanks,
John

>   
>   unlock:
>   	mutex_unlock(&ghes_reg_mutex);
> 


Cheers,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] EDAC/ghes: Do not warn when incrementing refcount on 0
  2019-11-21 15:23     ` John Garry
@ 2019-11-21 21:36       ` Robert Richter
  2019-11-22  9:01         ` Borislav Petkov
  2019-11-22 11:28       ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) Robert Richter
  1 sibling, 1 reply; 21+ messages in thread
From: Robert Richter @ 2019-11-21 21:36 UTC (permalink / raw)
  To: john.garry, Mauro Carvalho Chehab, Borislav Petkov, Tony Luck
  Cc: huangming23, james.morse, linux-edac, linux-kernel, linuxarm,
	Robert Richter, tanxiaofei, wanghuiqiang

Following warning from the refcount framework is seen during ghes
initialization:

 EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
 ------------[ cut here ]------------
 refcount_t: increment on 0; use-after-free.
 WARNING: CPU: 36 PID: 1 at lib/refcount.c:156 refcount_inc_checked+0x44/0x50
[...]
 Call trace:
  refcount_inc_checked+0x44/0x50
  ghes_edac_register+0x258/0x388
  ghes_probe+0x28c/0x5f0

It warns if the refcount is incremented from zero. This warning is
reasonable as a kernel object is typically created with a refcount of
one and freed once the refcount is zero. Afterwards the object would
be "used-after-free".

For ghes the refcount is initialized with zero, and that is why this
message is seen when initializing the first instance. However,
whenever the refcount is zero, the device will be allocated and
registered. Since the ghes_reg_mutex protects the refcount and
serializes allocation and freeing of ghes devices, a use-after-free
cannot happen here.

Instead of using refcount_inc() for the first instance, use
refcount_set(). This can be used here because the refcount is zero at
this point and can not change due to its protection by the mutex.

Reported-by: John Garry <john.garry@huawei.com>
Tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Robert Richter <rrichter@marvell.com>
---
 drivers/edac/ghes_edac.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 47f4e7f90ef0..b99080d8a10c 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -556,8 +556,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	ghes_pvt = pvt;
 	spin_unlock_irqrestore(&ghes_lock, flags);
 
-	/* only increment on success */
-	refcount_inc(&ghes_refcount);
+	/* only set on success */
+	refcount_set(&ghes_refcount, 1);
 
 unlock:
 	mutex_unlock(&ghes_reg_mutex);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] EDAC/ghes: Do not warn when incrementing refcount on 0
  2019-11-21 21:36       ` [PATCH] EDAC/ghes: Do not warn when incrementing refcount on 0 Robert Richter
@ 2019-11-22  9:01         ` Borislav Petkov
  2019-11-26  9:57           ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2019-11-22  9:01 UTC (permalink / raw)
  To: Robert Richter
  Cc: john.garry, Mauro Carvalho Chehab, Tony Luck, huangming23,
	james.morse, linux-edac, linux-kernel, linuxarm, tanxiaofei,
	wanghuiqiang

On Thu, Nov 21, 2019 at 09:36:57PM +0000, Robert Richter wrote:
> Following warning from the refcount framework is seen during ghes
> initialization:
> 
>  EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
>  ------------[ cut here ]------------
>  refcount_t: increment on 0; use-after-free.
>  WARNING: CPU: 36 PID: 1 at lib/refcount.c:156 refcount_inc_checked+0x44/0x50
> [...]
>  Call trace:
>   refcount_inc_checked+0x44/0x50
>   ghes_edac_register+0x258/0x388
>   ghes_probe+0x28c/0x5f0
> 
> It warns if the refcount is incremented from zero. This warning is
> reasonable as a kernel object is typically created with a refcount of
> one and freed once the refcount is zero. Afterwards the object would
> be "used-after-free".
> 
> For ghes the refcount is initialized with zero, and that is why this
> message is seen when initializing the first instance. However,
> whenever the refcount is zero, the device will be allocated and
> registered. Since the ghes_reg_mutex protects the refcount and
> serializes allocation and freeing of ghes devices, a use-after-free
> cannot happen here.
> 
> Instead of using refcount_inc() for the first instance, use
> refcount_set(). This can be used here because the refcount is zero at
> this point and can not change due to its protection by the mutex.
> 
> Reported-by: John Garry <john.garry@huawei.com>
> Tested-by: John Garry <john.garry@huawei.com>
> Signed-off-by: Robert Richter <rrichter@marvell.com>
> ---
>  drivers/edac/ghes_edac.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Queued, thanks.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-21 15:23     ` John Garry
  2019-11-21 21:36       ` [PATCH] EDAC/ghes: Do not warn when incrementing refcount on 0 Robert Richter
@ 2019-11-22 11:28       ` Robert Richter
  2019-11-26  9:59         ` John Garry
  1 sibling, 1 reply; 21+ messages in thread
From: Robert Richter @ 2019-11-22 11:28 UTC (permalink / raw)
  To: John Garry
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On 21.11.19 15:23:42, John Garry wrote:
> On 21/11/2019 14:23, Robert Richter wrote:
> > On 21.11.19 12:34:22, John Garry wrote:

> > > [   22.046666] EDAC MC: bug in low-level driver: attempt to assign
> > > [   22.046666]     duplicate mc_idx 0 in add_mc_to_global_list()
> > > [   22.058311] ghes_edac: Can't register at EDAC core
> > > [   22.065402] EDAC MC: bug in low-level driver: attempt to assign
> > > [   22.065402]     duplicate mc_idx 0 in add_mc_to_global_list()
> > > [   22.077080] ghes_edac: Can't register at EDAC core
> > > [   22.084140] EDAC MC: bug in low-level driver: attempt to assign
> > > [   22.084140]     duplicate mc_idx 0 in add_mc_to_global_list()
> > > [   22.095789] ghes_edac: Can't register at EDAC core
> > > [   22.102873] EDAC MC: bug in low-level driver: attempt to assign
> > > [   22.102873]     duplicate mc_idx 0 in add_mc_to_global_list()
> > > [   22.115442] ghes_edac: Can't register at EDAC core
> > > [   22.122536] EDAC MC: bug in low-level driver: attempt to assign
> > > [   22.122536]     duplicate mc_idx 0 in add_mc_to_global_list()
> > > [   22.134344] ghes_edac: Can't register at EDAC core
> > > [   22.141441] EDAC MC: bug in low-level driver: attempt to assign
> > > [   22.141441]     duplicate mc_idx 0 in add_mc_to_global_list()
> > > [   22.153089] ghes_edac: Can't register at EDAC core
> > > [   22.160161] EDAC MC: bug in low-level driver: attempt to assign
> > > [   22.160161]     duplicate mc_idx 0 in add_mc_to_global_list()
> > > [   22.171810] ghes_edac: Can't register at EDAC core
> > 
> > What I am more concerned is this here. In total this implies 8 ghes
> > users that all try to register a (single-instance) ghes mc device. For
> > non-x86 only one instance is allowed (see ghes_edac_register(), idx =
> > 0).

I also looked into this: With refcount_inc_checked() enabled, the
refcount is *not* increased from 0 to 1. Under the hood only
refcount_inc_not_zero() is called instead of refcount_inc(). So the
refcount is still zero after an edac mc device was registered. Instead
of sharing the edac mc device, the driver tries to allocate another mc
device for each GHESv2 entry in the HEST table. This causes the
'duplicate mc_idx' message. Also, it is ok to have multiple GHESv2
entries (your system seems to have 8 entries), e.g. to serve different
kind of errors in the system.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] EDAC/ghes: Do not warn when incrementing refcount on 0
  2019-11-22  9:01         ` Borislav Petkov
@ 2019-11-26  9:57           ` John Garry
  0 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2019-11-26  9:57 UTC (permalink / raw)
  To: Borislav Petkov, Robert Richter
  Cc: Mauro Carvalho Chehab, Tony Luck, huangming23, james.morse,
	linux-edac, linux-kernel, linuxarm, tanxiaofei, wanghuiqiang

On 22/11/2019 09:01, Borislav Petkov wrote:
> On Thu, Nov 21, 2019 at 09:36:57PM +0000, Robert Richter wrote:
>> Following warning from the refcount framework is seen during ghes
>> initialization:
>>
>>   EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
>>   ------------[ cut here ]------------
>>   refcount_t: increment on 0; use-after-free.
>>   WARNING: CPU: 36 PID: 1 at lib/refcount.c:156 refcount_inc_checked+0x44/0x50
>> [...]
>>   Call trace:
>>    refcount_inc_checked+0x44/0x50
>>    ghes_edac_register+0x258/0x388
>>    ghes_probe+0x28c/0x5f0
>>
>> It warns if the refcount is incremented from zero. This warning is
>> reasonable as a kernel object is typically created with a refcount of
>> one and freed once the refcount is zero. Afterwards the object would
>> be "used-after-free".
>>
>> For ghes the refcount is initialized with zero, and that is why this
>> message is seen when initializing the first instance. However,
>> whenever the refcount is zero, the device will be allocated and
>> registered. Since the ghes_reg_mutex protects the refcount and
>> serializes allocation and freeing of ghes devices, a use-after-free
>> cannot happen here.
>>
>> Instead of using refcount_inc() for the first instance, use
>> refcount_set(). This can be used here because the refcount is zero at
>> this point and can not change due to its protection by the mutex.
>>
>> Reported-by: John Garry <john.garry@huawei.com>
>> Tested-by: John Garry <john.garry@huawei.com>

According to kernel dev process Doc, this should be explicitly granted, so:
Tested-by: John Garry <john.garry@huawei.com>

Thanks,
John

>> Signed-off-by: Robert Richter <rrichter@marvell.com>
>> ---
>>   drivers/edac/ghes_edac.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Queued, thanks.
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-22 11:28       ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) Robert Richter
@ 2019-11-26  9:59         ` John Garry
  2019-11-27 17:07           ` linuxnext-2019127 " John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2019-11-26  9:59 UTC (permalink / raw)
  To: Robert Richter
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On 22/11/2019 11:28, Robert Richter wrote:
> On 21.11.19 15:23:42, John Garry wrote:
>> On 21/11/2019 14:23, Robert Richter wrote:
>>> On 21.11.19 12:34:22, John Garry wrote:
> 
>>>> [   22.046666] EDAC MC: bug in low-level driver: attempt to assign
>>>> [   22.046666]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>> [   22.058311] ghes_edac: Can't register at EDAC core
>>>> [   22.065402] EDAC MC: bug in low-level driver: attempt to assign
>>>> [   22.065402]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>> [   22.077080] ghes_edac: Can't register at EDAC core
>>>> [   22.084140] EDAC MC: bug in low-level driver: attempt to assign
>>>> [   22.084140]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>> [   22.095789] ghes_edac: Can't register at EDAC core
>>>> [   22.102873] EDAC MC: bug in low-level driver: attempt to assign
>>>> [   22.102873]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>> [   22.115442] ghes_edac: Can't register at EDAC core
>>>> [   22.122536] EDAC MC: bug in low-level driver: attempt to assign
>>>> [   22.122536]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>> [   22.134344] ghes_edac: Can't register at EDAC core
>>>> [   22.141441] EDAC MC: bug in low-level driver: attempt to assign
>>>> [   22.141441]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>> [   22.153089] ghes_edac: Can't register at EDAC core
>>>> [   22.160161] EDAC MC: bug in low-level driver: attempt to assign
>>>> [   22.160161]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>> [   22.171810] ghes_edac: Can't register at EDAC core
>>>
>>> What I am more concerned is this here. In total this implies 8 ghes
>>> users that all try to register a (single-instance) ghes mc device. For
>>> non-x86 only one instance is allowed (see ghes_edac_register(), idx =
>>> 0).
> 
> I also looked into this: With refcount_inc_checked() enabled, the
> refcount is *not* increased from 0 to 1. 

Yeah, I had quickly checked this back then and I think you're right.

Thanks,
John

Under the hood only
> refcount_inc_not_zero() is called instead of refcount_inc(). So the
> refcount is still zero after an edac mc device was registered. Instead
> of sharing the edac mc device, the driver tries to allocate another mc
> device for each GHESv2 entry in the HEST table. This causes the
> 'duplicate mc_idx' message. Also, it is ok to have multiple GHESv2
> entries (your system seems to have 8 entries), e.g. to serve different
> kind of errors in the system.
> 
> Thanks,
> 
> -Robert
> .
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019127 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-26  9:59         ` John Garry
@ 2019-11-27 17:07           ` John Garry
  2019-11-27 20:54             ` Robert Richter
  2019-11-28 21:12             ` linuxnext-2019127 " Robert Richter
  0 siblings, 2 replies; 21+ messages in thread
From: John Garry @ 2019-11-27 17:07 UTC (permalink / raw)
  To: Robert Richter, Borislav Petkov
  Cc: Mauro Carvalho Chehab, James Morse, tony.luck, linux-edac,
	linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On 26/11/2019 09:59, John Garry wrote:
> On 22/11/2019 11:28, Robert Richter wrote:
>> On 21.11.19 15:23:42, John Garry wrote:
>>> On 21/11/2019 14:23, Robert Richter wrote:
>>>> On 21.11.19 12:34:22, John Garry wrote:
>>
>>>>> [   22.046666] EDAC MC: bug in low-level driver: attempt to assign
>>>>> [   22.046666]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>>> [   22.058311] ghes_edac: Can't register at EDAC core
>>>>> [   22.065402] EDAC MC: bug in low-level driver: attempt to assign
>>>>> [   22.065402]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>>> [   22.077080] ghes_edac: Can't register at EDAC core
>>>>> [   22.084140] EDAC MC: bug in low-level driver: attempt to assign
>>>>> [   22.084140]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>>> [   22.095789] ghes_edac: Can't register at EDAC core
>>>>> [   22.102873] EDAC MC: bug in low-level driver: attempt to assign
>>>>> [   22.102873]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>>> [   22.115442] ghes_edac: Can't register at EDAC core
>>>>> [   22.122536] EDAC MC: bug in low-level driver: attempt to assign
>>>>> [   22.122536]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>>> [   22.134344] ghes_edac: Can't register at EDAC core
>>>>> [   22.141441] EDAC MC: bug in low-level driver: attempt to assign
>>>>> [   22.141441]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>>> [   22.153089] ghes_edac: Can't register at EDAC core
>>>>> [   22.160161] EDAC MC: bug in low-level driver: attempt to assign
>>>>> [   22.160161]     duplicate mc_idx 0 in add_mc_to_global_list()
>>>>> [   22.171810] ghes_edac: Can't register at EDAC core
>>>>
>>>> What I am more concerned is this here. In total this implies 8 ghes
>>>> users that all try to register a (single-instance) ghes mc device. For
>>>> non-x86 only one instance is allowed (see ghes_edac_register(), idx =
>>>> 0).
>>
>> I also looked into this: With refcount_inc_checked() enabled, the
>> refcount is *not* increased from 0 to 1. 
> 
> Yeah, I had quickly checked this back then and I think you're right.
> 
> Thanks,
> John

Hi guys,

Me again ... For linux-next 27 Nov, I now see this on my same arm64 system:

[   21.936616] ACPI: Power Button [PWRB]
[   22.074582] [Firmware Bug]: APEI: Invalid bit width + offset in GAR 
[0x94110034/64/0/3/0]
[   22.086095] EDAC MC0: Giving out device to module ghes_edac.c 
controller ghes_edac: DEV ghes (INTERRUPT)
[   22.097276] 
==================================================================
[   22.104498] BUG: KASAN: use-after-free in 
edac_remove_sysfs_mci_device+0x148/0x180
[   22.112055] Read of size 4 at addr ffff00233bc69338 by task swapper/0/1
[   22.118656]
[   22.120139] CPU: 33 PID: 1 Comm: swapper/0 Not tainted 
5.4.0-next-20191127-dirty #667
[   22.127956] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.16.01 03/15/2019
[   22.136467] Call trace:
[   22.138907]  dump_backtrace+0x0/0x290
[   22.142558]  show_stack+0x14/0x20
[   22.145865]  dump_stack+0xf0/0x14c
[   22.149258]  print_address_description.isra.11+0x6c/0x3b8
[   22.154645]  __kasan_report+0x12c/0x23c
[   22.158470]  kasan_report+0xc/0x18
[   22.161860]  __asan_load4+0x94/0xb8
[   22.165338]  edac_remove_sysfs_mci_device+0x148/0x180
[   22.170378]  edac_mc_del_mc+0x154/0x1b8
[   22.174203]  ghes_edac_unregister+0xa0/0x188
[   22.178465]  ghes_remove+0x11c/0x1f8
[   22.182033]  platform_drv_remove+0x3c/0x68
[   22.186119]  really_probe+0x174/0x548
[   22.189770]  driver_probe_device+0x7c/0x148
[   22.193942]  device_driver_attach+0x94/0xa0
[   22.198113]  __driver_attach+0xa4/0x110
[   22.201938]  bus_for_each_dev+0xe8/0x158
[   22.205849]  driver_attach+0x30/0x40
[   22.209413]  bus_add_driver+0x234/0x2f0
[   22.213237]  driver_register+0xbc/0x1d0
[   22.217063]  __platform_driver_register+0x7c/0x88
[   22.221757]  ghes_init+0xbc/0x14c
[   22.225060]  do_one_initcall+0xb4/0x254
[   22.228887]  kernel_init_freeable+0x248/0x2f4
[   22.233233]  kernel_init+0x10/0x118
[   22.236710]  ret_from_fork+0x10/0x18
[   22.240273]
[   22.241753] Allocated by task 1:
[   22.244971]  save_stack+0x28/0xc8
[   22.248274]  __kasan_kmalloc.isra.9+0xbc/0xd8
[   22.252619]  kasan_kmalloc+0xc/0x18
[   22.256096]  edac_mc_alloc+0x62c/0x888
[   22.259834]  ghes_edac_register+0x1c8/0x3f0
[   22.264006]  ghes_probe+0x28c/0x5f0
[   22.267484]  platform_drv_probe+0x70/0xd8
[   22.271482]  really_probe+0x118/0x548
[   22.275133]  driver_probe_device+0x7c/0x148
[   22.279305]  device_driver_attach+0x94/0xa0
[   22.283476]  __driver_attach+0xa4/0x110
[   22.287301]  bus_for_each_dev+0xe8/0x158
[   22.291212]  driver_attach+0x30/0x40
[   22.294776]  bus_add_driver+0x234/0x2f0
[   22.298600]  driver_register+0xbc/0x1d0
[   22.302425]  __platform_driver_register+0x7c/0x88
[   22.307118]  ghes_init+0xbc/0x14c
[   22.310421]  do_one_initcall+0xb4/0x254
[   22.314246]  kernel_init_freeable+0x248/0x2f4
[   22.318591]  kernel_init+0x10/0x118
[   22.322068]  ret_from_fork+0x10/0x18
[   22.325630]
[   22.327109] Freed by task 1:
[   22.329978]  save_stack+0x28/0xc8
[   22.333282]  __kasan_slab_free+0x118/0x180
[   22.337366]  kasan_slab_free+0x10/0x18
[   22.341106]  kfree+0x110/0x2b0
[   22.344150]  dimm_attr_release+0xc/0x18
[   22.347978]  device_release+0x7c/0xe0
[   22.351629]  kobject_put+0xb0/0x180
[   22.355106]  device_unregister+0x20/0x30
[   22.359018]  edac_remove_sysfs_mci_device+0x140/0x180
[   22.364057]  edac_mc_del_mc+0x154/0x1b8
[   22.367882]  ghes_edac_unregister+0xa0/0x188
[   22.372140]  ghes_remove+0x11c/0x1f8
[   22.375705]  platform_drv_remove+0x3c/0x68
[   22.379789]  really_probe+0x174/0x548
[   22.383440]  driver_probe_device+0x7c/0x148
[   22.387612]  device_driver_attach+0x94/0xa0
[   22.391783]  __driver_attach+0xa4/0x110
[   22.395608]  bus_for_each_dev+0xe8/0x158
[   22.399519]  driver_attach+0x30/0x40
[   22.403083]  bus_add_driver+0x234/0x2f0
[   22.406907]  driver_register+0xbc/0x1d0
[   22.410732]  __platform_driver_register+0x7c/0x88
[   22.415424]  ghes_init+0xbc/0x14c
[   22.418728]  do_one_initcall+0xb4/0x254
[   22.422553]  kernel_init_freeable+0x248/0x2f4
[   22.426898]  kernel_init+0x10/0x118
[   22.430375]  ret_from_fork+0x10/0x18
[   22.433937]
[   22.435417] The buggy address belongs to the object at ffff00233bc69000
[   22.435417]  which belongs to the cache kmalloc-1k of size 1024
[   22.447922] The buggy address is located 824 bytes inside of
[   22.447922]  1024-byte region [ffff00233bc69000, ffff00233bc69400)
[   22.459731] The buggy address belongs to the page:
[   22.464512] page:fffffe008ccf1a00 refcount:1 mapcount:0 
mapping:ffff00237080f600 index:0x0 compound_mapcount: 0
[   22.474590] raw: 2ffff00000010200 dead000000000100 dead00000000012ge 
dumped because: kasan: bad access detected
[   22.495608]
[   22.497087] Memory state around the buggy address:
[   22.501867]  ffff00233bc69200: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   22.509076]  ffff00233bc69280: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   22.516286] >ffff00233bc69300: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   22.523494]                                         ^
[   22.528534]  ffff00233bc69380: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   22.535744]  ffff00233bc69400: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   22.542952] 
==================================================================
[   22.550161] Disabling lock debugging due to kernel taint
[   22.555511] EDAC MC: Removed device 0 for ghes_edac.c ghes_edac: DEV ghes
[   22.564893] EDAC MC0: Giving out device to module ghes_edac.c 
controller ghes_edac: DEV ghes (INTERRUPT)
[   22.578292] GHES: APEI firmware first mode is enabled by APEI bit and 
WHEA _OSC.
[   22.586264] EINJ: Error INJection is initialized.


root@(none)$  cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff00236c273600 (size 256):
   comm "swapper/0", pid 1, jiffies 4294897813 (age 177.596s)
   hex dump (first 32 bytes):
     00 00 c5 3b 23 00 ff ff 00 08 c5 3b 23 00 ff ff  ...;#......;#...
     00 10 c5 3b 23 00 ff ff 00 18 c5 3b 23 00 ff ff  ...;#......;#...
   backtrace:
     [<000000007144931a>] __kmalloc+0x1e0/0x2c0
     [<00000000ffb454a9>] edac_mc_alloc+0x31c/0x888
     [<00000000f71ac8ce>] ghes_edac_register+0x1c8/0x3f0
     [<00000000c9708978>] ghes_probe+0x28c/0x5f0
     [<0000000082688646>] platform_drv_probe+0x70/0xd8
     [<0000000040ba35c7>] really_probe+0x118/0x548
     [<00000000603befc1>] driver_probe_device+0x7c/0x148
     [<000000002b50a9eb>] device_driver_attach+0x94/0xa0
     [<000000000d74ae48>] __driver_attach+0xa4/0x110
     [<0000000080f51922>] bus_for_each_dev+0xe8/0x158
     [<00000000300e9429>] driver_attach+0x30/0x40
     [<00000000721f69ab>] bus_add_driver+0x234/0x2f0
     [<00000000bc8fe749>] driver_register+0xbc/0x1d0
     [<000000001cc8671e>] __platform_driver_register+0x7c/0x88
     [<00000000324890ef>] ghes_init+0xbc/0x14c
     [<00000000bbe18b33>] do_one_initcall+0xb4/0x254
unreferenced object 0xffff00233bc50000 (size 1024):
   comm "swapper/0", pid 1, jiffies 4294897813 (age 177.596s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<00000000f51a8341>] kmem_cache_alloc+0x188/0x260
     [<000000006c66db0a>] edac_mc_alloc+0x38c/0x888
     [<00000000f71ac8ce>] ghes_edac_register+0x1c8/0x3f0
     [<00000000c9708978>] ghes_probe+0x28c/0x5f0
     [<0000000082688646>] platform_drv_probe+0x70/0xd8
     [<0000000040ba35c7>] really_probe+0x118/0x548
     [<00000000603befc1>] driver_probe_device+0x7c/0x148
     [<000000002b50a9eb>] device_driver_attach+0x94/0xa0
     [<000000000d74ae48>] __driver_attach+0xa4/0x110
     [<0000000080f51922>] bus_for_each_dev+0xe8/0x158
     [<00000000300e9429>] driver_attach+0x30/0x40
     [<00000000721f69ab>] bus_add_driver+0x234/0x2f0
     [<00000000bc8fe749>] driver_register+0xbc/0x1d0
     [<000000001cc8671e>] __platform_driver_register+0x7c/0x88
     [<00000000324890ef>] ghes_init+0xbc/0x14c
     [<00000000bbe18b33>] do_one_initcall+0xb4/0x254
unreferenced object 0xffff00236daa2b00 (size 128):
   comm "swapper/0", pid 1, jiffies 4294897813 (age 177.596s)
   hex dump (first 32 bytes):
     00 2a aa 6d 23 00 ff ff 00 00 00 00 00 00 00 00  .*.m#...........
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<000000007144931a>] __kmalloc+0x1e0/0x2c0
     [<000000003b8ce7e7>] edac_mc_alloc+0x400/0x888
     [<00000000f71ac8ce>] ghes_edac_register+0x1c8/0x3f0
     [<00000000c9708978>] ghes_probe+0x28c/0x5f0
     [<0000000082688646>] platform_drv_probe+0x70/0xd8
     [<0000000040ba35c7>] really_probe+0x118/0x548
     [<00000000603befc1>] driver_probe_device+0x7c/0x148
     [<000000002b50a9eb>] device_driver_attach+0x94/0xa0
     [<000000000d74ae48>] __driver_attach+0xa4/0x110
     [<0000000080f51922>] bus_for_each_dev+0xe8/0x158
     [<00000000300e9429>] driver_attach+0x30/0x40
     [<00000000721f69ab>] bus_add_driver+0x234/0x2f0
     [<00000000bc8fe749>] driver_register+0xbc/0x1d0
     [<000000001cc8671e>] __platform_driver_register+0x7c/0x88
     [<00000000324890ef>] ghes_init+0xbc/0x14c
     [<00000000bbe18b33>] do_one_initcall+0xb4/0x254
unreferenced object 0xffff00236daa2a00 (size 128):

[snip]

I have test enabled:
+CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
+CONFIG_KASAN=y
+CONFIG_DEBUG_KMEMLEAK=y

Cheers,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019127 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-27 17:07           ` linuxnext-2019127 " John Garry
@ 2019-11-27 20:54             ` Robert Richter
  2019-11-28 11:02               ` linuxnext-20191127 " John Garry
  2019-11-28 21:12             ` linuxnext-2019127 " Robert Richter
  1 sibling, 1 reply; 21+ messages in thread
From: Robert Richter @ 2019-11-27 20:54 UTC (permalink / raw)
  To: John Garry
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

Hi John,

thank you for testing.

On 27.11.19 17:07:33, John Garry wrote:

> [snip]
> 
> I have test enabled:
> +CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
> +CONFIG_KASAN=y
> +CONFIG_DEBUG_KMEMLEAK=y

Is this a regression (did it work before?), or a new test that you
newly run?

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-20191127 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-27 20:54             ` Robert Richter
@ 2019-11-28 11:02               ` John Garry
  2019-11-28 16:44                 ` Borislav Petkov
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2019-11-28 11:02 UTC (permalink / raw)
  To: Robert Richter
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)


Hi Robert,

> thank you for testing.

I'm just stumbling across these, TBH.

> 
> On 27.11.19 17:07:33, John Garry wrote:
> 
>> [snip]
>>
>> I have test enabled:
>> +CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
>> +CONFIG_KASAN=y
>> +CONFIG_DEBUG_KMEMLEAK=y
> 
> Is this a regression (did it work before?), or a new test that you
> newly run?

linuxnext-20191119 does not look to have the issue - that's when I 
cherry-pick your refcount fix - but has lots of memory leaks:

root@(none)$
root@(none)$ echo scan > /sys/kernel/debug/kmemleak
root@(none)$ [  121.639978] kmemleak: 128 new suspected memory leaks 
(see /sys/kernel/debug/kmemleak)

root@(none)$ cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff00236c24ba00 (size 256):
   comm "swapper/0", pid 1, jiffies 4294897826 (age 107.824s)
   hex dump (first 32 bytes):
     00 40 2d 3c 23 00 ff ff 00 48 2d 3c 23 00 ff ff  .@-<#....H-<#...
     00 50 2d 3c 23 00 ff ff 00 58 2d 3c 23 00 ff ff  .P-<#....X-<#...
   backtrace:
     [<0000000009aed8e3>] __kmalloc+0x1e0/0x2c0
     [<00000000bf599427>] edac_mc_alloc+0x31c/0x888
     [<00000000c070e314>] ghes_edac_register+0x15c/0x390
     [<00000000e4aad1c2>] ghes_probe+0x28c/0x5f0
     [<0000000079c357cb>] platform_drv_probe+0x70/0xd8
     [<00000000d4ab9188>] really_probe+0x118/0x548
     [<00000000763d50f1>] driver_probe_device+0x7c/0x148
     [<0000000058e623c3>] device_driver_attach+0x94/0xa0
     [<00000000d7cb679d>] __driver_attach+0xa4/0x110
     [<000000007d0942a0>] bus_for_each_dev+0xe8/0x158
     [<000000004cf734d1>] driver_attach+0x30/0x40
     [<000000009aa3536e>] bus_add_driver+0x234/0x2f0
     [<00000000d163cfe0>] driver_register+0xbc/0x1d0
     [<000000007e4f0ac1>] __platform_driver_register+0x7c/0x88
     [<00000000a63c8dd0>] ghes_init+0xbc/0x14c
     [<00000000356c8a7f>] do_one_initcall+0xb4/0x254
unreferenced object 0xffff00233c2d4000 (size 1024):
   comm "swapper/0", pid 1, jiffies 4294897826 (age 107.824s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<000000004945469f>] kmem_cache_alloc+0x188/0x260
     [<0000000032ea779d>] edac_mc_alloc+0x38c/0x888

Unfortunately v5.4 has similar memory leaks.

Thanks,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-20191127 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-28 11:02               ` linuxnext-20191127 " John Garry
@ 2019-11-28 16:44                 ` Borislav Petkov
  0 siblings, 0 replies; 21+ messages in thread
From: Borislav Petkov @ 2019-11-28 16:44 UTC (permalink / raw)
  To: John Garry
  Cc: Robert Richter, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On Thu, Nov 28, 2019 at 11:02:32AM +0000, John Garry wrote:
> linuxnext-20191119 does not look to have the issue - that's when I
> cherry-pick your refcount fix - but has lots of memory leaks:

Can you forget linux-next for a while and test the latest Linus master
branch?

Also, pls send your .config and full dmesg. Privately's fine too.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019127 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-27 17:07           ` linuxnext-2019127 " John Garry
  2019-11-27 20:54             ` Robert Richter
@ 2019-11-28 21:12             ` Robert Richter
  2019-12-02 10:23               ` John Garry
  1 sibling, 1 reply; 21+ messages in thread
From: Robert Richter @ 2019-11-28 21:12 UTC (permalink / raw)
  To: John Garry
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On 27.11.19 17:07:33, John Garry wrote:
> [   22.104498] BUG: KASAN: use-after-free in
> edac_remove_sysfs_mci_device+0x148/0x180

It is triggered in edac_remove_sysfs_mci_device().

device_unregister(&dimm->dev) not only removes the sysfs entry, it
also frees the dimm struct in dimm_attr_release(). When incrementing
the loop in mci_for_each_dimm(), the dimm struct is accessed again
which causes the use-after-free. But, the dimm struct shouln'd be
released here already.

edac_remove_sysfs_mci_device() should not release the devices at this
point. We need clean release functions for mci and dimm_info and
refcounts to protect pdev/dev mappings. And mci_for_each_dimm() must
be checked how it handles device removals and if it is safe.

Let's see how this can be fixed.

Thanks for reporting the issue.

-Robert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019127 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-11-28 21:12             ` linuxnext-2019127 " Robert Richter
@ 2019-12-02 10:23               ` John Garry
  2019-12-02 11:46                 ` Robert Richter
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2019-12-02 10:23 UTC (permalink / raw)
  To: Robert Richter
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On 28/11/2019 21:12, Robert Richter wrote:
> On 27.11.19 17:07:33, John Garry wrote:
>> [   22.104498] BUG: KASAN: use-after-free in
>> edac_remove_sysfs_mci_device+0x148/0x180
> 
> It is triggered in edac_remove_sysfs_mci_device().
> 
> device_unregister(&dimm->dev) not only removes the sysfs entry, it
> also frees the dimm struct in dimm_attr_release(). When incrementing
> the loop in mci_for_each_dimm(), the dimm struct is accessed again
> which causes the use-after-free. But, the dimm struct shouln'd be
> released here already.
> 
> edac_remove_sysfs_mci_device() should not release the devices at this
> point. We need clean release functions for mci and dimm_info and
> refcounts to protect pdev/dev mappings. And mci_for_each_dimm() must
> be checked how it handles device removals and if it is safe.
> 
> Let's see how this can be fixed.
> 
> Thanks for reporting the issue.

Fine, and would any fix also deal with the v5.4 mem leak which I 
mentioned also?

Cheers,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: linuxnext-2019127 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot)
  2019-12-02 10:23               ` John Garry
@ 2019-12-02 11:46                 ` Robert Richter
  0 siblings, 0 replies; 21+ messages in thread
From: Robert Richter @ 2019-12-02 11:46 UTC (permalink / raw)
  To: John Garry
  Cc: Borislav Petkov, Mauro Carvalho Chehab, James Morse, tony.luck,
	linux-edac, linux-kernel, wanghuiqiang, Xiaofei Tan, Linuxarm,
	Huangming (Mark)

On 02.12.19 10:23:29, John Garry wrote:
> On 28/11/2019 21:12, Robert Richter wrote:
> > On 27.11.19 17:07:33, John Garry wrote:
> > > [   22.104498] BUG: KASAN: use-after-free in
> > > edac_remove_sysfs_mci_device+0x148/0x180
> > 
> > It is triggered in edac_remove_sysfs_mci_device().
> > 
> > device_unregister(&dimm->dev) not only removes the sysfs entry, it
> > also frees the dimm struct in dimm_attr_release(). When incrementing
> > the loop in mci_for_each_dimm(), the dimm struct is accessed again
> > which causes the use-after-free. But, the dimm struct shouln'd be
> > released here already.
> > 
> > edac_remove_sysfs_mci_device() should not release the devices at this
> > point. We need clean release functions for mci and dimm_info and
> > refcounts to protect pdev/dev mappings. And mci_for_each_dimm() must
> > be checked how it handles device removals and if it is safe.
> > 
> > Let's see how this can be fixed.
> > 
> > Thanks for reporting the issue.
> 
> Fine, and would any fix also deal with the v5.4 mem leak which I mentioned
> also?

Yes, I have identified the leaks:

# grep edac /sys/kernel/debug/kmemleak | sort | uniq -c
      1     [<000000003c0f58f9>] edac_mc_alloc+0x3bc/0x9d0	# mci->csrows
     16     [<00000000bb932dc0>] edac_mc_alloc+0x49c/0x9d0	# csr->channels
     16     [<00000000e2734dba>] edac_mc_alloc+0x518/0x9d0	# csr->channels[chn]
      1     [<00000000eb040168>] edac_mc_alloc+0x5c8/0x9d0	# mci->dimms
     34     [<00000000ef737c29>] ghes_edac_register+0x1c8/0x3f8	# see edac_mc_alloc()

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-12-02 11:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-14 15:18 edac KASAN warning in experimental arm64 allmodconfig boot John Garry
2019-10-14 16:09 ` Borislav Petkov
2019-10-14 16:44   ` John Garry
2019-10-14 16:15 ` James Morse
2019-10-14 16:56   ` John Garry
2019-10-14 16:57     ` Borislav Petkov
2019-11-21 12:34 ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) John Garry
2019-11-21 14:23   ` Robert Richter
2019-11-21 15:23     ` John Garry
2019-11-21 21:36       ` [PATCH] EDAC/ghes: Do not warn when incrementing refcount on 0 Robert Richter
2019-11-22  9:01         ` Borislav Petkov
2019-11-26  9:57           ` John Garry
2019-11-22 11:28       ` linuxnext-2019119 edac warns (was Re: edac KASAN warning in experimental arm64 allmodconfig boot) Robert Richter
2019-11-26  9:59         ` John Garry
2019-11-27 17:07           ` linuxnext-2019127 " John Garry
2019-11-27 20:54             ` Robert Richter
2019-11-28 11:02               ` linuxnext-20191127 " John Garry
2019-11-28 16:44                 ` Borislav Petkov
2019-11-28 21:12             ` linuxnext-2019127 " Robert Richter
2019-12-02 10:23               ` John Garry
2019-12-02 11:46                 ` Robert Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).