linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: KASAN: use-after-free in enqueue_timer+0x4f/0x1e0
@ 2021-10-05 18:10 Kim Phillips
  2021-10-11 16:07 ` Kim Phillips
  0 siblings, 1 reply; 3+ messages in thread
From: Kim Phillips @ 2021-10-05 18:10 UTC (permalink / raw)
  To: Ainux, Thomas Zimmermann
  Cc: airlied, airlied, daniel, sterlingteng, chenhuacai, dri-devel,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 14839 bytes --]

Hi, I occasionally see the below trace with Linus' master on an
AMD Milan system:

[   25.657322] BUG: kernel NULL pointer dereference, address: 0000000000000000

[   25.665097] #PF: supervisor instruction fetch in kernel mode

[   25.671448] #PF: error_code(0x0010) - not-present page

[   25.677178] PGD 0 P4D 0

[   25.680000] Oops: 0010 [#1] PREEMPT SMP NOPTI

[   25.684857] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.0-rc3+ #71

[   25.692043] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS RXM1006B 08/20/2021

[   25.701069] RIP: 0010:0x0

[   25.703993] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.

[   25.711661] RSP: 0018:ffffc90000003eb8 EFLAGS: 00010002

[   25.717488] RAX: 0000000000000101 RBX: 0000000000000101 RCX: 0000000000000000

[   25.725449] RDX: 00000000fffef080 RSI: 0000000000000000 RDI: ffff88817751e350

[   25.733408] RBP: ffffc90000003ed8 R08: 0000000000000000 R09: 00000000000002e6

[   25.741368] R10: ffff88810dc8b2d0 R11: ffff88bf4de290b0 R12: ffff88817751e350

[   25.749327] R13: 0000000000000000 R14: ffff88bf4de18dc0 R15: ffff88817751e350

[   25.757286] FS:  0000000000000000(0000) GS:ffff88bf4de00000(0000) knlGS:0000000000000000

[   25.766312] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[   25.772720] CR2: ffffffffffffffd6 CR3: 0000000166650002 CR4: 0000000000770ef0

[   25.780754] PKRU: 55555554

[   25.783768] Call Trace:

[   25.786492]  <IRQ>

[   25.788729]  call_timer_fn+0x2e/0x130

[   25.792814]  run_timer_softirq+0x361/0x4a0

[   25.797381]  ? lapic_next_event+0x21/0x30

[   25.801852]  ? clockevents_program_event+0x8f/0xe0

[   25.807195]  __do_softirq+0xfb/0x2db

[   25.811183]  irq_exit_rcu+0x98/0xd0

[   25.815073]  sysvec_apic_timer_interrupt+0xac/0xd0

[   25.820415]  </IRQ>

[   25.822751]  asm_sysvec_apic_timer_interrupt+0x12/0x20

[   25.828481] RIP: 0010:native_safe_halt+0xb/0x10

[   25.833533] Code: 08 74 81 eb c0 cc cc cc cc cc cc cc cc cc cc eb 07 0f 00 2d 69 68 60 00 f4 c3 0f 1f 44 00 00 eb 07 0f 00 2d 59 68 60 00 fb f4 <c3> cc cc cc cc 0f 1f 44 00 00 55 48 89 e5 53 e8 b1 17 ff ff 66 90

[   25.854485] RSP: 0018:ffffffff82803e10 EFLAGS: 00000206

[   25.860311] RAX: ffffffff81c06c60 RBX: 0000000000000000 RCX: 0000000000000000

[   25.868270] RDX: 0000000000000000 RSI: ffffffff825afb2f RDI: ffffffff825bda91

[   25.876229] RBP: ffffffff82803e18 R08: 00000066a1710f94 R09: 0000000000000001

[   25.884188] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff82813940

[   25.892146] R13: ffffffff82813940 R14: 0000000000000000 R15: 0000000000000000

[   25.900098]  ? __sched_text_end+0x6/0x6

[   25.904462]  ? default_idle+0xe/0x20

[   25.908446]  arch_cpu_idle+0x15/0x20

[   25.912432]  default_idle_call+0x33/0xf0

[   25.916804]  do_idle+0x209/0x270

[   25.920402]  cpu_startup_entry+0x20/0x30

[   25.924774]  rest_init+0xd4/0xe0

[   25.928370]  arch_call_rest_init+0xe/0x1b

[   25.932840]  start_kernel+0x6bc/0x6e2

[   25.936922]  x86_64_start_reservations+0x24/0x26

[   25.942069]  x86_64_start_kernel+0x75/0x79

[   25.946635]  secondary_startup_64_no_verify+0xb0/0xbb

[   25.952270] Modules linked in: intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm ipmi_si(+) ipmi_devintf wmi_bmof rapl ipmi_msghandler input_leds ccp k10temp mac_hid sch_fq_codel msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm crct10dif_pclmul drm_kms_helper crc32_pclmul ghash_clmulni_intel syscopyarea sysfillrect hid_generic aesni_intel sysimgblt crypto_simd fb_sys_fops cryptd cec usbhid nvme ahci drm e1000e nvme_core libahci hid i2c_piix4 wmi

[   26.017199] CR2: 0000000000000000

[   26.020893] ---[ end trace 4ffc93f5c298732e ]---

[   26.026040] RIP: 0010:0x0

[   26.028958] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.

[   26.036626] RSP: 0018:ffffc90000003eb8 EFLAGS: 00010002

[   26.042557] RAX: 0000000000000101 RBX: 0000000000000101 RCX: 0000000000000000

[   26.050506] RDX: 00000000fffef080 RSI: 0000000000000000 RDI: ffff88817751e350

[   26.058464] RBP: ffffc90000003ed8 R08: 0000000000000000 R09: 00000000000002e6

[   26.066423] R10: ffff88810dc8b2d0 R11: ffff88bf4de290b0 R12: ffff88817751e350

[   26.074380] R13: 0000000000000000 R14: ffff88bf4de18dc0 R15: ffff88817751e350

[   26.082330] FS:  0000000000000000(0000) GS:ffff88bf4de00000(0000) knlGS:0000000000000000

[   26.091356] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[   26.097765] CR2: ffffffffffffffd6 CR3: 0000000166650002 CR4: 0000000000770ef0

[   26.105722] PKRU: 55555554

[   26.108736] Kernel panic - not syncing: Fatal exception in interrupt

[   26.116349] Kernel Offset: disabled

[   26.120237] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---


Adding 'nokaslr' and CONFIG_DEBUG_OBJECTS_TIMERS=y didn't make
it more reproducible, but CONFIG_KASAN=y changed the signature
to this use-after-free that I used to drive bisect:

[   40.503396] ==================================================================

[   40.514471] input: HID 0b1f:03e9 as /devices/pci0000:20/0000:20:08.1/0000:22:00.3/usb1/1-2/1-2.2/1-2.2:1.1/0003:0B1F:03E9.0004/input/input6

[   40.521823] BUG: KASAN: use-after-free in enqueue_timer+0x4f/0x1e0

[   40.536634] hid-generic 0003:0B1F:03E9.0004: input,hidraw1: USB HID v1.00 Mouse [HID 0b1f:03e9] on usb-0000:22:00.3-2.2/input1

[   40.542696] Write of size 8 at addr ffff8881feb80358 by task kworker/0:2/786



[   40.542705] CPU: 0 PID: 786 Comm: kworker/0:2 Not tainted 5.15.0-rc4+ #105

[   40.572608] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS RXM1006B 08/20/2021

[   40.581639] Workqueue: events work_for_cpu_fn

[   40.586507] Call Trace:

[   40.589234]  dump_stack_lvl+0x4a/0x5f

[   40.593324]  print_address_description.constprop.0+0x28/0x150

[   40.599750]  ? enqueue_timer+0x4f/0x1e0

[   40.604032]  kasan_report.cold+0x82/0xdb

[   40.608435]  ? ast_pci_probe+0xb1/0x100 [ast]

[   40.613305]  ? enqueue_timer+0x4f/0x1e0

[   40.617582]  __asan_store8+0x6c/0x90

[   40.621570]  enqueue_timer+0x4f/0x1e0

[   40.625654]  internal_add_timer+0x9b/0xd0

[   40.630127]  ? enqueue_timer+0x1e0/0x1e0

[   40.634501]  ? debug_smp_processor_id+0x17/0x20

[   40.639557]  ? get_nohz_timer_target+0x6f/0x290

[   40.644610]  ? lock_timer_base+0xb6/0xe0

[   40.648986]  add_timer+0x29a/0x3a0

[   40.652781]  ? run_timer_softirq+0xa40/0xa40

[   40.657547]  ? preempt_count_sub+0x18/0xc0

[   40.662120]  ? _raw_spin_unlock_irqrestore+0x22/0x40

[   40.667670]  ? drm_connector_list_iter_next+0x1bb/0x230 [drm]

[   40.674163]  __queue_delayed_work+0xdc/0x140

[   40.678932]  queue_delayed_work_on+0x6c/0x90

[   40.683698]  drm_kms_helper_poll_enable.part.0+0xf5/0x140 [drm_kms_helper]

[   40.691411]  ? drm_kms_helper_poll_fini+0x40/0x40 [drm_kms_helper]

[   40.698344]  ? __queue_work+0x640/0x640

[   40.702623]  ? debug_object_init+0x16/0x20

[   40.707195]  drm_kms_helper_poll_init+0xb8/0xc0 [drm_kms_helper]

[   40.713968]  ast_mode_config_init+0x5f8/0x800 [ast]

[   40.719419]  ? ast_crtc_reset+0xa0/0xa0 [ast]

[   40.724287]  ast_device_create.cold+0xab6/0xbdd [ast]

[   40.729931]  ast_pci_probe+0xba/0x100 [ast]

[   40.734602]  ? __pm_runtime_resume+0x65/0xb0

[   40.739366]  ? ast_pm_suspend+0x70/0x70 [ast]

[   40.739887] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

[   40.744231]  local_pci_probe+0x80/0xd0

[   40.744237]  ? pci_device_shutdown+0x90/0x90

[   40.753180] ata7.00: supports DRM functions and may not be fully accessible

[   40.755316]  work_for_cpu_fn+0x2e/0x50

[   40.760086] ata7.00: ATA-11: Samsung SSD 860 PRO 256GB, RVM01B6Q, max UDMA/133

[   40.767845]  process_one_work+0x4b9/0x7f0

[   40.767852]  worker_thread+0x47a/0x6b0

[   40.772971] ata7.00: disabling queued TRIM support

[   40.780081]  kthread+0x209/0x240

[   40.780085]  ? process_one_work+0x7f0/0x7f0

[   40.780089]  ? set_kthread_struct+0x80/0x80

[   40.784562] ata7.00: 500118192 sectors, multi 1: LBA48 NCQ (depth 32), AA

[   40.788736]  ret_from_fork+0x22/0x30

[   40.796196] ata7.00: Features: Trust Dev-Sleep NCQ-sndrcv



[   40.802735] ata7.00: supports DRM functions and may not be fully accessible

[   40.807011] Allocated by task 786:

[   40.807015]  kasan_save_stack+0x23/0x50

[   40.815459] ata7.00: disabling queued TRIM support

[   40.818577]  __kasan_kmalloc+0x83/0xa0

[   40.818581]  __kmalloc+0x2f7/0x4c0

[   40.818585]  __devm_drm_dev_alloc+0x2f/0xb0 [drm]

[   40.826718] ata7.00: configured for UDMA/133

[   40.834032]  ast_device_create+0x32/0x2c0 [ast]

[   40.838646] scsi 6:0:0:0: Direct-Access     ATA      Samsung SSD 860  1B6Q PQ: 0 ANSI: 5

[   40.842107]  ast_pci_probe+0xba/0x100 [ast]

[   40.842117]  local_pci_probe+0x80/0xd0

[   40.842121]  work_for_cpu_fn+0x2e/0x50

[   40.848592] ata7.00: Enabling discard_zeroes_data

[   40.851648]  process_one_work+0x4b9/0x7f0

[   40.855629] sd 6:0:0:0: [sda] 500118192 512-byte logical blocks: (256 GB/238 GiB)

[   40.860689]  worker_thread+0x47a/0x6b0

[   40.860694]  kthread+0x209/0x240

[   40.860696]  ret_from_fork+0x22/0x30

[   40.865913] sd 6:0:0:0: Attached scsi generic sg0 type 0



[   40.870516] Freed by task 1190:

[   40.870519]  kasan_save_stack+0x23/0x50

[   40.870523]  kasan_set_track+0x20/0x30

[   40.871945] sd 6:0:0:0: [sda] Write Protect is off

[   40.871968] sd 6:0:0:0: [sda] Mode Sense: 00 3a 00 00

[   40.872098] sd 6:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[   40.879881] ata7.00: disabled

[   40.884218]  kasan_set_free_info+0x24/0x40

[   40.884222]  __kasan_slab_free+0xef/0x120

[   40.884227]  kfree+0xa1/0x3e0

[   40.978619]  drm_dev_release+0x58/0x60 [drm]

[   40.983446]  devm_drm_dev_init_release+0x46/0x60 [drm]

[   40.989235]  devm_action_release+0x2c/0x40

[   40.993803]  release_nodes+0x62/0x150

[   40.997887]  devres_release_all+0x10a/0x150

[   41.002551]  really_probe+0x177/0x660

[   41.006636]  __driver_probe_device+0x19e/0x230

[   41.011592]  driver_probe_device+0x4e/0xf0

[   41.016161]  __driver_attach+0x102/0x1d0

[   41.020536]  bus_for_each_dev+0xfb/0x150

[   41.024911]  driver_attach+0x2d/0x40

[   41.028898]  bus_add_driver+0x23f/0x300

[   41.033177]  driver_register+0xdc/0x170

[   41.037455]  __pci_register_driver+0xfc/0x110

[   41.042314]  ast_init+0x48/0x1000 [ast]

[   41.046598]  do_one_initcall+0x99/0x2d0

[   41.050877]  do_init_module+0x10f/0x3c0

[   41.055154]  load_module+0x43fa/0x4710

[   41.059336]  __do_sys_finit_module+0x12a/0x1b0

[   41.064291]  __x64_sys_finit_module+0x43/0x50

[   41.069149]  do_syscall_64+0x3b/0xc0

[   41.073137]  entry_SYSCALL_64_after_hwframe+0x44/0xae



[   41.080433] Last potentially related work creation:

[   41.085873]  kasan_save_stack+0x23/0x50

[   41.090151]  kasan_record_aux_stack+0xac/0xc0

[   41.095010]  insert_work+0x37/0x170

[   41.098899]  __queue_work+0x208/0x640

[   41.102982]  queue_work_on+0x62/0x80

[   41.106969]  __drm_connector_put_safe+0x80/0xa0 [drm]

[   41.112675]  drm_connector_list_iter_next+0x1a0/0x230 [drm]

[   41.118961]  drm_mode_config_cleanup+0x12e/0x500 [drm]

[   41.124760]  drm_mode_config_init_release+0xe/0x10 [drm]

[   41.130753]  drm_managed_release+0x11c/0x240 [drm]

[   41.136164]  drm_dev_release+0x46/0x60 [drm]

[   41.140992]  devm_drm_dev_init_release+0x46/0x60 [drm]

[   41.146790]  devm_action_release+0x2c/0x40

[   41.151358]  release_nodes+0x62/0x150

[   41.155441]  devres_release_all+0x10a/0x150

[   41.160106]  really_probe+0x177/0x660

[   41.164190]  __driver_probe_device+0x19e/0x230

[   41.169147]  driver_probe_device+0x4e/0xf0

[   41.173716]  __driver_attach+0x102/0x1d0

[   41.178091]  bus_for_each_dev+0xfb/0x150

[   41.182466]  driver_attach+0x2d/0x40

[   41.186452]  bus_add_driver+0x23f/0x300

[   41.190729]  driver_register+0xdc/0x170

[   41.195006]  __pci_register_driver+0xfc/0x110

[   41.199868]  ast_init+0x48/0x1000 [ast]

[   41.204151]  do_one_initcall+0x99/0x2d0

[   41.208429]  do_init_module+0x10f/0x3c0

[   41.212707]  load_module+0x43fa/0x4710

[   41.216888]  __do_sys_finit_module+0x12a/0x1b0

[   41.221843]  __x64_sys_finit_module+0x43/0x50

[   41.226704]  do_syscall_64+0x3b/0xc0

[   41.230691]  entry_SYSCALL_64_after_hwframe+0x44/0xae



[   41.237985] The buggy address belongs to the object at ffff8881feb80000

                 which belongs to the cache kmalloc-8k of size 8192

[   41.251959] The buggy address is located 856 bytes inside of

                 8192-byte region [ffff8881feb80000, ffff8881feb82000)

[   41.265157] The buggy address belongs to the page:

[   41.270501] page:00000000eecae4f9 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1feb80

[   41.280985] head:00000000eecae4f9 order:3 compound_mapcount:0 compound_pincount:0

[   41.289336] flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)

[   41.297595] raw: 0017ffffc0010200 0000000000000000 dead000000000122 ffff88810004d180

[   41.306236] raw: 0000000000000000 0000000000020002 00000001ffffffff 0000000000000000

[   41.314875] page dumped because: kasan: bad access detected



[   41.322749] Memory state around the buggy address:

[   41.328093]  ffff8881feb80200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

[   41.336152]  ffff8881feb80280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

[   41.344211] >ffff8881feb80300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

[   41.352267]                                                     ^

[   41.359066]  ffff8881feb80380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

[   41.367123]  ffff8881feb80400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

[   41.375181] ==================================================================


That bisection led to this commit:

commit aae74ff9caa8de9a45ae2e46068c417817392a26

Author: Ainux <ainux.wang@gmail.com>

Date:   Wed May 26 19:15:15 2021 +0800



     drm/ast: Add detect function support



     The existence of the connector cannot be detected,

     so add the detect function to support.



     Signed-off-by: Ainux <ainux.wang@gmail.com>

     Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

     Link: https://patchwork.freedesktop.org/patch/msgid/20210526111515.40015-1-ainux.wang@gmail.com



  drivers/gpu/drm/ast/ast_mode.c | 18 +++++++++++++++++-

  1 file changed, 17 insertions(+), 1 deletion(-)


I confirmed that if I revert it from v5.15-rc4 (after reverting
its dependent 572994bf18ff "drm/ast: Zero is missing in detect
function"), the problem goes away.

Full .config, dmesg attached.

I can test any possible fixes...

Thanks,

Kim

[-- Attachment #2: aae74ff9caa8-failure.config.xz --]
[-- Type: application/x-xz, Size: 54020 bytes --]

[-- Attachment #3: f6274b06e326.dmesg.xz --]
[-- Type: application/x-xz, Size: 33848 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-12  1:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-05 18:10 BUG: KASAN: use-after-free in enqueue_timer+0x4f/0x1e0 Kim Phillips
2021-10-11 16:07 ` Kim Phillips
2021-10-12  1:48   ` David Airlie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).