All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
To: Majd Dibbiny <majd@mellanox.com>,
	Maor Gottlieb <maorg@mellanox.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Leon Romanovsky <leon@kernel.org>
Subject: BUG: double free of mlx5_cmd_work_ent during shutdown
Date: Thu, 22 Jul 2021 14:49:04 +0800	[thread overview]
Message-ID: <YPkU4HZwKMf9kuBH@syu-laptop> (raw)

Hi,

When force teardown is used (for devices that doesn't support fast
teardown), mlx5_cmd_force_teardown_hca() sends out a TEARDOWN_HCA command,
wait for the command to complete, and call cmd_ent_put() from the
following stack.

    cmd_exec() {
        // ...
        out_free:
            cmd_ent_put(ent); // HERE
        out:
            return err;
    }
      mlx5_cmd_exec_polling()
        mlx5_cmd_force_teardown_hca()
          shutdown()

However, there's another path which cmd_ent_put() is called for the same
command from the async events chain.

    mlx5_cmd_comp_handler() {
        for (i = 0; i < (1 << cmd->log_sz); i++) {
            if (test_bit(i, &vector)) {
                // ...
                if (!forced || /* Real FW completion */
                    pci_channel_offline(dev->pdev) || /* FW is inaccessible */
                    dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
                    cmd_ent_put(ent); // HERE
        // ...
    }
      cmd_comp_notifier()
        notifier_call_chain()
          atomic_notifier_call_chain()
            mlx5_eq_async_int()
              notifier_call_chain()
                atomic_notifier_call_chain()
                  mlx5_irq_int_handler()

This leads to to "refcount_t: underflow; use-after-free" warning during
shutdown.

---

Here's the stack trace on v5.13-based kernel with KASAN enabled.
(Line number doesn't exactly match though)

    systemd-shutdown[1]: Rebooting with kexec.
    kvm: exiting hardware virtualization
    sd 0:0:0:0: [sda] Synchronizing SCSI cache
    mlx5_core 0000:0a:00.1: Shutdown was called
    mlx5_try_fast_unload:1629: mlx5_core 0000:0a:00.1: mlx5_try_fast_unload:1629:(pid 4959): force teardown firmware support=1
    mlx5_try_fast_unload:1630: mlx5_core 0000:0a:00.1: mlx5_try_fast_unload:1630:(pid 4959): fast teardown firmware support=0
    mlx5_cmd_fast_teardown_hca:330: mlx5_core 0000:0a:00.1: mlx5_cmd_fast_teardown_hca:330:(pid 4959): fast teardown is not supported in the firmware
    dump_command:805: mlx5_core 0000:0a:00.1: dump_command:805:(pid 4647): cmd[0]: start dump
    dump_command:814: mlx5_core 0000:0a:00.1: dump_command:814:(pid 4647): cmd[0]: dump command TEARDOWN_HCA(0x103) INPUT
    dump_buf:272: cmd[0]: 000: 07000000 00000010 00000000 00000000
    dump_buf:272: cmd[0]: 010: 01030000 00000000 00000001 00000000
    dump_buf:272: cmd[0]: 020: 00000000 00000000 00000000 00000000
    dump_buf:272: cmd[0]: 030: 00000000 00000000 00000010 3fc50001
    dump_buf:279: 
    dump_command:848: mlx5_core 0000:0a:00.1: dump_command:848:(pid 4647): cmd[0]: end dump
    cmd_work_handler:1003: mlx5_core 0000:0a:00.1: cmd_work_handler:1003:(pid 4647): writing 0x1 to command doorbell
    dump_command:805: mlx5_core 0000:0a:00.1: dump_command:805:(pid 4647): cmd[0]: start dump
    mlx5_core 0000:0a:00.1: mlx5_cmd_comp_handler:1613:(pid 0): Command completion arrived after timeout (entry idx = 0).
    dump_command:814: mlx5_core 0000:0a:00.1: dump_command:814:(pid 4647): cmd[0]: dump command TEARDOWN_HCA(0x103) OUTPUT

    cmd_ent_put(00000000fbaea4c6) will be called in mlx5_cmd_comp_handler:1615 for TEARDOWN_HCA(0x103), polling 1, ret 0
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    any_notifier:148: mlx5_core 0000:0a:00.1: any_notifier:148:(pid 0): Async eqe type MLX5_EVENT_TYPE_CMD, subtype (0)
    ==================================================================
    BUG: KASAN: use-after-free in dump_command+0x9d9/0xa40 [mlx5_core]
    Read of size 4 at addr ffff8881120620e0 by task kworker/u96:0/4647
    
    CPU: 10 PID: 4647 Comm: kworker/u96:0 Tainted: G            E     5.14.0-rc1-1-default+ #7 6ab45f24446e456129f9451daed46c2a2aa80406
    Hardware name: Dell Inc. PowerEdge T320/0FDT3J, BIOS 2.9.0 01/08/2020
    Workqueue: mlx5_cmd_0000:0a:00.1 cmd_work_handler [mlx5_core]
    Call Trace:
     dump_stack_lvl+0x57/0x72
     print_address_description.constprop.0+0x1f/0x140
     ? dump_command+0x9d9/0xa40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     kasan_report.cold+0x7f/0x11b
     ? dump_command+0x9d9/0xa40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     dump_command+0x9d9/0xa40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? seqcount_lockdep_reader_access.constprop.0+0x84/0x90
     ? seqcount_lockdep_reader_access.constprop.0+0x84/0x90
     mlx5_cmd_comp_handler+0x498/0x13e0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? lock_release+0x1ea/0x6c0
     ? mlx5_cmd_check+0x2f0/0x2f0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? seqcount_lockdep_reader_access.constprop.0+0x84/0x90
     ? cmd_work_handler+0xfa2/0x19a0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? __cond_resched+0x15/0x30
     cmd_work_handler+0xfa2/0x19a0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? cmd_comp_notifier+0x90/0x90 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? lock_is_held_type+0xe0/0x110
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     process_one_work+0x8b2/0x14c0
     ? lock_release+0x6c0/0x6c0
     ? pwq_dec_nr_in_flight+0x260/0x260
     ? rwlock_bug.part.0+0x90/0x90
     worker_thread+0x544/0x12d0
     ? get_cpu_entry_area+0x13/0x20
     ? __kthread_parkme+0xa5/0x130
     ? process_one_work+0x14c0/0x14c0
     kthread+0x349/0x410
     ? _raw_spin_unlock_irq+0x24/0x40
     ? set_kthread_struct+0x100/0x100
     ret_from_fork+0x22/0x30
    
    Allocated by task 4959:
     kasan_save_stack+0x1b/0x40
     __kasan_kmalloc+0x7c/0x90
     cmd_exec+0x6f0/0x1740 [mlx5_core]
     mlx5_cmd_exec_polling+0x1f/0x40 [mlx5_core]
     mlx5_cmd_force_teardown_hca+0xfc/0x200 [mlx5_core]
     shutdown+0x2e7/0x3b3 [mlx5_core]
     pci_device_shutdown+0x75/0x110
     device_shutdown+0x2ce/0x5e0
     kernel_kexec+0x57/0xb0
     __do_sys_reboot+0x284/0x2f0
     do_syscall_64+0x5c/0xc0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    Freed by task 0:
     kasan_save_stack+0x1b/0x40
     kasan_set_track+0x1c/0x30
     kasan_set_free_info+0x20/0x30
     __kasan_slab_free+0xec/0x120
     slab_free_freelist_hook+0x90/0x1b0
     kfree+0xd1/0x410
     mlx5_cmd_comp_handler.cold+0x138/0x16c [mlx5_core]
     cmd_comp_notifier+0x72/0x90 [mlx5_core]
     atomic_notifier_call_chain+0xdd/0x1c0
     mlx5_eq_async_int+0x34c/0x8f0 [mlx5_core]
     atomic_notifier_call_chain+0xdd/0x1c0
     irq_int_handler+0x11/0x20 [mlx5_core]
     __handle_irq_event_percpu+0x264/0x640
     handle_irq_event+0x10b/0x270
     handle_edge_irq+0x21a/0xbe0
     __common_interrupt+0x72/0x160
     common_interrupt+0x7b/0xa0
     asm_common_interrupt+0x1e/0x40
    
    Last potentially related work creation:
     kasan_save_stack+0x1b/0x40
     kasan_record_aux_stack+0xa5/0xb0
     insert_work+0x44/0x280
     __queue_work+0x4b7/0xc50
     queue_work_on+0x8d/0x90
     cmd_exec+0xfac/0x1740 [mlx5_core]
     mlx5_cmd_exec_polling+0x1f/0x40 [mlx5_core]
     mlx5_cmd_force_teardown_hca+0xfc/0x200 [mlx5_core]
     shutdown+0x2e7/0x3b3 [mlx5_core]
     pci_device_shutdown+0x75/0x110
     device_shutdown+0x2ce/0x5e0
     kernel_kexec+0x57/0xb0
     __do_sys_reboot+0x284/0x2f0
     do_syscall_64+0x5c/0xc0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    The buggy address belongs to the object at ffff888112062000
     which belongs to the cache kmalloc-1k of size 1024
    The buggy address is located 224 bytes inside of
     1024-byte region [ffff888112062000, ffff888112062400)
    The buggy address belongs to the page:
    page:000000008b314a5c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x112060
    head:000000008b314a5c order:3 compound_mapcount:0 compound_pincount:0
    flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
    raw: 002ffff800010200 ffffea00041e3a00 0000000500000005 ffff888100042dc0
    raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff888112061f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff888112062000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff888112062080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                           ^
     ffff888112062100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff888112062180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================
    Disabling lock debugging due to kernel taint
    dump_buf:272: cmd[0]: 000: 07000000 00000010 00000000 00000000
    dump_buf:272: cmd[0]: 010: 01030000 00000000 00000001 00000000
    dump_buf:272: cmd[0]: 020: 00000000 00000000 00000000 00000001
    dump_buf:272: cmd[0]: 030: 00000000 00000000 00000010 3fc50000
    dump_buf:279: 
    dump_command:848: mlx5_core 0000:0a:00.1: dump_command:848:(pid 4647): cmd[0]: end dump
    mlx5_cmd_comp_handler:1648: mlx5_core 0000:0a:00.1: mlx5_cmd_comp_handler:1648:(pid 4647): command completed. ret 0x0, delivery status no errors(0x0)
    wait_func:1102: mlx5_core 0000:0a:00.1: wait_func:1102:(pid 4959): err 0, delivery status no errors(0)

    cmd_ent_put(00000000fbaea4c6) will be called in mlx5_cmd_invoke:1178 for TEARDOWN_HCA(0x103), polling 1, ret 0
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    ------------[ cut here ]------------
    refcount_t: underflow; use-after-free.
    WARNING: CPU: 2 PID: 4959 at lib/refcount.c:28 refcount_warn_saturate+0xaa/0x140
    Modules linked in: af_packet(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) mlx5_core(E) intel_rapl_msr(E) intel_rapl_common(E) sb_edac(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) mgag200(E) coretemp(E) kvm_intel(E) drm_kms_helper(E) kvm(E) syscopyarea(E) sysfillrect(E) ipmi_ssif(E) sysimgblt(E) joydev(E) fb_sys_fops(E) dcdbas(E) cec(E) tg3(E) nls_iso8859_1(E) iTCO_wdt(E) nls_cp437(E) irqbypass(E) rc_core(E) libphy(E) intel_pmc_bxt(E) mei_me(E) ipmi_si(E) vfat(E) i2c_algo_bit(E) mlxfw(E) efi_pstore(E) fat(E) ipmi_devintf(E) iTCO_vendor_support(E) pcspkr(E) pci_hyperv_intf(E) ipmi_msghandler(E) psample(E) tiny_power_button(E) mei(E) lpc_ich(E) wmi(E) button(E) drm(E) fuse(E) configfs(E) hid_microsoft(E) ff_memless(E) hid_generic(E) usbhid(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) ehci_pci(E) ehci_hcd(E) usbcore(E) sg(E) efivarfs(E)
    CPU: 2 PID: 4959 Comm: kexec Tainted: G    B       E     5.14.0-rc1-1-default+ #7 6ab45f24446e456129f9451daed46c2a2aa80406
    Hardware name: Dell Inc. PowerEdge T320/0FDT3J, BIOS 2.9.0 01/08/2020
    RIP: 0010:refcount_warn_saturate+0xaa/0x140
    Code: e9 0c b5 02 01 e8 be 45 f9 00 0f 0b eb d5 80 3d d7 0c b5 02 00 75 cc 48 c7 c7 e0 f0 a4 ad c6 05 c7 0c b5 02 01 e8 9e 45 f9 00 <0f> 0b eb b5 80 3d b5 0c b5 02 00 75 ac 48 c7 c7 a0 f1 a4 ad c6 05
    RSP: 0018:ffff8881593b79d8 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff888112062214 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffed102b276f2d
    RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
    R10: ffffed102b276eca R11: 0000000000000001 R12: 0000000000000000
    R13: 0000000000000000 R14: ffff888159585908 R15: ffff8881079fa080
    FS:  00007f8846e63740(0000) GS:ffff8881d3700000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f884700b670 CR3: 0000000112ef0004 CR4: 00000000001706e0
    Call Trace:
     cmd_exec+0xda6/0x1740 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? cb_timeout_handler+0xa0/0xa0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? mark_held_locks+0x9e/0xe0
     mlx5_cmd_exec_polling+0x1f/0x40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     mlx5_cmd_force_teardown_hca+0xfc/0x200 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? mlx5_cmd_teardown_hca+0xd0/0xd0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     shutdown+0x2e7/0x3b3 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     pci_device_shutdown+0x75/0x110
     device_shutdown+0x2ce/0x5e0
     kernel_kexec+0x57/0xb0
     __do_sys_reboot+0x284/0x2f0
     ? poweroff_work_func+0x90/0x90
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? kasan_quarantine_put+0x8e/0x1e0
     ? lockdep_hardirqs_on+0x77/0xf0
     ? kasan_quarantine_put+0x8e/0x1e0
     ? lock_is_held_type+0xe0/0x110
     ? slab_free_freelist_hook+0x90/0x1b0
     ? __fput+0x2a1/0x780
     ? lock_is_held_type+0x53/0x110
     ? rcu_read_lock_sched_held+0x3f/0x80
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? call_rcu+0x5b2/0xa80
     ? lock_is_held_type+0x53/0x110
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? syscall_enter_from_user_mode+0x21/0x70
     do_syscall_64+0x5c/0xc0
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? lockdep_hardirqs_on+0x77/0xf0
     ? do_syscall_64+0x69/0xc0
     ? do_syscall_64+0x69/0xc0
     ? lock_is_held_type+0x53/0x110
     ? irqentry_exit_to_user_mode+0xa/0x40
     ? asm_exc_page_fault+0x8/0x30
     ? lockdep_hardirqs_on+0x77/0xf0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x7f8846faa0d7
    Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 69 ad 0c 00 f7 d8 64 89 02 b8
    RSP: 002b:00007fffeb69d648 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
    RAX: ffffffffffffffda RBX: 0000561e00f46520 RCX: 00007f8846faa0d7
    RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead
    RBP: 00007f8847076600 R08: 000000000000000b R09: 0000000000000000
    R10: 00007f8846ebe0d8 R11: 0000000000000206 R12: 0000561dffc738fe
    R13: 0000561dffc73905 R14: 00007fffeb69d7a0 R15: 00000000ffffffff
    irq event stamp: 11228
    hardirqs last  enabled at (11227): [<ffffffffad2dde14>] _raw_spin_unlock_irq+0x24/0x40
    hardirqs last disabled at (11228): [<ffffffffad2c9f39>] __schedule+0x1229/0x2300
    softirqs last  enabled at (11162): [<ffffffffab3bd94e>] __irq_exit_rcu+0x12e/0x170
    softirqs last disabled at (11157): [<ffffffffab3bd94e>] __irq_exit_rcu+0x12e/0x170
    ---[ end trace 151c2ed42dd09eaf ]---
    cmd_exec:1864: mlx5_core 0000:0a:00.1: cmd_exec:1864:(pid 4959): err 0, status 0
    mlx5_core 0000:0a:00.1: mlx5_cmd_force_teardown_hca:313:(pid 4959): teardown with force mode failed, doing normal teardown
    mlx5_try_fast_unload:1654: mlx5_core 0000:0a:00.1: mlx5_try_fast_unload:1654:(pid 4959): Firmware couldn't do fast unload error: -5
    mlx5_wait_for_pages:739: mlx5_core 0000:0a:00.1: mlx5_wait_for_pages:739:(pid 4959): Waiting for 0 pages
    mlx5_wait_for_pages:752: mlx5_core 0000:0a:00.1: mlx5_wait_for_pages:752:(pid 4959): All pages received
    dump_command:805: mlx5_core 0000:0a:00.1: dump_command:805:(pid 4647): cmd[0]: start dump
    dump_command:814: mlx5_core 0000:0a:00.1: dump_command:814:(pid 4647): cmd[0]: dump command DESTROY_FLOW_TABLE(0x931) INPUT
    dump_buf:272: cmd[0]: 000: 07000000 00000040 00000001 347e4400
    dump_buf:272: cmd[0]: 010: 09310000 00000000 00000000 00000000
    dump_buf:272: cmd[0]: 020: 00000000 00000000 00000000 00000000
    dump_buf:272: cmd[0]: 030: 00000000 00000000 00000010 40de0001


Best,
Shung-Hsi Yu


             reply	other threads:[~2021-07-22  6:49 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22  6:49 Shung-Hsi Yu [this message]
2021-07-26 13:30 ` BUG: double free of mlx5_cmd_work_ent during shutdown Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YPkU4HZwKMf9kuBH@syu-laptop \
    --to=shung-hsi.yu@suse.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=majd@mellanox.com \
    --cc=maorg@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.