All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: lorenzo@kernel.org, sujuan.chen@mediatek.com,
	Felix Fietkau <nbd@nbd.name>,
	Linux List Kernel Mailing <linux-wireless@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>
Subject: [6.2][regression] after commit cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae stopping working wifi mt7921e
Date: Wed, 21 Dec 2022 06:10:41 +0500	[thread overview]
Message-ID: <CABXGCsMEnQd=gYKTd1knRsWuxCb=Etv5nAre+XJS_s5FgVteYA@mail.gmail.com> (raw)

Hi,
The kernel 6.2 preparation cycle has begun.
And after the kernel was updated on my laptop, the wifi stopped working.

Bisecting blames this commit:
cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae is the first bad commit
commit cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae
Author: Lorenzo Bianconi <lorenzo@kernel.org>
Date:   Sat Nov 12 16:40:35 2022 +0100

    wifi: mt76: add WED RX support to mt76_dma_{add,get}_buf

    Introduce the capability to configure RX WED in mt76_dma_{add,get}_buf
    utility routines.

    Tested-by: Daniel Golle <daniel@makrotopia.org>
    Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
    Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
    Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
    Signed-off-by: Felix Fietkau <nbd@nbd.name>

 drivers/net/wireless/mediatek/mt76/dma.c  | 125 ++++++++++++++++++++----------
 drivers/net/wireless/mediatek/mt76/mt76.h |   2 +
 2 files changed, 88 insertions(+), 39 deletions(-)

Unfortunately, I can't be sure that revert this commit will fix the
problem. Because after the revert, compile of kernel failing with
follow error:
drivers/net/wireless/mediatek/mt76/mt7915/dma.c: In function ‘mt7915_dma_init’:
drivers/net/wireless/mediatek/mt76/mt7915/dma.c:489:33: error:
implicit declaration of function ‘MT_WED_Q_RX’; did you mean
‘MT_WED_Q_TX’? [-Werror=implicit-function-declaration]
  489 |                                 MT_WED_Q_RX(MT7915_RXQ_BAND0);
      |                                 ^~~~~~~~~~~
      |                                 MT_WED_Q_TX
cc1: some warnings being treated as errors
  CC [M]  drivers/net/ethernet/intel/igb/e1000_phy.o
make[7]: *** [scripts/Makefile.build:252:
drivers/net/wireless/mediatek/mt76/mt7915/dma.o] Error 1
make[7]: *** Waiting for unfinished jobs....


In the kernel log I see such error traces after commit
cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae

1)
[   23.642036] ======================================================
[   23.642304] WARNING: possible circular locking dependency detected
[   23.642304] 6.1.0-rc5-13-cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae+
#13 Tainted: G        W    L
[   23.642304] ------------------------------------------------------
[   23.642304] kworker/u32:10/831 is trying to acquire lock:
[   23.642304] ffff8c43b2043c78 (&dev->mutex#3){+.+.}-{3:3}, at:
mt7921_roc_work+0x37/0xa0 [mt7921_common]
[   23.642304]
               but task is already holding lock:
[   23.642304] ffffaa0501a8fe78
((work_completion)(&dev->phy.roc_work)){+.+.}-{0:0}, at:
process_one_work+0x20b/0x5b0
[   23.642304]
               which lock already depends on the new lock.

[   23.642304]
               the existing dependency chain (in reverse order) is:
[   23.642304]
               -> #1 ((work_completion)(&dev->phy.roc_work)){+.+.}-{0:0}:
[   23.642304]        __flush_work+0x84/0x4b0
[   23.642304]        __cancel_work_timer+0xfc/0x190
[   23.642304]        mt7921_abort_roc+0x3b/0x60 [mt7921_common]
[   23.642304]        mt7921_mgd_complete_tx+0x4c/0x70 [mt7921_common]
[   23.642304]        drv_mgd_complete_tx+0x8c/0x190 [mac80211]
[   23.642304]        ieee80211_sta_rx_queued_mgmt+0x2a5/0x8e0 [mac80211]
[   23.642304]        ieee80211_iface_work+0x328/0x450 [mac80211]
[   23.642304]        process_one_work+0x294/0x5b0
[   23.642304]        worker_thread+0x4f/0x3a0
[   23.642304]        kthread+0xf5/0x120
[   23.642304]        ret_from_fork+0x22/0x30
[   23.642304]
               -> #0 (&dev->mutex#3){+.+.}-{3:3}:
[   23.642304]        __lock_acquire+0x12b1/0x1ef0
[   23.642304]        lock_acquire+0xc2/0x2b0
[   23.642304]        __mutex_lock+0xbb/0x850
[   23.642304]        mt7921_roc_work+0x37/0xa0 [mt7921_common]
[   23.642304]        process_one_work+0x294/0x5b0
[   23.642304]        worker_thread+0x4f/0x3a0
[   23.642304]        kthread+0xf5/0x120
[   23.642304]        ret_from_fork+0x22/0x30
[   23.642304]
               other info that might help us debug this:

[   23.642304]  Possible unsafe locking scenario:

[   23.642304]        CPU0                    CPU1
[   23.642304]        ----                    ----
[   23.642304]   lock((work_completion)(&dev->phy.roc_work));
[   23.642304]                                lock(&dev->mutex#3);
[   23.669750]
lock((work_completion)(&dev->phy.roc_work));
[   23.669750]   lock(&dev->mutex#3);
[   23.669750]
                *** DEADLOCK ***

[   23.671578] 2 locks held by kworker/u32:10/831:
[   23.671578]  #0: ffff8c43ba7aa148
((wq_completion)phy0){+.+.}-{0:0}, at: process_one_work+0x20b/0x5b0
[   23.671578]  #1: ffffaa0501a8fe78
((work_completion)(&dev->phy.roc_work)){+.+.}-{0:0}, at:
process_one_work+0x20b/0x5b0
[   23.673701]
               stack backtrace:
[   23.673701] CPU: 8 PID: 831 Comm: kworker/u32:10 Tainted: G
W    L     6.1.0-rc5-13-cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae+ #13
[   23.673701] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
G513QY_G513QY/G513QY, BIOS G513QY.320 09/07/2022
[   23.673701] Workqueue: phy0 mt7921_roc_work [mt7921_common]
[   23.673701] Call Trace:
[   23.673701]  <TASK>
[   23.677973]  dump_stack_lvl+0x5b/0x77
[   23.677973]  check_noncircular+0xff/0x110
[   23.677973]  ? sched_clock_local+0xe/0x80
[   23.677973]  __lock_acquire+0x12b1/0x1ef0
[   23.677973]  lock_acquire+0xc2/0x2b0
[   23.677973]  ? mt7921_roc_work+0x37/0xa0 [mt7921_common]
[   23.677973]  __mutex_lock+0xbb/0x850
[   23.681699]  ? mt7921_roc_work+0x37/0xa0 [mt7921_common]
[   23.681699]  ? mt7921_roc_work+0x37/0xa0 [mt7921_common]
[   23.681699]  ? mt7921_roc_work+0x37/0xa0 [mt7921_common]
[   23.681699]  mt7921_roc_work+0x37/0xa0 [mt7921_common]
[   23.681699]  process_one_work+0x294/0x5b0
[   23.681699]  worker_thread+0x4f/0x3a0
[   23.681699]  ? process_one_work+0x5b0/0x5b0
[   23.681699]  kthread+0xf5/0x120
[   23.685767]  ? kthread_complete_and_exit+0x20/0x20
[   23.685767]  ret_from_fork+0x22/0x30
[   23.685767]  </TASK>
[   24.599971] wlp5s0: authentication with 24:cf:24:c2:72:d0 timed out
[   24.749911] amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer
[   27.607726] mt7921e 0000:05:00.0: Message 00020003 (seq 10) timeout
[   30.615933] mt7921e 0000:05:00.0: Message 00020002 (seq 11) timeout
[   30.703139] mt7921e 0000:05:00.0: HW/SW Version: 0x8a108a10, Build
Time: 20220908210919a



2)
[   57.627571] ------------[ cut here ]------------
[   57.627575] WARNING: CPU: 10 PID: 831 at
drivers/iommu/dma-iommu.c:1038 iommu_dma_unmap_page+0x79/0x90
[   57.627586] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer
nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet
nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep
intel_rapl_msr intel_rapl_common sunrpc snd_sof_amd_rembrandt
snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_hda_codec_realtek
mt7921e snd_sof snd_hda_codec_generic snd_hda_codec_hdmi mt7921_common
snd_sof_utils edac_mce_amd snd_soc_core binfmt_misc snd_hda_intel
mt76_connac_lib snd_intel_dspcfg btusb snd_compress snd_intel_sdw_acpi
ac97_bus mt76 btrtl snd_pcm_dmaengine kvm_amd snd_hda_codec snd_pci_ps
btbcm snd_hda_core snd_rpl_pci_acp6x btintel vfat snd_pci_acp6x
snd_hwdep mac80211 fat btmtk kvm snd_seq libarc4 snd_seq_device
bluetooth irqbypass snd_pcm cfg80211 snd_pci_acp5x snd_rn_pci_acp3x
snd_timer snd_acp_config rapl snd snd_soc_acpi asus_nb_wmi wmi_bmof
[   57.627650]  pcspkr i2c_piix4 snd_pci_acp3x k10temp soundcore
joydev asus_wireless amd_pmc zram amdgpu crct10dif_pclmul hid_asus
crc32_pclmul drm_ttm_helper crc32c_intel asus_wmi polyval_clmulni ttm
ledtrig_audio sparse_keymap polyval_generic platform_profile iommu_v2
gpu_sched nvme drm_buddy nvme_core drm_display_helper rfkill
ghash_clmulni_intel ucsi_acpi hid_multitouch sha512_ssse3 serio_raw
typec_ucsi ccp r8169 cec sp5100_tco nvme_common typec i2c_hid_acpi
video i2c_hid wmi ip6_tables ip_tables fuse
[   57.627702] CPU: 10 PID: 831 Comm: kworker/u32:10 Tainted: G
W    L     6.1.0-rc5-13-cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae+ #13
[   57.627706] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
G513QY_G513QY/G513QY, BIOS G513QY.320 09/07/2022
[   57.627708] Workqueue: mt76 mt7921_mac_reset_work [mt7921_common]
[   57.627720] RIP: 0010:iommu_dma_unmap_page+0x79/0x90
[   57.627724] Code: 2b 48 3b 28 72 26 48 3b 68 08 73 20 4d 89 f8 44
89 f1 4c 89 ea 48 89 ee 48 89 df 5b 5d 41 5c 41 5d 41 5e 41 5f e9 d7
76 7e ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66 0f 1f
44 00
[   57.627727] RSP: 0018:ffffaa0501a8fcb8 EFLAGS: 00010246
[   57.627730] RAX: 0000000000000000 RBX: ffff8c43933500d0 RCX: 0000000000000000
[   57.627732] RDX: 0000000000000000 RSI: 0000000000000177 RDI: ffffaa0501a8fca0
[   57.627734] RBP: ffff8c43933500d0 R08: 00000000ffd77800 R09: 0000000000000081
[   57.627735] R10: 0000000000000001 R11: 000ffffffffff000 R12: 00000000ffd77800
[   57.627737] R13: 00000000000006c0 R14: 0000000000000002 R15: 0000000000000000
[   57.627739] FS:  0000000000000000(0000) GS:ffff8c5258a00000(0000)
knlGS:0000000000000000
[   57.627740] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.627742] CR2: 000055bcc13dc800 CR3: 0000000479228000 CR4: 0000000000750ee0
[   57.627744] PKRU: 55555554
[   57.627745] Call Trace:
[   57.627749]  <TASK>
[   57.627753]  dma_unmap_page_attrs+0x4c/0x1d0
[   57.627763]  mt76_dma_get_buf+0xaf/0x190 [mt76]
[   57.627774]  ? free_unref_page+0x1a7/0x280
[   57.627780]  mt76_dma_rx_cleanup+0xa0/0x150 [mt76]
[   57.627787]  mt7921_wpdma_reset+0xb6/0x1d0 [mt7921e]
[   57.627795]  mt7921e_mac_reset+0x141/0x2e0 [mt7921e]
[   57.627800]  mt7921_mac_reset_work+0x8b/0x160 [mt7921_common]
[   57.627808]  process_one_work+0x294/0x5b0
[   57.627817]  worker_thread+0x4f/0x3a0
[   57.627820]  ? process_one_work+0x5b0/0x5b0
[   57.627822]  kthread+0xf5/0x120
[   57.627826]  ? kthread_complete_and_exit+0x20/0x20
[   57.627830]  ret_from_fork+0x22/0x30
[   57.627838]  </TASK>
[   57.627840] irq event stamp: 135539
[   57.627841] hardirqs last  enabled at (135539):
[<ffffffff92f7a214>] _raw_spin_unlock_irq+0x24/0x50
[   57.627848] hardirqs last disabled at (135538):
[<ffffffff92f79f18>] _raw_spin_lock_irq+0x68/0x90
[   57.627851] softirqs last  enabled at (135534):
[<ffffffffc2fc2fe8>] __ieee80211_tx_skb_tid_band+0x68/0x250 [mac80211]
[   57.627896] softirqs last disabled at (135494):
[<ffffffffc2fc2fe8>] __ieee80211_tx_skb_tid_band+0x68/0x250 [mac80211]
[   57.627924] ---[ end trace 0000000000000000 ]---
[   57.711796] mt7921e 0000:05:00.0: HW/SW Version: 0x8a108a10, Build
Time: 20220908210919a

Full kernel log is here: https://pastebin.com/ALHUDvSQ

I hope my report helps fix the problem quickly.

-- 
Best Regards,
Mike Gavrilov.

             reply	other threads:[~2022-12-21  1:11 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-21  1:10 Mikhail Gavrilov [this message]
2022-12-21 10:45 ` [6.2][regression] after commit cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae stopping working wifi mt7921e Felix Fietkau
2022-12-21 11:26   ` Lorenzo Bianconi
2022-12-21 13:10   ` Mikhail Gavrilov
2022-12-21 14:12     ` Felix Fietkau
2022-12-21 16:07       ` Lorenzo Bianconi
2022-12-21 16:46       ` Mikhail Gavrilov
2022-12-21 17:17         ` Felix Fietkau
2022-12-22  6:47           ` Mikhail Gavrilov
2022-12-24  7:55             ` Thorsten Leemhuis
2022-12-26 10:59               ` Thorsten Leemhuis
2023-01-04 14:20           ` Thorsten Leemhuis
2023-01-09  7:32             ` Linux kernel regression tracking (Thorsten Leemhuis)
2023-01-10  7:16               ` Linux kernel regression tracking (Thorsten Leemhuis)
2023-01-10  8:00                 ` Felix Fietkau
2023-01-10  8:41                   ` Linux kernel regression tracking (Thorsten Leemhuis)
2023-01-13 14:11                     ` Kalle Valo
2023-01-10 21:52                   ` Mikhail Gavrilov
2022-12-22 12:36 ` [6.2][regression] after commit cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae stopping working wifi mt7921e #forregzbot Thorsten Leemhuis
2023-01-27 11:36   ` Linux kernel regression tracking (#update)
2023-01-17  0:33 [6.2][regression] after commit cd372b8c99c5a5cf6a464acebb7e4a79af7ec8ae stopping working wifi mt7921e Mike Lothian
2023-01-17  5:42 ` Mikhail Gavrilov
2023-01-17  6:42   ` Kalle Valo
2023-01-17 13:06   ` Mike Lothian
2023-01-17 13:13     ` Mikhail Gavrilov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABXGCsMEnQd=gYKTd1knRsWuxCb=Etv5nAre+XJS_s5FgVteYA@mail.gmail.com' \
    --to=mikhail.v.gavrilov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=nbd@nbd.name \
    --cc=sujuan.chen@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.