All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Michal Kazior <michal.kazior@tieto.com>
Cc: linux-wireless <linux-wireless@vger.kernel.org>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>
Subject: Re: [PATCH 1/3] ath10k: Ensure there are no stale ar->txqs entries.
Date: Thu, 1 Dec 2016 14:52:59 -0800	[thread overview]
Message-ID: <ba1cc818-6194-11d6-7cc3-1db28036a519@candelatech.com> (raw)
In-Reply-To: <57B70AED.2010200@candelatech.com>

On 08/19/2016 06:34 AM, Ben Greear wrote:
>
>
> On 08/18/2016 11:59 PM, Michal Kazior wrote:
>> On 19 August 2016 at 03:26,  <greearb@candelatech.com> wrote:
>>> From: Ben Greear <greearb@candelatech.com>
>>>
>>> I was seeing kernel crashes due to accessing freed memory
>>> while debugging a 9984 firmware that was crashing often.
>>>
>>> This patch fixes the crashes.  I am not certain if there
>>> is a better way or not.


I did some more hacking on this today.  I think I found some better clue on this.

I added this code:

static void ath10k_mac_txq_init(struct ath10k *ar, struct ieee80211_txq *txq)
{
	struct ath10k_txq *artxq = (void *)txq->drv_priv;
	struct ath10k_txq *tmp, *walker;
	struct ieee80211_txq *txq_tmp;
	int i = 0;

	if (!txq)
		return;

	spin_lock_bh(&ar->txqs_lock);
	ar->txqs_lock.rlock.dbg1 = 104;

	/* Remove from ar->txqs in case it still exists there. */
	list_for_each_entry_safe(walker, tmp, &ar->txqs, list) {
		txq_tmp = container_of((void *)walker, struct ieee80211_txq,
				       drv_priv);
		if ((++i % 10000) == 0) {
			ath10k_err(ar, "txq-init: Checking txq_tmp: %p i: %d\n", txq_tmp, i);
			ath10k_err(ar, "txq-init: txqs: %p walker->list: %p w->next: %p  w->prev: %p ar->txqs: %p\n",
				   &ar->txqs, &(walker->list), walker->list.next, walker->list.prev, &ar->txqs);
		}

		if (txq_tmp == txq) {
			WARN_ON_ONCE(1);
			ath10k_err(ar, "txq-init: Found txq when it should be deleted, txq_tmp: %p  txq: %p\n",
				   txq_tmp, txq);
			list_del(&walker->list);
		}
	}
	spin_unlock_bh(&ar->txqs_lock);

	INIT_LIST_HEAD(&artxq->list);
}


[firmware has just crashed]

Dec 01 14:43:06 wave2 kernel: ------------[ cut here ]------------
Dec 01 14:43:06 wave2 kernel: WARNING: CPU: 0 PID: 193 at /home/greearb/git/linux-4.7.dev.y/drivers/net/wireless/ath/ath10k/mac.c:4217 
ath10k_mac_txq_init+0x1a7/0x1b0 [ath10k_core]
Dec 01 14:43:06 wave2 kernel: Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 bridge 8021q garp mrp stp llc bnep bluetooth fuse 
macvlan pktgen rpcsec_gss_krb5 nfsv4 nfs fscache coretemp hwmon intel_rapl x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic kvm iTCO_wdt irqbypass iTCO_vendor_support ath10k_pci ath10k_core joydev ath snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_hwdep 
snd_seq snd_seq_device pcspkr cfg80211 snd_pcm snd_timer snd i2c_i801 lpc_ich shpchp soundcore tpm_tis tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc serio_raw 
i915 i2c_algo_bit ata_generic drm_kms_helper pata_acpi e1000e ptp drm pps_core i2c_core fjes video ipv6 [last unloaded: nf_conntrack]
Dec 01 14:43:06 wave2 kernel: CPU: 0 PID: 193 Comm: kworker/0:1 Not tainted 4.7.10+ #14
Dec 01 14:43:06 wave2 kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
Dec 01 14:43:06 wave2 kernel: Workqueue: events_freezable ieee80211_restart_work [mac80211]
Dec 01 14:43:06 wave2 kernel:  ffffffffa0e29507 ffff8801d14f7920 ffffffff8169ed08 0000000000000000
Dec 01 14:43:06 wave2 kernel:  0000000000000000 ffff8801d14f7968 ffffffff811569bc ffff8801d14e4f00
Dec 01 14:43:06 wave2 kernel:  0000107900000001 ffff8800c43ec9a0 0000000000000027 ffff8800c43ec988
Dec 01 14:43:06 wave2 kernel: Call Trace:
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e29507>] ? ath10k_mac_txq_init+0x1a7/0x1b0 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8169ed08>] dump_stack+0x85/0xcd
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811569bc>] __warn+0x10c/0x130
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81156b58>] warn_slowpath_null+0x18/0x20
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e29507>] ath10k_mac_txq_init+0x1a7/0x1b0 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e3766f>] ath10k_sta_state+0x4ef/0x1350 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811e10ed>] ? mark_lock+0x6d/0x8a0
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e37180>] ? ath10k_station_assoc+0x920/0x920 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811ddd14>] ? __lock_is_held+0x84/0xc0
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0c0095f>] drv_sta_state+0xef/0xc50 [mac80211]
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0c6b1b0>] ieee80211_reconfig+0x10a0/0x2890 [mac80211]
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0bf8361>] ieee80211_restart_work+0xb1/0xf0 [mac80211]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81184dad>] process_one_work+0x42d/0xac0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81184cf4>] ? process_one_work+0x374/0xac0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81184980>] ? pwq_dec_nr_in_flight+0x110/0x110
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811854c6>] worker_thread+0x86/0x730
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81df25aa>] ? _raw_spin_unlock_irqrestore+0x5a/0x70
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81185440>] ? process_one_work+0xac0/0xac0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8118f381>] kthread+0x191/0x1b0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8118f1f0>] ? kthread_create_on_node+0x320/0x320
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8119baa3>] ? preempt_count_sub+0x13/0xd0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81df2f8f>] ret_from_fork+0x1f/0x40
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8118f1f0>] ? kthread_create_on_node+0x320/0x320
Dec 01 14:43:06 wave2 kernel: ---[ end trace e64bc8f0c1a2531b ]---
Dec 01 14:43:06 wave2 kernel: ath10k_pci 0000:05:00.0: txq-init: Found txq when it should be deleted, txq_tmp: ffff8800c43ec988  txq: ffff8800c43ec988
Dec 01 14:43:07 wave2 kernel: ath10k_pci 0000:05:00.0: dropping dbg buffer due to crash since read


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

WARNING: multiple messages have this Message-ID (diff)
From: Ben Greear <greearb@candelatech.com>
To: Michal Kazior <michal.kazior@tieto.com>
Cc: linux-wireless <linux-wireless@vger.kernel.org>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>
Subject: Re: [PATCH 1/3] ath10k: Ensure there are no stale ar->txqs entries.
Date: Thu, 1 Dec 2016 14:52:59 -0800	[thread overview]
Message-ID: <ba1cc818-6194-11d6-7cc3-1db28036a519@candelatech.com> (raw)
In-Reply-To: <57B70AED.2010200@candelatech.com>

On 08/19/2016 06:34 AM, Ben Greear wrote:
>
>
> On 08/18/2016 11:59 PM, Michal Kazior wrote:
>> On 19 August 2016 at 03:26,  <greearb@candelatech.com> wrote:
>>> From: Ben Greear <greearb@candelatech.com>
>>>
>>> I was seeing kernel crashes due to accessing freed memory
>>> while debugging a 9984 firmware that was crashing often.
>>>
>>> This patch fixes the crashes.  I am not certain if there
>>> is a better way or not.


I did some more hacking on this today.  I think I found some better clue on this.

I added this code:

static void ath10k_mac_txq_init(struct ath10k *ar, struct ieee80211_txq *txq)
{
	struct ath10k_txq *artxq = (void *)txq->drv_priv;
	struct ath10k_txq *tmp, *walker;
	struct ieee80211_txq *txq_tmp;
	int i = 0;

	if (!txq)
		return;

	spin_lock_bh(&ar->txqs_lock);
	ar->txqs_lock.rlock.dbg1 = 104;

	/* Remove from ar->txqs in case it still exists there. */
	list_for_each_entry_safe(walker, tmp, &ar->txqs, list) {
		txq_tmp = container_of((void *)walker, struct ieee80211_txq,
				       drv_priv);
		if ((++i % 10000) == 0) {
			ath10k_err(ar, "txq-init: Checking txq_tmp: %p i: %d\n", txq_tmp, i);
			ath10k_err(ar, "txq-init: txqs: %p walker->list: %p w->next: %p  w->prev: %p ar->txqs: %p\n",
				   &ar->txqs, &(walker->list), walker->list.next, walker->list.prev, &ar->txqs);
		}

		if (txq_tmp == txq) {
			WARN_ON_ONCE(1);
			ath10k_err(ar, "txq-init: Found txq when it should be deleted, txq_tmp: %p  txq: %p\n",
				   txq_tmp, txq);
			list_del(&walker->list);
		}
	}
	spin_unlock_bh(&ar->txqs_lock);

	INIT_LIST_HEAD(&artxq->list);
}


[firmware has just crashed]

Dec 01 14:43:06 wave2 kernel: ------------[ cut here ]------------
Dec 01 14:43:06 wave2 kernel: WARNING: CPU: 0 PID: 193 at /home/greearb/git/linux-4.7.dev.y/drivers/net/wireless/ath/ath10k/mac.c:4217 
ath10k_mac_txq_init+0x1a7/0x1b0 [ath10k_core]
Dec 01 14:43:06 wave2 kernel: Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 bridge 8021q garp mrp stp llc bnep bluetooth fuse 
macvlan pktgen rpcsec_gss_krb5 nfsv4 nfs fscache coretemp hwmon intel_rapl x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic kvm iTCO_wdt irqbypass iTCO_vendor_support ath10k_pci ath10k_core joydev ath snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_hwdep 
snd_seq snd_seq_device pcspkr cfg80211 snd_pcm snd_timer snd i2c_i801 lpc_ich shpchp soundcore tpm_tis tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc serio_raw 
i915 i2c_algo_bit ata_generic drm_kms_helper pata_acpi e1000e ptp drm pps_core i2c_core fjes video ipv6 [last unloaded: nf_conntrack]
Dec 01 14:43:06 wave2 kernel: CPU: 0 PID: 193 Comm: kworker/0:1 Not tainted 4.7.10+ #14
Dec 01 14:43:06 wave2 kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
Dec 01 14:43:06 wave2 kernel: Workqueue: events_freezable ieee80211_restart_work [mac80211]
Dec 01 14:43:06 wave2 kernel:  ffffffffa0e29507 ffff8801d14f7920 ffffffff8169ed08 0000000000000000
Dec 01 14:43:06 wave2 kernel:  0000000000000000 ffff8801d14f7968 ffffffff811569bc ffff8801d14e4f00
Dec 01 14:43:06 wave2 kernel:  0000107900000001 ffff8800c43ec9a0 0000000000000027 ffff8800c43ec988
Dec 01 14:43:06 wave2 kernel: Call Trace:
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e29507>] ? ath10k_mac_txq_init+0x1a7/0x1b0 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8169ed08>] dump_stack+0x85/0xcd
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811569bc>] __warn+0x10c/0x130
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81156b58>] warn_slowpath_null+0x18/0x20
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e29507>] ath10k_mac_txq_init+0x1a7/0x1b0 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e3766f>] ath10k_sta_state+0x4ef/0x1350 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811e10ed>] ? mark_lock+0x6d/0x8a0
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0e37180>] ? ath10k_station_assoc+0x920/0x920 [ath10k_core]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811ddd14>] ? __lock_is_held+0x84/0xc0
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0c0095f>] drv_sta_state+0xef/0xc50 [mac80211]
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0c6b1b0>] ieee80211_reconfig+0x10a0/0x2890 [mac80211]
Dec 01 14:43:06 wave2 kernel:  [<ffffffffa0bf8361>] ieee80211_restart_work+0xb1/0xf0 [mac80211]
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81184dad>] process_one_work+0x42d/0xac0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81184cf4>] ? process_one_work+0x374/0xac0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81184980>] ? pwq_dec_nr_in_flight+0x110/0x110
Dec 01 14:43:06 wave2 kernel:  [<ffffffff811854c6>] worker_thread+0x86/0x730
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81df25aa>] ? _raw_spin_unlock_irqrestore+0x5a/0x70
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81185440>] ? process_one_work+0xac0/0xac0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8118f381>] kthread+0x191/0x1b0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8118f1f0>] ? kthread_create_on_node+0x320/0x320
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8119baa3>] ? preempt_count_sub+0x13/0xd0
Dec 01 14:43:06 wave2 kernel:  [<ffffffff81df2f8f>] ret_from_fork+0x1f/0x40
Dec 01 14:43:06 wave2 kernel:  [<ffffffff8118f1f0>] ? kthread_create_on_node+0x320/0x320
Dec 01 14:43:06 wave2 kernel: ---[ end trace e64bc8f0c1a2531b ]---
Dec 01 14:43:06 wave2 kernel: ath10k_pci 0000:05:00.0: txq-init: Found txq when it should be deleted, txq_tmp: ffff8800c43ec988  txq: ffff8800c43ec988
Dec 01 14:43:07 wave2 kernel: ath10k_pci 0000:05:00.0: dropping dbg buffer due to crash since read


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

  reply	other threads:[~2016-12-01 22:53 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19  1:26 [PATCH 1/3] ath10k: Ensure there are no stale ar->txqs entries greearb
2016-08-19  1:26 ` greearb
2016-08-19  1:26 ` [PATCH 2/3] ath10k: Grab rcu_read_lock before the txqs spinlock greearb
2016-08-19  1:26   ` greearb
2016-08-19  3:01   ` Manoharan, Rajkumar
2016-08-19  3:01     ` Manoharan, Rajkumar
2016-08-19  3:28     ` Ben Greear
2016-08-19  3:28       ` Ben Greear
2016-09-09 13:36   ` Valo, Kalle
2016-09-09 13:36     ` Valo, Kalle
2016-09-09 14:47     ` Ben Greear
2016-09-09 14:47       ` Ben Greear
2016-09-12  6:41       ` Johannes Berg
2016-09-12  6:41         ` Johannes Berg
2016-09-12 16:37         ` Ben Greear
2016-09-12 16:37           ` Ben Greear
2016-08-19  1:26 ` [PATCH 3/3] ath10k: Improve logging message greearb
2016-08-19  1:26   ` greearb
2016-08-19  6:35   ` Mohammed Shafi Shajakhan
2016-08-19  6:35     ` Mohammed Shafi Shajakhan
2016-09-09 13:30     ` Valo, Kalle
2016-09-09 13:30       ` Valo, Kalle
2016-09-13 12:29   ` [3/3] " Kalle Valo
2016-09-13 12:29     ` Kalle Valo
2016-08-19  6:59 ` [PATCH 1/3] ath10k: Ensure there are no stale ar->txqs entries Michal Kazior
2016-08-19  6:59   ` Michal Kazior
2016-08-19 13:34   ` Ben Greear
2016-08-19 13:34     ` Ben Greear
2016-12-01 22:52     ` Ben Greear [this message]
2016-12-01 22:52       ` Ben Greear
2016-12-02  0:24       ` Ben Greear
2016-12-02  0:24         ` Ben Greear
2016-12-05  8:50         ` Michal Kazior
2016-12-05  8:50           ` Michal Kazior
2016-12-05 18:19           ` Ben Greear
2016-12-05 18:19             ` Ben Greear
2016-09-09 17:25 ` Felix Fietkau
2016-09-09 17:25   ` Felix Fietkau
2016-09-09 17:46   ` Ben Greear
2016-09-09 17:46     ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba1cc818-6194-11d6-7cc3-1db28036a519@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=ath10k@lists.infradead.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=michal.kazior@tieto.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.