All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] ath10k:  Increase BMI timeout.
@ 2017-05-15 21:25 greearb
  2017-05-15 21:25 ` [PATCH 2/3] ath10k: Fix napi related crash when otp fails greearb
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: greearb @ 2017-05-15 21:25 UTC (permalink / raw)
  To: ath10k; +Cc: Ben Greear

From: Ben Greear <greearb@candelatech.com>

When testing a 9888 chipset NIC, I notice it often takes
almost 2 seconds, and then many times OTP fails, probably due
to the two-second timeout.

[ 2269.841842] ath10k_pci 0000:05:00.0: bmi cmd took: 1984 jiffies (HZ: 1000), rv: 0
[ 2273.608185] ath10k_pci 0000:05:00.0: bmi cmd took: 1986 jiffies (HZ: 1000), rv: 0
[ 2277.294732] ath10k_pci 0000:05:00.0: bmi cmd took: 1989 jiffies (HZ: 1000), rv: 0

So, increase the BMI timeout to 3 seconds.

Signed-off-by: Ben Greear <greearb@candelatech.com>
---
 drivers/net/wireless/ath/ath10k/bmi.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/bmi.h b/drivers/net/wireless/ath/ath10k/bmi.h
index 7d3231a..1054c0c 100644
--- a/drivers/net/wireless/ath/ath10k/bmi.h
+++ b/drivers/net/wireless/ath/ath10k/bmi.h
@@ -187,8 +187,8 @@ struct bmi_target_info {
 	u32 type;
 };
 
-/* in msec */
-#define BMI_COMMUNICATION_TIMEOUT_HZ (2 * HZ)
+/* in jiffies */
+#define BMI_COMMUNICATION_TIMEOUT_HZ (3 * HZ)
 
 #define BMI_CE_NUM_TO_TARG 0
 #define BMI_CE_NUM_TO_HOST 1
-- 
2.4.11


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] ath10k:  Fix napi related crash when otp fails.
  2017-05-15 21:25 [PATCH 1/3] ath10k: Increase BMI timeout greearb
@ 2017-05-15 21:25 ` greearb
  2017-05-31 12:27   ` [2/3] " Kalle Valo
  2017-05-15 21:25 ` [PATCH 3/3] ath10k: Log when longer bmi cmds happen greearb
  2017-06-01 12:56 ` [1/3] ath10k: Increase BMI timeout Kalle Valo
  2 siblings, 1 reply; 9+ messages in thread
From: greearb @ 2017-05-15 21:25 UTC (permalink / raw)
  To: ath10k; +Cc: Ben Greear

From: Ben Greear <greearb@candelatech.com>

The failure case is something like this:

otp fails (probably because BMI timeout was too short)
so core_start fails
so the entire firmware load fails..but since it was not ever
started properly, then it is not stopped as far as PCI is concerned.
So, napi is not released.

A bit later, maybe, the driver succeeds in loading and then we have a
BUG splat.

------------[ cut here ]------------
kernel BUG at /home/greearb/git/linux-4.9.dev.y/include/linux/netdevice.h:515!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 bridge ath10k_pci ath10k_core 8021q garp mrp stp llc bnep bluetooth fuse macv]
CPU: 1 PID: 3651 Comm: wpa_supplicant Not tainted 4.9.27+ #35
Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
task: ffff8802111f0000 task.stack: ffffc90001fb4000
RIP: 0010:[<ffffffffa1498d33>]  [<ffffffffa1498d33>] ath10k_pci_hif_power_up+0x173/0x180 [ath10k_pci]
RSP: 0018:ffffc90001fb7c30  EFLAGS: 00010246
RAX: 0000000000000008 RBX: ffff880212bc2bc0 RCX: 0000000000082004
RDX: ffffc9000d282000 RSI: ffffc9000d282000 RDI: 000000000fd0a000
RBP: ffffc90001fb7c40 R08: 0000000000200000 R09: 0000000000000101
R10: 0000000000000d00 R11: 0000000000000003 R12: 0000000000082000
R13: ffff880212beaef8 R14: 0000000000000000 R15: ffff8802134c1118
FS:  00007f476575c800(0000) GS:ffff88021e240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3da950b490 CR3: 0000000212b5a000 CR4: 00000000001406e0
Stack:
 ffff880212bc2bc0 ffff880212bc0700 ffffc90001fb7c68 ffffffffa1429281
 ffff8802134c0000 ffff880212bc0700 0000000000000000 ffffc90001fb7c90
 ffffffffa07cb818 ffff8802134c0000 ffff880212bc0700 0000000000000000
Call Trace:
 [<ffffffffa1429281>] ath10k_start+0x51/0x5c0 [ath10k_core]
 [<ffffffffa07cb818>] drv_start+0x38/0x140 [mac80211]
 [<ffffffffa07e2cc5>] ieee80211_do_open+0x2c5/0x990 [mac80211]
 [<ffffffffa07e33e0>] ieee80211_open+0x50/0x60 [mac80211]
 [<ffffffff817a9f2a>] __dev_open+0xaa/0x120
 [<ffffffff817aa208>] __dev_change_flags+0x98/0x160
 [<ffffffff817aa2f4>] dev_change_flags+0x24/0x60
 [<ffffffff8182388e>] devinet_ioctl+0x5ee/0x6c0
 [<ffffffff8182535b>] inet_ioctl+0x4b/0x70
 [<ffffffff81787430>] sock_do_ioctl+0x20/0x50
 [<ffffffff81787936>] sock_ioctl+0x1d6/0x2a0
 [<ffffffff8128d24b>] do_vfs_ioctl+0x8b/0x5b0
 [<ffffffff8178adbd>] ? __sys_recvmsg+0x3d/0x70
 [<ffffffff8128d7e4>] SyS_ioctl+0x74/0x80
 [<ffffffff8188a83b>] entry_SYSCALL_64_fastpath+0x1e/0xad
Code: ff ff ff 89 c2 48 89 df 48 c7 c6 10 d3 49 a1 e8 34 1d f9 ff 48 89 df e8 2c f9 ff ff 44 89 e0 c6 83 0e 74 02 00 01 5b 41 5c 5d c3 <0f> 0b 66 66 2e 0f
RIP  [<ffffffffa1498d33>] ath10k_pci_hif_power_up+0x173/0x180 [ath10k_pci]
 RSP <ffffc90001fb7c30>
---[ end trace b6dede286ed70e39 ]---

The BUG in question is this:

/**
 *      napi_enable - enable NAPI scheduling
 *      @n: NAPI context
 *
 * Resume NAPI from being scheduled on this context.
 * Must be paired with napi_disable.
 */
static inline void napi_enable(struct napi_struct *n)
{
        BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
        smp_mb__before_atomic();
        clear_bit(NAPI_STATE_SCHED, &n->state);
        clear_bit(NAPI_STATE_NPSVC, &n->state);
}

Signed-off-by: Ben Greear <greearb@candelatech.com>
---
 drivers/net/wireless/ath/ath10k/core.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index a715677..ad544b2 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -1806,7 +1806,7 @@ static int ath10k_core_pre_cal_config(struct ath10k *ar)
 
 	ret = ath10k_download_and_run_otp(ar);
 	if (ret) {
-		ath10k_err(ar, "failed to run otp: %d\n", ret);
+		ath10k_err(ar, "failed to run otp: %d  (pre-cal-config)\n", ret);
 		return ret;
 	}
 
@@ -1860,7 +1860,7 @@ static int ath10k_download_cal_data(struct ath10k *ar)
 
 	ret = ath10k_download_and_run_otp(ar);
 	if (ret) {
-		ath10k_err(ar, "failed to run otp: %d\n", ret);
+		ath10k_err(ar, "failed to run otp: %d (download-cal-data)\n", ret);
 		return ret;
 	}
 
@@ -2783,7 +2783,11 @@ static void ath10k_core_register_work(struct work_struct *work)
 
 	status = ath10k_core_probe_fw(ar);
 	if (status) {
-		ath10k_err(ar, "could not probe fw (%d)\n", status);
+		ath10k_err(ar, "could not probe fw (%d), releasing napi\n", status);
+		/* Make sure that NAPI state is cleaned up, it can otherwise fail
+		 * to be cleaned up in some error cases.
+		 */
+		napi_disable(&ar->napi);
 		goto err;
 	}
 
-- 
2.4.11


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] ath10k:  Log when longer bmi cmds happen.
  2017-05-15 21:25 [PATCH 1/3] ath10k: Increase BMI timeout greearb
  2017-05-15 21:25 ` [PATCH 2/3] ath10k: Fix napi related crash when otp fails greearb
@ 2017-05-15 21:25 ` greearb
  2017-05-31 12:46   ` [3/3] " Kalle Valo
  2017-06-01 12:56 ` [1/3] ath10k: Increase BMI timeout Kalle Valo
  2 siblings, 1 reply; 9+ messages in thread
From: greearb @ 2017-05-15 21:25 UTC (permalink / raw)
  To: ath10k; +Cc: Ben Greear

From: Ben Greear <greearb@candelatech.com>

This lets one have a clue that maybe timeouts are happening
when we just aren't waiting long enough.

Signed-off-by: Ben Greear <greearb@candelatech.com>
---
 drivers/net/wireless/ath/ath10k/pci.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 52b9ebb..0311b07 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -101,7 +101,8 @@ static int ath10k_pci_init_irq(struct ath10k *ar);
 static int ath10k_pci_deinit_irq(struct ath10k *ar);
 static int ath10k_pci_request_irq(struct ath10k *ar);
 static void ath10k_pci_free_irq(struct ath10k *ar);
-static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
+static int ath10k_pci_bmi_wait(struct ath10k *ar,
+			       struct ath10k_ce_pipe *tx_pipe,
 			       struct ath10k_ce_pipe *rx_pipe,
 			       struct bmi_xfer *xfer);
 static int ath10k_pci_qca99x0_chip_reset(struct ath10k *ar);
@@ -2156,7 +2157,7 @@ int ath10k_pci_hif_exchange_bmi_msg(struct ath10k *ar,
 	if (ret)
 		goto err_resp;
 
-	ret = ath10k_pci_bmi_wait(ce_tx, ce_rx, &xfer);
+	ret = ath10k_pci_bmi_wait(ar, ce_tx, ce_rx, &xfer);
 	if (ret) {
 		u32 unused_buffer;
 		unsigned int unused_nbytes;
@@ -2225,23 +2226,35 @@ static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
 	xfer->rx_done = true;
 }
 
-static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
+static int ath10k_pci_bmi_wait(struct ath10k *ar,
+			       struct ath10k_ce_pipe *tx_pipe,
 			       struct ath10k_ce_pipe *rx_pipe,
 			       struct bmi_xfer *xfer)
 {
 	unsigned long timeout = jiffies + BMI_COMMUNICATION_TIMEOUT_HZ;
+	unsigned long started = jiffies;
+	unsigned long dur;
+	int rv = -ETIMEDOUT;
 
 	while (time_before_eq(jiffies, timeout)) {
 		ath10k_pci_bmi_send_done(tx_pipe);
 		ath10k_pci_bmi_recv_data(rx_pipe);
 
-		if (xfer->tx_done && (xfer->rx_done == xfer->wait_for_resp))
-			return 0;
+		if (xfer->tx_done && (xfer->rx_done == xfer->wait_for_resp)) {
+			rv = 0;
+			goto out;
+		}
 
 		schedule();
 	}
 
-	return -ETIMEDOUT;
+out:
+	dur = jiffies - started;
+	if (dur > HZ)
+		ath10k_dbg(ar, ATH10K_DBG_BMI,
+			   "bmi cmd took: %lu jiffies (HZ: %d), rv: %d\n",
+			   dur, HZ, rv);
+	return rv;
 }
 
 /*
-- 
2.4.11


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [2/3] ath10k: Fix napi related crash when otp fails.
  2017-05-15 21:25 ` [PATCH 2/3] ath10k: Fix napi related crash when otp fails greearb
@ 2017-05-31 12:27   ` Kalle Valo
  2017-05-31 13:11     ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Kalle Valo @ 2017-05-31 12:27 UTC (permalink / raw)
  To: Ben Greear; +Cc: ath10k

Ben Greear <greearb@candelatech.com> wrote:

> The failure case is something like this:
> 
> otp fails (probably because BMI timeout was too short)
> so core_start fails
> so the entire firmware load fails..but since it was not ever
> started properly, then it is not stopped as far as PCI is concerned.
> So, napi is not released.
> 
> A bit later, maybe, the driver succeeds in loading and then we have a
> BUG splat.
> 
> ------------[ cut here ]------------
> kernel BUG at /home/greearb/git/linux-4.9.dev.y/include/linux/netdevice.h:515!
> invalid opcode: 0000 [#1] PREEMPT SMP
> Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 bridge ath10k_pci ath10k_core 8021q garp mrp stp llc bnep bluetooth fuse macv]
> CPU: 1 PID: 3651 Comm: wpa_supplicant Not tainted 4.9.27+ #35
> Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
> task: ffff8802111f0000 task.stack: ffffc90001fb4000
> RIP: 0010:[<ffffffffa1498d33>]  [<ffffffffa1498d33>] ath10k_pci_hif_power_up+0x173/0x180 [ath10k_pci]
> RSP: 0018:ffffc90001fb7c30  EFLAGS: 00010246
> RAX: 0000000000000008 RBX: ffff880212bc2bc0 RCX: 0000000000082004
> RDX: ffffc9000d282000 RSI: ffffc9000d282000 RDI: 000000000fd0a000
> RBP: ffffc90001fb7c40 R08: 0000000000200000 R09: 0000000000000101
> R10: 0000000000000d00 R11: 0000000000000003 R12: 0000000000082000
> R13: ffff880212beaef8 R14: 0000000000000000 R15: ffff8802134c1118
> FS:  00007f476575c800(0000) GS:ffff88021e240000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f3da950b490 CR3: 0000000212b5a000 CR4: 00000000001406e0
> Stack:
>  ffff880212bc2bc0 ffff880212bc0700 ffffc90001fb7c68 ffffffffa1429281
>  ffff8802134c0000 ffff880212bc0700 0000000000000000 ffffc90001fb7c90
>  ffffffffa07cb818 ffff8802134c0000 ffff880212bc0700 0000000000000000
> Call Trace:
>  [<ffffffffa1429281>] ath10k_start+0x51/0x5c0 [ath10k_core]
>  [<ffffffffa07cb818>] drv_start+0x38/0x140 [mac80211]
>  [<ffffffffa07e2cc5>] ieee80211_do_open+0x2c5/0x990 [mac80211]
>  [<ffffffffa07e33e0>] ieee80211_open+0x50/0x60 [mac80211]
>  [<ffffffff817a9f2a>] __dev_open+0xaa/0x120
>  [<ffffffff817aa208>] __dev_change_flags+0x98/0x160
>  [<ffffffff817aa2f4>] dev_change_flags+0x24/0x60
>  [<ffffffff8182388e>] devinet_ioctl+0x5ee/0x6c0
>  [<ffffffff8182535b>] inet_ioctl+0x4b/0x70
>  [<ffffffff81787430>] sock_do_ioctl+0x20/0x50
>  [<ffffffff81787936>] sock_ioctl+0x1d6/0x2a0
>  [<ffffffff8128d24b>] do_vfs_ioctl+0x8b/0x5b0
>  [<ffffffff8178adbd>] ? __sys_recvmsg+0x3d/0x70
>  [<ffffffff8128d7e4>] SyS_ioctl+0x74/0x80
>  [<ffffffff8188a83b>] entry_SYSCALL_64_fastpath+0x1e/0xad
> Code: ff ff ff 89 c2 48 89 df 48 c7 c6 10 d3 49 a1 e8 34 1d f9 ff 48 89 df e8 2c f9 ff ff 44 89 e0 c6 83 0e 74 02 00 01 5b 41 5c 5d c3 <0f> 0b 66 66 2e 0f
> RIP  [<ffffffffa1498d33>] ath10k_pci_hif_power_up+0x173/0x180 [ath10k_pci]
>  RSP <ffffc90001fb7c30>
> ---[ end trace b6dede286ed70e39 ]---
> 
> The BUG in question is this:
> 
> /**
>  *      napi_enable - enable NAPI scheduling
>  *      @n: NAPI context
>  *
>  * Resume NAPI from being scheduled on this context.
>  * Must be paired with napi_disable.
>  */
> static inline void napi_enable(struct napi_struct *n)
> {
>         BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
>         smp_mb__before_atomic();
>         clear_bit(NAPI_STATE_SCHED, &n->state);
>         clear_bit(NAPI_STATE_NPSVC, &n->state);
> }
> 
> Signed-off-by: Ben Greear <greearb@candelatech.com>

Doesn't the commit below fix this?

commit 1427228d5869f5804b03d47acfa4a88122572a78
Author: Kalle Valo <kvalo@qca.qualcomm.com>
Date:   Mon Feb 13 12:38:39 2017 +0200

    ath10k: fix napi crash during rmmod when probe firmware fails

-- 
https://patchwork.kernel.org/patch/9727993/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [3/3] ath10k: Log when longer bmi cmds happen.
  2017-05-15 21:25 ` [PATCH 3/3] ath10k: Log when longer bmi cmds happen greearb
@ 2017-05-31 12:46   ` Kalle Valo
  2017-05-31 13:00     ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Kalle Valo @ 2017-05-31 12:46 UTC (permalink / raw)
  To: Ben Greear; +Cc: ath10k

Ben Greear <greearb@candelatech.com> wrote:

> This lets one have a clue that maybe timeouts are happening
> when we just aren't waiting long enough.
> 
> Signed-off-by: Ben Greear <greearb@candelatech.com>
> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

I did some cosmetic changes and ret is now initialised after the loop:

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=13e31da60b738512ab1ac4429789d79a983ae5be

-- 
https://patchwork.kernel.org/patch/9727991/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [3/3] ath10k: Log when longer bmi cmds happen.
  2017-05-31 12:46   ` [3/3] " Kalle Valo
@ 2017-05-31 13:00     ` Ben Greear
  0 siblings, 0 replies; 9+ messages in thread
From: Ben Greear @ 2017-05-31 13:00 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k



On 05/31/2017 05:46 AM, Kalle Valo wrote:
> Ben Greear <greearb@candelatech.com> wrote:
>
>> This lets one have a clue that maybe timeouts are happening
>> when we just aren't waiting long enough.
>>
>> Signed-off-by: Ben Greear <greearb@candelatech.com>
>> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
>
> I did some cosmetic changes and ret is now initialised after the loop:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=13e31da60b738512ab1ac4429789d79a983ae5be

Looks OK to me.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2/3] ath10k: Fix napi related crash when otp fails.
  2017-05-31 12:27   ` [2/3] " Kalle Valo
@ 2017-05-31 13:11     ` Ben Greear
  2017-06-01 13:10       ` Kalle Valo
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2017-05-31 13:11 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k



On 05/31/2017 05:27 AM, Kalle Valo wrote:
> Ben Greear <greearb@candelatech.com> wrote:
>
>> The failure case is something like this:
>>
>> otp fails (probably because BMI timeout was too short)
>> so core_start fails
>> so the entire firmware load fails..but since it was not ever
>> started properly, then it is not stopped as far as PCI is concerned.
>> So, napi is not released.
>>
>> A bit later, maybe, the driver succeeds in loading and then we have a
>> BUG splat.
>>
>> ------------[ cut here ]------------
>> kernel BUG at /home/greearb/git/linux-4.9.dev.y/include/linux/netdevice.h:515!
>> invalid opcode: 0000 [#1] PREEMPT SMP
>> Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 bridge ath10k_pci ath10k_core 8021q garp mrp stp llc bnep bluetooth fuse macv]
>> CPU: 1 PID: 3651 Comm: wpa_supplicant Not tainted 4.9.27+ #35
>> Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
>> task: ffff8802111f0000 task.stack: ffffc90001fb4000
>> RIP: 0010:[<ffffffffa1498d33>]  [<ffffffffa1498d33>] ath10k_pci_hif_power_up+0x173/0x180 [ath10k_pci]
>> RSP: 0018:ffffc90001fb7c30  EFLAGS: 00010246
>> RAX: 0000000000000008 RBX: ffff880212bc2bc0 RCX: 0000000000082004
>> RDX: ffffc9000d282000 RSI: ffffc9000d282000 RDI: 000000000fd0a000
>> RBP: ffffc90001fb7c40 R08: 0000000000200000 R09: 0000000000000101
>> R10: 0000000000000d00 R11: 0000000000000003 R12: 0000000000082000
>> R13: ffff880212beaef8 R14: 0000000000000000 R15: ffff8802134c1118
>> FS:  00007f476575c800(0000) GS:ffff88021e240000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f3da950b490 CR3: 0000000212b5a000 CR4: 00000000001406e0
>> Stack:
>>  ffff880212bc2bc0 ffff880212bc0700 ffffc90001fb7c68 ffffffffa1429281
>>  ffff8802134c0000 ffff880212bc0700 0000000000000000 ffffc90001fb7c90
>>  ffffffffa07cb818 ffff8802134c0000 ffff880212bc0700 0000000000000000
>> Call Trace:
>>  [<ffffffffa1429281>] ath10k_start+0x51/0x5c0 [ath10k_core]
>>  [<ffffffffa07cb818>] drv_start+0x38/0x140 [mac80211]
>>  [<ffffffffa07e2cc5>] ieee80211_do_open+0x2c5/0x990 [mac80211]
>>  [<ffffffffa07e33e0>] ieee80211_open+0x50/0x60 [mac80211]
>>  [<ffffffff817a9f2a>] __dev_open+0xaa/0x120
>>  [<ffffffff817aa208>] __dev_change_flags+0x98/0x160
>>  [<ffffffff817aa2f4>] dev_change_flags+0x24/0x60
>>  [<ffffffff8182388e>] devinet_ioctl+0x5ee/0x6c0
>>  [<ffffffff8182535b>] inet_ioctl+0x4b/0x70
>>  [<ffffffff81787430>] sock_do_ioctl+0x20/0x50
>>  [<ffffffff81787936>] sock_ioctl+0x1d6/0x2a0
>>  [<ffffffff8128d24b>] do_vfs_ioctl+0x8b/0x5b0
>>  [<ffffffff8178adbd>] ? __sys_recvmsg+0x3d/0x70
>>  [<ffffffff8128d7e4>] SyS_ioctl+0x74/0x80
>>  [<ffffffff8188a83b>] entry_SYSCALL_64_fastpath+0x1e/0xad
>> Code: ff ff ff 89 c2 48 89 df 48 c7 c6 10 d3 49 a1 e8 34 1d f9 ff 48 89 df e8 2c f9 ff ff 44 89 e0 c6 83 0e 74 02 00 01 5b 41 5c 5d c3 <0f> 0b 66 66 2e 0f
>> RIP  [<ffffffffa1498d33>] ath10k_pci_hif_power_up+0x173/0x180 [ath10k_pci]
>>  RSP <ffffc90001fb7c30>
>> ---[ end trace b6dede286ed70e39 ]---
>>
>> The BUG in question is this:
>>
>> /**
>>  *      napi_enable - enable NAPI scheduling
>>  *      @n: NAPI context
>>  *
>>  * Resume NAPI from being scheduled on this context.
>>  * Must be paired with napi_disable.
>>  */
>> static inline void napi_enable(struct napi_struct *n)
>> {
>>         BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
>>         smp_mb__before_atomic();
>>         clear_bit(NAPI_STATE_SCHED, &n->state);
>>         clear_bit(NAPI_STATE_NPSVC, &n->state);
>> }
>>
>> Signed-off-by: Ben Greear <greearb@candelatech.com>
>
> Doesn't the commit below fix this?

I don't know, but the problem in my case is that napi wasn't properly stopped,
and the patch you reference is just about starting napi earlier, right?

I do think you should make the effort to provide such important
fixes to stable trees...if you had, then it would have been in my 4.9 tree already
and it would have either fixed the problem or I would know for certain it was needed.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [1/3] ath10k: Increase BMI timeout.
  2017-05-15 21:25 [PATCH 1/3] ath10k: Increase BMI timeout greearb
  2017-05-15 21:25 ` [PATCH 2/3] ath10k: Fix napi related crash when otp fails greearb
  2017-05-15 21:25 ` [PATCH 3/3] ath10k: Log when longer bmi cmds happen greearb
@ 2017-06-01 12:56 ` Kalle Valo
  2 siblings, 0 replies; 9+ messages in thread
From: Kalle Valo @ 2017-06-01 12:56 UTC (permalink / raw)
  To: Ben Greear; +Cc: ath10k

Ben Greear <greearb@candelatech.com> wrote:

> When testing a 9888 chipset NIC, I notice it often takes
> almost 2 seconds, and then many times OTP fails, probably due
> to the two-second timeout.
> 
> [ 2269.841842] ath10k_pci 0000:05:00.0: bmi cmd took: 1984 jiffies (HZ: 1000), rv: 0
> [ 2273.608185] ath10k_pci 0000:05:00.0: bmi cmd took: 1986 jiffies (HZ: 1000), rv: 0
> [ 2277.294732] ath10k_pci 0000:05:00.0: bmi cmd took: 1989 jiffies (HZ: 1000), rv: 0
> 
> So, increase the BMI timeout to 3 seconds.
> 
> Signed-off-by: Ben Greear <greearb@candelatech.com>
> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

2 patches applied to ath-next branch of ath.git, thanks.

13c3b26a71df ath10k: increase BMI timeout
6bb099b088c7 ath10k: log when longer bmi cmds happen

-- 
https://patchwork.kernel.org/patch/9727989/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2/3] ath10k: Fix napi related crash when otp fails.
  2017-05-31 13:11     ` Ben Greear
@ 2017-06-01 13:10       ` Kalle Valo
  0 siblings, 0 replies; 9+ messages in thread
From: Kalle Valo @ 2017-06-01 13:10 UTC (permalink / raw)
  To: Ben Greear; +Cc: ath10k

Ben Greear <greearb@candelatech.com> writes:

>> Doesn't the commit below fix this?
>
> I don't know, but the problem in my case is that napi wasn't properly stopped,
> and the patch you reference is just about starting napi earlier, right?

IIRC I moved it to a different function which was called later. This was
just to match the symmetry with napi_disable() and fix the error
handling.

-- 
Kalle Valo
_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-06-01 13:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 21:25 [PATCH 1/3] ath10k: Increase BMI timeout greearb
2017-05-15 21:25 ` [PATCH 2/3] ath10k: Fix napi related crash when otp fails greearb
2017-05-31 12:27   ` [2/3] " Kalle Valo
2017-05-31 13:11     ` Ben Greear
2017-06-01 13:10       ` Kalle Valo
2017-05-15 21:25 ` [PATCH 3/3] ath10k: Log when longer bmi cmds happen greearb
2017-05-31 12:46   ` [3/3] " Kalle Valo
2017-05-31 13:00     ` Ben Greear
2017-06-01 12:56 ` [1/3] ath10k: Increase BMI timeout Kalle Valo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.