All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] ath10k: speed up recovery
@ 2014-10-09  9:38 ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

Hi,

Patch #1 fixes a bug I've found while testing
patches #2 and #3. It's safe to cherry-pick it in
case patches #2 and/or #3 aren't accepted or need
a re-spin.

I've mainly used patch #2 to test reset and
recovery. It's pretty handy for (stress-)testing.

Patch #3 should improve recovery speed in some
cases. Notably when there's a long chain of WMI
commands involved which take a painfully long time
to timeout/complete if firmware crashes mid-way.

Note: Patches #2 and #3 depend on `ath10k: pci
related fixes 2014-10-09`. Without the patchset
patches #2 and #3 must not be applied. Patch #1 is
fine though.


Michal Kazior (3):
  ath10k: fix possible bmi crash
  ath10k: expose hw restart via debugfs
  ath10k: speed up hw recovery

 drivers/net/wireless/ath/ath10k/ce.c    |  7 +++++++
 drivers/net/wireless/ath/ath10k/core.c  | 16 ++++++++++++++++
 drivers/net/wireless/ath/ath10k/core.h  |  5 +++++
 drivers/net/wireless/ath/ath10k/debug.c |  7 ++++++-
 drivers/net/wireless/ath/ath10k/mac.c   | 14 ++++++++++++--
 drivers/net/wireless/ath/ath10k/mac.h   |  1 +
 drivers/net/wireless/ath/ath10k/pci.c   |  3 +++
 drivers/net/wireless/ath/ath10k/txrx.c  |  3 ++-
 drivers/net/wireless/ath/ath10k/wmi.c   |  4 ++++
 9 files changed, 56 insertions(+), 4 deletions(-)

-- 
1.8.5.3


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 0/3] ath10k: speed up recovery
@ 2014-10-09  9:38 ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

Hi,

Patch #1 fixes a bug I've found while testing
patches #2 and #3. It's safe to cherry-pick it in
case patches #2 and/or #3 aren't accepted or need
a re-spin.

I've mainly used patch #2 to test reset and
recovery. It's pretty handy for (stress-)testing.

Patch #3 should improve recovery speed in some
cases. Notably when there's a long chain of WMI
commands involved which take a painfully long time
to timeout/complete if firmware crashes mid-way.

Note: Patches #2 and #3 depend on `ath10k: pci
related fixes 2014-10-09`. Without the patchset
patches #2 and #3 must not be applied. Patch #1 is
fine though.


Michal Kazior (3):
  ath10k: fix possible bmi crash
  ath10k: expose hw restart via debugfs
  ath10k: speed up hw recovery

 drivers/net/wireless/ath/ath10k/ce.c    |  7 +++++++
 drivers/net/wireless/ath/ath10k/core.c  | 16 ++++++++++++++++
 drivers/net/wireless/ath/ath10k/core.h  |  5 +++++
 drivers/net/wireless/ath/ath10k/debug.c |  7 ++++++-
 drivers/net/wireless/ath/ath10k/mac.c   | 14 ++++++++++++--
 drivers/net/wireless/ath/ath10k/mac.h   |  1 +
 drivers/net/wireless/ath/ath10k/pci.c   |  3 +++
 drivers/net/wireless/ath/ath10k/txrx.c  |  3 ++-
 drivers/net/wireless/ath/ath10k/wmi.c   |  4 ++++
 9 files changed, 56 insertions(+), 4 deletions(-)

-- 
1.8.5.3


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] ath10k: fix possible bmi crash
  2014-10-09  9:38 ` Michal Kazior
@ 2014-10-09  9:38   ` Michal Kazior
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

While testing other things I've found that CE
items aren't cleared properly. This could lead to
null dereferences in BMI.

To prevent that make sure CE revoking clears the
nbytes value (which is used as a buffer completion
indication) and memset the entire CE rings when
(re)initializing.

Also make sure to check BMI xfer pointer and print
a splat instead of crashing the kernel.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/ce.c  | 7 +++++++
 drivers/net/wireless/ath/ath10k/pci.c | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
index 4745ef2..134dea5 100644
--- a/drivers/net/wireless/ath/ath10k/ce.c
+++ b/drivers/net/wireless/ath/ath10k/ce.c
@@ -558,6 +558,7 @@ int ath10k_ce_revoke_recv_next(struct ath10k_ce_pipe *ce_state,
 
 		/* sanity */
 		dest_ring->per_transfer_context[sw_index] = NULL;
+		desc->nbytes = 0;
 
 		/* Update sw_index */
 		sw_index = CE_RING_IDX_INCR(nentries_mask, sw_index);
@@ -837,6 +838,10 @@ static int ath10k_ce_init_src_ring(struct ath10k *ar,
 
 	memset(src_ring->per_transfer_context, 0,
 	       nentries * sizeof(*src_ring->per_transfer_context));
+	memset(src_ring->base_addr_owner_space, 0,
+	       nentries * sizeof(struct ce_desc));
+	memset(src_ring->shadow_base, 0,
+	       nentries * sizeof(struct ce_desc));
 
 	src_ring->sw_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
 	src_ring->sw_index &= src_ring->nentries_mask;
@@ -874,6 +879,8 @@ static int ath10k_ce_init_dest_ring(struct ath10k *ar,
 
 	memset(dest_ring->per_transfer_context, 0,
 	       nentries * sizeof(*dest_ring->per_transfer_context));
+	memset(dest_ring->base_addr_owner_space, 0,
+	       nentries * sizeof(struct ce_desc));
 
 	dest_ring->sw_index = ath10k_ce_dest_ring_read_index_get(ar, ctrl_addr);
 	dest_ring->sw_index &= dest_ring->nentries_mask;
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 6e8e098..ca2adbd 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1428,6 +1428,9 @@ static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
 					  &nbytes, &transfer_id, &flags))
 		return;
 
+	if (WARN_ON_ONCE(!xfer))
+		return;
+
 	if (!xfer->wait_for_resp) {
 		ath10k_warn(ar, "unexpected: BMI data received; ignoring\n");
 		return;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 1/3] ath10k: fix possible bmi crash
@ 2014-10-09  9:38   ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

While testing other things I've found that CE
items aren't cleared properly. This could lead to
null dereferences in BMI.

To prevent that make sure CE revoking clears the
nbytes value (which is used as a buffer completion
indication) and memset the entire CE rings when
(re)initializing.

Also make sure to check BMI xfer pointer and print
a splat instead of crashing the kernel.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/ce.c  | 7 +++++++
 drivers/net/wireless/ath/ath10k/pci.c | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
index 4745ef2..134dea5 100644
--- a/drivers/net/wireless/ath/ath10k/ce.c
+++ b/drivers/net/wireless/ath/ath10k/ce.c
@@ -558,6 +558,7 @@ int ath10k_ce_revoke_recv_next(struct ath10k_ce_pipe *ce_state,
 
 		/* sanity */
 		dest_ring->per_transfer_context[sw_index] = NULL;
+		desc->nbytes = 0;
 
 		/* Update sw_index */
 		sw_index = CE_RING_IDX_INCR(nentries_mask, sw_index);
@@ -837,6 +838,10 @@ static int ath10k_ce_init_src_ring(struct ath10k *ar,
 
 	memset(src_ring->per_transfer_context, 0,
 	       nentries * sizeof(*src_ring->per_transfer_context));
+	memset(src_ring->base_addr_owner_space, 0,
+	       nentries * sizeof(struct ce_desc));
+	memset(src_ring->shadow_base, 0,
+	       nentries * sizeof(struct ce_desc));
 
 	src_ring->sw_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
 	src_ring->sw_index &= src_ring->nentries_mask;
@@ -874,6 +879,8 @@ static int ath10k_ce_init_dest_ring(struct ath10k *ar,
 
 	memset(dest_ring->per_transfer_context, 0,
 	       nentries * sizeof(*dest_ring->per_transfer_context));
+	memset(dest_ring->base_addr_owner_space, 0,
+	       nentries * sizeof(struct ce_desc));
 
 	dest_ring->sw_index = ath10k_ce_dest_ring_read_index_get(ar, ctrl_addr);
 	dest_ring->sw_index &= dest_ring->nentries_mask;
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 6e8e098..ca2adbd 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1428,6 +1428,9 @@ static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
 					  &nbytes, &transfer_id, &flags))
 		return;
 
+	if (WARN_ON_ONCE(!xfer))
+		return;
+
 	if (!xfer->wait_for_resp) {
 		ath10k_warn(ar, "unexpected: BMI data received; ignoring\n");
 		return;
-- 
1.8.5.3


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] ath10k: expose hw restart via debugfs
  2014-10-09  9:38 ` Michal Kazior
@ 2014-10-09  9:38   ` Michal Kazior
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

Until now it was possible to simulate soft and
hard fw crashes but it wasn't possible to trigger
an immediately hw restart itself (without the fw
crash).

This can be useful when stress testing hw
restarting stability, e.g. during heavy tx/rx
traffic.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/debug.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c
index 0d94feb..20f6719 100644
--- a/drivers/net/wireless/ath/ath10k/debug.c
+++ b/drivers/net/wireless/ath/ath10k/debug.c
@@ -694,7 +694,8 @@ static ssize_t ath10k_read_simulate_fw_crash(struct file *file,
 		"To simulate firmware crash write one of the keywords to this file:\n"
 		"`soft` - this will send WMI_FORCE_FW_HANG_ASSERT to firmware if FW supports that command.\n"
 		"`hard` - this will send to firmware command with illegal parameters causing firmware crash.\n"
-		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n";
+		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n"
+		"`request` - this will simply queue hw restart without fw/hw actually crashing.\n";
 
 	return simple_read_from_buffer(user_buf, count, ppos, buf, strlen(buf));
 }
@@ -747,6 +748,10 @@ static ssize_t ath10k_write_simulate_fw_crash(struct file *file,
 	} else if (!strcmp(buf, "assert")) {
 		ath10k_info(ar, "simulating firmware assert crash\n");
 		ret = ath10k_debug_fw_assert(ar);
+	} else if (!strcmp(buf, "request")) {
+		ath10k_info(ar, "user requested hw restart\n");
+		queue_work(ar->workqueue, &ar->restart_work);
+		ret = 0;
 	} else {
 		ret = -EINVAL;
 		goto exit;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] ath10k: expose hw restart via debugfs
@ 2014-10-09  9:38   ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

Until now it was possible to simulate soft and
hard fw crashes but it wasn't possible to trigger
an immediately hw restart itself (without the fw
crash).

This can be useful when stress testing hw
restarting stability, e.g. during heavy tx/rx
traffic.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/debug.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c
index 0d94feb..20f6719 100644
--- a/drivers/net/wireless/ath/ath10k/debug.c
+++ b/drivers/net/wireless/ath/ath10k/debug.c
@@ -694,7 +694,8 @@ static ssize_t ath10k_read_simulate_fw_crash(struct file *file,
 		"To simulate firmware crash write one of the keywords to this file:\n"
 		"`soft` - this will send WMI_FORCE_FW_HANG_ASSERT to firmware if FW supports that command.\n"
 		"`hard` - this will send to firmware command with illegal parameters causing firmware crash.\n"
-		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n";
+		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n"
+		"`request` - this will simply queue hw restart without fw/hw actually crashing.\n";
 
 	return simple_read_from_buffer(user_buf, count, ppos, buf, strlen(buf));
 }
@@ -747,6 +748,10 @@ static ssize_t ath10k_write_simulate_fw_crash(struct file *file,
 	} else if (!strcmp(buf, "assert")) {
 		ath10k_info(ar, "simulating firmware assert crash\n");
 		ret = ath10k_debug_fw_assert(ar);
+	} else if (!strcmp(buf, "request")) {
+		ath10k_info(ar, "user requested hw restart\n");
+		queue_work(ar->workqueue, &ar->restart_work);
+		ret = 0;
 	} else {
 		ret = -EINVAL;
 		goto exit;
-- 
1.8.5.3


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] ath10k: speed up hw recovery
  2014-10-09  9:38 ` Michal Kazior
@ 2014-10-09  9:38   ` Michal Kazior
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

In some cases hw recovery was taking an absurdly
long time due to ath10k waiting for things that
would never really complete.

Instead of waiting for inevitable timeouts poke
all completions and wakequeues and check if it's
still worth waiting.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/core.c | 16 ++++++++++++++++
 drivers/net/wireless/ath/ath10k/core.h |  5 +++++
 drivers/net/wireless/ath/ath10k/mac.c  | 14 ++++++++++++--
 drivers/net/wireless/ath/ath10k/mac.h  |  1 +
 drivers/net/wireless/ath/ath10k/txrx.c |  3 ++-
 drivers/net/wireless/ath/ath10k/wmi.c  |  4 ++++
 6 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 5000348..11c213f 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
 {
 	struct ath10k *ar = container_of(work, struct ath10k, restart_work);
 
+	set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
+	barrier();
+	ieee80211_stop_queues(ar->hw);
+	ath10k_drain_tx(ar);
+	complete_all(&ar->scan.started);
+	complete_all(&ar->scan.completed);
+	complete_all(&ar->scan.on_channel);
+	complete_all(&ar->offchan_tx_completed);
+	complete_all(&ar->install_key_done);
+	complete_all(&ar->vdev_setup_done);
+	wake_up(&ar->htt.empty_tx_wq);
+	wake_up(&ar->wmi.tx_credits_wq);
+	wake_up(&ar->peer_mapping_wq);
+
 	mutex_lock(&ar->conf_mutex);
 
 	switch (ar->state) {
@@ -722,6 +736,8 @@ int ath10k_core_start(struct ath10k *ar, enum ath10k_firmware_mode mode)
 
 	lockdep_assert_held(&ar->conf_mutex);
 
+	clear_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
+
 	ath10k_bmi_start(ar);
 
 	if (ath10k_init_configure_target(ar)) {
diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
index 1ac2f14..528802d 100644
--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -371,6 +371,11 @@ enum ath10k_dev_flags {
 	/* Indicates that ath10k device is during CAC phase of DFS */
 	ATH10K_CAC_RUNNING,
 	ATH10K_FLAG_CORE_REGISTERED,
+
+	/* Device has crashed and needs to restart. This indicates any pending
+	 * waiters should immediately cancel instead of waiting for a time out.
+	 */
+	ATH10K_FLAG_CRASH_FLUSH,
 };
 
 enum ath10k_scan_state {
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 001ff1a..f6b4db0 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -519,6 +519,9 @@ static inline int ath10k_vdev_setup_sync(struct ath10k *ar)
 
 	lockdep_assert_held(&ar->conf_mutex);
 
+	if (test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags))
+		return -ESHUTDOWN;
+
 	ret = wait_for_completion_timeout(&ar->vdev_setup_done,
 					  ATH10K_VDEV_SETUP_TIMEOUT_HZ);
 	if (ret == 0)
@@ -551,6 +554,8 @@ static int ath10k_monitor_vdev_start(struct ath10k *ar, int vdev_id)
 	arg.channel.max_reg_power = channel->max_reg_power * 2;
 	arg.channel.max_antenna_gain = channel->max_antenna_gain * 2;
 
+	reinit_completion(&ar->vdev_setup_done);
+
 	ret = ath10k_wmi_vdev_start(ar, &arg);
 	if (ret) {
 		ath10k_warn(ar, "failed to request monitor vdev %i start: %d\n",
@@ -598,6 +603,8 @@ static int ath10k_monitor_vdev_stop(struct ath10k *ar)
 		ath10k_warn(ar, "failed to put down monitor vdev %i: %d\n",
 			    ar->monitor_vdev_id, ret);
 
+	reinit_completion(&ar->vdev_setup_done);
+
 	ret = ath10k_wmi_vdev_stop(ar, ar->monitor_vdev_id);
 	if (ret)
 		ath10k_warn(ar, "failed to to request monitor vdev %i stop: %d\n",
@@ -2362,7 +2369,7 @@ static void ath10k_tx(struct ieee80211_hw *hw,
 }
 
 /* Must not be called with conf_mutex held as workers can use that also. */
-static void ath10k_drain_tx(struct ath10k *ar)
+void ath10k_drain_tx(struct ath10k *ar)
 {
 	/* make sure rcu-protected mac80211 tx path itself is drained */
 	synchronize_net();
@@ -3883,7 +3890,9 @@ static void ath10k_flush(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
 			empty = (ar->htt.num_pending_tx == 0);
 			spin_unlock_bh(&ar->htt.tx_lock);
 
-			skip = (ar->state == ATH10K_STATE_WEDGED);
+			skip = (ar->state == ATH10K_STATE_WEDGED) ||
+			       test_bit(ATH10K_FLAG_CRASH_FLUSH,
+					&ar->dev_flags);
 
 			(empty || skip);
 		}), ATH10K_FLUSH_TIMEOUT_HZ);
@@ -3980,6 +3989,7 @@ static void ath10k_restart_complete(struct ieee80211_hw *hw)
 	if (ar->state == ATH10K_STATE_RESTARTED) {
 		ath10k_info(ar, "device successfully recovered\n");
 		ar->state = ATH10K_STATE_ON;
+		ieee80211_wake_queues(ar->hw);
 	}
 
 	mutex_unlock(&ar->conf_mutex);
diff --git a/drivers/net/wireless/ath/ath10k/mac.h b/drivers/net/wireless/ath/ath10k/mac.h
index 965c511..4e3c989 100644
--- a/drivers/net/wireless/ath/ath10k/mac.h
+++ b/drivers/net/wireless/ath/ath10k/mac.h
@@ -40,6 +40,7 @@ void ath10k_mgmt_over_wmi_tx_purge(struct ath10k *ar);
 void ath10k_mgmt_over_wmi_tx_work(struct work_struct *work);
 void ath10k_halt(struct ath10k *ar);
 void ath10k_mac_vif_beacon_free(struct ath10k_vif *arvif);
+void ath10k_drain_tx(struct ath10k *ar);
 
 static inline struct ath10k_vif *ath10k_vif_to_arvif(struct ieee80211_vif *vif)
 {
diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c
index f9c90e3..7579de8 100644
--- a/drivers/net/wireless/ath/ath10k/txrx.c
+++ b/drivers/net/wireless/ath/ath10k/txrx.c
@@ -146,7 +146,8 @@ static int ath10k_wait_for_peer_common(struct ath10k *ar, int vdev_id,
 			mapped = !!ath10k_peer_find(ar, vdev_id, addr);
 			spin_unlock_bh(&ar->data_lock);
 
-			mapped == expect_mapped;
+			(mapped == expect_mapped ||
+			 test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags));
 		}), 3*HZ);
 
 	if (ret <= 0)
diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 829fccf..0adfdb8 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -779,6 +779,10 @@ int ath10k_wmi_cmd_send(struct ath10k *ar, struct sk_buff *skb, u32 cmd_id)
 		ath10k_wmi_tx_beacons_nowait(ar);
 
 		ret = ath10k_wmi_cmd_send_nowait(ar, skb, cmd_id);
+
+		if (ret && test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags))
+			ret = -ESHUTDOWN;
+
 		(ret != -EAGAIN);
 	}), 3*HZ);
 
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] ath10k: speed up hw recovery
@ 2014-10-09  9:38   ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-09  9:38 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

In some cases hw recovery was taking an absurdly
long time due to ath10k waiting for things that
would never really complete.

Instead of waiting for inevitable timeouts poke
all completions and wakequeues and check if it's
still worth waiting.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/core.c | 16 ++++++++++++++++
 drivers/net/wireless/ath/ath10k/core.h |  5 +++++
 drivers/net/wireless/ath/ath10k/mac.c  | 14 ++++++++++++--
 drivers/net/wireless/ath/ath10k/mac.h  |  1 +
 drivers/net/wireless/ath/ath10k/txrx.c |  3 ++-
 drivers/net/wireless/ath/ath10k/wmi.c  |  4 ++++
 6 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 5000348..11c213f 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
 {
 	struct ath10k *ar = container_of(work, struct ath10k, restart_work);
 
+	set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
+	barrier();
+	ieee80211_stop_queues(ar->hw);
+	ath10k_drain_tx(ar);
+	complete_all(&ar->scan.started);
+	complete_all(&ar->scan.completed);
+	complete_all(&ar->scan.on_channel);
+	complete_all(&ar->offchan_tx_completed);
+	complete_all(&ar->install_key_done);
+	complete_all(&ar->vdev_setup_done);
+	wake_up(&ar->htt.empty_tx_wq);
+	wake_up(&ar->wmi.tx_credits_wq);
+	wake_up(&ar->peer_mapping_wq);
+
 	mutex_lock(&ar->conf_mutex);
 
 	switch (ar->state) {
@@ -722,6 +736,8 @@ int ath10k_core_start(struct ath10k *ar, enum ath10k_firmware_mode mode)
 
 	lockdep_assert_held(&ar->conf_mutex);
 
+	clear_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
+
 	ath10k_bmi_start(ar);
 
 	if (ath10k_init_configure_target(ar)) {
diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
index 1ac2f14..528802d 100644
--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -371,6 +371,11 @@ enum ath10k_dev_flags {
 	/* Indicates that ath10k device is during CAC phase of DFS */
 	ATH10K_CAC_RUNNING,
 	ATH10K_FLAG_CORE_REGISTERED,
+
+	/* Device has crashed and needs to restart. This indicates any pending
+	 * waiters should immediately cancel instead of waiting for a time out.
+	 */
+	ATH10K_FLAG_CRASH_FLUSH,
 };
 
 enum ath10k_scan_state {
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 001ff1a..f6b4db0 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -519,6 +519,9 @@ static inline int ath10k_vdev_setup_sync(struct ath10k *ar)
 
 	lockdep_assert_held(&ar->conf_mutex);
 
+	if (test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags))
+		return -ESHUTDOWN;
+
 	ret = wait_for_completion_timeout(&ar->vdev_setup_done,
 					  ATH10K_VDEV_SETUP_TIMEOUT_HZ);
 	if (ret == 0)
@@ -551,6 +554,8 @@ static int ath10k_monitor_vdev_start(struct ath10k *ar, int vdev_id)
 	arg.channel.max_reg_power = channel->max_reg_power * 2;
 	arg.channel.max_antenna_gain = channel->max_antenna_gain * 2;
 
+	reinit_completion(&ar->vdev_setup_done);
+
 	ret = ath10k_wmi_vdev_start(ar, &arg);
 	if (ret) {
 		ath10k_warn(ar, "failed to request monitor vdev %i start: %d\n",
@@ -598,6 +603,8 @@ static int ath10k_monitor_vdev_stop(struct ath10k *ar)
 		ath10k_warn(ar, "failed to put down monitor vdev %i: %d\n",
 			    ar->monitor_vdev_id, ret);
 
+	reinit_completion(&ar->vdev_setup_done);
+
 	ret = ath10k_wmi_vdev_stop(ar, ar->monitor_vdev_id);
 	if (ret)
 		ath10k_warn(ar, "failed to to request monitor vdev %i stop: %d\n",
@@ -2362,7 +2369,7 @@ static void ath10k_tx(struct ieee80211_hw *hw,
 }
 
 /* Must not be called with conf_mutex held as workers can use that also. */
-static void ath10k_drain_tx(struct ath10k *ar)
+void ath10k_drain_tx(struct ath10k *ar)
 {
 	/* make sure rcu-protected mac80211 tx path itself is drained */
 	synchronize_net();
@@ -3883,7 +3890,9 @@ static void ath10k_flush(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
 			empty = (ar->htt.num_pending_tx == 0);
 			spin_unlock_bh(&ar->htt.tx_lock);
 
-			skip = (ar->state == ATH10K_STATE_WEDGED);
+			skip = (ar->state == ATH10K_STATE_WEDGED) ||
+			       test_bit(ATH10K_FLAG_CRASH_FLUSH,
+					&ar->dev_flags);
 
 			(empty || skip);
 		}), ATH10K_FLUSH_TIMEOUT_HZ);
@@ -3980,6 +3989,7 @@ static void ath10k_restart_complete(struct ieee80211_hw *hw)
 	if (ar->state == ATH10K_STATE_RESTARTED) {
 		ath10k_info(ar, "device successfully recovered\n");
 		ar->state = ATH10K_STATE_ON;
+		ieee80211_wake_queues(ar->hw);
 	}
 
 	mutex_unlock(&ar->conf_mutex);
diff --git a/drivers/net/wireless/ath/ath10k/mac.h b/drivers/net/wireless/ath/ath10k/mac.h
index 965c511..4e3c989 100644
--- a/drivers/net/wireless/ath/ath10k/mac.h
+++ b/drivers/net/wireless/ath/ath10k/mac.h
@@ -40,6 +40,7 @@ void ath10k_mgmt_over_wmi_tx_purge(struct ath10k *ar);
 void ath10k_mgmt_over_wmi_tx_work(struct work_struct *work);
 void ath10k_halt(struct ath10k *ar);
 void ath10k_mac_vif_beacon_free(struct ath10k_vif *arvif);
+void ath10k_drain_tx(struct ath10k *ar);
 
 static inline struct ath10k_vif *ath10k_vif_to_arvif(struct ieee80211_vif *vif)
 {
diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c
index f9c90e3..7579de8 100644
--- a/drivers/net/wireless/ath/ath10k/txrx.c
+++ b/drivers/net/wireless/ath/ath10k/txrx.c
@@ -146,7 +146,8 @@ static int ath10k_wait_for_peer_common(struct ath10k *ar, int vdev_id,
 			mapped = !!ath10k_peer_find(ar, vdev_id, addr);
 			spin_unlock_bh(&ar->data_lock);
 
-			mapped == expect_mapped;
+			(mapped == expect_mapped ||
+			 test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags));
 		}), 3*HZ);
 
 	if (ret <= 0)
diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 829fccf..0adfdb8 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -779,6 +779,10 @@ int ath10k_wmi_cmd_send(struct ath10k *ar, struct sk_buff *skb, u32 cmd_id)
 		ath10k_wmi_tx_beacons_nowait(ar);
 
 		ret = ath10k_wmi_cmd_send_nowait(ar, skb, cmd_id);
+
+		if (ret && test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags))
+			ret = -ESHUTDOWN;
+
 		(ret != -EAGAIN);
 	}), 3*HZ);
 
-- 
1.8.5.3


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
  2014-10-09  9:38   ` Michal Kazior
@ 2014-10-10  6:56     ` Michal Kazior
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-10  6:56 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

On 9 October 2014 11:38, Michal Kazior <michal.kazior@tieto.com> wrote:
> In some cases hw recovery was taking an absurdly
> long time due to ath10k waiting for things that
> would never really complete.
>
> Instead of waiting for inevitable timeouts poke
> all completions and wakequeues and check if it's
> still worth waiting.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
> ---
[...]
> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>  {
>         struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>
> +       set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
> +       barrier();
> +       ieee80211_stop_queues(ar->hw);
> +       ath10k_drain_tx(ar);
> +       complete_all(&ar->scan.started);
> +       complete_all(&ar->scan.completed);
> +       complete_all(&ar->scan.on_channel);
> +       complete_all(&ar->offchan_tx_completed);
> +       complete_all(&ar->install_key_done);
> +       complete_all(&ar->vdev_setup_done);

> +       wake_up(&ar->htt.empty_tx_wq);
> +       wake_up(&ar->wmi.tx_credits_wq);
> +       wake_up(&ar->peer_mapping_wq);

Janusz found a problem with waitqueues. I'll need to move waitqueue
initialization to core_create or else if, e.g. fw probing fails
there's a chance a spurious interrupt will cause the kernel to deref
an invalid pointer.

I'll post a v2 later.


Michał

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
@ 2014-10-10  6:56     ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-10  6:56 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

On 9 October 2014 11:38, Michal Kazior <michal.kazior@tieto.com> wrote:
> In some cases hw recovery was taking an absurdly
> long time due to ath10k waiting for things that
> would never really complete.
>
> Instead of waiting for inevitable timeouts poke
> all completions and wakequeues and check if it's
> still worth waiting.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
> ---
[...]
> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>  {
>         struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>
> +       set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
> +       barrier();
> +       ieee80211_stop_queues(ar->hw);
> +       ath10k_drain_tx(ar);
> +       complete_all(&ar->scan.started);
> +       complete_all(&ar->scan.completed);
> +       complete_all(&ar->scan.on_channel);
> +       complete_all(&ar->offchan_tx_completed);
> +       complete_all(&ar->install_key_done);
> +       complete_all(&ar->vdev_setup_done);

> +       wake_up(&ar->htt.empty_tx_wq);
> +       wake_up(&ar->wmi.tx_credits_wq);
> +       wake_up(&ar->peer_mapping_wq);

Janusz found a problem with waitqueues. I'll need to move waitqueue
initialization to core_create or else if, e.g. fw probing fails
there's a chance a spurious interrupt will cause the kernel to deref
an invalid pointer.

I'll post a v2 later.


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] ath10k: expose hw restart via debugfs
  2014-10-09  9:38   ` Michal Kazior
@ 2014-10-13  8:07     ` Kalle Valo
  -1 siblings, 0 replies; 20+ messages in thread
From: Kalle Valo @ 2014-10-13  8:07 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k, linux-wireless

Michal Kazior <michal.kazior@tieto.com> writes:

> Until now it was possible to simulate soft and
> hard fw crashes but it wasn't possible to trigger
> an immediately hw restart itself (without the fw
> crash).
>
> This can be useful when stress testing hw
> restarting stability, e.g. during heavy tx/rx
> traffic.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>

Nice idea!

> --- a/drivers/net/wireless/ath/ath10k/debug.c
> +++ b/drivers/net/wireless/ath/ath10k/debug.c
> @@ -694,7 +694,8 @@ static ssize_t ath10k_read_simulate_fw_crash(struct file *file,
>  		"To simulate firmware crash write one of the keywords to this file:\n"
>  		"`soft` - this will send WMI_FORCE_FW_HANG_ASSERT to firmware if FW supports that command.\n"
>  		"`hard` - this will send to firmware command with illegal parameters causing firmware crash.\n"
> -		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n";
> +		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n"
> +		"`request` - this will simply queue hw restart without fw/hw actually crashing.\n";

"request" is not really very descriptive command. Maybe call it
"hw-restart" because that's what it really does, right?

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] ath10k: expose hw restart via debugfs
@ 2014-10-13  8:07     ` Kalle Valo
  0 siblings, 0 replies; 20+ messages in thread
From: Kalle Valo @ 2014-10-13  8:07 UTC (permalink / raw)
  To: Michal Kazior; +Cc: linux-wireless, ath10k

Michal Kazior <michal.kazior@tieto.com> writes:

> Until now it was possible to simulate soft and
> hard fw crashes but it wasn't possible to trigger
> an immediately hw restart itself (without the fw
> crash).
>
> This can be useful when stress testing hw
> restarting stability, e.g. during heavy tx/rx
> traffic.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>

Nice idea!

> --- a/drivers/net/wireless/ath/ath10k/debug.c
> +++ b/drivers/net/wireless/ath/ath10k/debug.c
> @@ -694,7 +694,8 @@ static ssize_t ath10k_read_simulate_fw_crash(struct file *file,
>  		"To simulate firmware crash write one of the keywords to this file:\n"
>  		"`soft` - this will send WMI_FORCE_FW_HANG_ASSERT to firmware if FW supports that command.\n"
>  		"`hard` - this will send to firmware command with illegal parameters causing firmware crash.\n"
> -		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n";
> +		"`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n"
> +		"`request` - this will simply queue hw restart without fw/hw actually crashing.\n";

"request" is not really very descriptive command. Maybe call it
"hw-restart" because that's what it really does, right?

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
  2014-10-09  9:38   ` Michal Kazior
@ 2014-10-13  8:15     ` Kalle Valo
  -1 siblings, 0 replies; 20+ messages in thread
From: Kalle Valo @ 2014-10-13  8:15 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k, linux-wireless

Michal Kazior <michal.kazior@tieto.com> writes:

> In some cases hw recovery was taking an absurdly
> long time due to ath10k waiting for things that
> would never really complete.
>
> Instead of waiting for inevitable timeouts poke
> all completions and wakequeues and check if it's
> still worth waiting.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>

[...]

> --- a/drivers/net/wireless/ath/ath10k/core.c
> +++ b/drivers/net/wireless/ath/ath10k/core.c
> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>  {
>  	struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>  
> +	set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
> +	barrier();

Please document why the barrier is needed.

> --- a/drivers/net/wireless/ath/ath10k/core.h
> +++ b/drivers/net/wireless/ath/ath10k/core.h
> @@ -371,6 +371,11 @@ enum ath10k_dev_flags {
>  	/* Indicates that ath10k device is during CAC phase of DFS */
>  	ATH10K_CAC_RUNNING,
>  	ATH10K_FLAG_CORE_REGISTERED,
> +
> +	/* Device has crashed and needs to restart. This indicates any pending
> +	 * waiters should immediately cancel instead of waiting for a time out.
> +	 */
> +	ATH10K_FLAG_CRASH_FLUSH,
>  };

Instead of a dev flag, should this actually be a new state to enum
ath10k_state? The reason I ask, I don't see how we would use this flag
with other states than with ATH10K_STATE_ON. Or is this needed because
of locking?

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
@ 2014-10-13  8:15     ` Kalle Valo
  0 siblings, 0 replies; 20+ messages in thread
From: Kalle Valo @ 2014-10-13  8:15 UTC (permalink / raw)
  To: Michal Kazior; +Cc: linux-wireless, ath10k

Michal Kazior <michal.kazior@tieto.com> writes:

> In some cases hw recovery was taking an absurdly
> long time due to ath10k waiting for things that
> would never really complete.
>
> Instead of waiting for inevitable timeouts poke
> all completions and wakequeues and check if it's
> still worth waiting.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>

[...]

> --- a/drivers/net/wireless/ath/ath10k/core.c
> +++ b/drivers/net/wireless/ath/ath10k/core.c
> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>  {
>  	struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>  
> +	set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
> +	barrier();

Please document why the barrier is needed.

> --- a/drivers/net/wireless/ath/ath10k/core.h
> +++ b/drivers/net/wireless/ath/ath10k/core.h
> @@ -371,6 +371,11 @@ enum ath10k_dev_flags {
>  	/* Indicates that ath10k device is during CAC phase of DFS */
>  	ATH10K_CAC_RUNNING,
>  	ATH10K_FLAG_CORE_REGISTERED,
> +
> +	/* Device has crashed and needs to restart. This indicates any pending
> +	 * waiters should immediately cancel instead of waiting for a time out.
> +	 */
> +	ATH10K_FLAG_CRASH_FLUSH,
>  };

Instead of a dev flag, should this actually be a new state to enum
ath10k_state? The reason I ask, I don't see how we would use this flag
with other states than with ATH10K_STATE_ON. Or is this needed because
of locking?

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
  2014-10-13  8:15     ` Kalle Valo
@ 2014-10-13  8:29       ` Michal Kazior
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-13  8:29 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 13 October 2014 10:15, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
>
>> In some cases hw recovery was taking an absurdly
>> long time due to ath10k waiting for things that
>> would never really complete.
>>
>> Instead of waiting for inevitable timeouts poke
>> all completions and wakequeues and check if it's
>> still worth waiting.
>>
>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
>
> [...]
>
>> --- a/drivers/net/wireless/ath/ath10k/core.c
>> +++ b/drivers/net/wireless/ath/ath10k/core.c
>> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>>  {
>>       struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>>
>> +     set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
>> +     barrier();
>
> Please document why the barrier is needed.

Okay. It's just to make sure compiler doesn't re-order it (so that
following complete/wake_ups will be called _after_ the bit is set).


>> --- a/drivers/net/wireless/ath/ath10k/core.h
>> +++ b/drivers/net/wireless/ath/ath10k/core.h
>> @@ -371,6 +371,11 @@ enum ath10k_dev_flags {
>>       /* Indicates that ath10k device is during CAC phase of DFS */
>>       ATH10K_CAC_RUNNING,
>>       ATH10K_FLAG_CORE_REGISTERED,
>> +
>> +     /* Device has crashed and needs to restart. This indicates any pending
>> +      * waiters should immediately cancel instead of waiting for a time out.
>> +      */
>> +     ATH10K_FLAG_CRASH_FLUSH,
>>  };
>
> Instead of a dev flag, should this actually be a new state to enum
> ath10k_state? The reason I ask, I don't see how we would use this flag
> with other states than with ATH10K_STATE_ON. Or is this needed because
> of locking?

Locking.

Reading/writing ar->state requires conf_mutex. The problem is waitiers
might be holding it so you wouldn't even be able to tell the waiters
to not wait anymore.

A long term idea might involve removing conf_mutex though but I recall
there's a couple of problems with that when I looked at it.


Michał

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
@ 2014-10-13  8:29       ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-13  8:29 UTC (permalink / raw)
  To: Kalle Valo; +Cc: linux-wireless, ath10k

On 13 October 2014 10:15, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
>
>> In some cases hw recovery was taking an absurdly
>> long time due to ath10k waiting for things that
>> would never really complete.
>>
>> Instead of waiting for inevitable timeouts poke
>> all completions and wakequeues and check if it's
>> still worth waiting.
>>
>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
>
> [...]
>
>> --- a/drivers/net/wireless/ath/ath10k/core.c
>> +++ b/drivers/net/wireless/ath/ath10k/core.c
>> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>>  {
>>       struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>>
>> +     set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
>> +     barrier();
>
> Please document why the barrier is needed.

Okay. It's just to make sure compiler doesn't re-order it (so that
following complete/wake_ups will be called _after_ the bit is set).


>> --- a/drivers/net/wireless/ath/ath10k/core.h
>> +++ b/drivers/net/wireless/ath/ath10k/core.h
>> @@ -371,6 +371,11 @@ enum ath10k_dev_flags {
>>       /* Indicates that ath10k device is during CAC phase of DFS */
>>       ATH10K_CAC_RUNNING,
>>       ATH10K_FLAG_CORE_REGISTERED,
>> +
>> +     /* Device has crashed and needs to restart. This indicates any pending
>> +      * waiters should immediately cancel instead of waiting for a time out.
>> +      */
>> +     ATH10K_FLAG_CRASH_FLUSH,
>>  };
>
> Instead of a dev flag, should this actually be a new state to enum
> ath10k_state? The reason I ask, I don't see how we would use this flag
> with other states than with ATH10K_STATE_ON. Or is this needed because
> of locking?

Locking.

Reading/writing ar->state requires conf_mutex. The problem is waitiers
might be holding it so you wouldn't even be able to tell the waiters
to not wait anymore.

A long term idea might involve removing conf_mutex though but I recall
there's a couple of problems with that when I looked at it.


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
  2014-10-13  8:29       ` Michal Kazior
@ 2014-10-13  8:44         ` Kalle Valo
  -1 siblings, 0 replies; 20+ messages in thread
From: Kalle Valo @ 2014-10-13  8:44 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k, linux-wireless

Michal Kazior <michal.kazior@tieto.com> writes:

> On 13 October 2014 10:15, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
>> Michal Kazior <michal.kazior@tieto.com> writes:
>>
>>> In some cases hw recovery was taking an absurdly
>>> long time due to ath10k waiting for things that
>>> would never really complete.
>>>
>>> Instead of waiting for inevitable timeouts poke
>>> all completions and wakequeues and check if it's
>>> still worth waiting.
>>>
>>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
>>
>> [...]
>>
>>> --- a/drivers/net/wireless/ath/ath10k/core.c
>>> +++ b/drivers/net/wireless/ath/ath10k/core.c
>>> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>>>  {
>>>       struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>>>
>>> +     set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
>>> +     barrier();
>>
>> Please document why the barrier is needed.
>
> Okay. It's just to make sure compiler doesn't re-order it (so that
> following complete/wake_ups will be called _after_ the bit is set).

Ok. Can you add a short comment to the code saying why you added the
barrier?

>>> --- a/drivers/net/wireless/ath/ath10k/core.h
>>> +++ b/drivers/net/wireless/ath/ath10k/core.h
>>> @@ -371,6 +371,11 @@ enum ath10k_dev_flags {
>>>       /* Indicates that ath10k device is during CAC phase of DFS */
>>>       ATH10K_CAC_RUNNING,
>>>       ATH10K_FLAG_CORE_REGISTERED,
>>> +
>>> +     /* Device has crashed and needs to restart. This indicates any pending
>>> +      * waiters should immediately cancel instead of waiting for a time out.
>>> +      */
>>> +     ATH10K_FLAG_CRASH_FLUSH,
>>>  };
>>
>> Instead of a dev flag, should this actually be a new state to enum
>> ath10k_state? The reason I ask, I don't see how we would use this flag
>> with other states than with ATH10K_STATE_ON. Or is this needed because
>> of locking?
>
> Locking.
>
> Reading/writing ar->state requires conf_mutex. The problem is waitiers
> might be holding it so you wouldn't even be able to tell the waiters
> to not wait anymore.
>
> A long term idea might involve removing conf_mutex though but I recall
> there's a couple of problems with that when I looked at it.

Makes sense. Can you add this to the commit log, please?

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ath10k: speed up hw recovery
@ 2014-10-13  8:44         ` Kalle Valo
  0 siblings, 0 replies; 20+ messages in thread
From: Kalle Valo @ 2014-10-13  8:44 UTC (permalink / raw)
  To: Michal Kazior; +Cc: linux-wireless, ath10k

Michal Kazior <michal.kazior@tieto.com> writes:

> On 13 October 2014 10:15, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
>> Michal Kazior <michal.kazior@tieto.com> writes:
>>
>>> In some cases hw recovery was taking an absurdly
>>> long time due to ath10k waiting for things that
>>> would never really complete.
>>>
>>> Instead of waiting for inevitable timeouts poke
>>> all completions and wakequeues and check if it's
>>> still worth waiting.
>>>
>>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
>>
>> [...]
>>
>>> --- a/drivers/net/wireless/ath/ath10k/core.c
>>> +++ b/drivers/net/wireless/ath/ath10k/core.c
>>> @@ -685,6 +685,20 @@ static void ath10k_core_restart(struct work_struct *work)
>>>  {
>>>       struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>>>
>>> +     set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
>>> +     barrier();
>>
>> Please document why the barrier is needed.
>
> Okay. It's just to make sure compiler doesn't re-order it (so that
> following complete/wake_ups will be called _after_ the bit is set).

Ok. Can you add a short comment to the code saying why you added the
barrier?

>>> --- a/drivers/net/wireless/ath/ath10k/core.h
>>> +++ b/drivers/net/wireless/ath/ath10k/core.h
>>> @@ -371,6 +371,11 @@ enum ath10k_dev_flags {
>>>       /* Indicates that ath10k device is during CAC phase of DFS */
>>>       ATH10K_CAC_RUNNING,
>>>       ATH10K_FLAG_CORE_REGISTERED,
>>> +
>>> +     /* Device has crashed and needs to restart. This indicates any pending
>>> +      * waiters should immediately cancel instead of waiting for a time out.
>>> +      */
>>> +     ATH10K_FLAG_CRASH_FLUSH,
>>>  };
>>
>> Instead of a dev flag, should this actually be a new state to enum
>> ath10k_state? The reason I ask, I don't see how we would use this flag
>> with other states than with ATH10K_STATE_ON. Or is this needed because
>> of locking?
>
> Locking.
>
> Reading/writing ar->state requires conf_mutex. The problem is waitiers
> might be holding it so you wouldn't even be able to tell the waiters
> to not wait anymore.
>
> A long term idea might involve removing conf_mutex though but I recall
> there's a couple of problems with that when I looked at it.

Makes sense. Can you add this to the commit log, please?

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] ath10k: expose hw restart via debugfs
  2014-10-13  8:07     ` Kalle Valo
@ 2014-10-13  8:45       ` Michal Kazior
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-13  8:45 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 13 October 2014 10:07, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
[...]
>> --- a/drivers/net/wireless/ath/ath10k/debug.c
>> +++ b/drivers/net/wireless/ath/ath10k/debug.c
>> @@ -694,7 +694,8 @@ static ssize_t ath10k_read_simulate_fw_crash(struct file *file,
>>               "To simulate firmware crash write one of the keywords to this file:\n"
>>               "`soft` - this will send WMI_FORCE_FW_HANG_ASSERT to firmware if FW supports that command.\n"
>>               "`hard` - this will send to firmware command with illegal parameters causing firmware crash.\n"
>> -             "`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n";
>> +             "`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n"
>> +             "`request` - this will simply queue hw restart without fw/hw actually crashing.\n";
>
> "request" is not really very descriptive command. Maybe call it
> "hw-restart" because that's what it really does, right?

Good idea. I'll rename it.


Michał

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] ath10k: expose hw restart via debugfs
@ 2014-10-13  8:45       ` Michal Kazior
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Kazior @ 2014-10-13  8:45 UTC (permalink / raw)
  To: Kalle Valo; +Cc: linux-wireless, ath10k

On 13 October 2014 10:07, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
[...]
>> --- a/drivers/net/wireless/ath/ath10k/debug.c
>> +++ b/drivers/net/wireless/ath/ath10k/debug.c
>> @@ -694,7 +694,8 @@ static ssize_t ath10k_read_simulate_fw_crash(struct file *file,
>>               "To simulate firmware crash write one of the keywords to this file:\n"
>>               "`soft` - this will send WMI_FORCE_FW_HANG_ASSERT to firmware if FW supports that command.\n"
>>               "`hard` - this will send to firmware command with illegal parameters causing firmware crash.\n"
>> -             "`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n";
>> +             "`assert` - this will send special illegal parameter to firmware to cause assert failure and crash.\n"
>> +             "`request` - this will simply queue hw restart without fw/hw actually crashing.\n";
>
> "request" is not really very descriptive command. Maybe call it
> "hw-restart" because that's what it really does, right?

Good idea. I'll rename it.


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-10-13  8:45 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-09  9:38 [PATCH 0/3] ath10k: speed up recovery Michal Kazior
2014-10-09  9:38 ` Michal Kazior
2014-10-09  9:38 ` [PATCH 1/3] ath10k: fix possible bmi crash Michal Kazior
2014-10-09  9:38   ` Michal Kazior
2014-10-09  9:38 ` [PATCH 2/3] ath10k: expose hw restart via debugfs Michal Kazior
2014-10-09  9:38   ` Michal Kazior
2014-10-13  8:07   ` Kalle Valo
2014-10-13  8:07     ` Kalle Valo
2014-10-13  8:45     ` Michal Kazior
2014-10-13  8:45       ` Michal Kazior
2014-10-09  9:38 ` [PATCH 3/3] ath10k: speed up hw recovery Michal Kazior
2014-10-09  9:38   ` Michal Kazior
2014-10-10  6:56   ` Michal Kazior
2014-10-10  6:56     ` Michal Kazior
2014-10-13  8:15   ` Kalle Valo
2014-10-13  8:15     ` Kalle Valo
2014-10-13  8:29     ` Michal Kazior
2014-10-13  8:29       ` Michal Kazior
2014-10-13  8:44       ` Kalle Valo
2014-10-13  8:44         ` Kalle Valo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.