All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] ath10k: pci fixes 2014-06-05
@ 2014-06-05  7:48 ` Michal Kazior
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

Hi,

I've found a few issues while I was testing ath10k
in QEMU/KVM with PCI passthrough. The second patch
is something that I've found while analysing the
bmi race and I don't think anyone experienced
problems associated with it.

There's still a weird synchronization issue with
(what I think is) iomap write propagation. The
initial fw bootup interrupt doesn't come in and
it seems CE interrupts are unmasked with a lag
because explicit polling after ctl_resp timeout
seems to be sufficient to work around it. I
suspect this might be a virtualization problem
rather than ath10k. Ideas, anyone?


Michal Kazior (2):
  ath10k: fix bmi exchange tx/rx race
  ath10k: sanitize tx ring index access properly

 drivers/net/wireless/ath/ath10k/ce.c  | 11 +++++++----
 drivers/net/wireless/ath/ath10k/pci.c | 11 +++--------
 drivers/net/wireless/ath/ath10k/pci.h |  3 ++-
 3 files changed, 12 insertions(+), 13 deletions(-)

-- 
1.8.5.3


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 0/2] ath10k: pci fixes 2014-06-05
@ 2014-06-05  7:48 ` Michal Kazior
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

Hi,

I've found a few issues while I was testing ath10k
in QEMU/KVM with PCI passthrough. The second patch
is something that I've found while analysing the
bmi race and I don't think anyone experienced
problems associated with it.

There's still a weird synchronization issue with
(what I think is) iomap write propagation. The
initial fw bootup interrupt doesn't come in and
it seems CE interrupts are unmasked with a lag
because explicit polling after ctl_resp timeout
seems to be sufficient to work around it. I
suspect this might be a virtualization problem
rather than ath10k. Ideas, anyone?


Michal Kazior (2):
  ath10k: fix bmi exchange tx/rx race
  ath10k: sanitize tx ring index access properly

 drivers/net/wireless/ath/ath10k/ce.c  | 11 +++++++----
 drivers/net/wireless/ath/ath10k/pci.c | 11 +++--------
 drivers/net/wireless/ath/ath10k/pci.h |  3 ++-
 3 files changed, 12 insertions(+), 13 deletions(-)

-- 
1.8.5.3


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] ath10k: fix bmi exchange tx/rx race
  2014-06-05  7:48 ` Michal Kazior
@ 2014-06-05  7:48   ` Michal Kazior
  -1 siblings, 0 replies; 10+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

It was possible for tx completion not to be
processed. In that case an old stack pointer was
left on copy engine tx ring. Next bmi exchange
would immediately pop it and use complete() on the
completion struct there causing corruption.

Make sure to wait for both tx and rx completions
properly.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/pci.c | 11 +++--------
 drivers/net/wireless/ath/ath10k/pci.h |  3 ++-
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index d0004d5..06840d1 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1362,8 +1362,6 @@ static int ath10k_pci_hif_exchange_bmi_msg(struct ath10k *ar,
 		ath10k_ce_recv_buf_enqueue(ce_rx, &xfer, resp_paddr);
 	}
 
-	init_completion(&xfer.done);
-
 	ret = ath10k_ce_send(ce_tx, &xfer, req_paddr, req_len, -1, 0);
 	if (ret)
 		goto err_resp;
@@ -1414,10 +1412,7 @@ static void ath10k_pci_bmi_send_done(struct ath10k_ce_pipe *ce_state)
 					  &nbytes, &transfer_id))
 		return;
 
-	if (xfer->wait_for_resp)
-		return;
-
-	complete(&xfer->done);
+	xfer->tx_done = true;
 }
 
 static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
@@ -1438,7 +1433,7 @@ static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
 	}
 
 	xfer->resp_len = nbytes;
-	complete(&xfer->done);
+	xfer->rx_done = true;
 }
 
 static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
@@ -1451,7 +1446,7 @@ static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
 		ath10k_pci_bmi_send_done(tx_pipe);
 		ath10k_pci_bmi_recv_data(rx_pipe);
 
-		if (completion_done(&xfer->done))
+		if (xfer->tx_done && (xfer->rx_done == xfer->wait_for_resp))
 			return 0;
 
 		schedule();
diff --git a/drivers/net/wireless/ath/ath10k/pci.h b/drivers/net/wireless/ath/ath10k/pci.h
index dfdebb4..9401292 100644
--- a/drivers/net/wireless/ath/ath10k/pci.h
+++ b/drivers/net/wireless/ath/ath10k/pci.h
@@ -38,7 +38,8 @@
 #define DIAG_TRANSFER_LIMIT 2048
 
 struct bmi_xfer {
-	struct completion done;
+	bool tx_done;
+	bool rx_done;
 	bool wait_for_resp;
 	u32 resp_len;
 };
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 1/2] ath10k: fix bmi exchange tx/rx race
@ 2014-06-05  7:48   ` Michal Kazior
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

It was possible for tx completion not to be
processed. In that case an old stack pointer was
left on copy engine tx ring. Next bmi exchange
would immediately pop it and use complete() on the
completion struct there causing corruption.

Make sure to wait for both tx and rx completions
properly.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/pci.c | 11 +++--------
 drivers/net/wireless/ath/ath10k/pci.h |  3 ++-
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index d0004d5..06840d1 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1362,8 +1362,6 @@ static int ath10k_pci_hif_exchange_bmi_msg(struct ath10k *ar,
 		ath10k_ce_recv_buf_enqueue(ce_rx, &xfer, resp_paddr);
 	}
 
-	init_completion(&xfer.done);
-
 	ret = ath10k_ce_send(ce_tx, &xfer, req_paddr, req_len, -1, 0);
 	if (ret)
 		goto err_resp;
@@ -1414,10 +1412,7 @@ static void ath10k_pci_bmi_send_done(struct ath10k_ce_pipe *ce_state)
 					  &nbytes, &transfer_id))
 		return;
 
-	if (xfer->wait_for_resp)
-		return;
-
-	complete(&xfer->done);
+	xfer->tx_done = true;
 }
 
 static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
@@ -1438,7 +1433,7 @@ static void ath10k_pci_bmi_recv_data(struct ath10k_ce_pipe *ce_state)
 	}
 
 	xfer->resp_len = nbytes;
-	complete(&xfer->done);
+	xfer->rx_done = true;
 }
 
 static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
@@ -1451,7 +1446,7 @@ static int ath10k_pci_bmi_wait(struct ath10k_ce_pipe *tx_pipe,
 		ath10k_pci_bmi_send_done(tx_pipe);
 		ath10k_pci_bmi_recv_data(rx_pipe);
 
-		if (completion_done(&xfer->done))
+		if (xfer->tx_done && (xfer->rx_done == xfer->wait_for_resp))
 			return 0;
 
 		schedule();
diff --git a/drivers/net/wireless/ath/ath10k/pci.h b/drivers/net/wireless/ath/ath10k/pci.h
index dfdebb4..9401292 100644
--- a/drivers/net/wireless/ath/ath10k/pci.h
+++ b/drivers/net/wireless/ath/ath10k/pci.h
@@ -38,7 +38,8 @@
 #define DIAG_TRANSFER_LIMIT 2048
 
 struct bmi_xfer {
-	struct completion done;
+	bool tx_done;
+	bool rx_done;
 	bool wait_for_resp;
 	u32 resp_len;
 };
-- 
1.8.5.3


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] ath10k: sanitize tx ring index access properly
  2014-06-05  7:48 ` Michal Kazior
@ 2014-06-05  7:48   ` Michal Kazior
  -1 siblings, 0 replies; 10+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

The tx ring index was immediately trimmed with a
bitmask. This discarded the 0xFFFFFFFF error case
(which theoretically can happen when a device is
abruptly disconnected) and led to using an invalid
tx ring index. This could lead to memory
corruption.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/ce.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
index d185dc0..7c6c7d5 100644
--- a/drivers/net/wireless/ath/ath10k/ce.c
+++ b/drivers/net/wireless/ath/ath10k/ce.c
@@ -603,16 +603,19 @@ static int ath10k_ce_completed_send_next_nolock(struct ath10k_ce_pipe *ce_state,
 		if (ret)
 			return ret;
 
-		src_ring->hw_index =
-			ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
-		src_ring->hw_index &= nentries_mask;
+		read_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
+		if (read_index == 0xFFFFFFFF)
+			return -ENODEV;
+
+		read_index &= nentries_mask;
+		src_ring->hw_index = read_index;
 
 		ath10k_pci_sleep(ar);
 	}
 
 	read_index = src_ring->hw_index;
 
-	if ((read_index == sw_index) || (read_index == 0xffffffff))
+	if (read_index == sw_index)
 		return -EIO;
 
 	sbase = src_ring->shadow_base;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] ath10k: sanitize tx ring index access properly
@ 2014-06-05  7:48   ` Michal Kazior
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Kazior @ 2014-06-05  7:48 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless, Michal Kazior

The tx ring index was immediately trimmed with a
bitmask. This discarded the 0xFFFFFFFF error case
(which theoretically can happen when a device is
abruptly disconnected) and led to using an invalid
tx ring index. This could lead to memory
corruption.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/ce.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
index d185dc0..7c6c7d5 100644
--- a/drivers/net/wireless/ath/ath10k/ce.c
+++ b/drivers/net/wireless/ath/ath10k/ce.c
@@ -603,16 +603,19 @@ static int ath10k_ce_completed_send_next_nolock(struct ath10k_ce_pipe *ce_state,
 		if (ret)
 			return ret;
 
-		src_ring->hw_index =
-			ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
-		src_ring->hw_index &= nentries_mask;
+		read_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
+		if (read_index == 0xFFFFFFFF)
+			return -ENODEV;
+
+		read_index &= nentries_mask;
+		src_ring->hw_index = read_index;
 
 		ath10k_pci_sleep(ar);
 	}
 
 	read_index = src_ring->hw_index;
 
-	if ((read_index == sw_index) || (read_index == 0xffffffff))
+	if (read_index == sw_index)
 		return -EIO;
 
 	sbase = src_ring->shadow_base;
-- 
1.8.5.3


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] ath10k: sanitize tx ring index access properly
  2014-06-05  7:48   ` Michal Kazior
@ 2014-07-14 13:20     ` Kalle Valo
  -1 siblings, 0 replies; 10+ messages in thread
From: Kalle Valo @ 2014-07-14 13:20 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k, linux-wireless

Michal Kazior <michal.kazior@tieto.com> writes:

> The tx ring index was immediately trimmed with a
> bitmask. This discarded the 0xFFFFFFFF error case
> (which theoretically can happen when a device is
> abruptly disconnected) and led to using an invalid
> tx ring index. This could lead to memory
> corruption.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
> ---
>  drivers/net/wireless/ath/ath10k/ce.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
> index d185dc0..7c6c7d5 100644
> --- a/drivers/net/wireless/ath/ath10k/ce.c
> +++ b/drivers/net/wireless/ath/ath10k/ce.c
> @@ -603,16 +603,19 @@ static int ath10k_ce_completed_send_next_nolock(struct ath10k_ce_pipe *ce_state,
>  		if (ret)
>  			return ret;
>  
> -		src_ring->hw_index =
> -			ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
> -		src_ring->hw_index &= nentries_mask;
> +		read_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
> +		if (read_index == 0xFFFFFFFF)
> +			return -ENODEV;

I changed this to lower case, as it was before. Let's use lower case hex
values in ath10k.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] ath10k: sanitize tx ring index access properly
@ 2014-07-14 13:20     ` Kalle Valo
  0 siblings, 0 replies; 10+ messages in thread
From: Kalle Valo @ 2014-07-14 13:20 UTC (permalink / raw)
  To: Michal Kazior; +Cc: linux-wireless, ath10k

Michal Kazior <michal.kazior@tieto.com> writes:

> The tx ring index was immediately trimmed with a
> bitmask. This discarded the 0xFFFFFFFF error case
> (which theoretically can happen when a device is
> abruptly disconnected) and led to using an invalid
> tx ring index. This could lead to memory
> corruption.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
> ---
>  drivers/net/wireless/ath/ath10k/ce.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
> index d185dc0..7c6c7d5 100644
> --- a/drivers/net/wireless/ath/ath10k/ce.c
> +++ b/drivers/net/wireless/ath/ath10k/ce.c
> @@ -603,16 +603,19 @@ static int ath10k_ce_completed_send_next_nolock(struct ath10k_ce_pipe *ce_state,
>  		if (ret)
>  			return ret;
>  
> -		src_ring->hw_index =
> -			ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
> -		src_ring->hw_index &= nentries_mask;
> +		read_index = ath10k_ce_src_ring_read_index_get(ar, ctrl_addr);
> +		if (read_index == 0xFFFFFFFF)
> +			return -ENODEV;

I changed this to lower case, as it was before. Let's use lower case hex
values in ath10k.

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ath10k: pci fixes 2014-06-05
  2014-06-05  7:48 ` Michal Kazior
@ 2014-07-15  8:22   ` Kalle Valo
  -1 siblings, 0 replies; 10+ messages in thread
From: Kalle Valo @ 2014-07-15  8:22 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k, linux-wireless

Michal Kazior <michal.kazior@tieto.com> writes:

> I've found a few issues while I was testing ath10k
> in QEMU/KVM with PCI passthrough. The second patch
> is something that I've found while analysing the
> bmi race and I don't think anyone experienced
> problems associated with it.
>
> There's still a weird synchronization issue with
> (what I think is) iomap write propagation. The
> initial fw bootup interrupt doesn't come in and
> it seems CE interrupts are unmasked with a lag
> because explicit polling after ctl_resp timeout
> seems to be sufficient to work around it. I
> suspect this might be a virtualization problem
> rather than ath10k. Ideas, anyone?

No ideas, but it would be great to get ath10k working in KVM. If we
cannot find a better solution, could we add the explicit polling anyway?
(With an approriate comment why such a hack is needed, of course)

> Michal Kazior (2):
>   ath10k: fix bmi exchange tx/rx race
>   ath10k: sanitize tx ring index access properly

Thanks, both patches applied.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] ath10k: pci fixes 2014-06-05
@ 2014-07-15  8:22   ` Kalle Valo
  0 siblings, 0 replies; 10+ messages in thread
From: Kalle Valo @ 2014-07-15  8:22 UTC (permalink / raw)
  To: Michal Kazior; +Cc: linux-wireless, ath10k

Michal Kazior <michal.kazior@tieto.com> writes:

> I've found a few issues while I was testing ath10k
> in QEMU/KVM with PCI passthrough. The second patch
> is something that I've found while analysing the
> bmi race and I don't think anyone experienced
> problems associated with it.
>
> There's still a weird synchronization issue with
> (what I think is) iomap write propagation. The
> initial fw bootup interrupt doesn't come in and
> it seems CE interrupts are unmasked with a lag
> because explicit polling after ctl_resp timeout
> seems to be sufficient to work around it. I
> suspect this might be a virtualization problem
> rather than ath10k. Ideas, anyone?

No ideas, but it would be great to get ath10k working in KVM. If we
cannot find a better solution, could we add the explicit polling anyway?
(With an approriate comment why such a hack is needed, of course)

> Michal Kazior (2):
>   ath10k: fix bmi exchange tx/rx race
>   ath10k: sanitize tx ring index access properly

Thanks, both patches applied.

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-07-15  8:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-05  7:48 [PATCH 0/2] ath10k: pci fixes 2014-06-05 Michal Kazior
2014-06-05  7:48 ` Michal Kazior
2014-06-05  7:48 ` [PATCH 1/2] ath10k: fix bmi exchange tx/rx race Michal Kazior
2014-06-05  7:48   ` Michal Kazior
2014-06-05  7:48 ` [PATCH 2/2] ath10k: sanitize tx ring index access properly Michal Kazior
2014-06-05  7:48   ` Michal Kazior
2014-07-14 13:20   ` Kalle Valo
2014-07-14 13:20     ` Kalle Valo
2014-07-15  8:22 ` [PATCH 0/2] ath10k: pci fixes 2014-06-05 Kalle Valo
2014-07-15  8:22   ` Kalle Valo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.