linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/3] ath10k: improve throughout of RX of sdio
@ 2019-09-25  9:10 Wen Gong
  2019-09-25  9:10 ` [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio Wen Gong
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Wen Gong @ 2019-09-25  9:10 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless

v6: change code style, change commit log to add test value of RX, use sk_buff_head to replace the rx queue
this is 3 patches of the 7 patches from ath10k: improve throughout of tcp/udp TX/RX of sdio

v5: no change

v4: add err handle in ath10k_sdio_mbox_rx_fetch_bundle, change commit log

v3: change some code style
split fix incorrect skb tail of rx bundle to patch "adjust skb length in ath10k_sdio_mbox_rx_packet"

v2: fix incorrect skb tail of rx bundle in ath10k_sdio_mbox_rx_process_packet, change macro HTC_GET_BUNDLE_COUNT

The bottleneck of throughout on sdio chip is the bus bandwidth, to the
patches are all to increase the use ratio of sdio bus.

These patches only affect sdio bus chip, explanation is mentioned in each
patch's commit log.

Alagu Sankar (1):
  ath10k: enable RX bundle receive for sdio

Wen Gong (2):
  ath10k: change max RX bundle size from 8 to 32 for sdio
  ath10k: add workqueue for RX path of sdio

 drivers/net/wireless/ath/ath10k/htc.h  |  12 ++-
 drivers/net/wireless/ath/ath10k/sdio.c | 147 +++++++++++++++++++++++----------
 drivers/net/wireless/ath/ath10k/sdio.h |  26 +++++-
 3 files changed, 134 insertions(+), 51 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio
  2019-09-25  9:10 [PATCH v6 0/3] ath10k: improve throughout of RX of sdio Wen Gong
@ 2019-09-25  9:10 ` Wen Gong
  2019-10-24  8:29   ` Kalle Valo
  2019-11-25 11:49   ` Kalle Valo
  2019-09-25  9:10 ` [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 " Wen Gong
  2019-09-25  9:10 ` [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio Wen Gong
  2 siblings, 2 replies; 22+ messages in thread
From: Wen Gong @ 2019-09-25  9:10 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless

From: Alagu Sankar <alagusankar@silex-india.com>

The existing implementation of initiating multiple sdio transfers for
receive bundling is slowing down the receive speed. Combining the
transfers using a bundle method would be ideal.

The transmission utilization ratio for sdio bus for small packet is
slow, because the space and time cost for sdio bus is same for large
length packet and small length packet. So the speed of data for large
length packet is higher than small length.

Test result of different length of data:
data packet(byte)   cost time(us)   calculated rate(Mbps)
      256               28                73
      512               33               124
     1024               35               234
     1792               45               318
    14336              168               682
    28672              333               688
    57344              660               695

Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWPZ-1

Signed-off-by: Alagu Sankar <alagusankar@silex-india.com>
Signed-off-by: Wen Gong <wgong@codeaurora.org>
---
 drivers/net/wireless/ath/ath10k/sdio.c | 110 +++++++++++++++++++++------------
 drivers/net/wireless/ath/ath10k/sdio.h |  11 +++-
 2 files changed, 79 insertions(+), 42 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index 120200a..f545626 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -24,6 +24,8 @@
 #include "trace.h"
 #include "sdio.h"
 
+#define ATH10K_SDIO_VSG_BUF_SIZE	(32 * 1024)
+
 /* inlined helper functions */
 
 static inline int ath10k_sdio_calc_txrx_padded_len(struct ath10k_sdio *ar_sdio,
@@ -484,11 +486,11 @@ static int ath10k_sdio_mbox_rx_process_packets(struct ath10k *ar,
 	return ret;
 }
 
-static int ath10k_sdio_mbox_alloc_pkt_bundle(struct ath10k *ar,
-					     struct ath10k_sdio_rx_data *rx_pkts,
-					     struct ath10k_htc_hdr *htc_hdr,
-					     size_t full_len, size_t act_len,
-					     size_t *bndl_cnt)
+static int ath10k_sdio_mbox_alloc_bundle(struct ath10k *ar,
+					 struct ath10k_sdio_rx_data *rx_pkts,
+					 struct ath10k_htc_hdr *htc_hdr,
+					 size_t full_len, size_t act_len,
+					 size_t *bndl_cnt)
 {
 	int ret, i;
 
@@ -529,6 +531,7 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar,
 	size_t full_len, act_len;
 	bool last_in_bundle;
 	int ret, i;
+	int pkt_cnt = 0;
 
 	if (n_lookaheads > ATH10K_SDIO_MAX_RX_MSGS) {
 		ath10k_warn(ar,
@@ -572,20 +575,19 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar,
 			 */
 			size_t bndl_cnt;
 
-			ret = ath10k_sdio_mbox_alloc_pkt_bundle(ar,
-								&ar_sdio->rx_pkts[i],
-								htc_hdr,
-								full_len,
-								act_len,
-								&bndl_cnt);
+			ret = ath10k_sdio_mbox_alloc_bundle(ar,
+							    &ar_sdio->rx_pkts[pkt_cnt],
+							    htc_hdr,
+							    full_len,
+							    act_len,
+							    &bndl_cnt);
 
 			if (ret) {
 				ath10k_warn(ar, "alloc_bundle error %d\n", ret);
 				goto err;
 			}
 
-			n_lookaheads += bndl_cnt;
-			i += bndl_cnt;
+			pkt_cnt += bndl_cnt;
 			/*Next buffer will be the last in the bundle */
 			last_in_bundle = true;
 		}
@@ -597,7 +599,7 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar,
 		if (htc_hdr->flags & ATH10K_HTC_FLAGS_RECV_1MORE_BLOCK)
 			full_len += ATH10K_HIF_MBOX_BLOCK_SIZE;
 
-		ret = ath10k_sdio_mbox_alloc_rx_pkt(&ar_sdio->rx_pkts[i],
+		ret = ath10k_sdio_mbox_alloc_rx_pkt(&ar_sdio->rx_pkts[pkt_cnt],
 						    act_len,
 						    full_len,
 						    last_in_bundle,
@@ -606,9 +608,11 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar,
 			ath10k_warn(ar, "alloc_rx_pkt error %d\n", ret);
 			goto err;
 		}
+
+		pkt_cnt++;
 	}
 
-	ar_sdio->n_rx_pkts = i;
+	ar_sdio->n_rx_pkts = pkt_cnt;
 
 	return 0;
 
@@ -622,59 +626,76 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar,
 	return ret;
 }
 
-static int ath10k_sdio_mbox_rx_packet(struct ath10k *ar,
-				      struct ath10k_sdio_rx_data *pkt)
+static int ath10k_sdio_mbox_rx_fetch(struct ath10k *ar)
 {
 	struct ath10k_sdio *ar_sdio = ath10k_sdio_priv(ar);
+	struct ath10k_sdio_rx_data *pkt = &ar_sdio->rx_pkts[0];
 	struct sk_buff *skb = pkt->skb;
 	struct ath10k_htc_hdr *htc_hdr;
 	int ret;
 
 	ret = ath10k_sdio_readsb(ar, ar_sdio->mbox_info.htc_addr,
 				 skb->data, pkt->alloc_len);
-	if (ret)
-		goto out;
 
-	/* Update actual length. The original length may be incorrect,
-	 * as the FW will bundle multiple packets as long as their sizes
-	 * fit within the same aligned length (pkt->alloc_len).
-	 */
-	htc_hdr = (struct ath10k_htc_hdr *)skb->data;
-	pkt->act_len = le16_to_cpu(htc_hdr->len) + sizeof(*htc_hdr);
-	if (pkt->act_len > pkt->alloc_len) {
-		ath10k_warn(ar, "rx packet too large (%zu > %zu)\n",
-			    pkt->act_len, pkt->alloc_len);
-		ret = -EMSGSIZE;
-		goto out;
+	if (ret) {
+		ar_sdio->n_rx_pkts = 0;
+		ath10k_sdio_mbox_free_rx_pkt(pkt);
+		return ret;
 	}
 
-	skb_put(skb, pkt->act_len);
-
-out:
+	htc_hdr = (struct ath10k_htc_hdr *)skb->data;
+	pkt->act_len = le16_to_cpu(htc_hdr->len) + sizeof(*htc_hdr);
 	pkt->status = ret;
+	skb_put(skb, pkt->act_len);
 
 	return ret;
 }
 
-static int ath10k_sdio_mbox_rx_fetch(struct ath10k *ar)
+static int ath10k_sdio_mbox_rx_fetch_bundle(struct ath10k *ar)
 {
 	struct ath10k_sdio *ar_sdio = ath10k_sdio_priv(ar);
+	struct ath10k_sdio_rx_data *pkt;
+	struct ath10k_htc_hdr *htc_hdr;
 	int ret, i;
+	u32 pkt_offset, virt_pkt_len;
 
+	virt_pkt_len = 0;
+	for (i = 0; i < ar_sdio->n_rx_pkts; i++)
+		virt_pkt_len += ar_sdio->rx_pkts[i].alloc_len;
+
+	if (virt_pkt_len > ATH10K_SDIO_VSG_BUF_SIZE) {
+		ath10k_err(ar, "size exceeding limit %d\n", virt_pkt_len);
+		ret = -E2BIG;
+		goto err;
+	}
+
+	ret = ath10k_sdio_readsb(ar, ar_sdio->mbox_info.htc_addr,
+				 ar_sdio->vsg_buffer, virt_pkt_len);
+	if (ret) {
+		ath10k_warn(ar, "failed to read bundle packets: %d", ret);
+		goto err;
+	}
+
+	pkt_offset = 0;
 	for (i = 0; i < ar_sdio->n_rx_pkts; i++) {
-		ret = ath10k_sdio_mbox_rx_packet(ar,
-						 &ar_sdio->rx_pkts[i]);
-		if (ret)
-			goto err;
+		pkt = &ar_sdio->rx_pkts[i];
+		htc_hdr = (struct ath10k_htc_hdr *)(ar_sdio->vsg_buffer + pkt_offset);
+		pkt->act_len = le16_to_cpu(htc_hdr->len) + sizeof(*htc_hdr);
+
+		skb_put_data(pkt->skb, htc_hdr, pkt->act_len);
+		pkt->status = 0;
+		pkt_offset += pkt->alloc_len;
 	}
 
 	return 0;
 
 err:
 	/* Free all packets that was not successfully fetched. */
-	for (; i < ar_sdio->n_rx_pkts; i++)
+	for (i = 0; i < ar_sdio->n_rx_pkts; i++)
 		ath10k_sdio_mbox_free_rx_pkt(&ar_sdio->rx_pkts[i]);
 
+	ar_sdio->n_rx_pkts = 0;
+
 	return ret;
 }
 
@@ -717,7 +738,10 @@ static int ath10k_sdio_mbox_rxmsg_pending_handler(struct ath10k *ar,
 			 */
 			*done = false;
 
-		ret = ath10k_sdio_mbox_rx_fetch(ar);
+		if (ar_sdio->n_rx_pkts > 1)
+			ret = ath10k_sdio_mbox_rx_fetch_bundle(ar);
+		else
+			ret = ath10k_sdio_mbox_rx_fetch(ar);
 
 		/* Process fetched packets. This will potentially update
 		 * n_lookaheads depending on if the packets contain lookahead
@@ -2020,6 +2044,12 @@ static int ath10k_sdio_probe(struct sdio_func *func,
 		goto err_core_destroy;
 	}
 
+	ar_sdio->vsg_buffer = devm_kmalloc(ar->dev, ATH10K_SDIO_VSG_BUF_SIZE, GFP_KERNEL);
+	if (!ar_sdio->vsg_buffer) {
+		ret = -ENOMEM;
+		goto err_core_destroy;
+	}
+
 	ar_sdio->irq_data.irq_en_reg =
 		devm_kzalloc(ar->dev, sizeof(struct ath10k_sdio_irq_enable_regs),
 			     GFP_KERNEL);
diff --git a/drivers/net/wireless/ath/ath10k/sdio.h b/drivers/net/wireless/ath/ath10k/sdio.h
index b8c7ac0..8d5b09f 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.h
+++ b/drivers/net/wireless/ath/ath10k/sdio.h
@@ -138,8 +138,8 @@ struct ath10k_sdio_irq_proc_regs {
 	u8 rx_lookahead_valid;
 	u8 host_int_status2;
 	u8 gmbox_rx_avail;
-	__le32 rx_lookahead[2];
-	__le32 rx_gmbox_lookahead_alias[2];
+	__le32 rx_lookahead[2 * ATH10K_HIF_MBOX_NUM_MAX];
+	__le32 int_status_enable;
 };
 
 struct ath10k_sdio_irq_enable_regs {
@@ -196,6 +196,13 @@ struct ath10k_sdio {
 	struct ath10k *ar;
 	struct ath10k_sdio_irq_data irq_data;
 
+	/* temporary buffer for sdio read.
+	 * It is allocated when probe, and used for receive bundled packets,
+	 * the read for bundled packets is not parallel, so it does not need
+	 * protected.
+	 */
+	u8 *vsg_buffer;
+
 	/* temporary buffer for BMI requests */
 	u8 *bmi_buf;
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-09-25  9:10 [PATCH v6 0/3] ath10k: improve throughout of RX of sdio Wen Gong
  2019-09-25  9:10 ` [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio Wen Gong
@ 2019-09-25  9:10 ` Wen Gong
  2019-10-24  9:25   ` Kalle Valo
                     ` (2 more replies)
  2019-09-25  9:10 ` [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio Wen Gong
  2 siblings, 3 replies; 22+ messages in thread
From: Wen Gong @ 2019-09-25  9:10 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless

The max bundle size support by firmware is 32, change it from 8 to 32
will help performance. This results in significant performance
improvement on RX path.

Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWPZ-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
---
 drivers/net/wireless/ath/ath10k/htc.h  | 12 +++++++++---
 drivers/net/wireless/ath/ath10k/sdio.c |  4 ++--
 drivers/net/wireless/ath/ath10k/sdio.h |  4 ++--
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/htc.h b/drivers/net/wireless/ath/ath10k/htc.h
index f55d3ca..7055156 100644
--- a/drivers/net/wireless/ath/ath10k/htc.h
+++ b/drivers/net/wireless/ath/ath10k/htc.h
@@ -39,7 +39,7 @@
  * 4-byte aligned.
  */
 
-#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        8
+#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        32
 
 enum ath10k_htc_tx_flags {
 	ATH10K_HTC_FLAG_NEED_CREDIT_UPDATE = 0x01,
@@ -48,10 +48,16 @@ enum ath10k_htc_tx_flags {
 
 enum ath10k_htc_rx_flags {
 	ATH10K_HTC_FLAGS_RECV_1MORE_BLOCK = 0x01,
-	ATH10K_HTC_FLAG_TRAILER_PRESENT = 0x02,
-	ATH10K_HTC_FLAG_BUNDLE_MASK     = 0xF0
+	ATH10K_HTC_FLAG_TRAILER_PRESENT = 0x02
 };
 
+#define ATH10K_HTC_FLAG_BUNDLE_MASK 0xF0
+#define ATH10K_HTC_BUNDLE_EXTRA_MASK GENMASK(3, 2)
+
+#define ATH10K_HTC_GET_BUNDLE_COUNT(flags) \
+	    (FIELD_GET(ATH10K_HTC_FLAG_BUNDLE_MASK, (flags)) +  \
+	    (FIELD_GET(ATH10K_HTC_BUNDLE_EXTRA_MASK, (flags)) << 4))
+
 struct ath10k_htc_hdr {
 	u8 eid; /* @enum ath10k_htc_ep_id */
 	u8 flags; /* @enum ath10k_htc_tx_flags, ath10k_htc_rx_flags */
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index f545626..a510101 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -24,7 +24,7 @@
 #include "trace.h"
 #include "sdio.h"
 
-#define ATH10K_SDIO_VSG_BUF_SIZE	(32 * 1024)
+#define ATH10K_SDIO_VSG_BUF_SIZE	(64 * 1024)
 
 /* inlined helper functions */
 
@@ -494,7 +494,7 @@ static int ath10k_sdio_mbox_alloc_bundle(struct ath10k *ar,
 {
 	int ret, i;
 
-	*bndl_cnt = FIELD_GET(ATH10K_HTC_FLAG_BUNDLE_MASK, htc_hdr->flags);
+	*bndl_cnt = ATH10K_HTC_GET_BUNDLE_COUNT(htc_hdr->flags);
 
 	if (*bndl_cnt > HTC_HOST_MAX_MSG_PER_RX_BUNDLE) {
 		ath10k_warn(ar,
diff --git a/drivers/net/wireless/ath/ath10k/sdio.h b/drivers/net/wireless/ath/ath10k/sdio.h
index 8d5b09f..00bd4ca 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.h
+++ b/drivers/net/wireless/ath/ath10k/sdio.h
@@ -89,10 +89,10 @@
  * to the maximum value (HTC_HOST_MAX_MSG_PER_RX_BUNDLE).
  *
  * in this case the driver must allocate
- * (HTC_HOST_MAX_MSG_PER_RX_BUNDLE * HTC_HOST_MAX_MSG_PER_RX_BUNDLE) skb's.
+ * (HTC_HOST_MAX_MSG_PER_RX_BUNDLE * 2) skb's.
  */
 #define ATH10K_SDIO_MAX_RX_MSGS \
-	(HTC_HOST_MAX_MSG_PER_RX_BUNDLE * HTC_HOST_MAX_MSG_PER_RX_BUNDLE)
+	(HTC_HOST_MAX_MSG_PER_RX_BUNDLE * 2)
 
 #define ATH10K_FIFO_TIMEOUT_AND_CHIP_CONTROL   0x00000868u
 #define ATH10K_FIFO_TIMEOUT_AND_CHIP_CONTROL_DISABLE_SLEEP_OFF 0xFFFEFFFF
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-09-25  9:10 [PATCH v6 0/3] ath10k: improve throughout of RX of sdio Wen Gong
  2019-09-25  9:10 ` [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio Wen Gong
  2019-09-25  9:10 ` [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 " Wen Gong
@ 2019-09-25  9:10 ` Wen Gong
  2019-10-24  8:14   ` Kalle Valo
  2019-10-31  9:08   ` Kalle Valo
  2 siblings, 2 replies; 22+ messages in thread
From: Wen Gong @ 2019-09-25  9:10 UTC (permalink / raw)
  To: ath10k; +Cc: linux-wireless

For RX, it has two parts, one is to read data from sdio, another
is to indicate the packets to upper stack. Recently it has only
one thread to do all RX things, it results that it is sequential
for RX and low throughout, change RX to parallel for the two parts
will increase throughout.

This patch move the indication to a workqueue, it results in
significant performance improvement on RX path.

Udp rx throughout is 200Mbps without this patch, and it arrives
400Mbps with this patch.

Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWPZ-1

Signed-off-by: Wen Gong <wgong@codeaurora.org>
---
 drivers/net/wireless/ath/ath10k/sdio.c | 35 +++++++++++++++++++++++++++++++---
 drivers/net/wireless/ath/ath10k/sdio.h | 11 +++++++++++
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index a510101..ff02833 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -419,6 +419,7 @@ static int ath10k_sdio_mbox_rx_process_packets(struct ath10k *ar,
 	struct ath10k_htc *htc = &ar->htc;
 	struct ath10k_sdio_rx_data *pkt;
 	struct ath10k_htc_ep *ep;
+	struct ath10k_skb_cb *cb;
 	enum ath10k_htc_ep_id id;
 	int ret, i, *n_lookahead_local;
 	u32 *lookaheads_local;
@@ -464,10 +465,16 @@ static int ath10k_sdio_mbox_rx_process_packets(struct ath10k *ar,
 		if (ret)
 			goto out;
 
-		if (!pkt->trailer_only)
-			ep->ep_ops.ep_rx_complete(ar_sdio->ar, pkt->skb);
-		else
+		if (!pkt->trailer_only) {
+			cb = ATH10K_SKB_CB(pkt->skb);
+			cb->eid = id;
+
+			skb_queue_tail(&ar_sdio->rx_head, pkt->skb);
+			queue_work(ar->workqueue_aux,
+				   &ar_sdio->async_work_rx);
+		} else {
 			kfree_skb(pkt->skb);
+		}
 
 		/* The RX complete handler now owns the skb...*/
 		pkt->skb = NULL;
@@ -1317,6 +1324,25 @@ static void __ath10k_sdio_write_async(struct ath10k *ar,
 	ath10k_sdio_free_bus_req(ar, req);
 }
 
+static void ath10k_rx_indication_async_work(struct work_struct *work)
+{
+	struct ath10k_sdio *ar_sdio = container_of(work, struct ath10k_sdio,
+						   async_work_rx);
+	struct ath10k *ar = ar_sdio->ar;
+	struct ath10k_htc_ep *ep;
+	struct ath10k_skb_cb *cb;
+	struct sk_buff *skb;
+
+	while (true) {
+		skb = skb_dequeue(&ar_sdio->rx_head);
+		if (!skb)
+			break;
+		cb = ATH10K_SKB_CB(skb);
+		ep = &ar->htc.endpoint[cb->eid];
+		ep->ep_ops.ep_rx_complete(ar, skb);
+	}
+}
+
 static void ath10k_sdio_write_async_work(struct work_struct *work)
 {
 	struct ath10k_sdio *ar_sdio = container_of(work, struct ath10k_sdio,
@@ -2087,6 +2113,9 @@ static int ath10k_sdio_probe(struct sdio_func *func,
 	for (i = 0; i < ATH10K_SDIO_BUS_REQUEST_MAX_NUM; i++)
 		ath10k_sdio_free_bus_req(ar, &ar_sdio->bus_req[i]);
 
+	skb_queue_head_init(&ar_sdio->rx_head);
+	INIT_WORK(&ar_sdio->async_work_rx, ath10k_rx_indication_async_work);
+
 	dev_id_base = FIELD_GET(QCA_MANUFACTURER_ID_BASE, id->device);
 	switch (dev_id_base) {
 	case QCA_MANUFACTURER_ID_AR6005_BASE:
diff --git a/drivers/net/wireless/ath/ath10k/sdio.h b/drivers/net/wireless/ath/ath10k/sdio.h
index 00bd4ca..8aa0dbc 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.h
+++ b/drivers/net/wireless/ath/ath10k/sdio.h
@@ -98,6 +98,12 @@
 #define ATH10K_FIFO_TIMEOUT_AND_CHIP_CONTROL_DISABLE_SLEEP_OFF 0xFFFEFFFF
 #define ATH10K_FIFO_TIMEOUT_AND_CHIP_CONTROL_DISABLE_SLEEP_ON 0x10000
 
+struct ath10k_sdio_rx_request {
+	struct list_head list;
+	struct sk_buff *skb;
+	struct ath10k_htc_ep *ep;
+};
+
 struct ath10k_sdio_bus_request {
 	struct list_head list;
 
@@ -187,6 +193,9 @@ struct ath10k_sdio {
 	struct ath10k_sdio_bus_request bus_req[ATH10K_SDIO_BUS_REQUEST_MAX_NUM];
 	/* free list of bus requests */
 	struct list_head bus_req_freeq;
+
+	struct sk_buff_head rx_head;
+
 	/* protects access to bus_req_freeq */
 	spinlock_t lock;
 
@@ -213,6 +222,8 @@ struct ath10k_sdio {
 	struct list_head wr_asyncq;
 	/* protects access to wr_asyncq */
 	spinlock_t wr_async_lock;
+
+	struct work_struct async_work_rx;
 };
 
 static inline struct ath10k_sdio *ath10k_sdio_priv(struct ath10k *ar)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-09-25  9:10 ` [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio Wen Gong
@ 2019-10-24  8:14   ` Kalle Valo
  2019-10-31  9:08   ` Kalle Valo
  1 sibling, 0 replies; 22+ messages in thread
From: Kalle Valo @ 2019-10-24  8:14 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> For RX, it has two parts, one is to read data from sdio, another
> is to indicate the packets to upper stack. Recently it has only
> one thread to do all RX things, it results that it is sequential
> for RX and low throughout, change RX to parallel for the two parts
> will increase throughout.
>
> This patch move the indication to a workqueue, it results in
> significant performance improvement on RX path.
>
> Udp rx throughout is 200Mbps without this patch, and it arrives
> 400Mbps with this patch.
>
> Tested with QCA6174 SDIO with firmware
> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>

[...]

> --- a/drivers/net/wireless/ath/ath10k/sdio.h
> +++ b/drivers/net/wireless/ath/ath10k/sdio.h
> @@ -98,6 +98,12 @@
>  #define ATH10K_FIFO_TIMEOUT_AND_CHIP_CONTROL_DISABLE_SLEEP_OFF 0xFFFEFFFF
>  #define ATH10K_FIFO_TIMEOUT_AND_CHIP_CONTROL_DISABLE_SLEEP_ON 0x10000
>  
> +struct ath10k_sdio_rx_request {
> +	struct list_head list;
> +	struct sk_buff *skb;
> +	struct ath10k_htc_ep *ep;
> +};

This is not used anymore, I removed it in the pending branch.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio
  2019-09-25  9:10 ` [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio Wen Gong
@ 2019-10-24  8:29   ` Kalle Valo
  2019-11-25 11:49   ` Kalle Valo
  1 sibling, 0 replies; 22+ messages in thread
From: Kalle Valo @ 2019-10-24  8:29 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> From: Alagu Sankar <alagusankar@silex-india.com>
>
> The existing implementation of initiating multiple sdio transfers for
> receive bundling is slowing down the receive speed. Combining the
> transfers using a bundle method would be ideal.
>
> The transmission utilization ratio for sdio bus for small packet is
> slow, because the space and time cost for sdio bus is same for large
> length packet and small length packet. So the speed of data for large
> length packet is higher than small length.
>
> Test result of different length of data:
> data packet(byte)   cost time(us)   calculated rate(Mbps)
>       256               28                73
>       512               33               124
>      1024               35               234
>      1792               45               318
>     14336              168               682
>     28672              333               688
>     57344              660               695
>
> Tested with QCA6174 SDIO with firmware
> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>
> Signed-off-by: Alagu Sankar <alagusankar@silex-india.com>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>

[...]

> +static int ath10k_sdio_mbox_rx_fetch_bundle(struct ath10k *ar)
>  {
>  	struct ath10k_sdio *ar_sdio = ath10k_sdio_priv(ar);
> +	struct ath10k_sdio_rx_data *pkt;
> +	struct ath10k_htc_hdr *htc_hdr;
>  	int ret, i;
> +	u32 pkt_offset, virt_pkt_len;
>  
> +	virt_pkt_len = 0;
> +	for (i = 0; i < ar_sdio->n_rx_pkts; i++)
> +		virt_pkt_len += ar_sdio->rx_pkts[i].alloc_len;
> +
> +	if (virt_pkt_len > ATH10K_SDIO_VSG_BUF_SIZE) {
> +		ath10k_err(ar, "size exceeding limit %d\n", virt_pkt_len);
> +		ret = -E2BIG;
> +		goto err;
> +	}

This should use ath10k_warn(), fixed in the pending branch. I also
improved the log message.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-09-25  9:10 ` [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 " Wen Gong
@ 2019-10-24  9:25   ` Kalle Valo
  2019-10-24  9:40     ` Wen Gong
  2019-10-24  9:30   ` Kalle Valo
  2019-10-24 10:05   ` Kalle Valo
  2 siblings, 1 reply; 22+ messages in thread
From: Kalle Valo @ 2019-10-24  9:25 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> The max bundle size support by firmware is 32, change it from 8 to 32
> will help performance. This results in significant performance
> improvement on RX path.
>
> Tested with QCA6174 SDIO with firmware
> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>
> ---
>  drivers/net/wireless/ath/ath10k/htc.h  | 12 +++++++++---
>  drivers/net/wireless/ath/ath10k/sdio.c |  4 ++--
>  drivers/net/wireless/ath/ath10k/sdio.h |  4 ++--
>  3 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/htc.h b/drivers/net/wireless/ath/ath10k/htc.h
> index f55d3ca..7055156 100644
> --- a/drivers/net/wireless/ath/ath10k/htc.h
> +++ b/drivers/net/wireless/ath/ath10k/htc.h
> @@ -39,7 +39,7 @@
>   * 4-byte aligned.
>   */
>  
> -#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        8
> +#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        32

So how do I know that this change doesn't break any other hardware? I
did a quick review and I think it's safe, but the commit log mentions
nothing about this.

Please clarify and I can update the commit log.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-09-25  9:10 ` [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 " Wen Gong
  2019-10-24  9:25   ` Kalle Valo
@ 2019-10-24  9:30   ` Kalle Valo
  2019-10-24  9:47     ` Wen Gong
  2019-10-24 10:05   ` Kalle Valo
  2 siblings, 1 reply; 22+ messages in thread
From: Kalle Valo @ 2019-10-24  9:30 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> The max bundle size support by firmware is 32, change it from 8 to 32
> will help performance. This results in significant performance
> improvement on RX path.
>
> Tested with QCA6174 SDIO with firmware
> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>

[...]

> --- a/drivers/net/wireless/ath/ath10k/sdio.c
> +++ b/drivers/net/wireless/ath/ath10k/sdio.c
> @@ -24,7 +24,7 @@
>  #include "trace.h"
>  #include "sdio.h"
>  
> -#define ATH10K_SDIO_VSG_BUF_SIZE	(32 * 1024)
> +#define ATH10K_SDIO_VSG_BUF_SIZE	(64 * 1024)

Is allocating 64 kb with kmalloc() reliable, especially on smaller
systems? I hope it is, but checking if someone else knows better. We
only do this only once in probe(), though.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-10-24  9:25   ` Kalle Valo
@ 2019-10-24  9:40     ` Wen Gong
  2019-10-24 10:14       ` Kalle Valo
  0 siblings, 1 reply; 22+ messages in thread
From: Wen Gong @ 2019-10-24  9:40 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 2019-10-24 17:25, Kalle Valo wrote:
> Wen Gong <wgong@codeaurora.org> writes:
> 
>> The max bundle size support by firmware is 32, change it from 8 to 32
>> will help performance. This results in significant performance
>> improvement on RX path.
>> 
>> Tested with QCA6174 SDIO with firmware
>> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>> 
>> Signed-off-by: Wen Gong <wgong@codeaurora.org>
>> ---
>>  drivers/net/wireless/ath/ath10k/htc.h  | 12 +++++++++---
>>  drivers/net/wireless/ath/ath10k/sdio.c |  4 ++--
>>  drivers/net/wireless/ath/ath10k/sdio.h |  4 ++--
>>  3 files changed, 13 insertions(+), 7 deletions(-)
>> 
>> diff --git a/drivers/net/wireless/ath/ath10k/htc.h 
>> b/drivers/net/wireless/ath/ath10k/htc.h
>> index f55d3ca..7055156 100644
>> --- a/drivers/net/wireless/ath/ath10k/htc.h
>> +++ b/drivers/net/wireless/ath/ath10k/htc.h
>> @@ -39,7 +39,7 @@
>>   * 4-byte aligned.
>>   */
>> 
>> -#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        8
>> +#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        32
> 
> So how do I know that this change doesn't break any other hardware? I
> did a quick review and I think it's safe, but the commit log mentions
> nothing about this.
the real max rx bundle is decided in ath10k_htc_wait_target.
it is the min value of HTC_HOST_MAX_MSG_PER_RX_BUNDLE and the value 
reported from firmware.
htc->max_msgs_per_htc_bundle =
			min_t(u8, msg->ready_ext.max_msgs_per_htc_bundle,
			      HTC_HOST_MAX_MSG_PER_RX_BUNDLE);
> 
> Please clarify and I can update the commit log.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-10-24  9:30   ` Kalle Valo
@ 2019-10-24  9:47     ` Wen Gong
  0 siblings, 0 replies; 22+ messages in thread
From: Wen Gong @ 2019-10-24  9:47 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 2019-10-24 17:30, Kalle Valo wrote:
> Wen Gong <wgong@codeaurora.org> writes:
> 
>> The max bundle size support by firmware is 32, change it from 8 to 32
>> will help performance. This results in significant performance
>> improvement on RX path.
>> 
>> Tested with QCA6174 SDIO with firmware
>> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>> 
>> Signed-off-by: Wen Gong <wgong@codeaurora.org>
> 
> [...]
> 
>> --- a/drivers/net/wireless/ath/ath10k/sdio.c
>> +++ b/drivers/net/wireless/ath/ath10k/sdio.c
>> @@ -24,7 +24,7 @@
>>  #include "trace.h"
>>  #include "sdio.h"
>> 
>> -#define ATH10K_SDIO_VSG_BUF_SIZE	(32 * 1024)
>> +#define ATH10K_SDIO_VSG_BUF_SIZE	(64 * 1024)
> 
> Is allocating 64 kb with kmalloc() reliable, especially on smaller
> systems? I hope it is, but checking if someone else knows better. We
> only do this only once in probe(), though.
rx packet is more than 1500 bytes for performance test, so for 32 
packets, 32*1024 is not enough.
yes, it is allocated only one time for probe.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-09-25  9:10 ` [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 " Wen Gong
  2019-10-24  9:25   ` Kalle Valo
  2019-10-24  9:30   ` Kalle Valo
@ 2019-10-24 10:05   ` Kalle Valo
  2019-10-24 10:48     ` Wen Gong
  2 siblings, 1 reply; 22+ messages in thread
From: Kalle Valo @ 2019-10-24 10:05 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> The max bundle size support by firmware is 32, change it from 8 to 32
> will help performance. This results in significant performance
> improvement on RX path.
>
> Tested with QCA6174 SDIO with firmware
> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>

[...]

> --- a/drivers/net/wireless/ath/ath10k/htc.h
> +++ b/drivers/net/wireless/ath/ath10k/htc.h
> @@ -39,7 +39,7 @@
>   * 4-byte aligned.
>   */
>  
> -#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        8
> +#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        32
>  
>  enum ath10k_htc_tx_flags {
>  	ATH10K_HTC_FLAG_NEED_CREDIT_UPDATE = 0x01,
> @@ -48,10 +48,16 @@ enum ath10k_htc_tx_flags {
>  
>  enum ath10k_htc_rx_flags {
>  	ATH10K_HTC_FLAGS_RECV_1MORE_BLOCK = 0x01,
> -	ATH10K_HTC_FLAG_TRAILER_PRESENT = 0x02,
> -	ATH10K_HTC_FLAG_BUNDLE_MASK     = 0xF0
> +	ATH10K_HTC_FLAG_TRAILER_PRESENT = 0x02
>  };

I left the comma in ATH10K_HTC_FLAG_TRAILER_PRESENT to make the diff cleaner.

> +#define ATH10K_HTC_FLAG_BUNDLE_MASK 0xF0
> +#define ATH10K_HTC_BUNDLE_EXTRA_MASK GENMASK(3, 2)
> +
> +#define ATH10K_HTC_GET_BUNDLE_COUNT(flags) \
> +	    (FIELD_GET(ATH10K_HTC_FLAG_BUNDLE_MASK, (flags)) +  \
> +	    (FIELD_GET(ATH10K_HTC_BUNDLE_EXTRA_MASK, (flags)) << 4))

I think I asked you about the shift of 4 bits earlier but now I figured
it out (I hope) and documented it like this:

#define ATH10K_HTC_FLAG_BUNDLE_MASK GENMASK(7,4)

/* bits 2-3 are for extra bundle count bits 4-5 */
#define ATH10K_HTC_BUNDLE_EXTRA_MASK GENMASK(3, 2)
#define ATH10K_HTC_BUNDLE_EXTRA_SHIFT 4

static inline unsigned int ath10k_htc_get_bundle_count(u8 flags)
{
	unsigned int count, extra_count;

	count = FIELD_GET(ATH10K_HTC_FLAG_BUNDLE_MASK, flags);
	extra_count = FIELD_GET(ATH10K_HTC_BUNDLE_EXTRA_MASK, flags) <<
		ATH10K_HTC_BUNDLE_EXTRA_SHIFT;

	return count + extra_count;
}

As you can see I also changed the macro to a function, as I prefer C
over CPP :) And changed ATH10K_HTC_FLAG_BUNDLE_MASK to use GENMASK().

But this only compiled tested, please do properly test the patches from
pending branch and let me know if I broke something:

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=afd85ca1b086695cfd26bf484442eaf3bccb6bdd

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=4225b4d50a4f6a1159dc3316d068398f1b5edb57

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=911e0fc846cfc46fb4ccd1d223cb153681ff05bd

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-10-24  9:40     ` Wen Gong
@ 2019-10-24 10:14       ` Kalle Valo
  0 siblings, 0 replies; 22+ messages in thread
From: Kalle Valo @ 2019-10-24 10:14 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> On 2019-10-24 17:25, Kalle Valo wrote:
>> Wen Gong <wgong@codeaurora.org> writes:
>>
>>> The max bundle size support by firmware is 32, change it from 8 to 32
>>> will help performance. This results in significant performance
>>> improvement on RX path.
>>>
>>> Tested with QCA6174 SDIO with firmware
>>> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>>>
>>> Signed-off-by: Wen Gong <wgong@codeaurora.org>
>>> ---
>>>  drivers/net/wireless/ath/ath10k/htc.h  | 12 +++++++++---
>>>  drivers/net/wireless/ath/ath10k/sdio.c |  4 ++--
>>>  drivers/net/wireless/ath/ath10k/sdio.h |  4 ++--
>>>  3 files changed, 13 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/net/wireless/ath/ath10k/htc.h
>>> b/drivers/net/wireless/ath/ath10k/htc.h
>>> index f55d3ca..7055156 100644
>>> --- a/drivers/net/wireless/ath/ath10k/htc.h
>>> +++ b/drivers/net/wireless/ath/ath10k/htc.h
>>> @@ -39,7 +39,7 @@
>>>   * 4-byte aligned.
>>>   */
>>>
>>> -#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        8
>>> +#define HTC_HOST_MAX_MSG_PER_RX_BUNDLE        32
>>
>> So how do I know that this change doesn't break any other hardware? I
>> did a quick review and I think it's safe, but the commit log mentions
>> nothing about this.
>
> the real max rx bundle is decided in ath10k_htc_wait_target.
> it is the min value of HTC_HOST_MAX_MSG_PER_RX_BUNDLE and the value
> reported from firmware.
> htc->max_msgs_per_htc_bundle =
> 			min_t(u8, msg->ready_ext.max_msgs_per_htc_bundle,
> 			      HTC_HOST_MAX_MSG_PER_RX_BUNDLE);

And we assume that no other firmware than QCA6174 SDIO uses value bigger
than 8? Because if there is a such firmware using, for example, value 9
this might cause a regression.

Anyway, I added this comment to the commit log:

  The real max rx bundle is decided in ath10k_htc_wait_target(), it is
  the min value of HTC_HOST_MAX_MSG_PER_RX_BUNDLE and the value reported
  from firmware. So this change shouldn't cause any regressions with
  other hardware supported by ath10k.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-10-24 10:05   ` Kalle Valo
@ 2019-10-24 10:48     ` Wen Gong
  2019-10-31  9:09       ` Kalle Valo
  0 siblings, 1 reply; 22+ messages in thread
From: Wen Gong @ 2019-10-24 10:48 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 2019-10-24 18:05, Kalle Valo wrote:
> Wen Gong <wgong@codeaurora.org> writes:

> As you can see I also changed the macro to a function, as I prefer C
> over CPP :) And changed ATH10K_HTC_FLAG_BUNDLE_MASK to use GENMASK().
> 
yes.
> But this only compiled tested, please do properly test the patches from
> pending branch and let me know if I broke something:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=afd85ca1b086695cfd26bf484442eaf3bccb6bdd
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=4225b4d50a4f6a1159dc3316d068398f1b5edb57
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=911e0fc846cfc46fb4ccd1d223cb153681ff05bd

I will test the 3 patches.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-09-25  9:10 ` [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio Wen Gong
  2019-10-24  8:14   ` Kalle Valo
@ 2019-10-31  9:08   ` Kalle Valo
  2019-11-01  7:42     ` Wen Gong
  1 sibling, 1 reply; 22+ messages in thread
From: Kalle Valo @ 2019-10-31  9:08 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> For RX, it has two parts, one is to read data from sdio, another
> is to indicate the packets to upper stack. Recently it has only
> one thread to do all RX things, it results that it is sequential
> for RX and low throughout, change RX to parallel for the two parts
> will increase throughout.
>
> This patch move the indication to a workqueue, it results in
> significant performance improvement on RX path.
>
> Udp rx throughout is 200Mbps without this patch, and it arrives
> 400Mbps with this patch.
>
> Tested with QCA6174 SDIO with firmware
> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>

[...]

> +static void ath10k_rx_indication_async_work(struct work_struct *work)
> +{
> +	struct ath10k_sdio *ar_sdio = container_of(work, struct ath10k_sdio,
> +						   async_work_rx);
> +	struct ath10k *ar = ar_sdio->ar;
> +	struct ath10k_htc_ep *ep;
> +	struct ath10k_skb_cb *cb;
> +	struct sk_buff *skb;
> +
> +	while (true) {
> +		skb = skb_dequeue(&ar_sdio->rx_head);
> +		if (!skb)
> +			break;
> +		cb = ATH10K_SKB_CB(skb);
> +		ep = &ar->htc.endpoint[cb->eid];
> +		ep->ep_ops.ep_rx_complete(ar, skb);
> +	}
> +}

I just realised that this is RX path so we should use ATH10K_SKB_RXCB()
instead. I made the change below to this commit in pending branch:

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=28da1fe7a3ffa5c55c52328c21165d9efdf2e94c

diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
index af68eb5d0776..c5407f5080b2 100644
--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -124,6 +124,7 @@ struct ath10k_skb_cb {
 struct ath10k_skb_rxcb {
 	dma_addr_t paddr;
 	struct hlist_node hlist;
+	u8 eid;
 };
 
 static inline struct ath10k_skb_cb *ATH10K_SKB_CB(struct sk_buff *skb)
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index c34637881219..c7d09b07a382 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -419,7 +419,7 @@ static int ath10k_sdio_mbox_rx_process_packets(struct ath10k *ar,
 	struct ath10k_htc *htc = &ar->htc;
 	struct ath10k_sdio_rx_data *pkt;
 	struct ath10k_htc_ep *ep;
-	struct ath10k_skb_cb *cb;
+	struct ath10k_skb_rxcb *cb;
 	enum ath10k_htc_ep_id id;
 	int ret, i, *n_lookahead_local;
 	u32 *lookaheads_local;
@@ -466,7 +466,7 @@ static int ath10k_sdio_mbox_rx_process_packets(struct ath10k *ar,
 			goto out;
 
 		if (!pkt->trailer_only) {
-			cb = ATH10K_SKB_CB(pkt->skb);
+			cb = ATH10K_SKB_RXCB(pkt->skb);
 			cb->eid = id;
 
 			skb_queue_tail(&ar_sdio->rx_head, pkt->skb);
@@ -1333,14 +1333,14 @@ static void ath10k_rx_indication_async_work(struct work_struct *work)
 						   async_work_rx);
 	struct ath10k *ar = ar_sdio->ar;
 	struct ath10k_htc_ep *ep;
-	struct ath10k_skb_cb *cb;
+	struct ath10k_skb_rxcb *cb;
 	struct sk_buff *skb;
 
 	while (true) {
 		skb = skb_dequeue(&ar_sdio->rx_head);
 		if (!skb)
 			break;
-		cb = ATH10K_SKB_CB(skb);
+		cb = ATH10K_SKB_RXCB(skb);
 		ep = &ar->htc.endpoint[cb->eid];
 		ep->ep_ops.ep_rx_complete(ar, skb);
 	}

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-10-24 10:48     ` Wen Gong
@ 2019-10-31  9:09       ` Kalle Valo
  2019-11-01  7:41         ` Wen Gong
  0 siblings, 1 reply; 22+ messages in thread
From: Kalle Valo @ 2019-10-31  9:09 UTC (permalink / raw)
  To: Wen Gong; +Cc: linux-wireless, ath10k

Wen Gong <wgong@codeaurora.org> writes:

> On 2019-10-24 18:05, Kalle Valo wrote:
>> Wen Gong <wgong@codeaurora.org> writes:
>
>> As you can see I also changed the macro to a function, as I prefer C
>> over CPP :) And changed ATH10K_HTC_FLAG_BUNDLE_MASK to use GENMASK().
>>
> yes.
>> But this only compiled tested, please do properly test the patches from
>> pending branch and let me know if I broke something:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=afd85ca1b086695cfd26bf484442eaf3bccb6bdd
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=4225b4d50a4f6a1159dc3316d068398f1b5edb57
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=911e0fc846cfc46fb4ccd1d223cb153681ff05bd
>
> I will test the 3 patches.

Did you have a chance to test them? Do note that I did one minor change
today:

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=28da1fe7a3ffa5c55c52328c21165d9efdf2e94c

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 for sdio
  2019-10-31  9:09       ` Kalle Valo
@ 2019-11-01  7:41         ` Wen Gong
  0 siblings, 0 replies; 22+ messages in thread
From: Wen Gong @ 2019-11-01  7:41 UTC (permalink / raw)
  To: Kalle Valo; +Cc: linux-wireless, ath10k

On 2019-10-31 17:09, Kalle Valo wrote:

>> I will test the 3 patches.
> 
> Did you have a chance to test them? Do note that I did one minor change
> today:
Kalle,
Yes, I will test the 7 patches together later.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=28da1fe7a3ffa5c55c52328c21165d9efdf2e94c

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-10-31  9:08   ` Kalle Valo
@ 2019-11-01  7:42     ` Wen Gong
  2019-11-11 10:47       ` Wen Gong
  0 siblings, 1 reply; 22+ messages in thread
From: Wen Gong @ 2019-11-01  7:42 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 2019-10-31 17:08, Kalle Valo wrote:
、> I just realised that this is RX path so we should use 
ATH10K_SKB_RXCB()
> instead. I made the change below to this commit in pending branch:
> 
I will test with the new patch together with other performance patches.
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=28da1fe7a3ffa5c55c52328c21165d9efdf2e94c
> 
> diff --git a/drivers/net/wireless/ath/ath10k/core.h
> b/drivers/net/wireless/ath/ath10k/core.h
> index af68eb5d0776..c5407f5080b2 100644
> --- a/drivers/net/wireless/ath/ath10k/core.h
> +++ b/drivers/net/wireless/ath/ath10k/core.h
> @@ -124,6 +124,7 @@ struct ath10k_skb_cb {
>  struct ath10k_skb_rxcb {
>  	dma_addr_t paddr;
>  	struct hlist_node hlist;
> +	u8 eid;
>  };
> 
>  static inline struct ath10k_skb_cb *ATH10K_SKB_CB(struct sk_buff *skb)
> diff --git a/drivers/net/wireless/ath/ath10k/sdio.c
> b/drivers/net/wireless/ath/ath10k/sdio.c
> index c34637881219..c7d09b07a382 100644
> --- a/drivers/net/wireless/ath/ath10k/sdio.c
> +++ b/drivers/net/wireless/ath/ath10k/sdio.c
> @@ -419,7 +419,7 @@ static int
> ath10k_sdio_mbox_rx_process_packets(struct ath10k *ar,
>  	struct ath10k_htc *htc = &ar->htc;
>  	struct ath10k_sdio_rx_data *pkt;
>  	struct ath10k_htc_ep *ep;
> -	struct ath10k_skb_cb *cb;
> +	struct ath10k_skb_rxcb *cb;
>  	enum ath10k_htc_ep_id id;
>  	int ret, i, *n_lookahead_local;
>  	u32 *lookaheads_local;
> @@ -466,7 +466,7 @@ static int
> ath10k_sdio_mbox_rx_process_packets(struct ath10k *ar,
>  			goto out;
> 
>  		if (!pkt->trailer_only) {
> -			cb = ATH10K_SKB_CB(pkt->skb);
> +			cb = ATH10K_SKB_RXCB(pkt->skb);
>  			cb->eid = id;
> 
>  			skb_queue_tail(&ar_sdio->rx_head, pkt->skb);
> @@ -1333,14 +1333,14 @@ static void
> ath10k_rx_indication_async_work(struct work_struct *work)
>  						   async_work_rx);
>  	struct ath10k *ar = ar_sdio->ar;
>  	struct ath10k_htc_ep *ep;
> -	struct ath10k_skb_cb *cb;
> +	struct ath10k_skb_rxcb *cb;
>  	struct sk_buff *skb;
> 
>  	while (true) {
>  		skb = skb_dequeue(&ar_sdio->rx_head);
>  		if (!skb)
>  			break;
> -		cb = ATH10K_SKB_CB(skb);
> +		cb = ATH10K_SKB_RXCB(skb);
>  		ep = &ar->htc.endpoint[cb->eid];
>  		ep->ep_ops.ep_rx_complete(ar, skb);
>  	}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-11-01  7:42     ` Wen Gong
@ 2019-11-11 10:47       ` Wen Gong
  2019-11-11 12:23         ` Kalle Valo
  0 siblings, 1 reply; 22+ messages in thread
From: Wen Gong @ 2019-11-11 10:47 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 2019-11-01 15:42, Wen Gong wrote:
> On 2019-10-31 17:08, Kalle Valo wrote:
> 、> I just realised that this is RX path so we should use 
> ATH10K_SKB_RXCB()
>> instead. I made the change below to this commit in pending branch:
>> 
> I will test with the new patch together with other performance patches.
Hi Kalle, I have tested with the patches of pending branch, it is 
success.
result is same with the public review which I tested before.

the patches I tested on pending branch:

ath10k: enable alt data of TX path for sdio
ath10k: add htt TX bundle for sdio
ath10k: disable TX complete indication of htt for sdio
ath10k: enable napi on RX path for sdio
ath10k: sdio: remove struct ath10k_sdio_rx_data::status
ath10k: sdio: cosmetic cleanup
ath10k: add workqueue for RX path of sdio
ath10k: change max RX bundle size from 8 to 32 for sdio
ath10k: enable RX bundle receive for sdio


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-11-11 10:47       ` Wen Gong
@ 2019-11-11 12:23         ` Kalle Valo
  2019-11-13  3:37           ` Wen Gong
  0 siblings, 1 reply; 22+ messages in thread
From: Kalle Valo @ 2019-11-11 12:23 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> writes:

> On 2019-11-01 15:42, Wen Gong wrote:
>> On 2019-10-31 17:08, Kalle Valo wrote:
>> 、> I just realised that this is RX path so we should use
>> ATH10K_SKB_RXCB()
>>> instead. I made the change below to this commit in pending branch:
>>>
>> I will test with the new patch together with other performance patches.
> Hi Kalle, I have tested with the patches of pending branch, it is
> success.
> result is same with the public review which I tested before.
>
> the patches I tested on pending branch:
>
> ath10k: enable alt data of TX path for sdio
> ath10k: add htt TX bundle for sdio
> ath10k: disable TX complete indication of htt for sdio
> ath10k: enable napi on RX path for sdio
> ath10k: sdio: remove struct ath10k_sdio_rx_data::status
> ath10k: sdio: cosmetic cleanup
> ath10k: add workqueue for RX path of sdio
> ath10k: change max RX bundle size from 8 to 32 for sdio
> ath10k: enable RX bundle receive for sdio

Very good, thanks for testing.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-11-11 12:23         ` Kalle Valo
@ 2019-11-13  3:37           ` Wen Gong
  2019-11-22 10:02             ` Kalle Valo
  0 siblings, 1 reply; 22+ messages in thread
From: Wen Gong @ 2019-11-13  3:37 UTC (permalink / raw)
  To: Kalle Valo; +Cc: ath10k, linux-wireless

On 2019-11-11 20:23, Kalle Valo wrote:
> Wen Gong <wgong@codeaurora.org> writes:
> 
>> On 2019-11-01 15:42, Wen Gong wrote:
>>> On 2019-10-31 17:08, Kalle Valo wrote:
>>> 、> I just realised that this is RX path so we should use
>>> ATH10K_SKB_RXCB()
>>>> instead. I made the change below to this commit in pending branch:
>>>> 
>>> I will test with the new patch together with other performance 
>>> patches.
>> Hi Kalle, I have tested with the patches of pending branch, it is
>> success.
>> result is same with the public review which I tested before.
>> 
>> the patches I tested on pending branch:
>> 
>> ath10k: enable alt data of TX path for sdio
>> ath10k: add htt TX bundle for sdio
>> ath10k: disable TX complete indication of htt for sdio
>> ath10k: enable napi on RX path for sdio
>> ath10k: sdio: remove struct ath10k_sdio_rx_data::status
>> ath10k: sdio: cosmetic cleanup
>> ath10k: add workqueue for RX path of sdio
>> ath10k: change max RX bundle size from 8 to 32 for sdio
>> ath10k: enable RX bundle receive for sdio
> 
> Very good, thanks for testing.
Hi Kalle,

this patch will trigger connect fail for PSK mode AP:
ath10k: add workqueue for RX path of sdio

I have sent patch to fix it:
ath10k: clear ieee80211_rx_status for sdio

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio
  2019-11-13  3:37           ` Wen Gong
@ 2019-11-22 10:02             ` Kalle Valo
  0 siblings, 0 replies; 22+ messages in thread
From: Kalle Valo @ 2019-11-22 10:02 UTC (permalink / raw)
  To: Wen Gong; +Cc: linux-wireless, ath10k

Wen Gong <wgong@codeaurora.org> writes:

> On 2019-11-11 20:23, Kalle Valo wrote:
>> Wen Gong <wgong@codeaurora.org> writes:
>>
>>> On 2019-11-01 15:42, Wen Gong wrote:
>>>> On 2019-10-31 17:08, Kalle Valo wrote:
>>>> 、> I just realised that this is RX path so we should use
>>>> ATH10K_SKB_RXCB()
>>>>> instead. I made the change below to this commit in pending branch:
>>>>>
>>>> I will test with the new patch together with other performance
>>>> patches.
>>> Hi Kalle, I have tested with the patches of pending branch, it is
>>> success.
>>> result is same with the public review which I tested before.
>>>
>>> the patches I tested on pending branch:
>>>
>>> ath10k: enable alt data of TX path for sdio
>>> ath10k: add htt TX bundle for sdio
>>> ath10k: disable TX complete indication of htt for sdio
>>> ath10k: enable napi on RX path for sdio
>>> ath10k: sdio: remove struct ath10k_sdio_rx_data::status
>>> ath10k: sdio: cosmetic cleanup
>>> ath10k: add workqueue for RX path of sdio
>>> ath10k: change max RX bundle size from 8 to 32 for sdio
>>> ath10k: enable RX bundle receive for sdio
>>
>> Very good, thanks for testing.
>
> this patch will trigger connect fail for PSK mode AP:
> ath10k: add workqueue for RX path of sdio
>
> I have sent patch to fix it:
> ath10k: clear ieee80211_rx_status for sdio

Good catch! But as this patch is not yet applied I fixed this patch
instead with this:

--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -2235,7 +2235,10 @@ static bool ath10k_htt_rx_proc_rx_ind_hl(struct ath10k_htt *htt,
 
        hdr = (struct ieee80211_hdr *)skb->data;
        qos = ieee80211_is_data_qos(hdr->frame_control);
+
        rx_status = IEEE80211_SKB_RXCB(skb);
+       memset(rx_status, 0, sizeof(*rx_status));
+
        rx_status->chains |= BIT(0);
        if (rx->ppdu.combined_rssi == 0) {
                /* SDIO firmware does not provide signal */

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio
  2019-09-25  9:10 ` [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio Wen Gong
  2019-10-24  8:29   ` Kalle Valo
@ 2019-11-25 11:49   ` Kalle Valo
  1 sibling, 0 replies; 22+ messages in thread
From: Kalle Valo @ 2019-11-25 11:49 UTC (permalink / raw)
  To: Wen Gong; +Cc: ath10k, linux-wireless

Wen Gong <wgong@codeaurora.org> wrote:

> The existing implementation of initiating multiple sdio transfers for
> receive bundling is slowing down the receive speed. Combining the
> transfers using a bundle method would be ideal.
> 
> The transmission utilization ratio for sdio bus for small packet is
> slow, because the space and time cost for sdio bus is same for large
> length packet and small length packet. So the speed of data for large
> length packet is higher than small length.
> 
> Test result of different length of data:
> data packet(byte)   cost time(us)   calculated rate(Mbps)
>       256               28                73
>       512               33               124
>      1024               35               234
>      1792               45               318
>     14336              168               682
>     28672              333               688
>     57344              660               695
> 
> Tested with QCA6174 SDIO with firmware
> WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
> 
> Signed-off-by: Alagu Sankar <alagusankar@silex-india.com>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

3 patches applied to ath-next branch of ath.git, thanks.

8d985555ddaa ath10k: enable RX bundle receive for sdio
224776520ead ath10k: change max RX bundle size from 8 to 32 for sdio
67654b26c903 ath10k: add workqueue for RX path of sdio

-- 
https://patchwork.kernel.org/patch/11160247/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-11-25 11:49 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-25  9:10 [PATCH v6 0/3] ath10k: improve throughout of RX of sdio Wen Gong
2019-09-25  9:10 ` [PATCH v6 1/3] ath10k: enable RX bundle receive for sdio Wen Gong
2019-10-24  8:29   ` Kalle Valo
2019-11-25 11:49   ` Kalle Valo
2019-09-25  9:10 ` [PATCH v6 2/3] ath10k: change max RX bundle size from 8 to 32 " Wen Gong
2019-10-24  9:25   ` Kalle Valo
2019-10-24  9:40     ` Wen Gong
2019-10-24 10:14       ` Kalle Valo
2019-10-24  9:30   ` Kalle Valo
2019-10-24  9:47     ` Wen Gong
2019-10-24 10:05   ` Kalle Valo
2019-10-24 10:48     ` Wen Gong
2019-10-31  9:09       ` Kalle Valo
2019-11-01  7:41         ` Wen Gong
2019-09-25  9:10 ` [PATCH v6 3/3] ath10k: add workqueue for RX path of sdio Wen Gong
2019-10-24  8:14   ` Kalle Valo
2019-10-31  9:08   ` Kalle Valo
2019-11-01  7:42     ` Wen Gong
2019-11-11 10:47       ` Wen Gong
2019-11-11 12:23         ` Kalle Valo
2019-11-13  3:37           ` Wen Gong
2019-11-22 10:02             ` Kalle Valo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).