linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL
@ 2017-04-14  3:19 sean.wang
  2017-04-14  3:19 ` [PATCH v2 net 1/2] net: ethernet: mediatek: fix inconsistency between TXD and the used buffer sean.wang
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: sean.wang @ 2017-04-14  3:19 UTC (permalink / raw)
  To: john, davem
  Cc: nbd, netdev, linux-kernel, linux-mediatek, keyhaede, Sean Wang

From: Sean Wang <sean.wang@mediatek.com>

Changes since v1:
- fix inconsistent enumeration which easily causes the potential bug

The series fixes kernel BUG caused by inconsistent SKB length reported
into BQL. The reason for inconsistent length comes from hardware BUG which
results in different port number carried on the TXD within the lifecycle of
SKB. So patch 2) is proposed for use a software way to track which port
the SKB involving instead of hardware way. And patch 1) is given for another
issue I found which causes TXD and SKB inconsistency that is not expected
in the initial logic, so it is also being corrected it in the series.

The log for the kernel BUG caused by the issue is posted as below.

[  120.825955] kernel BUG at ... lib/dynamic_queue_limits.c:26!
[  120.837684] Internal error: Oops - BUG: 0 [#1] SMP ARM
[  120.842778] Modules linked in:
[  120.845811] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-191576-gdbcef47 #35
[  120.853488] Hardware name: Mediatek Cortex-A7 (Device Tree)
[  120.859012] task: c1007480 task.stack: c1000000
[  120.863510] PC is at dql_completed+0x108/0x17c
[  120.867915] LR is at 0x46
[  120.870512] pc : [<c03c19c8>]    lr : [<00000046>]    psr: 80000113
[  120.870512] sp : c1001d58  ip : c1001d80  fp : c1001d7c
[  120.881895] r10: 0000003e  r9 : df6b3400  r8 : 0ed86506
[  120.887075] r7 : 00000001  r6 : 00000001  r5 : 0ed8654c  r4 : df0135d8
[  120.893546] r3 : 00000001  r2 : df016800  r1 : 0000fece  r0 : df6b3480
[  120.900018] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  120.907093] Control: 10c5387d  Table: 9e27806a  DAC: 00000051
[  120.912789] Process swapper/0 (pid: 0, stack limit = 0xc1000218)
[  120.918744] Stack: (0xc1001d58 to 0xc1002000)

....

121.085331] 1fc0: 00000000 c0a52a28 00000000 c10855d4 c1003c58 c0a52a24 c100885c 8000406a
[  121.093444] 1fe0: 410fc073 00000000 00000000 c1001ff8 8000807c c0a009cc 00000000 00000000
[  121.101575] [<c03c19c8>] (dql_completed) from [<c04cb010>] (mtk_napi_tx+0x1d0/0x37c)
[  121.109263] [<c04cb010>] (mtk_napi_tx) from [<c05e28cc>] (net_rx_action+0x24c/0x3b8)
[  121.116951] [<c05e28cc>] (net_rx_action) from [<c010152c>] (__do_softirq+0xe4/0x35c)
[  121.124638] [<c010152c>] (__do_softirq) from [<c012a624>] (irq_exit+0xe8/0x150)
[  121.131895] [<c012a624>] (irq_exit) from [<c017750c>] (__handle_domain_irq+0x70/0xc4)
[  121.139666] [<c017750c>] (__handle_domain_irq) from [<c0101404>] (gic_handle_irq+0x58/0x9c)
[  121.147953] [<c0101404>] (gic_handle_irq) from [<c010e18c>] (__irq_svc+0x6c/0x90)
[  121.155373] Exception stack(0xc1001ef8 to 0xc1001f40)

Sean Wang (2):
  net: ethernet: mediatek: fix inconsistency between TXD and the used
    buffer
  net: ethernet: mediatek: fix inconsistency of port number carried in
    TXD

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 31 ++++++++++++++++-------------
 drivers/net/ethernet/mediatek/mtk_eth_soc.h | 12 ++++++++---
 2 files changed, 26 insertions(+), 17 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 net 1/2] net: ethernet: mediatek: fix inconsistency between TXD and the used buffer
  2017-04-14  3:19 [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL sean.wang
@ 2017-04-14  3:19 ` sean.wang
  2017-04-14  3:19 ` [PATCH v2 net 2/2] net: ethernet: mediatek: fix inconsistency of port number carried in TXD sean.wang
  2017-04-17 17:34 ` [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: sean.wang @ 2017-04-14  3:19 UTC (permalink / raw)
  To: john, davem
  Cc: nbd, netdev, linux-kernel, linux-mediatek, keyhaede, Sean Wang

From: Sean Wang <sean.wang@mediatek.com>

Fix inconsistency between the TXD descriptor and the used buffer that
would cause unexpected logic at mtk_tx_unmap() during skb housekeeping.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 14e1bd1..48ba617 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -613,7 +613,7 @@ static int mtk_tx_map(struct sk_buff *skb, struct net_device *dev,
 	struct mtk_mac *mac = netdev_priv(dev);
 	struct mtk_eth *eth = mac->hw;
 	struct mtk_tx_dma *itxd, *txd;
-	struct mtk_tx_buf *tx_buf;
+	struct mtk_tx_buf *itx_buf, *tx_buf;
 	dma_addr_t mapped_addr;
 	unsigned int nr_frags;
 	int i, n_desc = 1;
@@ -627,8 +627,8 @@ static int mtk_tx_map(struct sk_buff *skb, struct net_device *dev,
 	fport = (mac->id + 1) << TX_DMA_FPORT_SHIFT;
 	txd4 |= fport;
 
-	tx_buf = mtk_desc_to_tx_buf(ring, itxd);
-	memset(tx_buf, 0, sizeof(*tx_buf));
+	itx_buf = mtk_desc_to_tx_buf(ring, itxd);
+	memset(itx_buf, 0, sizeof(*itx_buf));
 
 	if (gso)
 		txd4 |= TX_DMA_TSO;
@@ -647,9 +647,9 @@ static int mtk_tx_map(struct sk_buff *skb, struct net_device *dev,
 		return -ENOMEM;
 
 	WRITE_ONCE(itxd->txd1, mapped_addr);
-	tx_buf->flags |= MTK_TX_FLAGS_SINGLE0;
-	dma_unmap_addr_set(tx_buf, dma_addr0, mapped_addr);
-	dma_unmap_len_set(tx_buf, dma_len0, skb_headlen(skb));
+	itx_buf->flags |= MTK_TX_FLAGS_SINGLE0;
+	dma_unmap_addr_set(itx_buf, dma_addr0, mapped_addr);
+	dma_unmap_len_set(itx_buf, dma_len0, skb_headlen(skb));
 
 	/* TX SG offload */
 	txd = itxd;
@@ -685,10 +685,9 @@ static int mtk_tx_map(struct sk_buff *skb, struct net_device *dev,
 					       last_frag * TX_DMA_LS0));
 			WRITE_ONCE(txd->txd4, fport);
 
-			tx_buf->skb = (struct sk_buff *)MTK_DMA_DUMMY_DESC;
 			tx_buf = mtk_desc_to_tx_buf(ring, txd);
 			memset(tx_buf, 0, sizeof(*tx_buf));
-
+			tx_buf->skb = (struct sk_buff *)MTK_DMA_DUMMY_DESC;
 			tx_buf->flags |= MTK_TX_FLAGS_PAGE0;
 			dma_unmap_addr_set(tx_buf, dma_addr0, mapped_addr);
 			dma_unmap_len_set(tx_buf, dma_len0, frag_map_size);
@@ -698,7 +697,7 @@ static int mtk_tx_map(struct sk_buff *skb, struct net_device *dev,
 	}
 
 	/* store skb to cleanup */
-	tx_buf->skb = skb;
+	itx_buf->skb = skb;
 
 	WRITE_ONCE(itxd->txd4, txd4);
 	WRITE_ONCE(itxd->txd3, (TX_DMA_SWC | TX_DMA_PLEN0(skb_headlen(skb)) |
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 net 2/2] net: ethernet: mediatek: fix inconsistency of port number carried in TXD
  2017-04-14  3:19 [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL sean.wang
  2017-04-14  3:19 ` [PATCH v2 net 1/2] net: ethernet: mediatek: fix inconsistency between TXD and the used buffer sean.wang
@ 2017-04-14  3:19 ` sean.wang
  2017-04-17 17:34 ` [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: sean.wang @ 2017-04-14  3:19 UTC (permalink / raw)
  To: john, davem
  Cc: nbd, netdev, linux-kernel, linux-mediatek, keyhaede, Sean Wang

From: Sean Wang <sean.wang@mediatek.com>

Fix port inconsistency on TXD due to hardware BUG that would cause
different port number is carried on the same TXD between tx_map()
and tx_unmap() with the iperf test. It would cause confusing BQL
logic which leads to kernel panic when dual GMAC runs concurrently.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 14 +++++++++-----
 drivers/net/ethernet/mediatek/mtk_eth_soc.h | 12 +++++++++---
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 48ba617..6313c53 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -648,6 +648,8 @@ static int mtk_tx_map(struct sk_buff *skb, struct net_device *dev,
 
 	WRITE_ONCE(itxd->txd1, mapped_addr);
 	itx_buf->flags |= MTK_TX_FLAGS_SINGLE0;
+	itx_buf->flags |= (!mac->id) ? MTK_TX_FLAGS_FPORT0 :
+			  MTK_TX_FLAGS_FPORT1;
 	dma_unmap_addr_set(itx_buf, dma_addr0, mapped_addr);
 	dma_unmap_len_set(itx_buf, dma_len0, skb_headlen(skb));
 
@@ -689,6 +691,9 @@ static int mtk_tx_map(struct sk_buff *skb, struct net_device *dev,
 			memset(tx_buf, 0, sizeof(*tx_buf));
 			tx_buf->skb = (struct sk_buff *)MTK_DMA_DUMMY_DESC;
 			tx_buf->flags |= MTK_TX_FLAGS_PAGE0;
+			tx_buf->flags |= (!mac->id) ? MTK_TX_FLAGS_FPORT0 :
+					 MTK_TX_FLAGS_FPORT1;
+
 			dma_unmap_addr_set(tx_buf, dma_addr0, mapped_addr);
 			dma_unmap_len_set(tx_buf, dma_len0, frag_map_size);
 			frag_size -= frag_map_size;
@@ -1011,17 +1016,16 @@ static int mtk_poll_tx(struct mtk_eth *eth, int budget)
 
 	while ((cpu != dma) && budget) {
 		u32 next_cpu = desc->txd2;
-		int mac;
+		int mac = 0;
 
 		desc = mtk_qdma_phys_to_virt(ring, desc->txd2);
 		if ((desc->txd3 & TX_DMA_OWNER_CPU) == 0)
 			break;
 
-		mac = (desc->txd4 >> TX_DMA_FPORT_SHIFT) &
-		       TX_DMA_FPORT_MASK;
-		mac--;
-
 		tx_buf = mtk_desc_to_tx_buf(ring, desc);
+		if (tx_buf->flags & MTK_TX_FLAGS_FPORT1)
+			mac = 1;
+
 		skb = tx_buf->skb;
 		if (!skb) {
 			condition = 1;
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index 996024d..3c46a3b 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -410,12 +410,18 @@ struct mtk_hw_stats {
 	struct u64_stats_sync	syncp;
 };
 
-/* PDMA descriptor can point at 1-2 segments. This enum allows us to track how
- * memory was allocated so that it can be freed properly
- */
 enum mtk_tx_flags {
+	/* PDMA descriptor can point at 1-2 segments. This enum allows us to
+	 * track how memory was allocated so that it can be freed properly.
+	 */
 	MTK_TX_FLAGS_SINGLE0	= 0x01,
 	MTK_TX_FLAGS_PAGE0	= 0x02,
+
+	/* MTK_TX_FLAGS_FPORTx allows tracking which port the transmitted
+	 * SKB out instead of looking up through hardware TX descriptor.
+	 */
+	MTK_TX_FLAGS_FPORT0	= 0x04,
+	MTK_TX_FLAGS_FPORT1	= 0x08,
 };
 
 /* This enum allows us to identify how the clock is defined on the array of the
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL
  2017-04-14  3:19 [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL sean.wang
  2017-04-14  3:19 ` [PATCH v2 net 1/2] net: ethernet: mediatek: fix inconsistency between TXD and the used buffer sean.wang
  2017-04-14  3:19 ` [PATCH v2 net 2/2] net: ethernet: mediatek: fix inconsistency of port number carried in TXD sean.wang
@ 2017-04-17 17:34 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2017-04-17 17:34 UTC (permalink / raw)
  To: sean.wang; +Cc: john, nbd, netdev, linux-kernel, linux-mediatek, keyhaede

From: <sean.wang@mediatek.com>
Date: Fri, 14 Apr 2017 11:19:10 +0800

> From: Sean Wang <sean.wang@mediatek.com>
> 
> Changes since v1:
> - fix inconsistent enumeration which easily causes the potential bug

Series applied, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-17 17:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-14  3:19 [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL sean.wang
2017-04-14  3:19 ` [PATCH v2 net 1/2] net: ethernet: mediatek: fix inconsistency between TXD and the used buffer sean.wang
2017-04-14  3:19 ` [PATCH v2 net 2/2] net: ethernet: mediatek: fix inconsistency of port number carried in TXD sean.wang
2017-04-17 17:34 ` [PATCH v2 net 0/2] Fix crash caused by reporting inconsistent skb->len to BQL David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).