linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] net: macb: Fixes
@ 2014-05-04 22:42 Soren Brinkmann
  2014-05-04 22:42 ` [PATCH 1/5] net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP Soren Brinkmann
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Soren Brinkmann @ 2014-05-04 22:42 UTC (permalink / raw)
  To: Michal Simek, Nicolas Ferre
  Cc: Sören Brinkmann, git, linux-kernel, linux-arm-kernel, netdev


Hi Nicolas,

I think I found the cause of the issue I told you about. Looks like
driver and HW are racing (a few more details in the commit message).
On my way finding that, I found a few more minor issues which are fixed
in the first patches.
The last one is "fixing" the actual race. I don't know if I overlooked
anything here, but this seems to work fine on Zynq.

	Thanks,
	Sören


Soren Brinkmann (5):
  net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP
  net: macb: Clear interrupt flags
  net: macb: Re-enable RX interrupt only when RX is done
  net: macb: Remove 'unlikely' optimization
  net: macb: Fix race between HW and driver

 drivers/net/ethernet/cadence/macb.c | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

-- 
1.9.2.1.g06c4abd


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP
  2014-05-04 22:42 [PATCH 0/5] net: macb: Fixes Soren Brinkmann
@ 2014-05-04 22:42 ` Soren Brinkmann
  2014-05-04 22:42 ` [PATCH 2/5] net: macb: Clear interrupt flags Soren Brinkmann
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Soren Brinkmann @ 2014-05-04 22:42 UTC (permalink / raw)
  To: Michal Simek, Nicolas Ferre
  Cc: Sören Brinkmann, git, linux-kernel, linux-arm-kernel, netdev

Just as commit "net: macb: DMA-unmap full rx-buffer"
(48330e08fa168395b9fd9f369f06cca1df204361), pass the size that
was used for mapping the memory also to the unmap routine to
avoid warnings from the DMA_API.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---

 drivers/net/ethernet/cadence/macb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index ca97005e24b4..18fdcd9d51b3 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -1113,7 +1113,7 @@ static void gem_free_rx_buffers(struct macb *bp)
 
 		desc = &bp->rx_ring[i];
 		addr = MACB_BF(RX_WADDR, MACB_BFEXT(RX_WADDR, desc->addr));
-		dma_unmap_single(&bp->pdev->dev, addr, skb->len,
+		dma_unmap_single(&bp->pdev->dev, addr, bp->rx_buffer_size,
 				 DMA_FROM_DEVICE);
 		dev_kfree_skb_any(skb);
 		skb = NULL;
-- 
1.9.2.1.g06c4abd


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] net: macb: Clear interrupt flags
  2014-05-04 22:42 [PATCH 0/5] net: macb: Fixes Soren Brinkmann
  2014-05-04 22:42 ` [PATCH 1/5] net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP Soren Brinkmann
@ 2014-05-04 22:42 ` Soren Brinkmann
  2014-05-04 22:43 ` [PATCH 3/5] net: macb: Re-enable RX interrupt only when RX is done Soren Brinkmann
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Soren Brinkmann @ 2014-05-04 22:42 UTC (permalink / raw)
  To: Michal Simek, Nicolas Ferre
  Cc: Sören Brinkmann, git, linux-kernel, linux-arm-kernel, netdev

A few interrupt flags were not cleared in the ISR, resulting in a sytem
trapped in the ISR in cases one of those interrupts occurred. Clear all
flags to avoid such situations.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---

 drivers/net/ethernet/cadence/macb.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 18fdcd9d51b3..e38fe39d9589 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -951,6 +951,10 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 		if (unlikely(status & (MACB_TX_ERR_FLAGS))) {
 			macb_writel(bp, IDR, MACB_TX_INT_FLAGS);
 			schedule_work(&bp->tx_error_task);
+
+			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
+				macb_writel(bp, ISR, MACB_TX_ERR_FLAGS);
+
 			break;
 		}
 
@@ -968,6 +972,9 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 				bp->hw_stats.gem.rx_overruns++;
 			else
 				bp->hw_stats.macb.rx_overruns++;
+
+			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
+				macb_writel(bp, ISR, MACB_BIT(ISR_ROVR));
 		}
 
 		if (status & MACB_BIT(HRESP)) {
@@ -977,6 +984,9 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 			 * (work queue?)
 			 */
 			netdev_err(dev, "DMA bus error: HRESP not OK\n");
+
+			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
+				macb_writel(bp, ISR, MACB_BIT(HRESP));
 		}
 
 		status = macb_readl(bp, ISR);
-- 
1.9.2.1.g06c4abd


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] net: macb: Re-enable RX interrupt only when RX is done
  2014-05-04 22:42 [PATCH 0/5] net: macb: Fixes Soren Brinkmann
  2014-05-04 22:42 ` [PATCH 1/5] net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP Soren Brinkmann
  2014-05-04 22:42 ` [PATCH 2/5] net: macb: Clear interrupt flags Soren Brinkmann
@ 2014-05-04 22:43 ` Soren Brinkmann
  2014-05-04 22:43 ` [PATCH 4/5] net: macb: Remove 'unlikely' optimization Soren Brinkmann
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Soren Brinkmann @ 2014-05-04 22:43 UTC (permalink / raw)
  To: Michal Simek, Nicolas Ferre
  Cc: Sören Brinkmann, git, linux-kernel, linux-arm-kernel, netdev

When data is received during the driver processing received data the
NAPI is re-scheduled. In that case the RX interrupt should not be
re-enabled.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---

 drivers/net/ethernet/cadence/macb.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index e38fe39d9589..3f4b8ee0b0e7 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -891,16 +891,15 @@ static int macb_poll(struct napi_struct *napi, int budget)
 	if (work_done < budget) {
 		napi_complete(napi);
 
-		/*
-		 * We've done what we can to clean the buffers. Make sure we
-		 * get notified when new packets arrive.
-		 */
-		macb_writel(bp, IER, MACB_RX_INT_FLAGS);
-
 		/* Packets received while interrupts were disabled */
 		status = macb_readl(bp, RSR);
-		if (unlikely(status))
+		if (unlikely(status)) {
+			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
+				macb_writel(bp, ISR, MACB_BIT(RCOMP));
 			napi_reschedule(napi);
+		} else {
+			macb_writel(bp, IER, MACB_RX_INT_FLAGS);
+		}
 	}
 
 	/* TODO: Handle errors */
-- 
1.9.2.1.g06c4abd


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] net: macb: Remove 'unlikely' optimization
  2014-05-04 22:42 [PATCH 0/5] net: macb: Fixes Soren Brinkmann
                   ` (2 preceding siblings ...)
  2014-05-04 22:43 ` [PATCH 3/5] net: macb: Re-enable RX interrupt only when RX is done Soren Brinkmann
@ 2014-05-04 22:43 ` Soren Brinkmann
  2014-05-04 22:43 ` [PATCH 5/5] net: macb: Fix race between HW and driver Soren Brinkmann
  2014-05-22 13:18 ` [PATCH 0/5] net: macb: Fixes Alexandre Belloni
  5 siblings, 0 replies; 12+ messages in thread
From: Soren Brinkmann @ 2014-05-04 22:43 UTC (permalink / raw)
  To: Michal Simek, Nicolas Ferre
  Cc: Sören Brinkmann, git, linux-kernel, linux-arm-kernel, netdev

Coverage data suggests that the unlikely case of receiving data while
the receive handler is running may not be that unlikely.
Coverage data after running iperf for a while:
    91320:  891:	work_done = bp->macbgem_ops.mog_rx(bp, budget);
    91320:  892:	if (work_done < budget) {
     2362:  893:		napi_complete(napi);
        -:  894:
        -:  895:		/* Packets received while interrupts were disabled */
     4724:  896:		status = macb_readl(bp, RSR);
     2362:  897:		if (unlikely(status)) {
      762:  898:			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
      762:  899:				macb_writel(bp, ISR, MACB_BIT(RCOMP));
        -:  900:			napi_reschedule(napi);
        -:  901:		} else {
     1600:  902:			macb_writel(bp, IER, MACB_RX_INT_FLAGS);
        -:  903:		}
        -:  904:	}

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---

 drivers/net/ethernet/cadence/macb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 3f4b8ee0b0e7..3e13aa31548a 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -893,7 +893,7 @@ static int macb_poll(struct napi_struct *napi, int budget)
 
 		/* Packets received while interrupts were disabled */
 		status = macb_readl(bp, RSR);
-		if (unlikely(status)) {
+		if (status) {
 			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
 				macb_writel(bp, ISR, MACB_BIT(RCOMP));
 			napi_reschedule(napi);
-- 
1.9.2.1.g06c4abd


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] net: macb: Fix race between HW and driver
  2014-05-04 22:42 [PATCH 0/5] net: macb: Fixes Soren Brinkmann
                   ` (3 preceding siblings ...)
  2014-05-04 22:43 ` [PATCH 4/5] net: macb: Remove 'unlikely' optimization Soren Brinkmann
@ 2014-05-04 22:43 ` Soren Brinkmann
  2014-05-05 20:56   ` David Miller
  2014-05-22 13:18 ` [PATCH 0/5] net: macb: Fixes Alexandre Belloni
  5 siblings, 1 reply; 12+ messages in thread
From: Soren Brinkmann @ 2014-05-04 22:43 UTC (permalink / raw)
  To: Michal Simek, Nicolas Ferre
  Cc: Sören Brinkmann, git, linux-kernel, linux-arm-kernel, netdev

Under "heavy" RX load, the driver cannot handle the descriptors fast
enough. In detail, when a descriptor is consumed, its used flag is
cleared and once the RX budget is consumed all descriptors with a
cleared used flag are prepared to receive more data. Under load though,
the HW may constantly receive more data and use those descriptors with a
cleared used flag before they are actually prepared for next usage.

The head and tail pointers into the RX-ring should always be valid and
we can omit clearing and checking of the used flag.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---

---
 drivers/net/ethernet/cadence/macb.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 3e13aa31548a..e9daa072ebb4 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -599,25 +599,16 @@ static void gem_rx_refill(struct macb *bp)
 {
 	unsigned int		entry;
 	struct sk_buff		*skb;
-	struct macb_dma_desc	*desc;
 	dma_addr_t		paddr;
 
 	while (CIRC_SPACE(bp->rx_prepared_head, bp->rx_tail, RX_RING_SIZE) > 0) {
-		u32 addr, ctrl;
-
 		entry = macb_rx_ring_wrap(bp->rx_prepared_head);
-		desc = &bp->rx_ring[entry];
 
 		/* Make hw descriptor updates visible to CPU */
 		rmb();
 
-		addr = desc->addr;
-		ctrl = desc->ctrl;
 		bp->rx_prepared_head++;
 
-		if ((addr & MACB_BIT(RX_USED)))
-			continue;
-
 		if (bp->rx_skbuff[entry] == NULL) {
 			/* allocate sk_buff for this free entry in ring */
 			skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
@@ -698,7 +689,6 @@ static int gem_rx(struct macb *bp, int budget)
 		if (!(addr & MACB_BIT(RX_USED)))
 			break;
 
-		desc->addr &= ~MACB_BIT(RX_USED);
 		bp->rx_tail++;
 		count++;
 
-- 
1.9.2.1.g06c4abd


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] net: macb: Fix race between HW and driver
  2014-05-04 22:43 ` [PATCH 5/5] net: macb: Fix race between HW and driver Soren Brinkmann
@ 2014-05-05 20:56   ` David Miller
  2014-05-05 21:05     ` Sören Brinkmann
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2014-05-05 20:56 UTC (permalink / raw)
  To: soren.brinkmann
  Cc: michal.simek, nicolas.ferre, git, linux-kernel, linux-arm-kernel, netdev

From: Soren Brinkmann <soren.brinkmann@xilinx.com>
Date: Sun,  4 May 2014 15:43:02 -0700

> Under "heavy" RX load, the driver cannot handle the descriptors fast
> enough. In detail, when a descriptor is consumed, its used flag is
> cleared and once the RX budget is consumed all descriptors with a
> cleared used flag are prepared to receive more data. Under load though,
> the HW may constantly receive more data and use those descriptors with a
> cleared used flag before they are actually prepared for next usage.
> 
> The head and tail pointers into the RX-ring should always be valid and
> we can omit clearing and checking of the used flag.
> 
> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>

Isn't the RX_USED bit the only thing that controls what RX entries
the chip will try to use?

I can't see how you can just remove the RX_USED bit handling
altogether.

The problem actually seems to be that in the current code we clear the
RX_USED bit before we actually reallocate the buffer and set it up.

It should be a bug to see the RX_USED bit set in gem_rx_refill(), and
the only reason why it can happen is exactly because you're clearing it
too early in gem_rx().

This change doesn't seem to be correct, I'm not applying this series
sorry.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] net: macb: Fix race between HW and driver
  2014-05-05 20:56   ` David Miller
@ 2014-05-05 21:05     ` Sören Brinkmann
  2014-05-05 21:09       ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Sören Brinkmann @ 2014-05-05 21:05 UTC (permalink / raw)
  To: David Miller
  Cc: michal.simek, nicolas.ferre, git, linux-kernel, linux-arm-kernel, netdev

On Mon, 2014-05-05 at 04:56PM -0400, David Miller wrote:
> From: Soren Brinkmann <soren.brinkmann@xilinx.com>
> Date: Sun,  4 May 2014 15:43:02 -0700
> 
> > Under "heavy" RX load, the driver cannot handle the descriptors fast
> > enough. In detail, when a descriptor is consumed, its used flag is
> > cleared and once the RX budget is consumed all descriptors with a
> > cleared used flag are prepared to receive more data. Under load though,
> > the HW may constantly receive more data and use those descriptors with a
> > cleared used flag before they are actually prepared for next usage.
> > 
> > The head and tail pointers into the RX-ring should always be valid and
> > we can omit clearing and checking of the used flag.
> > 
> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> 
> Isn't the RX_USED bit the only thing that controls what RX entries
> the chip will try to use?
> 
> I can't see how you can just remove the RX_USED bit handling
> altogether.
> 
> The problem actually seems to be that in the current code we clear the
> RX_USED bit before we actually reallocate the buffer and set it up.
> 
> It should be a bug to see the RX_USED bit set in gem_rx_refill(), and
> the only reason why it can happen is exactly because you're clearing it
> too early in gem_rx().

I don't follow. The HW uses the descriptor and the driver handles the
received data. So, in gem_rx_refill we should actually only replace
descriptor which have the RX_USED _set_, not? Currently the test tests
for the opposite, since SW clears RX_USED in gem_rx. This patch just
removes those two parts. The RX_USED is left as is (HW should have set
it). And in gem_rx_refill we simply rely on the head and tail pointers
to refill the used descriptors. I didn't see a reason to do the additional
checking of the RX_USED bit.
After the refill the RX_USED flags are of course cleared for all
refilled descriptors.

	Thanks,
	Sören



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] net: macb: Fix race between HW and driver
  2014-05-05 21:05     ` Sören Brinkmann
@ 2014-05-05 21:09       ` David Miller
  2014-05-05 21:10         ` Sören Brinkmann
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2014-05-05 21:09 UTC (permalink / raw)
  To: soren.brinkmann
  Cc: michal.simek, nicolas.ferre, git, linux-kernel, linux-arm-kernel, netdev

From: Sören Brinkmann <soren.brinkmann@xilinx.com>
Date: Mon, 5 May 2014 14:05:19 -0700

> On Mon, 2014-05-05 at 04:56PM -0400, David Miller wrote:
>> From: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> Date: Sun,  4 May 2014 15:43:02 -0700
>> 
>> > Under "heavy" RX load, the driver cannot handle the descriptors fast
>> > enough. In detail, when a descriptor is consumed, its used flag is
>> > cleared and once the RX budget is consumed all descriptors with a
>> > cleared used flag are prepared to receive more data. Under load though,
>> > the HW may constantly receive more data and use those descriptors with a
>> > cleared used flag before they are actually prepared for next usage.
>> > 
>> > The head and tail pointers into the RX-ring should always be valid and
>> > we can omit clearing and checking of the used flag.
>> > 
>> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> 
>> Isn't the RX_USED bit the only thing that controls what RX entries
>> the chip will try to use?
>> 
>> I can't see how you can just remove the RX_USED bit handling
>> altogether.
>> 
>> The problem actually seems to be that in the current code we clear the
>> RX_USED bit before we actually reallocate the buffer and set it up.
>> 
>> It should be a bug to see the RX_USED bit set in gem_rx_refill(), and
>> the only reason why it can happen is exactly because you're clearing it
>> too early in gem_rx().
> 
> I don't follow. The HW uses the descriptor and the driver handles the
> received data. So, in gem_rx_refill we should actually only replace
> descriptor which have the RX_USED _set_, not? Currently the test tests
> for the opposite, since SW clears RX_USED in gem_rx. This patch just
> removes those two parts. The RX_USED is left as is (HW should have set
> it). And in gem_rx_refill we simply rely on the head and tail pointers
> to refill the used descriptors. I didn't see a reason to do the additional
> checking of the RX_USED bit.
> After the refill the RX_USED flags are of course cleared for all
> refilled descriptors.

Aha, that makes sense, I didn't notice that RX_USED is cleared as a side
effect of gem_rx_refill()'s work.

I'll apply this series, thanks for explaining.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] net: macb: Fix race between HW and driver
  2014-05-05 21:09       ` David Miller
@ 2014-05-05 21:10         ` Sören Brinkmann
  2014-05-05 21:12           ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Sören Brinkmann @ 2014-05-05 21:10 UTC (permalink / raw)
  To: David Miller
  Cc: michal.simek, nicolas.ferre, git, linux-kernel, linux-arm-kernel, netdev

On Mon, 2014-05-05 at 05:09PM -0400, David Miller wrote:
> From: Sören Brinkmann <soren.brinkmann@xilinx.com>
> Date: Mon, 5 May 2014 14:05:19 -0700
> 
> > On Mon, 2014-05-05 at 04:56PM -0400, David Miller wrote:
> >> From: Soren Brinkmann <soren.brinkmann@xilinx.com>
> >> Date: Sun,  4 May 2014 15:43:02 -0700
> >> 
> >> > Under "heavy" RX load, the driver cannot handle the descriptors fast
> >> > enough. In detail, when a descriptor is consumed, its used flag is
> >> > cleared and once the RX budget is consumed all descriptors with a
> >> > cleared used flag are prepared to receive more data. Under load though,
> >> > the HW may constantly receive more data and use those descriptors with a
> >> > cleared used flag before they are actually prepared for next usage.
> >> > 
> >> > The head and tail pointers into the RX-ring should always be valid and
> >> > we can omit clearing and checking of the used flag.
> >> > 
> >> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> >> 
> >> Isn't the RX_USED bit the only thing that controls what RX entries
> >> the chip will try to use?
> >> 
> >> I can't see how you can just remove the RX_USED bit handling
> >> altogether.
> >> 
> >> The problem actually seems to be that in the current code we clear the
> >> RX_USED bit before we actually reallocate the buffer and set it up.
> >> 
> >> It should be a bug to see the RX_USED bit set in gem_rx_refill(), and
> >> the only reason why it can happen is exactly because you're clearing it
> >> too early in gem_rx().
> > 
> > I don't follow. The HW uses the descriptor and the driver handles the
> > received data. So, in gem_rx_refill we should actually only replace
> > descriptor which have the RX_USED _set_, not? Currently the test tests
> > for the opposite, since SW clears RX_USED in gem_rx. This patch just
> > removes those two parts. The RX_USED is left as is (HW should have set
> > it). And in gem_rx_refill we simply rely on the head and tail pointers
> > to refill the used descriptors. I didn't see a reason to do the additional
> > checking of the RX_USED bit.
> > After the refill the RX_USED flags are of course cleared for all
> > refilled descriptors.
> 
> Aha, that makes sense, I didn't notice that RX_USED is cleared as a side
> effect of gem_rx_refill()'s work.
> 
> I'll apply this series, thanks for explaining.

Thanks, but we probably wanna wait for Nicolas to test on his platforms
using macb?

	Sören



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] net: macb: Fix race between HW and driver
  2014-05-05 21:10         ` Sören Brinkmann
@ 2014-05-05 21:12           ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2014-05-05 21:12 UTC (permalink / raw)
  To: soren.brinkmann
  Cc: michal.simek, nicolas.ferre, git, linux-kernel, linux-arm-kernel, netdev

From: Sören Brinkmann <soren.brinkmann@xilinx.com>
Date: Mon, 5 May 2014 14:10:23 -0700

> On Mon, 2014-05-05 at 05:09PM -0400, David Miller wrote:
>> From: Sören Brinkmann <soren.brinkmann@xilinx.com>
>> Date: Mon, 5 May 2014 14:05:19 -0700
>> 
>> > On Mon, 2014-05-05 at 04:56PM -0400, David Miller wrote:
>> >> From: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> >> Date: Sun,  4 May 2014 15:43:02 -0700
>> >> 
>> >> > Under "heavy" RX load, the driver cannot handle the descriptors fast
>> >> > enough. In detail, when a descriptor is consumed, its used flag is
>> >> > cleared and once the RX budget is consumed all descriptors with a
>> >> > cleared used flag are prepared to receive more data. Under load though,
>> >> > the HW may constantly receive more data and use those descriptors with a
>> >> > cleared used flag before they are actually prepared for next usage.
>> >> > 
>> >> > The head and tail pointers into the RX-ring should always be valid and
>> >> > we can omit clearing and checking of the used flag.
>> >> > 
>> >> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> >> 
>> >> Isn't the RX_USED bit the only thing that controls what RX entries
>> >> the chip will try to use?
>> >> 
>> >> I can't see how you can just remove the RX_USED bit handling
>> >> altogether.
>> >> 
>> >> The problem actually seems to be that in the current code we clear the
>> >> RX_USED bit before we actually reallocate the buffer and set it up.
>> >> 
>> >> It should be a bug to see the RX_USED bit set in gem_rx_refill(), and
>> >> the only reason why it can happen is exactly because you're clearing it
>> >> too early in gem_rx().
>> > 
>> > I don't follow. The HW uses the descriptor and the driver handles the
>> > received data. So, in gem_rx_refill we should actually only replace
>> > descriptor which have the RX_USED _set_, not? Currently the test tests
>> > for the opposite, since SW clears RX_USED in gem_rx. This patch just
>> > removes those two parts. The RX_USED is left as is (HW should have set
>> > it). And in gem_rx_refill we simply rely on the head and tail pointers
>> > to refill the used descriptors. I didn't see a reason to do the additional
>> > checking of the RX_USED bit.
>> > After the refill the RX_USED flags are of course cleared for all
>> > refilled descriptors.
>> 
>> Aha, that makes sense, I didn't notice that RX_USED is cleared as a side
>> effect of gem_rx_refill()'s work.
>> 
>> I'll apply this series, thanks for explaining.
> 
> Thanks, but we probably wanna wait for Nicolas to test on his platforms
> using macb?

We can always apply a fixup or revert.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] net: macb: Fixes
  2014-05-04 22:42 [PATCH 0/5] net: macb: Fixes Soren Brinkmann
                   ` (4 preceding siblings ...)
  2014-05-04 22:43 ` [PATCH 5/5] net: macb: Fix race between HW and driver Soren Brinkmann
@ 2014-05-22 13:18 ` Alexandre Belloni
  5 siblings, 0 replies; 12+ messages in thread
From: Alexandre Belloni @ 2014-05-22 13:18 UTC (permalink / raw)
  To: Soren Brinkmann
  Cc: Michal Simek, Nicolas Ferre, linux-arm-kernel, netdev, git, linux-kernel

Hi,

On 04/05/2014 at 15:42:57 -0700, Soren Brinkmann wrote :
> 
> Hi Nicolas,
> 
> I think I found the cause of the issue I told you about. Looks like
> driver and HW are racing (a few more details in the commit message).
> On my way finding that, I found a few more minor issues which are fixed
> in the first patches.
> The last one is "fixing" the actual race. I don't know if I overlooked
> anything here, but this seems to work fine on Zynq.
> 

Tested on at91sam9g45ek with iperf showing good performances. For the
whole series:
Tested-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>


-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-05-22 13:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-04 22:42 [PATCH 0/5] net: macb: Fixes Soren Brinkmann
2014-05-04 22:42 ` [PATCH 1/5] net: macb: Pass same size to DMA_UNMAP as used for DMA_MAP Soren Brinkmann
2014-05-04 22:42 ` [PATCH 2/5] net: macb: Clear interrupt flags Soren Brinkmann
2014-05-04 22:43 ` [PATCH 3/5] net: macb: Re-enable RX interrupt only when RX is done Soren Brinkmann
2014-05-04 22:43 ` [PATCH 4/5] net: macb: Remove 'unlikely' optimization Soren Brinkmann
2014-05-04 22:43 ` [PATCH 5/5] net: macb: Fix race between HW and driver Soren Brinkmann
2014-05-05 20:56   ` David Miller
2014-05-05 21:05     ` Sören Brinkmann
2014-05-05 21:09       ` David Miller
2014-05-05 21:10         ` Sören Brinkmann
2014-05-05 21:12           ` David Miller
2014-05-22 13:18 ` [PATCH 0/5] net: macb: Fixes Alexandre Belloni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).