All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] macb: RLE and BNA handling
@ 2009-01-13 16:21 Erik Waling
  2009-01-14  7:47 ` Erik Waling
  2009-01-14 10:15 ` Haavard Skinnemoen
  0 siblings, 2 replies; 11+ messages in thread
From: Erik Waling @ 2009-01-13 16:21 UTC (permalink / raw)
  To: hskinnemoen; +Cc: linux-kernel

This patch (against 2.6.28) solves two issues we have been experiencing
when connecting the EMAC module on our AT91SAM9260 board to an ethernet
repeater hub. When transfering large amounts of data we sometimes
experienced that the Retry Limit Exceeded (RLE) bit got set in TSR
during transmission attempts. When this happened the driver would stall
in a state that prevented any more data from being sent.

The other issue experienced was in the RX part of the driver. Sometimes
the driver runs out of unused slots in the RX ring buffer. All slots
will be filled with data but the driver is not able to extract a
complete frame from the fragments in the buffer. When this happens the
Buffer Not Available (BNA) bit is set in RSR.

This patch adds handling for RLE and BNA.

Signed-off-by: Erik Waling <erik.waling@konftel.com>

---

diff -urpN linux-2.6.28.orig/drivers/net/macb.c linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c
--- linux-2.6.28.orig/drivers/net/macb.c        2009-01-13 10:36:13.000000000 +0100
+++ linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c     2009-01-13 16:50:34.000000000 +0100
@@ -2,6 +2,7 @@
  * Atmel MACB Ethernet Controller driver
  *
  * Copyright (C) 2004-2006 Atmel Corporation
+ * Copyright (C) 2008-2009 Konftel AB
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -316,10 +317,11 @@ static void macb_tx(struct macb *bp)
        dev_dbg(&bp->pdev->dev, "macb_tx status = %02lx\n",
                (unsigned long)status);

-       if (status & MACB_BIT(UND)) {
+       if (status & (MACB_BIT(UND) | MACB_BIT(TSR_RLE))) {
                int i;
-               printk(KERN_ERR "%s: TX underrun, resetting buffers\n",
-                       bp->dev->name);
+               printk(KERN_ERR "%s: TX %s, resetting buffers\n",
+                       bp->dev->name, status & MACB_BIT(UND) ?
+                       "underrun" : "retry limit exceeded");

                head = bp->tx_head;

@@ -527,18 +529,31 @@ static int macb_poll(struct napi_struct
        dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
                (unsigned long)status, budget);

-       if (!(status & MACB_BIT(REC))) {
+       if (status & MACB_BIT(REC)) {
+               work_done = macb_rx(bp, budget);
+               if (work_done < budget)
+                       netif_rx_complete(dev, napi);
+       } else if (status & MACB_BIT(BNA)) {
+               /* No slots available in RX ring. Mark all slots
+                * as unused.
+                */
+               int i;
+
+               dev_warn(&bp->pdev->dev,
+                               "No free RX buffers. Marking all as unused.\n");
+
+               for (i = 0; i < RX_RING_SIZE; i++)
+                       bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
+
+               wmb();
+               netif_rx_complete(dev, napi);
+       } else if (!(status & MACB_BIT(REC))) {
                dev_warn(&bp->pdev->dev,
                         "No RX buffers complete, status = %02lx\n",
                         (unsigned long)status);
                netif_rx_complete(dev, napi);
-               goto out;
        }

-       work_done = macb_rx(bp, budget);
-       if (work_done < budget)
-               netif_rx_complete(dev, napi);
-
        /*
         * We've done what we can to clean the buffers. Make sure we
         * get notified when new packets arrive.
@@ -584,7 +599,8 @@ static irqreturn_t macb_interrupt(int ir
                        }
                }

-               if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
+               if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND) |
+                            MACB_BIT(ISR_RLE)))
                        macb_tx(bp);

                /*



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-01-13 16:21 [PATCH] macb: RLE and BNA handling Erik Waling
@ 2009-01-14  7:47 ` Erik Waling
  2009-01-14 10:15 ` Haavard Skinnemoen
  1 sibling, 0 replies; 11+ messages in thread
From: Erik Waling @ 2009-01-14  7:47 UTC (permalink / raw)
  To: hskinnemoen; +Cc: linux-kernel

Woops, just realized that the tabs in my previous post might have been
messed up. One more try...

Signed-off-by: Erik Waling <erik.waling@konftel.com>

---

diff -urpN linux-2.6.28.orig/drivers/net/macb.c linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c
--- linux-2.6.28.orig/drivers/net/macb.c	2009-01-13 10:36:13.000000000 +0100
+++ linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c	2009-01-13 16:50:34.000000000 +0100
@@ -2,6 +2,7 @@
  * Atmel MACB Ethernet Controller driver
  *
  * Copyright (C) 2004-2006 Atmel Corporation
+ * Copyright (C) 2008-2009 Konftel AB
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -316,10 +317,11 @@ static void macb_tx(struct macb *bp)
 	dev_dbg(&bp->pdev->dev, "macb_tx status = %02lx\n",
 		(unsigned long)status);
 
-	if (status & MACB_BIT(UND)) {
+	if (status & (MACB_BIT(UND) | MACB_BIT(TSR_RLE))) {
 		int i;
-		printk(KERN_ERR "%s: TX underrun, resetting buffers\n",
-			bp->dev->name);
+		printk(KERN_ERR "%s: TX %s, resetting buffers\n",
+			bp->dev->name, status & MACB_BIT(UND) ? 
+			"underrun" : "retry limit exceeded");
 
 		head = bp->tx_head;
 
@@ -527,18 +529,31 @@ static int macb_poll(struct napi_struct 
 	dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
 		(unsigned long)status, budget);
 
-	if (!(status & MACB_BIT(REC))) {
+	if (status & MACB_BIT(REC)) {
+		work_done = macb_rx(bp, budget);
+		if (work_done < budget)
+			netif_rx_complete(dev, napi);
+	} else if (status & MACB_BIT(BNA)) {
+		/* No slots available in RX ring. Mark all slots
+		 * as unused.
+		 */
+		int i;
+
+		dev_warn(&bp->pdev->dev, 
+				"No free RX buffers. Marking all as unused.\n");
+
+		for (i = 0; i < RX_RING_SIZE; i++) 
+			bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
+		
+		wmb();
+		netif_rx_complete(dev, napi);
+	} else if (!(status & MACB_BIT(REC))) {
 		dev_warn(&bp->pdev->dev,
 			 "No RX buffers complete, status = %02lx\n",
 			 (unsigned long)status);
 		netif_rx_complete(dev, napi);
-		goto out;
 	}
 
-	work_done = macb_rx(bp, budget);
-	if (work_done < budget)
-		netif_rx_complete(dev, napi);
-
 	/*
 	 * We've done what we can to clean the buffers. Make sure we
 	 * get notified when new packets arrive.
@@ -584,7 +599,8 @@ static irqreturn_t macb_interrupt(int ir
 			}
 		}
 
-		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
+		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND) | 
+                            MACB_BIT(ISR_RLE)))
 			macb_tx(bp);
 
 		/*



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-01-13 16:21 [PATCH] macb: RLE and BNA handling Erik Waling
  2009-01-14  7:47 ` Erik Waling
@ 2009-01-14 10:15 ` Haavard Skinnemoen
  2009-01-14 10:27   ` Erik Waling
  1 sibling, 1 reply; 11+ messages in thread
From: Haavard Skinnemoen @ 2009-01-14 10:15 UTC (permalink / raw)
  To: Erik Waling; +Cc: linux-kernel

Erik Waling wrote:
> This patch (against 2.6.28) solves two issues we have been experiencing
> when connecting the EMAC module on our AT91SAM9260 board to an ethernet
> repeater hub. When transfering large amounts of data we sometimes
> experienced that the Retry Limit Exceeded (RLE) bit got set in TSR
> during transmission attempts. When this happened the driver would stall
> in a state that prevented any more data from being sent.

Right. This part of your patch looks good...

> The other issue experienced was in the RX part of the driver. Sometimes
> the driver runs out of unused slots in the RX ring buffer. All slots
> will be filled with data but the driver is not able to extract a
> complete frame from the fragments in the buffer. When this happens the
> Buffer Not Available (BNA) bit is set in RSR.

This part also looks like a real problem, however...

> @@ -527,18 +529,31 @@ static int macb_poll(struct napi_struct
>         dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
>                 (unsigned long)status, budget);
> 
> -       if (!(status & MACB_BIT(REC))) {
> +       if (status & MACB_BIT(REC)) {
> +               work_done = macb_rx(bp, budget);
> +               if (work_done < budget)
> +                       netif_rx_complete(dev, napi);
> +       } else if (status & MACB_BIT(BNA)) {
> +               /* No slots available in RX ring. Mark all slots
> +                * as unused.
> +                */
> +               int i;
> +
> +               dev_warn(&bp->pdev->dev,
> +                               "No free RX buffers. Marking all as unused.\n");
> +
> +               for (i = 0; i < RX_RING_SIZE; i++)
> +                       bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
> +
> +               wmb();
> +               netif_rx_complete(dev, napi);

I'm not sure if nuking all the buffers that were received successfully
is the right thing to do here...?

> +       } else if (!(status & MACB_BIT(REC))) {

This should probably be just "else", since you checked the REC bit
above.

Haavard

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-01-14 10:15 ` Haavard Skinnemoen
@ 2009-01-14 10:27   ` Erik Waling
  2009-01-14 12:03     ` Haavard Skinnemoen
  0 siblings, 1 reply; 11+ messages in thread
From: Erik Waling @ 2009-01-14 10:27 UTC (permalink / raw)
  To: Haavard Skinnemoen; +Cc: Erik Waling, linux-kernel

See comments below...

On Wed, 2009-01-14 at 11:15 +0100, Haavard Skinnemoen wrote:
> Erik Waling wrote:
> > This patch (against 2.6.28) solves two issues we have been experiencing
> > when connecting the EMAC module on our AT91SAM9260 board to an ethernet
> > repeater hub. When transfering large amounts of data we sometimes
> > experienced that the Retry Limit Exceeded (RLE) bit got set in TSR
> > during transmission attempts. When this happened the driver would stall
> > in a state that prevented any more data from being sent.
> 
> Right. This part of your patch looks good...
> 
> > The other issue experienced was in the RX part of the driver. Sometimes
> > the driver runs out of unused slots in the RX ring buffer. All slots
> > will be filled with data but the driver is not able to extract a
> > complete frame from the fragments in the buffer. When this happens the
> > Buffer Not Available (BNA) bit is set in RSR.
> 
> This part also looks like a real problem, however...
> 
> > @@ -527,18 +529,31 @@ static int macb_poll(struct napi_struct
> >         dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
> >                 (unsigned long)status, budget);
> >
> > -       if (!(status & MACB_BIT(REC))) {
> > +       if (status & MACB_BIT(REC)) {
> > +               work_done = macb_rx(bp, budget);
> > +               if (work_done < budget)
> > +                       netif_rx_complete(dev, napi);
> > +       } else if (status & MACB_BIT(BNA)) {
> > +               /* No slots available in RX ring. Mark all slots
> > +                * as unused.
> > +                */
> > +               int i;
> > +
> > +               dev_warn(&bp->pdev->dev,
> > +                               "No free RX buffers. Marking all as unused.\n");
> > +
> > +               for (i = 0; i < RX_RING_SIZE; i++)
> > +                       bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
> > +
> > +               wmb();
> > +               netif_rx_complete(dev, napi);
> 
> I'm not sure if nuking all the buffers that were received successfully
> is the right thing to do here...?
> 

This was the only solution I could think of since all slots are marked
as used and we are not able to assemble a complete frame. Do you think
there is a better solution?

> > +       } else if (!(status & MACB_BIT(REC))) {
> 
> This should probably be just "else", since you checked the REC bit
> above.

You are right. That is probably more correct.

> Haavard

Erik


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-01-14 10:27   ` Erik Waling
@ 2009-01-14 12:03     ` Haavard Skinnemoen
  2009-01-15 13:37       ` Erik Waling
  0 siblings, 1 reply; 11+ messages in thread
From: Haavard Skinnemoen @ 2009-01-14 12:03 UTC (permalink / raw)
  To: Erik Waling; +Cc: Erik Waling, linux-kernel

Erik Waling wrote:
> > > @@ -527,18 +529,31 @@ static int macb_poll(struct napi_struct
> > >         dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
> > >                 (unsigned long)status, budget);
> > >
> > > -       if (!(status & MACB_BIT(REC))) {
> > > +       if (status & MACB_BIT(REC)) {
> > > +               work_done = macb_rx(bp, budget);
> > > +               if (work_done < budget)
> > > +                       netif_rx_complete(dev, napi);
> > > +       } else if (status & MACB_BIT(BNA)) {
> > > +               /* No slots available in RX ring. Mark all slots
> > > +                * as unused.
> > > +                */
> > > +               int i;
> > > +
> > > +               dev_warn(&bp->pdev->dev,
> > > +                               "No free RX buffers. Marking all as unused.\n");
> > > +
> > > +               for (i = 0; i < RX_RING_SIZE; i++)
> > > +                       bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
> > > +
> > > +               wmb();
> > > +               netif_rx_complete(dev, napi);  
> > 
> > I'm not sure if nuking all the buffers that were received successfully
> > is the right thing to do here...?
> >   
> 
> This was the only solution I could think of since all slots are marked
> as used and we are not able to assemble a complete frame. Do you think
> there is a better solution?

No, I guess that might be a good fallback solution if we really can't
seem to make any progress...

However, I'm wondering if there might be a different bug in the code
above: Suppose that we receive lots of frames, start processing them,
but exhaust our budget so that we return before we had a chance to look
at all of them.

Then, when the network layer calls us again, we will only continue
processing the buffers if the REC bit was set in the mean time, which
it might not be if there was a brief pause in the flow of packets. If
this happens, we'll simply display a warning and call
netif_rx_complete() with potentially lots of unprocessed packets in the
RX ring...

So I'm wondering if the right thing to do is to just call macb_rx()
regardless of the state of the REC bit. If it exhausts the budget, and
the BNA bit is set, we could flush the ring in order to sort of relieve
the pressure a bit...

Btw, I suspect you'll need to update rx_tail when you do that, or it
might not point to where the next used buffer will be.

Haavard

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-01-14 12:03     ` Haavard Skinnemoen
@ 2009-01-15 13:37       ` Erik Waling
  2009-03-26  9:38         ` Erik Waling
  0 siblings, 1 reply; 11+ messages in thread
From: Erik Waling @ 2009-01-15 13:37 UTC (permalink / raw)
  To: Haavard Skinnemoen; +Cc: linux-kernel

On Wed, 2009-01-14 at 13:03 +0100, Haavard Skinnemoen wrote:

[...]

> >
> > This was the only solution I could think of since all slots are marked
> > as used and we are not able to assemble a complete frame. Do you think
> > there is a better solution?
> 
> No, I guess that might be a good fallback solution if we really can't
> seem to make any progress...
> 
> However, I'm wondering if there might be a different bug in the code
> above: Suppose that we receive lots of frames, start processing them,
> but exhaust our budget so that we return before we had a chance to look
> at all of them.

True. That could probably happen.

> Then, when the network layer calls us again, we will only continue
> processing the buffers if the REC bit was set in the mean time, which
> it might not be if there was a brief pause in the flow of packets. If
> this happens, we'll simply display a warning and call
> netif_rx_complete() with potentially lots of unprocessed packets in the
> RX ring...
> 
> So I'm wondering if the right thing to do is to just call macb_rx()
> regardless of the state of the REC bit. If it exhausts the budget, and
> the BNA bit is set, we could flush the ring in order to sort of relieve
> the pressure a bit...

I agree. The best solution would probably be one that does not depend on
the REC bit.

I did some quick tests and modified macb_poll() so that it calls
macb_rx() before anything else is checked. When macb_rx() returns the
poll routine checks if the BNA bit was set in RSR. If that is the case
and macb_rx() was not able to process any complete frames we flush the
whole ring. In the case were we manage to process at least one frame and
BNA is set we just clear the bit without flushing the ring since the
processed frame(s) will leave one or more empty slots in the ring. 

I also had to modify macb_rx() slightly since we, after a buffer flush,
might receive an EOF fragment before any SOF frags are received (the
case were we received the SOF and then flushed the buffer). 

In order to test this quickly I changed the size of the RX ring to 16
slots since it is quite hard to reproduce the BNA issue using the
standard size of 512.

I have been testing (by constant bi-directional transfer) for an hour or
so and the driver always seems recover after BNA. Anyway, I think the
BNA part of this patch needs some more review and testing before even
thinking about sending it upstream.

> Btw, I suspect you'll need to update rx_tail when you do that, or it
> might not point to where the next used buffer will be.

Are you sure rx_tail needs updating? If I understand the specs correctly
the EMAC module will continue writing to the address currently stored in
RBQP. This address should be the same as before the flush (if no new
fragments has been recvd). Right?

> Haavard

Erik


Signed-off-by: Erik Waling <erik.waling@konftel.com>


---

diff -urpN linux-2.6.28.orig/drivers/net/macb.c linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c
--- linux-2.6.28.orig/drivers/net/macb.c	2009-01-13 10:36:13.000000000 +0100
+++ linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c	2009-01-15 14:23:52.000000000 +0100
@@ -2,6 +2,7 @@
  * Atmel MACB Ethernet Controller driver
  *
  * Copyright (C) 2004-2006 Atmel Corporation
+ * Copyright (C) 2008-2009 Konftel AB
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -316,10 +317,11 @@ static void macb_tx(struct macb *bp)
 	dev_dbg(&bp->pdev->dev, "macb_tx status = %02lx\n",
 		(unsigned long)status);
 
-	if (status & MACB_BIT(UND)) {
+	if (status & (MACB_BIT(UND) | MACB_BIT(TSR_RLE))) {
 		int i;
-		printk(KERN_ERR "%s: TX underrun, resetting buffers\n",
-			bp->dev->name);
+		printk(KERN_ERR "%s: TX %s, resetting buffers\n",
+			bp->dev->name, status & MACB_BIT(UND) ? 
+			"underrun" : "retry limit exceeded");
 
 		head = bp->tx_head;
 
@@ -484,13 +486,28 @@ static int macb_rx(struct macb *bp, int 
 
 		if (ctrl & MACB_BIT(RX_EOF)) {
 			int dropped;
-			BUG_ON(first_frag == -1);
-
-			dropped = macb_rx_frame(bp, first_frag, tail);
-			first_frag = -1;
-			if (!dropped) {
-				received++;
-				budget--;
+			if (first_frag != -1) {
+				dropped = macb_rx_frame(bp, first_frag, tail);
+				first_frag = -1;
+				
+				if (!dropped) {
+					received++;
+					budget--;
+				}
+			} else {
+				/* We might end up here if we previously
+				 * flushed the RX buffer ring because of 
+				 * no available buffers.
+				 */
+				printk(KERN_INFO 
+					"%s: Found EOF frag without any "
+					"previous SOF frag. Discarding "
+					"everything up until EOF.\n",
+					bp->dev->name);
+				/* This might also "discard" a couple of 
+				 * fragments that are already marked as unused.
+				 */
+				discard_partial_frame(bp, bp->rx_tail, tail);
 			}
 		}
 	}
@@ -514,28 +531,39 @@ static int macb_poll(struct napi_struct 
 	macb_writel(bp, RSR, status);
 
 	work_done = 0;
-	if (!status) {
-		/*
-		 * This may happen if an interrupt was pending before
-		 * this function was called last time, and no packets
-		 * have been received since.
-		 */
-		netif_rx_complete(dev, napi);
-		goto out;
-	}
 
 	dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
 		(unsigned long)status, budget);
 
-	if (!(status & MACB_BIT(REC))) {
-		dev_warn(&bp->pdev->dev,
-			 "No RX buffers complete, status = %02lx\n",
-			 (unsigned long)status);
-		netif_rx_complete(dev, napi);
-		goto out;
+	work_done = macb_rx(bp, budget);
+	
+	if (status & MACB_BIT(BNA)) {
+		
+		printk(KERN_INFO
+			"%s: No RX buffers available. "
+			"Packets processed = %d\n", dev->name, work_done);
+
+		/* Only flush buffers if we did not manage to assemble
+		 * any complete frames in macb_rx() since each assembled
+		 * frame will result in free slots in the ring buffer.
+		 */
+		if (!work_done) {
+			/* No slots available in RX ring and no frames were
+			 * assembled. Mark all slots as unused. 
+			 * Hopefully this will let us recover... 
+			 */
+			int i;
+
+			printk(KERN_INFO "%s: No free RX buffers. "
+				"Flushing everything.\n", dev->name);
+
+			for (i = 0; i < RX_RING_SIZE; i++) 
+				bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
+
+			wmb();
+		}
 	}
 
-	work_done = macb_rx(bp, budget);
 	if (work_done < budget)
 		netif_rx_complete(dev, napi);
 
@@ -543,7 +571,6 @@ static int macb_poll(struct napi_struct 
 	 * We've done what we can to clean the buffers. Make sure we
 	 * get notified when new packets arrive.
 	 */
-out:
 	macb_writel(bp, IER, MACB_RX_INT_FLAGS);
 
 	/* TODO: Handle errors */
@@ -584,7 +611,8 @@ static irqreturn_t macb_interrupt(int ir
 			}
 		}
 
-		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
+		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND) | 
+                            MACB_BIT(ISR_RLE)))
 			macb_tx(bp);
 
 		/*



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-01-15 13:37       ` Erik Waling
@ 2009-03-26  9:38         ` Erik Waling
  2009-03-30 12:04           ` Haavard Skinnemoen
  0 siblings, 1 reply; 11+ messages in thread
From: Erik Waling @ 2009-03-26  9:38 UTC (permalink / raw)
  To: Haavard Skinnemoen; +Cc: linux-kernel

Hi Haavard,

Have you had the time to check the patch below? These two issues still
needs to be addressed.

If you are unsure of the BNA part you could at least pass the RLE part
upstreams since it happens more frequently. 

Regards,

Erik

On Thu, 2009-01-15 at 14:37 +0100, Erik Waling wrote:
> On Wed, 2009-01-14 at 13:03 +0100, Haavard Skinnemoen wrote:
> 
> [...]
> 
> > >
> > > This was the only solution I could think of since all slots are marked
> > > as used and we are not able to assemble a complete frame. Do you think
> > > there is a better solution?
> >
> > No, I guess that might be a good fallback solution if we really can't
> > seem to make any progress...
> >
> > However, I'm wondering if there might be a different bug in the code
> > above: Suppose that we receive lots of frames, start processing them,
> > but exhaust our budget so that we return before we had a chance to look
> > at all of them.
> 
> True. That could probably happen.
> 
> > Then, when the network layer calls us again, we will only continue
> > processing the buffers if the REC bit was set in the mean time, which
> > it might not be if there was a brief pause in the flow of packets. If
> > this happens, we'll simply display a warning and call
> > netif_rx_complete() with potentially lots of unprocessed packets in the
> > RX ring...
> >
> > So I'm wondering if the right thing to do is to just call macb_rx()
> > regardless of the state of the REC bit. If it exhausts the budget, and
> > the BNA bit is set, we could flush the ring in order to sort of relieve
> > the pressure a bit...
> 
> I agree. The best solution would probably be one that does not depend on
> the REC bit.
> 
> I did some quick tests and modified macb_poll() so that it calls
> macb_rx() before anything else is checked. When macb_rx() returns the
> poll routine checks if the BNA bit was set in RSR. If that is the case
> and macb_rx() was not able to process any complete frames we flush the
> whole ring. In the case were we manage to process at least one frame and
> BNA is set we just clear the bit without flushing the ring since the
> processed frame(s) will leave one or more empty slots in the ring.
> 
> I also had to modify macb_rx() slightly since we, after a buffer flush,
> might receive an EOF fragment before any SOF frags are received (the
> case were we received the SOF and then flushed the buffer).
> 
> In order to test this quickly I changed the size of the RX ring to 16
> slots since it is quite hard to reproduce the BNA issue using the
> standard size of 512.
> 
> I have been testing (by constant bi-directional transfer) for an hour or
> so and the driver always seems recover after BNA. Anyway, I think the
> BNA part of this patch needs some more review and testing before even
> thinking about sending it upstream.
> 
> > Btw, I suspect you'll need to update rx_tail when you do that, or it
> > might not point to where the next used buffer will be.
> 
> Are you sure rx_tail needs updating? If I understand the specs correctly
> the EMAC module will continue writing to the address currently stored in
> RBQP. This address should be the same as before the flush (if no new
> fragments has been recvd). Right?
> 
> > Haavard
> 
> Erik
> 
> 
> Signed-off-by: Erik Waling <erik.waling@konftel.com>
> 
> 
> ---
> 
> diff -urpN linux-2.6.28.orig/drivers/net/macb.c linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c
> --- linux-2.6.28.orig/drivers/net/macb.c        2009-01-13 10:36:13.000000000 +0100
> +++ linux-2.6.28.rle_and_bna_fix/drivers/net/macb.c     2009-01-15 14:23:52.000000000 +0100
> @@ -2,6 +2,7 @@
>   * Atmel MACB Ethernet Controller driver
>   *
>   * Copyright (C) 2004-2006 Atmel Corporation
> + * Copyright (C) 2008-2009 Konftel AB
>   *
>   * This program is free software; you can redistribute it and/or modify
>   * it under the terms of the GNU General Public License version 2 as
> @@ -316,10 +317,11 @@ static void macb_tx(struct macb *bp)
>         dev_dbg(&bp->pdev->dev, "macb_tx status = %02lx\n",
>                 (unsigned long)status);
> 
> -       if (status & MACB_BIT(UND)) {
> +       if (status & (MACB_BIT(UND) | MACB_BIT(TSR_RLE))) {
>                 int i;
> -               printk(KERN_ERR "%s: TX underrun, resetting buffers\n",
> -                       bp->dev->name);
> +               printk(KERN_ERR "%s: TX %s, resetting buffers\n",
> +                       bp->dev->name, status & MACB_BIT(UND) ?
> +                       "underrun" : "retry limit exceeded");
> 
>                 head = bp->tx_head;
> 
> @@ -484,13 +486,28 @@ static int macb_rx(struct macb *bp, int
> 
>                 if (ctrl & MACB_BIT(RX_EOF)) {
>                         int dropped;
> -                       BUG_ON(first_frag == -1);
> -
> -                       dropped = macb_rx_frame(bp, first_frag, tail);
> -                       first_frag = -1;
> -                       if (!dropped) {
> -                               received++;
> -                               budget--;
> +                       if (first_frag != -1) {
> +                               dropped = macb_rx_frame(bp, first_frag, tail);
> +                               first_frag = -1;
> +
> +                               if (!dropped) {
> +                                       received++;
> +                                       budget--;
> +                               }
> +                       } else {
> +                               /* We might end up here if we previously
> +                                * flushed the RX buffer ring because of
> +                                * no available buffers.
> +                                */
> +                               printk(KERN_INFO
> +                                       "%s: Found EOF frag without any "
> +                                       "previous SOF frag. Discarding "
> +                                       "everything up until EOF.\n",
> +                                       bp->dev->name);
> +                               /* This might also "discard" a couple of
> +                                * fragments that are already marked as unused.
> +                                */
> +                               discard_partial_frame(bp, bp->rx_tail, tail);
>                         }
>                 }
>         }
> @@ -514,28 +531,39 @@ static int macb_poll(struct napi_struct
>         macb_writel(bp, RSR, status);
> 
>         work_done = 0;
> -       if (!status) {
> -               /*
> -                * This may happen if an interrupt was pending before
> -                * this function was called last time, and no packets
> -                * have been received since.
> -                */
> -               netif_rx_complete(dev, napi);
> -               goto out;
> -       }
> 
>         dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
>                 (unsigned long)status, budget);
> 
> -       if (!(status & MACB_BIT(REC))) {
> -               dev_warn(&bp->pdev->dev,
> -                        "No RX buffers complete, status = %02lx\n",
> -                        (unsigned long)status);
> -               netif_rx_complete(dev, napi);
> -               goto out;
> +       work_done = macb_rx(bp, budget);
> +
> +       if (status & MACB_BIT(BNA)) {
> +
> +               printk(KERN_INFO
> +                       "%s: No RX buffers available. "
> +                       "Packets processed = %d\n", dev->name, work_done);
> +
> +               /* Only flush buffers if we did not manage to assemble
> +                * any complete frames in macb_rx() since each assembled
> +                * frame will result in free slots in the ring buffer.
> +                */
> +               if (!work_done) {
> +                       /* No slots available in RX ring and no frames were
> +                        * assembled. Mark all slots as unused.
> +                        * Hopefully this will let us recover...
> +                        */
> +                       int i;
> +
> +                       printk(KERN_INFO "%s: No free RX buffers. "
> +                               "Flushing everything.\n", dev->name);
> +
> +                       for (i = 0; i < RX_RING_SIZE; i++)
> +                               bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
> +
> +                       wmb();
> +               }
>         }
> 
> -       work_done = macb_rx(bp, budget);
>         if (work_done < budget)
>                 netif_rx_complete(dev, napi);
> 
> @@ -543,7 +571,6 @@ static int macb_poll(struct napi_struct
>          * We've done what we can to clean the buffers. Make sure we
>          * get notified when new packets arrive.
>          */
> -out:
>         macb_writel(bp, IER, MACB_RX_INT_FLAGS);
> 
>         /* TODO: Handle errors */
> @@ -584,7 +611,8 @@ static irqreturn_t macb_interrupt(int ir
>                         }
>                 }
> 
> -               if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
> +               if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND) |
> +                            MACB_BIT(ISR_RLE)))
>                         macb_tx(bp);
> 
>                 /*



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-03-26  9:38         ` Erik Waling
@ 2009-03-30 12:04           ` Haavard Skinnemoen
  2009-03-31 14:50             ` Erik Waling
  0 siblings, 1 reply; 11+ messages in thread
From: Haavard Skinnemoen @ 2009-03-30 12:04 UTC (permalink / raw)
  To: Erik Waling; +Cc: linux-kernel

Erik Waling wrote:
> Have you had the time to check the patch below? These two issues still
> needs to be addressed.

Yeah...sorry for not responding earlier.

> If you are unsure of the BNA part you could at least pass the RLE part
> upstreams since it happens more frequently.

Right. I have to admit I'm not at all convinced about the BNA part...it
seems to me like the only scenario when it makes a difference is when
the ring is so small that it can't hold an entire frame. And if that's
the case, isn't the easiest solution to just increase the size of the
ring?

So, could you please split the patch up into the following parts:
  1. TX RLE handling
  2. Call macb_rx() regardless of the status
  3. Handle incomplete RX frames
  4. Special case for RX BNA

Then I can pass on the first three, and we can keep discussing the last
one, which is the one I'm not sure about.

Haavard

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-03-30 12:04           ` Haavard Skinnemoen
@ 2009-03-31 14:50             ` Erik Waling
  2009-04-01  9:31               ` Haavard Skinnemoen
  0 siblings, 1 reply; 11+ messages in thread
From: Erik Waling @ 2009-03-31 14:50 UTC (permalink / raw)
  To: Haavard Skinnemoen; +Cc: linux-kernel

Hi Haavard,

Thanks for your reply. See my comments below.

On Mon, 2009-03-30 at 14:04 +0200, Haavard Skinnemoen wrote: 
> Erik Waling wrote:
> > Have you had the time to check the patch below? These two issues still
> > needs to be addressed.
> 
> Yeah...sorry for not responding earlier.
> 
> > If you are unsure of the BNA part you could at least pass the RLE part
> > upstreams since it happens more frequently.
> 
> Right. I have to admit I'm not at all convinced about the BNA part...it
> seems to me like the only scenario when it makes a difference is when
> the ring is so small that it can't hold an entire frame. And if that's
> the case, isn't the easiest solution to just increase the size of the
> ring?


Regarding BNA I think there will always be a possibility that we will
fill the entire ring with start and mid fragments and no actual complete
frame can be assembled. The probability that it happens will decrease
with a larger buffer, but I still think there is a chance of it
happening no matter the size of the buffer. I think we have two options:

Option 1 - Discard one or a couple of fragments to enable us to receive
a new fragment. Hopefully this will be an EOF fragment for a partial
frame in our ring and we could assemble a complete frame and hence free
some more space in the buffer. This option would however require some
kind of an algorithm for choosing which fragment(s) to discard.

Option 2 - Flush the entire ring. This will result in loss of more data
than in option 1, but higher layers should be able to handle this. This
option will not require any algorithm for choosing fragments to discard
and will be easier to implement. I would vote for this option since BNA
seems to happen very rarely with the current buffer ring size.

> So, could you please split the patch up into the following parts:
>   1. TX RLE handling
>   2. Call macb_rx() regardless of the status
>   3. Handle incomplete RX frames
>   4. Special case for RX BNA

I did as you suggested and split my patches:

Patch 1: TX RLE handling
Patch 2: Call macb_rx() regardless of the status
Patch 3: BNA and incomplete RX frame handling

I didn't think it made any sense to make separate patches for incomplete
RX frames and BNA handling since the only time we will encounter
incomplete RX frames is if we have nuked some fragments in the buffer
during BNA handling.


> Then I can pass on the first three, and we can keep discussing the last
> one, which is the one I'm not sure about.
> 
> Haavard

All patches should apply against 2.6.29.

Signed-off-by: Erik Waling <erik.waling@konftel.com>


---

diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p1_rle_fix/drivers/net/macb.c
--- linux-2.6.29/drivers/net/macb.c	2009-03-24 00:12:14.000000000 +0100
+++ linux-2.6.29_p1_rle_fix/drivers/net/macb.c	2009-03-31 13:31:39.000000000 +0200
@@ -2,6 +2,7 @@
  * Atmel MACB Ethernet Controller driver
  *
  * Copyright (C) 2004-2006 Atmel Corporation
+ * Copyright (C) 2008-2009 Konftel AB
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -316,10 +317,11 @@
 	dev_dbg(&bp->pdev->dev, "macb_tx status = %02lx\n",
 		(unsigned long)status);
 
-	if (status & MACB_BIT(UND)) {
+	if (status & (MACB_BIT(UND) | MACB_BIT(TSR_RLE))) {
 		int i;
-		printk(KERN_ERR "%s: TX underrun, resetting buffers\n",
-			bp->dev->name);
+		printk(KERN_ERR "%s: TX %s, resetting buffers\n",
+			bp->dev->name, status & MACB_BIT(UND) ? 
+			"underrun" : "retry limit exceeded");
 
 		/* Transfer ongoing, disable transmitter, to avoid confusion */
 		if (status & MACB_BIT(TGO))
@@ -591,7 +593,8 @@
 			}
 		}
 
-		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
+		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND) | 
+			    MACB_BIT(ISR_RLE)))
 			macb_tx(bp);
 
 		/*







diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p2_no_rx_status_check/drivers/net/macb.c
--- linux-2.6.29/drivers/net/macb.c	2009-03-24 00:12:14.000000000 +0100
+++ linux-2.6.29_p2_no_rx_status_check/drivers/net/macb.c	2009-03-31 13:45:01.000000000 +0200
@@ -521,27 +521,10 @@
 	macb_writel(bp, RSR, status);
 
 	work_done = 0;
-	if (!status) {
-		/*
-		 * This may happen if an interrupt was pending before
-		 * this function was called last time, and no packets
-		 * have been received since.
-		 */
-		netif_rx_complete(napi);
-		goto out;
-	}
 
 	dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
 		(unsigned long)status, budget);
 
-	if (!(status & MACB_BIT(REC))) {
-		dev_warn(&bp->pdev->dev,
-			 "No RX buffers complete, status = %02lx\n",
-			 (unsigned long)status);
-		netif_rx_complete(napi);
-		goto out;
-	}
-
 	work_done = macb_rx(bp, budget);
 	if (work_done < budget)
 		netif_rx_complete(napi);
@@ -550,7 +533,6 @@
 	 * We've done what we can to clean the buffers. Make sure we
 	 * get notified when new packets arrive.
 	 */
-out:
 	macb_writel(bp, IER, MACB_RX_INT_FLAGS);
 
 	/* TODO: Handle errors */






diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p3_handle_bna_and_incomplete_rx_frames/drivers/net/macb.c
--- linux-2.6.29/drivers/net/macb.c	2009-03-24 00:12:14.000000000 +0100
+++ linux-2.6.29_p3_handle_bna_and_incomplete_rx_frames/drivers/net/macb.c	2009-03-31 13:44:33.000000000 +0200
@@ -491,13 +491,28 @@
 
 		if (ctrl & MACB_BIT(RX_EOF)) {
 			int dropped;
-			BUG_ON(first_frag == -1);
-
-			dropped = macb_rx_frame(bp, first_frag, tail);
-			first_frag = -1;
-			if (!dropped) {
-				received++;
-				budget--;
+			if (first_frag != -1) {
+				dropped = macb_rx_frame(bp, first_frag, tail);
+				first_frag = -1;
+
+				if (!dropped) {
+					received++;
+					budget--;
+				}
+			} else {
+				/* We might end up here if we previously
+				 * flushed the RX buffer ring because of 
+				 * no available buffers.
+				 */
+				printk(KERN_INFO
+					"%s: Found EOF frag without any "
+					"previous SOF frag. Discarding "
+					"everything up until EOF.\n",
+					bp->dev->name);
+				/* This might also "discard" a couple of 
+				 * fragments that are already marked as unused.
+				 */
+				discard_partial_frame(bp, bp->rx_tail, tail);
 			}
 		}
 	}
@@ -543,6 +558,34 @@
 	}
 
 	work_done = macb_rx(bp, budget);
+
+	if (status & MACB_BIT(BNA)) {
+
+		printk(KERN_INFO
+			"%s: No RX buffers available. "
+			"Packets processed = %d\n", dev->name, work_done);
+
+		/* Only flush buffers if we did not manage to assemble
+		 * any complete frames in macb_rx() since each assembled
+		 * frame will result in free slots in the ring buffer.
+	 	 */
+		if (!work_done) {
+			/* No slots available in RX ring and no frames were
+		         * assembled. Mark all slots as unused. 
+		 	 * Hopefully this will let us recover... 
+			 */
+			int i;
+
+			printk(KERN_INFO "%s: No free RX buffers. "
+				"Flushing everything.\n", dev->name);
+
+			for (i = 0; i < RX_RING_SIZE; i++)
+				bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
+
+			wmb();
+		}
+	}
+
 	if (work_done < budget)
 		netif_rx_complete(napi);
 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-03-31 14:50             ` Erik Waling
@ 2009-04-01  9:31               ` Haavard Skinnemoen
  2009-04-01 13:05                 ` Erik Waling
  0 siblings, 1 reply; 11+ messages in thread
From: Haavard Skinnemoen @ 2009-04-01  9:31 UTC (permalink / raw)
  To: Erik Waling; +Cc: linux-kernel

Erik Waling wrote:
> Regarding BNA I think there will always be a possibility that we will
> fill the entire ring with start and mid fragments and no actual complete
> frame can be assembled. The probability that it happens will decrease
> with a larger buffer, but I still think there is a chance of it
> happening no matter the size of the buffer. I think we have two options:

Right, but the existing code will already discard frames which don't
have an end marker, won't it? So the only way things will get stuck is
if the ring is so small that the last incomplete frame will prevent the
next full frame to be stored.

> Option 1 - Discard one or a couple of fragments to enable us to receive
> a new fragment. Hopefully this will be an EOF fragment for a partial
> frame in our ring and we could assemble a complete frame and hence free
> some more space in the buffer. This option would however require some
> kind of an algorithm for choosing which fragment(s) to discard.

I'm not sure if I understand. We shouldn't kill the last frame even if
it doesn't have an EOF marker because it could be that it's still under
transfer. And as for other frames, they are either delivered or
discarded.

> Option 2 - Flush the entire ring. This will result in loss of more data
> than in option 1, but higher layers should be able to handle this. This
> option will not require any algorithm for choosing fragments to discard
> and will be easier to implement. I would vote for this option since BNA
> seems to happen very rarely with the current buffer ring size.

Ok...if BNA still happens with the other fixes applied, I guess we
should still consider this.

> > So, could you please split the patch up into the following parts:
> >   1. TX RLE handling
> >   2. Call macb_rx() regardless of the status
> >   3. Handle incomplete RX frames
> >   4. Special case for RX BNA
> 
> I did as you suggested and split my patches:
> 
> Patch 1: TX RLE handling
> Patch 2: Call macb_rx() regardless of the status
> Patch 3: BNA and incomplete RX frame handling
> 
> I didn't think it made any sense to make separate patches for incomplete
> RX frames and BNA handling since the only time we will encounter
> incomplete RX frames is if we have nuked some fragments in the buffer
> during BNA handling.

Ok, I was sort of thinking it might be better to handle such frames
properly instead BUG()ing in any case, but I agree it isn't actually
_supposed_ to happen without the BNA stuff applied as well.

> diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p1_rle_fix/drivers/net/macb.c
> --- linux-2.6.29/drivers/net/macb.c	2009-03-24 00:12:14.000000000 +0100
> +++ linux-2.6.29_p1_rle_fix/drivers/net/macb.c	2009-03-31 13:31:39.000000000 +0200
> @@ -2,6 +2,7 @@
>   * Atmel MACB Ethernet Controller driver
>   *
>   * Copyright (C) 2004-2006 Atmel Corporation
> + * Copyright (C) 2008-2009 Konftel AB

I'm no copyright expert, but are you sure this stuff is significant
enough to claim copyright in the whole thing?

>   * This program is free software; you can redistribute it and/or modify
>   * it under the terms of the GNU General Public License version 2 as
> @@ -316,10 +317,11 @@
>  	dev_dbg(&bp->pdev->dev, "macb_tx status = %02lx\n",
>  		(unsigned long)status);
>  
> -	if (status & MACB_BIT(UND)) {
> +	if (status & (MACB_BIT(UND) | MACB_BIT(TSR_RLE))) {
>  		int i;
> -		printk(KERN_ERR "%s: TX underrun, resetting buffers\n",
> -			bp->dev->name);
> +		printk(KERN_ERR "%s: TX %s, resetting buffers\n",
> +			bp->dev->name, status & MACB_BIT(UND) ? 
> +			"underrun" : "retry limit exceeded");
>  
>  		/* Transfer ongoing, disable transmitter, to avoid confusion */
>  		if (status & MACB_BIT(TGO))
> @@ -591,7 +593,8 @@
>  			}
>  		}
>  
> -		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
> +		if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND) | 
> +			    MACB_BIT(ISR_RLE)))
>  			macb_tx(bp);
>  
>  		/*

I'll call this "macb: Handle Retry Limit Exceeded errors"

> diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p2_no_rx_status_check/drivers/net/macb.c
> --- linux-2.6.29/drivers/net/macb.c	2009-03-24 00:12:14.000000000 +0100
> +++ linux-2.6.29_p2_no_rx_status_check/drivers/net/macb.c	2009-03-31 13:45:01.000000000 +0200
> @@ -521,27 +521,10 @@
>  	macb_writel(bp, RSR, status);
>  
>  	work_done = 0;
> -	if (!status) {
> -		/*
> -		 * This may happen if an interrupt was pending before
> -		 * this function was called last time, and no packets
> -		 * have been received since.
> -		 */
> -		netif_rx_complete(napi);
> -		goto out;
> -	}
>  
>  	dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
>  		(unsigned long)status, budget);
>  
> -	if (!(status & MACB_BIT(REC))) {
> -		dev_warn(&bp->pdev->dev,
> -			 "No RX buffers complete, status = %02lx\n",
> -			 (unsigned long)status);
> -		netif_rx_complete(napi);
> -		goto out;
> -	}
> -
>  	work_done = macb_rx(bp, budget);
>  	if (work_done < budget)
>  		netif_rx_complete(napi);
> @@ -550,7 +533,6 @@
>  	 * We've done what we can to clean the buffers. Make sure we
>  	 * get notified when new packets arrive.
>  	 */
> -out:
>  	macb_writel(bp, IER, MACB_RX_INT_FLAGS);
>  
>  	/* TODO: Handle errors */

Hmm...I'm not sure how valid that TODO is. Should probably look into
that.

Anyway, I'll call this "macb: process the RX ring regardless of
interrupt status" and include some of the discussion.

> diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p3_handle_bna_and_incomplete_rx_frames/drivers/net/macb.c
> --- linux-2.6.29/drivers/net/macb.c	2009-03-24 00:12:14.000000000 +0100
> +++ linux-2.6.29_p3_handle_bna_and_incomplete_rx_frames/drivers/net/macb.c	2009-03-31 13:44:33.000000000 +0200
> @@ -491,13 +491,28 @@
>  
>  		if (ctrl & MACB_BIT(RX_EOF)) {
>  			int dropped;
> -			BUG_ON(first_frag == -1);
> -
> -			dropped = macb_rx_frame(bp, first_frag, tail);
> -			first_frag = -1;
> -			if (!dropped) {
> -				received++;
> -				budget--;
> +			if (first_frag != -1) {
> +				dropped = macb_rx_frame(bp, first_frag, tail);
> +				first_frag = -1;
> +
> +				if (!dropped) {
> +					received++;
> +					budget--;
> +				}
> +			} else {
> +				/* We might end up here if we previously
> +				 * flushed the RX buffer ring because of 
> +				 * no available buffers.
> +				 */
> +				printk(KERN_INFO
> +					"%s: Found EOF frag without any "
> +					"previous SOF frag. Discarding "
> +					"everything up until EOF.\n",
> +					bp->dev->name);
> +				/* This might also "discard" a couple of 
> +				 * fragments that are already marked as unused.
> +				 */
> +				discard_partial_frame(bp, bp->rx_tail, tail);
>  			}
>  		}
>  	}
> @@ -543,6 +558,34 @@
>  	}
>  
>  	work_done = macb_rx(bp, budget);
> +
> +	if (status & MACB_BIT(BNA)) {
> +
> +		printk(KERN_INFO
> +			"%s: No RX buffers available. "
> +			"Packets processed = %d\n", dev->name, work_done);
> +
> +		/* Only flush buffers if we did not manage to assemble
> +		 * any complete frames in macb_rx() since each assembled
> +		 * frame will result in free slots in the ring buffer.
> +	 	 */
> +		if (!work_done) {
> +			/* No slots available in RX ring and no frames were
> +		         * assembled. Mark all slots as unused. 
> +		 	 * Hopefully this will let us recover... 
> +			 */
> +			int i;
> +
> +			printk(KERN_INFO "%s: No free RX buffers. "
> +				"Flushing everything.\n", dev->name);
> +
> +			for (i = 0; i < RX_RING_SIZE; i++)
> +				bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);

Hmm...ok, I guess this is reasonably safe, but I still suspect a race
which might mark a buffer as unused right after it was filled by the
macb. It's very unlikely to happen, but a nasty long-running interrupt
handler could cause problems if it happens at _just_ the wrong time...

And if it happens, I suspect the ring indices in the driver will get
out of sync with the hardware, which is very bad. So if you really need
to do this, I think it's safest to disable RX while flushing the ring
and reset all the pointers afterwards.

Haavard

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] macb: RLE and BNA handling
  2009-04-01  9:31               ` Haavard Skinnemoen
@ 2009-04-01 13:05                 ` Erik Waling
  0 siblings, 0 replies; 11+ messages in thread
From: Erik Waling @ 2009-04-01 13:05 UTC (permalink / raw)
  To: Haavard Skinnemoen; +Cc: linux-kernel

Hi Haavard,

See my comments below.

On Wed, 2009-04-01 at 11:31 +0200, Haavard Skinnemoen wrote:
> Erik Waling wrote:
> > Regarding BNA I think there will always be a possibility that we will
> > fill the entire ring with start and mid fragments and no actual complete
> > frame can be assembled. The probability that it happens will decrease
> > with a larger buffer, but I still think there is a chance of it
> > happening no matter the size of the buffer. I think we have two options:
> 
> Right, but the existing code will already discard frames which don't
> have an end marker, won't it? So the only way things will get stuck is
> if the ring is so small that the last incomplete frame will prevent the
> next full frame to be stored.

Thats true. The code will discard a partial frame if two SOF fragments
are discovered before an EOF fragment. I'm not sure how the buffer looks
when BNA happens, but it has to contain none or one start fragment and
the rest of the slots will contain mid fragments. I have only
experienced this during high network load when working behind an old
repeater hub.

> > Option 1 - Discard one or a couple of fragments to enable us to receive
> > a new fragment. Hopefully this will be an EOF fragment for a partial
> > frame in our ring and we could assemble a complete frame and hence free
> > some more space in the buffer. This option would however require some
> > kind of an algorithm for choosing which fragment(s) to discard.
> 
> I'm not sure if I understand. We shouldn't kill the last frame even if
> it doesn't have an EOF marker because it could be that it's still under
> transfer. And as for other frames, they are either delivered or
> discarded.
> 
> > Option 2 - Flush the entire ring. This will result in loss of more data
> > than in option 1, but higher layers should be able to handle this. This
> > option will not require any algorithm for choosing fragments to discard
> > and will be easier to implement. I would vote for this option since BNA
> > seems to happen very rarely with the current buffer ring size.
> 
> Ok...if BNA still happens with the other fixes applied, I guess we
> should still consider this.

I haven't managed to trigger the BNA issue with the "macb: process the
RX ring regardless of interrupt status" patch. Maybe it would be a good
idea to just add a print if BNA happens. Then you should get some bug
reports if people see that message when their network traffic stalls.

> > > So, could you please split the patch up into the following parts:
> > >   1. TX RLE handling
> > >   2. Call macb_rx() regardless of the status
> > >   3. Handle incomplete RX frames
> > >   4. Special case for RX BNA
> >
> > I did as you suggested and split my patches:
> >
> > Patch 1: TX RLE handling
> > Patch 2: Call macb_rx() regardless of the status
> > Patch 3: BNA and incomplete RX frame handling
> >
> > I didn't think it made any sense to make separate patches for incomplete
> > RX frames and BNA handling since the only time we will encounter
> > incomplete RX frames is if we have nuked some fragments in the buffer
> > during BNA handling.
> 
> Ok, I was sort of thinking it might be better to handle such frames
> properly instead BUG()ing in any case, but I agree it isn't actually
> _supposed_ to happen without the BNA stuff applied as well.

Do you still want to apply that part even though it should not be
necessary? I can add that as a seperate patch if you want.

> > diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p1_rle_fix/drivers/net/macb.c
> > --- linux-2.6.29/drivers/net/macb.c   2009-03-24 00:12:14.000000000 +0100
> > +++ linux-2.6.29_p1_rle_fix/drivers/net/macb.c        2009-03-31 13:31:39.000000000 +0200
> > @@ -2,6 +2,7 @@
> >   * Atmel MACB Ethernet Controller driver
> >   *
> >   * Copyright (C) 2004-2006 Atmel Corporation
> > + * Copyright (C) 2008-2009 Konftel AB
> 
> I'm no copyright expert, but are you sure this stuff is significant
> enough to claim copyright in the whole thing?

I wouldn't personally claim copyright for it. I just added it to one
patch since I didn't write this code on my spare time and because of
that I don't own the code myself.           

> >   * This program is free software; you can redistribute it and/or modify
> >   * it under the terms of the GNU General Public License version 2 as
> > @@ -316,10 +317,11 @@
> >       dev_dbg(&bp->pdev->dev, "macb_tx status = %02lx\n",
> >               (unsigned long)status);
> >
> > -     if (status & MACB_BIT(UND)) {
> > +     if (status & (MACB_BIT(UND) | MACB_BIT(TSR_RLE))) {
> >               int i;
> > -             printk(KERN_ERR "%s: TX underrun, resetting buffers\n",
> > -                     bp->dev->name);
> > +             printk(KERN_ERR "%s: TX %s, resetting buffers\n",
> > +                     bp->dev->name, status & MACB_BIT(UND) ?
> > +                     "underrun" : "retry limit exceeded");
> >
> >               /* Transfer ongoing, disable transmitter, to avoid confusion */
> >               if (status & MACB_BIT(TGO))
> > @@ -591,7 +593,8 @@
> >                       }
> >               }
> >
> > -             if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
> > +             if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND) |
> > +                         MACB_BIT(ISR_RLE)))
> >                       macb_tx(bp);
> >
> >               /*
> 
> I'll call this "macb: Handle Retry Limit Exceeded errors"
> 
> > diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p2_no_rx_status_check/drivers/net/macb.c
> > --- linux-2.6.29/drivers/net/macb.c   2009-03-24 00:12:14.000000000 +0100
> > +++ linux-2.6.29_p2_no_rx_status_check/drivers/net/macb.c     2009-03-31 13:45:01.000000000 +0200
> > @@ -521,27 +521,10 @@
> >       macb_writel(bp, RSR, status);
> >
> >       work_done = 0;
> > -     if (!status) {
> > -             /*
> > -              * This may happen if an interrupt was pending before
> > -              * this function was called last time, and no packets
> > -              * have been received since.
> > -              */
> > -             netif_rx_complete(napi);
> > -             goto out;
> > -     }
> >
> >       dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
> >               (unsigned long)status, budget);
> >
> > -     if (!(status & MACB_BIT(REC))) {
> > -             dev_warn(&bp->pdev->dev,
> > -                      "No RX buffers complete, status = %02lx\n",
> > -                      (unsigned long)status);
> > -             netif_rx_complete(napi);
> > -             goto out;
> > -     }
> > -
> >       work_done = macb_rx(bp, budget);
> >       if (work_done < budget)
> >               netif_rx_complete(napi);
> > @@ -550,7 +533,6 @@
> >        * We've done what we can to clean the buffers. Make sure we
> >        * get notified when new packets arrive.
> >        */
> > -out:
> >       macb_writel(bp, IER, MACB_RX_INT_FLAGS);
> >
> >       /* TODO: Handle errors */
> 
> Hmm...I'm not sure how valid that TODO is. Should probably look into
> that.
> 
> Anyway, I'll call this "macb: process the RX ring regardless of
> interrupt status" and include some of the discussion.

Sounds good. I didn't want to remove it since I didn't know if you had
any other issues you wanted to resolve in the future.


> > diff -urN linux-2.6.29/drivers/net/macb.c linux-2.6.29_p3_handle_bna_and_incomplete_rx_frames/drivers/net/macb.c
> > --- linux-2.6.29/drivers/net/macb.c   2009-03-24 00:12:14.000000000 +0100
> > +++ linux-2.6.29_p3_handle_bna_and_incomplete_rx_frames/drivers/net/macb.c    2009-03-31 13:44:33.000000000 +0200
> > @@ -491,13 +491,28 @@
> >
> >               if (ctrl & MACB_BIT(RX_EOF)) {
> >                       int dropped;
> > -                     BUG_ON(first_frag == -1);
> > -
> > -                     dropped = macb_rx_frame(bp, first_frag, tail);
> > -                     first_frag = -1;
> > -                     if (!dropped) {
> > -                             received++;
> > -                             budget--;
> > +                     if (first_frag != -1) {
> > +                             dropped = macb_rx_frame(bp, first_frag, tail);
> > +                             first_frag = -1;
> > +
> > +                             if (!dropped) {
> > +                                     received++;
> > +                                     budget--;
> > +                             }
> > +                     } else {
> > +                             /* We might end up here if we previously
> > +                              * flushed the RX buffer ring because of
> > +                              * no available buffers.
> > +                              */
> > +                             printk(KERN_INFO
> > +                                     "%s: Found EOF frag without any "
> > +                                     "previous SOF frag. Discarding "
> > +                                     "everything up until EOF.\n",
> > +                                     bp->dev->name);
> > +                             /* This might also "discard" a couple of
> > +                              * fragments that are already marked as unused.
> > +                              */
> > +                             discard_partial_frame(bp, bp->rx_tail, tail);
> >                       }
> >               }
> >       }
> > @@ -543,6 +558,34 @@
> >       }
> >
> >       work_done = macb_rx(bp, budget);
> > +
> > +     if (status & MACB_BIT(BNA)) {
> > +
> > +             printk(KERN_INFO
> > +                     "%s: No RX buffers available. "
> > +                     "Packets processed = %d\n", dev->name, work_done);
> > +
> > +             /* Only flush buffers if we did not manage to assemble
> > +              * any complete frames in macb_rx() since each assembled
> > +              * frame will result in free slots in the ring buffer.
> > +              */
> > +             if (!work_done) {
> > +                     /* No slots available in RX ring and no frames were
> > +                      * assembled. Mark all slots as unused.
> > +                      * Hopefully this will let us recover...
> > +                      */
> > +                     int i;
> > +
> > +                     printk(KERN_INFO "%s: No free RX buffers. "
> > +                             "Flushing everything.\n", dev->name);
> > +
> > +                     for (i = 0; i < RX_RING_SIZE; i++)
> > +                             bp->rx_ring[i].addr &= ~MACB_BIT(RX_USED);
> 
> Hmm...ok, I guess this is reasonably safe, but I still suspect a race
> which might mark a buffer as unused right after it was filled by the
> macb. It's very unlikely to happen, but a nasty long-running interrupt
> handler could cause problems if it happens at _just_ the wrong time...
> 
> And if it happens, I suspect the ring indices in the driver will get
> out of sync with the hardware, which is very bad. So if you really need
> to do this, I think it's safest to disable RX while flushing the ring
> and reset all the pointers afterwards.

You are probably right. RX should be disabled while flushing the buffer.

-- Erik



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-04-01 13:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-13 16:21 [PATCH] macb: RLE and BNA handling Erik Waling
2009-01-14  7:47 ` Erik Waling
2009-01-14 10:15 ` Haavard Skinnemoen
2009-01-14 10:27   ` Erik Waling
2009-01-14 12:03     ` Haavard Skinnemoen
2009-01-15 13:37       ` Erik Waling
2009-03-26  9:38         ` Erik Waling
2009-03-30 12:04           ` Haavard Skinnemoen
2009-03-31 14:50             ` Erik Waling
2009-04-01  9:31               ` Haavard Skinnemoen
2009-04-01 13:05                 ` Erik Waling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.