* Gianfar: RX Recycle skb->len error @ 2010-03-20 19:54 Ben Menchaca (ben@bigfootnetworks.com) 2010-03-22 4:46 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: Ben Menchaca (ben@bigfootnetworks.com) @ 2010-03-20 19:54 UTC (permalink / raw) To: netdev We are seeing some random skb data length errors on RX after long-running, full-gigabit traffic. First, my debugging and solution are based on the following invariant assumption: (skb->tail - skb->data) == skb->len If this is wrong, please educate. After some tracing, here is where the error packets seem to originate: 1. We are cleaning rx, in gfar_clean_rx_ring; 2. A new RX skb is drawn from the rx_recycle queue, and obey the above invariant (so, in gfar_new_skb(), __skb_dequeue returns an skb); 3. At this point skb_reserve is called, which moves data and tail by the same calculated alignamount; 4. So, newskb is not NULL. However, !(bdp->status & RXBD_LAST) || (bdp->status & RXBD_ERR)) is evaluates to true; 5. Since newskb is not NULL, we arrive at the else if (skb), which is true; 6. skb->data = skb->head + NET_SKB_PAD is applied, and then the skb is requeued for recycling. At this point, skb->data != skb->tail, but skb->len == 0. When this skb is used for the next RX, it is causing issues later when we skb_put trailers, and then trust skb->len. I would propose something like: --- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -2540,6 +2540,7 @@ * recycle list. */ skb->data = skb->head + NET_SKB_PAD; + skb_reset_tail_pointer(skb); __skb_queue_head(&priv->rx_recycle, skb); } } else { Ben Menchaca Bigfoot Networks ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Gianfar: RX Recycle skb->len error 2010-03-20 19:54 Gianfar: RX Recycle skb->len error Ben Menchaca (ben@bigfootnetworks.com) @ 2010-03-22 4:46 ` David Miller 2010-03-22 17:24 ` Anton Vorontsov 0 siblings, 1 reply; 7+ messages in thread From: David Miller @ 2010-03-22 4:46 UTC (permalink / raw) To: ben; +Cc: netdev, avorontsov, Sandeep.Kumar From: "Ben Menchaca (ben@bigfootnetworks.com)" <ben@bigfootnetworks.com> Date: Sat, 20 Mar 2010 12:54:59 -0700 > We are seeing some random skb data length errors on RX after long-running, full-gigabit traffic. First, my debugging and solution are based on the following invariant assumption: > (skb->tail - skb->data) == skb->len > > If this is wrong, please educate. > > After some tracing, here is where the error packets seem to originate: > 1. We are cleaning rx, in gfar_clean_rx_ring; > 2. A new RX skb is drawn from the rx_recycle queue, and obey the above invariant (so, in gfar_new_skb(), __skb_dequeue returns an skb); > 3. At this point skb_reserve is called, which moves data and tail by the same calculated alignamount; > 4. So, newskb is not NULL. However, !(bdp->status & RXBD_LAST) || (bdp->status & RXBD_ERR)) is evaluates to true; > 5. Since newskb is not NULL, we arrive at the else if (skb), which is true; > 6. skb->data = skb->head + NET_SKB_PAD is applied, and then the skb is requeued for recycling. > > At this point, skb->data != skb->tail, but skb->len == 0. When this skb is used for the next RX, it is causing issues later when we skb_put trailers, and then trust skb->len. > > I would propose something like: Thanks for debugging this, some gianfar developers CC:'d. > @@ -2540,6 +2540,7 @@ > * recycle list. > */ > skb->data = skb->head + NET_SKB_PAD; > + skb_reset_tail_pointer(skb); > __skb_queue_head(&priv->rx_recycle, skb); > } > } else { This code is essentially trying to undo skb_reserve() but as you found it's doing so in a buggy manner. skb_reserve() adjusts both the 'data' and 'tail' pointers, but this attempt at a reversal is only modifying 'data'. Your fix is fine, but really any by-hand modification of skb->data is a bug, and we should provide an skb_unreserve() or similar to hide such details away, and use it here. Anton? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Gianfar: RX Recycle skb->len error 2010-03-22 4:46 ` David Miller @ 2010-03-22 17:24 ` Anton Vorontsov 2010-03-22 21:10 ` Ben Menchaca (ben@bigfootnetworks.com) 0 siblings, 1 reply; 7+ messages in thread From: Anton Vorontsov @ 2010-03-22 17:24 UTC (permalink / raw) To: David Miller; +Cc: ben, netdev, Sandeep.Kumar On Sun, Mar 21, 2010 at 09:46:42PM -0700, David Miller wrote: [...] > > * recycle list. > > */ > > skb->data = skb->head + NET_SKB_PAD; > > + skb_reset_tail_pointer(skb); > > __skb_queue_head(&priv->rx_recycle, skb); > > } > > } else { > > This code is essentially trying to undo skb_reserve() > but as you found it's doing so in a buggy manner. > > skb_reserve() adjusts both the 'data' and 'tail' pointers, > but this attempt at a reversal is only modifying 'data'. > > Your fix is fine, but really any by-hand modification of > skb->data is a bug, and we should provide an skb_unreserve() > or similar to hide such details away, and use it here. > > Anton? Yes, skb_unreserve() (or skb_reset_reserved() for naming consistency?) would be great. Ben, note that ucc_geth.c driver is also affected by that bug, so I guess it needs a similar fix. Thanks, -- Anton Vorontsov email: cbouatmailru@gmail.com irc://irc.freenode.net/bd2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: Gianfar: RX Recycle skb->len error 2010-03-22 17:24 ` Anton Vorontsov @ 2010-03-22 21:10 ` Ben Menchaca (ben@bigfootnetworks.com) 2010-03-23 3:30 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: Ben Menchaca (ben@bigfootnetworks.com) @ 2010-03-22 21:10 UTC (permalink / raw) To: avorontsov, David Miller; +Cc: netdev, Sandeep.Kumar > Yes, skb_unreserve() (or skb_reset_reserved() for naming consistency?) > would be great. Just a couple of notes: It's yucky, but skb_reserve(skb, -alignamount) works, if you know the alignamount (which I discuss a bit below). If that's not intended, then should len be unsigned int? skb_put, skb_push, and skb_pull all seem to be unsigned int. It seems in both these cases for gianfar, the amount of the alignment is not immediately available to the code the recognizes that the skb_reset_reserved() is required. A bit larger rework appears to be needed. I took a shot at using a new heuristic for the rx_ring to prevent having to skb_reset_reserved() at all. The idea is to guarantee that the elements in ring are reserve()-ed, and that if we encounter an error condition that yields the two-skb case, free the new skb, since it has not yet been reserve()-ed. Looking forward to hearing from FS on the "right" way, though. Ben Menchaca Bigfoot Networks diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c index b671555..82f8486 100644 --- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -109,6 +109,7 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev); static void gfar_reset_task(struct work_struct *work); static void gfar_timeout(struct net_device *dev); static int gfar_close(struct net_device *dev); +static void gfar_skb_reserve_aligned(struct sk_buff *skb); struct sk_buff *gfar_new_skb(struct net_device *dev); static void gfar_new_rxbdp(struct gfar_priv_rx_q *rx_queue, struct rxbd8 *bdp, struct sk_buff *skb); @@ -214,6 +215,7 @@ static int gfar_init_bds(struct net_device *ndev) ndev->name); goto err_rxalloc_fail; } + gfar_skb_reserve_aligned(ndev); rx_queue->rx_skbuff[j] = skb; gfar_new_rxbdp(rx_queue, rxbdp, skb); @@ -2372,6 +2374,20 @@ static void gfar_new_rxbdp(struct gfar_priv_rx_q *rx_queue, struct rxbd8 *bdp, } +static void gfar_skb_reserve_aligned(struct sk_buff *skb) +{ + unsigned int alignamount; + + alignamount = RXBUF_ALIGNMENT - + (((unsigned long) skb->data) & (RXBUF_ALIGNMENT - 1)); + + /* We need the data buffer to be aligned properly. We will reserve + * as many bytes as needed to align the data properly + */ + skb_reserve(skb, alignamount); +} + + struct sk_buff * gfar_new_skb(struct net_device *dev) { unsigned int alignamount; @@ -2383,17 +2399,6 @@ struct sk_buff * gfar_new_skb(struct net_device *dev) skb = netdev_alloc_skb(dev, priv->rx_buffer_size + RXBUF_ALIGNMENT); - if (!skb) - return NULL; - - alignamount = RXBUF_ALIGNMENT - - (((unsigned long) skb->data) & (RXBUF_ALIGNMENT - 1)); - - /* We need the data buffer to be aligned properly. We will reserve - * as many bytes as needed to align the data properly - */ - skb_reserve(skb, alignamount); - return skb; } @@ -2532,21 +2537,17 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit) if (unlikely(!newskb)) newskb = skb; else if (skb) { - /* - * We need to reset ->data to what it - * was before gfar_new_skb() re-aligned - * it to an RXBUF_ALIGNMENT boundary - * before we put the skb back on the - * recycle list. - */ - skb->data = skb->head + NET_SKB_PAD; - __skb_queue_head(&priv->rx_recycle, skb); + __skb_queue_head(&priv->rx_recycle, newskb); + newskb = skb; + } else { + gfar_skb_reserve_aligned(newskb); } } else { /* Increment the number of packets */ rx_queue->stats.rx_packets++; howmany++; + gfar_skb_reserve_aligned(newskb); if (likely(skb)) { pkt_len = bdp->length - ETH_FCS_LEN; /* Remove the FCS from the packet length */ ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Gianfar: RX Recycle skb->len error 2010-03-22 21:10 ` Ben Menchaca (ben@bigfootnetworks.com) @ 2010-03-23 3:30 ` David Miller 2010-03-23 14:16 ` Ben Menchaca (ben@bigfootnetworks.com) 0 siblings, 1 reply; 7+ messages in thread From: David Miller @ 2010-03-23 3:30 UTC (permalink / raw) To: ben; +Cc: avorontsov, netdev, Sandeep.Kumar From: "Ben Menchaca (ben@bigfootnetworks.com)" <ben@bigfootnetworks.com> Date: Mon, 22 Mar 2010 14:10:48 -0700 > It's yucky, but skb_reserve(skb, -alignamount) works, I have no problem with people using that. > It seems in both these cases for gianfar, the amount of the > alignment is not immediately available to the code the > recognizes that the skb_reset_reserved() is required. A bit > larger rework appears to be needed. There's no need to make this so complicated. Just remember the value and then refer to it later, when needed. struct gianfar_skb_cb { int alignamount; }; #define GIANFAR_CB(skb) ((struct gianfar_skb_cb *)((skb)->cb)) ... GIANFAR_CB(skb)->alignamount = alignamount; ... skb_reserve(skb, -GIANFAR_CB(skb)->alignamount); ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: Gianfar: RX Recycle skb->len error 2010-03-23 3:30 ` David Miller @ 2010-03-23 14:16 ` Ben Menchaca (ben@bigfootnetworks.com) 2010-03-23 20:00 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: Ben Menchaca (ben@bigfootnetworks.com) @ 2010-03-23 14:16 UTC (permalink / raw) To: David Miller; +Cc: avorontsov, netdev, Sandeep.Kumar, kumar.gala > There's no need to make this so complicated. Just remember the > value and then refer to it later, when needed. Thanks for the sanity adjustment! As suggested, then...hope to hear from FS soon. diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c index b671555..669de02 100644 --- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -2393,6 +2393,7 @@ struct sk_buff * gfar_new_skb(struct net_device *dev) * as many bytes as needed to align the data properly */ skb_reserve(skb, alignamount); + GFAR_CB(skb)->alignamount = alignamount; return skb; } @@ -2533,13 +2534,13 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit) newskb = skb; else if (skb) { /* - * We need to reset ->data to what it + * We need to un-reserve() the skb to what it * was before gfar_new_skb() re-aligned * it to an RXBUF_ALIGNMENT boundary * before we put the skb back on the * recycle list. */ - skb->data = skb->head + NET_SKB_PAD; + skb_reserve(skb, -GFAR_CB(skb)->alignamount); __skb_queue_head(&priv->rx_recycle, skb); } } else { diff --git a/drivers/net/gianfar.h b/drivers/net/gianfar.h index 3d72dc4..3ae2c77 100644 --- a/drivers/net/gianfar.h +++ b/drivers/net/gianfar.h @@ -566,6 +566,12 @@ struct rxfcb { u16 vlctl; /* VLAN control word */ }; +struct gfar_skb_cb { + int alignamount; +}; + +#define GFAR_CB(skb) ((struct gfar_skb_cb *)((skb)->cb)) + struct rmon_mib { u32 tr64; /* 0x.680 - Transmit and Receive 64-byte Frame Counter */ ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Gianfar: RX Recycle skb->len error 2010-03-23 14:16 ` Ben Menchaca (ben@bigfootnetworks.com) @ 2010-03-23 20:00 ` David Miller 0 siblings, 0 replies; 7+ messages in thread From: David Miller @ 2010-03-23 20:00 UTC (permalink / raw) To: ben; +Cc: avorontsov, netdev, Sandeep.Kumar, kumar.gala From: "Ben Menchaca (ben@bigfootnetworks.com)" <ben@bigfootnetworks.com> Date: Tue, 23 Mar 2010 07:16:58 -0700 >> There's no need to make this so complicated. Just remember the >> value and then refer to it later, when needed. > > Thanks for the sanity adjustment! As suggested, then...hope to hear > from FS soon. Let's get this tested and with a proper commit message and signoff so it can get applied. Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-03-23 19:59 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-03-20 19:54 Gianfar: RX Recycle skb->len error Ben Menchaca (ben@bigfootnetworks.com) 2010-03-22 4:46 ` David Miller 2010-03-22 17:24 ` Anton Vorontsov 2010-03-22 21:10 ` Ben Menchaca (ben@bigfootnetworks.com) 2010-03-23 3:30 ` David Miller 2010-03-23 14:16 ` Ben Menchaca (ben@bigfootnetworks.com) 2010-03-23 20:00 ` David Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.