All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] bnx2x: Tx barriers and locks
@ 2010-02-28 10:12 Vladislav Zolotarov
  2010-03-01  2:49 ` David Miller
  2010-03-01 13:33 ` Stanislaw Gruszka
  0 siblings, 2 replies; 15+ messages in thread
From: Vladislav Zolotarov @ 2010-02-28 10:12 UTC (permalink / raw)
  To: netdev, davem; +Cc: sgruszka, eilong

[Resending with the proper subject. Sorry for the mess. ]

This patch is based on the RFC of Stanislaw Gruszka.

More specifically it fixes two possible races:
- One, described by Stanislaw, may lead to permanent disabling of the Tx
queue.
This is fixed by adding the smp_wmb() to propagate the BD consumer
change towards the memory.
- Second may lead to bnx2x_start_xmit() returning NETDEV_TX_BUSY.
This is fixed by taking a tx_lock() before rechecking the number of
available Tx BDs.

thanks,
vlad 

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x_main.c |   31 ++++++++++++++++++++++---------
 1 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 5adf2a0..ed785a3 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -57,8 +57,8 @@
 #include "bnx2x_init_ops.h"
 #include "bnx2x_dump.h"
 
-#define DRV_MODULE_VERSION	"1.52.1-6"
-#define DRV_MODULE_RELDATE	"2010/02/16"
+#define DRV_MODULE_VERSION	"1.52.1-7"
+#define DRV_MODULE_RELDATE	"2010/02/28"
 #define BNX2X_BC_VER		0x040200
 
 #include <linux/firmware.h>
@@ -957,21 +957,34 @@ static int bnx2x_tx_int(struct bnx2x_fastpath *fp)
 	fp->tx_pkt_cons = sw_cons;
 	fp->tx_bd_cons = bd_cons;
 
+	/* Need to make the tx_bd_cons update visible to start_xmit()
+	 * before checking for netif_tx_queue_stopped().  Without the
+	 * memory barrier, there is a small possibility that
+	 * start_xmit() will miss it and cause the queue to be stopped
+	 * forever.
+	 */
+	smp_wmb();
+
 	/* TBD need a thresh? */
 	if (unlikely(netif_tx_queue_stopped(txq))) {
-
-		/* Need to make the tx_bd_cons update visible to start_xmit()
-		 * before checking for netif_tx_queue_stopped().  Without the
-		 * memory barrier, there is a small possibility that
-		 * start_xmit() will miss it and cause the queue to be stopped
-		 * forever.
+		/* Taking tx_lock() is needed to prevent reenabling the queue
+		 * while it's empty. This could have happen if rx_action() gets
+		 * suspended in bnx2x_tx_int() after the condition before
+		 * netif_tx_wake_queue(), while tx_action (bnx2x_start_xmit()):
+		 *
+		 * stops the queue->sees fresh tx_bd_cons->releases the queue->
+		 * sends some packets consuming the whole queue again->
+		 * stops the queue
 		 */
-		smp_mb();
+
+		__netif_tx_lock(txq, smp_processor_id());
 
 		if ((netif_tx_queue_stopped(txq)) &&
 		    (bp->state == BNX2X_STATE_OPEN) &&
 		    (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3))
 			netif_tx_wake_queue(txq);
+
+		__netif_tx_unlock(txq);
 	}
 	return 0;
 }
-- 
1.6.3.3





^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-02-28 10:12 [PATCH 1/1] bnx2x: Tx barriers and locks Vladislav Zolotarov
@ 2010-03-01  2:49 ` David Miller
  2010-03-01 13:33 ` Stanislaw Gruszka
  1 sibling, 0 replies; 15+ messages in thread
From: David Miller @ 2010-03-01  2:49 UTC (permalink / raw)
  To: vladz; +Cc: netdev, sgruszka, eilong

From: "Vladislav Zolotarov" <vladz@broadcom.com>
Date: Sun, 28 Feb 2010 12:12:02 +0200

> [Resending with the proper subject. Sorry for the mess. ]
> 
> This patch is based on the RFC of Stanislaw Gruszka.
> 
> More specifically it fixes two possible races:
> - One, described by Stanislaw, may lead to permanent disabling of the Tx
> queue.
> This is fixed by adding the smp_wmb() to propagate the BD consumer
> change towards the memory.
> - Second may lead to bnx2x_start_xmit() returning NETDEV_TX_BUSY.
> This is fixed by taking a tx_lock() before rechecking the number of
> available Tx BDs.
> 
> thanks,
> vlad 
> 
> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
> Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>

Applied, thanks everyone.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-02-28 10:12 [PATCH 1/1] bnx2x: Tx barriers and locks Vladislav Zolotarov
  2010-03-01  2:49 ` David Miller
@ 2010-03-01 13:33 ` Stanislaw Gruszka
  2010-03-01 17:59   ` Michael Chan
  1 sibling, 1 reply; 15+ messages in thread
From: Stanislaw Gruszka @ 2010-03-01 13:33 UTC (permalink / raw)
  To: Vladislav Zolotarov; +Cc: netdev, davem, eilong, Matt Carlson, Michael Chan

On Sun, Feb 28, 2010 at 12:12:02PM +0200, Vladislav Zolotarov wrote:
> --- a/drivers/net/bnx2x_main.c
> +++ b/drivers/net/bnx2x_main.c
> @@ -57,8 +57,8 @@
>  #include "bnx2x_init_ops.h"
>  #include "bnx2x_dump.h"
>  
> -#define DRV_MODULE_VERSION	"1.52.1-6"
> -#define DRV_MODULE_RELDATE	"2010/02/16"
> +#define DRV_MODULE_VERSION	"1.52.1-7"
> +#define DRV_MODULE_RELDATE	"2010/02/28"
>  #define BNX2X_BC_VER		0x040200
>  
>  #include <linux/firmware.h>
> @@ -957,21 +957,34 @@ static int bnx2x_tx_int(struct bnx2x_fastpath *fp)
>  	fp->tx_pkt_cons = sw_cons;
>  	fp->tx_bd_cons = bd_cons;
>  
> +	/* Need to make the tx_bd_cons update visible to start_xmit()
> +	 * before checking for netif_tx_queue_stopped().  Without the
> +	 * memory barrier, there is a small possibility that
> +	 * start_xmit() will miss it and cause the queue to be stopped
> +	 * forever.
> +	 */
> +	smp_wmb();
> +
>  	/* TBD need a thresh? */
>  	if (unlikely(netif_tx_queue_stopped(txq))) {
> -
> -		/* Need to make the tx_bd_cons update visible to start_xmit()
> -		 * before checking for netif_tx_queue_stopped().  Without the
> -		 * memory barrier, there is a small possibility that
> -		 * start_xmit() will miss it and cause the queue to be stopped
> -		 * forever.
> +		/* Taking tx_lock() is needed to prevent reenabling the queue
> +		 * while it's empty. This could have happen if rx_action() gets
> +		 * suspended in bnx2x_tx_int() after the condition before
> +		 * netif_tx_wake_queue(), while tx_action (bnx2x_start_xmit()):
> +		 *
> +		 * stops the queue->sees fresh tx_bd_cons->releases the queue->
> +		 * sends some packets consuming the whole queue again->
> +		 * stops the queue
>  		 */
> -		smp_mb();
> +
> +		__netif_tx_lock(txq, smp_processor_id());
>  
>  		if ((netif_tx_queue_stopped(txq)) &&
>  		    (bp->state == BNX2X_STATE_OPEN) &&
>  		    (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3))
>  			netif_tx_wake_queue(txq);
> +
> +		__netif_tx_unlock(txq);
>  	}
>  	return 0;
>  }

There is still difference between what we have in bnx2x and bnx2/tg3
regarding memory barriers in tx_poll/start_xmit code. Mainly we have
smp_mb() in bnx2/tg3_tx_avail(), and in bnx2/tg3_tx_int() is smp_mb()
not smp_wmb(). I do not see that bnx2x is wrong, but would like to know
why there is a difference, maybe bnx2/tg3 should be changed?

Stanislaw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-01 13:33 ` Stanislaw Gruszka
@ 2010-03-01 17:59   ` Michael Chan
  2010-03-02 10:38     ` Vladislav Zolotarov
  2010-03-02 11:30     ` Stanislaw Gruszka
  0 siblings, 2 replies; 15+ messages in thread
From: Michael Chan @ 2010-03-01 17:59 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Vladislav Zolotarov, netdev, davem, Eilon Greenstein, Matthew Carlson


On Mon, 2010-03-01 at 05:33 -0800, Stanislaw Gruszka wrote:
> On Sun, Feb 28, 2010 at 12:12:02PM +0200, Vladislav Zolotarov wrote:
> > --- a/drivers/net/bnx2x_main.c
> > +++ b/drivers/net/bnx2x_main.c
> > @@ -57,8 +57,8 @@
> >  #include "bnx2x_init_ops.h"
> >  #include "bnx2x_dump.h"
> >  
> > -#define DRV_MODULE_VERSION	"1.52.1-6"
> > -#define DRV_MODULE_RELDATE	"2010/02/16"
> > +#define DRV_MODULE_VERSION	"1.52.1-7"
> > +#define DRV_MODULE_RELDATE	"2010/02/28"
> >  #define BNX2X_BC_VER		0x040200
> >  
> >  #include <linux/firmware.h>
> > @@ -957,21 +957,34 @@ static int bnx2x_tx_int(struct bnx2x_fastpath *fp)
> >  	fp->tx_pkt_cons = sw_cons;
> >  	fp->tx_bd_cons = bd_cons;
> >  
> > +	/* Need to make the tx_bd_cons update visible to start_xmit()
> > +	 * before checking for netif_tx_queue_stopped().  Without the
> > +	 * memory barrier, there is a small possibility that
> > +	 * start_xmit() will miss it and cause the queue to be stopped
> > +	 * forever.
> > +	 */
> > +	smp_wmb();
> > +
> >  	/* TBD need a thresh? */
> >  	if (unlikely(netif_tx_queue_stopped(txq))) {
> > -
> > -		/* Need to make the tx_bd_cons update visible to start_xmit()
> > -		 * before checking for netif_tx_queue_stopped().  Without the
> > -		 * memory barrier, there is a small possibility that
> > -		 * start_xmit() will miss it and cause the queue to be stopped
> > -		 * forever.
> > +		/* Taking tx_lock() is needed to prevent reenabling the queue
> > +		 * while it's empty. This could have happen if rx_action() gets
> > +		 * suspended in bnx2x_tx_int() after the condition before
> > +		 * netif_tx_wake_queue(), while tx_action (bnx2x_start_xmit()):
> > +		 *
> > +		 * stops the queue->sees fresh tx_bd_cons->releases the queue->
> > +		 * sends some packets consuming the whole queue again->
> > +		 * stops the queue
> >  		 */
> > -		smp_mb();
> > +
> > +		__netif_tx_lock(txq, smp_processor_id());
> >  
> >  		if ((netif_tx_queue_stopped(txq)) &&
> >  		    (bp->state == BNX2X_STATE_OPEN) &&
> >  		    (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3))
> >  			netif_tx_wake_queue(txq);
> > +
> > +		__netif_tx_unlock(txq);
> >  	}
> >  	return 0;
> >  }
> 
> There is still difference between what we have in bnx2x and bnx2/tg3
> regarding memory barriers in tx_poll/start_xmit code. Mainly we have
> smp_mb() in bnx2/tg3_tx_avail(), and in bnx2/tg3_tx_int() is smp_mb()
> not smp_wmb(). I do not see that bnx2x is wrong, but would like to know
> why there is a difference, maybe bnx2/tg3 should be changed?
> 

The memory barrier in tx_int() is to make the tx index update happen
before the netif_tx_queue_stopped() check.  The barrier is to prevent a
situation like this:

    CPU0					CPU1
    start_xmit()
    	if (tx_ring_full) {
    						tx_int()
    							if (!netif_tx_queue_stopped)
    		netif_tx_stop_queue()
    		if (!tx_ring_full)
    							update_tx_index 
    			netif_tx_wake_queue()
    	}
    

The update_tx_index code is before the if statement in program order,
but the CPU and/or compiler can reorder it as shown above. smp_mb() will
prevent that.  Will smp_wmb() prevent that as well?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-01 17:59   ` Michael Chan
@ 2010-03-02 10:38     ` Vladislav Zolotarov
  2010-03-02 11:38       ` Stanislaw Gruszka
  2010-03-02 11:30     ` Stanislaw Gruszka
  1 sibling, 1 reply; 15+ messages in thread
From: Vladislav Zolotarov @ 2010-03-02 10:38 UTC (permalink / raw)
  To: Michael Chan, Stanislaw Gruszka
  Cc: netdev, davem, Eilon Greenstein, Matthew Carlson

I was assuming that there is an implicit read memory barrier in the construction

if (unlikely(netif_tx_queue_stopped(txq)))

I assumed it looking at the other kernel code using test_bit() in conditional constructions (like net_tx_action()).

After reading a Pentium Developers Manual I'm afraid I might have assumed wrong and there is needed a read memory barrier to ensure that the bit testing is performed not earlier the specified location in the code flow (due to CPU reordering).

Dave, as an author of atomic_ops.txt paper I think u r the best man to ask. Could u pls. clarify that if we need to ensure that the bit testing is needed AFTER the consumer update (namely after the smp_wmb()) I need to replace it with the smp_mb().

If yes, it's a clear bug and I'll prepare an appropriate patch immediately.

Thanks,
vlad

> -----Original Message-----
> From: Michael Chan [mailto:mchan@broadcom.com] 
> Sent: Monday, March 01, 2010 7:59 PM
> To: Stanislaw Gruszka
> Cc: Vladislav Zolotarov; netdev@vger.kernel.org; 
> davem@davemloft.net; Eilon Greenstein; Matthew Carlson
> Subject: Re: [PATCH 1/1] bnx2x: Tx barriers and locks
> 
> 
> On Mon, 2010-03-01 at 05:33 -0800, Stanislaw Gruszka wrote:
> > On Sun, Feb 28, 2010 at 12:12:02PM +0200, Vladislav Zolotarov wrote:
> > > --- a/drivers/net/bnx2x_main.c
> > > +++ b/drivers/net/bnx2x_main.c
> > > @@ -57,8 +57,8 @@
> > >  #include "bnx2x_init_ops.h"
> > >  #include "bnx2x_dump.h"
> > >  
> > > -#define DRV_MODULE_VERSION	"1.52.1-6"
> > > -#define DRV_MODULE_RELDATE	"2010/02/16"
> > > +#define DRV_MODULE_VERSION	"1.52.1-7"
> > > +#define DRV_MODULE_RELDATE	"2010/02/28"
> > >  #define BNX2X_BC_VER		0x040200
> > >  
> > >  #include <linux/firmware.h>
> > > @@ -957,21 +957,34 @@ static int bnx2x_tx_int(struct 
> bnx2x_fastpath *fp)
> > >  	fp->tx_pkt_cons = sw_cons;
> > >  	fp->tx_bd_cons = bd_cons;
> > >  
> > > +	/* Need to make the tx_bd_cons update visible to start_xmit()
> > > +	 * before checking for netif_tx_queue_stopped().  Without the
> > > +	 * memory barrier, there is a small possibility that
> > > +	 * start_xmit() will miss it and cause the queue to be stopped
> > > +	 * forever.
> > > +	 */
> > > +	smp_wmb();
> > > +
> > >  	/* TBD need a thresh? */
> > >  	if (unlikely(netif_tx_queue_stopped(txq))) {
> > > -
> > > -		/* Need to make the tx_bd_cons update visible 
> to start_xmit()
> > > -		 * before checking for 
> netif_tx_queue_stopped().  Without the
> > > -		 * memory barrier, there is a small possibility that
> > > -		 * start_xmit() will miss it and cause the 
> queue to be stopped
> > > -		 * forever.
> > > +		/* Taking tx_lock() is needed to prevent 
> reenabling the queue
> > > +		 * while it's empty. This could have happen if 
> rx_action() gets
> > > +		 * suspended in bnx2x_tx_int() after the 
> condition before
> > > +		 * netif_tx_wake_queue(), while tx_action 
> (bnx2x_start_xmit()):
> > > +		 *
> > > +		 * stops the queue->sees fresh 
> tx_bd_cons->releases the queue->
> > > +		 * sends some packets consuming the whole queue again->
> > > +		 * stops the queue
> > >  		 */
> > > -		smp_mb();
> > > +
> > > +		__netif_tx_lock(txq, smp_processor_id());
> > >  
> > >  		if ((netif_tx_queue_stopped(txq)) &&
> > >  		    (bp->state == BNX2X_STATE_OPEN) &&
> > >  		    (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3))
> > >  			netif_tx_wake_queue(txq);
> > > +
> > > +		__netif_tx_unlock(txq);
> > >  	}
> > >  	return 0;
> > >  }
> > 
> > There is still difference between what we have in bnx2x and bnx2/tg3
> > regarding memory barriers in tx_poll/start_xmit code. Mainly we have
> > smp_mb() in bnx2/tg3_tx_avail(), and in bnx2/tg3_tx_int() 
> is smp_mb()
> > not smp_wmb(). I do not see that bnx2x is wrong, but would 
> like to know
> > why there is a difference, maybe bnx2/tg3 should be changed?
> > 
> 
> The memory barrier in tx_int() is to make the tx index update happen
> before the netif_tx_queue_stopped() check.  The barrier is to 
> prevent a
> situation like this:
> 
>     CPU0					CPU1
>     start_xmit()
>     	if (tx_ring_full) {
>     						tx_int()
>     							if 
> (!netif_tx_queue_stopped)
>     		netif_tx_stop_queue()
>     		if (!tx_ring_full)
>     							update_tx_index 
>     			netif_tx_wake_queue()
>     	}
>     
> 
> The update_tx_index code is before the if statement in program order,
> but the CPU and/or compiler can reorder it as shown above. 
> smp_mb() will
> prevent that.  Will smp_wmb() prevent that as well?
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-01 17:59   ` Michael Chan
  2010-03-02 10:38     ` Vladislav Zolotarov
@ 2010-03-02 11:30     ` Stanislaw Gruszka
  2010-03-02 12:50       ` Vladislav Zolotarov
  1 sibling, 1 reply; 15+ messages in thread
From: Stanislaw Gruszka @ 2010-03-02 11:30 UTC (permalink / raw)
  To: Michael Chan
  Cc: Vladislav Zolotarov, netdev, davem, Eilon Greenstein, Matthew Carlson

On Mon, Mar 01, 2010 at 09:59:07AM -0800, Michael Chan wrote:
> > There is still difference between what we have in bnx2x and bnx2/tg3
> > regarding memory barriers in tx_poll/start_xmit code. Mainly we have
> > smp_mb() in bnx2/tg3_tx_avail(), and in bnx2/tg3_tx_int() is smp_mb()
> > not smp_wmb(). I do not see that bnx2x is wrong, but would like to know
> > why there is a difference, maybe bnx2/tg3 should be changed?
> > 
> 
> The memory barrier in tx_int() is to make the tx index update happen
> before the netif_tx_queue_stopped() check.  The barrier is to prevent a
> situation like this:
> 
>     CPU0					CPU1
>     start_xmit()
>     	if (tx_ring_full) {
>     						tx_int()
>     							if (!netif_tx_queue_stopped)
>     		netif_tx_stop_queue()
>     		if (!tx_ring_full)
>     							update_tx_index 
>     			netif_tx_wake_queue()
>     	}
>     
> 
> The update_tx_index code is before the if statement in program order,
> but the CPU and/or compiler can reorder it as shown above. smp_mb() will
> prevent that.  Will smp_wmb() prevent that as well?

No. smp_wmb() affect only write orders on CPU1 performing tx_int(), so
that should be fixed in bnx2x.

Regarding memory barrier in tx_avail(), I don't think it its needed for
anything, except maybe usage at the beginning of start_xmit(), but we can
just remove that like in the patch below. I going to post "official"
patches for tg3, bnx2 and bnx2x, if no nobody has nothing against.

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index ed785a3..0f406b8 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -893,7 +893,6 @@ static inline u16 bnx2x_tx_avail(struct bnx2x_fastpath *fp)
 	u16 prod;
 	u16 cons;
 
-	barrier(); /* Tell compiler that prod and cons can change */
 	prod = fp->tx_bd_prod;
 	cons = fp->tx_bd_cons;
 
@@ -963,9 +962,8 @@ static int bnx2x_tx_int(struct bnx2x_fastpath *fp)
 	 * start_xmit() will miss it and cause the queue to be stopped
 	 * forever.
 	 */
-	smp_wmb();
+	smp_mb();
 
-	/* TBD need a thresh? */
 	if (unlikely(netif_tx_queue_stopped(txq))) {
 		/* Taking tx_lock() is needed to prevent reenabling the queue
 		 * while it's empty. This could have happen if rx_action() gets
@@ -11177,10 +11175,9 @@ static netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct eth_tx_bd *tx_data_bd, *total_pkt_bd = NULL;
 	struct eth_tx_parse_bd *pbd = NULL;
 	u16 pkt_prod, bd_prod;
-	int nbd, fp_index;
+	int nbd, fp_index, i, ret;
 	dma_addr_t mapping;
 	u32 xmit_type = bnx2x_xmit_type(bp, skb);
-	int i;
 	u8 hlen = 0;
 	__le16 pkt_size = 0;
 
@@ -11195,10 +11192,9 @@ static netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	fp = &bp->fp[fp_index];
 
 	if (unlikely(bnx2x_tx_avail(fp) < (skb_shinfo(skb)->nr_frags + 3))) {
-		fp->eth_q_stats.driver_xoff++;
-		netif_tx_stop_queue(txq);
 		BNX2X_ERR("BUG! Tx ring full when queue awake!\n");
-		return NETDEV_TX_BUSY;
+		ret = NETDEV_TX_BUSY;						
+		goto stop_queue;
 	}
 
 	DP(NETIF_MSG_TX_QUEUED, "SKB: summed %x  protocol %x  protocol(%x,%x)"
@@ -11426,19 +11422,24 @@ static netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	mmiowb();
 
 	fp->tx_bd_prod += nbd;
-
-	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3)) {
-		netif_tx_stop_queue(txq);
-		/* We want bnx2x_tx_int to "see" the updated tx_bd_prod
-		   if we put Tx into XOFF state. */
-		smp_mb();
-		fp->eth_q_stats.driver_xoff++;
-		if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
-			netif_tx_wake_queue(txq);
-	}
 	fp->tx_pkt++;
+	
+	ret = NETDEV_TX_OK;
+	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3))
+		goto stop_queue;
+
+	return ret;
 
-	return NETDEV_TX_OK;
+stop_queue:
+	netif_tx_stop_queue(txq);
+	/* paired barrier is in bnx2x_tx_int(), update of tx_bd_cons
+	 * have to be visable here, after we XOFF bit setting */
+	smp_mb();
+	fp->eth_q_stats.driver_xoff++;
+	if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
+		netif_tx_wake_queue(txq);
+	
+	return ret;
 }
 
 /* called with rtnl_lock */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 10:38     ` Vladislav Zolotarov
@ 2010-03-02 11:38       ` Stanislaw Gruszka
  0 siblings, 0 replies; 15+ messages in thread
From: Stanislaw Gruszka @ 2010-03-02 11:38 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Michael Chan, netdev, davem, Eilon Greenstein, Matthew Carlson

On Tue, Mar 02, 2010 at 02:38:39AM -0800, Vladislav Zolotarov wrote:
> After reading a Pentium Developers Manual I'm afraid I might have assumed wrong and there is needed a read memory barrier to ensure that the bit testing is performed not earlier the specified location in the code flow (due to CPU reordering).

Linux supports more relaxed cpu's than Pentium :)

> Dave, as an author of atomic_ops.txt paper I think u r the best man to ask. Could u pls. clarify that if we need to ensure that the bit testing is needed AFTER the consumer update (namely after the smp_wmb()) I need to replace it with the smp_mb().
> 
> If yes, it's a clear bug and I'll prepare an appropriate patch immediately.

Yes, it's a bug.

Thanks
Stanislaw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 11:30     ` Stanislaw Gruszka
@ 2010-03-02 12:50       ` Vladislav Zolotarov
  2010-03-02 13:55         ` Stanislaw Gruszka
  0 siblings, 1 reply; 15+ messages in thread
From: Vladislav Zolotarov @ 2010-03-02 12:50 UTC (permalink / raw)
  To: Stanislaw Gruszka, Michael Chan
  Cc: netdev, davem, Eilon Greenstein, Matthew Carlson

Stanislaw barrier() is not a memory barrier - it's a compiler barrier. I don't think removing it from bnx2x_tx_avail() will improve anything. If u think I'm wrong, could u, pls., provide a specific example.

Regarding the patch u've proposed - I don't see any obvious benefit in the code change u've suggested except for replacing the smp_wmb() to smp_mb() in the bnx2x_tx_int() and would prefer to wait for Dave's reply on that matter.

Let's recall that the bxn2x_start_xmit() is called under tx_lock(), so no additional memory barriers needed to ensure that bnx2x_tx_avail() has fresh values of BD consumer and producer at the beginning of bnx2x_start_xmit(). That's why we don't need the more complicated logic putting the queue asleep at the beginning of bnx2x_start_xmit() as we need at the end. That's why I think your "goto" change in the beginning of bnx2x_start_xmit() is suboptimal.

Thanks,
vlad

> -----Original Message-----
> From: Stanislaw Gruszka [mailto:sgruszka@redhat.com] 
> Sent: Tuesday, March 02, 2010 1:31 PM
> To: Michael Chan
> Cc: Vladislav Zolotarov; netdev@vger.kernel.org; 
> davem@davemloft.net; Eilon Greenstein; Matthew Carlson
> Subject: Re: [PATCH 1/1] bnx2x: Tx barriers and locks
> 
> On Mon, Mar 01, 2010 at 09:59:07AM -0800, Michael Chan wrote:
> > > There is still difference between what we have in bnx2x 
> and bnx2/tg3
> > > regarding memory barriers in tx_poll/start_xmit code. 
> Mainly we have
> > > smp_mb() in bnx2/tg3_tx_avail(), and in bnx2/tg3_tx_int() 
> is smp_mb()
> > > not smp_wmb(). I do not see that bnx2x is wrong, but 
> would like to know
> > > why there is a difference, maybe bnx2/tg3 should be changed?
> > > 
> > 
> > The memory barrier in tx_int() is to make the tx index update happen
> > before the netif_tx_queue_stopped() check.  The barrier is 
> to prevent a
> > situation like this:
> > 
> >     CPU0					CPU1
> >     start_xmit()
> >     	if (tx_ring_full) {
> >     						tx_int()
> >     							
> if (!netif_tx_queue_stopped)
> >     		netif_tx_stop_queue()
> >     		if (!tx_ring_full)
> >     							
> update_tx_index 
> >     			netif_tx_wake_queue()
> >     	}
> >     
> > 
> > The update_tx_index code is before the if statement in 
> program order,
> > but the CPU and/or compiler can reorder it as shown above. 
> smp_mb() will
> > prevent that.  Will smp_wmb() prevent that as well?
> 
> No. smp_wmb() affect only write orders on CPU1 performing tx_int(), so
> that should be fixed in bnx2x.
> 
> Regarding memory barrier in tx_avail(), I don't think it its 
> needed for
> anything, except maybe usage at the beginning of 
> start_xmit(), but we can
> just remove that like in the patch below. I going to post "official"
> patches for tg3, bnx2 and bnx2x, if no nobody has nothing against.
> 
> diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
> index ed785a3..0f406b8 100644
> --- a/drivers/net/bnx2x_main.c
> +++ b/drivers/net/bnx2x_main.c
> @@ -893,7 +893,6 @@ static inline u16 bnx2x_tx_avail(struct 
> bnx2x_fastpath *fp)
>  	u16 prod;
>  	u16 cons;
>  
> -	barrier(); /* Tell compiler that prod and cons can change */
>  	prod = fp->tx_bd_prod;
>  	cons = fp->tx_bd_cons;
>  
> @@ -963,9 +962,8 @@ static int bnx2x_tx_int(struct bnx2x_fastpath *fp)
>  	 * start_xmit() will miss it and cause the queue to be stopped
>  	 * forever.
>  	 */
> -	smp_wmb();
> +	smp_mb();
>  
> -	/* TBD need a thresh? */
>  	if (unlikely(netif_tx_queue_stopped(txq))) {
>  		/* Taking tx_lock() is needed to prevent 
> reenabling the queue
>  		 * while it's empty. This could have happen if 
> rx_action() gets
> @@ -11177,10 +11175,9 @@ static netdev_tx_t 
> bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	struct eth_tx_bd *tx_data_bd, *total_pkt_bd = NULL;
>  	struct eth_tx_parse_bd *pbd = NULL;
>  	u16 pkt_prod, bd_prod;
> -	int nbd, fp_index;
> +	int nbd, fp_index, i, ret;
>  	dma_addr_t mapping;
>  	u32 xmit_type = bnx2x_xmit_type(bp, skb);
> -	int i;
>  	u8 hlen = 0;
>  	__le16 pkt_size = 0;
>  
> @@ -11195,10 +11192,9 @@ static netdev_tx_t 
> bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	fp = &bp->fp[fp_index];
>  
>  	if (unlikely(bnx2x_tx_avail(fp) < 
> (skb_shinfo(skb)->nr_frags + 3))) {
> -		fp->eth_q_stats.driver_xoff++;
> -		netif_tx_stop_queue(txq);
>  		BNX2X_ERR("BUG! Tx ring full when queue awake!\n");
> -		return NETDEV_TX_BUSY;
> +		ret = NETDEV_TX_BUSY;				
> 		
> +		goto stop_queue;
>  	}
>  
>  	DP(NETIF_MSG_TX_QUEUED, "SKB: summed %x  protocol %x  
> protocol(%x,%x)"
> @@ -11426,19 +11422,24 @@ static netdev_tx_t 
> bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	mmiowb();
>  
>  	fp->tx_bd_prod += nbd;
> -
> -	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3)) {
> -		netif_tx_stop_queue(txq);
> -		/* We want bnx2x_tx_int to "see" the updated tx_bd_prod
> -		   if we put Tx into XOFF state. */
> -		smp_mb();
> -		fp->eth_q_stats.driver_xoff++;
> -		if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
> -			netif_tx_wake_queue(txq);
> -	}
>  	fp->tx_pkt++;
> +	
> +	ret = NETDEV_TX_OK;
> +	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3))
> +		goto stop_queue;
> +
> +	return ret;
>  
> -	return NETDEV_TX_OK;
> +stop_queue:
> +	netif_tx_stop_queue(txq);
> +	/* paired barrier is in bnx2x_tx_int(), update of tx_bd_cons
> +	 * have to be visable here, after we XOFF bit setting */
> +	smp_mb();
> +	fp->eth_q_stats.driver_xoff++;
> +	if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
> +		netif_tx_wake_queue(txq);
> +	
> +	return ret;
>  }
>  
>  /* called with rtnl_lock */
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 12:50       ` Vladislav Zolotarov
@ 2010-03-02 13:55         ` Stanislaw Gruszka
  2010-03-02 16:18           ` Michael Chan
  2010-03-02 16:21           ` Vladislav Zolotarov
  0 siblings, 2 replies; 15+ messages in thread
From: Stanislaw Gruszka @ 2010-03-02 13:55 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Michael Chan, netdev, davem, Eilon Greenstein, Matthew Carlson

On Tue, Mar 02, 2010 at 04:50:59AM -0800, Vladislav Zolotarov wrote:
> Stanislaw barrier() is not a memory barrier - it's a compiler barrier. I don't think removing it from bnx2x_tx_avail() will improve anything. If u think I'm wrong, could u, pls., provide a specific example.

Only improvement is removing confusing code, And comment like
"Tell compiler that prod and cons can change" is even more
confusing. If you think I'm wrong, just tell as why that
barrier is needed :)

For bnx2 and tg3 that would be removal of smp_mb().

> Let's recall that the bxn2x_start_xmit() is called under tx_lock(), so no additional memory barriers needed to ensure that bnx2x_tx_avail() has fresh values of BD consumer and producer at the beginning of bnx2x_start_xmit().

Memory barriers are not about refreshing anything, they are
about ordering.

> That's why we don't need the more complicated logic putting the queue asleep at the beginning of bnx2x_start_xmit() as we need at the end. That's why I think your "goto" change in the beginning of bnx2x_start_xmit() is suboptimal.

You may have right, about that smb_mb() is not needed at the beginning
of start_xmit(), but I'm not sure for that, since we are changing tx_bd_cons
without netif_tx_lock and setting XOFF bit there. If we can assume that
at the beginning of start_xmit() there is always enough space it hw queue,
we can just remove tx_avail() check. Otherwise IMHO is better to put queue
asleep in safe way.

About performance, adding smp_mb() on bug/unused code path does not hurt
much :)

Cheers
Stanislaw

> > -----Original Message-----
> > From: Stanislaw Gruszka [mailto:sgruszka@redhat.com] 
> > Sent: Tuesday, March 02, 2010 1:31 PM
> > To: Michael Chan
> > Cc: Vladislav Zolotarov; netdev@vger.kernel.org; 
> > davem@davemloft.net; Eilon Greenstein; Matthew Carlson
> > Subject: Re: [PATCH 1/1] bnx2x: Tx barriers and locks
> > 
> > On Mon, Mar 01, 2010 at 09:59:07AM -0800, Michael Chan wrote:
> > > > There is still difference between what we have in bnx2x 
> > and bnx2/tg3
> > > > regarding memory barriers in tx_poll/start_xmit code. 
> > Mainly we have
> > > > smp_mb() in bnx2/tg3_tx_avail(), and in bnx2/tg3_tx_int() 
> > is smp_mb()
> > > > not smp_wmb(). I do not see that bnx2x is wrong, but 
> > would like to know
> > > > why there is a difference, maybe bnx2/tg3 should be changed?
> > > > 
> > > 
> > > The memory barrier in tx_int() is to make the tx index update happen
> > > before the netif_tx_queue_stopped() check.  The barrier is 
> > to prevent a
> > > situation like this:
> > > 
> > >     CPU0					CPU1
> > >     start_xmit()
> > >     	if (tx_ring_full) {
> > >     						tx_int()
> > >     							
> > if (!netif_tx_queue_stopped)
> > >     		netif_tx_stop_queue()
> > >     		if (!tx_ring_full)
> > >     							
> > update_tx_index 
> > >     			netif_tx_wake_queue()
> > >     	}
> > >     
> > > 
> > > The update_tx_index code is before the if statement in 
> > program order,
> > > but the CPU and/or compiler can reorder it as shown above. 
> > smp_mb() will
> > > prevent that.  Will smp_wmb() prevent that as well?
> > 
> > No. smp_wmb() affect only write orders on CPU1 performing tx_int(), so
> > that should be fixed in bnx2x.
> > 
> > Regarding memory barrier in tx_avail(), I don't think it its 
> > needed for
> > anything, except maybe usage at the beginning of 
> > start_xmit(), but we can
> > just remove that like in the patch below. I going to post "official"
> > patches for tg3, bnx2 and bnx2x, if no nobody has nothing against.
> > 
> > diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
> > index ed785a3..0f406b8 100644
> > --- a/drivers/net/bnx2x_main.c
> > +++ b/drivers/net/bnx2x_main.c
> > @@ -893,7 +893,6 @@ static inline u16 bnx2x_tx_avail(struct 
> > bnx2x_fastpath *fp)
> >  	u16 prod;
> >  	u16 cons;
> >  
> > -	barrier(); /* Tell compiler that prod and cons can change */
> >  	prod = fp->tx_bd_prod;
> >  	cons = fp->tx_bd_cons;
> >  
> > @@ -963,9 +962,8 @@ static int bnx2x_tx_int(struct bnx2x_fastpath *fp)
> >  	 * start_xmit() will miss it and cause the queue to be stopped
> >  	 * forever.
> >  	 */
> > -	smp_wmb();
> > +	smp_mb();
> >  
> > -	/* TBD need a thresh? */
> >  	if (unlikely(netif_tx_queue_stopped(txq))) {
> >  		/* Taking tx_lock() is needed to prevent 
> > reenabling the queue
> >  		 * while it's empty. This could have happen if 
> > rx_action() gets
> > @@ -11177,10 +11175,9 @@ static netdev_tx_t 
> > bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	struct eth_tx_bd *tx_data_bd, *total_pkt_bd = NULL;
> >  	struct eth_tx_parse_bd *pbd = NULL;
> >  	u16 pkt_prod, bd_prod;
> > -	int nbd, fp_index;
> > +	int nbd, fp_index, i, ret;
> >  	dma_addr_t mapping;
> >  	u32 xmit_type = bnx2x_xmit_type(bp, skb);
> > -	int i;
> >  	u8 hlen = 0;
> >  	__le16 pkt_size = 0;
> >  
> > @@ -11195,10 +11192,9 @@ static netdev_tx_t 
> > bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	fp = &bp->fp[fp_index];
> >  
> >  	if (unlikely(bnx2x_tx_avail(fp) < 
> > (skb_shinfo(skb)->nr_frags + 3))) {
> > -		fp->eth_q_stats.driver_xoff++;
> > -		netif_tx_stop_queue(txq);
> >  		BNX2X_ERR("BUG! Tx ring full when queue awake!\n");
> > -		return NETDEV_TX_BUSY;
> > +		ret = NETDEV_TX_BUSY;				
> > 		
> > +		goto stop_queue;
> >  	}
> >  
> >  	DP(NETIF_MSG_TX_QUEUED, "SKB: summed %x  protocol %x  
> > protocol(%x,%x)"
> > @@ -11426,19 +11422,24 @@ static netdev_tx_t 
> > bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	mmiowb();
> >  
> >  	fp->tx_bd_prod += nbd;
> > -
> > -	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3)) {
> > -		netif_tx_stop_queue(txq);
> > -		/* We want bnx2x_tx_int to "see" the updated tx_bd_prod
> > -		   if we put Tx into XOFF state. */
> > -		smp_mb();
> > -		fp->eth_q_stats.driver_xoff++;
> > -		if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
> > -			netif_tx_wake_queue(txq);
> > -	}
> >  	fp->tx_pkt++;
> > +	
> > +	ret = NETDEV_TX_OK;
> > +	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3))
> > +		goto stop_queue;
> > +
> > +	return ret;
> >  
> > -	return NETDEV_TX_OK;
> > +stop_queue:
> > +	netif_tx_stop_queue(txq);
> > +	/* paired barrier is in bnx2x_tx_int(), update of tx_bd_cons
> > +	 * have to be visable here, after we XOFF bit setting */
> > +	smp_mb();
> > +	fp->eth_q_stats.driver_xoff++;
> > +	if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
> > +		netif_tx_wake_queue(txq);
> > +	
> > +	return ret;
> >  }
> >  
> >  /* called with rtnl_lock */
> > 
> > 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 13:55         ` Stanislaw Gruszka
@ 2010-03-02 16:18           ` Michael Chan
  2010-03-02 16:59             ` Stanislaw Gruszka
  2010-03-02 16:21           ` Vladislav Zolotarov
  1 sibling, 1 reply; 15+ messages in thread
From: Michael Chan @ 2010-03-02 16:18 UTC (permalink / raw)
  To: 'Stanislaw Gruszka', Vladislav Zolotarov
  Cc: netdev, davem, Eilon Greenstein, Matthew Carlson

Stanislaw Gruszka wrote:

> On Tue, Mar 02, 2010 at 04:50:59AM -0800, Vladislav Zolotarov wrote:
> > Stanislaw barrier() is not a memory barrier - it's a
> compiler barrier. I don't think removing it from
> bnx2x_tx_avail() will improve anything. If u think I'm wrong,
> could u, pls., provide a specific example.
>
> Only improvement is removing confusing code, And comment like
> "Tell compiler that prod and cons can change" is even more
> confusing. If you think I'm wrong, just tell as why that
> barrier is needed :)

The barrier (compiler barrier at least) is required in
bnx2x_tx_avail().  The status block index can be updated by DMA and
the compiler doesn't know it (because it is considered wrong to
declare the status block as volatile).  Near the end of
bnx2x_start_xmit() where we call bnx2x_tx_avail() twice.  It is
possible that the compiler will optimize it and not look at the
status block in memory the second time.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 13:55         ` Stanislaw Gruszka
  2010-03-02 16:18           ` Michael Chan
@ 2010-03-02 16:21           ` Vladislav Zolotarov
  1 sibling, 0 replies; 15+ messages in thread
From: Vladislav Zolotarov @ 2010-03-02 16:21 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Michael Chan, netdev, davem, Eilon Greenstein, Matthew Carlson

 

> -----Original Message-----
> From: Stanislaw Gruszka [mailto:sgruszka@redhat.com] 
> Sent: Tuesday, March 02, 2010 3:56 PM
> To: Vladislav Zolotarov
> Cc: Michael Chan; netdev@vger.kernel.org; 
> davem@davemloft.net; Eilon Greenstein; Matthew Carlson
> Subject: Re: [PATCH 1/1] bnx2x: Tx barriers and locks
> 
> On Tue, Mar 02, 2010 at 04:50:59AM -0800, Vladislav Zolotarov wrote:
> > Stanislaw barrier() is not a memory barrier - it's a 
> compiler barrier. I don't think removing it from 
> bnx2x_tx_avail() will improve anything. If u think I'm wrong, 
> could u, pls., provide a specific example.
> 
> Only improvement is removing confusing code, And comment like
> "Tell compiler that prod and cons can change" is even more
> confusing. If you think I'm wrong, just tell as why that
> barrier is needed :)

Both the coment and the barrier() ensures that compiler doesn't optimize the reading of the consumer and producers outside the inline function. This is what was meant when the function has been written so I don't think that either the remark or a barrier() itself are confusing.

> 
> For bnx2 and tg3 that would be removal of smp_mb().
> 
> > Let's recall that the bxn2x_start_xmit() is called under 
> tx_lock(), so no additional memory barriers needed to ensure 
> that bnx2x_tx_avail() has fresh values of BD consumer and 
> producer at the beginning of bnx2x_start_xmit().
> 
> Memory barriers are not about refreshing anything, they are
> about ordering.
> 
> > That's why we don't need the more complicated logic putting 
> the queue asleep at the beginning of bnx2x_start_xmit() as we 
> need at the end. That's why I think your "goto" change in the 
> beginning of bnx2x_start_xmit() is suboptimal.
> 
> You may have right, about that smb_mb() is not needed at the beginning
> of start_xmit(), but I'm not sure for that, since we are 
> changing tx_bd_cons
> without netif_tx_lock and setting XOFF bit there. If we can 
> assume that
> at the beginning of start_xmit() there is always enough space 
> it hw queue,
> we can just remove tx_avail() check. Otherwise IMHO is better 
> to put queue
> asleep in safe way.

The driver (bnx2x, Michael and Matthuew may elaborate on bnx2 and tg3 better) is written the way that is meant to ensure that there is always place on the ring at the beginning of bnx2x_start_xmit(). The check at the beginning ensures that nothing is broken. And if it's broken - it's a bug that should be fixed and we may not remove that check because if we do we will not even know that there was something wrong. But as long as there is nothing wrong - the current implementation of the tx_avail() check at the beginning of bnx2x_start_xmit() is absolutely safe as it is now.

> 
> About performance, adding smp_mb() on bug/unused code path 
> does not hurt
> much :)

It doesn't help either.... ;)

> 
> Cheers
> Stanislaw
> 
> > > -----Original Message-----
> > > From: Stanislaw Gruszka [mailto:sgruszka@redhat.com] 
> > > Sent: Tuesday, March 02, 2010 1:31 PM
> > > To: Michael Chan
> > > Cc: Vladislav Zolotarov; netdev@vger.kernel.org; 
> > > davem@davemloft.net; Eilon Greenstein; Matthew Carlson
> > > Subject: Re: [PATCH 1/1] bnx2x: Tx barriers and locks
> > > 
> > > On Mon, Mar 01, 2010 at 09:59:07AM -0800, Michael Chan wrote:
> > > > > There is still difference between what we have in bnx2x 
> > > and bnx2/tg3
> > > > > regarding memory barriers in tx_poll/start_xmit code. 
> > > Mainly we have
> > > > > smp_mb() in bnx2/tg3_tx_avail(), and in bnx2/tg3_tx_int() 
> > > is smp_mb()
> > > > > not smp_wmb(). I do not see that bnx2x is wrong, but 
> > > would like to know
> > > > > why there is a difference, maybe bnx2/tg3 should be changed?
> > > > > 
> > > > 
> > > > The memory barrier in tx_int() is to make the tx index 
> update happen
> > > > before the netif_tx_queue_stopped() check.  The barrier is 
> > > to prevent a
> > > > situation like this:
> > > > 
> > > >     CPU0					CPU1
> > > >     start_xmit()
> > > >     	if (tx_ring_full) {
> > > >     						tx_int()
> > > >     							
> > > if (!netif_tx_queue_stopped)
> > > >     		netif_tx_stop_queue()
> > > >     		if (!tx_ring_full)
> > > >     							
> > > update_tx_index 
> > > >     			netif_tx_wake_queue()
> > > >     	}
> > > >     
> > > > 
> > > > The update_tx_index code is before the if statement in 
> > > program order,
> > > > but the CPU and/or compiler can reorder it as shown above. 
> > > smp_mb() will
> > > > prevent that.  Will smp_wmb() prevent that as well?
> > > 
> > > No. smp_wmb() affect only write orders on CPU1 performing 
> tx_int(), so
> > > that should be fixed in bnx2x.
> > > 
> > > Regarding memory barrier in tx_avail(), I don't think it its 
> > > needed for
> > > anything, except maybe usage at the beginning of 
> > > start_xmit(), but we can
> > > just remove that like in the patch below. I going to post 
> "official"
> > > patches for tg3, bnx2 and bnx2x, if no nobody has nothing against.
> > > 
> > > diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
> > > index ed785a3..0f406b8 100644
> > > --- a/drivers/net/bnx2x_main.c
> > > +++ b/drivers/net/bnx2x_main.c
> > > @@ -893,7 +893,6 @@ static inline u16 bnx2x_tx_avail(struct 
> > > bnx2x_fastpath *fp)
> > >  	u16 prod;
> > >  	u16 cons;
> > >  
> > > -	barrier(); /* Tell compiler that prod and cons can change */
> > >  	prod = fp->tx_bd_prod;
> > >  	cons = fp->tx_bd_cons;
> > >  
> > > @@ -963,9 +962,8 @@ static int bnx2x_tx_int(struct 
> bnx2x_fastpath *fp)
> > >  	 * start_xmit() will miss it and cause the queue to be stopped
> > >  	 * forever.
> > >  	 */
> > > -	smp_wmb();
> > > +	smp_mb();
> > >  
> > > -	/* TBD need a thresh? */
> > >  	if (unlikely(netif_tx_queue_stopped(txq))) {
> > >  		/* Taking tx_lock() is needed to prevent 
> > > reenabling the queue
> > >  		 * while it's empty. This could have happen if 
> > > rx_action() gets
> > > @@ -11177,10 +11175,9 @@ static netdev_tx_t 
> > > bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > >  	struct eth_tx_bd *tx_data_bd, *total_pkt_bd = NULL;
> > >  	struct eth_tx_parse_bd *pbd = NULL;
> > >  	u16 pkt_prod, bd_prod;
> > > -	int nbd, fp_index;
> > > +	int nbd, fp_index, i, ret;
> > >  	dma_addr_t mapping;
> > >  	u32 xmit_type = bnx2x_xmit_type(bp, skb);
> > > -	int i;
> > >  	u8 hlen = 0;
> > >  	__le16 pkt_size = 0;
> > >  
> > > @@ -11195,10 +11192,9 @@ static netdev_tx_t 
> > > bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > >  	fp = &bp->fp[fp_index];
> > >  
> > >  	if (unlikely(bnx2x_tx_avail(fp) < 
> > > (skb_shinfo(skb)->nr_frags + 3))) {
> > > -		fp->eth_q_stats.driver_xoff++;
> > > -		netif_tx_stop_queue(txq);
> > >  		BNX2X_ERR("BUG! Tx ring full when queue awake!\n");
> > > -		return NETDEV_TX_BUSY;
> > > +		ret = NETDEV_TX_BUSY;				
> > > 		
> > > +		goto stop_queue;
> > >  	}
> > >  
> > >  	DP(NETIF_MSG_TX_QUEUED, "SKB: summed %x  protocol %x  
> > > protocol(%x,%x)"
> > > @@ -11426,19 +11422,24 @@ static netdev_tx_t 
> > > bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > >  	mmiowb();
> > >  
> > >  	fp->tx_bd_prod += nbd;
> > > -
> > > -	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3)) {
> > > -		netif_tx_stop_queue(txq);
> > > -		/* We want bnx2x_tx_int to "see" the updated tx_bd_prod
> > > -		   if we put Tx into XOFF state. */
> > > -		smp_mb();
> > > -		fp->eth_q_stats.driver_xoff++;
> > > -		if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
> > > -			netif_tx_wake_queue(txq);
> > > -	}
> > >  	fp->tx_pkt++;
> > > +	
> > > +	ret = NETDEV_TX_OK;
> > > +	if (unlikely(bnx2x_tx_avail(fp) < MAX_SKB_FRAGS + 3))
> > > +		goto stop_queue;
> > > +
> > > +	return ret;
> > >  
> > > -	return NETDEV_TX_OK;
> > > +stop_queue:
> > > +	netif_tx_stop_queue(txq);
> > > +	/* paired barrier is in bnx2x_tx_int(), update of tx_bd_cons
> > > +	 * have to be visable here, after we XOFF bit setting */
> > > +	smp_mb();
> > > +	fp->eth_q_stats.driver_xoff++;
> > > +	if (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3)
> > > +		netif_tx_wake_queue(txq);
> > > +	
> > > +	return ret;
> > >  }
> > >  
> > >  /* called with rtnl_lock */
> > > 
> > > 
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 16:18           ` Michael Chan
@ 2010-03-02 16:59             ` Stanislaw Gruszka
  2010-03-02 17:26               ` Michael Chan
  0 siblings, 1 reply; 15+ messages in thread
From: Stanislaw Gruszka @ 2010-03-02 16:59 UTC (permalink / raw)
  To: Michael Chan
  Cc: Vladislav Zolotarov, netdev, davem, Eilon Greenstein, Matthew Carlson

On Tue, Mar 02, 2010 at 08:18:44AM -0800, Michael Chan wrote:
> Stanislaw Gruszka wrote:
> 
> > On Tue, Mar 02, 2010 at 04:50:59AM -0800, Vladislav Zolotarov wrote:
> > > Stanislaw barrier() is not a memory barrier - it's a
> > compiler barrier. I don't think removing it from
> > bnx2x_tx_avail() will improve anything. If u think I'm wrong,
> > could u, pls., provide a specific example.
> >
> > Only improvement is removing confusing code, And comment like
> > "Tell compiler that prod and cons can change" is even more
> > confusing. If you think I'm wrong, just tell as why that
> > barrier is needed :)
> 
> The barrier (compiler barrier at least) is required in
> bnx2x_tx_avail().  The status block index can be updated by DMA and
> the compiler doesn't know it (because it is considered wrong to

If you are telling status block index you mean which variable ?
 
> declare the status block as volatile).  Near the end of
> bnx2x_start_xmit() where we call bnx2x_tx_avail() twice.  It is
> possible that the compiler will optimize it and not look at the
> status block in memory the second time.

Ok, I'm trying to understand.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 16:59             ` Stanislaw Gruszka
@ 2010-03-02 17:26               ` Michael Chan
  2010-03-08 15:38                 ` Stanislaw Gruszka
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Chan @ 2010-03-02 17:26 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Vladislav Zolotarov, netdev, davem, Eilon Greenstein, Matthew Carlson


On Tue, 2010-03-02 at 08:59 -0800, Stanislaw Gruszka wrote:
> On Tue, Mar 02, 2010 at 08:18:44AM -0800, Michael Chan wrote:
> > Stanislaw Gruszka wrote:
> > 
> > > On Tue, Mar 02, 2010 at 04:50:59AM -0800, Vladislav Zolotarov wrote:
> > > > Stanislaw barrier() is not a memory barrier - it's a
> > > compiler barrier. I don't think removing it from
> > > bnx2x_tx_avail() will improve anything. If u think I'm wrong,
> > > could u, pls., provide a specific example.
> > >
> > > Only improvement is removing confusing code, And comment like
> > > "Tell compiler that prod and cons can change" is even more
> > > confusing. If you think I'm wrong, just tell as why that
> > > barrier is needed :)
> > 
> > The barrier (compiler barrier at least) is required in
> > bnx2x_tx_avail().  The status block index can be updated by DMA and
> > the compiler doesn't know it (because it is considered wrong to
> 
> If you are telling status block index you mean which variable ?

The fp-> fields which can be updated by NAPI poll based on new status
block DMA.

>  
> > declare the status block as volatile).  Near the end of
> > bnx2x_start_xmit() where we call bnx2x_tx_avail() twice.  It is
> > possible that the compiler will optimize it and not look at the
> > status block in memory the second time.
> 
> Ok, I'm trying to understand.
> --




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1] bnx2x: Tx barriers and locks
  2010-03-02 17:26               ` Michael Chan
@ 2010-03-08 15:38                 ` Stanislaw Gruszka
  0 siblings, 0 replies; 15+ messages in thread
From: Stanislaw Gruszka @ 2010-03-08 15:38 UTC (permalink / raw)
  To: Michael Chan
  Cc: Vladislav Zolotarov, netdev, davem, Eilon Greenstein, Matthew Carlson

On Tue, Mar 02, 2010 at 09:26:27AM -0800, Michael Chan wrote:
> 
> On Tue, 2010-03-02 at 08:59 -0800, Stanislaw Gruszka wrote:
> > On Tue, Mar 02, 2010 at 08:18:44AM -0800, Michael Chan wrote:
> > > Stanislaw Gruszka wrote:
> > > 
> > > > On Tue, Mar 02, 2010 at 04:50:59AM -0800, Vladislav Zolotarov wrote:
> > > > > Stanislaw barrier() is not a memory barrier - it's a
> > > > compiler barrier. I don't think removing it from
> > > > bnx2x_tx_avail() will improve anything. If u think I'm wrong,
> > > > could u, pls., provide a specific example.
> > > >
> > > > Only improvement is removing confusing code, And comment like
> > > > "Tell compiler that prod and cons can change" is even more
> > > > confusing. If you think I'm wrong, just tell as why that
> > > > barrier is needed :)
> > > 
> > > The barrier (compiler barrier at least) is required in
> > > bnx2x_tx_avail().  The status block index can be updated by DMA and
> > > the compiler doesn't know it (because it is considered wrong to
> > 
> > If you are telling status block index you mean which variable ?
> 
> The fp-> fields which can be updated by NAPI poll based on new status
> block DMA.

So, we are talking about fp->tx_bd_prod and fp->tx_bd_cons.

> > > declare the status block as volatile).  Near the end of
> > > bnx2x_start_xmit() where we call bnx2x_tx_avail() twice.  It is
> > > possible that the compiler will optimize it and not look at the
> > > status block in memory the second time.

Still think barrier() in bnx2x_tx_avail is not necessary.

First of all, in what we have now in net-next-2.6 tree all fp->tx_bd_prod
modifications and bnx2x_tx_avail() function calls are done with
__netif_tx_lock taken, so we don't need any barriers for that. I'm omitting
here bnx2x_run_loopback() as this function is called when interface
is disabled, and bnx2x_stats_update() when output of bnx2x_tx_avail()
is printed in debug mode.

Regarding fp->tx_bd_cons: situation is a bit more complicated. It is modified
outside __netif_tx_lock protection and can happen in parallel with
bnx2x_start_xmit(). However barrier() in bnx2_tx_avail do not help when
fp->tx_bd_cons is modified on bnx2x_tx_int() on other cpu, since there
is no guarantees that registers/cache will be flushed on that cpu. Even
taking into account smp_wmb() after "fp->tx_bd_cons = bd_cons", as
smp_wmb() only guarantee preserve of ordering, not flushing registers/cache
in any particular time. For the same reason changing barrier() to smb_mb()
in bnx2x_tx_avail have no sense.

I plan to split my bnx2x patch into smaller peaces and post them. Small
patches perhaps will be more clean and You would not have objections
against :)

Stanislaw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/1] bnx2x: Tx barriers and locks
@ 2010-02-28 10:03 Vladislav Zolotarov
  0 siblings, 0 replies; 15+ messages in thread
From: Vladislav Zolotarov @ 2010-02-28 10:03 UTC (permalink / raw)
  To: netdev; +Cc: Eilon Greenstein; sgruszka

[Resending with the proper subject. Sorry for the mess. ]

This patch is based on the RFC of Stanislaw Gruszka.

More specifically it fixes two possible races:
- One, described by Stanislaw, may lead to permanent disabling of the Tx
queue.
This is fixed by adding the smp_wmb() to propagate the BD consumer
change towards the memory.
- Second may lead to bnx2x_start_xmit() returning NETDEV_TX_BUSY.
This is fixed by taking a tx_lock() before rechecking the number of
available Tx BDs.

thanks,
vlad 

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x_main.c |   31 ++++++++++++++++++++++---------
 1 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 5adf2a0..ed785a3 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -57,8 +57,8 @@
 #include "bnx2x_init_ops.h"
 #include "bnx2x_dump.h"
 
-#define DRV_MODULE_VERSION	"1.52.1-6"
-#define DRV_MODULE_RELDATE	"2010/02/16"
+#define DRV_MODULE_VERSION	"1.52.1-7"
+#define DRV_MODULE_RELDATE	"2010/02/28"
 #define BNX2X_BC_VER		0x040200
 
 #include <linux/firmware.h>
@@ -957,21 +957,34 @@ static int bnx2x_tx_int(struct bnx2x_fastpath *fp)
 	fp->tx_pkt_cons = sw_cons;
 	fp->tx_bd_cons = bd_cons;
 
+	/* Need to make the tx_bd_cons update visible to start_xmit()
+	 * before checking for netif_tx_queue_stopped().  Without the
+	 * memory barrier, there is a small possibility that
+	 * start_xmit() will miss it and cause the queue to be stopped
+	 * forever.
+	 */
+	smp_wmb();
+
 	/* TBD need a thresh? */
 	if (unlikely(netif_tx_queue_stopped(txq))) {
-
-		/* Need to make the tx_bd_cons update visible to start_xmit()
-		 * before checking for netif_tx_queue_stopped().  Without the
-		 * memory barrier, there is a small possibility that
-		 * start_xmit() will miss it and cause the queue to be stopped
-		 * forever.
+		/* Taking tx_lock() is needed to prevent reenabling the queue
+		 * while it's empty. This could have happen if rx_action() gets
+		 * suspended in bnx2x_tx_int() after the condition before
+		 * netif_tx_wake_queue(), while tx_action (bnx2x_start_xmit()):
+		 *
+		 * stops the queue->sees fresh tx_bd_cons->releases the queue->
+		 * sends some packets consuming the whole queue again->
+		 * stops the queue
 		 */
-		smp_mb();
+
+		__netif_tx_lock(txq, smp_processor_id());
 
 		if ((netif_tx_queue_stopped(txq)) &&
 		    (bp->state == BNX2X_STATE_OPEN) &&
 		    (bnx2x_tx_avail(fp) >= MAX_SKB_FRAGS + 3))
 			netif_tx_wake_queue(txq);
+
+		__netif_tx_unlock(txq);
 	}
 	return 0;
 }
-- 
1.6.3.3





^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-03-08 15:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-28 10:12 [PATCH 1/1] bnx2x: Tx barriers and locks Vladislav Zolotarov
2010-03-01  2:49 ` David Miller
2010-03-01 13:33 ` Stanislaw Gruszka
2010-03-01 17:59   ` Michael Chan
2010-03-02 10:38     ` Vladislav Zolotarov
2010-03-02 11:38       ` Stanislaw Gruszka
2010-03-02 11:30     ` Stanislaw Gruszka
2010-03-02 12:50       ` Vladislav Zolotarov
2010-03-02 13:55         ` Stanislaw Gruszka
2010-03-02 16:18           ` Michael Chan
2010-03-02 16:59             ` Stanislaw Gruszka
2010-03-02 17:26               ` Michael Chan
2010-03-08 15:38                 ` Stanislaw Gruszka
2010-03-02 16:21           ` Vladislav Zolotarov
  -- strict thread matches above, loose matches on Subject: below --
2010-02-28 10:03 Vladislav Zolotarov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.