All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
@ 2015-10-21 20:37 Tom Lendacky
  2015-10-23  9:59 ` David Miller
  2015-10-23 16:29 ` Alexander Duyck
  0 siblings, 2 replies; 7+ messages in thread
From: Tom Lendacky @ 2015-10-21 20:37 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Christoffer Dall

The code currently uses the lightweight dma_wmb barrier before updating
the current descriptor count. Under heavy load, the Tx cleanup routine
was seeing the updated current descriptor count before the updated
descriptor information. As a result, the Tx descriptor was being cleaned
up before it was used because it was not "owned" by the hardware yet,
resulting in a Tx queue hang.

Using the wmb barrier insures that the descriptor is updated before the
descriptor counter preventing the Tx queue hang. For extra insurance,
the Tx cleanup routine is changed to grab the current decriptor count on
entry and uses that initial value in the processing loop rather than
trying to chase the current value.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c |    2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c |    4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
index a4473d8..e9ab8b9 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -1595,7 +1595,7 @@ static void xgbe_dev_xmit(struct xgbe_channel *channel)
 				  packet->rdesc_count, 1);
 
 	/* Make sure ownership is written to the descriptor */
-	dma_wmb();
+	wmb();
 
 	ring->cur = cur_index + 1;
 	if (!packet->skb->xmit_more ||
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index aae9d5e..d2b77d9 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1807,6 +1807,7 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
 	struct netdev_queue *txq;
 	int processed = 0;
 	unsigned int tx_packets = 0, tx_bytes = 0;
+	unsigned int cur;
 
 	DBGPR("-->xgbe_tx_poll\n");
 
@@ -1814,10 +1815,11 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
 	if (!ring)
 		return 0;
 
+	cur = ring->cur;
 	txq = netdev_get_tx_queue(netdev, channel->queue_index);
 
 	while ((processed < XGBE_TX_DESC_MAX_PROC) &&
-	       (ring->dirty != ring->cur)) {
+	       (ring->dirty != cur)) {
 		rdata = XGBE_GET_DESC_DATA(ring, ring->dirty);
 		rdesc = rdata->rdesc;
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
  2015-10-21 20:37 [PATCH net] amd-xgbe: Use wmb before updating current descriptor count Tom Lendacky
@ 2015-10-23  9:59 ` David Miller
  2015-10-23 13:35   ` Tom Lendacky
  2015-10-23 16:29 ` Alexander Duyck
  1 sibling, 1 reply; 7+ messages in thread
From: David Miller @ 2015-10-23  9:59 UTC (permalink / raw)
  To: thomas.lendacky; +Cc: netdev, christoffer.dall

From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 21 Oct 2015 15:37:05 -0500

> The code currently uses the lightweight dma_wmb barrier before updating
> the current descriptor count. Under heavy load, the Tx cleanup routine
> was seeing the updated current descriptor count before the updated
> descriptor information. As a result, the Tx descriptor was being cleaned
> up before it was used because it was not "owned" by the hardware yet,
> resulting in a Tx queue hang.
> 
> Using the wmb barrier insures that the descriptor is updated before the
> descriptor counter preventing the Tx queue hang. For extra insurance,
> the Tx cleanup routine is changed to grab the current decriptor count on
> entry and uses that initial value in the processing loop rather than
> trying to chase the current value.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Tested-by: Christoffer Dall <christoffer.dall@linaro.org>

Applied, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
  2015-10-23  9:59 ` David Miller
@ 2015-10-23 13:35   ` Tom Lendacky
  2015-10-23 14:02     ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Tom Lendacky @ 2015-10-23 13:35 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, christoffer.dall

On 10/23/2015 04:59 AM, David Miller wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> Date: Wed, 21 Oct 2015 15:37:05 -0500
>
>> The code currently uses the lightweight dma_wmb barrier before updating
>> the current descriptor count. Under heavy load, the Tx cleanup routine
>> was seeing the updated current descriptor count before the updated
>> descriptor information. As a result, the Tx descriptor was being cleaned
>> up before it was used because it was not "owned" by the hardware yet,
>> resulting in a Tx queue hang.
>>
>> Using the wmb barrier insures that the descriptor is updated before the
>> descriptor counter preventing the Tx queue hang. For extra insurance,
>> the Tx cleanup routine is changed to grab the current decriptor count on
>> entry and uses that initial value in the processing loop rather than
>> trying to chase the current value.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Tested-by: Christoffer Dall <christoffer.dall@linaro.org>
>
> Applied, thanks.

Hi David,

Can you queue this up for stable?  It is applicable to 4.1 and 4.2.

Thanks,
Tom

>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
  2015-10-23 13:35   ` Tom Lendacky
@ 2015-10-23 14:02     ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2015-10-23 14:02 UTC (permalink / raw)
  To: thomas.lendacky; +Cc: netdev, christoffer.dall

From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Fri, 23 Oct 2015 08:35:52 -0500

> Can you queue this up for stable?  It is applicable to 4.1 and 4.2.

Ok, done.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
  2015-10-21 20:37 [PATCH net] amd-xgbe: Use wmb before updating current descriptor count Tom Lendacky
  2015-10-23  9:59 ` David Miller
@ 2015-10-23 16:29 ` Alexander Duyck
  2015-10-23 16:57   ` Tom Lendacky
  1 sibling, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2015-10-23 16:29 UTC (permalink / raw)
  To: Tom Lendacky, netdev; +Cc: David Miller, Christoffer Dall

On 10/21/2015 01:37 PM, Tom Lendacky wrote:
> The code currently uses the lightweight dma_wmb barrier before updating
> the current descriptor count. Under heavy load, the Tx cleanup routine
> was seeing the updated current descriptor count before the updated
> descriptor information. As a result, the Tx descriptor was being cleaned
> up before it was used because it was not "owned" by the hardware yet,
> resulting in a Tx queue hang.
>
> Using the wmb barrier insures that the descriptor is updated before the
> descriptor counter preventing the Tx queue hang. For extra insurance,
> the Tx cleanup routine is changed to grab the current decriptor count on
> entry and uses that initial value in the processing loop rather than
> trying to chase the current value.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Tested-by: Christoffer Dall <christoffer.dall@linaro.org>

This shouldn't be using wmb() or dma_wmb().  It looks like you are just 
using the barriers to synchronize between CPUs.  If that is the case you 
should probably be using smp_wmb()/smp_rmb().

> ---
>   drivers/net/ethernet/amd/xgbe/xgbe-dev.c |    2 +-
>   drivers/net/ethernet/amd/xgbe/xgbe-drv.c |    4 +++-
>   2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
> index a4473d8..e9ab8b9 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
> @@ -1595,7 +1595,7 @@ static void xgbe_dev_xmit(struct xgbe_channel *channel)
>   				  packet->rdesc_count, 1);
>   
>   	/* Make sure ownership is written to the descriptor */
> -	dma_wmb();
> +	wmb();
>   
>   	ring->cur = cur_index + 1;
>   	if (!packet->skb->xmit_more ||

If anything you could probably just replace it with an smp_wmb() since 
the device doesn't ever read the ring->cur value.  Using a wmb() here is 
just doing unnecessary harm to  your performance as you aren't dealing 
with multiple memory domains.

> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index aae9d5e..d2b77d9 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -1807,6 +1807,7 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
>   	struct netdev_queue *txq;
>   	int processed = 0;
>   	unsigned int tx_packets = 0, tx_bytes = 0;
> +	unsigned int cur;
>   
>   	DBGPR("-->xgbe_tx_poll\n");
>   
> @@ -1814,10 +1815,11 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
>   	if (!ring)
>   		return 0;
>   
> +	cur = ring->cur;
>   	txq = netdev_get_tx_queue(netdev, channel->queue_index);
>   
>   	while ((processed < XGBE_TX_DESC_MAX_PROC) &&
> -	       (ring->dirty != ring->cur)) {
> +	       (ring->dirty != cur)) {
>   		rdata = XGBE_GET_DESC_DATA(ring, ring->dirty);
>   		rdesc = rdata->rdesc;
>   
>

I believe your bug is down here.  You likely need an smp_rmb() between 
the read of ring->cur and rdata.  Otherwise there is nothing to prevent 
the compiler from reordering this so that it reads the descriptor data 
ahead of ring->cur.

Moving the read out of the loop likely resolves the issue as it becomes 
harder for the compiler to optimize the read versus the code in the 
loop.  You may want to just try reverting this patch and starting the 
loop with a smp_rmb() here and replace the dma_wmb() with an smp_wmb() 
above and you should see the race between the interrupt routine and the 
transmit path resolved as I believe the current fix is still a bit racy.

- Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
  2015-10-23 16:29 ` Alexander Duyck
@ 2015-10-23 16:57   ` Tom Lendacky
  2015-10-24  0:43     ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Tom Lendacky @ 2015-10-23 16:57 UTC (permalink / raw)
  To: Alexander Duyck, netdev; +Cc: David Miller, Christoffer Dall

On 10/23/2015 11:29 AM, Alexander Duyck wrote:
> On 10/21/2015 01:37 PM, Tom Lendacky wrote:
>> The code currently uses the lightweight dma_wmb barrier before updating
>> the current descriptor count. Under heavy load, the Tx cleanup routine
>> was seeing the updated current descriptor count before the updated
>> descriptor information. As a result, the Tx descriptor was being cleaned
>> up before it was used because it was not "owned" by the hardware yet,
>> resulting in a Tx queue hang.
>>
>> Using the wmb barrier insures that the descriptor is updated before the
>> descriptor counter preventing the Tx queue hang. For extra insurance,
>> the Tx cleanup routine is changed to grab the current decriptor count on
>> entry and uses that initial value in the processing loop rather than
>> trying to chase the current value.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Tested-by: Christoffer Dall <christoffer.dall@linaro.org>
>
> This shouldn't be using wmb() or dma_wmb().  It looks like you are just
> using the barriers to synchronize between CPUs.  If that is the case you
> should probably be using smp_wmb()/smp_rmb().
>
>> ---
>>   drivers/net/ethernet/amd/xgbe/xgbe-dev.c |    2 +-
>>   drivers/net/ethernet/amd/xgbe/xgbe-drv.c |    4 +++-
>>   2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
>> b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
>> index a4473d8..e9ab8b9 100644
>> --- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
>> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
>> @@ -1595,7 +1595,7 @@ static void xgbe_dev_xmit(struct xgbe_channel
>> *channel)
>>                     packet->rdesc_count, 1);
>>       /* Make sure ownership is written to the descriptor */
>> -    dma_wmb();
>> +    wmb();
>>       ring->cur = cur_index + 1;
>>       if (!packet->skb->xmit_more ||
>
> If anything you could probably just replace it with an smp_wmb() since
> the device doesn't ever read the ring->cur value.  Using a wmb() here is
> just doing unnecessary harm to  your performance as you aren't dealing
> with multiple memory domains.
>
>> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
>> b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
>> index aae9d5e..d2b77d9 100644
>> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
>> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
>> @@ -1807,6 +1807,7 @@ static int xgbe_tx_poll(struct xgbe_channel
>> *channel)
>>       struct netdev_queue *txq;
>>       int processed = 0;
>>       unsigned int tx_packets = 0, tx_bytes = 0;
>> +    unsigned int cur;
>>       DBGPR("-->xgbe_tx_poll\n");
>> @@ -1814,10 +1815,11 @@ static int xgbe_tx_poll(struct xgbe_channel
>> *channel)
>>       if (!ring)
>>           return 0;
>> +    cur = ring->cur;
>>       txq = netdev_get_tx_queue(netdev, channel->queue_index);
>>       while ((processed < XGBE_TX_DESC_MAX_PROC) &&
>> -           (ring->dirty != ring->cur)) {
>> +           (ring->dirty != cur)) {
>>           rdata = XGBE_GET_DESC_DATA(ring, ring->dirty);
>>           rdesc = rdata->rdesc;
>>
>
> I believe your bug is down here.  You likely need an smp_rmb() between
> the read of ring->cur and rdata.  Otherwise there is nothing to prevent
> the compiler from reordering this so that it reads the descriptor data
> ahead of ring->cur.
>
> Moving the read out of the loop likely resolves the issue as it becomes
> harder for the compiler to optimize the read versus the code in the
> loop.  You may want to just try reverting this patch and starting the
> loop with a smp_rmb() here and replace the dma_wmb() with an smp_wmb()
> above and you should see the race between the interrupt routine and the
> transmit path resolved as I believe the current fix is still a bit racy.

Alex, thanks for the advice, it makes a lot of sense. I'll test your
suggested fix.

David, if this is indeed the proper fix would you want to me to send a
new patch based on this old patch or a new patch based on you having
reverted the old patch?

Thanks,
Tom

>
> - Alex
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
  2015-10-23 16:57   ` Tom Lendacky
@ 2015-10-24  0:43     ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2015-10-24  0:43 UTC (permalink / raw)
  To: thomas.lendacky; +Cc: alexander.duyck, netdev, christoffer.dall

From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Fri, 23 Oct 2015 11:57:27 -0500

> David, if this is indeed the proper fix would you want to me to send a
> new patch based on this old patch or a new patch based on you having
> reverted the old patch?

One never sends me a "new patch" for a patch I've already applied, once I
apply it, it is there forever and must either be fixed with a relative
patch or reverted.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-10-24  0:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-21 20:37 [PATCH net] amd-xgbe: Use wmb before updating current descriptor count Tom Lendacky
2015-10-23  9:59 ` David Miller
2015-10-23 13:35   ` Tom Lendacky
2015-10-23 14:02     ` David Miller
2015-10-23 16:29 ` Alexander Duyck
2015-10-23 16:57   ` Tom Lendacky
2015-10-24  0:43     ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.