[PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
@ 2015-01-21 12:54 Ezequiel Garcia
  2015-01-21 12:54 ` [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path Ezequiel Garcia
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Ezequiel Garcia @ 2015-01-21 12:54 UTC (permalink / raw)
  To: netdev, Russell King, David Miller; +Cc: B38611, fabio.estevam, Ezequiel Garcia

These two commits are fixes to the issue reported by Russell King on
mv643xx_eth. Namely, the introduction of a regression by commit 69ad0dd7af22
which removed the support for highmem skb fragments. The guilty commit
introduced the assumption of fragment's payload being located in lowmem pages.

A similar pattern can be found in the original mvneta driver (in fact, the
regression was introduced by copy-pasting the mvneta code).

These fixes are for the non-TSO egress path in mvneta and mv643xx_eth drivers.
The TSO path needs a more intrusive change, as the TSO API needs to be fixed
(e.g. to make it work in skb fragments, instead of pointers to data).

Russell, as I'm still unable to reproduce this, do you think you can
give it a spin over there?

Ezequiel Garcia (2):
  net: mvneta: Fix highmem support in the non-TSO egress path
  net: mv643xx_eth: Fix highmem support in non-TSO egress path

 drivers/net/ethernet/marvell/mv643xx_eth.c | 26 ++++++++++++++------
 drivers/net/ethernet/marvell/mvneta.c      | 39 ++++++++++++++++++------------
 2 files changed, 43 insertions(+), 22 deletions(-)

-- 
2.2.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path
  2015-01-21 12:54 [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Ezequiel Garcia
@ 2015-01-21 12:54 ` Ezequiel Garcia
  2015-01-26 22:40   ` David Miller
  2015-01-21 12:54 ` [PATCH 2/2] net: mv643xx_eth: Fix highmem support in " Ezequiel Garcia
  2015-01-21 15:01 ` [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Russell King - ARM Linux
  2 siblings, 1 reply; 19+ messages in thread
From: Ezequiel Garcia @ 2015-01-21 12:54 UTC (permalink / raw)
  To: netdev, Russell King, David Miller; +Cc: B38611, fabio.estevam, Ezequiel Garcia

The current implementation is broken and does not support
a skb fragment being in a highmem page. By using page_address()
to get the address of a fragment's page, we are assuming a
lowmem page. However, such assumption is incorrect and proper
highmem support is required instead.

This commit fixes this by using the skb_frag_dma_map() helper,
which takes care of mapping the skb fragment properly.

Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 39 +++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 96208f1..adec923 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1296,10 +1296,22 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 
 		mvneta_txq_inc_get(txq);
 
-		if (!IS_TSO_HEADER(txq, tx_desc->buf_phys_addr))
-			dma_unmap_single(pp->dev->dev.parent,
-					 tx_desc->buf_phys_addr,
-					 tx_desc->data_size, DMA_TO_DEVICE);
+		if (!IS_TSO_HEADER(txq, tx_desc->buf_phys_addr)) {
+
+			/* The first descriptor is either a TSO header or
+			 * the linear part of the skb.
+			 */
+			if (tx_desc->command & MVNETA_TXD_F_DESC)
+				dma_unmap_single(pp->dev->dev.parent,
+						 tx_desc->buf_phys_addr,
+						 tx_desc->data_size,
+						 DMA_TO_DEVICE);
+			else
+				dma_unmap_page(pp->dev->dev.parent,
+					       tx_desc->buf_phys_addr,
+					       tx_desc->data_size,
+					       DMA_TO_DEVICE);
+		}
 		if (!skb)
 			continue;
 		dev_kfree_skb_any(skb);
@@ -1669,14 +1681,11 @@ static int mvneta_tx_frag_process(struct mvneta_port *pp, struct sk_buff *skb,
 
 	for (i = 0; i < nr_frags; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-		void *addr = page_address(frag->page.p) + frag->page_offset;
-
 		tx_desc = mvneta_txq_next_desc_get(txq);
-		tx_desc->data_size = frag->size;
-
-		tx_desc->buf_phys_addr =
-			dma_map_single(pp->dev->dev.parent, addr,
-				       tx_desc->data_size, DMA_TO_DEVICE);
+		tx_desc->data_size = skb_frag_size(frag);
+		tx_desc->buf_phys_addr = skb_frag_dma_map(pp->dev->dev.parent,
+						frag, 0, tx_desc->data_size,
+						DMA_TO_DEVICE);
 
 		if (dma_mapping_error(pp->dev->dev.parent,
 				      tx_desc->buf_phys_addr)) {
@@ -1704,10 +1713,10 @@ error:
 	 */
 	for (i = i - 1; i >= 0; i--) {
 		tx_desc = txq->descs + i;
-		dma_unmap_single(pp->dev->dev.parent,
-				 tx_desc->buf_phys_addr,
-				 tx_desc->data_size,
-				 DMA_TO_DEVICE);
+		dma_unmap_page(pp->dev->dev.parent,
+			       tx_desc->buf_phys_addr,
+			       tx_desc->data_size,
+			       DMA_TO_DEVICE);
 		mvneta_txq_desc_put(txq);
 	}
 
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/2] net: mv643xx_eth: Fix highmem support in non-TSO egress path
  2015-01-21 12:54 [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Ezequiel Garcia
  2015-01-21 12:54 ` [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path Ezequiel Garcia
@ 2015-01-21 12:54 ` Ezequiel Garcia
  2015-01-21 17:40   ` Russell King - ARM Linux
  2015-01-26 22:40   ` David Miller
  2015-01-21 15:01 ` [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Russell King - ARM Linux
  2 siblings, 2 replies; 19+ messages in thread
From: Ezequiel Garcia @ 2015-01-21 12:54 UTC (permalink / raw)
  To: netdev, Russell King, David Miller; +Cc: B38611, fabio.estevam, Ezequiel Garcia

Commit 69ad0dd7af22b61d9e0e68e56b6290121618b0fb
Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Date:   Mon May 19 13:59:59 2014 -0300

    net: mv643xx_eth: Use dma_map_single() to map the skb fragments

caused a nasty regression by removing the support for highmem skb
fragments. By using page_address() to get the address of a fragment's
page, we are assuming a lowmem page. However, such assumption is incorrect,
as fragments can be in highmem pages, resulting in very nasty issues.

This commit fixes this by using the skb_frag_dma_map() helper,
which takes care of mapping the skb fragment properly.

Fixes: 69ad0dd7af22 ("net: mv643xx_eth: Use dma_map_single() to map the skb fragments")
Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
---
 drivers/net/ethernet/marvell/mv643xx_eth.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index a62fc38..0c77f0e 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -879,10 +879,8 @@ static void txq_submit_frag_skb(struct tx_queue *txq, struct sk_buff *skb)
 		skb_frag_t *this_frag;
 		int tx_index;
 		struct tx_desc *desc;
-		void *addr;
 
 		this_frag = &skb_shinfo(skb)->frags[frag];
-		addr = page_address(this_frag->page.p) + this_frag->page_offset;
 		tx_index = txq->tx_curr_desc++;
 		if (txq->tx_curr_desc == txq->tx_ring_size)
 			txq->tx_curr_desc = 0;
@@ -902,8 +900,9 @@ static void txq_submit_frag_skb(struct tx_queue *txq, struct sk_buff *skb)
 
 		desc->l4i_chk = 0;
 		desc->byte_cnt = skb_frag_size(this_frag);
-		desc->buf_ptr = dma_map_single(mp->dev->dev.parent, addr,
-					       desc->byte_cnt, DMA_TO_DEVICE);
+		desc->buf_ptr = skb_frag_dma_map(mp->dev->dev.parent,
+						 this_frag, 0, desc->byte_cnt,
+						 DMA_TO_DEVICE);
 	}
 }
 
@@ -1065,9 +1064,22 @@ static int txq_reclaim(struct tx_queue *txq, int budget, int force)
 		reclaimed++;
 		txq->tx_desc_count--;
 
-		if (!IS_TSO_HEADER(txq, desc->buf_ptr))
-			dma_unmap_single(mp->dev->dev.parent, desc->buf_ptr,
-					 desc->byte_cnt, DMA_TO_DEVICE);
+		if (!IS_TSO_HEADER(txq, desc->buf_ptr)) {
+
+			/* The first descriptor is either a TSO header or
+			 * the linear part of the skb.
+			 */
+			if (desc->cmd_sts & TX_FIRST_DESC)
+				dma_unmap_single(mp->dev->dev.parent,
+						 desc->buf_ptr,
+						 desc->byte_cnt,
+						 DMA_TO_DEVICE);
+			else
+				dma_unmap_page(mp->dev->dev.parent,
+					       desc->buf_ptr,
+					       desc->byte_cnt,
+					       DMA_TO_DEVICE);
+		}
 
 		if (cmd_sts & TX_ENABLE_INTERRUPT) {
 			struct sk_buff *skb = __skb_dequeue(&txq->tx_skb);
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-21 12:54 [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Ezequiel Garcia
  2015-01-21 12:54 ` [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path Ezequiel Garcia
  2015-01-21 12:54 ` [PATCH 2/2] net: mv643xx_eth: Fix highmem support in " Ezequiel Garcia
@ 2015-01-21 15:01 ` Russell King - ARM Linux
  2015-01-22 18:41   ` Dean Gehnert
  2 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2015-01-21 15:01 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: netdev, David Miller, B38611, fabio.estevam

On Wed, Jan 21, 2015 at 09:54:08AM -0300, Ezequiel Garcia wrote:
> These two commits are fixes to the issue reported by Russell King on
> mv643xx_eth. Namely, the introduction of a regression by commit 69ad0dd7af22
> which removed the support for highmem skb fragments. The guilty commit
> introduced the assumption of fragment's payload being located in lowmem pages.

I do wonder whether 69ad0dd7af22 is the real culpret, or whether there is
some other change in the netdev layer that we're missing.  That commit is
in 3.16, but from what I remember, 3.17 works fine, it's 3.18 which fails.

> A similar pattern can be found in the original mvneta driver (in fact, the
> regression was introduced by copy-pasting the mvneta code).
> 
> These fixes are for the non-TSO egress path in mvneta and mv643xx_eth drivers.
> The TSO path needs a more intrusive change, as the TSO API needs to be fixed
> (e.g. to make it work in skb fragments, instead of pointers to data).
> 
> Russell, as I'm still unable to reproduce this, do you think you can
> give it a spin over there?

Sure - I think the only one I can test is mv643xx_eth, I don't think I
have any device which supports mv_neta.

The test scenario is for a NFS mount (the Marvell device as the NFS
client) over IPv6.

Initial testing looks good, I'll let it run for a while with various
builds on the NFS share (which iirc was one of the triggering
workloads).

Thanks.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] net: mv643xx_eth: Fix highmem support in non-TSO egress path
  2015-01-21 12:54 ` [PATCH 2/2] net: mv643xx_eth: Fix highmem support in " Ezequiel Garcia
@ 2015-01-21 17:40   ` Russell King - ARM Linux
  2015-01-21 23:34     ` Ezequiel Garcia
  2015-01-26 22:40   ` David Miller
  1 sibling, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2015-01-21 17:40 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: netdev, David Miller, B38611, fabio.estevam

On Wed, Jan 21, 2015 at 09:54:10AM -0300, Ezequiel Garcia wrote:
> Commit 69ad0dd7af22b61d9e0e68e56b6290121618b0fb
> Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> Date:   Mon May 19 13:59:59 2014 -0300
> 
>     net: mv643xx_eth: Use dma_map_single() to map the skb fragments
> 
> caused a nasty regression by removing the support for highmem skb
> fragments. By using page_address() to get the address of a fragment's
> page, we are assuming a lowmem page. However, such assumption is incorrect,
> as fragments can be in highmem pages, resulting in very nasty issues.
> 
> This commit fixes this by using the skb_frag_dma_map() helper,
> which takes care of mapping the skb fragment properly.

This seems fine, so:

> Fixes: 69ad0dd7af22 ("net: mv643xx_eth: Use dma_map_single() to map the skb fragments")
> Reported-by: Russell King <linux@arm.linux.org.uk>

Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>

Thanks.

> Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> ---
>  drivers/net/ethernet/marvell/mv643xx_eth.c | 26 +++++++++++++++++++-------
>  1 file changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
> index a62fc38..0c77f0e 100644
> --- a/drivers/net/ethernet/marvell/mv643xx_eth.c
> +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
> @@ -879,10 +879,8 @@ static void txq_submit_frag_skb(struct tx_queue *txq, struct sk_buff *skb)
>  		skb_frag_t *this_frag;
>  		int tx_index;
>  		struct tx_desc *desc;
> -		void *addr;
>  
>  		this_frag = &skb_shinfo(skb)->frags[frag];
> -		addr = page_address(this_frag->page.p) + this_frag->page_offset;
>  		tx_index = txq->tx_curr_desc++;
>  		if (txq->tx_curr_desc == txq->tx_ring_size)
>  			txq->tx_curr_desc = 0;
> @@ -902,8 +900,9 @@ static void txq_submit_frag_skb(struct tx_queue *txq, struct sk_buff *skb)
>  
>  		desc->l4i_chk = 0;
>  		desc->byte_cnt = skb_frag_size(this_frag);
> -		desc->buf_ptr = dma_map_single(mp->dev->dev.parent, addr,
> -					       desc->byte_cnt, DMA_TO_DEVICE);
> +		desc->buf_ptr = skb_frag_dma_map(mp->dev->dev.parent,
> +						 this_frag, 0, desc->byte_cnt,
> +						 DMA_TO_DEVICE);
>  	}
>  }
>  
> @@ -1065,9 +1064,22 @@ static int txq_reclaim(struct tx_queue *txq, int budget, int force)
>  		reclaimed++;
>  		txq->tx_desc_count--;
>  
> -		if (!IS_TSO_HEADER(txq, desc->buf_ptr))
> -			dma_unmap_single(mp->dev->dev.parent, desc->buf_ptr,
> -					 desc->byte_cnt, DMA_TO_DEVICE);
> +		if (!IS_TSO_HEADER(txq, desc->buf_ptr)) {
> +
> +			/* The first descriptor is either a TSO header or
> +			 * the linear part of the skb.
> +			 */
> +			if (desc->cmd_sts & TX_FIRST_DESC)
> +				dma_unmap_single(mp->dev->dev.parent,
> +						 desc->buf_ptr,
> +						 desc->byte_cnt,
> +						 DMA_TO_DEVICE);
> +			else
> +				dma_unmap_page(mp->dev->dev.parent,
> +					       desc->buf_ptr,
> +					       desc->byte_cnt,
> +					       DMA_TO_DEVICE);
> +		}
>  
>  		if (cmd_sts & TX_ENABLE_INTERRUPT) {
>  			struct sk_buff *skb = __skb_dequeue(&txq->tx_skb);
> -- 
> 2.2.1
> 

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] net: mv643xx_eth: Fix highmem support in non-TSO egress path
  2015-01-21 17:40   ` Russell King - ARM Linux
@ 2015-01-21 23:34     ` Ezequiel Garcia
  2015-01-22  0:11       ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Ezequiel Garcia @ 2015-01-21 23:34 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: netdev, David Miller, B38611, fabio.estevam

On 01/21/2015 02:40 PM, Russell King - ARM Linux wrote:
> On Wed, Jan 21, 2015 at 09:54:10AM -0300, Ezequiel Garcia wrote:
>> Commit 69ad0dd7af22b61d9e0e68e56b6290121618b0fb
>> Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
>> Date:   Mon May 19 13:59:59 2014 -0300
>>
>>     net: mv643xx_eth: Use dma_map_single() to map the skb fragments
>>
>> caused a nasty regression by removing the support for highmem skb
>> fragments. By using page_address() to get the address of a fragment's
>> page, we are assuming a lowmem page. However, such assumption is incorrect,
>> as fragments can be in highmem pages, resulting in very nasty issues.
>>
>> This commit fixes this by using the skb_frag_dma_map() helper,
>> which takes care of mapping the skb fragment properly.
> 
> This seems fine, so:
> 

I have just realised that the non-TSO and the TSO paths must work
simultaneously (we don't know which path an egress skb will take).

So, with these patches, the unmapping is done using dma_unmap_page() which
is only correct if the skb took the non-TSO paths. In other words,
these fixes are wrong (although I have no idea the effect of
using dma_unmap_page on a mapping done with dma_map_single).

And the problem is that in the TSO path, the linear and the non-linear
fragments use the same kind of descriptors, so we can't distinguish
them in the cleanup, and can't decide if _single or _page should be used.

Any ideas?

I guess we could keep track in some data structure of the type of mapping
on each descriptor. Or alternatively, avoid highmem fragments altogether
by mapping to a lowmem page.

I'll try to come up with some more patches following the first idea.

Sorry for the crappiness,
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] net: mv643xx_eth: Fix highmem support in non-TSO egress path
  2015-01-21 23:34     ` Ezequiel Garcia
@ 2015-01-22  0:11       ` Russell King - ARM Linux
  2015-01-22 12:17         ` Ezequiel Garcia
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2015-01-22  0:11 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: netdev, David Miller, B38611, fabio.estevam

On Wed, Jan 21, 2015 at 08:34:30PM -0300, Ezequiel Garcia wrote:
> I have just realised that the non-TSO and the TSO paths must work
> simultaneously (we don't know which path an egress skb will take).
> 
> So, with these patches, the unmapping is done using dma_unmap_page() which
> is only correct if the skb took the non-TSO paths. In other words,
> these fixes are wrong (although I have no idea the effect of
> using dma_unmap_page on a mapping done with dma_map_single).
> 
> And the problem is that in the TSO path, the linear and the non-linear
> fragments use the same kind of descriptors, so we can't distinguish
> them in the cleanup, and can't decide if _single or _page should be used.
> 
> Any ideas?

Or, maybe, if davem would reply, we might come to the conclusion (as
I previously pointed out) that it's not a driver issue, but a netdev
core issue:

static netdev_features_t harmonize_features(struct sk_buff *skb,
        netdev_features_t features)
{
...
        if (skb->ip_summed != CHECKSUM_NONE &&
            !can_checksum_protocol(features, type)) {
                features &= ~NETIF_F_ALL_CSUM;
        } else if (illegal_highdma(skb->dev, skb)) {
                features &= ~NETIF_F_SG;
        }

The problem is when the first "if" is true (as is the case with IPv6 on
mv643xx_eth.c), we clear NETIF_F_ALL_CSUM, but leave NETIF_F_SG set.

Had that first if been false, we would've called illegal_highdma(), and
found that the skb contains some highmem fragments, but the device does
*not* have NETIF_F_HIGHDMA set, and so that second "if" would be true.
The result of that is NETIF_F_SG is cleared.

In this case, in validate_xmit_skb(), skb_needs_linearize() would be
false for a skb with fragments, causing the skb to be linearised.  I've
not completely traced the GSO path, but I'd assume that does something
similar (which I think skb_segment() handles.)

So, I'm wondering whether the above should be:

static netdev_features_t harmonize_features(struct sk_buff *skb,
        netdev_features_t features)
{
...
        if (skb->ip_summed != CHECKSUM_NONE &&
            !can_checksum_protocol(features, type)) {
                features &= ~NETIF_F_ALL_CSUM;
        }

        if (illegal_highdma(skb->dev, skb)) {
                features &= ~NETIF_F_SG;
        }

So that we get NETIF_F_SG turned off for all cases (irrespective of the
NETIF_F_ALL_CSUM test) if we see a skb with highmem and we the device
does not support highdma.

Yes, the code above hasn't changed in functionality for a long time, but
that doesn't mean it isn't buggy, and isn't the cause of our current bug.

However, it would be far better to have the drivers fixed for the sake
of performance - it's only this dma_map_page() thing that is the real
cause of the problem in these drivers.

Looking at TSO, it seems madness that it doesn't support highmem:

void tso_start(struct sk_buff *skb, struct tso_t *tso)
{
...
        tso->data = skb->data + hdr_len;
...
                tso->data = page_address(frag->page.p) + frag->page_offset;

Of course, this would all be a lot easier for drivers if all drivers had
to worry about was a struct page, offset and size, rather than having to
track whether each individual mapping of a transmit packet was mapped
with dma_map_single() or dma_map_page().

That all said, what I really care about is the regression which basically
makes 3.18 unusable on this hardware and seeing _some_ kind of resolution
to that regression - I don't care if it doesn't quite perform, what I care
about is that the network driver doesn't oops the kernel.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] net: mv643xx_eth: Fix highmem support in non-TSO egress path
  2015-01-22  0:11       ` Russell King - ARM Linux
@ 2015-01-22 12:17         ` Ezequiel Garcia
  0 siblings, 0 replies; 19+ messages in thread
From: Ezequiel Garcia @ 2015-01-22 12:17 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: netdev, David Miller, B38611, fabio.estevam

On 01/21/2015 09:11 PM, Russell King - ARM Linux wrote:
> On Wed, Jan 21, 2015 at 08:34:30PM -0300, Ezequiel Garcia wrote:
>> I have just realised that the non-TSO and the TSO paths must work
>> simultaneously (we don't know which path an egress skb will take).
>>
>> So, with these patches, the unmapping is done using dma_unmap_page() which
>> is only correct if the skb took the non-TSO paths. In other words,
>> these fixes are wrong (although I have no idea the effect of
>> using dma_unmap_page on a mapping done with dma_map_single).
>>
>> And the problem is that in the TSO path, the linear and the non-linear
>> fragments use the same kind of descriptors, so we can't distinguish
>> them in the cleanup, and can't decide if _single or _page should be used.
>>
>> Any ideas?
> 
> Or, maybe, if davem would reply, we might come to the conclusion (as
> I previously pointed out) that it's not a driver issue, but a netdev
> core issue:
> 
> static netdev_features_t harmonize_features(struct sk_buff *skb,
>         netdev_features_t features)
> {
> ...
>         if (skb->ip_summed != CHECKSUM_NONE &&
>             !can_checksum_protocol(features, type)) {
>                 features &= ~NETIF_F_ALL_CSUM;
>         } else if (illegal_highdma(skb->dev, skb)) {
>                 features &= ~NETIF_F_SG;
>         }
> 
> The problem is when the first "if" is true (as is the case with IPv6 on
> mv643xx_eth.c), we clear NETIF_F_ALL_CSUM, but leave NETIF_F_SG set.
> 
> Had that first if been false, we would've called illegal_highdma(), and
> found that the skb contains some highmem fragments, but the device does
> *not* have NETIF_F_HIGHDMA set, and so that second "if" would be true.
> The result of that is NETIF_F_SG is cleared.
> 
> In this case, in validate_xmit_skb(), skb_needs_linearize() would be
> false for a skb with fragments, causing the skb to be linearised.  I've
> not completely traced the GSO path, but I'd assume that does something
> similar (which I think skb_segment() handles.)
> 
> So, I'm wondering whether the above should be:
> 
> static netdev_features_t harmonize_features(struct sk_buff *skb,
>         netdev_features_t features)
> {
> ...
>         if (skb->ip_summed != CHECKSUM_NONE &&
>             !can_checksum_protocol(features, type)) {
>                 features &= ~NETIF_F_ALL_CSUM;
>         }
> 
>         if (illegal_highdma(skb->dev, skb)) {
>                 features &= ~NETIF_F_SG;
>         }
> 
> So that we get NETIF_F_SG turned off for all cases (irrespective of the
> NETIF_F_ALL_CSUM test) if we see a skb with highmem and we the device
> does not support highdma.
> 
> Yes, the code above hasn't changed in functionality for a long time, but
> that doesn't mean it isn't buggy, and isn't the cause of our current bug.
> 

Hm, that's interesting.

> However, it would be far better to have the drivers fixed for the sake
> of performance - it's only this dma_map_page() thing that is the real
> cause of the problem in these drivers.
> 

Yes, I have just sent a v2 to fix the mv643xx_eth driver (non-TSO path).
If that works, I'll see about preparing a fix for mvneta, and for both
egress paths.

> Looking at TSO, it seems madness that it doesn't support highmem:
> 
> void tso_start(struct sk_buff *skb, struct tso_t *tso)
> {
> ...
>         tso->data = skb->data + hdr_len;
> ...
>                 tso->data = page_address(frag->page.p) + frag->page_offset;
> 
> Of course, this would all be a lot easier for drivers if all drivers had
> to worry about was a struct page, offset and size, rather than having to
> track whether each individual mapping of a transmit packet was mapped
> with dma_map_single() or dma_map_page().
> 
> That all said, what I really care about is the regression which basically
> makes 3.18 unusable on this hardware and seeing _some_ kind of resolution
> to that regression - I don't care if it doesn't quite perform, what I care
> about is that the network driver doesn't oops the kernel.
> 

Thanks for all the info!
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-21 15:01 ` [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Russell King - ARM Linux
@ 2015-01-22 18:41   ` Dean Gehnert
  2015-01-22 18:45     ` Ezequiel Garcia
  2015-01-22 21:09     ` Russell King - ARM Linux
  0 siblings, 2 replies; 19+ messages in thread
From: Dean Gehnert @ 2015-01-22 18:41 UTC (permalink / raw)
  To: Russell King - ARM Linux, Ezequiel Garcia
  Cc: netdev, David Miller, B38611, fabio.estevam

On 01/21/2015 07:01 AM, Russell King - ARM Linux wrote:
> On Wed, Jan 21, 2015 at 09:54:08AM -0300, Ezequiel Garcia wrote:
>> These two commits are fixes to the issue reported by Russell King on
>> mv643xx_eth. Namely, the introduction of a regression by commit 69ad0dd7af22
>> which removed the support for highmem skb fragments. The guilty commit
>> introduced the assumption of fragment's payload being located in lowmem pages.
> I do wonder whether 69ad0dd7af22 is the real culpret, or whether there is
> some other change in the netdev layer that we're missing.  That commit is
> in 3.16, but from what I remember, 3.17 works fine, it's 3.18 which fails.
>
>> A similar pattern can be found in the original mvneta driver (in fact, the
>> regression was introduced by copy-pasting the mvneta code).
>>
>> These fixes are for the non-TSO egress path in mvneta and mv643xx_eth drivers.
>> The TSO path needs a more intrusive change, as the TSO API needs to be fixed
>> (e.g. to make it work in skb fragments, instead of pointers to data).
>>
>> Russell, as I'm still unable to reproduce this, do you think you can
>> give it a spin over there?
> Sure - I think the only one I can test is mv643xx_eth, I don't think I
> have any device which supports mv_neta.
>
> The test scenario is for a NFS mount (the Marvell device as the NFS
> client) over IPv6.
>
> Initial testing looks good, I'll let it run for a while with various
> builds on the NFS share (which iirc was one of the triggering
> workloads).
>
> Thanks.
>
FYI, I found a way to reproduce the mv643xx_eth transmit corruption 
without using a network filesystem by using SOCAT (should also be able 
to use NETCAT or NC) and I have a bit more information about the 
corruption that looks like it is somehow related to the cache line size.

1) Create a "large" input file with known data on the target (saved to 
RAM disk or other storage):
     % php -r 'for ($x = 0; $x < 0x2000000; $x++) { printf("%08X\n", 
$x); }' > ExpectData.in
       or
     % perl -e 'for ($x = 0; $x < 0x2000000; $x++) { printf("%08X\n", 
$x); }' > ExpectData.in
     % md5sum ExpectData.in
     4a4727232209b85badc1ca25ed4df222  ExpectData.in
2) Start SOCAT on the host system to perform Ethernet receive MD5 
checksum of the data:
     % socat -s -u TCP4-LISTEN:4000,fork,reuseaddr EXEC:md5sum
3) Enable TSO on the target:
     % ethtool -K eth0 tso on
4)  Send the data file from the target to the host using SOCAT with a 
non-cache aligned block size:
     % socat -b$(((1024*10)+1)) -u ExpectData.in TCP:192.168.1.212:4000
5) The SOCAT running on the host system will report the MD5 checksum. If 
the MD5 is correct, it should be 4a4727232209b85badc1ca25ed4df222.

What I am seeing is every now and then, there are 32-bits (4 bytes) of 
data in the transmit Ethernet stream that are corrupted. If I change the 
SOCAT block size to something that is Armada 300 (Kirkwood) cache line 
aligned (ie. -b$(((1024*10)+0)) or -b$(((1024*10)+8))), it works just 
fine... If you want to capture the actual file and look at it, you can 
use SOCAT:
   % socat -u TCP4-LISTEN:4000,fork,reuseaddr OPEN:ActualData.in,creat
and since the data file is text, it is really easy to see the corruption 
(diff ExpectData.in ActualData.in | less).

I can disable TSO (ethtool -K eth0 tso off) and re-run the tests and the 
corruption does not occur.

I will give Ezequiel's latest patches a test a today and let you know if 
they change the behavior.

Dean

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 18:41   ` Dean Gehnert
@ 2015-01-22 18:45     ` Ezequiel Garcia
  2015-01-22 19:01       ` Dean Gehnert
  2015-01-22 21:09     ` Russell King - ARM Linux
  1 sibling, 1 reply; 19+ messages in thread
From: Ezequiel Garcia @ 2015-01-22 18:45 UTC (permalink / raw)
  To: deang, Russell King - ARM Linux
  Cc: netdev, David Miller, B38611, fabio.estevam

On 01/22/2015 03:41 PM, Dean Gehnert wrote:
> On 01/21/2015 07:01 AM, Russell King - ARM Linux wrote:
>> On Wed, Jan 21, 2015 at 09:54:08AM -0300, Ezequiel Garcia wrote:
>>> These two commits are fixes to the issue reported by Russell King on
>>> mv643xx_eth. Namely, the introduction of a regression by commit
>>> 69ad0dd7af22
>>> which removed the support for highmem skb fragments. The guilty commit
>>> introduced the assumption of fragment's payload being located in
>>> lowmem pages.
>> I do wonder whether 69ad0dd7af22 is the real culpret, or whether there is
>> some other change in the netdev layer that we're missing.  That commit is
>> in 3.16, but from what I remember, 3.17 works fine, it's 3.18 which
>> fails.
>>
>>> A similar pattern can be found in the original mvneta driver (in
>>> fact, the
>>> regression was introduced by copy-pasting the mvneta code).
>>>
>>> These fixes are for the non-TSO egress path in mvneta and mv643xx_eth
>>> drivers.
>>> The TSO path needs a more intrusive change, as the TSO API needs to
>>> be fixed
>>> (e.g. to make it work in skb fragments, instead of pointers to data).
>>>
>>> Russell, as I'm still unable to reproduce this, do you think you can
>>> give it a spin over there?
>> Sure - I think the only one I can test is mv643xx_eth, I don't think I
>> have any device which supports mv_neta.
>>
>> The test scenario is for a NFS mount (the Marvell device as the NFS
>> client) over IPv6.
>>
>> Initial testing looks good, I'll let it run for a while with various
>> builds on the NFS share (which iirc was one of the triggering
>> workloads).
>>
>> Thanks.
>>
> FYI, I found a way to reproduce the mv643xx_eth transmit corruption
> without using a network filesystem by using SOCAT (should also be able
> to use NETCAT or NC) and I have a bit more information about the
> corruption that looks like it is somehow related to the cache line size.
> 
> 1) Create a "large" input file with known data on the target (saved to
> RAM disk or other storage):
>     % php -r 'for ($x = 0; $x < 0x2000000; $x++) { printf("%08X\n", $x);
> }' > ExpectData.in
>       or
>     % perl -e 'for ($x = 0; $x < 0x2000000; $x++) { printf("%08X\n",
> $x); }' > ExpectData.in
>     % md5sum ExpectData.in
>     4a4727232209b85badc1ca25ed4df222  ExpectData.in
> 2) Start SOCAT on the host system to perform Ethernet receive MD5
> checksum of the data:
>     % socat -s -u TCP4-LISTEN:4000,fork,reuseaddr EXEC:md5sum
> 3) Enable TSO on the target:
>     % ethtool -K eth0 tso on
> 4)  Send the data file from the target to the host using SOCAT with a
> non-cache aligned block size:
>     % socat -b$(((1024*10)+1)) -u ExpectData.in TCP:192.168.1.212:4000
> 5) The SOCAT running on the host system will report the MD5 checksum. If
> the MD5 is correct, it should be 4a4727232209b85badc1ca25ed4df222.
> 
> What I am seeing is every now and then, there are 32-bits (4 bytes) of
> data in the transmit Ethernet stream that are corrupted. If I change the
> SOCAT block size to something that is Armada 300 (Kirkwood) cache line
> aligned (ie. -b$(((1024*10)+0)) or -b$(((1024*10)+8))), it works just
> fine... If you want to capture the actual file and look at it, you can
> use SOCAT:
>   % socat -u TCP4-LISTEN:4000,fork,reuseaddr OPEN:ActualData.in,creat
> and since the data file is text, it is really easy to see the corruption
> (diff ExpectData.in ActualData.in | less).
> 
> I can disable TSO (ethtool -K eth0 tso off) and re-run the tests and the
> corruption does not occur.
> 
> I will give Ezequiel's latest patches a test a today and let you know if
> they change the behavior.
> 

Sigh, this smells like a completely different bug. Which kernel version
are you testing?
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 18:45     ` Ezequiel Garcia
@ 2015-01-22 19:01       ` Dean Gehnert
  0 siblings, 0 replies; 19+ messages in thread
From: Dean Gehnert @ 2015-01-22 19:01 UTC (permalink / raw)
  To: Ezequiel Garcia, Russell King - ARM Linux
  Cc: netdev, David Miller, B38611, fabio.estevam

On 01/22/2015 10:45 AM, Ezequiel Garcia wrote:
> On 01/22/2015 03:41 PM, Dean Gehnert wrote:
>> On 01/21/2015 07:01 AM, Russell King - ARM Linux wrote:
>>> On Wed, Jan 21, 2015 at 09:54:08AM -0300, Ezequiel Garcia wrote:
>>>> These two commits are fixes to the issue reported by Russell King on
>>>> mv643xx_eth. Namely, the introduction of a regression by commit
>>>> 69ad0dd7af22
>>>> which removed the support for highmem skb fragments. The guilty commit
>>>> introduced the assumption of fragment's payload being located in
>>>> lowmem pages.
>>> I do wonder whether 69ad0dd7af22 is the real culpret, or whether there is
>>> some other change in the netdev layer that we're missing.  That commit is
>>> in 3.16, but from what I remember, 3.17 works fine, it's 3.18 which
>>> fails.
>>>
>>>> A similar pattern can be found in the original mvneta driver (in
>>>> fact, the
>>>> regression was introduced by copy-pasting the mvneta code).
>>>>
>>>> These fixes are for the non-TSO egress path in mvneta and mv643xx_eth
>>>> drivers.
>>>> The TSO path needs a more intrusive change, as the TSO API needs to
>>>> be fixed
>>>> (e.g. to make it work in skb fragments, instead of pointers to data).
>>>>
>>>> Russell, as I'm still unable to reproduce this, do you think you can
>>>> give it a spin over there?
>>> Sure - I think the only one I can test is mv643xx_eth, I don't think I
>>> have any device which supports mv_neta.
>>>
>>> The test scenario is for a NFS mount (the Marvell device as the NFS
>>> client) over IPv6.
>>>
>>> Initial testing looks good, I'll let it run for a while with various
>>> builds on the NFS share (which iirc was one of the triggering
>>> workloads).
>>>
>>> Thanks.
>>>
>> FYI, I found a way to reproduce the mv643xx_eth transmit corruption
>> without using a network filesystem by using SOCAT (should also be able
>> to use NETCAT or NC) and I have a bit more information about the
>> corruption that looks like it is somehow related to the cache line size.
>>
>> 1) Create a "large" input file with known data on the target (saved to
>> RAM disk or other storage):
>>      % php -r 'for ($x = 0; $x < 0x2000000; $x++) { printf("%08X\n", $x);
>> }' > ExpectData.in
>>        or
>>      % perl -e 'for ($x = 0; $x < 0x2000000; $x++) { printf("%08X\n",
>> $x); }' > ExpectData.in
>>      % md5sum ExpectData.in
>>      4a4727232209b85badc1ca25ed4df222  ExpectData.in
>> 2) Start SOCAT on the host system to perform Ethernet receive MD5
>> checksum of the data:
>>      % socat -s -u TCP4-LISTEN:4000,fork,reuseaddr EXEC:md5sum
>> 3) Enable TSO on the target:
>>      % ethtool -K eth0 tso on
>> 4)  Send the data file from the target to the host using SOCAT with a
>> non-cache aligned block size:
>>      % socat -b$(((1024*10)+1)) -u ExpectData.in TCP:192.168.1.212:4000
>> 5) The SOCAT running on the host system will report the MD5 checksum. If
>> the MD5 is correct, it should be 4a4727232209b85badc1ca25ed4df222.
>>
>> What I am seeing is every now and then, there are 32-bits (4 bytes) of
>> data in the transmit Ethernet stream that are corrupted. If I change the
>> SOCAT block size to something that is Armada 300 (Kirkwood) cache line
>> aligned (ie. -b$(((1024*10)+0)) or -b$(((1024*10)+8))), it works just
>> fine... If you want to capture the actual file and look at it, you can
>> use SOCAT:
>>    % socat -u TCP4-LISTEN:4000,fork,reuseaddr OPEN:ActualData.in,creat
>> and since the data file is text, it is really easy to see the corruption
>> (diff ExpectData.in ActualData.in | less).
>>
>> I can disable TSO (ethtool -K eth0 tso off) and re-run the tests and the
>> corruption does not occur.
>>
>> I will give Ezequiel's latest patches a test a today and let you know if
>> they change the behavior.
>>
> Sigh, this smells like a completely different bug. Which kernel version
> are you testing?
This was tested on v3.16.7 with all the patches forward to tip of tree 
on the mv643xx_eth driver. We are still not devtree enabled, so v3.16.7 
is where we are stuck until I get the Armada XP and MV-EBU enabled.

BTW, I did check and I already have all the latest NetDev patches for 
the driver and the problem still occurs...

It is almost feeling like a low-level DMA issue being the root cause... 
A bit off topic for the NetDev discussion, but we have also seen some 
odd behavior with the Firewire OHCI driver in respect to the Kirkwood 
platform and DMA... Not sure if they are could be related problems or not.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 18:41   ` Dean Gehnert
  2015-01-22 18:45     ` Ezequiel Garcia
@ 2015-01-22 21:09     ` Russell King - ARM Linux
  2015-01-22 21:27       ` Dean Gehnert
  1 sibling, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2015-01-22 21:09 UTC (permalink / raw)
  To: Dean Gehnert; +Cc: Ezequiel Garcia, netdev, David Miller, B38611, fabio.estevam

On Thu, Jan 22, 2015 at 10:41:00AM -0800, Dean Gehnert wrote:
> FYI, I found a way to reproduce the mv643xx_eth transmit corruption without
> using a network filesystem by using SOCAT (should also be able to use NETCAT
> or NC) and I have a bit more information about the corruption that looks
> like it is somehow related to the cache line size.

That's not quite what I'm seeing.  What I'm seeing with NFS is that the
machine is basically unusable.  I have the etna_viv source in a NFS
share (it's shared amongst not only the Dove box but also my collection
of iMX6 based hardware.)

I'm fairly fully IPv6 enabled here, which includes NFS.

On the Dove, if I try to build this without any fixes, and then try to
build the etna_viv sources, it will take the machine out to the extent
that I have to reboot it - either the machine will freeze solidly, or
the kernel will oops in the DMA API functions, in a path which was
called from an interrupt handler.  That takes out the entire machine
because we miss acknowleding the interrupt.

Either way, it's effectively a power cycle as there's no reset button on
the machine.

I have yet to see any sign of data corruption.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 21:09     ` Russell King - ARM Linux
@ 2015-01-22 21:27       ` Dean Gehnert
  2015-01-22 21:49         ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Dean Gehnert @ 2015-01-22 21:27 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Ezequiel Garcia, netdev, David Miller, B38611, fabio.estevam

On 01/22/2015 01:09 PM, Russell King - ARM Linux wrote:
> On Thu, Jan 22, 2015 at 10:41:00AM -0800, Dean Gehnert wrote:
>> FYI, I found a way to reproduce the mv643xx_eth transmit corruption without
>> using a network filesystem by using SOCAT (should also be able to use NETCAT
>> or NC) and I have a bit more information about the corruption that looks
>> like it is somehow related to the cache line size.
> That's not quite what I'm seeing.  What I'm seeing with NFS is that the
> machine is basically unusable.  I have the etna_viv source in a NFS
> share (it's shared amongst not only the Dove box but also my collection
> of iMX6 based hardware.)
>
> I'm fairly fully IPv6 enabled here, which includes NFS.
>
> On the Dove, if I try to build this without any fixes, and then try to
> build the etna_viv sources, it will take the machine out to the extent
> that I have to reboot it - either the machine will freeze solidly, or
> the kernel will oops in the DMA API functions, in a path which was
> called from an interrupt handler.  That takes out the entire machine
> because we miss acknowleding the interrupt.
I am wondering if there is a possibility of the root cause of this being 
in the arch DMA layer... From my testing with SOCAT and different cache 
line alignments, I am seeing Ethernet 4 byte transmit corruptions. My 
fear is this may not be restricted to the Ethernet transmit and maybe 
the root cause is a DMA / cache issue... I have no way to prove that 
theory. Your DMA API oops is a bit concerning that maybe there is some 
corruption going on during DMA operation.
>
> Either way, it's effectively a power cycle as there's no reset button on
> the machine.
>
> I have yet to see any sign of data corruption.
>
Can you can try the SOCAT test on your Dove platform and see if that 
passes the non-cache line aligned test case? I think what the SOCAT test 
does is take the NFS "variable" out of the equation. My theory is that 
if there is a DMA corruption, then hard telling what kinds of problems 
will occur. It might be the payload of a file is corrupted, or if the 
NFS structures are corrupted, it could manifest itself as a problem in 
the NFS code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 21:27       ` Dean Gehnert
@ 2015-01-22 21:49         ` Russell King - ARM Linux
  2015-01-22 23:06           ` Russell King - ARM Linux
  2015-01-22 23:08           ` Dean Gehnert
  0 siblings, 2 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2015-01-22 21:49 UTC (permalink / raw)
  To: Dean Gehnert; +Cc: Ezequiel Garcia, netdev, David Miller, B38611, fabio.estevam

On Thu, Jan 22, 2015 at 01:27:31PM -0800, Dean Gehnert wrote:
> On 01/22/2015 01:09 PM, Russell King - ARM Linux wrote:
> >On Thu, Jan 22, 2015 at 10:41:00AM -0800, Dean Gehnert wrote:
> >>FYI, I found a way to reproduce the mv643xx_eth transmit corruption without
> >>using a network filesystem by using SOCAT (should also be able to use NETCAT
> >>or NC) and I have a bit more information about the corruption that looks
> >>like it is somehow related to the cache line size.
> >That's not quite what I'm seeing.  What I'm seeing with NFS is that the
> >machine is basically unusable.  I have the etna_viv source in a NFS
> >share (it's shared amongst not only the Dove box but also my collection
> >of iMX6 based hardware.)
> >
> >I'm fairly fully IPv6 enabled here, which includes NFS.
> >
> >On the Dove, if I try to build this without any fixes, and then try to
> >build the etna_viv sources, it will take the machine out to the extent
> >that I have to reboot it - either the machine will freeze solidly, or
> >the kernel will oops in the DMA API functions, in a path which was
> >called from an interrupt handler.  That takes out the entire machine
> >because we miss acknowleding the interrupt.
> 
> I am wondering if there is a possibility of the root cause of this being in
> the arch DMA layer... From my testing with SOCAT and different cache line
> alignments, I am seeing Ethernet 4 byte transmit corruptions. My fear is
> this may not be restricted to the Ethernet transmit and maybe the root cause
> is a DMA / cache issue... I have no way to prove that theory. Your DMA API
> oops is a bit concerning that maybe there is some corruption going on during
> DMA operation.

We're careful in the arch code to do the best we can in all cases; that's
not to say that drivers aren't buggy (in that, they don't respect the DMA
API rules) but what I can say is that the ARM arch code gets it right.

Provided the ethernet driver maps the DMA buffer with DMA_TO_DEVICE prior
to the transfer being initiated, transfers _from_ the Marvell platform(s)
should be fine.

Provided the ethernet driver maps the DMA buffer with DMA_FROM_DEVICE
prior to handing it to the device, and then does not write to any cache
line associated with that DMA buffer before the ethernet driver has
completed, and then unmaps it with DMA_FROM_DEVICE, then again,
everything should be fine.

(The detail above "does not write to any cache line associated with
the DMA buffer" is subtle; what it means is that if the DMA buffer is
not aligned to a cache line, then nothing must write to the cache lines
which overlap the buffer, otherwise data corruption will occur.)

> Can you can try the SOCAT test on your Dove platform and see if that passes
> the non-cache line aligned test case? I think what the SOCAT test does is
> take the NFS "variable" out of the equation. My theory is that if there is a
> DMA corruption, then hard telling what kinds of problems will occur. It
> might be the payload of a file is corrupted, or if the NFS structures are
> corrupted, it could manifest itself as a problem in the NFS code.

This is one of the problems of having the TCP/UDP checksums offloaded to
the adapter - if the data is cocked up at the DMA stage, these checksums
won't detect it.

Anyway, I'm running the test now, but I had to change the socat line to:

# socat -b$(((1024*10)+1)) -u open:ExpectData.in TCP:192.168.1.212:4000

The receiving end is getting:

4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
...

and I'm up to over 24 of these without any problem being visible - how
long does it take to show?

For reference, the features on my Dove box are:

Features for eth0:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]


-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 21:49         ` Russell King - ARM Linux
@ 2015-01-22 23:06           ` Russell King - ARM Linux
  2015-01-22 23:09             ` Dean Gehnert
  2015-01-22 23:08           ` Dean Gehnert
  1 sibling, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2015-01-22 23:06 UTC (permalink / raw)
  To: Dean Gehnert; +Cc: Ezequiel Garcia, netdev, David Miller, B38611, fabio.estevam

On Thu, Jan 22, 2015 at 09:49:10PM +0000, Russell King - ARM Linux wrote:
> On Thu, Jan 22, 2015 at 01:27:31PM -0800, Dean Gehnert wrote:
> > Can you can try the SOCAT test on your Dove platform and see if that passes
> > the non-cache line aligned test case? I think what the SOCAT test does is
> > take the NFS "variable" out of the equation. My theory is that if there is a
> > DMA corruption, then hard telling what kinds of problems will occur. It
> > might be the payload of a file is corrupted, or if the NFS structures are
> > corrupted, it could manifest itself as a problem in the NFS code.
> 
> Anyway, I'm running the test now, but I had to change the socat line to:
> 
> # socat -b$(((1024*10)+1)) -u open:ExpectData.in TCP:192.168.1.212:4000
> 
> The receiving end is getting:
> 
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> ...

It's been running for about an hour now with no sign of any problem.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 21:49         ` Russell King - ARM Linux
  2015-01-22 23:06           ` Russell King - ARM Linux
@ 2015-01-22 23:08           ` Dean Gehnert
  1 sibling, 0 replies; 19+ messages in thread
From: Dean Gehnert @ 2015-01-22 23:08 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Ezequiel Garcia, netdev, David Miller, B38611, fabio.estevam

On 01/22/2015 01:49 PM, Russell King - ARM Linux wrote:
> On Thu, Jan 22, 2015 at 01:27:31PM -0800, Dean Gehnert wrote:
>> On 01/22/2015 01:09 PM, Russell King - ARM Linux wrote:
>>> On Thu, Jan 22, 2015 at 10:41:00AM -0800, Dean Gehnert wrote:
>>>> FYI, I found a way to reproduce the mv643xx_eth transmit corruption without
>>>> using a network filesystem by using SOCAT (should also be able to use NETCAT
>>>> or NC) and I have a bit more information about the corruption that looks
>>>> like it is somehow related to the cache line size.
>>> That's not quite what I'm seeing.  What I'm seeing with NFS is that the
>>> machine is basically unusable.  I have the etna_viv source in a NFS
>>> share (it's shared amongst not only the Dove box but also my collection
>>> of iMX6 based hardware.)
>>>
>>> I'm fairly fully IPv6 enabled here, which includes NFS.
>>>
>>> On the Dove, if I try to build this without any fixes, and then try to
>>> build the etna_viv sources, it will take the machine out to the extent
>>> that I have to reboot it - either the machine will freeze solidly, or
>>> the kernel will oops in the DMA API functions, in a path which was
>>> called from an interrupt handler.  That takes out the entire machine
>>> because we miss acknowleding the interrupt.
>> I am wondering if there is a possibility of the root cause of this being in
>> the arch DMA layer... From my testing with SOCAT and different cache line
>> alignments, I am seeing Ethernet 4 byte transmit corruptions. My fear is
>> this may not be restricted to the Ethernet transmit and maybe the root cause
>> is a DMA / cache issue... I have no way to prove that theory. Your DMA API
>> oops is a bit concerning that maybe there is some corruption going on during
>> DMA operation.
> We're careful in the arch code to do the best we can in all cases; that's
> not to say that drivers aren't buggy (in that, they don't respect the DMA
> API rules) but what I can say is that the ARM arch code gets it right.
Agreed. I have not seen problems like this before on other ARM 
implementations.
>
> Provided the ethernet driver maps the DMA buffer with DMA_TO_DEVICE prior
> to the transfer being initiated, transfers _from_ the Marvell platform(s)
> should be fine.
>
> Provided the ethernet driver maps the DMA buffer with DMA_FROM_DEVICE
> prior to handing it to the device, and then does not write to any cache
> line associated with that DMA buffer before the ethernet driver has
> completed, and then unmaps it with DMA_FROM_DEVICE, then again,
> everything should be fine.
>
> (The detail above "does not write to any cache line associated with
> the DMA buffer" is subtle; what it means is that if the DMA buffer is
> not aligned to a cache line, then nothing must write to the cache lines
> which overlap the buffer, otherwise data corruption will occur.)
I wonder if that is a clue for me to chase... The cache line should be 
completely flushed to hardware before the DMA operation is started. The 
DMA mapping routines should be making sure all the buffers associated 
with the DMA operation are locked down and flushed before completing the 
DMA map operation. However, if there is other code that was modifying 
the DMA buffers after the lock down and before the DMA has completed and 
the buffers have been un-mapped, that would be bad.
>
>> Can you can try the SOCAT test on your Dove platform and see if that passes
>> the non-cache line aligned test case? I think what the SOCAT test does is
>> take the NFS "variable" out of the equation. My theory is that if there is a
>> DMA corruption, then hard telling what kinds of problems will occur. It
>> might be the payload of a file is corrupted, or if the NFS structures are
>> corrupted, it could manifest itself as a problem in the NFS code.
> This is one of the problems of having the TCP/UDP checksums offloaded to
> the adapter - if the data is cocked up at the DMA stage, these checksums
> won't detect it.
I am going to noodle a bit for a way that I could check if the buffer 
has changed between the DMA map and un-map calls... I might be able to 
add some code to checksum the buffer between those calls. If the 
checksum changes. that would indicate that someone is changing the buffer.
>
> Anyway, I'm running the test now, but I had to change the socat line to:
>
> # socat -b$(((1024*10)+1)) -u open:ExpectData.in TCP:192.168.1.212:4000
>
> The receiving end is getting:
>
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> 4a4727232209b85badc1ca25ed4df222  -
> ...
>
> and I'm up to over 24 of these without any problem being visible - how
> long does it take to show?
It should show up in the 1st or 2nd and all following iterations. For 
smaller files it seems to work for a while, but with the 256MB file, it 
stresses the system enough that is about guaranteed to occur. It looks 
like the Dove is working correctly. You have TSO enabled, large buffer, 
etc, so your results look good.

Refresh my memory... What version of Marvell Armada is the Dove? I was 
thinking the Dove was later than the Kirkwood and Armada 300 and was 
maybe an early Armada 370 or ???...
>
> For reference, the features on my Dove box are:
>
> Features for eth0:
> rx-checksumming: on
> tx-checksumming: on
>          tx-checksum-ipv4: on
>          tx-checksum-ip-generic: off [fixed]
>          tx-checksum-ipv6: off [fixed]
>          tx-checksum-fcoe-crc: off [fixed]
>          tx-checksum-sctp: off [fixed]
> scatter-gather: on
>          tx-scatter-gather: on
>          tx-scatter-gather-fraglist: off [fixed]
> tcp-segmentation-offload: on
>          tx-tcp-segmentation: on
>          tx-tcp-ecn-segmentation: off [fixed]
>          tx-tcp6-segmentation: off [fixed]
> udp-fragmentation-offload: off [fixed]
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off [fixed]
> rx-vlan-offload: off [fixed]
> tx-vlan-offload: off [fixed]
> ntuple-filters: off [fixed]
> receive-hashing: off [fixed]
> highdma: off [fixed]
> rx-vlan-filter: off [fixed]
> vlan-challenged: off [fixed]
> tx-lockless: off [fixed]
> netns-local: off [fixed]
> tx-gso-robust: off [fixed]
> tx-fcoe-segmentation: off [fixed]
> tx-gre-segmentation: off [fixed]
> tx-ipip-segmentation: off [fixed]
> tx-sit-segmentation: off [fixed]
> tx-udp_tnl-segmentation: off [fixed]
> tx-mpls-segmentation: off [fixed]
> fcoe-mtu: off [fixed]
> tx-nocache-copy: off
> loopback: off [fixed]
> rx-fcs: off [fixed]
> rx-all: off [fixed]
> tx-vlan-stag-hw-insert: off [fixed]
> rx-vlan-stag-hw-parse: off [fixed]
> rx-vlan-stag-filter: off [fixed]
> l2-fwd-offload: off [fixed]
> busy-poll: off [fixed]
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
  2015-01-22 23:06           ` Russell King - ARM Linux
@ 2015-01-22 23:09             ` Dean Gehnert
  0 siblings, 0 replies; 19+ messages in thread
From: Dean Gehnert @ 2015-01-22 23:09 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Ezequiel Garcia, netdev, David Miller, B38611, fabio.estevam

On 01/22/2015 03:06 PM, Russell King - ARM Linux wrote:
> On Thu, Jan 22, 2015 at 09:49:10PM +0000, Russell King - ARM Linux wrote:
>> On Thu, Jan 22, 2015 at 01:27:31PM -0800, Dean Gehnert wrote:
>>> Can you can try the SOCAT test on your Dove platform and see if that passes
>>> the non-cache line aligned test case? I think what the SOCAT test does is
>>> take the NFS "variable" out of the equation. My theory is that if there is a
>>> DMA corruption, then hard telling what kinds of problems will occur. It
>>> might be the payload of a file is corrupted, or if the NFS structures are
>>> corrupted, it could manifest itself as a problem in the NFS code.
>> Anyway, I'm running the test now, but I had to change the socat line to:
>>
>> # socat -b$(((1024*10)+1)) -u open:ExpectData.in TCP:192.168.1.212:4000
>>
>> The receiving end is getting:
>>
>> 4a4727232209b85badc1ca25ed4df222  -
>> 4a4727232209b85badc1ca25ed4df222  -
>> 4a4727232209b85badc1ca25ed4df222  -
>> 4a4727232209b85badc1ca25ed4df222  -
>> 4a4727232209b85badc1ca25ed4df222  -
>> ...
> It's been running for about an hour now with no sign of any problem.
>
It should show the problem in the 1st or 2nd iteration and then continue 
after that... I don't think your Dove shows the problem.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path
  2015-01-21 12:54 ` [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path Ezequiel Garcia
@ 2015-01-26 22:40   ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2015-01-26 22:40 UTC (permalink / raw)
  To: ezequiel.garcia; +Cc: netdev, linux, B38611, fabio.estevam

From: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Date: Wed, 21 Jan 2015 09:54:09 -0300

> +		if (!IS_TSO_HEADER(txq, tx_desc->buf_phys_addr)) {
> +
> +			/* The first descriptor is either a TSO header or
> +			 * the linear part of the skb.
> +			 */

This empty line is unnecessary and inapparopriate, please remove it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] net: mv643xx_eth: Fix highmem support in non-TSO egress path
  2015-01-21 12:54 ` [PATCH 2/2] net: mv643xx_eth: Fix highmem support in " Ezequiel Garcia
  2015-01-21 17:40   ` Russell King - ARM Linux
@ 2015-01-26 22:40   ` David Miller
  1 sibling, 0 replies; 19+ messages in thread
From: David Miller @ 2015-01-26 22:40 UTC (permalink / raw)
  To: ezequiel.garcia; +Cc: netdev, linux, B38611, fabio.estevam

From: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Date: Wed, 21 Jan 2015 09:54:10 -0300

> +		if (!IS_TSO_HEADER(txq, desc->buf_ptr)) {
> +
> +			/* The first descriptor is either a TSO header or
> +			 * the linear part of the skb.
> +			 */

Similar to the first patch, please remove this empty line.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-01-26 22:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-21 12:54 [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Ezequiel Garcia
2015-01-21 12:54 ` [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path Ezequiel Garcia
2015-01-26 22:40   ` David Miller
2015-01-21 12:54 ` [PATCH 2/2] net: mv643xx_eth: Fix highmem support in " Ezequiel Garcia
2015-01-21 17:40   ` Russell King - ARM Linux
2015-01-21 23:34     ` Ezequiel Garcia
2015-01-22  0:11       ` Russell King - ARM Linux
2015-01-22 12:17         ` Ezequiel Garcia
2015-01-26 22:40   ` David Miller
2015-01-21 15:01 ` [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Russell King - ARM Linux
2015-01-22 18:41   ` Dean Gehnert
2015-01-22 18:45     ` Ezequiel Garcia
2015-01-22 19:01       ` Dean Gehnert
2015-01-22 21:09     ` Russell King - ARM Linux
2015-01-22 21:27       ` Dean Gehnert
2015-01-22 21:49         ` Russell King - ARM Linux
2015-01-22 23:06           ` Russell King - ARM Linux
2015-01-22 23:09             ` Dean Gehnert
2015-01-22 23:08           ` Dean Gehnert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.