All of lore.kernel.org
 help / color / mirror / Atom feed
From: Russell King - ARM Linux <linux@arm.linux.org.uk>
To: Dean Gehnert <deang@tpi.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>,
	netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
	B38611@freescale.com, fabio.estevam@freescale.com
Subject: Re: [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path
Date: Thu, 22 Jan 2015 21:49:11 +0000	[thread overview]
Message-ID: <20150122214910.GD26493@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <54C16B43.5040504@tpi.com>

On Thu, Jan 22, 2015 at 01:27:31PM -0800, Dean Gehnert wrote:
> On 01/22/2015 01:09 PM, Russell King - ARM Linux wrote:
> >On Thu, Jan 22, 2015 at 10:41:00AM -0800, Dean Gehnert wrote:
> >>FYI, I found a way to reproduce the mv643xx_eth transmit corruption without
> >>using a network filesystem by using SOCAT (should also be able to use NETCAT
> >>or NC) and I have a bit more information about the corruption that looks
> >>like it is somehow related to the cache line size.
> >That's not quite what I'm seeing.  What I'm seeing with NFS is that the
> >machine is basically unusable.  I have the etna_viv source in a NFS
> >share (it's shared amongst not only the Dove box but also my collection
> >of iMX6 based hardware.)
> >
> >I'm fairly fully IPv6 enabled here, which includes NFS.
> >
> >On the Dove, if I try to build this without any fixes, and then try to
> >build the etna_viv sources, it will take the machine out to the extent
> >that I have to reboot it - either the machine will freeze solidly, or
> >the kernel will oops in the DMA API functions, in a path which was
> >called from an interrupt handler.  That takes out the entire machine
> >because we miss acknowleding the interrupt.
> 
> I am wondering if there is a possibility of the root cause of this being in
> the arch DMA layer... From my testing with SOCAT and different cache line
> alignments, I am seeing Ethernet 4 byte transmit corruptions. My fear is
> this may not be restricted to the Ethernet transmit and maybe the root cause
> is a DMA / cache issue... I have no way to prove that theory. Your DMA API
> oops is a bit concerning that maybe there is some corruption going on during
> DMA operation.

We're careful in the arch code to do the best we can in all cases; that's
not to say that drivers aren't buggy (in that, they don't respect the DMA
API rules) but what I can say is that the ARM arch code gets it right.

Provided the ethernet driver maps the DMA buffer with DMA_TO_DEVICE prior
to the transfer being initiated, transfers _from_ the Marvell platform(s)
should be fine.

Provided the ethernet driver maps the DMA buffer with DMA_FROM_DEVICE
prior to handing it to the device, and then does not write to any cache
line associated with that DMA buffer before the ethernet driver has
completed, and then unmaps it with DMA_FROM_DEVICE, then again,
everything should be fine.

(The detail above "does not write to any cache line associated with
the DMA buffer" is subtle; what it means is that if the DMA buffer is
not aligned to a cache line, then nothing must write to the cache lines
which overlap the buffer, otherwise data corruption will occur.)

> Can you can try the SOCAT test on your Dove platform and see if that passes
> the non-cache line aligned test case? I think what the SOCAT test does is
> take the NFS "variable" out of the equation. My theory is that if there is a
> DMA corruption, then hard telling what kinds of problems will occur. It
> might be the payload of a file is corrupted, or if the NFS structures are
> corrupted, it could manifest itself as a problem in the NFS code.

This is one of the problems of having the TCP/UDP checksums offloaded to
the adapter - if the data is cocked up at the DMA stage, these checksums
won't detect it.

Anyway, I'm running the test now, but I had to change the socat line to:

# socat -b$(((1024*10)+1)) -u open:ExpectData.in TCP:192.168.1.212:4000

The receiving end is getting:

4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
4a4727232209b85badc1ca25ed4df222  -
...

and I'm up to over 24 of these without any problem being visible - how
long does it take to show?

For reference, the features on my Dove box are:

Features for eth0:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]


-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

  reply	other threads:[~2015-01-22 21:49 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-21 12:54 [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Ezequiel Garcia
2015-01-21 12:54 ` [PATCH 1/2] net: mvneta: Fix highmem support in the non-TSO egress path Ezequiel Garcia
2015-01-26 22:40   ` David Miller
2015-01-21 12:54 ` [PATCH 2/2] net: mv643xx_eth: Fix highmem support in " Ezequiel Garcia
2015-01-21 17:40   ` Russell King - ARM Linux
2015-01-21 23:34     ` Ezequiel Garcia
2015-01-22  0:11       ` Russell King - ARM Linux
2015-01-22 12:17         ` Ezequiel Garcia
2015-01-26 22:40   ` David Miller
2015-01-21 15:01 ` [PATCH net 0/2] net: marvell: Fix highmem support on non-TSO path Russell King - ARM Linux
2015-01-22 18:41   ` Dean Gehnert
2015-01-22 18:45     ` Ezequiel Garcia
2015-01-22 19:01       ` Dean Gehnert
2015-01-22 21:09     ` Russell King - ARM Linux
2015-01-22 21:27       ` Dean Gehnert
2015-01-22 21:49         ` Russell King - ARM Linux [this message]
2015-01-22 23:06           ` Russell King - ARM Linux
2015-01-22 23:09             ` Dean Gehnert
2015-01-22 23:08           ` Dean Gehnert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150122214910.GD26493@n2100.arm.linux.org.uk \
    --to=linux@arm.linux.org.uk \
    --cc=B38611@freescale.com \
    --cc=davem@davemloft.net \
    --cc=deang@tpi.com \
    --cc=ezequiel.garcia@free-electrons.com \
    --cc=fabio.estevam@freescale.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.