All of lore.kernel.org
 help / color / mirror / Atom feed
* [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
@ 2017-04-21 12:39 Rafa Corvillo
  2017-04-24  6:45 ` Rafa Corvillo
  2017-04-25 15:27 ` Stephen Hemminger
  0 siblings, 2 replies; 18+ messages in thread
From: Rafa Corvillo @ 2017-04-21 12:39 UTC (permalink / raw)
  To: netdev

We are working in an ARMv7 embedded system running kernel 4.9 (LEDE build).
It is an imx6 board with 2 ethernet interfaces. One of them is connected to
a Marvell switch.

The schema of the system is the following:

  +-------------------+ eth0
  |                   +--+
  |                   |  |
  | Embedded system   +--+
  |                   |
  |      ARMv7        |
  |                   | Marvell 88E8057(sky2) +-------------+
  |                   +--+ +--+             +--+ eth1
  |                   |  +---------------------+ |             |  +------+
  |                   +--+      CPU port       +--+ mv88e6176  +--+
  +------+--+---------+ |             |
emulated|  | |             |
GPIO    +--+ +--+             +--+ eth2
MDIO      +-----------------------------------+ |             |  +------+
                               MDIO +--+             +--+
+-------------+

There is a bridge (br-lan) which includes eth0/eth1/eth2

If I connect the eth1/eth2, the link is up and I can do ping through it. 
But, once
I start sending a heavy traffic load the link fails and the kernel sends the
following messages:

[   48.557140] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.564964] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.572110] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.579263] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.586417] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.593573] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.600718] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   54.877567] net_ratelimit: 6 callbacks suppressed
[   54.882293] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   61.413552] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518

I have a modified device-tree of imx6 which includes the mdio and dsa 
nodes. This
device-tree works in a kernel 4.1.6, but I know that these parts of the 
kernel have
a lot of changes. The changes included for mdio and dsa in the 
device-tree are the
following (diff arch/arm/boot/dts/imx6qdl-gw53xx.dtsi 
arch/arm/boot/dts/imx6qdl-gw53xx-mdio.dtsi):

16a17,18
 >     can0 = &can1;
 >     ethernet0 = &fec;
21a24
 >     sky2 = &eth1;
24a28,29
 >     usdhc2 = &usdhc3;
 >     mdio-gpio0 = &mdio0;
62a68,125
 >   mdio0: mdio {
 >     compatible = "virtual,mdio-gpio";
 >     #address-cells = <1>;
 >     #size-cells = <0>;
 >     /* MDC = gpio-17, MDIO = gpio-20 */
 >     gpios =  <&gpio1 17 1
 >         &gpio1 20 0>;
 >                 ethernet-phy@0  {
 >           compatible = "marvell,dsa";
 >                 };
 >   };
 >
 >   dsa {
 >     compatible = "marvell,dsa";
 >     #address-cells = <2>;
 >     #size-cells = <0>;
 >
 >     interrupts = <10>;
 >     dsa,ethernet = <&eth1>;
 >     dsa,mii-bus = <&mdio0>;
 >
 >     switch@0 {
 >       #address-cells = <1>;
 >       #size-cells = <0>;
 >       reg = <0 0>;  /* MDIO address 0, switch 0 in tree */
 >
 >       port@0 {
 >         reg = <0>;
 >         label = "cpu";
 >       };
 >
 >       port@1 {
 >         reg = <1>;
 >         label = "eth1";
 >       };
 >
 >       port@2 {
 >         reg = <2>;
 >         label = "eth2";
 >       };
 >
 >       port@3 {
 >         reg = <3>;
 >         label = "eth3";
 >       };
 >
 >       port@4 {
 >         reg = <4>;
 >         label = "eth4";
 >       };
 >     };
 >   };
 >
361a425,430
 > &mdio0 {
 >   pinctrl-names = "default";
 >   pinctrl-0 = <&pinctrl_mdio>;
 >   status = "okay";
 > };
 >
363c432
<   imx6qdl-gw53xx {
---
 >   imx6qdl-gw53xx-mdio {
448a518,524
 >       >;
 >     };
 >
 >     pinctrl_mdio: mdiogrp {
 >       fsl,pins = <
 >         MX6QDL_PAD_SD1_DAT1__GPIO1_IO17   0x1b0b9
 >         MX6QDL_PAD_SD1_CLK__GPIO1_IO20    0x1b0b9

Do you know of any possible reason why this could be happening?

Thanks in advance.

Rafa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-04-21 12:39 [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176 Rafa Corvillo
@ 2017-04-24  6:45 ` Rafa Corvillo
  2017-04-25 15:27 ` Stephen Hemminger
  1 sibling, 0 replies; 18+ messages in thread
From: Rafa Corvillo @ 2017-04-24  6:45 UTC (permalink / raw)
  To: netdev

I resend the mail with the schema fixed. Sorry for the inconvenience.

We are working in an ARMv7 embedded system running kernel 4.9 (LEDE build).
It is an imx6 board with 2 ethernet interfaces. One of them is connected to
a Marvell switch.

The schema of the system is the following:

+-------------------+ eth0
  |                   +--+
  |                   |  |
  | Embedded system   +--+
  |                   |
  |      ARMv7        |
  |                   | Marvell 88E8057(sky2) +-------------+
  |                   +--+ +--+             +--+ eth1
  |                   |  +---------------------+ |             |  +------+
  |                   +--+      CPU port       +--+ mv88e6176  +--+
  +------+--+---------+ |             |
emulated|  | |             |
GPIO    +--+ +--+             +--+ eth2
MDIO      +-----------------------------------+ |             |  +------+
                               MDIO +--+             +--+
+-------------+

There is a bridge (br-lan) which includes eth0/eth1/eth2

If I connect the eth1/eth2, the link is up and I can do ping through it. 
But, once
I start sending a heavy traffic load the link fails and the kernel sends 
the
following messages:

[   48.557140] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.564964] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.572110] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.579263] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.586417] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.593573] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   48.600718] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   54.877567] net_ratelimit: 6 callbacks suppressed
[   54.882293] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518
[   61.413552] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
length 1518

I have a modified device-tree of imx6 which includes the mdio and dsa 
nodes. This
device-tree works in a kernel 4.1.6, but I know that these parts of the 
kernel have
a lot of changes. The changes included for mdio and dsa in the 
device-tree are the
following (diff arch/arm/boot/dts/imx6qdl-gw53xx.dtsi 
arch/arm/boot/dts/imx6qdl-gw53xx-mdio.dtsi):

16a17,18
 >     can0 = &can1;
 >     ethernet0 = &fec;
21a24
 >     sky2 = &eth1;
24a28,29
 >     usdhc2 = &usdhc3;
 >     mdio-gpio0 = &mdio0;
62a68,125
 >   mdio0: mdio {
 >     compatible = "virtual,mdio-gpio";
 >     #address-cells = <1>;
 >     #size-cells = <0>;
 >     /* MDC = gpio-17, MDIO = gpio-20 */
 >     gpios =  <&gpio1 17 1
 >         &gpio1 20 0>;
 >                 ethernet-phy@0  {
 >           compatible = "marvell,dsa";
 >                 };
 >   };
 >
 >   dsa {
 >     compatible = "marvell,dsa";
 >     #address-cells = <2>;
 >     #size-cells = <0>;
 >
 >     interrupts = <10>;
 >     dsa,ethernet = <&eth1>;
 >     dsa,mii-bus = <&mdio0>;
 >
 >     switch@0 {
 >       #address-cells = <1>;
 >       #size-cells = <0>;
 >       reg = <0 0>;  /* MDIO address 0, switch 0 in tree */
 >
 >       port@0 {
 >         reg = <0>;
 >         label = "cpu";
 >       };
 >
 >       port@1 {
 >         reg = <1>;
 >         label = "eth1";
 >       };
 >
 >       port@2 {
 >         reg = <2>;
 >         label = "eth2";
 >       };
 >
 >       port@3 {
 >         reg = <3>;
 >         label = "eth3";
 >       };
 >
 >       port@4 {
 >         reg = <4>;
 >         label = "eth4";
 >       };
 >     };
 >   };
 >
361a425,430
 > &mdio0 {
 >   pinctrl-names = "default";
 >   pinctrl-0 = <&pinctrl_mdio>;
 >   status = "okay";
 > };
 >
363c432
<   imx6qdl-gw53xx {
---
 >   imx6qdl-gw53xx-mdio {
448a518,524
 >       >;
 >     };
 >
 >     pinctrl_mdio: mdiogrp {
 >       fsl,pins = <
 >         MX6QDL_PAD_SD1_DAT1__GPIO1_IO17   0x1b0b9
 >         MX6QDL_PAD_SD1_CLK__GPIO1_IO20    0x1b0b9

Do you know of any possible reason why this could be happening?

Thanks in advance.

Rafa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-04-21 12:39 [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176 Rafa Corvillo
  2017-04-24  6:45 ` Rafa Corvillo
@ 2017-04-25 15:27 ` Stephen Hemminger
  2017-04-27 12:05   ` Rafa Corvillo
  1 sibling, 1 reply; 18+ messages in thread
From: Stephen Hemminger @ 2017-04-25 15:27 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: netdev

On Fri, 21 Apr 2017 14:39:00 +0200
Rafa Corvillo <rafael.corvillo@aoifes.com> wrote:

> We are working in an ARMv7 embedded system running kernel 4.9 (LEDE build).
> It is an imx6 board with 2 ethernet interfaces. One of them is connected to
> a Marvell switch.
> 
> The schema of the system is the following:
> 
>   +-------------------+ eth0
>   |                   +--+
>   |                   |  |
>   | Embedded system   +--+
>   |                   |
>   |      ARMv7        |
>   |                   | Marvell 88E8057(sky2) +-------------+
>   |                   +--+ +--+             +--+ eth1
>   |                   |  +---------------------+ |             |  +------+
>   |                   +--+      CPU port       +--+ mv88e6176  +--+
>   +------+--+---------+ |             |
> emulated|  | |             |
> GPIO    +--+ +--+             +--+ eth2
> MDIO      +-----------------------------------+ |             |  +------+
>                                MDIO +--+             +--+
> +-------------+
> 
> There is a bridge (br-lan) which includes eth0/eth1/eth2
> 
> If I connect the eth1/eth2, the link is up and I can do ping through it. 
> But, once
> I start sending a heavy traffic load the link fails and the kernel sends the
> following messages:
> 
> [   48.557140] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.564964] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.572110] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.579263] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.586417] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.593573] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   48.600718] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   54.877567] net_ratelimit: 6 callbacks suppressed
> [   54.882293] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518
> [   61.413552] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010 
> length 1518

The status error bits are in sky2.h
0x5f20010 is
     05f2 frame length => 1522 
     0010 Too long err

That means the packet was longer than the configured MTU.
You are probably getting packets with VLAN tag but have not configured
a VLAN.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-04-25 15:27 ` Stephen Hemminger
@ 2017-04-27 12:05   ` Rafa Corvillo
  2017-04-27 13:04     ` Andrew Lunn
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-04-27 12:05 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

On 25/04/17 17:27, Stephen Hemminger wrote:
> On Fri, 21 Apr 2017 14:39:00 +0200
> Rafa Corvillo <rafael.corvillo@aoifes.com> wrote:
>
>> We are working in an ARMv7 embedded system running kernel 4.9 (LEDE build).
>> It is an imx6 board with 2 ethernet interfaces. One of them is connected to
>> a Marvell switch.
>>
>> The schema of the system is the following:
>>
>>    +-------------------+ eth0
>>    |                   +--+
>>    |                   |  |
>>    | Embedded system   +--+
>>    |                   |
>>    |      ARMv7        |
>>    |                   | Marvell 88E8057(sky2) +-------------+
>>    |                   +--+ +--+             +--+ eth1
>>    |                   |  +---------------------+ |             |  +------+
>>    |                   +--+      CPU port       +--+ mv88e6176  +--+
>>    +------+--+---------+ |             |
>> emulated|  | |             |
>> GPIO    +--+ +--+             +--+ eth2
>> MDIO      +-----------------------------------+ |             |  +------+
>>                                 MDIO +--+             +--+
>> +-------------+
>>
>> There is a bridge (br-lan) which includes eth0/eth1/eth2
>>
>> If I connect the eth1/eth2, the link is up and I can do ping through it.
>> But, once
>> I start sending a heavy traffic load the link fails and the kernel sends the
>> following messages:
>>
>> [   48.557140] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   48.564964] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   48.572110] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   48.579263] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   48.586417] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   48.593573] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   48.600718] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   54.877567] net_ratelimit: 6 callbacks suppressed
>> [   54.882293] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>> [   61.413552] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>> length 1518
>
> The status error bits are in sky2.h
> 0x5f20010 is
>       05f2 frame length => 1522
>       0010 Too long err
>
> That means the packet was longer than the configured MTU.
> You are probably getting packets with VLAN tag but have not configured
> a VLAN.
>
>
>

Thanks for the information. I have increased the MTU value to 1550 
(workaround) and it works if sends traffic (with iperf) from my computer 
to the unit. But, if I send traffic outside the unit, I get a new error 
message and link goes down:

[ 4901.032989] sky2 0000:04:00.0 marvell: tx timeout
[ 4904.722670] sky2 0000:04:00.0 marvell: Link is up at 1000 Mbps, full 
duplex, flow control both

Rafa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-04-27 12:05   ` Rafa Corvillo
@ 2017-04-27 13:04     ` Andrew Lunn
  2017-04-28 11:54       ` Rafa Corvillo
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Lunn @ 2017-04-27 13:04 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: Stephen Hemminger, netdev

On Thu, Apr 27, 2017 at 02:05:51PM +0200, Rafa Corvillo wrote:
> On 25/04/17 17:27, Stephen Hemminger wrote:
> >On Fri, 21 Apr 2017 14:39:00 +0200
> >Rafa Corvillo <rafael.corvillo@aoifes.com> wrote:
> >
> >>We are working in an ARMv7 embedded system running kernel 4.9 (LEDE build).
> >>It is an imx6 board with 2 ethernet interfaces. One of them is connected to
> >>a Marvell switch.
> >>
> >>The schema of the system is the following:
> >>

Hi Rafa

Your ASCII art got messed up somewhere. Is this the correct
reconstruction?

   +-------------------+ eth0
   |                   +--+
   |                   |  |
   | Embedded system   +--+
   |                   |
   |      ARMv7        |
   |                   | Marvell 88E8057(sky2)     +-------------+
   |                   +--+                     +--+             +--+ eth1
   |                   |  +---------------------+  |             |  +------+
   |                   +--+      CPU port       +--+ mv88e6176   +--+
   +------+--+---------+                           |             |
emulated  |  |                                     |             |
GPIO      +--+                                  +--+             +--+ eth2
MDIO       +-----------------------------------+   |             |  +------+
                                MDIO            +--+             +--+
                                                   +-------------+

I assume you are using DSA? Since this is LEDE, it could be swconfig,
but the bridge configuration you mentioned would not make sense for
swconfig.

> >>If I connect the eth1/eth2, the link is up and I can do ping through it.
> >>But, once
> >>I start sending a heavy traffic load the link fails and the kernel sends the
> >>following messages:
> >>
> >>[   48.557140] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   48.564964] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   48.572110] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   48.579263] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   48.586417] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   48.593573] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   48.600718] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   54.877567] net_ratelimit: 6 callbacks suppressed
> >>[   54.882293] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >>[   61.413552] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
> >>length 1518
> >
> >The status error bits are in sky2.h
> >0x5f20010 is
> >      05f2 frame length => 1522
> >      0010 Too long err
> >
> >That means the packet was longer than the configured MTU.
> >You are probably getting packets with VLAN tag but have not configured
> >a VLAN.

Since you are using DSA, you will have DSA tags enabled on frames
to/from the switch. This adds an extra 8 byte header in the frame.  My
guess is, it is this header, not the VLAN tag which is causing you MTU
issues.

I think this is the first time i've seen sky2 used in a DSA
setup. mv643xx or mvneta is generally what is used, when using Marvell
chipsets. These drivers are more lenient about MTU, and are happy to
pass frames with additional headers.

> Thanks for the information. I have increased the MTU value to 1550
> (workaround) and it works if sends traffic (with iperf) from my
> computer to the unit. But, if I send traffic outside the unit, I get
> a new error message and link goes down:

Changing the MTU like this is not a good fix. It will allow you to
receive frames which are bigger, but it also means the local network
stack will generate bigger frames to be transmitted. You probably need
to modify the sky2 driver to allow it to receive frames bigger than
the interface MTU, by about 8 bytes.
 
> [ 4901.032989] sky2 0000:04:00.0 marvell: tx timeout
> [ 4904.722670] sky2 0000:04:00.0 marvell: Link is up at 1000 Mbps,
> full duplex, flow control both

Between the sky2 and the switch, do you have two back-to-back PHYs or
are you connecting the RGMII interfaces together?

    Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-04-27 13:04     ` Andrew Lunn
@ 2017-04-28 11:54       ` Rafa Corvillo
  2017-04-28 12:22         ` Andrew Lunn
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-04-28 11:54 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stephen Hemminger, netdev

On 27/04/17 15:04, Andrew Lunn wrote:
> On Thu, Apr 27, 2017 at 02:05:51PM +0200, Rafa Corvillo wrote:
>> On 25/04/17 17:27, Stephen Hemminger wrote:
>>> On Fri, 21 Apr 2017 14:39:00 +0200
>>> Rafa Corvillo <rafael.corvillo@aoifes.com> wrote:
>>>
>>>> We are working in an ARMv7 embedded system running kernel 4.9 (LEDE build).
>>>> It is an imx6 board with 2 ethernet interfaces. One of them is connected to
>>>> a Marvell switch.
>>>>
>>>> The schema of the system is the following:
>>>>
>
> Hi Rafa
>
> Your ASCII art got messed up somewhere. Is this the correct
> reconstruction?

Yes, this is the schema.

>
>     +-------------------+ eth0
>     |                   +--+
>     |                   |  |
>     | Embedded system   +--+
>     |                   |
>     |      ARMv7        |
>     |                   | Marvell 88E8057(sky2)     +-------------+
>     |                   +--+                     +--+             +--+ eth1
>     |                   |  +---------------------+  |             |  +------+
>     |                   +--+      CPU port       +--+ mv88e6176   +--+
>     +------+--+---------+                           |             |
> emulated  |  |                                     |             |
> GPIO      +--+                                  +--+             +--+ eth2
> MDIO       +-----------------------------------+   |             |  +------+
>                                  MDIO            +--+             +--+
>                                                     +-------------+
>
> I assume you are using DSA? Since this is LEDE, it could be swconfig,
> but the bridge configuration you mentioned would not make sense for
> swconfig.

Yes, we use DSA driver. We don't use swconfig to configure the Marvell 
switch. Our board has two ethernet interfaces (eth0 and marvell) using 
sky2 driver. The marvell interface is connected to an external Marvell 
switch (mv88e6176) with four ethernet ports (but we only use two of 
them, eth1 and eth2). The Marvell switch is configured with the MDIO 
protocol, that we emulate through GPIOS (mdio-gpio kernel module), and 
the DSA driver is used to works with the Marvell switch.

We have the ethernet interfaces in the same bridge:

config interface 'lan'
         option type 'bridge'
         option ifname 'eth0 eth1 eth2'
         option proto 'static'
         option ipaddr '192.168.1.100'
         option netmask '255.255.255.0'
         option ip6assign '60'

root@LEDE:/# brctl show
bridge name     bridge id               STP enabled     interfaces
br-lan          7fff.00d01274f069       no              eth0
                                                         eth1
                                                         eth2
root@LEDE:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
group default qlen 1
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
     inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
     inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel 
master br-lan state DOWN group default qlen 1000
     link/ether 00:d0:12:74:f0:69 brd ff:ff:ff:ff:ff:ff
3: ifb0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
qlen 32
     link/ether be:80:bc:5e:63:c3 brd ff:ff:ff:ff:ff:ff
4: ifb1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
qlen 32
     link/ether 0a:1d:8d:06:e3:5d brd ff:ff:ff:ff:ff:ff
5: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN group default qlen 1
     link/gre 0.0.0.0 brd 0.0.0.0
6: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN 
group default qlen 1000
     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
7: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN 
group default qlen 1000
     link/ether e2:0b:10:b8:b7:b0 brd ff:ff:ff:ff:ff:ff
8: teql0: <NOARP> mtu 1500 qdisc noop state DOWN group default qlen 100
     link/void
9: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
     link/can
10: marvell: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel 
state UP group default qlen 1000
     link/ether aa:64:73:91:09:a9 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::a864:73ff:fe91:9a9/64 scope link
        valid_lft forever preferred_lft forever
11: eth1@marvell: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc 
noqueue master br-lan switchid 00000000 state LOWERLAYERDOWN group 
default qlen 1000
     link/ether aa:64:73:91:09:a9 brd ff:ff:ff:ff:ff:ff
12: eth2@marvell: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc 
noqueue master br-lan switchid 00000000 state UP group default qlen 1000
     link/ether aa:64:73:91:09:a9 brd ff:ff:ff:ff:ff:ff
13: eth3@marvell: <BROADCAST,MULTICAST> mtu 1500 qdisc noop switchid 
00000000 state DOWN group default qlen 1000
     link/ether aa:64:73:91:09:a9 brd ff:ff:ff:ff:ff:ff
14: eth4@marvell: <BROADCAST,MULTICAST> mtu 1500 qdisc noop switchid 
00000000 state DOWN group default qlen 1000
     link/ether aa:64:73:91:09:a9 brd ff:ff:ff:ff:ff:ff
15: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UP group default qlen 1000
     link/ether 00:d0:12:74:f0:69 brd ff:ff:ff:ff:ff:ff
     inet 192.168.1.100/24 brd 192.168.1.255 scope global br-lan
        valid_lft forever preferred_lft forever
     inet6 fd7b:a43b:e93e::1/60 scope global noprefixroute
        valid_lft forever preferred_lft forever
     inet6 fe80::2d0:12ff:fe74:f069/64 scope link
        valid_lft forever preferred_lft forever


We have this configuration working on a kernel 4.1 and including patches 
to upgrade dsa/mv88e6xxx to kernel version 4.3 (5acf4d0, Wed, 27 May 
2015 15:32:15 -0700) "[PATCH] blk: rq_data_dir() should not return a 
boolean."

>
>>>> If I connect the eth1/eth2, the link is up and I can do ping through it.
>>>> But, once
>>>> I start sending a heavy traffic load the link fails and the kernel sends the
>>>> following messages:
>>>>
>>>> [   48.557140] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   48.564964] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   48.572110] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   48.579263] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   48.586417] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   48.593573] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   48.600718] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   54.877567] net_ratelimit: 6 callbacks suppressed
>>>> [   54.882293] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>> [   61.413552] sky2 0000:04:00.0 marvell: rx error, status 0x5f20010
>>>> length 1518
>>>
>>> The status error bits are in sky2.h
>>> 0x5f20010 is
>>>       05f2 frame length => 1522
>>>       0010 Too long err
>>>
>>> That means the packet was longer than the configured MTU.
>>> You are probably getting packets with VLAN tag but have not configured
>>> a VLAN.
>
> Since you are using DSA, you will have DSA tags enabled on frames
> to/from the switch. This adds an extra 8 byte header in the frame.  My
> guess is, it is this header, not the VLAN tag which is causing you MTU
> issues.

But it is strange because, as I have said above, we have the same 
configuration working properly on a kernel 4.1 (with OpenWrt), and we 
have the MTU set to 1500.

>
> I think this is the first time i've seen sky2 used in a DSA
> setup. mv643xx or mvneta is generally what is used, when using Marvell
> chipsets. These drivers are more lenient about MTU, and are happy to
> pass frames with additional headers.
>

We use the mv88e6xxx (as our switch is mv88e6176) and it depends on DSA 
driver in the kernel (isn't it?).

>> Thanks for the information. I have increased the MTU value to 1550
>> (workaround) and it works if sends traffic (with iperf) from my
>> computer to the unit. But, if I send traffic outside the unit, I get
>> a new error message and link goes down:
>
> Changing the MTU like this is not a good fix. It will allow you to
> receive frames which are bigger, but it also means the local network
> stack will generate bigger frames to be transmitted. You probably need
> to modify the sky2 driver to allow it to receive frames bigger than
> the interface MTU, by about 8 bytes.

Should the DSA driver remove the DSA tags before pass the frames to sky2 
interface?

>
>> [ 4901.032989] sky2 0000:04:00.0 marvell: tx timeout
>> [ 4904.722670] sky2 0000:04:00.0 marvell: Link is up at 1000 Mbps,
>> full duplex, flow control both
>
> Between the sky2 and the switch, do you have two back-to-back PHYs or
> are you connecting the RGMII interfaces together?

I think that we have two back-to-back PHYs, but I am going to double 
check this with the hardware team.

Thanks,
Rafa

>
>      Andrew
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-04-28 11:54       ` Rafa Corvillo
@ 2017-04-28 12:22         ` Andrew Lunn
  2017-05-08 12:03           ` Rafa Corvillo
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Lunn @ 2017-04-28 12:22 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: Stephen Hemminger, netdev

> >Since you are using DSA, you will have DSA tags enabled on frames
> >to/from the switch. This adds an extra 8 byte header in the frame.  My
> >guess is, it is this header, not the VLAN tag which is causing you MTU
> >issues.
> 
> But it is strange because, as I have said above, we have the same
> configuration working properly on a kernel 4.1 (with OpenWrt), and
> we have the MTU set to 1500.

If you look at sky2.c:

static unsigned sky2_get_rx_threshold(struct sky2_port *sky2)
{
        unsigned size;

        /* Space needed for frame data + headers rounded up */
        size = roundup(sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN, 8);

        /* Stopping point for hardware truncation */
        return (size - 8) / sizeof(u32);
}

This is not going to be big enough for a frame with a DSA header.

> >I think this is the first time i've seen sky2 used in a DSA
> >setup. mv643xx or mvneta is generally what is used, when using Marvell
> >chipsets. These drivers are more lenient about MTU, and are happy to
> >pass frames with additional headers.
> >
> 
> We use the mv88e6xxx (as our switch is mv88e6176) and it depends on
> DSA driver in the kernel (isn't it?).

That is correct. But i was talking about the Ethernet interface. All
the designs i've seen use an mv643xxx Ethernet interface, or an mvneta
interface. This is the first time i've seen a sky2 used, which is why
i'm not too surprised you have issues.

> >Changing the MTU like this is not a good fix. It will allow you to
> >receive frames which are bigger, but it also means the local network
> >stack will generate bigger frames to be transmitted. You probably need
> >to modify the sky2 driver to allow it to receive frames bigger than
> >the interface MTU, by about 8 bytes.
> 
> Should the DSA driver remove the DSA tags before pass the frames to
> sky2 interface?

The DSA driver is adding the DSA tags to the frame and passing these
tagged frames to the sky2 interface. Frames going to/from the switch
will always have such tags.

> >>[ 4901.032989] sky2 0000:04:00.0 marvell: tx timeout
> >>[ 4904.722670] sky2 0000:04:00.0 marvell: Link is up at 1000 Mbps,
> >>full duplex, flow control both
> >
> >Between the sky2 and the switch, do you have two back-to-back PHYs or
> >are you connecting the RGMII interfaces together?
> 
> I think that we have two back-to-back PHYs, but I am going to double
> check this with the hardware team.

This could be your problem them. The mv88e6xxx switch driver assumes
there is a straight rgmii-rgmii connection, no PHYs. So it hard
configures the 'CPU' port to its fastest speed, with the link forced
up. If you actually have a PHY there, this might not work so well. I
don't know if the switch PHY is going to do autoneg correctly. Try
using ethtool to look at the sky2 PHY and see what state it is in.

      Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-04-28 12:22         ` Andrew Lunn
@ 2017-05-08 12:03           ` Rafa Corvillo
  2017-05-08 12:38             ` Andrew Lunn
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-05-08 12:03 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stephen Hemminger, netdev

On 28/04/17 14:22, Andrew Lunn wrote:
>>> Since you are using DSA, you will have DSA tags enabled on frames
>>> to/from the switch. This adds an extra 8 byte header in the frame.  My
>>> guess is, it is this header, not the VLAN tag which is causing you MTU
>>> issues.
>>
>> But it is strange because, as I have said above, we have the same
>> configuration working properly on a kernel 4.1 (with OpenWrt), and
>> we have the MTU set to 1500.

Hi Andrew,

Sorry for the delay in my answer, I was out of the office.

>
> If you look at sky2.c:
>
> static unsigned sky2_get_rx_threshold(struct sky2_port *sky2)
> {
>          unsigned size;
>
>          /* Space needed for frame data + headers rounded up */
>          size = roundup(sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN, 8);
>
>          /* Stopping point for hardware truncation */
>          return (size - 8) / sizeof(u32);
> }
>
> This is not going to be big enough for a frame with a DSA header.
>

Then, would be a good fix add 8 bytes to the size variable in this function?

>>> I think this is the first time i've seen sky2 used in a DSA
>>> setup. mv643xx or mvneta is generally what is used, when using Marvell
>>> chipsets. These drivers are more lenient about MTU, and are happy to
>>> pass frames with additional headers.
>>>
>>
>> We use the mv88e6xxx (as our switch is mv88e6176) and it depends on
>> DSA driver in the kernel (isn't it?).
>
> That is correct. But i was talking about the Ethernet interface. All
> the designs i've seen use an mv643xxx Ethernet interface, or an mvneta
> interface. This is the first time i've seen a sky2 used, which is why
> i'm not too surprised you have issues.
>
>>> Changing the MTU like this is not a good fix. It will allow you to
>>> receive frames which are bigger, but it also means the local network
>>> stack will generate bigger frames to be transmitted. You probably need
>>> to modify the sky2 driver to allow it to receive frames bigger than
>>> the interface MTU, by about 8 bytes.
>>
>> Should the DSA driver remove the DSA tags before pass the frames to
>> sky2 interface?
>
> The DSA driver is adding the DSA tags to the frame and passing these
> tagged frames to the sky2 interface. Frames going to/from the switch
> will always have such tags.
>
>>>> [ 4901.032989] sky2 0000:04:00.0 marvell: tx timeout
>>>> [ 4904.722670] sky2 0000:04:00.0 marvell: Link is up at 1000 Mbps,
>>>> full duplex, flow control both
>>>
>>> Between the sky2 and the switch, do you have two back-to-back PHYs or
>>> are you connecting the RGMII interfaces together?
>>
>> I think that we have two back-to-back PHYs, but I am going to double
>> check this with the hardware team.
>
> This could be your problem them. The mv88e6xxx switch driver assumes
> there is a straight rgmii-rgmii connection, no PHYs. So it hard
> configures the 'CPU' port to its fastest speed, with the link forced
> up. If you actually have a PHY there, this might not work so well. I
> don't know if the switch PHY is going to do autoneg correctly. Try
> using ethtool to look at the sky2 PHY and see what state it is in.
>
>        Andrew
>

The output of ethtool of sky2 interface is the following:

Settings for marvell:
         Supported ports: [ TP ]
         Supported link modes:   10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Half 1000baseT/Full
         Supported pause frame use: No
         Supports auto-negotiation: Yes
         Advertised link modes:  10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Half 1000baseT/Full
         Advertised pause frame use: No
         Advertised auto-negotiation: No
         Speed: 1000Mb/s
         Duplex: Full
         Port: Twisted Pair
         PHYAD: 0
         Transceiver: internal
         Auto-negotiation: on
         MDI-X: Unknown
         Supports Wake-on: pg
         Wake-on: d
         Current message level: 0x000000ff (255)
                                drv probe link timer ifdown ifup rx_err 
tx_err
         Link detected: yes


And the output of ethtool of eth2@marvell (interface that I have connected):

Settings for eth2:
         Supported ports: [ TP MII ]
         Supported link modes:   10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Half 1000baseT/Full
         Supported pause frame use: No
         Supports auto-negotiation: Yes
         Advertised link modes:  10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Half 1000baseT/Full
         Advertised pause frame use: No
         Advertised auto-negotiation: Yes
         Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                              100baseT/Half 100baseT/Full
         Link partner advertised pause frame use: No
         Link partner advertised auto-negotiation: No
         Speed: 100Mb/s
         Duplex: Full
         Port: MII
         PHYAD: 2
         Transceiver: external
         Auto-negotiation: on
         Supports Wake-on: d
         Wake-on: d
         Link detected: yes


Do you see something strange in these outputs?

Thanks,

Rafa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-08 12:03           ` Rafa Corvillo
@ 2017-05-08 12:38             ` Andrew Lunn
  2017-05-16 10:50               ` Rafa Corvillo
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Lunn @ 2017-05-08 12:38 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: Stephen Hemminger, netdev

> >static unsigned sky2_get_rx_threshold(struct sky2_port *sky2)
> >{
> >         unsigned size;
> >
> >         /* Space needed for frame data + headers rounded up */
> >         size = roundup(sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN, 8);
> >
> >         /* Stopping point for hardware truncation */
> >         return (size - 8) / sizeof(u32);
> >}
> >
> >This is not going to be big enough for a frame with a DSA header.
> >
> 
> Then, would be a good fix add 8 bytes to the size variable in this function?

Yes. Also look at the transmit code, is there again a limit based on
the MTU.

> Settings for marvell:
>         Supported ports: [ TP ]
>         Supported link modes:   10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Half 1000baseT/Full
>         Supported pause frame use: No
>         Supports auto-negotiation: Yes
>         Advertised link modes:  10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Half 1000baseT/Full
>         Advertised pause frame use: No
>         Advertised auto-negotiation: No
>         Speed: 1000Mb/s
>         Duplex: Full
>         Port: Twisted Pair
>         PHYAD: 0
>         Transceiver: internal
>         Auto-negotiation: on
>         MDI-X: Unknown
>         Supports Wake-on: pg
>         Wake-on: d
>         Current message level: 0x000000ff (255)
>                                drv probe link timer ifdown ifup
> rx_err tx_err
>         Link detected: yes
> 

So this suggests there is a real PHY there, and it is
auto-negotiating.

What we cannot see is the status for the PHY it connects to. But since
this PHY has established a link, the other PHY is probably O.K. It is
just a bit unsafe, since you are relying on reset behaviour. There is
nothing in software configuring the second PHY to make it
auto-negotiate.

	Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-08 12:38             ` Andrew Lunn
@ 2017-05-16 10:50               ` Rafa Corvillo
  2017-05-16 12:47                 ` Andrew Lunn
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-05-16 10:50 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stephen Hemminger, netdev

On 08/05/17 14:38, Andrew Lunn wrote:
>>> static unsigned sky2_get_rx_threshold(struct sky2_port *sky2)
>>> {
>>>          unsigned size;
>>>
>>>          /* Space needed for frame data + headers rounded up */
>>>          size = roundup(sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN, 8);
>>>
>>>          /* Stopping point for hardware truncation */
>>>          return (size - 8) / sizeof(u32);
>>> }
>>>
>>> This is not going to be big enough for a frame with a DSA header.
>>>
>>
>> Then, would be a good fix add 8 bytes to the size variable in this function?
>
> Yes. Also look at the transmit code, is there again a limit based on
> the MTU.

Hi Andrew,

Adding 8 bytes (sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN + 8 
(EDSA_HLEN)) does not fix the error, because the interface keep having a 
maximum length of 1518 bytes (sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN).

The polling function of sky2 driver (sky2_poll) calls to the function 
sky2_status_intr (with the parameter struct sky2_hw *hw). The 
sky2_status_intr function gets the status of the list elements (struct 
sky2_status_le) from the sky2_hw parameter and, from the sky2_status_le, 
gets the maximum length (1518) and the status code (0x5f20010). When the 
latter function (sky2_status_intr) calls to the sky2_receive function 
with the parameters length and status, it reports about the error (rx 
error, status 0x5f20010
length 1518).

I don't know who sets the maximum length (1518) and the status code 
(0x5f20010) of the packets. Is it possible that these values to be set 
outside the sky2 code?

Thanks,

Rafa

>
>> Settings for marvell:
>>          Supported ports: [ TP ]
>>          Supported link modes:   10baseT/Half 10baseT/Full
>>                                  100baseT/Half 100baseT/Full
>>                                  1000baseT/Half 1000baseT/Full
>>          Supported pause frame use: No
>>          Supports auto-negotiation: Yes
>>          Advertised link modes:  10baseT/Half 10baseT/Full
>>                                  100baseT/Half 100baseT/Full
>>                                  1000baseT/Half 1000baseT/Full
>>          Advertised pause frame use: No
>>          Advertised auto-negotiation: No
>>          Speed: 1000Mb/s
>>          Duplex: Full
>>          Port: Twisted Pair
>>          PHYAD: 0
>>          Transceiver: internal
>>          Auto-negotiation: on
>>          MDI-X: Unknown
>>          Supports Wake-on: pg
>>          Wake-on: d
>>          Current message level: 0x000000ff (255)
>>                                 drv probe link timer ifdown ifup
>> rx_err tx_err
>>          Link detected: yes
>>
>
> So this suggests there is a real PHY there, and it is
> auto-negotiating.
>
> What we cannot see is the status for the PHY it connects to. But since
> this PHY has established a link, the other PHY is probably O.K. It is
> just a bit unsafe, since you are relying on reset behaviour. There is
> nothing in software configuring the second PHY to make it
> auto-negotiate.
>
> 	Andrew
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-16 10:50               ` Rafa Corvillo
@ 2017-05-16 12:47                 ` Andrew Lunn
  2017-05-16 13:09                   ` Rafa Corvillo
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Lunn @ 2017-05-16 12:47 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: Stephen Hemminger, netdev

> Adding 8 bytes (sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN + 8
> (EDSA_HLEN)) does not fix the error, because the interface keep
> having a maximum length of 1518 bytes (sky2->netdev->mtu + ETH_HLEN
> + VLAN_HLEN).

Did you check the value being written to here:

        /*
         * The receiver hangs if it receives frames larger than the
         * packet buffer. As a workaround, truncate oversize frames, but
         * the register is limited to 9 bits, so if you do frames > 2052
         * you better get the MTU right!
         */
        thresh = sky2_get_rx_threshold(sky2);
        if (thresh > 0x1ff)
                sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_OFF);
        else {
                sky2_write16(hw, SK_REG(sky2->port, RX_GMF_TR_THR), thresh);
                sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_ON);
        }


What is thresh?

     Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-16 12:47                 ` Andrew Lunn
@ 2017-05-16 13:09                   ` Rafa Corvillo
  2017-05-16 13:21                     ` Andrew Lunn
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-05-16 13:09 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stephen Hemminger, netdev

>> Adding 8 bytes (sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN + 8
>> (EDSA_HLEN)) does not fix the error, because the interface keep
>> having a maximum length of 1518 bytes (sky2->netdev->mtu + ETH_HLEN
>> + VLAN_HLEN).
>
> Did you check the value being written to here:
>
>          /*
>           * The receiver hangs if it receives frames larger than the
>           * packet buffer. As a workaround, truncate oversize frames, but
>           * the register is limited to 9 bits, so if you do frames > 2052
>           * you better get the MTU right!
>           */
>          thresh = sky2_get_rx_threshold(sky2);
>          if (thresh > 0x1ff)
>                  sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_OFF);
>          else {
>                  sky2_write16(hw, SK_REG(sky2->port, RX_GMF_TR_THR), thresh);
>                  sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_ON);
>          }
>
>
> What is thresh?

The value of thresh is 380.

Rafa

>
>       Andrew
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-16 13:09                   ` Rafa Corvillo
@ 2017-05-16 13:21                     ` Andrew Lunn
  2017-05-16 15:50                       ` Rafa Corvillo
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Lunn @ 2017-05-16 13:21 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: Stephen Hemminger, netdev

On Tue, May 16, 2017 at 03:09:17PM +0200, Rafa Corvillo wrote:
> >>Adding 8 bytes (sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN + 8
> >>(EDSA_HLEN)) does not fix the error, because the interface keep
> >>having a maximum length of 1518 bytes (sky2->netdev->mtu + ETH_HLEN
> >>+ VLAN_HLEN).
> >
> >Did you check the value being written to here:
> >
> >         /*
> >          * The receiver hangs if it receives frames larger than the
> >          * packet buffer. As a workaround, truncate oversize frames, but
> >          * the register is limited to 9 bits, so if you do frames > 2052
> >          * you better get the MTU right!
> >          */
> >         thresh = sky2_get_rx_threshold(sky2);
> >         if (thresh > 0x1ff)
> >                 sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_OFF);
> >         else {
> >                 sky2_write16(hw, SK_REG(sky2->port, RX_GMF_TR_THR), thresh);
> >                 sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_ON);
> >         }
> >
> >
> >What is thresh?
> 
> The value of thresh is 380.

So that is 1528.

You could hack it and try 0x1ff.

Also, check that in sky2_rx_add(), le->length is set to 4K.

      Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-16 13:21                     ` Andrew Lunn
@ 2017-05-16 15:50                       ` Rafa Corvillo
  2017-05-16 15:58                         ` Andrew Lunn
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-05-16 15:50 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stephen Hemminger, netdev

> On Tue, May 16, 2017 at 03:09:17PM +0200, Rafa Corvillo wrote:
>>>> Adding 8 bytes (sky2->netdev->mtu + ETH_HLEN + VLAN_HLEN + 8
>>>> (EDSA_HLEN)) does not fix the error, because the interface keep
>>>> having a maximum length of 1518 bytes (sky2->netdev->mtu + ETH_HLEN
>>>> + VLAN_HLEN).
>>>
>>> Did you check the value being written to here:
>>>
>>>          /*
>>>           * The receiver hangs if it receives frames larger than the
>>>           * packet buffer. As a workaround, truncate oversize frames, but
>>>           * the register is limited to 9 bits, so if you do frames > 2052
>>>           * you better get the MTU right!
>>>           */
>>>          thresh = sky2_get_rx_threshold(sky2);
>>>          if (thresh > 0x1ff)
>>>                  sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_OFF);
>>>          else {
>>>                  sky2_write16(hw, SK_REG(sky2->port, RX_GMF_TR_THR), thresh);
>>>                  sky2_write32(hw, SK_REG(sky2->port, RX_GMF_CTRL_T), RX_TRUNC_ON);
>>>          }
>>>
>>>
>>> What is thresh?
>>
>> The value of thresh is 380.
>
> So that is 1528.

Yes, this is the result of the roundup function.

>
> You could hack it and try 0x1ff.

I have forced the value of thresh to 0x1ff and the rx error still appears.

>
> Also, check that in sky2_rx_add(), le->length is set to 4K.
>

The value of le->length is set to 1520.

Rafa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-16 15:50                       ` Rafa Corvillo
@ 2017-05-16 15:58                         ` Andrew Lunn
  2017-05-16 16:19                           ` Rafa Corvillo
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Lunn @ 2017-05-16 15:58 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: Stephen Hemminger, netdev

> >Also, check that in sky2_rx_add(), le->length is set to 4K.
> >
> 
> The value of le->length is set to 1520.

> Rafa

Ah.

You probably need to change sky2_get_rx_data_size() as well, to add in
the extra 8 bytes.

    Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-16 15:58                         ` Andrew Lunn
@ 2017-05-16 16:19                           ` Rafa Corvillo
  2017-05-26 10:13                             ` Rafa Corvillo
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-05-16 16:19 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stephen Hemminger, netdev

>>> Also, check that in sky2_rx_add(), le->length is set to 4K.
>>>
>>
>> The value of le->length is set to 1520.
>
>> Rafa
>
> Ah.
>
> You probably need to change sky2_get_rx_data_size() as well, to add in
> the extra 8 bytes.
>
>      Andrew
>

If I add the extra 8 byte in the function sky2_get_rx_data_size(), the 
value of le->length is set to 1528. But the rx error still appears. 
Because of that, I think it is possible the maximum length could be set 
outside the sky2 code.

Rafa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-16 16:19                           ` Rafa Corvillo
@ 2017-05-26 10:13                             ` Rafa Corvillo
  2017-05-31 19:31                               ` Andrew Lunn
  0 siblings, 1 reply; 18+ messages in thread
From: Rafa Corvillo @ 2017-05-26 10:13 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Stephen Hemminger, netdev

As modifying sky2 code I have not could get any solution, then I have 
modified some parameters on the ethernet interface and using a MTU = 
1503 and disabling TSO (TCP Segmentation Offload) mechanism all 
communication errors disappear, both the rx errors due to too long 
packets and tx timeout errors. I have performed iperf tests during 1 
hour (downstream and upstream) and the performance of the ethernet is 
stable. Furthermore, disabling SG (scatter-gather) improves the 
throughput of the ethernet interface with traffic upstream.

Is it possible that these offload mechanism don't work fine with the 
sky2 driver?

Thanks,

Rafa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176
  2017-05-26 10:13                             ` Rafa Corvillo
@ 2017-05-31 19:31                               ` Andrew Lunn
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Lunn @ 2017-05-31 19:31 UTC (permalink / raw)
  To: Rafa Corvillo; +Cc: Stephen Hemminger, netdev

On Fri, May 26, 2017 at 12:13:03PM +0200, Rafa Corvillo wrote:
> As modifying sky2 code I have not could get any solution, then I
> have modified some parameters on the ethernet interface and using a
> MTU = 1503 and disabling TSO (TCP Segmentation Offload) mechanism
> all communication errors disappear, both the rx errors due to too
> long packets and tx timeout errors.

Hi Rafa

That at least makes some sense. It would be interesting to sniff the
frames going from, the sky2 to the switch. I suspect you will find
that the first frame TSO sends for a big segment has the DSA header
and the following frames don't. You might want to look at how TSO is
setup on transmit, if you can tell it which parts of the header needs
to be placed on each segment, so that it includes the DSA header.

    Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-05-31 19:31 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-21 12:39 [ISSUE: sky2 - rx error] Link stops working under heavy traffic load connected to a mv88e6176 Rafa Corvillo
2017-04-24  6:45 ` Rafa Corvillo
2017-04-25 15:27 ` Stephen Hemminger
2017-04-27 12:05   ` Rafa Corvillo
2017-04-27 13:04     ` Andrew Lunn
2017-04-28 11:54       ` Rafa Corvillo
2017-04-28 12:22         ` Andrew Lunn
2017-05-08 12:03           ` Rafa Corvillo
2017-05-08 12:38             ` Andrew Lunn
2017-05-16 10:50               ` Rafa Corvillo
2017-05-16 12:47                 ` Andrew Lunn
2017-05-16 13:09                   ` Rafa Corvillo
2017-05-16 13:21                     ` Andrew Lunn
2017-05-16 15:50                       ` Rafa Corvillo
2017-05-16 15:58                         ` Andrew Lunn
2017-05-16 16:19                           ` Rafa Corvillo
2017-05-26 10:13                             ` Rafa Corvillo
2017-05-31 19:31                               ` Andrew Lunn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.