linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Issue found in Armada 370: "No buffer space available" error during continuous ping
@ 2014-07-08  2:20 Maggie Mae Roxas
  2014-07-08  2:27 ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-08  2:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas,
Good day.

We have previously discussed before on Armada 370 ethernet issues
(resolved via your suggestion/patches).

We just found out recently that ethernet (SGMII) connection encounters
"No buffer space available" error during continuous ping to other
nodes in the same network (ie, even ping to the server). It is
frequently around the 17th packet.
# Please see attached log (ping_error.txt) for more info.

Here are the details:

Processor: Marvell Armada 370 88F6707
Ethernet module: Marvell 88E1512
Board: Custom
Kernel versions tried:
- 3.13.9 (issue exists)
- 3.10.24 (issue does NOT exist)
- 3.13.9 (issue does NOT exist)
U-Boot versions tried: 2013_Q1 and 2013_Q3 from Marvell extranet

Some more important notes:
- This does not happen in wlan connections.
- This happens in 10, 100 and 1000Mbps connections.

Is there a known issue for this?

As always, thank you very much.

Regards,
Maggie Roxas
-------------- next part --------------
root at localhost:~# ethtool eth0                                                  
Settings for eth0:                                                              
        Supported ports: [ TP MII ]                                             
        Supported link modes:   10baseT/Half 10baseT/Full                       
                                100baseT/Half 100baseT/Full                     
                                1000baseT/Half 1000baseT/Full                   
        Supported pause frame use: No                                           
        Supports auto-negotiation: Yes                                          
        Advertised link modes:  10baseT/Half 10baseT/Full                       
                                100baseT/Half 100baseT/Full                     
                                1000baseT/Half 1000baseT/Full                   
        Advertised pause frame use: No                                          
        Advertised auto-negotiation: Yes                                        
        Speed: 1000Mb/s                                                         
        Duplex: Full                                                            
        Port: MII                                                               
        PHYAD: 0                                                                
        Transceiver: external                                                   
        Auto-negotiation: on                                                    
        Link detected: yes                                                      
root at localhost:~# ping 10.42.0.1                                                
PING 10.42.0.1 (10.42.0.1) 56(84) bytes of data.                                
64 bytes from 10.42.0.1: icmp_seq=1 ttl=64 time=0.222 ms                        
64 bytes from 10.42.0.1: icmp_seq=2 ttl=64 time=0.188 ms                        
64 bytes from 10.42.0.1: icmp_seq=3 ttl=64 time=0.191 ms                        
64 bytes from 10.42.0.1: icmp_seq=4 ttl=64 time=0.181 ms                        
64 bytes from 10.42.0.1: icmp_seq=5 ttl=64 time=0.186 ms                        
64 bytes from 10.42.0.1: icmp_seq=6 ttl=64 time=0.184 ms                        
64 bytes from 10.42.0.1: icmp_seq=7 ttl=64 time=0.184 ms                        
64 bytes from 10.42.0.1: icmp_seq=8 ttl=64 time=0.181 ms                        
64 bytes from 10.42.0.1: icmp_seq=9 ttl=64 time=0.183 ms                        
64 bytes from 10.42.0.1: icmp_seq=10 ttl=64 time=0.181 ms                       
64 bytes from 10.42.0.1: icmp_seq=11 ttl=64 time=0.185 ms                       
64 bytes from 10.42.0.1: icmp_seq=12 ttl=64 time=0.180 ms                       
64 bytes from 10.42.0.1: icmp_seq=13 ttl=64 time=0.193 ms                       
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
^C                                                                              
--- 10.42.0.1 ping statistics ---                                               
17 packets transmitted, 13 received, 23% packet loss, time 34995ms              
rtt min/avg/max/mdev = 0.180/0.187/0.222/0.018 ms                               
root at localhost:~# ethtool -s eth0 speed 100 duplex full                         
root at localhost:~# ethtool eth0                                                  
Settings for eth0:                                                              
        Supported ports: [ TP MII ]                                             
        Supported link modes:   10baseT/Half 10baseT/Full                       
                                100baseT/Half 100baseT/Full                     
                                1000baseT/Half 1000baseT/Full                   
        Supported pause frame use: No                                           
        Supports auto-negotiation: Yes                                          
        Advertised link modes:  100baseT/Full                                   
        Advertised pause frame use: No                                          
        Advertised auto-negotiation: Yes                                        
        Speed: 100Mb/s                                                          
        Duplex: Full                                                            
        Port: MII                                                               
        PHYAD: 0                                                                
        Transceiver: external                                                   
        Auto-negotiation: on                                                    
        Link detected: yes                                                      
root at localhost:~# ping 10.42.0.1                                                
PING 10.42.0.1 (10.42.0.1) 56(84) bytes of data.                                
64 bytes from 10.42.0.1: icmp_seq=1 ttl=64 time=0.441 ms                        
64 bytes from 10.42.0.1: icmp_seq=2 ttl=64 time=0.213 ms                        
64 bytes from 10.42.0.1: icmp_seq=3 ttl=64 time=0.199 ms                        
64 bytes from 10.42.0.1: icmp_seq=4 ttl=64 time=0.201 ms                        
64 bytes from 10.42.0.1: icmp_seq=5 ttl=64 time=0.206 ms                        
64 bytes from 10.42.0.1: icmp_seq=6 ttl=64 time=0.202 ms                        
64 bytes from 10.42.0.1: icmp_seq=7 ttl=64 time=0.199 ms                        
64 bytes from 10.42.0.1: icmp_seq=8 ttl=64 time=0.201 ms                        
64 bytes from 10.42.0.1: icmp_seq=9 ttl=64 time=0.189 ms                        
64 bytes from 10.42.0.1: icmp_seq=10 ttl=64 time=0.202 ms                       
64 bytes from 10.42.0.1: icmp_seq=11 ttl=64 time=0.196 ms                       
64 bytes from 10.42.0.1: icmp_seq=12 ttl=64 time=0.207 ms                       
64 bytes from 10.42.0.1: icmp_seq=13 ttl=64 time=0.204 ms                       
64 bytes from 10.42.0.1: icmp_seq=14 ttl=64 time=0.199 ms                       
64 bytes from 10.42.0.1: icmp_seq=15 ttl=64 time=0.196 ms                       
64 bytes from 10.42.0.1: icmp_seq=16 ttl=64 time=0.202 ms                       
64 bytes from 10.42.0.1: icmp_seq=17 ttl=64 time=0.199 ms                       
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
^C                                                                              
--- 10.42.0.1 ping statistics ---                                               
21 packets transmitted, 17 received, 19% packet loss, time 38996ms              
rtt min/avg/max/mdev = 0.189/0.215/0.441/0.056 ms                               
root at localhost:~# ethtool -s eth0 speed 10 duplex full                          
root at localhost:~# ethtool eth0                                                  
Settings for eth0:                                                              
        Supported ports: [ TP MII ]                                             
        Supported link modes:   10baseT/Half 10baseT/Full                       
                                100baseT/Half 100baseT/Full                     
                                1000baseT/Half 1000baseT/Full                   
        Supported pause frame use: No                                           
        Supports auto-negotiation: Yes                                          
        Advertised link modes:  10baseT/Full                                    
        Advertised pause frame use: No                                          
        Advertised auto-negotiation: Yes                                        
        Speed: 10Mb/s                                                           
        Duplex: Full                                                            
        Port: MII                                                               
        PHYAD: 0                                                                
        Transceiver: external                                                   
        Auto-negotiation: on                                                    
        Link detected: yes                                                      
root at localhost:~# ping 10.42.0.1                                                
PING 10.42.0.1 (10.42.0.1) 56(84) bytes of data.                                
64 bytes from 10.42.0.1: icmp_seq=1 ttl=64 time=0.447 ms                        
64 bytes from 10.42.0.1: icmp_seq=2 ttl=64 time=0.396 ms                        
64 bytes from 10.42.0.1: icmp_seq=3 ttl=64 time=0.390 ms                        
64 bytes from 10.42.0.1: icmp_seq=4 ttl=64 time=0.388 ms                        
64 bytes from 10.42.0.1: icmp_seq=5 ttl=64 time=0.390 ms                        
64 bytes from 10.42.0.1: icmp_seq=6 ttl=64 time=0.391 ms                        
64 bytes from 10.42.0.1: icmp_seq=7 ttl=64 time=0.387 ms                        
64 bytes from 10.42.0.1: icmp_seq=8 ttl=64 time=0.385 ms                        
64 bytes from 10.42.0.1: icmp_seq=9 ttl=64 time=0.386 ms                        
64 bytes from 10.42.0.1: icmp_seq=10 ttl=64 time=0.385 ms                       
64 bytes from 10.42.0.1: icmp_seq=11 ttl=64 time=0.386 ms                       
64 bytes from 10.42.0.1: icmp_seq=12 ttl=64 time=0.377 ms                       
64 bytes from 10.42.0.1: icmp_seq=13 ttl=64 time=0.373 ms                       
64 bytes from 10.42.0.1: icmp_seq=14 ttl=64 time=0.378 ms                       
64 bytes from 10.42.0.1: icmp_seq=15 ttl=64 time=0.384 ms                       
64 bytes from 10.42.0.1: icmp_seq=16 ttl=64 time=0.393 ms                       
64 bytes from 10.42.0.1: icmp_seq=17 ttl=64 time=0.384 ms                       
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
ping: sendmsg: No buffer space available                                        
^C                                                                              
--- 10.42.0.1 ping statistics ---                                               
21 packets transmitted, 17 received, 19% packet loss, time 38996ms              
rtt min/avg/max/mdev = 0.373/0.389/0.447/0.023 ms         
root at localhost:~# uname -a                                                      
Linux localhost.localdomain 3.13.9 #1 SMP Tue Jul 8 09:23:06 PHT 2014 armv7l arm
v7l armv7l GNU/Linux                 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-08  2:20 Issue found in Armada 370: "No buffer space available" error during continuous ping Maggie Mae Roxas
@ 2014-07-08  2:27 ` Maggie Mae Roxas
  2014-07-08  8:21   ` Thomas Petazzoni
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-08  2:27 UTC (permalink / raw)
  To: linux-arm-kernel

Sorry, correcting typo:

- 3.13.9 (issue exists)
- 3.10.24 (issue does NOT exist)
- 3.13.5 (issue does NOT exist)

On Mon, Jul 7, 2014 at 7:20 PM, Maggie Mae Roxas
<maggie.mae.roxas@gmail.com> wrote:
> Hi Thomas,
> Good day.
>
> We have previously discussed before on Armada 370 ethernet issues
> (resolved via your suggestion/patches).
>
> We just found out recently that ethernet (SGMII) connection encounters
> "No buffer space available" error during continuous ping to other
> nodes in the same network (ie, even ping to the server). It is
> frequently around the 17th packet.
> # Please see attached log (ping_error.txt) for more info.
>
> Here are the details:
>
> Processor: Marvell Armada 370 88F6707
> Ethernet module: Marvell 88E1512
> Board: Custom
> Kernel versions tried:
> - 3.13.9 (issue exists)
> - 3.10.24 (issue does NOT exist)
> - 3.13.9 (issue does NOT exist)
> U-Boot versions tried: 2013_Q1 and 2013_Q3 from Marvell extranet
>
> Some more important notes:
> - This does not happen in wlan connections.
> - This happens in 10, 100 and 1000Mbps connections.
>
> Is there a known issue for this?
>
> As always, thank you very much.
>
> Regards,
> Maggie Roxas

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-08  2:27 ` Maggie Mae Roxas
@ 2014-07-08  8:21   ` Thomas Petazzoni
  2014-07-09  6:35     ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Petazzoni @ 2014-07-08  8:21 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Maggie Mae Roxas,

On Mon, 7 Jul 2014 19:27:22 -0700, Maggie Mae Roxas wrote:
> Sorry, correcting typo:
> 
> - 3.13.9 (issue exists)
> - 3.10.24 (issue does NOT exist)
> - 3.13.5 (issue does NOT exist)

Ok, thanks again for the report. Unfortunately, you're hitting a known
problem that was fixed, but after the 3.13-stable cycle was closed.
Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
network driver:

$ git slog --author=free-electrons v3.13.5..v3.13.9
396b229b683fdc08d8705883860ec5a1b810546a net: mvneta: fix usage as a module on RGMII configurations
ea64e1f33d9d627da5d38da035e5d7443276e84e net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE

To fix the usage of mvneta on RGMII configurations. However, by doing
so, I broke SGMII configurations. So, I sent a patch to revert "net:
mvneta: fix usage as a module on RGMII configurations", which was
accepted. But in the mean time, the 3.13-stable cycle was closed, so
this revert was never merged in the 3.13.x series.

If you want to stay on 3.13.x, you should therefore apply:

cd71e246c16b30e3f396a85943d5f596202737ba Revert "net: mvneta: fix usage as a module on RGMII configurations"

This commit is from the 3.14-stable branch, in Linus master branch,
it's:

cc6ca3023f2c2bbcd062e9d4cf6afc2ba2821ada Revert "net: mvneta: fix usage as a module on RGMII configurations"

Some other commits, merged after that, fix the usage of mvneta as a
module on both RGMII and SGMII configurations.

Any reason you're still using 3.13.x ? You should really consider
switching at least to v3.14.x, which is a long-term version, and
therefore still maintained. If you use v3.14.x, all of your bug reports
that end up in patches, will ultimately be fixed in v3.14.x.

Thanks again for your report!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-08  8:21   ` Thomas Petazzoni
@ 2014-07-09  6:35     ` Maggie Mae Roxas
  2014-07-14  3:55       ` Maggie Mae Roxas
  2014-07-15 12:24       ` Thomas Petazzoni
  0 siblings, 2 replies; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-09  6:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas,
Good day.

As much as we'd like to switch to the latest v3.14.x, we need to stay
at kernel v3.13.9 as that is a hard requirement of our customer (I
think it's because it's the base platform for Ubuntu 14.04 FS which
we'll use).

So I applied your patch in our v3.13.9 as suggested:
http://kernel.opensuse.org/cgit/kernel/patch/?id=cd71e246c16b30e3f396a85943d5f596202737ba

Unfortunately, issue still exists after we applied it.

> Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
network driver:

Given this, we tried to replace mvneta.c of our v3.13.9 and replace it
with v3.13.5's mvneta.c.
Issue does not exist when we did that - but of course, we surely will
miss something, so we wanted to confirm this further with you.
It seems like applying cd71e246c16b30e3f396a85943d5f596202737ba in
v3.13.9 is not sufficient enough..?
Possibly there are v3.13.5 and v3.13.9 diff (see attached) needed
apart from just cd71e246c16b30e3f396a85943d5f596202737ba?

> Thanks again for your report!
No problem. We're also thankful for your support!

Regards,
Maggie Roxas

On Tue, Jul 8, 2014 at 1:21 AM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> Dear Maggie Mae Roxas,
>
> On Mon, 7 Jul 2014 19:27:22 -0700, Maggie Mae Roxas wrote:
>> Sorry, correcting typo:
>>
>> - 3.13.9 (issue exists)
>> - 3.10.24 (issue does NOT exist)
>> - 3.13.5 (issue does NOT exist)
>
> Ok, thanks again for the report. Unfortunately, you're hitting a known
> problem that was fixed, but after the 3.13-stable cycle was closed.
> Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
> network driver:
>
> $ git slog --author=free-electrons v3.13.5..v3.13.9
> 396b229b683fdc08d8705883860ec5a1b810546a net: mvneta: fix usage as a module on RGMII configurations
> ea64e1f33d9d627da5d38da035e5d7443276e84e net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE
>
> To fix the usage of mvneta on RGMII configurations. However, by doing
> so, I broke SGMII configurations. So, I sent a patch to revert "net:
> mvneta: fix usage as a module on RGMII configurations", which was
> accepted. But in the mean time, the 3.13-stable cycle was closed, so
> this revert was never merged in the 3.13.x series.
>
> If you want to stay on 3.13.x, you should therefore apply:
>
> cd71e246c16b30e3f396a85943d5f596202737ba Revert "net: mvneta: fix usage as a module on RGMII configurations"
>
> This commit is from the 3.14-stable branch, in Linus master branch,
> it's:
>
> cc6ca3023f2c2bbcd062e9d4cf6afc2ba2821ada Revert "net: mvneta: fix usage as a module on RGMII configurations"
>
> Some other commits, merged after that, fix the usage of mvneta as a
> module on both RGMII and SGMII configurations.
>
> Any reason you're still using 3.13.x ? You should really consider
> switching at least to v3.14.x, which is a long-term version, and
> therefore still maintained. If you use v3.14.x, all of your bug reports
> that end up in patches, will ultimately be fixed in v3.14.x.
>
> Thanks again for your report!
>
> Thomas
> --
> Thomas Petazzoni, CTO, Free Electrons
> Embedded Linux, Kernel and Android engineering
> http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: diff_13.5_vs_13.9.patch
Type: application/octet-stream
Size: 15452 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20140708/30d70391/attachment.obj>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-09  6:35     ` Maggie Mae Roxas
@ 2014-07-14  3:55       ` Maggie Mae Roxas
  2014-07-15 12:24       ` Thomas Petazzoni
  1 sibling, 0 replies; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-14  3:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas,
Good day.

Any update on my previous inquiry?
Thank you.

Regards,
Maggie Roxas

On Tue, Jul 8, 2014 at 11:35 PM, Maggie Mae Roxas
<maggie.mae.roxas@gmail.com> wrote:
> Hi Thomas,
> Good day.
>
> As much as we'd like to switch to the latest v3.14.x, we need to stay
> at kernel v3.13.9 as that is a hard requirement of our customer (I
> think it's because it's the base platform for Ubuntu 14.04 FS which
> we'll use).
>
> So I applied your patch in our v3.13.9 as suggested:
> http://kernel.opensuse.org/cgit/kernel/patch/?id=cd71e246c16b30e3f396a85943d5f596202737ba
>
> Unfortunately, issue still exists after we applied it.
>
>> Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
> network driver:
>
> Given this, we tried to replace mvneta.c of our v3.13.9 and replace it
> with v3.13.5's mvneta.c.
> Issue does not exist when we did that - but of course, we surely will
> miss something, so we wanted to confirm this further with you.
> It seems like applying cd71e246c16b30e3f396a85943d5f596202737ba in
> v3.13.9 is not sufficient enough..?
> Possibly there are v3.13.5 and v3.13.9 diff (see attached) needed
> apart from just cd71e246c16b30e3f396a85943d5f596202737ba?
>
>> Thanks again for your report!
> No problem. We're also thankful for your support!
>
> Regards,
> Maggie Roxas
>
> On Tue, Jul 8, 2014 at 1:21 AM, Thomas Petazzoni
> <thomas.petazzoni@free-electrons.com> wrote:
>> Dear Maggie Mae Roxas,
>>
>> On Mon, 7 Jul 2014 19:27:22 -0700, Maggie Mae Roxas wrote:
>>> Sorry, correcting typo:
>>>
>>> - 3.13.9 (issue exists)
>>> - 3.10.24 (issue does NOT exist)
>>> - 3.13.5 (issue does NOT exist)
>>
>> Ok, thanks again for the report. Unfortunately, you're hitting a known
>> problem that was fixed, but after the 3.13-stable cycle was closed.
>> Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
>> network driver:
>>
>> $ git slog --author=free-electrons v3.13.5..v3.13.9
>> 396b229b683fdc08d8705883860ec5a1b810546a net: mvneta: fix usage as a module on RGMII configurations
>> ea64e1f33d9d627da5d38da035e5d7443276e84e net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE
>>
>> To fix the usage of mvneta on RGMII configurations. However, by doing
>> so, I broke SGMII configurations. So, I sent a patch to revert "net:
>> mvneta: fix usage as a module on RGMII configurations", which was
>> accepted. But in the mean time, the 3.13-stable cycle was closed, so
>> this revert was never merged in the 3.13.x series.
>>
>> If you want to stay on 3.13.x, you should therefore apply:
>>
>> cd71e246c16b30e3f396a85943d5f596202737ba Revert "net: mvneta: fix usage as a module on RGMII configurations"
>>
>> This commit is from the 3.14-stable branch, in Linus master branch,
>> it's:
>>
>> cc6ca3023f2c2bbcd062e9d4cf6afc2ba2821ada Revert "net: mvneta: fix usage as a module on RGMII configurations"
>>
>> Some other commits, merged after that, fix the usage of mvneta as a
>> module on both RGMII and SGMII configurations.
>>
>> Any reason you're still using 3.13.x ? You should really consider
>> switching at least to v3.14.x, which is a long-term version, and
>> therefore still maintained. If you use v3.14.x, all of your bug reports
>> that end up in patches, will ultimately be fixed in v3.14.x.
>>
>> Thanks again for your report!
>>
>> Thomas
>> --
>> Thomas Petazzoni, CTO, Free Electrons
>> Embedded Linux, Kernel and Android engineering
>> http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-09  6:35     ` Maggie Mae Roxas
  2014-07-14  3:55       ` Maggie Mae Roxas
@ 2014-07-15 12:24       ` Thomas Petazzoni
  2014-07-15 12:43         ` Willy Tarreau
  1 sibling, 1 reply; 27+ messages in thread
From: Thomas Petazzoni @ 2014-07-15 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Maggie Mae Roxas,

On Tue, 8 Jul 2014 23:35:36 -0700, Maggie Mae Roxas wrote:

> As much as we'd like to switch to the latest v3.14.x, we need to stay
> at kernel v3.13.9 as that is a hard requirement of our customer (I
> think it's because it's the base platform for Ubuntu 14.04 FS which
> we'll use).
> 
> So I applied your patch in our v3.13.9 as suggested:
> http://kernel.opensuse.org/cgit/kernel/patch/?id=cd71e246c16b30e3f396a85943d5f596202737ba
> 
> Unfortunately, issue still exists after we applied it.
> 
> > Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
> network driver:
> 
> Given this, we tried to replace mvneta.c of our v3.13.9 and replace it
> with v3.13.5's mvneta.c.
> Issue does not exist when we did that - but of course, we surely will
> miss something, so we wanted to confirm this further with you.
> It seems like applying cd71e246c16b30e3f396a85943d5f596202737ba in
> v3.13.9 is not sufficient enough..?
> Possibly there are v3.13.5 and v3.13.9 diff (see attached) needed
> apart from just cd71e246c16b30e3f396a85943d5f596202737ba?

Hum, there are indeed more commits than I thought between 3.13.5 and
3.13.9 :

396b229b683fdc08d8705883860ec5a1b810546a net: mvneta: fix usage as a module on RGMII configurations
ea64e1f33d9d627da5d38da035e5d7443276e84e net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE
4f3a4f701b59a3e4b5c8503ac3d905c0a326f922 net: mvneta: replace Tx timer with a real interrupt
0ce58acf529bacd25dbe01298ff51a5c4d59a4f4 net: mvneta: add missing bit descriptions for interrupt masks and causes
8c2c9b1efcc4b04b4625d0613b9e74ef17016dea net: mvneta: do not schedule in mvneta_tx_timeout
92817335465090aecfde6caef9bab6923f209664 net: mvneta: use per_cpu stats to fix an SMP lock up
fbfbed33a5effba7dc6f33e3ed598f9bd31b0cdf net: mvneta: increase the 64-bit rx/tx stats out of the hot path

Willy: many of the patches in this list are yours. The patch "net:
mvneta: fix usage as a module on RGMII configurations" is known to
break SGMII configurations, but even after reverting it, Maggie Mae
reports that mvneta in 3.13.9 doesn't work, but mvneta works on 3.13.5.
Do you see any missing patch on the TX rework that you did and that was
included betweeen 3.13.5 and 3.13.9 ?

Thanks,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-15 12:24       ` Thomas Petazzoni
@ 2014-07-15 12:43         ` Willy Tarreau
  2014-07-17  5:37           ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-07-15 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas,

On Tue, Jul 15, 2014 at 02:24:31PM +0200, Thomas Petazzoni wrote:
> Dear Maggie Mae Roxas,
> 
> On Tue, 8 Jul 2014 23:35:36 -0700, Maggie Mae Roxas wrote:
> 
> > As much as we'd like to switch to the latest v3.14.x, we need to stay
> > at kernel v3.13.9 as that is a hard requirement of our customer (I
> > think it's because it's the base platform for Ubuntu 14.04 FS which
> > we'll use).
> > 
> > So I applied your patch in our v3.13.9 as suggested:
> > http://kernel.opensuse.org/cgit/kernel/patch/?id=cd71e246c16b30e3f396a85943d5f596202737ba
> > 
> > Unfortunately, issue still exists after we applied it.
> > 
> > > Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
> > network driver:
> > 
> > Given this, we tried to replace mvneta.c of our v3.13.9 and replace it
> > with v3.13.5's mvneta.c.
> > Issue does not exist when we did that - but of course, we surely will
> > miss something, so we wanted to confirm this further with you.
> > It seems like applying cd71e246c16b30e3f396a85943d5f596202737ba in
> > v3.13.9 is not sufficient enough..?
> > Possibly there are v3.13.5 and v3.13.9 diff (see attached) needed
> > apart from just cd71e246c16b30e3f396a85943d5f596202737ba?
> 
> Hum, there are indeed more commits than I thought between 3.13.5 and
> 3.13.9 :
> 
> 396b229b683fdc08d8705883860ec5a1b810546a net: mvneta: fix usage as a module on RGMII configurations
> ea64e1f33d9d627da5d38da035e5d7443276e84e net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE
> 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922 net: mvneta: replace Tx timer with a real interrupt
> 0ce58acf529bacd25dbe01298ff51a5c4d59a4f4 net: mvneta: add missing bit descriptions for interrupt masks and causes
> 8c2c9b1efcc4b04b4625d0613b9e74ef17016dea net: mvneta: do not schedule in mvneta_tx_timeout
> 92817335465090aecfde6caef9bab6923f209664 net: mvneta: use per_cpu stats to fix an SMP lock up
> fbfbed33a5effba7dc6f33e3ed598f9bd31b0cdf net: mvneta: increase the 64-bit rx/tx stats out of the hot path
> 
> Willy: many of the patches in this list are yours. The patch "net:
> mvneta: fix usage as a module on RGMII configurations" is known to
> break SGMII configurations, but even after reverting it, Maggie Mae
> reports that mvneta in 3.13.9 doesn't work, but mvneta works on 3.13.5.
> Do you see any missing patch on the TX rework that you did and that was
> included betweeen 3.13.5 and 3.13.9 ?

No, everything seems to be there. Additionally, my local development branch
was actually based on the commits above as it was rebased on v3.13.9.

Maggie, do you know if it is possible that for any reason your board
would not deliver an IRQ on Tx completion ? That could explain things.
You can easily test reverting commit 4f3a4f701b just in case. If that's
the case, then the next step will be to figure out how it is possible
that IRQs are disabled!

Regards,
Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-15 12:43         ` Willy Tarreau
@ 2014-07-17  5:37           ` Maggie Mae Roxas
  2014-07-17  8:15             ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-17  5:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas, Willy,
Good day.

First of all, thanks again for looking further into this.

> Maggie, do you know if it is possible that for any reason your board
would not deliver an IRQ on Tx completion ? That could explain things.
I'm sorrry but I'm not really sure how to check this - all I know is
that the difference in the working and not-working setup (both
software and hardware) is the mvneta.c.

> You can easily test reverting commit 4f3a4f701b just in case. If that's
the case, then the next step will be to figure out how it is possible
that IRQs are disabled!
I'll try this one, specifcally this combination:
- use 3.13.9 mvneta.c
- apply cd71e246c16b30e3f396a85943d5f596202737ba
- revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922

I will update you the results within today until latest tomorrow.

Thank you for your support as usual!

Regards,
Maggie Roxas

On Tue, Jul 15, 2014 at 5:43 AM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Thomas,
>
> On Tue, Jul 15, 2014 at 02:24:31PM +0200, Thomas Petazzoni wrote:
>> Dear Maggie Mae Roxas,
>>
>> On Tue, 8 Jul 2014 23:35:36 -0700, Maggie Mae Roxas wrote:
>>
>> > As much as we'd like to switch to the latest v3.14.x, we need to stay
>> > at kernel v3.13.9 as that is a hard requirement of our customer (I
>> > think it's because it's the base platform for Ubuntu 14.04 FS which
>> > we'll use).
>> >
>> > So I applied your patch in our v3.13.9 as suggested:
>> > http://kernel.opensuse.org/cgit/kernel/patch/?id=cd71e246c16b30e3f396a85943d5f596202737ba
>> >
>> > Unfortunately, issue still exists after we applied it.
>> >
>> > > Basically, between 3.13.5 and 3.13.9, I introduced two patches to the
>> > network driver:
>> >
>> > Given this, we tried to replace mvneta.c of our v3.13.9 and replace it
>> > with v3.13.5's mvneta.c.
>> > Issue does not exist when we did that - but of course, we surely will
>> > miss something, so we wanted to confirm this further with you.
>> > It seems like applying cd71e246c16b30e3f396a85943d5f596202737ba in
>> > v3.13.9 is not sufficient enough..?
>> > Possibly there are v3.13.5 and v3.13.9 diff (see attached) needed
>> > apart from just cd71e246c16b30e3f396a85943d5f596202737ba?
>>
>> Hum, there are indeed more commits than I thought between 3.13.5 and
>> 3.13.9 :
>>
>> 396b229b683fdc08d8705883860ec5a1b810546a net: mvneta: fix usage as a module on RGMII configurations
>> ea64e1f33d9d627da5d38da035e5d7443276e84e net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE
>> 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922 net: mvneta: replace Tx timer with a real interrupt
>> 0ce58acf529bacd25dbe01298ff51a5c4d59a4f4 net: mvneta: add missing bit descriptions for interrupt masks and causes
>> 8c2c9b1efcc4b04b4625d0613b9e74ef17016dea net: mvneta: do not schedule in mvneta_tx_timeout
>> 92817335465090aecfde6caef9bab6923f209664 net: mvneta: use per_cpu stats to fix an SMP lock up
>> fbfbed33a5effba7dc6f33e3ed598f9bd31b0cdf net: mvneta: increase the 64-bit rx/tx stats out of the hot path
>>
>> Willy: many of the patches in this list are yours. The patch "net:
>> mvneta: fix usage as a module on RGMII configurations" is known to
>> break SGMII configurations, but even after reverting it, Maggie Mae
>> reports that mvneta in 3.13.9 doesn't work, but mvneta works on 3.13.5.
>> Do you see any missing patch on the TX rework that you did and that was
>> included betweeen 3.13.5 and 3.13.9 ?
>
> No, everything seems to be there. Additionally, my local development branch
> was actually based on the commits above as it was rebased on v3.13.9.
>
> Maggie, do you know if it is possible that for any reason your board
> would not deliver an IRQ on Tx completion ? That could explain things.
> You can easily test reverting commit 4f3a4f701b just in case. If that's
> the case, then the next step will be to figure out how it is possible
> that IRQs are disabled!
>
> Regards,
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-17  5:37           ` Maggie Mae Roxas
@ 2014-07-17  8:15             ` Willy Tarreau
  2014-07-21  1:57               ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-07-17  8:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Maggie,

On Wed, Jul 16, 2014 at 10:37:47PM -0700, Maggie Mae Roxas wrote:
> Hi Thomas, Willy,
> Good day.
> 
> First of all, thanks again for looking further into this.
> 
> > Maggie, do you know if it is possible that for any reason your board
> > would not deliver an IRQ on Tx completion ? That could explain things.
>
> I'm sorrry but I'm not really sure how to check this - all I know is
> that the difference in the working and not-working setup (both
> software and hardware) is the mvneta.c.

In fact I don't know if you're running your own board or a "standard"
one (a mirabox or any NAS board). Because that could also be one of the
differences between what you observe on your side and our respective
experiences with our boards.

> > You can easily test reverting commit 4f3a4f701b just in case. If that's
> > the case, then the next step will be to figure out how it is possible
> > that IRQs are disabled!
> I'll try this one, specifcally this combination:
> - use 3.13.9 mvneta.c
> - apply cd71e246c16b30e3f396a85943d5f596202737ba
> - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922
> 
> I will update you the results within today until latest tomorrow.

OK fine, thank you!

Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-17  8:15             ` Willy Tarreau
@ 2014-07-21  1:57               ` Maggie Mae Roxas
  2014-07-21  2:45                 ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-21  1:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy, Thomas,
Good day.

> I'll try this one, specifcally this combination:
> - use 3.13.9 mvneta.c
> - apply cd71e246c16b30e3f396a85943d5f596202737ba
> - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922

This is to confirm that "No buffer space available" issue is resolved
after I applied above combinations in in v3.13.9.
Thanks a lot for the help!

BTW, sorry for the late update - all our connections, electricity and
stuff are unstable for the past week due to typhoon and we only got
everything working just today.

Thank you again for attending to our reports.

Regards,
Maggie Roxas

On Thu, Jul 17, 2014 at 1:15 AM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Maggie,
>
> On Wed, Jul 16, 2014 at 10:37:47PM -0700, Maggie Mae Roxas wrote:
>> Hi Thomas, Willy,
>> Good day.
>>
>> First of all, thanks again for looking further into this.
>>
>> > Maggie, do you know if it is possible that for any reason your board
>> > would not deliver an IRQ on Tx completion ? That could explain things.
>>
>> I'm sorrry but I'm not really sure how to check this - all I know is
>> that the difference in the working and not-working setup (both
>> software and hardware) is the mvneta.c.
>
> In fact I don't know if you're running your own board or a "standard"
> one (a mirabox or any NAS board). Because that could also be one of the
> differences between what you observe on your side and our respective
> experiences with our boards.
>
>> > You can easily test reverting commit 4f3a4f701b just in case. If that's
>> > the case, then the next step will be to figure out how it is possible
>> > that IRQs are disabled!
>> I'll try this one, specifcally this combination:
>> - use 3.13.9 mvneta.c
>> - apply cd71e246c16b30e3f396a85943d5f596202737ba
>> - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922
>>
>> I will update you the results within today until latest tomorrow.
>
> OK fine, thank you!
>
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-21  1:57               ` Maggie Mae Roxas
@ 2014-07-21  2:45                 ` Maggie Mae Roxas
  2014-07-21  5:44                   ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-21  2:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy,
Good day.

BTW, here are some answers to your questions.

> In fact I don't know if you're running your own board or a "standard"
one (a mirabox or any NAS board).
We are using a "customized" one, not a "standard" one.
We based the design on Armada 370 RD Evaluation Board, but we used
Marvell 88E1512 as Ethernet PHY and Marvell 88F6707 as processor
instead of the ones in the Armada 370 RD (I think it uses Marvell
88E1310 as Ethernet PHY and Marvell 88F6W11 as processor).

> Because that could also be one of the
differences between what you observe on your side and our respective
experiences with our boards.
Acknowledged.

> Maggie, do you know if it is possible that for any reason your board
would not deliver an IRQ on Tx completion ? That could explain things.
> You can easily test reverting commit 4f3a4f701b just in case.
> If that's the case, then the next step will be to figure out how it is possible
that IRQs are disabled!
After reverting 4f3a4f701b, as I reported, issue does not happen anymore.
Please let me know how to "figure out how it is possible that IRQs are
disabled".

Also, what is the impact if I use this combination?
> - use 3.13.9 mvneta.c
> - apply cd71e246c16b30e3f396a85943d5f596202737ba
> - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922

Are there functionalities that won't work?

Thank you very much for your support.

Regards,
Maggie Roxas

On Sun, Jul 20, 2014 at 6:57 PM, Maggie Mae Roxas
<maggie.mae.roxas@gmail.com> wrote:
> Hi Willy, Thomas,
> Good day.
>
>> I'll try this one, specifcally this combination:
>> - use 3.13.9 mvneta.c
>> - apply cd71e246c16b30e3f396a85943d5f596202737ba
>> - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922
>
> This is to confirm that "No buffer space available" issue is resolved
> after I applied above combinations in in v3.13.9.
> Thanks a lot for the help!
>
> BTW, sorry for the late update - all our connections, electricity and
> stuff are unstable for the past week due to typhoon and we only got
> everything working just today.
>
> Thank you again for attending to our reports.
>
> Regards,
> Maggie Roxas
>
> On Thu, Jul 17, 2014 at 1:15 AM, Willy Tarreau <w@1wt.eu> wrote:
>> Hi Maggie,
>>
>> On Wed, Jul 16, 2014 at 10:37:47PM -0700, Maggie Mae Roxas wrote:
>>> Hi Thomas, Willy,
>>> Good day.
>>>
>>> First of all, thanks again for looking further into this.
>>>
>>> > Maggie, do you know if it is possible that for any reason your board
>>> > would not deliver an IRQ on Tx completion ? That could explain things.
>>>
>>> I'm sorrry but I'm not really sure how to check this - all I know is
>>> that the difference in the working and not-working setup (both
>>> software and hardware) is the mvneta.c.
>>
>> In fact I don't know if you're running your own board or a "standard"
>> one (a mirabox or any NAS board). Because that could also be one of the
>> differences between what you observe on your side and our respective
>> experiences with our boards.
>>
>>> > You can easily test reverting commit 4f3a4f701b just in case. If that's
>>> > the case, then the next step will be to figure out how it is possible
>>> > that IRQs are disabled!
>>> I'll try this one, specifcally this combination:
>>> - use 3.13.9 mvneta.c
>>> - apply cd71e246c16b30e3f396a85943d5f596202737ba
>>> - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922
>>>
>>> I will update you the results within today until latest tomorrow.
>>
>> OK fine, thank you!
>>
>> Willy
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-21  2:45                 ` Maggie Mae Roxas
@ 2014-07-21  5:44                   ` Willy Tarreau
  2014-07-21  6:33                     ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-07-21  5:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Maggie,

On Sun, Jul 20, 2014 at 07:45:13PM -0700, Maggie Mae Roxas wrote:
> Hi Willy,
> Good day.
> 
> BTW, here are some answers to your questions.
> 
> > In fact I don't know if you're running your own board or a "standard"
> one (a mirabox or any NAS board).
> We are using a "customized" one, not a "standard" one.
> We based the design on Armada 370 RD Evaluation Board, but we used
> Marvell 88E1512 as Ethernet PHY and Marvell 88F6707 as processor
> instead of the ones in the Armada 370 RD (I think it uses Marvell
> 88E1310 as Ethernet PHY and Marvell 88F6W11 as processor).

OK. For information, the mirabox uses a 88F6707 and a 88E1510 for
the phy. So it's very close to what you have.

> > Maggie, do you know if it is possible that for any reason your board
> would not deliver an IRQ on Tx completion ? That could explain things.
> > You can easily test reverting commit 4f3a4f701b just in case.
> > If that's the case, then the next step will be to figure out how it is possible
> that IRQs are disabled!
> After reverting 4f3a4f701b, as I reported, issue does not happen anymore.

As you said that you both applied cd71e2 and reverted 4f3a4f, could you
please confirm that with cd71 applied only it was not enough ? I'm
finding it really strange, because as you use the same CPU as the
mirabox, I'm seeing no reason why the IRQ wouldn't work, and since
you're using a slightly different phy from us, the first patch which
changes the the RGMII configuration (cd71e2) would be a more likely
candidate.

> Please let me know how to "figure out how it is possible that IRQs are
> disabled".

Checking /proc/interrupts when you're sending some traffic should show
that the IRQ is increasing from time to time.

> Also, what is the impact if I use this combination?

First you're not using a mainline kernel which means that you'll always
be bothered. Second, removing support for the Tx IRQ means that your
Tx traffic can become very slow (typically 134 Mbps instead of 987 for
unidirectional traffic), which can be a problem if your board is used
as a router for example. If you're building a NAS, you'll have less
impact. Third, considering that other boards work without applying
these changes, it might be possible that there's an issue on your
board, and maybe detecting it early would allow you to fix it for all
future batches, and maybe only apply these patches for the few very
first ones.

Regards,
Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-21  5:44                   ` Willy Tarreau
@ 2014-07-21  6:33                     ` Maggie Mae Roxas
  2014-07-21  7:03                       ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-21  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy,
Good day.

> OK. For information, the mirabox uses a 88F6707 and a 88E1510 for
the phy.
> So it's very close to what you have.
Noted.

> As you said that you both applied cd71e2 and reverted 4f3a4f, could you
please confirm that with cd71 applied only it was not enough?
Yes.
If mvneta.c used is the v3.13.9 + cd71e2, issue still occurs.
If mvneta.c used is the v3.13.9 + cd71e2 - 4f3a4f, issue does not occur anymore.

> I'm finding it really strange, because as you use the same CPU as the
mirabox,
> I'm seeing no reason why the IRQ wouldn't work, and since
you're using a slightly different phy from us, the first patch which
changes the the RGMII configuration (cd71e2) would be a more likely
candidate.

Okay. First, I'll check if the interrupts are working by checking
this, as you suggested:
<snip>
Checking /proc/interrupts when you're sending some traffic should show
that the IRQ is increasing from time to time.
<snip>
I'll inform you the results within the next 2-3 days.

> First you're not using a mainline kernel which means that you'll always
be bothered.
> Second, removing support for the Tx IRQ means that your Tx traffic can become very slow (typically 134 Mbps instead of 987 for unidirectional traffic), which can be a problem if your board is used as a router for example.
> If you're building a NAS, you'll have less impact.
We'll be using it as a router, thus, it would really be a problem for us.
Will check possibilities of shifting to v3.14+ with our customer -
especially if we found problems in ethernet performance as you
mentioned.
Any recommendations on which version to use, specifically?

> Third, considering that other boards work without applying these changes, it might be possible that there's an issue on your board, and maybe detecting it early would allow you to fix it for all future batches, and maybe only apply these patches for the few very first ones.
Acknowledged.
Once we verified that indeed, the performance was slower (or
interrupts were not increasing) - we will inform our hardware team and
have them investigate this issue further for possible hardware bugs.

Thanks a lot for the help again, I'll let you know as soon as I have more info.

Regards,
Maggie Roxas

On Sun, Jul 20, 2014 at 10:44 PM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Maggie,
>
> On Sun, Jul 20, 2014 at 07:45:13PM -0700, Maggie Mae Roxas wrote:
>> Hi Willy,
>> Good day.
>>
>> BTW, here are some answers to your questions.
>>
>> > In fact I don't know if you're running your own board or a "standard"
>> one (a mirabox or any NAS board).
>> We are using a "customized" one, not a "standard" one.
>> We based the design on Armada 370 RD Evaluation Board, but we used
>> Marvell 88E1512 as Ethernet PHY and Marvell 88F6707 as processor
>> instead of the ones in the Armada 370 RD (I think it uses Marvell
>> 88E1310 as Ethernet PHY and Marvell 88F6W11 as processor).
>
> OK. For information, the mirabox uses a 88F6707 and a 88E1510 for
> the phy. So it's very close to what you have.
>
>> > Maggie, do you know if it is possible that for any reason your board
>> would not deliver an IRQ on Tx completion ? That could explain things.
>> > You can easily test reverting commit 4f3a4f701b just in case.
>> > If that's the case, then the next step will be to figure out how it is possible
>> that IRQs are disabled!
>> After reverting 4f3a4f701b, as I reported, issue does not happen anymore.
>
> As you said that you both applied cd71e2 and reverted 4f3a4f, could you
> please confirm that with cd71 applied only it was not enough ? I'm
> finding it really strange, because as you use the same CPU as the
> mirabox, I'm seeing no reason why the IRQ wouldn't work, and since
> you're using a slightly different phy from us, the first patch which
> changes the the RGMII configuration (cd71e2) would be a more likely
> candidate.
>
>> Please let me know how to "figure out how it is possible that IRQs are
>> disabled".
>
> Checking /proc/interrupts when you're sending some traffic should show
> that the IRQ is increasing from time to time.
>
>> Also, what is the impact if I use this combination?
>
> First you're not using a mainline kernel which means that you'll always
> be bothered. Second, removing support for the Tx IRQ means that your
> Tx traffic can become very slow (typically 134 Mbps instead of 987 for
> unidirectional traffic), which can be a problem if your board is used
> as a router for example. If you're building a NAS, you'll have less
> impact. Third, considering that other boards work without applying
> these changes, it might be possible that there's an issue on your
> board, and maybe detecting it early would allow you to fix it for all
> future batches, and maybe only apply these patches for the few very
> first ones.
>
> Regards,
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-21  6:33                     ` Maggie Mae Roxas
@ 2014-07-21  7:03                       ` Willy Tarreau
  2014-07-23  2:24                         ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-07-21  7:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Maggie,

On Sun, Jul 20, 2014 at 11:33:22PM -0700, Maggie Mae Roxas wrote:
> > As you said that you both applied cd71e2 and reverted 4f3a4f, could you
> please confirm that with cd71 applied only it was not enough?
> Yes.
> If mvneta.c used is the v3.13.9 + cd71e2, issue still occurs.
> If mvneta.c used is the v3.13.9 + cd71e2 - 4f3a4f, issue does not occur anymore.

Rather strange then.

> Okay. First, I'll check if the interrupts are working by checking
> this, as you suggested:
> <snip>
> Checking /proc/interrupts when you're sending some traffic should show
> that the IRQ is increasing from time to time.
> <snip>
> I'll inform you the results within the next 2-3 days.

OK.

> We'll be using it as a router, thus, it would really be a problem for us.

OK so clearly the issue must be found.
Just thinking about something, do you have a custom boot loader ? It
would be possible that in our case, the Tx IRQ works only because some
obscure or undocumented bits are set by the boot loader and that in your
case it's not pre-initialized.

> Will check possibilities of shifting to v3.14+ with our customer -
> especially if we found problems in ethernet performance as you
> mentioned.
> Any recommendations on which version to use, specifically?

LTS would probably even interest your customer as it's an LTS version.
In this case, always pick the most recent one (3.14.12 today). You may
even be interested in 3.15.6 which contains another phy fix supposed to
fix cd71e2, but if you're saying that it doesn't change anything for you
I guess it will have no effet (might be worth testing for the purpose of
helping troubleshooting though).

> > Third, considering that other boards work without applying these changes, it might be possible that there's an issue on your board, and maybe detecting it early would allow you to fix it for all future batches, and maybe only apply these patches for the few very first ones.
> Acknowledged.
> Once we verified that indeed, the performance was slower (or
> interrupts were not increasing) - we will inform our hardware team and
> have them investigate this issue further for possible hardware bugs.

OK. I still have a hard time imagining how hardware itself could prevent
an IRQ from being delivered from a NIC which is located inside the SoC,
but there must be an explanation somewhere :-/

> Thanks a lot for the help again, I'll let you know as soon as I have more info.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-21  7:03                       ` Willy Tarreau
@ 2014-07-23  2:24                         ` Maggie Mae Roxas
  2014-07-23  6:16                           ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-23  2:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy,
Good day.

> OK so clearly the issue must be found.

Actually we have 2 products using Armada 370.
One has only 1 ethernet port, so it is expected to act as Client only.
The other one has 2 ethernet ports, so it's more router-like.

For the product with one port, we have checked the combination patch
and it seems like Tx IRQ is increasing so it's OK. We checked this via
/proc/interrupts and mvneta's value there changed from 500000+ to
around 900000+ after we perform a 10-iteration iperf to the server.
The throughput is also OK, we're getting around 850Mbits when we use a
1Gbit connection, which is roughly just the same as what we've been
experiencing when we're still using 3.10.x (even 3.2.x).

As for the other product with two ports, we do expect that we might be
encountering the slow performance you mentioned.
But we are not focusing on this project yet so once it's active again,
I'll let you know.

> Just thinking about something, do you have a custom boot loader ?
> It would be possible that in our case, the Tx IRQ works only because some
> obscure or undocumented bits are set by the boot loader and that in your
> case it's not pre-initialized.

We are indeed using a "custom" boot loader.
We are using Marvell u-boot 2014_T1.1 (latest QA release, I think).
We applied some patches to memory (since we have 1Gb DDR), some bits
and pieces for the interfaces we're going to support and not to
support, and of course our own environment variables.
As for the DDR memory/register patches, they came directly from our
Marvell contact.

But with what I mentioned above, I think our Tx interrupt is working...?

BTW, for both products we've designed from Armada 370 RD, we didn't
use a switch. So we removed all switch-related codes in the boot
loader.
I'm not sure if not having switch affects the behavior?

How about you? May I know what boot loader you are using?

> LTS would probably even interest your customer as it's an LTS version.
> In this case, always pick the most recent one (3.14.12 today). You may
> even be interested in 3.15.6 which contains another phy fix supposed to
> fix cd71e2, but if you're saying that it doesn't change anything for you
> I guess it will have no effet (might be worth testing for the purpose of
> helping troubleshooting though).

Thank you for this advise, we'll take note of this.
We plan to stick on using LTS from now on, as much as possible.

> OK. I still have a hard time imagining how hardware itself could prevent
> an IRQ from being delivered from a NIC which is located inside the SoC,
> but there must be an explanation somewhere :-/
I also would like to know how. :-/
But maybe it's our difference in boot loader as you speculated.

In any case, thanks a lot again for your assistance!

Regards,
Maggie Roxas

On Mon, Jul 21, 2014 at 12:03 AM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Maggie,
>
> On Sun, Jul 20, 2014 at 11:33:22PM -0700, Maggie Mae Roxas wrote:
>> > As you said that you both applied cd71e2 and reverted 4f3a4f, could you
>> please confirm that with cd71 applied only it was not enough?
>> Yes.
>> If mvneta.c used is the v3.13.9 + cd71e2, issue still occurs.
>> If mvneta.c used is the v3.13.9 + cd71e2 - 4f3a4f, issue does not occur anymore.
>
> Rather strange then.
>
>> Okay. First, I'll check if the interrupts are working by checking
>> this, as you suggested:
>> <snip>
>> Checking /proc/interrupts when you're sending some traffic should show
>> that the IRQ is increasing from time to time.
>> <snip>
>> I'll inform you the results within the next 2-3 days.
>
> OK.
>
>> We'll be using it as a router, thus, it would really be a problem for us.
>
> OK so clearly the issue must be found.
> Just thinking about something, do you have a custom boot loader ? It
> would be possible that in our case, the Tx IRQ works only because some
> obscure or undocumented bits are set by the boot loader and that in your
> case it's not pre-initialized.
>
>> Will check possibilities of shifting to v3.14+ with our customer -
>> especially if we found problems in ethernet performance as you
>> mentioned.
>> Any recommendations on which version to use, specifically?
>
> LTS would probably even interest your customer as it's an LTS version.
> In this case, always pick the most recent one (3.14.12 today). You may
> even be interested in 3.15.6 which contains another phy fix supposed to
> fix cd71e2, but if you're saying that it doesn't change anything for you
> I guess it will have no effet (might be worth testing for the purpose of
> helping troubleshooting though).
>
>> > Third, considering that other boards work without applying these changes, it might be possible that there's an issue on your board, and maybe detecting it early would allow you to fix it for all future batches, and maybe only apply these patches for the few very first ones.
>> Acknowledged.
>> Once we verified that indeed, the performance was slower (or
>> interrupts were not increasing) - we will inform our hardware team and
>> have them investigate this issue further for possible hardware bugs.
>
> OK. I still have a hard time imagining how hardware itself could prevent
> an IRQ from being delivered from a NIC which is located inside the SoC,
> but there must be an explanation somewhere :-/
>
>> Thanks a lot for the help again, I'll let you know as soon as I have more info.
>
> Thanks,
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-23  2:24                         ` Maggie Mae Roxas
@ 2014-07-23  6:16                           ` Willy Tarreau
  2014-07-24  7:24                             ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-07-23  6:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Maggie,

On Tue, Jul 22, 2014 at 07:24:35PM -0700, Maggie Mae Roxas wrote:
> Hi Willy,
> Good day.
> 
> > OK so clearly the issue must be found.
> 
> Actually we have 2 products using Armada 370.
> One has only 1 ethernet port, so it is expected to act as Client only.
> The other one has 2 ethernet ports, so it's more router-like.
> 
> For the product with one port, we have checked the combination patch
> and it seems like Tx IRQ is increasing so it's OK. We checked this via
> /proc/interrupts and mvneta's value there changed from 500000+ to
> around 900000+ after we perform a 10-iteration iperf to the server.
> The throughput is also OK, we're getting around 850Mbits when we use a
> 1Gbit connection, which is roughly just the same as what we've been
> experiencing when we're still using 3.10.x (even 3.2.x).

OK.

> As for the other product with two ports, we do expect that we might be
> encountering the slow performance you mentioned.
> But we are not focusing on this project yet so once it's active again,
> I'll let you know.
> 
> > Just thinking about something, do you have a custom boot loader ?
> > It would be possible that in our case, the Tx IRQ works only because some
> > obscure or undocumented bits are set by the boot loader and that in your
> > case it's not pre-initialized.
> 
> We are indeed using a "custom" boot loader.
> We are using Marvell u-boot 2014_T1.1 (latest QA release, I think).
> We applied some patches to memory (since we have 1Gb DDR), some bits
> and pieces for the interfaces we're going to support and not to
> support, and of course our own environment variables.
> As for the DDR memory/register patches, they came directly from our
> Marvell contact.
> 
> But with what I mentioned above, I think our Tx interrupt is working...?

Yes, seems so.

> BTW, for both products we've designed from Armada 370 RD, we didn't
> use a switch. So we removed all switch-related codes in the boot
> loader.
> I'm not sure if not having switch affects the behavior?

I have no idea, I remember that this code is deeply burried into the
original neta code. There was also a large code for the network
classifier and something like buffer management in the original
Marvell's driver if my memory serves me correctly, I have no idea
if these ones set up anything special.

> How about you? May I know what boot loader you are using?

Just the original ones. I have a mirabox with its original boot loader :

    U-Boot 2009.08 (Sep 16 2012 - 22:50:06)Marvell version: 1.1.2 NQ
    U-Boot Addressing:
           Code:            00600000:006AFFF0
           BSS:             006F8E40
           Stack:           0x5fff70
           PageTable:       0x8e0000
           Heap address:    0x900000:0xe00000
    Board: DB-88F6710-BP
    SoC:   MV6710 A1
    CPU:   Marvell PJ4B v7 UP (Rev 1) LE
           CPU @ 1200Mhz, L2 @ 600Mhz
           DDR @ 600Mhz, TClock @ 200Mhz
           DDR 16Bit Width, FastPath Memory Access
    PEX 0: Detected No Link.
    PEX 1: Root Complex Interface, Detected Link X1
    DRAM:   1 GB
           CS 0: base 0x00000000 size 512 MB
           CS 1: base 0x20000000 size 512 MB
           Addresses 14M - 0M are saved for the U-Boot usage.
    NAND:  1024 MiB
    Bad block table found at page 262016, version 0x01
    Bad block table found at page 261888, version 0x01
    FPU not initialized
    USB 0: Host Mode
    USB 1: Host Mode
    Modules/Interfaces Detected:
           RGMII0 Phy
           RGMII1 Phy
           PEX0 (Lane 0)
           PEX1 (Lane 1)
    phy16= 72 
    phy16= 72 
    MMC:   MRVL_MMC: 0
    Net:   egiga0 [PRIME], egiga1
    Hit any key to stop autoboot:  0 

> > LTS would probably even interest your customer as it's an LTS version.
> > In this case, always pick the most recent one (3.14.12 today). You may
> > even be interested in 3.15.6 which contains another phy fix supposed to
> > fix cd71e2, but if you're saying that it doesn't change anything for you
> > I guess it will have no effet (might be worth testing for the purpose of
> > helping troubleshooting though).
> 
> Thank you for this advise, we'll take note of this.
> We plan to stick on using LTS from now on, as much as possible.
> 
> > OK. I still have a hard time imagining how hardware itself could prevent
> > an IRQ from being delivered from a NIC which is located inside the SoC,
> > but there must be an explanation somewhere :-/
> I also would like to know how. :-/
> But maybe it's our difference in boot loader as you speculated.

I think we could try to dump all of our respective mvneta registers and
compare them, though I have very little time for this today. And if it
comes from extra SoC functions like buffer management or network classifier,
I have no idea how they work nor what to dump :-/

Regards,
Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-23  6:16                           ` Willy Tarreau
@ 2014-07-24  7:24                             ` Maggie Mae Roxas
  2014-12-01  6:35                               ` Maggie Mae Roxas
       [not found]                               ` <CAB8gEUtgo-8nets3tRtqiZ8qRx+SyCq2d8v05scavWNwE5TNXg@mail.gmail.com>
  0 siblings, 2 replies; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-07-24  7:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy,
Good day.

> I have no idea, I remember that this code is deeply burried into the
> original neta code. There was also a large code for the network
> classifier and something like buffer management in the original
> Marvell's driver if my memory serves me correctly, I have no idea
> if these ones set up anything special.
I also have no idea on this.

> Just the original ones. I have a mirabox with its original boot loader :
> U-Boot 2009.08 (Sep 16 2012 - 22:50:06)Marvell version: 1.1.2 NQ
This, I think, is a 2012_Q4 (QA) release if it's based on 2009.08.
We use a 2014_T1 (QA) release based on 2011.12.
# So we're using different boot loaders.

> I think we could try to dump all of our respective mvneta registers and
> compare them, though I have very little time for this today. And if it
> comes from extra SoC functions like buffer management or network classifier,
> I have no idea how they work nor what to dump :-/
Yes, I think the difference in bootloaders especially on mvneta
registers' values matter on our difference in behavior.

And indeed, it would take much of our time if we need to compare all
mvneta register dumps, and we'd need to forward those values directly
to a Marvell contact for more effective evaluation since it's more of
a hardware stuff. :-/

In any case, since performance and Tx interrupts were both still OK on
us, I think it's enough.
We'll just inform you if we notice some irregularities.

Thank you very much for discussing this issue with us! :-)

Until our future discussions,
Maggie Roxas

On Tue, Jul 22, 2014 at 11:16 PM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Maggie,
>
> On Tue, Jul 22, 2014 at 07:24:35PM -0700, Maggie Mae Roxas wrote:
>> Hi Willy,
>> Good day.
>>
>> > OK so clearly the issue must be found.
>>
>> Actually we have 2 products using Armada 370.
>> One has only 1 ethernet port, so it is expected to act as Client only.
>> The other one has 2 ethernet ports, so it's more router-like.
>>
>> For the product with one port, we have checked the combination patch
>> and it seems like Tx IRQ is increasing so it's OK. We checked this via
>> /proc/interrupts and mvneta's value there changed from 500000+ to
>> around 900000+ after we perform a 10-iteration iperf to the server.
>> The throughput is also OK, we're getting around 850Mbits when we use a
>> 1Gbit connection, which is roughly just the same as what we've been
>> experiencing when we're still using 3.10.x (even 3.2.x).
>
> OK.
>
>> As for the other product with two ports, we do expect that we might be
>> encountering the slow performance you mentioned.
>> But we are not focusing on this project yet so once it's active again,
>> I'll let you know.
>>
>> > Just thinking about something, do you have a custom boot loader ?
>> > It would be possible that in our case, the Tx IRQ works only because some
>> > obscure or undocumented bits are set by the boot loader and that in your
>> > case it's not pre-initialized.
>>
>> We are indeed using a "custom" boot loader.
>> We are using Marvell u-boot 2014_T1.1 (latest QA release, I think).
>> We applied some patches to memory (since we have 1Gb DDR), some bits
>> and pieces for the interfaces we're going to support and not to
>> support, and of course our own environment variables.
>> As for the DDR memory/register patches, they came directly from our
>> Marvell contact.
>>
>> But with what I mentioned above, I think our Tx interrupt is working...?
>
> Yes, seems so.
>
>> BTW, for both products we've designed from Armada 370 RD, we didn't
>> use a switch. So we removed all switch-related codes in the boot
>> loader.
>> I'm not sure if not having switch affects the behavior?
>
> I have no idea, I remember that this code is deeply burried into the
> original neta code. There was also a large code for the network
> classifier and something like buffer management in the original
> Marvell's driver if my memory serves me correctly, I have no idea
> if these ones set up anything special.
>
>> How about you? May I know what boot loader you are using?
>
> Just the original ones. I have a mirabox with its original boot loader :
>
>     U-Boot 2009.08 (Sep 16 2012 - 22:50:06)Marvell version: 1.1.2 NQ
>     U-Boot Addressing:
>            Code:            00600000:006AFFF0
>            BSS:             006F8E40
>            Stack:           0x5fff70
>            PageTable:       0x8e0000
>            Heap address:    0x900000:0xe00000
>     Board: DB-88F6710-BP
>     SoC:   MV6710 A1
>     CPU:   Marvell PJ4B v7 UP (Rev 1) LE
>            CPU @ 1200Mhz, L2 @ 600Mhz
>            DDR @ 600Mhz, TClock @ 200Mhz
>            DDR 16Bit Width, FastPath Memory Access
>     PEX 0: Detected No Link.
>     PEX 1: Root Complex Interface, Detected Link X1
>     DRAM:   1 GB
>            CS 0: base 0x00000000 size 512 MB
>            CS 1: base 0x20000000 size 512 MB
>            Addresses 14M - 0M are saved for the U-Boot usage.
>     NAND:  1024 MiB
>     Bad block table found at page 262016, version 0x01
>     Bad block table found at page 261888, version 0x01
>     FPU not initialized
>     USB 0: Host Mode
>     USB 1: Host Mode
>     Modules/Interfaces Detected:
>            RGMII0 Phy
>            RGMII1 Phy
>            PEX0 (Lane 0)
>            PEX1 (Lane 1)
>     phy16= 72
>     phy16= 72
>     MMC:   MRVL_MMC: 0
>     Net:   egiga0 [PRIME], egiga1
>     Hit any key to stop autoboot:  0
>
>> > LTS would probably even interest your customer as it's an LTS version.
>> > In this case, always pick the most recent one (3.14.12 today). You may
>> > even be interested in 3.15.6 which contains another phy fix supposed to
>> > fix cd71e2, but if you're saying that it doesn't change anything for you
>> > I guess it will have no effet (might be worth testing for the purpose of
>> > helping troubleshooting though).
>>
>> Thank you for this advise, we'll take note of this.
>> We plan to stick on using LTS from now on, as much as possible.
>>
>> > OK. I still have a hard time imagining how hardware itself could prevent
>> > an IRQ from being delivered from a NIC which is located inside the SoC,
>> > but there must be an explanation somewhere :-/
>> I also would like to know how. :-/
>> But maybe it's our difference in boot loader as you speculated.
>
> I think we could try to dump all of our respective mvneta registers and
> compare them, though I have very little time for this today. And if it
> comes from extra SoC functions like buffer management or network classifier,
> I have no idea how they work nor what to dump :-/
>
> Regards,
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-07-24  7:24                             ` Maggie Mae Roxas
@ 2014-12-01  6:35                               ` Maggie Mae Roxas
       [not found]                               ` <CAB8gEUtgo-8nets3tRtqiZ8qRx+SyCq2d8v05scavWNwE5TNXg@mail.gmail.com>
  1 sibling, 0 replies; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-12-01  6:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy, Thomas.
Good day.

I am reopening this discussion because we found an unusual behavior
after using this combination that we thought was OK as discussed in
the previous messages of this thread:

> - use 3.13.9 mvneta.c
> - apply cd71e246c16b30e3f396a85943d5f596202737ba
> - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922

Specifically, if we apply above, the "No buffer space available" error
during continuous ping does NOT occur anymore.
# Link to logs: https://app.box.com/s/hq04v0c0j896em9s7nql

However, after continuous and further testing, we encounter the ff. issues:
1. Low throughput during iperf when Armada 370 device is set as iperf
client. For example, in 1000Mbits/s, we only get below 140Mbits/s.
Low throughput does NOT occur when device is set as iperf server instead.
# Link to logs: https://app.box.com/s/09xes3pl2bud38hcftbq
2. At some point, kernel will crashes and hangs, that device needs to
be reboot again.
# Link to logs: https://app.box.com/s/kahj81rnue6wygs7ylvd

If we only use bare 3.13.9 (without patches), we do NOT encounter (1)
and (2) above, but we have the "No buffer space available" issue back
again.
# Link to logs: https://app.box.com/s/x2dkxomhbqor9tzun5b3

If we use bare 3.13.5 (without patches), we also encounter (1) and (2)
above, but do NOT encounter "No buffer space available" issue.
#Just the same behavior with 3.13.9 with patches.

Again we're using the ff. a "customized" board with Armada 370, not a
"standard" one.
We based the design on Armada 370 RD Evaluation Board, but we used
Marvell 88E1512 as Ethernet PHY and Marvell 88F6707 as processor
instead of the ones in the Armada 370 RD (I think it uses Marvell
88E1310 as Ethernet PHY and Marvell 88F6W11 as processor).

Please advise what could be going wrong.
Basically, we have to fix the "No buffer space available", low
throughput and kernel crash tendency while sticking to Linux Kernel
3.13.x.
We will provide the info you need or do whatever you advise us to so
that we can hopefully debug this with you.
# Our suppliers already created 3.13.x-compatible drivers for the
modules we placed in our board, so we can't shift to 3.14.x (or else,
we'll have to ask our suppliers to re-create the drivers compatible
for 3.14.x - which would take resources in both manpower and time
which we can't afford due to nearing market release of this product).

Hoping for your consideration and assistance.

Thank you very much.

Regards,
Maggie Roxas

On Thu, Jul 24, 2014 at 3:24 PM, Maggie Mae Roxas
<maggie.mae.roxas@gmail.com> wrote:
> Hi Willy,
> Good day.
>
>> I have no idea, I remember that this code is deeply burried into the
>> original neta code. There was also a large code for the network
>> classifier and something like buffer management in the original
>> Marvell's driver if my memory serves me correctly, I have no idea
>> if these ones set up anything special.
> I also have no idea on this.
>
>> Just the original ones. I have a mirabox with its original boot loader :
>> U-Boot 2009.08 (Sep 16 2012 - 22:50:06)Marvell version: 1.1.2 NQ
> This, I think, is a 2012_Q4 (QA) release if it's based on 2009.08.
> We use a 2014_T1 (QA) release based on 2011.12.
> # So we're using different boot loaders.
>
>> I think we could try to dump all of our respective mvneta registers and
>> compare them, though I have very little time for this today. And if it
>> comes from extra SoC functions like buffer management or network classifier,
>> I have no idea how they work nor what to dump :-/
> Yes, I think the difference in bootloaders especially on mvneta
> registers' values matter on our difference in behavior.
>
> And indeed, it would take much of our time if we need to compare all
> mvneta register dumps, and we'd need to forward those values directly
> to a Marvell contact for more effective evaluation since it's more of
> a hardware stuff. :-/
>
> In any case, since performance and Tx interrupts were both still OK on
> us, I think it's enough.
> We'll just inform you if we notice some irregularities.
>
> Thank you very much for discussing this issue with us! :-)
>
> Until our future discussions,
> Maggie Roxas
>
> On Tue, Jul 22, 2014 at 11:16 PM, Willy Tarreau <w@1wt.eu> wrote:
>> Hi Maggie,
>>
>> On Tue, Jul 22, 2014 at 07:24:35PM -0700, Maggie Mae Roxas wrote:
>>> Hi Willy,
>>> Good day.
>>>
>>> > OK so clearly the issue must be found.
>>>
>>> Actually we have 2 products using Armada 370.
>>> One has only 1 ethernet port, so it is expected to act as Client only.
>>> The other one has 2 ethernet ports, so it's more router-like.
>>>
>>> For the product with one port, we have checked the combination patch
>>> and it seems like Tx IRQ is increasing so it's OK. We checked this via
>>> /proc/interrupts and mvneta's value there changed from 500000+ to
>>> around 900000+ after we perform a 10-iteration iperf to the server.
>>> The throughput is also OK, we're getting around 850Mbits when we use a
>>> 1Gbit connection, which is roughly just the same as what we've been
>>> experiencing when we're still using 3.10.x (even 3.2.x).
>>
>> OK.
>>
>>> As for the other product with two ports, we do expect that we might be
>>> encountering the slow performance you mentioned.
>>> But we are not focusing on this project yet so once it's active again,
>>> I'll let you know.
>>>
>>> > Just thinking about something, do you have a custom boot loader ?
>>> > It would be possible that in our case, the Tx IRQ works only because some
>>> > obscure or undocumented bits are set by the boot loader and that in your
>>> > case it's not pre-initialized.
>>>
>>> We are indeed using a "custom" boot loader.
>>> We are using Marvell u-boot 2014_T1.1 (latest QA release, I think).
>>> We applied some patches to memory (since we have 1Gb DDR), some bits
>>> and pieces for the interfaces we're going to support and not to
>>> support, and of course our own environment variables.
>>> As for the DDR memory/register patches, they came directly from our
>>> Marvell contact.
>>>
>>> But with what I mentioned above, I think our Tx interrupt is working...?
>>
>> Yes, seems so.
>>
>>> BTW, for both products we've designed from Armada 370 RD, we didn't
>>> use a switch. So we removed all switch-related codes in the boot
>>> loader.
>>> I'm not sure if not having switch affects the behavior?
>>
>> I have no idea, I remember that this code is deeply burried into the
>> original neta code. There was also a large code for the network
>> classifier and something like buffer management in the original
>> Marvell's driver if my memory serves me correctly, I have no idea
>> if these ones set up anything special.
>>
>>> How about you? May I know what boot loader you are using?
>>
>> Just the original ones. I have a mirabox with its original boot loader :
>>
>>     U-Boot 2009.08 (Sep 16 2012 - 22:50:06)Marvell version: 1.1.2 NQ
>>     U-Boot Addressing:
>>            Code:            00600000:006AFFF0
>>            BSS:             006F8E40
>>            Stack:           0x5fff70
>>            PageTable:       0x8e0000
>>            Heap address:    0x900000:0xe00000
>>     Board: DB-88F6710-BP
>>     SoC:   MV6710 A1
>>     CPU:   Marvell PJ4B v7 UP (Rev 1) LE
>>            CPU @ 1200Mhz, L2 @ 600Mhz
>>            DDR @ 600Mhz, TClock @ 200Mhz
>>            DDR 16Bit Width, FastPath Memory Access
>>     PEX 0: Detected No Link.
>>     PEX 1: Root Complex Interface, Detected Link X1
>>     DRAM:   1 GB
>>            CS 0: base 0x00000000 size 512 MB
>>            CS 1: base 0x20000000 size 512 MB
>>            Addresses 14M - 0M are saved for the U-Boot usage.
>>     NAND:  1024 MiB
>>     Bad block table found at page 262016, version 0x01
>>     Bad block table found at page 261888, version 0x01
>>     FPU not initialized
>>     USB 0: Host Mode
>>     USB 1: Host Mode
>>     Modules/Interfaces Detected:
>>            RGMII0 Phy
>>            RGMII1 Phy
>>            PEX0 (Lane 0)
>>            PEX1 (Lane 1)
>>     phy16= 72
>>     phy16= 72
>>     MMC:   MRVL_MMC: 0
>>     Net:   egiga0 [PRIME], egiga1
>>     Hit any key to stop autoboot:  0
>>
>>> > LTS would probably even interest your customer as it's an LTS version.
>>> > In this case, always pick the most recent one (3.14.12 today). You may
>>> > even be interested in 3.15.6 which contains another phy fix supposed to
>>> > fix cd71e2, but if you're saying that it doesn't change anything for you
>>> > I guess it will have no effet (might be worth testing for the purpose of
>>> > helping troubleshooting though).
>>>
>>> Thank you for this advise, we'll take note of this.
>>> We plan to stick on using LTS from now on, as much as possible.
>>>
>>> > OK. I still have a hard time imagining how hardware itself could prevent
>>> > an IRQ from being delivered from a NIC which is located inside the SoC,
>>> > but there must be an explanation somewhere :-/
>>> I also would like to know how. :-/
>>> But maybe it's our difference in boot loader as you speculated.
>>
>> I think we could try to dump all of our respective mvneta registers and
>> compare them, though I have very little time for this today. And if it
>> comes from extra SoC functions like buffer management or network classifier,
>> I have no idea how they work nor what to dump :-/
>>
>> Regards,
>> Willy
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
       [not found]                               ` <CAB8gEUtgo-8nets3tRtqiZ8qRx+SyCq2d8v05scavWNwE5TNXg@mail.gmail.com>
@ 2014-12-01  7:28                                 ` Willy Tarreau
  2014-12-01  8:27                                   ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-12-01  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Maggie,

On Mon, Dec 01, 2014 at 02:26:49PM +0800, Maggie Mae Roxas wrote:
> Hi Willy, Thomas.
> Good day.
> 
> I am reopening this discussion because we found an unusual behavior
> after using this combination that we thought was OK as discussed in
> the previous messages of this thread:
> 
> > - use 3.13.9 mvneta.c
> > - apply cd71e246c16b30e3f396a85943d5f596202737ba
> > - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922
> 
> Specifically, if we apply above, the "No buffer space available" error
> during continuous ping does NOT occur anymore.
> # Attached: with_patch_3_13_9_no_buffer_space_solved.txt
> 
> However, after continuous and further testing, we encounter the ff. issues:
> 1. Low throughput during iperf when Armada 370 device is set as iperf
> client. For example, in 1000Mbits/s, we only get below 140Mbits/s.

Yes that was the intent of the original fix.

We recently diagnosed the issue related to "no buffer space available".
What happens is that the "ping" utility uses a very small socket buffer.
It sends a few packets, and the NIC doesn't send interrupts until the
TX interrupt count is reached, so the Tx skbs are not freed and the
socket buffers remain full.

The only solution at the moment is to make the NIC emit an IRQ for each
Tx packet. I'm still trying to find a better way to do this (either find
a way to make the NIC emit an IRQ once the Tx queue is empty or adjust
the IRQ delay when adding more packets, though it creates a race condition).

In the mean time you can apply the attached patch. I haven't submitted it
yet only by lack of time :-(

Best regards,
Willy

-------------- next part --------------
>From 01b23da3607dbce1d1abfe5b7f092de11ae327cf Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w@1wt.eu>
Date: Sat, 25 Oct 2014 19:12:49 +0200
Subject: net: mvneta: fix TX coalesce interrupt mode

The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.

The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.

In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.

The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.

Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.

No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer.

This fix needs to be backported to stable kernels starting with 3.10.

Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/ethernet/marvell/mvneta.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 4762994..35bfba7 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -214,7 +214,7 @@
 /* Various constants */
 
 /* Coalescing */
-#define MVNETA_TXDONE_COAL_PKTS		16
+#define MVNETA_TXDONE_COAL_PKTS		1
 #define MVNETA_RX_COAL_PKTS		32
 #define MVNETA_RX_COAL_USEC		100
 
-- 
1.7.12.2.21.g234cd45.dirty

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-01  7:28                                 ` Willy Tarreau
@ 2014-12-01  8:27                                   ` Maggie Mae Roxas
  2014-12-01  9:28                                     ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-12-01  8:27 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy,
Good day.

Thank you for the quick feedback.

> In the mean time you can apply the attached patch. I haven't submitted it yet only by lack of time :-(
Should we apply this patch on:
1. 3.13.9? or
2. 3.13.9 with cd71e246c16b30e3f396a85943d5f596202737ba and reverted
4f3a4f701b59a3e4b5c8503ac3d905c0a326f922?

This patch is expected resolve the low throughput and the kernel crash as well?
# Not just the "No buffer space available" error?

Thank you.

Regards,
Maggie Roxas

On Mon, Dec 1, 2014 at 3:28 PM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Maggie,
>
> On Mon, Dec 01, 2014 at 02:26:49PM +0800, Maggie Mae Roxas wrote:
>> Hi Willy, Thomas.
>> Good day.
>>
>> I am reopening this discussion because we found an unusual behavior
>> after using this combination that we thought was OK as discussed in
>> the previous messages of this thread:
>>
>> > - use 3.13.9 mvneta.c
>> > - apply cd71e246c16b30e3f396a85943d5f596202737ba
>> > - revert 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922
>>
>> Specifically, if we apply above, the "No buffer space available" error
>> during continuous ping does NOT occur anymore.
>> # Attached: with_patch_3_13_9_no_buffer_space_solved.txt
>>
>> However, after continuous and further testing, we encounter the ff. issues:
>> 1. Low throughput during iperf when Armada 370 device is set as iperf
>> client. For example, in 1000Mbits/s, we only get below 140Mbits/s.
>
> Yes that was the intent of the original fix.
>
> We recently diagnosed the issue related to "no buffer space available".
> What happens is that the "ping" utility uses a very small socket buffer.
> It sends a few packets, and the NIC doesn't send interrupts until the
> TX interrupt count is reached, so the Tx skbs are not freed and the
> socket buffers remain full.
>
> The only solution at the moment is to make the NIC emit an IRQ for each
> Tx packet. I'm still trying to find a better way to do this (either find
> a way to make the NIC emit an IRQ once the Tx queue is empty or adjust
> the IRQ delay when adding more packets, though it creates a race condition).
>
> In the mean time you can apply the attached patch. I haven't submitted it
> yet only by lack of time :-(
>
> Best regards,
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-01  8:27                                   ` Maggie Mae Roxas
@ 2014-12-01  9:28                                     ` Willy Tarreau
  2014-12-01  9:32                                       ` Thomas Petazzoni
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-12-01  9:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 01, 2014 at 04:27:46PM +0800, Maggie Mae Roxas wrote:
> Hi Willy,
> Good day.
> 
> Thank you for the quick feedback.
> 
> > In the mean time you can apply the attached patch. I haven't submitted it yet only by lack of time :-(
> Should we apply this patch on:
> 1. 3.13.9? or
> 2. 3.13.9 with cd71e246c16b30e3f396a85943d5f596202737ba and reverted
> 4f3a4f701b59a3e4b5c8503ac3d905c0a326f922?
> 
> This patch is expected resolve the low throughput and the kernel crash as well?
> # Not just the "No buffer space available" error?

Yes absolutely. The low throughput is caused by the use of a timer instead
of an interrupt to flush Tx descriptors. The "No buffer space available"
is caused by the Tx coalesce of 16 which only flushes the buffers after 16
packets have been emitted. When socket buffers are too small for 16 packets
(eg: ping) you get the error above. Thus setting Tx coalesce to 1 fixes it
for all situations. It's slightly less performant than coalesce 16 but you
can change it using ethtool if you want (4 still works with ping and shows
better performance).

Best regards,
Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-01  9:28                                     ` Willy Tarreau
@ 2014-12-01  9:32                                       ` Thomas Petazzoni
  2014-12-01  9:58                                         ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Petazzoni @ 2014-12-01  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Willy Tarreau,

On Mon, 1 Dec 2014 10:28:51 +0100, Willy Tarreau wrote:

> > This patch is expected resolve the low throughput and the kernel crash as well?
> > # Not just the "No buffer space available" error?
> 
> Yes absolutely. The low throughput is caused by the use of a timer instead
> of an interrupt to flush Tx descriptors. The "No buffer space available"
> is caused by the Tx coalesce of 16 which only flushes the buffers after 16
> packets have been emitted. When socket buffers are too small for 16 packets
> (eg: ping) you get the error above. Thus setting Tx coalesce to 1 fixes it
> for all situations. It's slightly less performant than coalesce 16 but you
> can change it using ethtool if you want (4 still works with ping and shows
> better performance).

If I understood correctly, on RX the interrupt coalescing can be done
every X packets, or after N milliseconds. However, on TX, it's only
after Y packets, there is no way to configure a delay.

But in any case, with NAPI implemented in software, are these hardware
interrupt coalescing features very important? As soon as the number of
interrupts becomes high, the kernel will disable the interrupt and
switch to polling, no?

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-01  9:32                                       ` Thomas Petazzoni
@ 2014-12-01  9:58                                         ` Willy Tarreau
  2014-12-01 10:15                                           ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-12-01  9:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas,

On Mon, Dec 01, 2014 at 10:32:21AM +0100, Thomas Petazzoni wrote:
> If I understood correctly, on RX the interrupt coalescing can be done
> every X packets, or after N milliseconds. However, on TX, it's only
> after Y packets, there is no way to configure a delay.

That was my understanding of the datasheet as well.

> But in any case, with NAPI implemented in software, are these hardware
> interrupt coalescing features very important? As soon as the number of
> interrupts becomes high, the kernel will disable the interrupt and
> switch to polling, no?

Absolutely, but despite this, the interrupts still impact the system's
performance, because instead of waking up the driver once the Tx queue
is about to be empty (let's say twice per Tx queue), we wake up as often
as the system supports it. The net effect is a performance loss of about
5% on small packets, which is not huge of course, but would rather be
spent doing some more useful stuff.

Uri gave me a contact at Marvell who knows this device well, I'll ask
him if there's no other way to work with this chip. Sending an interrupt
at least when the Tx queue is empty would be nice :-/

Best regards,
Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-01  9:58                                         ` Willy Tarreau
@ 2014-12-01 10:15                                           ` Maggie Mae Roxas
  2014-12-02  4:09                                             ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-12-01 10:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas, Willy.
Good day.

Thank you for your assistance as always.

We are currently testing the attached patch + 3.13.9.
We'll let you know the results as soon as we've conducted enough tests
- latest tomorrow.

Will keep you posted.

Regards,
Maggie Roxas


On Mon, Dec 1, 2014 at 5:58 PM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Thomas,
>
> On Mon, Dec 01, 2014 at 10:32:21AM +0100, Thomas Petazzoni wrote:
>> If I understood correctly, on RX the interrupt coalescing can be done
>> every X packets, or after N milliseconds. However, on TX, it's only
>> after Y packets, there is no way to configure a delay.
>
> That was my understanding of the datasheet as well.
>
>> But in any case, with NAPI implemented in software, are these hardware
>> interrupt coalescing features very important? As soon as the number of
>> interrupts becomes high, the kernel will disable the interrupt and
>> switch to polling, no?
>
> Absolutely, but despite this, the interrupts still impact the system's
> performance, because instead of waking up the driver once the Tx queue
> is about to be empty (let's say twice per Tx queue), we wake up as often
> as the system supports it. The net effect is a performance loss of about
> 5% on small packets, which is not huge of course, but would rather be
> spent doing some more useful stuff.
>
> Uri gave me a contact at Marvell who knows this device well, I'll ask
> him if there's no other way to work with this chip. Sending an interrupt
> at least when the Tx queue is empty would be nice :-/
>
> Best regards,
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-01 10:15                                           ` Maggie Mae Roxas
@ 2014-12-02  4:09                                             ` Maggie Mae Roxas
  2014-12-02  6:56                                               ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-12-02  4:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy, Thomas,
Good day.

This is just to confirm that the patch Willy sent yesterday, patched
to 3.13.9 alone, resolves these issues:
- "No buffer space available" around 17th packet
- Low throughput on iperf as client (ie, ~130 Mbits/s on 1000 Mbits/s)
- Kernel crash

We confirmed it after quite some testing.
# Full logs at:
# as iperf client: https://app.box.com/s/5pzh753dmmnqt4q8s5iy
# as iperf server: https://app.box.com/s/t9ohxrcphpmnfer9dtj9

Thank you very much for your assistance again!!! :)

Regards,
Maggie Roxas

On Mon, Dec 1, 2014 at 6:15 PM, Maggie Mae Roxas
<maggie.mae.roxas@gmail.com> wrote:
> Hi Thomas, Willy.
> Good day.
>
> Thank you for your assistance as always.
>
> We are currently testing the attached patch + 3.13.9.
> We'll let you know the results as soon as we've conducted enough tests
> - latest tomorrow.
>
> Will keep you posted.
>
> Regards,
> Maggie Roxas
>
>
> On Mon, Dec 1, 2014 at 5:58 PM, Willy Tarreau <w@1wt.eu> wrote:
>> Hi Thomas,
>>
>> On Mon, Dec 01, 2014 at 10:32:21AM +0100, Thomas Petazzoni wrote:
>>> If I understood correctly, on RX the interrupt coalescing can be done
>>> every X packets, or after N milliseconds. However, on TX, it's only
>>> after Y packets, there is no way to configure a delay.
>>
>> That was my understanding of the datasheet as well.
>>
>>> But in any case, with NAPI implemented in software, are these hardware
>>> interrupt coalescing features very important? As soon as the number of
>>> interrupts becomes high, the kernel will disable the interrupt and
>>> switch to polling, no?
>>
>> Absolutely, but despite this, the interrupts still impact the system's
>> performance, because instead of waking up the driver once the Tx queue
>> is about to be empty (let's say twice per Tx queue), we wake up as often
>> as the system supports it. The net effect is a performance loss of about
>> 5% on small packets, which is not huge of course, but would rather be
>> spent doing some more useful stuff.
>>
>> Uri gave me a contact at Marvell who knows this device well, I'll ask
>> him if there's no other way to work with this chip. Sending an interrupt
>> at least when the Tx queue is empty would be nice :-/
>>
>> Best regards,
>> Willy
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-02  4:09                                             ` Maggie Mae Roxas
@ 2014-12-02  6:56                                               ` Willy Tarreau
  2014-12-02  7:04                                                 ` Maggie Mae Roxas
  0 siblings, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2014-12-02  6:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Maggie,

On Tue, Dec 02, 2014 at 12:09:05PM +0800, Maggie Mae Roxas wrote:
> Hi Willy, Thomas,
> Good day.
> 
> This is just to confirm that the patch Willy sent yesterday, patched
> to 3.13.9 alone, resolves these issues:
> - "No buffer space available" around 17th packet
> - Low throughput on iperf as client (ie, ~130 Mbits/s on 1000 Mbits/s)
> - Kernel crash
> 
> We confirmed it after quite some testing.
> # Full logs at:
> # as iperf client: https://app.box.com/s/5pzh753dmmnqt4q8s5iy
> # as iperf server: https://app.box.com/s/t9ohxrcphpmnfer9dtj9
> 
> Thank you very much for your assistance again!!! :)

thanks for your valuable feedback, I'll add your tested-by and will
send it to netdev for inclusion before I switch to something else
and forget again!

Willy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Issue found in Armada 370: "No buffer space available" error during continuous ping
  2014-12-02  6:56                                               ` Willy Tarreau
@ 2014-12-02  7:04                                                 ` Maggie Mae Roxas
  0 siblings, 0 replies; 27+ messages in thread
From: Maggie Mae Roxas @ 2014-12-02  7:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Willy,
Good day.

No problem. Thanks again!

Regards,
Maggie Roxas

On Tue, Dec 2, 2014 at 2:56 PM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Maggie,
>
> On Tue, Dec 02, 2014 at 12:09:05PM +0800, Maggie Mae Roxas wrote:
>> Hi Willy, Thomas,
>> Good day.
>>
>> This is just to confirm that the patch Willy sent yesterday, patched
>> to 3.13.9 alone, resolves these issues:
>> - "No buffer space available" around 17th packet
>> - Low throughput on iperf as client (ie, ~130 Mbits/s on 1000 Mbits/s)
>> - Kernel crash
>>
>> We confirmed it after quite some testing.
>> # Full logs at:
>> # as iperf client: https://app.box.com/s/5pzh753dmmnqt4q8s5iy
>> # as iperf server: https://app.box.com/s/t9ohxrcphpmnfer9dtj9
>>
>> Thank you very much for your assistance again!!! :)
>
> thanks for your valuable feedback, I'll add your tested-by and will
> send it to netdev for inclusion before I switch to something else
> and forget again!
>
> Willy
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2014-12-02  7:04 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-08  2:20 Issue found in Armada 370: "No buffer space available" error during continuous ping Maggie Mae Roxas
2014-07-08  2:27 ` Maggie Mae Roxas
2014-07-08  8:21   ` Thomas Petazzoni
2014-07-09  6:35     ` Maggie Mae Roxas
2014-07-14  3:55       ` Maggie Mae Roxas
2014-07-15 12:24       ` Thomas Petazzoni
2014-07-15 12:43         ` Willy Tarreau
2014-07-17  5:37           ` Maggie Mae Roxas
2014-07-17  8:15             ` Willy Tarreau
2014-07-21  1:57               ` Maggie Mae Roxas
2014-07-21  2:45                 ` Maggie Mae Roxas
2014-07-21  5:44                   ` Willy Tarreau
2014-07-21  6:33                     ` Maggie Mae Roxas
2014-07-21  7:03                       ` Willy Tarreau
2014-07-23  2:24                         ` Maggie Mae Roxas
2014-07-23  6:16                           ` Willy Tarreau
2014-07-24  7:24                             ` Maggie Mae Roxas
2014-12-01  6:35                               ` Maggie Mae Roxas
     [not found]                               ` <CAB8gEUtgo-8nets3tRtqiZ8qRx+SyCq2d8v05scavWNwE5TNXg@mail.gmail.com>
2014-12-01  7:28                                 ` Willy Tarreau
2014-12-01  8:27                                   ` Maggie Mae Roxas
2014-12-01  9:28                                     ` Willy Tarreau
2014-12-01  9:32                                       ` Thomas Petazzoni
2014-12-01  9:58                                         ` Willy Tarreau
2014-12-01 10:15                                           ` Maggie Mae Roxas
2014-12-02  4:09                                             ` Maggie Mae Roxas
2014-12-02  6:56                                               ` Willy Tarreau
2014-12-02  7:04                                                 ` Maggie Mae Roxas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).