* Network performance - iperf @ 2010-03-29 11:33 Michal Simek 2010-03-29 12:16 ` Eric Dumazet ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Michal Simek @ 2010-03-29 11:33 UTC (permalink / raw) To: LKML Cc: John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Hi All, I am doing several network benchmarks on Microblaze cpu with MMU. I am seeing one issue which is weird and I would like know where the problem is. I am using the same hw design and the same Linux kernel. I have done only change in memory size (in DTS). 32MB: 18.3Mb/s 64MB: 15.2Mb/s 128MB: 10.6Mb/s 256MB: 3.8Mb/s There is huge difference between systems with 32MB and 256MB ram. I am running iperf TCP tests with these commands. On x86: iperf -c 192.168.0.105 -i 5 -t 50 On microblaze: iperf -s I look at pte misses which are the same on all configurations which means that the number of do_page_fault exceptions is the same on all configurations. I added some hooks to low level kernel code to be able to see number of tlb misses. There is big differences between number of misses on system with 256MB and 32MB. I measured two kernel settings. First column is kernel with asm optimized memcpy/memmove function and the second is without optimization. (Kernel with asm optimized lib functions is 30% faster than system without optimization) 32MB: 12703 13641 64MB: 1021750 655644 128MB: 1031644 531879 256MB: 1011322 430027 Most of them are data tlb misses. Microblaze MMU doesn't use any LRU mechanism to find TLB victim that's why we there is naive TLB replacement strategy based on incrementing counter. We using 2 tlbs for kernel itself which are not updated that's why we can use "only" 62 TLBs from 64. I am using two LL_TEMAC driver which use dma and I observe the same results on both that's why I think that the problem is in kernel itself. It could be connection with memory management or with cache behavior. Have you ever met with this system behavior? Do you know about tests which I can do? I also done several tests to identify weak kernel places via Qemu and this is the most called functions. Unknown label means functions outside kernel. Numbers are in % TCP 31.47 - memcpy 15.00 - do_csum 11.93 - unknown 5.62 - __copy_tofrom_user 2.94 - memset 2.49 - default idle 1.66 - __invalidate_dcache_range 1.57 - __kmalloc 1.32 - skb_copy_bits 1.23 - __alloc_skb UDP 51.86 - unknown 9.31 - default_idle 6.01 - __copy_tofrom_user 4.00 - do_csum 2.05 - schedule 1.92 - __muldi3 1.39 - update_curr 1.20 - __invalidate_dcache_range 1.12 - __enqueue_entity I optimized copy_tofrom_user function to support word-copying. (Just cover aligned cases because the most copying is aligned.) Also uaccess unification was done. Do you have any idea howto improve TCP/UDP performance in general? Or tests which can point me on weak places. I am using microblaze-next branch. The same code is in linux-next tree. Thanks, Michal -- Michal Simek, Ing. (M.Eng) PetaLogix - Linux Solutions for a Reconfigurable World w: www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 11:33 Network performance - iperf Michal Simek @ 2010-03-29 12:16 ` Eric Dumazet 2010-03-29 14:54 ` Michal Simek 2010-03-29 16:47 ` Rick Jones 2010-03-29 20:07 ` Eric Dumazet 2 siblings, 1 reply; 11+ messages in thread From: Eric Dumazet @ 2010-03-29 12:16 UTC (permalink / raw) To: michal.simek Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Le lundi 29 mars 2010 à 13:33 +0200, Michal Simek a écrit : > Do you have any idea howto improve TCP/UDP performance in general? > Or tests which can point me on weak places. Could you post "netstat -s" on your receiver, after fresh boot and your iperf session, for 32 MB and 256 MB ram case ? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 12:16 ` Eric Dumazet @ 2010-03-29 14:54 ` Michal Simek 2010-03-29 15:27 ` Michal Simek 0 siblings, 1 reply; 11+ messages in thread From: Michal Simek @ 2010-03-29 14:54 UTC (permalink / raw) To: Eric Dumazet Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Eric Dumazet wrote: > Le lundi 29 mars 2010 à 13:33 +0200, Michal Simek a écrit : > >> Do you have any idea howto improve TCP/UDP performance in general? >> Or tests which can point me on weak places. > > Could you post "netstat -s" on your receiver, after fresh boot and your > iperf session, for 32 MB and 256 MB ram case ? > I am not sure if is helpful but look below. Thanks, Michal ~ # ./netstat -s Ip: 0 total packets received 0 forwarded 0 incoming packets discarded 0 incoming packets delivered 0 requests sent out Icmp: 0 ICMP messages received 0 input ICMP message failed. ICMP input histogram: 0 ICMP messages sent 0 ICMP messages failed ICMP output histogram: Tcp: 0 active connections openings 0 passive connection openings 0 failed connection attempts 0 connection resets received 0 connections established 0 segments received 0 segments send out 0 segments retransmited 0 bad segments received. 0 resets sent Udp: 0 packets received 0 packets to unknown port received. 0 packet receive errors 0 packets sent RcvbufErrors: 0 SndbufErrors: 0 UdpLite: InDatagrams: 0 NoPorts: 0 InErrors: 0 OutDatagrams: 0 RcvbufErrors: 0 SndbufErrors: 0 error parsing /proc/net/snmp: Success -- Michal Simek, Ing. (M.Eng) PetaLogix - Linux Solutions for a Reconfigurable World w: www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 14:54 ` Michal Simek @ 2010-03-29 15:27 ` Michal Simek 2010-03-29 17:45 ` Eric Dumazet 0 siblings, 1 reply; 11+ messages in thread From: Michal Simek @ 2010-03-29 15:27 UTC (permalink / raw) Cc: Eric Dumazet, LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Michal Simek wrote: > Eric Dumazet wrote: >> Le lundi 29 mars 2010 à 13:33 +0200, Michal Simek a écrit : >> >>> Do you have any idea howto improve TCP/UDP performance in general? >>> Or tests which can point me on weak places. >> >> Could you post "netstat -s" on your receiver, after fresh boot and your >> iperf session, for 32 MB and 256 MB ram case ? >> > > I am not sure if is helpful but look below. > Sorry I forget to c&p that second part. :-( Look below. Michal 32MB ~ # cat /proc/meminfo | head -n 1 MemTotal: 30024 kB ~ # iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 6] local 192.168.0.10 port 5001 connected with 192.168.0.101 port 43577 [ ID] Interval Transfer Bandwidth [ 6] 0.0-50.0 sec 78.0 MBytes 13.1 Mbits/sec ~ # ./netstat -s Ip: 56596 total packets received 0 forwarded 0 incoming packets discarded 56596 incoming packets delivered 15752 requests sent out Icmp: 0 ICMP messages received 0 input ICMP message failed. ICMP input histogram: 0 ICMP messages sent 0 ICMP messages failed ICMP output histogram: Tcp: 0 active connections openings 1 passive connection openings 0 failed connection attempts 0 connection resets received 0 connections established 56596 segments received 15752 segments send out 0 segments retransmited 0 bad segments received. 0 resets sent Udp: 0 packets received 0 packets to unknown port received. 0 packet receive errors 0 packets sent RcvbufErrors: 0 SndbufErrors: 0 UdpLite: InDatagrams: 0 NoPorts: 0 InErrors: 0 OutDatagrams: 0 RcvbufErrors: 0 SndbufErrors: 0 error parsing /proc/net/snmp: Success 256MB ~ # cat /proc/meminfo | head -n 1 MemTotal: 257212 kB ~ # iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 6] local 192.168.0.10 port 5001 connected with 192.168.0.101 port 46069 [ ID] Interval Transfer Bandwidth [ 6] 0.0-50.2 sec 19.5 MBytes 3.26 Mbits/sec ~ # ./netstat -s Ip: 14163 total packets received 0 forwarded 0 incoming packets discarded 14163 incoming packets delivered 5209 requests sent out Icmp: 0 ICMP messages received 0 input ICMP message failed. ICMP input histogram: 0 ICMP messages sent 0 ICMP messages failed ICMP output histogram: Tcp: 0 active connections openings 1 passive connection openings 0 failed connection attempts 0 connection resets received 0 connections established 14163 segments received 5209 segments send out 0 segments retransmited 0 bad segments received. 0 resets sent Udp: 0 packets received 0 packets to unknown port received. 0 packet receive errors 0 packets sent RcvbufErrors: 0 SndbufErrors: 0 UdpLite: InDatagrams: 0 NoPorts: 0 InErrors: 0 OutDatagrams: 0 RcvbufErrors: 0 SndbufErrors: 0 error parsing /proc/net/snmp: Success -- Michal Simek, Ing. (M.Eng) PetaLogix - Linux Solutions for a Reconfigurable World w: www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 15:27 ` Michal Simek @ 2010-03-29 17:45 ` Eric Dumazet 2010-03-30 9:34 ` Michal Simek 0 siblings, 1 reply; 11+ messages in thread From: Eric Dumazet @ 2010-03-29 17:45 UTC (permalink / raw) To: michal.simek Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Le lundi 29 mars 2010 à 17:27 +0200, Michal Simek a écrit : > Michal Simek wrote: > > Eric Dumazet wrote: > >> Le lundi 29 mars 2010 à 13:33 +0200, Michal Simek a écrit : > >> > >>> Do you have any idea howto improve TCP/UDP performance in general? > >>> Or tests which can point me on weak places. > >> > >> Could you post "netstat -s" on your receiver, after fresh boot and your > >> iperf session, for 32 MB and 256 MB ram case ? > >> > > > > I am not sure if is helpful but look below. > > > Sorry I forget to c&p that second part. :-( > Sorry, your netstat is not up2date. If you cannot correct it to last version [ net-tools 1.60 , netstat 1.42 ], please send cat /proc/net/snmp cat /proc/net/netstat ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 17:45 ` Eric Dumazet @ 2010-03-30 9:34 ` Michal Simek 2010-03-30 12:11 ` Steve Magnani 2010-03-30 12:41 ` Eric Dumazet 0 siblings, 2 replies; 11+ messages in thread From: Michal Simek @ 2010-03-30 9:34 UTC (permalink / raw) To: Eric Dumazet Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm [-- Attachment #1: Type: text/plain, Size: 8888 bytes --] Eric Dumazet wrote: > Le lundi 29 mars 2010 à 17:27 +0200, Michal Simek a écrit : >> Michal Simek wrote: >>> Eric Dumazet wrote: >>>> Le lundi 29 mars 2010 à 13:33 +0200, Michal Simek a écrit : >>>> >>>>> Do you have any idea howto improve TCP/UDP performance in general? >>>>> Or tests which can point me on weak places. >>>> Could you post "netstat -s" on your receiver, after fresh boot and your >>>> iperf session, for 32 MB and 256 MB ram case ? >>>> >>> I am not sure if is helpful but look below. >>> >> Sorry I forget to c&p that second part. :-( >> > > Sorry, your netstat is not up2date. I am afraid that is up2date. > > If you cannot correct it to last version > [ net-tools 1.60 , netstat 1.42 ], please send > > cat /proc/net/snmp > cat /proc/net/netstat There is small buffer for parsing /proc/net/netstat. There is necessary to extend buffer size because one line is greater than 1024 chars. ~ # head -n 1 /proc/net/netstat TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailed EmbryonicRsts PruneCalled RcvPruned OfoPruned OutOfWindowIcmps LockDroppedIcmps ArpFilter TW TWRecycled TWKilled PAWSPassive PAWSActive PAWSEstab DelayedACKs DelayedACKLocked DelayedACKLost ListenOverflows ListenDrops TCPPrequeued TCPDirectCopyFromBacklog TCPDirectCopyFromPrequeue TCPPrequeueDropped TCPHPHits TCPHPHitsToUser TCPPureAcks TCPHPAcks TCPRenoRecovery TCPSackRecovery TCPSACKReneging TCPFACKReorder TCPSACKReorder TCPRenoReorder TCPTSReorder TCPFullUndo TCPPartialUndo TCPDSACKUndo TCPLossUndo TCPLoss TCPLostRetransmit TCPRenoFailures TCPSackFailures TCPLossFailures TCPFastRetrans TCPForwardRetrans TCPSlowStartRetrans TCPTimeouts TCPRenoRecoveryFail TCPSackRecoveryFail TCPSchedulerFailed TCPRcvCollapsed TCPDSACKOldSent TCPDSACKOfoSent TCPDSACKRecv TCPDSACKOfoRecv TCPAbortOnSyn TCPAbortOnData TCPAbortOnClose TCPAbortOnMemory TCPAbortOnTimeout TCPAbortOnLinger TCPAbortFailed TCPMemoryPressures TCPSACKDiscard TCPDSACKIgnoredOld TCPDSACKIgnoredNoUndo TCPSpuriousRTOs TCPMD5NotFound TCPMD5Unexpected TCPSackShifted TCPSackMerged TCPSackShiftFallback TCPBacklogDrop TCPMinTTLDrop Look at attached patch. And updated results are below. Thanks, Michal 256M ~ # iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 6] local 192.168.0.10 port 5001 connected with 192.168.0.101 port 33261 [ ID] Interval Transfer Bandwidth [ 6] 0.0-50.2 sec 22.9 MBytes 3.83 Mbits/sec ~ # ./netstat -s Ip: 16618 total packets received 0 forwarded 0 incoming packets discarded 16618 incoming packets delivered 6490 requests sent out Icmp: 0 ICMP messages received 0 input ICMP message failed. ICMP input histogram: 0 ICMP messages sent 0 ICMP messages failed ICMP output histogram: Tcp: 0 active connections openings 1 passive connection openings 0 failed connection attempts 0 connection resets received 0 connections established 16618 segments received 6490 segments send out 0 segments retransmited 0 bad segments received. 0 resets sent Udp: 0 packets received 0 packets to unknown port received. 0 packet receive errors 0 packets sent RcvbufErrors: 0 SndbufErrors: 0 UdpLite: InDatagrams: 0 NoPorts: 0 InErrors: 0 OutDatagrams: 0 RcvbufErrors: 0 SndbufErrors: 0 TcpExt: 2233 packets pruned from receive queue because of socket buffer overrun ArpFilter: 0 1 delayed acks sent 5519 packets header predicted TCPPureAcks: 2 TCPHPAcks: 0 TCPRenoRecovery: 0 TCPSackRecovery: 0 TCPSACKReneging: 0 TCPFACKReorder: 0 TCPSACKReorder: 0 TCPRenoReorder: 0 TCPTSReorder: 0 TCPFullUndo: 0 TCPPartialUndo: 0 TCPDSACKUndo: 0 TCPLossUndo: 0 TCPLoss: 0 TCPLostRetransmit: 0 TCPRenoFailures: 0 TCPSackFailures: 0 TCPLossFailures: 0 TCPFastRetrans: 0 TCPForwardRetrans: 0 TCPSlowStartRetrans: 0 TCPTimeouts: 0 TCPRenoRecoveryFail: 0 TCPSackRecoveryFail: 0 TCPSchedulerFailed: 0 TCPRcvCollapsed: 207654 TCPDSACKOldSent: 0 TCPDSACKOfoSent: 0 TCPDSACKRecv: 0 TCPDSACKOfoRecv: 0 TCPAbortOnSyn: 0 TCPAbortOnData: 0 TCPAbortOnClose: 0 TCPAbortOnMemory: 0 TCPAbortOnTimeout: 0 TCPAbortOnLinger: 0 TCPAbortFailed: 0 TCPMemoryPressures: 0 TCPSACKDiscard: 0 TCPDSACKIgnoredOld: 0 TCPDSACKIgnoredNoUndo: 0 TCPSpuriousRTOs: 0 TCPMD5NotFound: 0 TCPMD5Unexpected: 0 TCPSackShifted: 0 TCPSackMerged: 0 TCPSackShiftFallback: 0 TCPBacklogDrop: 0 TCPMinTTLDrop: 0 IpExt: InNoRoutes: 0 InTruncatedPkts: 0 InMcastPkts: 0 OutMcastPkts: 0 InBcastPkts: 0 OutBcastPkts: 0 InOctets: 24915880 OutOctets: 337488 InMcastOctets: 0 OutMcastOctets: 0 InBcastOctets: 0 OutBcastOctets: 0 ~ # ./netstat --version net-tools 1.60 netstat 1.42 (2001-04-15) Fred Baumgarten, Alan Cox, Bernd Eckenfels, Phil Blundell, Tuan Hoang and others +NEW_ADDRT +RTF_IRTT +RTF_REJECT +FW_MASQUERADE -I18N AF: (inet) +UNIX +INET -INET6 -IPX -AX25 -NETROM -X25 -ATALK -ECONET -ROSE HW: +ETHER -ARC +SLIP +PPP -TUNNEL -TR -AX25 -NETROM -X25 -FR -ROSE -ASH -SIT -FDDI -HIPPI -HDLC/LAPB ~ # head -n 1 /proc/meminfo MemTotal: 257108 kB 32MB ~ # head -n 1 /proc/meminfo MemTotal: 29920 kB ~ # iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 6] local 192.168.0.10 port 5001 connected with 192.168.0.101 port 50088 [ ID] Interval Transfer Bandwidth [ 6] 0.0-50.0 sec 109 MBytes 18.3 Mbits/sec ~ # ./netstat -s Ip: 79040 total packets received 0 forwarded 0 incoming packets discarded 79040 incoming packets delivered 29655 requests sent out Icmp: 0 ICMP messages received 0 input ICMP message failed. ICMP input histogram: 0 ICMP messages sent 0 ICMP messages failed ICMP output histogram: Tcp: 0 active connections openings 1 passive connection openings 0 failed connection attempts 0 connection resets received 0 connections established 79040 segments received 29655 segments send out 0 segments retransmited 0 bad segments received. 0 resets sent Udp: 0 packets received 0 packets to unknown port received. 0 packet receive errors 0 packets sent RcvbufErrors: 0 SndbufErrors: 0 UdpLite: InDatagrams: 0 NoPorts: 0 InErrors: 0 OutDatagrams: 0 RcvbufErrors: 0 SndbufErrors: 0 TcpExt: 9773 packets pruned from receive queue because of socket buffer overrun ArpFilter: 0 1 delayed acks sent 101 packets directly queued to recvmsg prequeue. 558928 packets directly received from prequeue 33274 packets header predicted 378 packets header predicted and directly queued to user TCPPureAcks: 2 TCPHPAcks: 0 TCPRenoRecovery: 0 TCPSackRecovery: 0 TCPSACKReneging: 0 TCPFACKReorder: 0 TCPSACKReorder: 0 TCPRenoReorder: 0 TCPTSReorder: 0 TCPFullUndo: 0 TCPPartialUndo: 0 TCPDSACKUndo: 0 TCPLossUndo: 0 TCPLoss: 0 TCPLostRetransmit: 0 TCPRenoFailures: 0 TCPSackFailures: 0 TCPLossFailures: 0 TCPFastRetrans: 0 TCPForwardRetrans: 0 TCPSlowStartRetrans: 0 TCPTimeouts: 0 TCPRenoRecoveryFail: 0 TCPSackRecoveryFail: 0 TCPSchedulerFailed: 0 TCPRcvCollapsed: 120195 TCPDSACKOldSent: 0 TCPDSACKOfoSent: 0 TCPDSACKRecv: 0 TCPDSACKOfoRecv: 0 TCPAbortOnSyn: 0 TCPAbortOnData: 0 TCPAbortOnClose: 0 TCPAbortOnMemory: 0 TCPAbortOnTimeout: 0 TCPAbortOnLinger: 0 TCPAbortFailed: 0 TCPMemoryPressures: 0 TCPSACKDiscard: 0 TCPDSACKIgnoredOld: 0 TCPDSACKIgnoredNoUndo: 0 TCPSpuriousRTOs: 0 TCPMD5NotFound: 0 TCPMD5Unexpected: 0 TCPSackShifted: 0 TCPSackMerged: 0 TCPSackShiftFallback: 0 TCPBacklogDrop: 0 TCPMinTTLDrop: 0 IpExt: InNoRoutes: 0 InTruncatedPkts: 0 InMcastPkts: 0 OutMcastPkts: 0 InBcastPkts: 0 OutBcastPkts: 0 InOctets: 118232864 OutOctets: 1542068 InMcastOctets: 0 OutMcastOctets: 0 InBcastOctets: 0 OutBcastOctets: 0 ~ # -- Michal Simek, Ing. (M.Eng) PetaLogix - Linux Solutions for a Reconfigurable World w: www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663 [-- Attachment #2: 0001-Extend-buffer-size-because-line-size-is-greater-than.patch --] [-- Type: text/x-patch, Size: 988 bytes --] >From e2d160e2e235103af3e08b5bbbc451982bc0fed0 Mon Sep 17 00:00:00 2001 From: Michal Simek <monstr@monstr.eu> Date: Tue, 30 Mar 2010 10:45:06 +0200 Subject: [PATCH] Extend buffer size because line size is greater than 1024 chars Error shown on console UdpLite: InDatagrams: 0 NoPorts: 0 InErrors: 0 OutDatagrams: 0 RcvbufErrors: 0 SndbufErrors: 0 error parsing /proc/net/snmp: Success It is easy to check size of line which is necessary to check. 1151 Signed-off-by: Michal Simek <monstr@monstr.eu> --- statistics.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/statistics.c b/statistics.c index a878df8..51beb8a 100644 --- a/statistics.c +++ b/statistics.c @@ -291,7 +291,7 @@ struct tabtab *newtable(struct tabtab *tabs, char *title) void process_fd(FILE *f) { - char buf1[1024], buf2[1024]; + char buf1[2048], buf2[2048]; char *sp, *np, *p; while (fgets(buf1, sizeof buf1, f)) { int endflag; -- 1.5.5.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-30 9:34 ` Michal Simek @ 2010-03-30 12:11 ` Steve Magnani 2010-03-30 12:41 ` Eric Dumazet 1 sibling, 0 replies; 11+ messages in thread From: Steve Magnani @ 2010-03-30 12:11 UTC (permalink / raw) To: Michal Simek, Eric Dumazet Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Arnd Bergmann, akpm Looks like a lot of time spent in tcp_collapse(). I think there is assumption that the memcpys involved in collapsing lead to better throughput than packet retransmit, which on our platforms may not be the case. If so is there a way to tune when tcp_collapse gets invoked? I haven't seen one. Steve -----Original Message----- From: Michal Simek <michal.simek@petalogix.com> To: Eric Dumazet <eric.dumazet@gmail.com> Cc: LKML <linux-kernel@vger.kernel.org>, John Williams <john.williams@petalogix.com>, netdev@vger.kernel.org, Grant Likely <grant.likely@secretlab.ca>, John Linn <John.Linn@xilinx.com>, "Steven J. Magnani" <steve@digidescorp.com>, Arnd Bergmann <arnd@arndb.de>, akpm@linux-foundation.org Date: Tue, 30 Mar 2010 11:34:29 +0200 Subject: Re: Network performance - iperf > Eric Dumazet wrote: > > Le lundi 29 mars 2010 à 17:27 +0200, Michal Simek a écrit : > >> Michal Simek wrote: > >>> Eric Dumazet wrote: > >>>> Le lundi 29 mars 2010 à 13:33 +0200, Michal Simek a écrit : > >>>> > >>>>> Do you have any idea howto improve TCP/UDP performance in > general? > >>>>> Or tests which can point me on weak places. > >>>> Could you post "netstat -s" on your receiver, after fresh boot and > your > >>>> iperf session, for 32 MB and 256 MB ram case ? > >>>> > >>> I am not sure if is helpful but look below. > >>> > >> Sorry I forget to c&p that second part. :-( > >> > > > > Sorry, your netstat is not up2date. > > I am afraid that is up2date. > > > > > If you cannot correct it to last version > > [ net-tools 1.60 , netstat 1.42 ], please send > > > > cat /proc/net/snmp > > cat /proc/net/netstat > > There is small buffer for parsing /proc/net/netstat. > There is necessary to extend buffer size because one line is greater > than 1024 chars. > > ~ # head -n 1 /proc/net/netstat > TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailed EmbryonicRsts > PruneCalled RcvPruned OfoPruned OutOfWindowIcmps LockDroppedIcmps > ArpFilter TW TWRecycled TWKilled PAWSPassive PAWSActive PAWSEstab > DelayedACKs DelayedACKLocked DelayedACKLost ListenOverflows ListenDrops > TCPPrequeued TCPDirectCopyFromBacklog TCPDirectCopyFromPrequeue > TCPPrequeueDropped TCPHPHits TCPHPHitsToUser TCPPureAcks TCPHPAcks > TCPRenoRecovery TCPSackRecovery TCPSACKReneging TCPFACKReorder > TCPSACKReorder TCPRenoReorder TCPTSReorder TCPFullUndo TCPPartialUndo > TCPDSACKUndo TCPLossUndo TCPLoss TCPLostRetransmit TCPRenoFailures > TCPSackFailures TCPLossFailures TCPFastRetrans TCPForwardRetrans > TCPSlowStartRetrans TCPTimeouts TCPRenoRecoveryFail TCPSackRecoveryFail > TCPSchedulerFailed TCPRcvCollapsed TCPDSACKOldSent TCPDSACKOfoSent > TCPDSACKRecv TCPDSACKOfoRecv TCPAbortOnSyn TCPAbortOnData > TCPAbortOnClose TCPAbortOnMemory TCPAbortOnTimeout TCPAbortOnLinger > TCPAbortFailed TCPMemoryPressures TCPSACKDiscard TCPDSACKIgnoredOld > TCPDSACKIgnoredNoUndo TCPSpuriousRTOs TCPMD5NotFound TCPMD5Unexpected > TCPSackShifted TCPSackMerged TCPSackShiftFallback TCPBacklogDrop > TCPMinTTLDrop > > Look at attached patch. > And updated results are below. > > Thanks, > Michal > > > > > > 256M > > ~ # iperf -s > ------------------------------------------------------------ > Server listening on TCP port 5001 > TCP window size: 85.3 KByte (default) > ------------------------------------------------------------ > [ 6] local 192.168.0.10 port 5001 connected with 192.168.0.101 port > 33261 > [ ID] Interval Transfer Bandwidth > [ 6] 0.0-50.2 sec 22.9 MBytes 3.83 Mbits/sec > ~ # ./netstat -s > Ip: > 16618 total packets received > 0 forwarded > 0 incoming packets discarded > 16618 incoming packets delivered > 6490 requests sent out > Icmp: > 0 ICMP messages received > 0 input ICMP message failed. > ICMP input histogram: > 0 ICMP messages sent > 0 ICMP messages failed > ICMP output histogram: > Tcp: > 0 active connections openings > 1 passive connection openings > 0 failed connection attempts > 0 connection resets received > 0 connections established > 16618 segments received > 6490 segments send out > 0 segments retransmited > 0 bad segments received. > 0 resets sent > Udp: > 0 packets received > 0 packets to unknown port received. > 0 packet receive errors > 0 packets sent > RcvbufErrors: 0 > SndbufErrors: 0 > UdpLite: > InDatagrams: 0 > NoPorts: 0 > InErrors: 0 > OutDatagrams: 0 > RcvbufErrors: 0 > SndbufErrors: 0 > TcpExt: > 2233 packets pruned from receive queue because of socket buffer > overrun > ArpFilter: 0 > 1 delayed acks sent > 5519 packets header predicted > TCPPureAcks: 2 > TCPHPAcks: 0 > TCPRenoRecovery: 0 > TCPSackRecovery: 0 > TCPSACKReneging: 0 > TCPFACKReorder: 0 > TCPSACKReorder: 0 > TCPRenoReorder: 0 > TCPTSReorder: 0 > TCPFullUndo: 0 > TCPPartialUndo: 0 > TCPDSACKUndo: 0 > TCPLossUndo: 0 > TCPLoss: 0 > TCPLostRetransmit: 0 > TCPRenoFailures: 0 > TCPSackFailures: 0 > TCPLossFailures: 0 > TCPFastRetrans: 0 > TCPForwardRetrans: 0 > TCPSlowStartRetrans: 0 > TCPTimeouts: 0 > TCPRenoRecoveryFail: 0 > TCPSackRecoveryFail: 0 > TCPSchedulerFailed: 0 > TCPRcvCollapsed: 207654 > TCPDSACKOldSent: 0 > TCPDSACKOfoSent: 0 > TCPDSACKRecv: 0 > TCPDSACKOfoRecv: 0 > TCPAbortOnSyn: 0 > TCPAbortOnData: 0 > TCPAbortOnClose: 0 > TCPAbortOnMemory: 0 > TCPAbortOnTimeout: 0 > TCPAbortOnLinger: 0 > TCPAbortFailed: 0 > TCPMemoryPressures: 0 > TCPSACKDiscard: 0 > TCPDSACKIgnoredOld: 0 > TCPDSACKIgnoredNoUndo: 0 > TCPSpuriousRTOs: 0 > TCPMD5NotFound: 0 > TCPMD5Unexpected: 0 > TCPSackShifted: 0 > TCPSackMerged: 0 > TCPSackShiftFallback: 0 > TCPBacklogDrop: 0 > TCPMinTTLDrop: 0 > IpExt: > InNoRoutes: 0 > InTruncatedPkts: 0 > InMcastPkts: 0 > OutMcastPkts: 0 > InBcastPkts: 0 > OutBcastPkts: 0 > InOctets: 24915880 > OutOctets: 337488 > InMcastOctets: 0 > OutMcastOctets: 0 > InBcastOctets: 0 > OutBcastOctets: 0 > ~ # ./netstat --version > net-tools 1.60 > netstat 1.42 (2001-04-15) > Fred Baumgarten, Alan Cox, Bernd Eckenfels, Phil Blundell, Tuan Hoang > and others > +NEW_ADDRT +RTF_IRTT +RTF_REJECT +FW_MASQUERADE -I18N > AF: (inet) +UNIX +INET -INET6 -IPX -AX25 -NETROM -X25 -ATALK -ECONET > -ROSE > HW: +ETHER -ARC +SLIP +PPP -TUNNEL -TR -AX25 -NETROM -X25 -FR -ROSE > -ASH -SIT -FDDI -HIPPI -HDLC/LAPB > ~ # head -n 1 /proc/meminfo > MemTotal: 257108 kB > > > > 32MB > > ~ # head -n 1 /proc/meminfo > MemTotal: 29920 kB > ~ # iperf -s > ------------------------------------------------------------ > Server listening on TCP port 5001 > TCP window size: 85.3 KByte (default) > ------------------------------------------------------------ > [ 6] local 192.168.0.10 port 5001 connected with 192.168.0.101 port > 50088 > [ ID] Interval Transfer Bandwidth > [ 6] 0.0-50.0 sec 109 MBytes 18.3 Mbits/sec > ~ # ./netstat -s > Ip: > 79040 total packets received > 0 forwarded > 0 incoming packets discarded > 79040 incoming packets delivered > 29655 requests sent out > Icmp: > 0 ICMP messages received > 0 input ICMP message failed. > ICMP input histogram: > 0 ICMP messages sent > 0 ICMP messages failed > ICMP output histogram: > Tcp: > 0 active connections openings > 1 passive connection openings > 0 failed connection attempts > 0 connection resets received > 0 connections established > 79040 segments received > 29655 segments send out > 0 segments retransmited > 0 bad segments received. > 0 resets sent > Udp: > 0 packets received > 0 packets to unknown port received. > 0 packet receive errors > 0 packets sent > RcvbufErrors: 0 > SndbufErrors: 0 > UdpLite: > InDatagrams: 0 > NoPorts: 0 > InErrors: 0 > OutDatagrams: 0 > RcvbufErrors: 0 > SndbufErrors: 0 > TcpExt: > 9773 packets pruned from receive queue because of socket buffer > overrun > ArpFilter: 0 > 1 delayed acks sent > 101 packets directly queued to recvmsg prequeue. > 558928 packets directly received from prequeue > 33274 packets header predicted > 378 packets header predicted and directly queued to user > TCPPureAcks: 2 > TCPHPAcks: 0 > TCPRenoRecovery: 0 > TCPSackRecovery: 0 > TCPSACKReneging: 0 > TCPFACKReorder: 0 > TCPSACKReorder: 0 > TCPRenoReorder: 0 > TCPTSReorder: 0 > TCPFullUndo: 0 > TCPPartialUndo: 0 > TCPDSACKUndo: 0 > TCPLossUndo: 0 > TCPLoss: 0 > TCPLostRetransmit: 0 > TCPRenoFailures: 0 > TCPSackFailures: 0 > TCPLossFailures: 0 > TCPFastRetrans: 0 > TCPForwardRetrans: 0 > TCPSlowStartRetrans: 0 > TCPTimeouts: 0 > TCPRenoRecoveryFail: 0 > TCPSackRecoveryFail: 0 > TCPSchedulerFailed: 0 > TCPRcvCollapsed: 120195 > TCPDSACKOldSent: 0 > TCPDSACKOfoSent: 0 > TCPDSACKRecv: 0 > TCPDSACKOfoRecv: 0 > TCPAbortOnSyn: 0 > TCPAbortOnData: 0 > TCPAbortOnClose: 0 > TCPAbortOnMemory: 0 > TCPAbortOnTimeout: 0 > TCPAbortOnLinger: 0 > TCPAbortFailed: 0 > TCPMemoryPressures: 0 > TCPSACKDiscard: 0 > TCPDSACKIgnoredOld: 0 > TCPDSACKIgnoredNoUndo: 0 > TCPSpuriousRTOs: 0 > TCPMD5NotFound: 0 > TCPMD5Unexpected: 0 > TCPSackShifted: 0 > TCPSackMerged: 0 > TCPSackShiftFallback: 0 > TCPBacklogDrop: 0 > TCPMinTTLDrop: 0 > IpExt: > InNoRoutes: 0 > InTruncatedPkts: 0 > InMcastPkts: 0 > OutMcastPkts: 0 > InBcastPkts: 0 > OutBcastPkts: 0 > InOctets: 118232864 > OutOctets: 1542068 > InMcastOctets: 0 > OutMcastOctets: 0 > InBcastOctets: 0 > OutBcastOctets: 0 > ~ # > > > > > > > > > > > -- > Michal Simek, Ing. (M.Eng) > PetaLogix - Linux Solutions for a Reconfigurable World > w: www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: > +61-7-30090663 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-30 9:34 ` Michal Simek 2010-03-30 12:11 ` Steve Magnani @ 2010-03-30 12:41 ` Eric Dumazet 1 sibling, 0 replies; 11+ messages in thread From: Eric Dumazet @ 2010-03-30 12:41 UTC (permalink / raw) To: michal.simek Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Le mardi 30 mars 2010 à 11:34 +0200, Michal Simek a écrit : > 2233 packets pruned from receive queue because of socket buffer overrun > TCPRcvCollapsed: 207654 Thats a problem. A big one :( If I remember, you use LL_TEMAC driver. This drivers allocates big skbs for RX ring (more than 9000 bytes each skb). Given your 32 Mbytes kernel size, this seems plain wrong. You might try to copybreak them before giving skb to network stack, consuming the minimum space. This would also help this driver to survive in low memory conditions, avoiding death if high order pages are not available. I cannot even compile this driver on my x86 platform, but here is a preliminar patch to give you the idea : [PATCH] ll_temac: Fix some memory allocation problems Driver use high order allocations that might fail after a while. When receiving a buffer from card, try to copy it to keep a pool of pre-allocated high order buffers. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- Please please please note I did not test this patch. drivers/net/ll_temac_main.c | 48 +++++++++++++++------------------- 1 file changed, 22 insertions(+), 26 deletions(-) diff --git a/drivers/net/ll_temac_main.c b/drivers/net/ll_temac_main.c index a18e348..412b72e 100644 --- a/drivers/net/ll_temac_main.c +++ b/drivers/net/ll_temac_main.c @@ -155,8 +155,8 @@ static int temac_dma_bd_init(struct net_device *ndev) lp->rx_bd_v[i].next = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * ((i + 1) % RX_BD_NUM); - skb = alloc_skb(XTE_MAX_JUMBO_FRAME_SIZE - + XTE_ALIGN, GFP_ATOMIC); + skb = alloc_skb(XTE_MAX_JUMBO_FRAME_SIZE + XTE_ALIGN, + GFP_KERNEL); if (skb == 0) { dev_err(&ndev->dev, "alloc_skb error %d\n", i); return -1; @@ -625,34 +625,30 @@ static void ll_temac_recv(struct net_device *ndev) skb = lp->rx_skb[lp->rx_bd_ci]; length = cur_p->app4 & 0x3FFF; - skb_vaddr = virt_to_bus(skb->data); + new_skb = netdev_alloc_skb_ip_align(length); + if (new_skb) { + skb_copy_to_linear_data(new_skb, skb->data, length); + skb_put(new_skb, length); + skb_vaddr = virt_to_bus(skb->data); + dma_sync_single_for_device(ndev->dev.parent, + skb_vaddr, + XTE_MAX_JUMBO_FRAME_SIZE, + PCI_DMA_FROMDEVICE); + new_skb->dev = ndev; + new_skb->protocol = eth_type_trans(new_skb, ndev); + new_skb->ip_summed = CHECKSUM_NONE; + + netif_rx(new_skb); + + ndev->stats.rx_packets++; + ndev->stats.rx_bytes += length; + } else + ndev->stats.rx_dropped++; + dma_unmap_single(ndev->dev.parent, skb_vaddr, length, DMA_FROM_DEVICE); - skb_put(skb, length); - skb->dev = ndev; - skb->protocol = eth_type_trans(skb, ndev); - skb->ip_summed = CHECKSUM_NONE; - - netif_rx(skb); - - ndev->stats.rx_packets++; - ndev->stats.rx_bytes += length; - - new_skb = alloc_skb(XTE_MAX_JUMBO_FRAME_SIZE + XTE_ALIGN, - GFP_ATOMIC); - if (new_skb == 0) { - dev_err(&ndev->dev, "no memory for new sk_buff\n"); - spin_unlock_irqrestore(&lp->rx_lock, flags); - return; - } - - skb_reserve(new_skb, BUFFER_ALIGN(new_skb->data)); - cur_p->app0 = STS_CTRL_APP0_IRQONEND; - cur_p->phys = dma_map_single(ndev->dev.parent, new_skb->data, - XTE_MAX_JUMBO_FRAME_SIZE, - DMA_FROM_DEVICE); cur_p->len = XTE_MAX_JUMBO_FRAME_SIZE; lp->rx_skb[lp->rx_bd_ci] = new_skb; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 11:33 Network performance - iperf Michal Simek 2010-03-29 12:16 ` Eric Dumazet @ 2010-03-29 16:47 ` Rick Jones 2010-03-29 16:57 ` Rick Jones 2010-03-29 20:07 ` Eric Dumazet 2 siblings, 1 reply; 11+ messages in thread From: Rick Jones @ 2010-03-29 16:47 UTC (permalink / raw) To: michal.simek Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm I don't know how to set fixed socket buffer sizes in iperf, if you were running netperf though I would suggest fixing the socket buffer sizes with the test-specific -s (affects local) and -S (affects remote) options: netperf -t TCP_STREAM -H <remote> -l 30 -- -s 32K -S 32K -m 32K to test the hypothesis that the autotuning of the socket buffers/window size is allowing the windows to grow in the larger memory cases beyond what the TLB in your processor is comfortable with. Particularly if you didn't see much degredation as RAM is increased on something like: netperf -t TCP_RR -H <remote> -l 30 -- -r 1 which is a simple request/response test that will never try to have more than one packet in flight at a time, regardless of how large the window gets. happy benchmarking, rick jones http://www.netperf.org/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 16:47 ` Rick Jones @ 2010-03-29 16:57 ` Rick Jones 0 siblings, 0 replies; 11+ messages in thread From: Rick Jones @ 2010-03-29 16:57 UTC (permalink / raw) To: michal.simek Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Rick Jones wrote: > I don't know how to set fixed socket buffer sizes in iperf, if you were > running netperf though I would suggest fixing the socket buffer sizes > with the test-specific -s (affects local) and -S (affects remote) options: > > netperf -t TCP_STREAM -H <remote> -l 30 -- -s 32K -S 32K -m 32K > > to test the hypothesis that the autotuning of the socket buffers/window > size is allowing the windows to grow in the larger memory cases beyond > what the TLB in your processor is comfortable with. BTW, by default, netperf will allocate a "ring" of send buffers - the number allocated will be one more than the socket buffer size divided by the send size - so in the example above, there will be two 32KB buffers allocated in netperf's send ring. A similar calculation may happen on the receive side. That can be controlled via the global (before the "--") -W option. -W send,recv Set the number of send,recv buffers So, you might make the netperf command: netperf -t TCP_STREAM -H <remote> -l 30 -W 1,1 -- -s 32K -S 32K -m 32K happy benchmarking, rick jones ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Network performance - iperf 2010-03-29 11:33 Network performance - iperf Michal Simek 2010-03-29 12:16 ` Eric Dumazet 2010-03-29 16:47 ` Rick Jones @ 2010-03-29 20:07 ` Eric Dumazet 2 siblings, 0 replies; 11+ messages in thread From: Eric Dumazet @ 2010-03-29 20:07 UTC (permalink / raw) To: michal.simek Cc: LKML, John Williams, netdev, Grant Likely, John Linn, Steven J. Magnani, Arnd Bergmann, akpm Le lundi 29 mars 2010 à 13:33 +0200, Michal Simek a écrit : > Hi All, > > I am doing several network benchmarks on Microblaze cpu with MMU. > I am seeing one issue which is weird and I would like know where the > problem is. > I am using the same hw design and the same Linux kernel. I have done > only change in memory size (in DTS). > > 32MB: 18.3Mb/s > 64MB: 15.2Mb/s > 128MB: 10.6Mb/s > 256MB: 3.8Mb/s > > There is huge difference between systems with 32MB and 256MB ram. > > I am running iperf TCP tests with these commands. > On x86: iperf -c 192.168.0.105 -i 5 -t 50 > On microblaze: iperf -s > > I look at pte misses which are the same on all configurations which > means that the number of do_page_fault exceptions is the same on all > configurations. > I added some hooks to low level kernel code to be able to see number of > tlb misses. There is big differences between number of misses on system > with 256MB and 32MB. I measured two kernel settings. First column is > kernel with asm optimized memcpy/memmove function and the second is > without optimization. (Kernel with asm optimized lib functions is 30% > faster than system without optimization) > > 32MB: 12703 13641 > 64MB: 1021750 655644 > 128MB: 1031644 531879 > 256MB: 1011322 430027 > > Most of them are data tlb misses. Microblaze MMU doesn't use any LRU > mechanism to find TLB victim that's why we there is naive TLB > replacement strategy based on incrementing counter. We using 2 tlbs for > kernel itself which are not updated that's why we can use "only" 62 TLBs > from 64. > This probably has nothing to do with tcp stack, but trashing tlb on some pathological cases (you have 62 entries, thats good for working size up to 248 Kbytes, all included (program stack, program static & dynamic data), given microblaze 4Kbytes page size. You could try : echo "4096 8192 32768" >/proc/sys/net/ipv4/tcp_rmem to reduce memory footprint of iperf (or use iperf parameters) Of course, I suppose kernel memory is 32 MB max, if you use only two tlbs (16 Mbytes each) for kernel... ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-03-30 12:41 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-03-29 11:33 Network performance - iperf Michal Simek 2010-03-29 12:16 ` Eric Dumazet 2010-03-29 14:54 ` Michal Simek 2010-03-29 15:27 ` Michal Simek 2010-03-29 17:45 ` Eric Dumazet 2010-03-30 9:34 ` Michal Simek 2010-03-30 12:11 ` Steve Magnani 2010-03-30 12:41 ` Eric Dumazet 2010-03-29 16:47 ` Rick Jones 2010-03-29 16:57 ` Rick Jones 2010-03-29 20:07 ` Eric Dumazet
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.