linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* network / performance problems
@ 2004-02-21  2:44 Ron Peterson
  2004-02-21  3:08 ` Andrew Morton
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-21  2:44 UTC (permalink / raw)
  To: linux-kernel


I have several new Dell 2650's in various stages of production.  They are
dual xeon machines, hyperthreaded, w/ built in broadcom Gbit
adapters.  I've also installed IntelPRO 1000/MT dual port
adapters.  I have tried various combinations of these adapters and kernels
2.4.24 and 2.6.3, but continue to have problems with slowly degrading
network performance.  Under heavy load, thing really go sour, and I've
actually had to reboot to get things back again.

I've assembled some data at the following location, which I hope provides
some additional insight into the nature of my difficulties.  These include
smokeping graphs, sundry stats, and some commentary.  I am of course happy
to provide any additional information that would be helpful.  It's almost
certain that I've neglected to mention the once crucial detail that makes
everything clear (likely just me being obtuse)... ;)

http://depot.mtholyoke.edu:8080/tmp/

I've not subscribed to the lkml (I know, boo), so would appreciate
CC's.  I will happily subscribe if anyone feels I'm being outrageously
gauche.

(Thanks in general for all the stuff you guys do.  Amazing!)

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-21  2:44 network / performance problems Ron Peterson
@ 2004-02-21  3:08 ` Andrew Morton
  2004-02-21 14:36   ` Ron Peterson
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Morton @ 2004-02-21  3:08 UTC (permalink / raw)
  To: Ron Peterson; +Cc: linux-kernel

Ron Peterson <rpeterso@MtHolyoke.edu> wrote:
>
>  http://depot.mtholyoke.edu:8080/tmp/

Could you chmod user.log and vmstat.log for us?

There are a few things you should try - you probably already have:

- Stop all applications, restart them

- Unload net driver module, reload and reconfigure it.

If either of those (or similar operations) are found to bring the latency
back to normal then that would be a big hint.  ie: we need to find
something which brings the performance back apart from a complete reboot.

Also, look out for consistent increases in either urer or system CPU time.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-21  3:08 ` Andrew Morton
@ 2004-02-21 14:36   ` Ron Peterson
  2004-02-22 17:32     ` Ron Peterson
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-21 14:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


On Fri, 20 Feb 2004, Andrew Morton wrote:
> 
> Could you chmod user.log and vmstat.log for us?

Oops.  All set.  :\

> There are a few things you should try - you probably already have:
> 
> - Stop all applications, restart them
> 
> - Unload net driver module, reload and reconfigure it.
> 
> If either of those (or similar operations) are found to bring the latency
> back to normal then that would be a big hint.  ie: we need to find
> something which brings the performance back apart from a complete reboot.

I have managed to get the system back without reboot.  I stopped all
sendmail and mimedefang processes.  After a few minutes, I brought down
the (virtual) interface eth0:2 that served to host the MX IP 
address.  Shortly later, the machine came back.

That was on mist, the heavily loaded mail gateway.  I just added a couple
more graphs in folder 'other' that show a couple of other machines being
monitored by tap.  Ping response times grow over time, no matter what
machine is being monitored (not just linux).  Tap does not have any
virtual interfaces.

> Also, look out for consistent increases in either urer or system CPU time.

OK.  Have to run to the dump and other errands now, though... ;)

Thanks.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-21 14:36   ` Ron Peterson
@ 2004-02-22 17:32     ` Ron Peterson
  2004-02-22 21:25       ` Ron Peterson
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-22 17:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


On Sat, 21 Feb 2004, Ron Peterson wrote:
> On Fri, 20 Feb 2004, Andrew Morton wrote:
> 
> > There are a few things you should try - you probably already have:
> > 
> > - Stop all applications, restart them
> > 
> > - Unload net driver module, reload and reconfigure it.
> > 
> > If either of those (or similar operations) are found to bring the latency
> > back to normal then that would be a big hint.  ie: we need to find
> > something which brings the performance back apart from a complete reboot.

This machine is starting to head south again.  I've updated user.log (a
bunch of stats I'm syslogging) at
http://depot.mtholyoke.edu:8080/tmp/.  I've also added
http://depot.mtholyoke.edu:8080/tmp/mist/10/, which contains the latest 
smokeping graphs.

In about an hour or two, I'll likely have to head in to get this thing
back on its feet.  I will try the suggestions above, to see if that will
do the trick.  Is there anything else I should try to do or look at?  Once
I reboot (if that's what I have to do), I expect it will be a couple of
days before this happens again.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-22 17:32     ` Ron Peterson
@ 2004-02-22 21:25       ` Ron Peterson
  2004-02-23 16:32         ` Ron Peterson
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-22 21:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


On Sun, 22 Feb 2004, Ron Peterson wrote:
> On Sat, 21 Feb 2004, Ron Peterson wrote:
> > On Fri, 20 Feb 2004, Andrew Morton wrote:
> > 
> > > There are a few things you should try - you probably already have:
> > > 
> > > - Stop all applications, restart them
> > > 
> > > - Unload net driver module, reload and reconfigure it.
> > > 
> > > If either of those (or similar operations) are found to bring the latency
> > > back to normal then that would be a big hint.  ie: we need to find
> > > something which brings the performance back apart from a complete reboot.
> 
> This machine is starting to head south again.  I've updated user.log (a
> bunch of stats I'm syslogging) at
> http://depot.mtholyoke.edu:8080/tmp/.  I've also added
> http://depot.mtholyoke.edu:8080/tmp/mist/10/, which contains the latest 
> smokeping graphs.
> 
> In about an hour or two, I'll likely have to head in to get this thing
> back on its feet.  I will try the suggestions above, to see if that will
> do the trick.  Is there anything else I should try to do or look at?  Once
> I reboot (if that's what I have to do), I expect it will be a couple of
> days before this happens again.

I couldn't get it back.  I stopped apache, mimedefang, sendmail,
imapproxyd, and ifdown'd the interfaces.  Waited a bit, put them back up,
and still had large ping times.  I then ifdown'd the interface and removed
the e1000 module.  I couldn't get it back again though, I think because I
had some nfs mounts I couldn't unmount.

I rebooted.  I set the BIOS to not run hyperthreaded, to see if that has
any effect.  Now wait...

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-22 21:25       ` Ron Peterson
@ 2004-02-23 16:32         ` Ron Peterson
  2004-02-23 22:17           ` Ron Peterson
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-23 16:32 UTC (permalink / raw)
  To: linux-kernel


On Sun, 22 Feb 2004, Ron Peterson wrote:

> I rebooted.  I set the BIOS to not run hyperthreaded, to see if that has
> any effect.  Now wait...

Turning hyperthreading off hasn't helped.  Ping response times are still
slowly increasing.

I just put up another set of graphs that might be the most interesting
yet.

http://depot.mtholyoke.edu:8080/tmp/depot-depot/2002-02-12_10.30/

They show depot, a very lightly used machine, monitoring itself, running
e1000, then being rebooted to use a new kernel with ACPI support, then
having the e1000 module swapped out for the bcm7500 module (switch back to
built in NIC).  Reboot brought response times down to almost
zero.  Switching module/NIC had no effect at all.

Does anyone have any ideas of things they'd like me to try?  My
imagination is running dry.  I have non-production 2650's and beige boxes
I can try stuff on.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-23 16:32         ` Ron Peterson
@ 2004-02-23 22:17           ` Ron Peterson
  2004-02-23 22:54             ` Ron Peterson
  2004-02-24  4:28             ` Ron Peterson
  0 siblings, 2 replies; 26+ messages in thread
From: Ron Peterson @ 2004-02-23 22:17 UTC (permalink / raw)
  To: linux-kernel


more graphs, and more graphs.

http://depot.mtholyoke.edu:8080/tmp/must-mhc/2002-02-23_17:00/

The monitoring machine, must, had until noon been up for 223 days,
running kernel 2.4.20.  I ping'd 'mhc' from must this morning, and
consistently recieved a response time of 0.2 ms.  It is using the 3c59x
module, w/ a 3c905C-TX card.

options 3c59x options=4 full_duplex=1

Just before noon, I compiled 2.4.24 on must.  Everything else is the
same, except I started running smokeping.

Now, these graphs aren't really long enough to reliably indicate a
trend.  But so far, they are showing ever increasing mhc ping reponse
times.  The same trend can be seen for other machines being monitored.

I plan to let this run this way for awhile.  Then I will boot back to
2.4.20 for comparison.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-23 22:17           ` Ron Peterson
@ 2004-02-23 22:54             ` Ron Peterson
  2004-02-24  4:28             ` Ron Peterson
  1 sibling, 0 replies; 26+ messages in thread
From: Ron Peterson @ 2004-02-23 22:54 UTC (permalink / raw)
  To: linux-kernel


On Mon, 23 Feb 2004, Ron Peterson wrote:
> more graphs, and more graphs.

...and more graphs still.  This time the machine I *want* to be my primary
monitoring station, running 2.6.3, pinging a Compaq DS20.

http://depot.mtholyoke.edu:8080/tmp/tap-mhc/2002-02-23_17:30/

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-23 22:17           ` Ron Peterson
  2004-02-23 22:54             ` Ron Peterson
@ 2004-02-24  4:28             ` Ron Peterson
  2004-02-24 14:26               ` Ron Peterson
  1 sibling, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-24  4:28 UTC (permalink / raw)
  To: linux-kernel


On Mon, 23 Feb 2004, Ron Peterson wrote:
> 
> more graphs, and more graphs.
> 
> http://depot.mtholyoke.edu:8080/tmp/must-mhc/2002-02-23_17:00/

...and another graph to follow up.  Same setup, but now running
2.4.20.  Looks much better.

http://depot.mtholyoke.edu:8080/tmp/must-mhc/2002-02-23_23:00/

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-24  4:28             ` Ron Peterson
@ 2004-02-24 14:26               ` Ron Peterson
  2004-02-24 18:22                 ` David S. Miller
  2004-02-25 19:16                 ` Ron Peterson
  0 siblings, 2 replies; 26+ messages in thread
From: Ron Peterson @ 2004-02-24 14:26 UTC (permalink / raw)
  To: linux-kernel


On Mon, 23 Feb 2004, Ron Peterson wrote:
> On Mon, 23 Feb 2004, Ron Peterson wrote:
> > 
> > more graphs, and more graphs.
> > 
> > http://depot.mtholyoke.edu:8080/tmp/must-mhc/2002-02-23_17:00/
> 
> ...and another graph to follow up.  Same setup, but now running
> 2.4.20.  Looks much better.
> 
> http://depot.mtholyoke.edu:8080/tmp/must-mhc/2002-02-23_23:00/

And a follow up to the follow up.  Things have stabilized long enough now
that the trend appears real.

http://depot.mtholyoke.edu:8080/tmp/must-mhc/2002-02-24_8:40/

Was it Mark Twain who said that interpolation is like standing on the
south side of a cliff and walking north because the ground has been flat
so far?

I've also added some graphs of must monitoring mist (which is the machine
I actually care about the most right now).  Mist ping latencies are
predictably on the upswing again.  I'll likely be rebooting soon.

http://depot.mtholyoke.edu:8080/tmp/must-mist/2002-02-24_8:40/

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-24 14:26               ` Ron Peterson
@ 2004-02-24 18:22                 ` David S. Miller
  2004-02-24 18:42                   ` Ron Peterson
  2004-02-25 19:16                 ` Ron Peterson
  1 sibling, 1 reply; 26+ messages in thread
From: David S. Miller @ 2004-02-24 18:22 UTC (permalink / raw)
  To: Ron Peterson; +Cc: linux-kernel


So Ron, as performance gets worse and worse, take a look at what the firewall
rules in the kernel look like.   I bet you're accumulating netfilter ipchains
rules over time and this makes packet processing go slower and slower, and it's
due to some bug in whatever is dynamically adding firewall rules to your system.

I'm guessing all of this because that is exactly what was causing problems for
someone who reported something basically identical to what you're reporting now
the other week.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-24 18:22                 ` David S. Miller
@ 2004-02-24 18:42                   ` Ron Peterson
  2004-02-24 18:47                     ` David S. Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-24 18:42 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel


On Tue, 24 Feb 2004, David S. Miller wrote:
> 
> So Ron, as performance gets worse and worse, take a look at what the firewall
> rules in the kernel look like.   I bet you're accumulating netfilter ipchains
> rules over time and this makes packet processing go slower and slower, and it's
> due to some bug in whatever is dynamically adding firewall rules to your system.

Thanks David.

I'm not dynamically altering the rules, though.  On machine 'must', you
can see in the following graph that using 2.4.24, must's latency starts
growing.  With 2.4.20, it doesn't.  I've included the iptables rules I use
on this machine below.

http://depot.mtholyoke.edu:8080/tmp/must-mhc/2002-02-24_8:40/mhc_last_108000.png

The current smokeping graph looks exactly the same, just longer.

I did an 'iptables -v -L' and 'iptables -v -L -t nat' on a couple of
machines that are still running, but slowing down, and see exactly what
I'd expect.  Is there something else I can look at for you?

I *do* run iptables on all of these machines.  The script below
essentially reflects how I do this, with minor variations according to
what ports I want open.  The rules on 'mist' are a little more
complicated; we DHCP mist to be the gateway for unregistered machines,
and I do some SNAT stuff to redirect off-campus web traffic to a
registration page.

########################################################################
IFACE="eth0"
IPTABLES="/sbin/iptables"
echo "1" > /proc/sys/net/ipv4/ip_forward
########################################################################


########################################################################
# Flush existing rules for all chains.
$IPTABLES -F
$IPTABLES -t nat -F

# The default policy for each chain is to DROP the packet.
$IPTABLES -P INPUT DROP
$IPTABLES -P OUTPUT DROP
$IPTABLES -P FORWARD DROP
########################################################################


########################################################################
# Allow ping from on-campus
iptables -A INPUT -s 138.110.0.0/16 --protocol icmp --icmp-type
echo-request -j ACCEPT
########################################################################



########################################################################
# Allow this host to establish new connections.  Otherwise only accept
# established connections.
$IPTABLES -A OUTPUT --match state --state NEW,ESTABLISHED,RELATED -j
ACCEPT
$IPTABLES -A INPUT --match state --state ESTABLISHED,RELATED -j ACCEPT

# Allow incoming ssh connections from on campus
$IPTABLES -A INPUT -s 138.110.0.0/16 --protocol tcp --destination-port 22
-j ACCEPT

# Allow NetBIOS
$IPTABLES -A INPUT --protocol tcp --destination-port 137:139 -j ACCEPT
$IPTABLES -A INPUT --protocol udp --destination-port 137:139 -j ACCEPT
$IPTABLES -A INPUT --protocol tcp --destination-port 445 -j ACCEPT
$IPTABLES -A INPUT --protocol tcp --destination-port 445 -j ACCEPT
$IPTABLES -A INPUT --protocol udp --destination-port 445 -j ACCEPT
# Need to do this to allow nmblookup broadcasts to recieve reply
$IPTABLES -A INPUT --protocol udp --source-port 137:139 -j ACCEPT

# Allow secure web connections
$IPTABLES -A INPUT --protocol tcp --destination-port 443 -j ACCEPT

# Allow incoming postgresql connections from on campus
# $IPTABLES -A INPUT -s 138.110.0.0/16 --protocol tcp --destination-port
5432 -j ACCEPT

# Allow this host to talk to itself.
$IPTABLES -A INPUT -d 127.0.0.1 -i lo -j ACCEPT
#######################################################################

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-24 18:42                   ` Ron Peterson
@ 2004-02-24 18:47                     ` David S. Miller
  2004-02-24 19:35                       ` Ron Peterson
  0 siblings, 1 reply; 26+ messages in thread
From: David S. Miller @ 2004-02-24 18:47 UTC (permalink / raw)
  To: Ron Peterson; +Cc: linux-kernel


Hmmm, I wonder if the connection tracking tables are simply never shrinking.

Can you get some kernel profiles when the problem hits?  If you don't know how
to do this, it's got to be documented somewhere and I'm sure someone can point
you at how to do it.

I bet we'll see netfilter at the top of the profiles or something like that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-24 18:47                     ` David S. Miller
@ 2004-02-24 19:35                       ` Ron Peterson
  2004-02-24 23:20                         ` Andrew Morton
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-24 19:35 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel


On Tue, 24 Feb 2004, David S. Miller wrote:

> Hmmm, I wonder if the connection tracking tables are simply never shrinking.
> 
> Can you get some kernel profiles when the problem hits?  If you don't know how
> to do this, it's got to be documented somewhere and I'm sure someone can point
> you at how to do it.
> 
> I bet we'll see netfilter at the top of the profiles or something like that.

OK.

I haven't done kernel profiling before.  Did a little googling and this is
what I think I know (2.4.x)

In lilo.conf, do append="profile=2" (is 2 a good number?)
reboot
echo > /proc/profile
readprofile -m System.map-2.4.24 (or whatever)

Is that correct?

Of course, this problem happens fastest on productions machines, which I
hate to put out of commission...  I need to turn my attention to some
other stuff for a bit today, but tonight I'll see if I can't work
something up on a non-production machine to make it go bad.  Then do some
profiling.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-24 19:35                       ` Ron Peterson
@ 2004-02-24 23:20                         ` Andrew Morton
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2004-02-24 23:20 UTC (permalink / raw)
  To: Ron Peterson; +Cc: davem, linux-kernel

Ron Peterson <rpeterso@MtHolyoke.edu> wrote:
>
> I haven't done kernel profiling before.  Did a little googling and this is
> what I think I know (2.4.x)
> 
> In lilo.conf, do append="profile=2" (is 2 a good number?)
> reboot
> echo > /proc/profile
> readprofile -m System.map-2.4.24 (or whatever)

Do this:

- boot with `profile=1'

- make sure that /boot/System.map pertains to the currently-running kernel

sudo readprofile -r
sudo readprofile -M10
sleep 60
sudo readprofile -n -v -m /boot/System.map | sort -n +2 > prof.out


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-24 14:26               ` Ron Peterson
  2004-02-24 18:22                 ` David S. Miller
@ 2004-02-25 19:16                 ` Ron Peterson
  2004-03-04 19:24                   ` Ron Peterson
  1 sibling, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-02-25 19:16 UTC (permalink / raw)
  To: linux-kernel


On Tue, 24 Feb 2004, Ron Peterson wrote:

> I've also added some graphs of must monitoring mist (which is the machine
> I actually care about the most right now).  Mist ping latencies are
> predictably on the upswing again.  I'll likely be rebooting soon.
> 
> http://depot.mtholyoke.edu:8080/tmp/must-mist/2002-02-24_8:40/

I've had to turn my attention to some other responsibilities, so I haven't
done any kernel profiling yet.  However, I can report that I rebooted
'mist' into 2.4.20 yesterday, and I have seen rock solid .15 ms response
times for more than 24 hours.  Host 'must' is likewise now stable, running
2.4.20 for two days now.  I have graphs, logs, etc. if anyone cares to see
them.

mist is hyperthreaded dual xeon now back to built-in broadcom adapter (tg3
module).  must is single cpu asus p4pe w/ 3com adapter.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-02-25 19:16                 ` Ron Peterson
@ 2004-03-04 19:24                   ` Ron Peterson
  2004-03-04 19:29                     ` David S. Miller
                                       ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Ron Peterson @ 2004-03-04 19:24 UTC (permalink / raw)
  To: linux-kernel


On Wed, 25 Feb 2004, Ron Peterson wrote:

> On Tue, 24 Feb 2004, Ron Peterson wrote:
> 
> > I've also added some graphs of must monitoring mist (which is the machine
> > I actually care about the most right now).  Mist ping latencies are
> > predictably on the upswing again.  I'll likely be rebooting soon.
> > 
> > http://depot.mtholyoke.edu:8080/tmp/must-mist/2002-02-24_8:40/
> 
> I've had to turn my attention to some other responsibilities, so I haven't
> done any kernel profiling yet.  However, I can report that I rebooted
> 'mist' into 2.4.20 yesterday, and I have seen rock solid .15 ms response
> times for more than 24 hours.  Host 'must' is likewise now stable, running
> 2.4.20 for two days now.  I have graphs, logs, etc. if anyone cares to see
> them.

These machines remain very stable at 2.4.20.

I don't know where things currently stand vis-a-vis knowing what's
causing this network/system load creep problem, but I thought I'd report
that I installed 2.4.21 on a single processor about a week ago (1GHz PIII,
500MB, Intel 82820 (ICH2) Chipset w/ eepro100 module), and am seeing the
same bad behaviour.  I have very clear graphs, if that's useful, but
haven't been logging system stats as aggressively as on some other
machines.

So something between 2.4.20 and 2.4.21, I think.  I wish I could be more
helpfull..

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-04 19:24                   ` Ron Peterson
@ 2004-03-04 19:29                     ` David S. Miller
  2004-03-04 19:39                       ` Ron Peterson
  2004-03-06 14:55                     ` Ron Peterson
  2004-04-12 15:03                     ` Ron Peterson
  2 siblings, 1 reply; 26+ messages in thread
From: David S. Miller @ 2004-03-04 19:29 UTC (permalink / raw)
  To: Ron Peterson; +Cc: linux-kernel


You're not providing any new information until you work on those kernel
profiles we asked for the other week.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-04 19:29                     ` David S. Miller
@ 2004-03-04 19:39                       ` Ron Peterson
  0 siblings, 0 replies; 26+ messages in thread
From: Ron Peterson @ 2004-03-04 19:39 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel


On Thu, 4 Mar 2004, David S. Miller wrote:

> You're not providing any new information until you work on those kernel
> profiles we asked for the other week.

I didn't hear back when I wrote and asked if you still wanted those.  I'm
rebooting with profiling turned on.  It takes a few days for things to get
out of control, but I'll provide data when that happens.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-04 19:24                   ` Ron Peterson
  2004-03-04 19:29                     ` David S. Miller
@ 2004-03-06 14:55                     ` Ron Peterson
  2004-03-06 15:00                       ` Ron Peterson
  2004-03-09  7:34                       ` David S. Miller
  2004-04-12 15:03                     ` Ron Peterson
  2 siblings, 2 replies; 26+ messages in thread
From: Ron Peterson @ 2004-03-06 14:55 UTC (permalink / raw)
  To: linux-kernel


On Thu, 4 Mar 2004, Ron Peterson wrote:

> ... thought I'd report that I installed 2.4.21 on a single processor
> about a week ago (1GHz PIII, 500MB, Intel 82820 (ICH2) Chipset w/
> eepro100 module), and am seeing the same bad behaviour.

I've booted with kernel profiling turned on.  I've posted some preliminary
results.  I don't have profile data yet, but you can see in the following
that when I turn off my iptables rules, the ping latency graph flattens
out.

http://depot.mtholyoke.edu:8080/tmp/tap-sam/2004-03-06_9:30/sam_last_108000.png
http://depot.mtholyoke.edu:8080/tmp/tap-sam/

My understanding is that the kernel profile information will become
interesting when the machine starts thrashing.  If it would be useful for
me to dump anything before then, let me know.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-06 14:55                     ` Ron Peterson
@ 2004-03-06 15:00                       ` Ron Peterson
  2004-03-09  7:34                       ` David S. Miller
  1 sibling, 0 replies; 26+ messages in thread
From: Ron Peterson @ 2004-03-06 15:00 UTC (permalink / raw)
  To: linux-kernel


On Sat, 6 Mar 2004, Ron Peterson wrote:

> My understanding is that the kernel profile information will become
> interesting when the machine starts thrashing.  If it would be useful for
> me to dump anything before then, let me know.

On a related note...

What kind of performance hit do you take for booting with kernel profiling
turned on?  If not much, I would consider always booting this way, so that
if a machine starts sinking, I could maybe capture some useful
information.  Is that wise?

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-06 14:55                     ` Ron Peterson
  2004-03-06 15:00                       ` Ron Peterson
@ 2004-03-09  7:34                       ` David S. Miller
  2004-03-09 15:01                         ` Ron Peterson
  1 sibling, 1 reply; 26+ messages in thread
From: David S. Miller @ 2004-03-09  7:34 UTC (permalink / raw)
  To: Ron Peterson; +Cc: linux-kernel

On Sat, 6 Mar 2004 09:55:09 -0500 (EST)
Ron Peterson <rpeterso@MtHolyoke.edu> wrote:

> My understanding is that the kernel profile information will become
> interesting when the machine starts thrashing.

Yes, now please, pretty please, get us the profiles...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-09  7:34                       ` David S. Miller
@ 2004-03-09 15:01                         ` Ron Peterson
  2004-03-09 21:11                           ` Ron Peterson
  0 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-03-09 15:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel


On Mon, 8 Mar 2004, David S. Miller wrote:

> Date: Mon, 8 Mar 2004 23:34:31 -0800
> From: David S. Miller <davem@redhat.com>
> To: Ron Peterson <rpeterso@MtHolyoke.edu>
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: network / performance problems
> 
> On Sat, 6 Mar 2004 09:55:09 -0500 (EST)
> Ron Peterson <rpeterso@MtHolyoke.edu> wrote:
> 
> > My understanding is that the kernel profile information will become
> > interesting when the machine starts thrashing.
> 
> Yes, now please, pretty please, get us the profiles...

http://depot.mtholyoke.edu:8080/tmp/tap-sam/2004-03-09_09:30/

The machines is not really thrashing yet, but I'd expect in another couple
days, if experience holds, that it will be gonzo.  I'd like to revert back
to 2.4.20 before then, as this is a production machine.  I'll leave it
going as is for a short while, however, in case anyone has any suggestions
about things I should look at while it's misbehaving.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-09 15:01                         ` Ron Peterson
@ 2004-03-09 21:11                           ` Ron Peterson
  0 siblings, 0 replies; 26+ messages in thread
From: Ron Peterson @ 2004-03-09 21:11 UTC (permalink / raw)
  To: linux-kernel


On Tue, 9 Mar 2004, Ron Peterson wrote:
> 
> The machines is not really thrashing yet, but I'd expect in another couple
> days, if experience holds, that it will be gonzo.  I'd like to revert back
> to 2.4.20 before then, as this is a production machine.  I'll leave it
> going as is for a short while, however, in case anyone has any suggestions
> about things I should look at while it's misbehaving.

I'm now dumping profile information from sam to the following location
every fifteen minutes:

http://depot.mtholyoke.edu:8080/tmp/sam-profile/

I'm thinking I'll reboot sam to 2.4.20 tomorrow morning, unless someone
says they'd like some more data.

Best.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-03-04 19:24                   ` Ron Peterson
  2004-03-04 19:29                     ` David S. Miller
  2004-03-06 14:55                     ` Ron Peterson
@ 2004-04-12 15:03                     ` Ron Peterson
  2004-04-14  4:54                       ` Ron Peterson
  2 siblings, 1 reply; 26+ messages in thread
From: Ron Peterson @ 2004-04-12 15:03 UTC (permalink / raw)
  To: linux-kernel


On Thu, 4 Mar 2004, Ron Peterson wrote:

> These machines remain very stable at 2.4.20.
> 
> I don't know where things currently stand vis-a-vis knowing what's
> causing this network/system load creep problem, but I thought I'd report
> that I installed 2.4.21 on a single processor about a week ago (1GHz PIII,
> 500MB, Intel 82820 (ICH2) Chipset w/ eepro100 module), and am seeing the
> same bad behaviour.

I still don't know the root cause of my ever increasing ping
latencies.  However, I can report that if I compile all the netfilter
helpers as modules, rather than statically linking them, that everything
runs fine.

This has solved my immediate problem, so I've turned my attention to other
things.  As far as I know, though, there's still something amiss.

I have another machine that's not in production yet running 2.6.5.  I'm
adopted the habit of compiling netfilter stuff as modules, but I'll
statically link everything and run it that way to see what I can see.

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: network / performance problems
  2004-04-12 15:03                     ` Ron Peterson
@ 2004-04-14  4:54                       ` Ron Peterson
  0 siblings, 0 replies; 26+ messages in thread
From: Ron Peterson @ 2004-04-14  4:54 UTC (permalink / raw)
  To: linux-kernel


On Mon, 12 Apr 2004, Ron Peterson wrote:

> I have another machine that's not in production yet running 2.6.5.  I'm
> adopted the habit of compiling netfilter stuff as modules, but I'll
> statically link everything and run it that way to see what I can see.

Results here:

http://depot.mtholyoke.edu:8080/tmp/tap-stow/2004-04-14/

The problem persists.  To the best of my knowledge, starting with kernel 
version 2.4.21, and including 2.6 series kernels, if you statically link
netfilter code, and use iptables to set up connection tracking rules (as 
below), ksoftirqd will consume increasing cpu%, and ping latencies
will grow.  Eventually the machine will be unuseable.


#! /bin/sh

IPTABLES=/usr/local/sbin/iptables

IFPUB=eth0
IFPRIV=eth1
PUBIP=...
PUBNET=...
PRIVIP=...
PRIVNET=...

# The default policy for each chain is to DROP the packet.
$IPTABLES -P INPUT DROP
$IPTABLES -P OUTPUT DROP
$IPTABLES -P FORWARD DROP

# Flush existing rules for all chains.
$IPTABLES -F
$IPTABLES -t nat -F
$IPTABLES -X

# Allow this host to establish new connections.  Otherwise only accept
# established connections.
$IPTABLES -A OUTPUT --match state --state NEW,ESTABLISHED,RELATED -j ACCEPT
$IPTABLES -A INPUT --match state --state ESTABLISHED,RELATED -j ACCEPT

# Allow ping from on-campus
$IPTABLES -A INPUT -i $IFPUB -s $PUBNET --protocol icmp --icmp-type echo-request -j ACCEPT
$IPTABLES -A INPUT -i $IFPRIV -s $PRIVNET --protocol icmp --icmp-type echo-request -j ACCEPT

# Allow incoming ssh connections.
$IPTABLES -A INPUT --protocol tcp --destination-port 22 -j ACCEPT

# Allow incoming https connections.
# $IPTABLES -A INPUT --protocol tcp --destination-port 443 -j ACCEPT

# Allow Samba/SMB/NetBIOS
$IPTABLES -A INPUT --protocol tcp --destination-port 137:139 -j ACCEPT
$IPTABLES -A INPUT --protocol tcp --destination-port 445 -j ACCEPT

# Allow CUPS
$IPTABLES -A INPUT --protocol tcp --destination-port 631 -j ACCEPT

# Allow this host to talk to itself.
$IPTABLES -A INPUT -d 127.0.0.1 -i lo -j ACCEPT
$IPTABLES -A INPUT -s $PUBIP -d $PUBIP -j ACCEPT
$IPTABLES -A INPUT -s $PRIVIP -d $PRIVIP -j ACCEPT

_________________________
Ron Peterson
Network & Systems Manager
Mount Holyoke College


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2004-04-14  4:54 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-02-21  2:44 network / performance problems Ron Peterson
2004-02-21  3:08 ` Andrew Morton
2004-02-21 14:36   ` Ron Peterson
2004-02-22 17:32     ` Ron Peterson
2004-02-22 21:25       ` Ron Peterson
2004-02-23 16:32         ` Ron Peterson
2004-02-23 22:17           ` Ron Peterson
2004-02-23 22:54             ` Ron Peterson
2004-02-24  4:28             ` Ron Peterson
2004-02-24 14:26               ` Ron Peterson
2004-02-24 18:22                 ` David S. Miller
2004-02-24 18:42                   ` Ron Peterson
2004-02-24 18:47                     ` David S. Miller
2004-02-24 19:35                       ` Ron Peterson
2004-02-24 23:20                         ` Andrew Morton
2004-02-25 19:16                 ` Ron Peterson
2004-03-04 19:24                   ` Ron Peterson
2004-03-04 19:29                     ` David S. Miller
2004-03-04 19:39                       ` Ron Peterson
2004-03-06 14:55                     ` Ron Peterson
2004-03-06 15:00                       ` Ron Peterson
2004-03-09  7:34                       ` David S. Miller
2004-03-09 15:01                         ` Ron Peterson
2004-03-09 21:11                           ` Ron Peterson
2004-04-12 15:03                     ` Ron Peterson
2004-04-14  4:54                       ` Ron Peterson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).