All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: NFS Problems (kernel locks up)
@ 2003-03-19 20:35 Heflin, Roger A.
  2003-03-20 19:31 ` Kresimir Kukulj
  0 siblings, 1 reply; 7+ messages in thread
From: Heflin, Roger A. @ 2003-03-19 20:35 UTC (permalink / raw)
  To: nfs; +Cc: madmax




	I would suggest running a machine stress test on the machine.

	I did had a situation where a large NFS load would quickly take
	down a machine, and finally determined that the actual hardware
	was bad, and when put under stress would crash, I swapped
	out the hardware (case+mb+memory+cpu) with another (I used
	all of the same hd's) and the machine quite crashing even under
	the same kind of load.   The original machine lasted 5-10 minutes
	under heavy NFS load, would last days under light NFS loads.

	We have had good luck with 2.4.19 and 2.4.21pre[34] as nfs=20
	servers.

	The only thing to watch out for on the number of files is that
	there are issues on unix (unix in general) with lots of files
	in a single directory, quite a number of things get slow with
	lots of files in a single dir. =20

	You might try one of the cpu burn in type programs and see if
	that also makes it fail, and maybe a disk benchmark and see if=20
	that makes it fail.

	If either of those make it fail, it is a hardware problem of some
	sort.

	I have a large number of NFS servers and we get a few odd crashes
	that generally are traced back to hardware issues.
		=09
						Roger

> Message: 4
> Date: Wed, 19 Mar 2003 19:22:41 +0100
> From: Kresimir Kukulj <madmax@iskon.hr>
> To: nfs@lists.sourceforge.net
> Subject: [NFS] NFS problems (kernel locks up)
>=20
> Hi
>=20
> We are trying to assess if linux could perform as a NFS server to =
linux
> client(s). In our test we moved part of mailboxes of a freemail =
service
> (after some initial testing) to a NFS storage (linux NFS server). It =
worked
> ok, and used very little resources. But, during the nightly backup, =
NFS
> server crashed. Symptoms were that:
>   1. client detected that NFS server is not responding
>   2. NFS server responded to ping, but you could not log in to it. =
Every
>      attempt to log-in stopped at TCP connection being established, =
but
>      daemon did not respond (I presume, that at that particular moment
>      TCP/IP stack was still working).
>   3. After cca 10 minutes, it locks up (not ping-able).
>   4. I have serial console attached to the server, and kernel did not
>      respond to SYS-REQ.
>   5. After turning off the power and then back on, server booted, and
>      resumed its function.
>=20
> This happened three times, every time during the backup (Networker),
> sometimes only 5 minutes after backup started, sometimes after 1.5 =
hours.
> This was all using 2.4.20 kernel (no extra patches), using NFSv3, udp, =
async.
> NFS client was using: =
rw,hard,intr,udp,rsize=3D8192,wsize=3D8192,nodev,nosuid
> NFS server used: rw,no_root_squash (default is async).
>=20
> Then, I have put 2.4.21-pre5 because it contained some NFS fixes. =
After
> that, server survived three days (2 incrementals and one full backup
> completed successfully). Then it crashed during the day for no =
apparent
> reason (we have the server monitored with 'cricket', and there were no
> unusual activities...).
>=20
> I have changed to NFSv2,sync,udp and it crashed during the backup that =
night,
> and then again during the day. This resulted with filesystem =
corruption
> (replaying the ext3 journal caused fsck to be invoked - couple of =
hours was
> wasted on checking).
>=20
> Now I have reverted back to NFSv3,udp, but kept 'sync'. I will see =
tonight
> will it survive or not.=20
>=20
> Filesystem is 99Gb ext3 partition, with 1024 block size, internal =
journal.
> That fs is 50% full, and contains around 290000 files (13.7% =
fragmentation).
> Files are between few kilobytes up to 10 Mb.
>=20
> Normal filesystem usage is ~200kb read, 300Kb write per second with < =
5%
> disk utilization. When backup runs, reading gets ~ 5Mb/sec with disk
> utilization of ~ 100%.
>=20
> Client and server are connected to the same switch, with no dropped =
packets.
>=20
> We are satisfied with performance (while the server works).
>=20
> Can anybody give a suggestion ? I have tried everything I can think =
of.>=20
> We would like to use linux as a NFS server, but if this does not work, =
we
> will be forced to consider alternatives like Solaris x86.
> Can anyone here suggest a good alternative NFS server OS (for x86) =
with a
> good support for SCSI HW RAID controllers ? ICP Vortex unfortunately =
is
> not supported under Solaris x86, but what other controllers (let's say =
for
> Solaris x86) do you reccommend ?
>=20
> Also, I am concerned about filesystem. Will ext3 be able to handle, =
let's
> say, 10 million files ? If not, will Solaris x86 UFS be any better.
> [ For us, reiser proved to be sometimes difficult, and we had couple =
of fs
> related crashes, so we are trying to find alternatives. Filesystem =
check
> on that amount of files is measured in days. ]
>=20
> Some info about hardware:
> Dell PowerApp 200 with 2 x Pentium III (Coppermine), each 1GHz.
> 1Gb memory, with CONFIG_HIGHMEM4G=3Dy.
> eepro100 ethernet
> ServerWorks chipset but nothing except CDROM is connected to it.
> ICP Vortex Hardware RAID model GDT8523RZ
> Driver for this (SCSI) controller is from 2.4.20 kernel (its pretty =
new).
> 5 FUJITSU MAJ3364MC 34Gb drives in RAID5 (4+hotfix).
> Filesystem is ext3 with journal=3Dordered.
>=20
> Kernel is vanilla 2.4.20, and 2.4.21-pre5.
> I can provide 'dmesg' and '.config' for that kernel.
>=20
> Distribution is Debian stable 3.0.
> These packages are installed:
> ii  nfs-common              1.0-2                   NFS support files =
common to client and server
> ii  nfs-kernel-server       1.0-2                   Kernel NFS server =
support
>=20
> NFS server and client use fixed ports as described at NFS-Howto:
> Kernel command line: root=3D/dev/sda2 lockd.udpport=3D32768 \
>                      lockd.tcpport=3D32768 console=3Dtty0 =
console=3DttyS0,9600
> statd, mountd are fixed as well, and iptables are configured to pass
> fragmented packets. By default, NFS server runs with 8 kernel threads
> (knfsd). According to /proc/net/rpc/nfsd there is no need for more =
kernel
> threads.
>=20
> Services that run on NFS client are POP3 and SMTP daemons and a web =
based
> frontend that uses them. Both daemons are configured to use their =
version of
> dot locking (as recommended).
>=20
> Thanks.
>=20
> --=20
> Kresimir Kukulj
> Iskon Internet d.d.
> ISS
> Savska 41/X.
> 10000 Zagreb
>=20
>=20
>=20
> --__--__--
>=20
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20
>=20
> End of NFS Digest


-------------------------------------------------------
This SF.net email is sponsored by: Does your code think in ink? 
You could win a Tablet PC. Get a free Tablet PC hat just for playing. 
What are you waiting for?
http://ads.sourceforge.net/cgi-bin/redirect.pl?micr5043en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: NFS Problems (kernel locks up)
  2003-03-19 20:35 NFS Problems (kernel locks up) Heflin, Roger A.
@ 2003-03-20 19:31 ` Kresimir Kukulj
  0 siblings, 0 replies; 7+ messages in thread
From: Kresimir Kukulj @ 2003-03-20 19:31 UTC (permalink / raw)
  To: Heflin, Roger A.; +Cc: nfs

Quoting Heflin, Roger A. (Roger.A.Heflin@conocophillips.com):
> 	I would suggest running a machine stress test on the machine.
> 
> 	I did had a situation where a large NFS load would quickly take
> 	down a machine, and finally determined that the actual hardware
> 	was bad, and when put under stress would crash, I swapped
> 	out the hardware (case+mb+memory+cpu) with another (I used
> 	all of the same hd's) and the machine quite crashing even under
> 	the same kind of load.   The original machine lasted 5-10 minutes
> 	under heavy NFS load, would last days under light NFS loads.
> 
> 	We have had good luck with 2.4.19 and 2.4.21pre[34] as nfs 
> 	servers.
> 
> 	The only thing to watch out for on the number of files is that
> 	there are issues on unix (unix in general) with lots of files
> 	in a single directory, quite a number of things get slow with
> 	lots of files in a single dir.  
> 
> 	You might try one of the cpu burn in type programs and see if
> 	that also makes it fail, and maybe a disk benchmark and see if 
> 	that makes it fail.
> 
> 	If either of those make it fail, it is a hardware problem of some
> 	sort.
> 
> 	I have a large number of NFS servers and we get a few odd crashes
> 	that generally are traced back to hardware issues.

Thanks for a reply.

I have tested local RAID array with bonnie, IOzone, postmark and home-made
tools to benchmark file system performance. I tested local fs more that 10
times, and NFS (1 client load) with the same tools using various
combinations of NFSv2, NFSv3, sync, async.
Not a single crash.

I reverted problematic server to 2.4.21-pre5 with NFSv3,udp,sync and it
survived the nightly backup. We'll see how long will it take before it
crashes again. I don't think this is hardware related. Crashes are not
random. Kernel version and protocol version determine when it will crash.

-- 
Kresimir Kukulj                      madmax@iskon.hr
+--------------------------------------------------+
Old PC's never die. They just become Unix terminals.


-------------------------------------------------------
This SF.net email is sponsored by: Tablet PC.  
Does your code think in ink? You could win a Tablet PC. 
Get a free Tablet PC hat just for playing. What are you waiting for? 
http://ads.sourceforge.net/cgi-bin/redirect.pl?micr5043en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NFS problems (kernel locks up)
  2003-03-19 18:22 NFS problems " Kresimir Kukulj
  2003-03-21 19:49 ` Bernd Schubert
@ 2003-03-24 17:19 ` David Dougall
  1 sibling, 0 replies; 7+ messages in thread
From: David Dougall @ 2003-03-24 17:19 UTC (permalink / raw)
  To: Kresimir Kukulj; +Cc: nfs

You might want to try out XFS on linux.  We have been running 2.4.19rc3 on
similar machines to the ones you describe for almost a year now with
little to no problems(including networker backups).  My experience has
shown that XFS is more stable and better performance than ext3.
Unfortunately, you need to get a huge kernel patch from SGI.  It has been
worth it for us.
--David Dougall


On Wed, 19 Mar 2003, Kresimir Kukulj wrote:

> Hi
>
> We are trying to assess if linux could perform as a NFS server to linux
> client(s). In our test we moved part of mailboxes of a freemail service
> (after some initial testing) to a NFS storage (linux NFS server). It worked
> ok, and used very little resources. But, during the nightly backup, NFS
> server crashed. Symptoms were that:
>   1. client detected that NFS server is not responding
>   2. NFS server responded to ping, but you could not log in to it. Every
>      attempt to log-in stopped at TCP connection being established, but
>      daemon did not respond (I presume, that at that particular moment
>      TCP/IP stack was still working).
>   3. After cca 10 minutes, it locks up (not ping-able).
>   4. I have serial console attached to the server, and kernel did not
>      respond to SYS-REQ.
>   5. After turning off the power and then back on, server booted, and
>      resumed its function.
>
> This happened three times, every time during the backup (Networker),
> sometimes only 5 minutes after backup started, sometimes after 1.5 hours.
> This was all using 2.4.20 kernel (no extra patches), using NFSv3, udp, async.
> NFS client was using: rw,hard,intr,udp,rsize=8192,wsize=8192,nodev,nosuid
> NFS server used: rw,no_root_squash (default is async).
>
> Then, I have put 2.4.21-pre5 because it contained some NFS fixes. After
> that, server survived three days (2 incrementals and one full backup
> completed successfully). Then it crashed during the day for no apparent
> reason (we have the server monitored with 'cricket', and there were no
> unusual activities...).
>
> I have changed to NFSv2,sync,udp and it crashed during the backup that night,
> and then again during the day. This resulted with filesystem corruption
> (replaying the ext3 journal caused fsck to be invoked - couple of hours was
> wasted on checking).
>
> Now I have reverted back to NFSv3,udp, but kept 'sync'. I will see tonight
> will it survive or not.
>
> Filesystem is 99Gb ext3 partition, with 1024 block size, internal journal.
> That fs is 50% full, and contains around 290000 files (13.7% fragmentation).
> Files are between few kilobytes up to 10 Mb.
>
> Normal filesystem usage is ~200kb read, 300Kb write per second with < 5%
> disk utilization. When backup runs, reading gets ~ 5Mb/sec with disk
> utilization of ~ 100%.
>
> Client and server are connected to the same switch, with no dropped packets.
>
> We are satisfied with performance (while the server works).
>
> Can anybody give a suggestion ? I have tried everything I can think of.
> We would like to use linux as a NFS server, but if this does not work, we
> will be forced to consider alternatives like Solaris x86.
> Can anyone here suggest a good alternative NFS server OS (for x86) with a
> good support for SCSI HW RAID controllers ? ICP Vortex unfortunately is
> not supported under Solaris x86, but what other controllers (let's say for
> Solaris x86) do you reccommend ?
>
> Also, I am concerned about filesystem. Will ext3 be able to handle, let's
> say, 10 million files ? If not, will Solaris x86 UFS be any better.
> [ For us, reiser proved to be sometimes difficult, and we had couple of fs
> related crashes, so we are trying to find alternatives. Filesystem check
> on that amount of files is measured in days. ]
>
> Some info about hardware:
> Dell PowerApp 200 with 2 x Pentium III (Coppermine), each 1GHz.
> 1Gb memory, with CONFIG_HIGHMEM4G=y.
> eepro100 ethernet
> ServerWorks chipset but nothing except CDROM is connected to it.
> ICP Vortex Hardware RAID model GDT8523RZ
> Driver for this (SCSI) controller is from 2.4.20 kernel (its pretty new).
> 5 FUJITSU MAJ3364MC 34Gb drives in RAID5 (4+hotfix).
> Filesystem is ext3 with journal=ordered.
>
> Kernel is vanilla 2.4.20, and 2.4.21-pre5.
> I can provide 'dmesg' and '.config' for that kernel.
>
> Distribution is Debian stable 3.0.
> These packages are installed:
> ii  nfs-common              1.0-2                   NFS support files common to client and server
> ii  nfs-kernel-server       1.0-2                   Kernel NFS server support
>
> NFS server and client use fixed ports as described at NFS-Howto:
> Kernel command line: root=/dev/sda2 lockd.udpport=32768 \
>                      lockd.tcpport=32768 console=tty0 console=ttyS0,9600
> statd, mountd are fixed as well, and iptables are configured to pass
> fragmented packets. By default, NFS server runs with 8 kernel threads
> (knfsd). According to /proc/net/rpc/nfsd there is no need for more kernel
> threads.
>
> Services that run on NFS client are POP3 and SMTP daemons and a web based
> frontend that uses them. Both daemons are configured to use their version of
> dot locking (as recommended).
>
> Thanks.
>
> --
> Kresimir Kukulj
> Iskon Internet d.d.
> ISS
> Savska 41/X.
> 10000 Zagreb
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Does your code think in ink?
> You could win a Tablet PC. Get a free Tablet PC hat just for playing.
> What are you waiting for?
> http://ads.sourceforge.net/cgi-bin/redirect.pl?micr5043en
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>
>

______________________________________
Inflex Virus Scanner - installed on mailserver for domain @et.byu.edu
Queries to: postmaster@et.byu.edu


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NFS problems (kernel locks up)
  2003-03-21 19:49 ` Bernd Schubert
  2003-03-21 22:54   ` Kresimir Kukulj
@ 2003-03-21 22:57   ` Kresimir Kukulj
  1 sibling, 0 replies; 7+ messages in thread
From: Kresimir Kukulj @ 2003-03-21 22:57 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: nfs

Quoting Bernd Schubert (bernd-schubert@web.de):
> > Some info about hardware:
> > Dell PowerApp 200 with 2 x Pentium III (Coppermine), each 1GHz.
> > 1Gb memory, with CONFIG_HIGHMEM4G=y.
> > eepro100 ethernet
> > ServerWorks chipset but nothing except CDROM is connected to it.
> > ICP Vortex Hardware RAID model GDT8523RZ
> > Driver for this (SCSI) controller is from 2.4.20 kernel (its pretty new).
> > 5 FUJITSU MAJ3364MC 34Gb drives in RAID5 (4+hotfix).
> > Filesystem is ext3 with journal=ordered.
> >
> 
> We have a rather similar machine (well, without the raid and not from Dell) 
> and with 2GB. Actually we had some trouble with the serverworks chipset and 
> the memory.
> The lockups you are describing are probably not nfs related, but due to your 
> hardware. Try to update your bios, disable mtrr, agp, similar speed 
> optimizing things in your kernel configuration. 
> Run memtest86 (the full test), even if you have ECC.
> If all of this still doesn't help, try to disable dual-cpu support.
> As much as I know ext3 make more problems with nfs than reiserfs does, but 
> this shouldn't cause the lockups

Uh, sorry to follow up again, but forgot to ask.
Could you elaborate a bit what problems ext3 has with NFS?
Have you tried XFS and what are your experiences using it with NFS?
I hope this in not to off-topic for this list.

-- 
Kresimir Kukulj                      madmax@iskon.hr
+--------------------------------------------------+
Old PC's never die. They just become Unix terminals.


-------------------------------------------------------
This SF.net email is sponsored by:Crypto Challenge is now open! 
Get cracking and register here for some mind boggling fun and 
the chance of winning an Apple iPod:
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0031en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NFS problems (kernel locks up)
  2003-03-21 19:49 ` Bernd Schubert
@ 2003-03-21 22:54   ` Kresimir Kukulj
  2003-03-21 22:57   ` Kresimir Kukulj
  1 sibling, 0 replies; 7+ messages in thread
From: Kresimir Kukulj @ 2003-03-21 22:54 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: nfs

Quoting Bernd Schubert (bernd-schubert@web.de):
> > Some info about hardware:
> > Dell PowerApp 200 with 2 x Pentium III (Coppermine), each 1GHz.
> > 1Gb memory, with CONFIG_HIGHMEM4G=y.
> > eepro100 ethernet
> > ServerWorks chipset but nothing except CDROM is connected to it.
> > ICP Vortex Hardware RAID model GDT8523RZ
> > Driver for this (SCSI) controller is from 2.4.20 kernel (its pretty new).
> > 5 FUJITSU MAJ3364MC 34Gb drives in RAID5 (4+hotfix).
> > Filesystem is ext3 with journal=ordered.
> >
> 
> We have a rather similar machine (well, without the raid and not from Dell) 
> and with 2GB. Actually we had some trouble with the serverworks chipset and 
> the memory.
> The lockups you are describing are probably not nfs related, but due to your 
> hardware. Try to update your bios, disable mtrr, agp, similar speed 
> optimizing things in your kernel configuration. 
> Run memtest86 (the full test), even if you have ECC.
> If all of this still doesn't help, try to disable dual-cpu support.
> As much as I know ext3 make more problems with nfs than reiserfs does, but 
> this shouldn't cause the lockups

Thanks for replying. I will try your advice.
Next time it crashes, I will use same kernel but without support for:
mtrr, SMP, HIGHMEM, IDE ATA subsystem compiled in as they are not really
essential. Unfortulately, server is in semi production/testing faze, so I
cannot use memtest86 for now.

-- 
Kresimir Kukulj                      madmax@iskon.hr
+--------------------------------------------------+
Old PC's never die. They just become Unix terminals.


-------------------------------------------------------
This SF.net email is sponsored by:Crypto Challenge is now open! 
Get cracking and register here for some mind boggling fun and 
the chance of winning an Apple iPod:
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0031en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NFS problems (kernel locks up)
  2003-03-19 18:22 NFS problems " Kresimir Kukulj
@ 2003-03-21 19:49 ` Bernd Schubert
  2003-03-21 22:54   ` Kresimir Kukulj
  2003-03-21 22:57   ` Kresimir Kukulj
  2003-03-24 17:19 ` David Dougall
  1 sibling, 2 replies; 7+ messages in thread
From: Bernd Schubert @ 2003-03-21 19:49 UTC (permalink / raw)
  To: Kresimir Kukulj; +Cc: nfs

> Some info about hardware:
> Dell PowerApp 200 with 2 x Pentium III (Coppermine), each 1GHz.
> 1Gb memory, with CONFIG_HIGHMEM4G=y.
> eepro100 ethernet
> ServerWorks chipset but nothing except CDROM is connected to it.
> ICP Vortex Hardware RAID model GDT8523RZ
> Driver for this (SCSI) controller is from 2.4.20 kernel (its pretty new).
> 5 FUJITSU MAJ3364MC 34Gb drives in RAID5 (4+hotfix).
> Filesystem is ext3 with journal=ordered.
>

We have a rather similar machine (well, without the raid and not from Dell) 
and with 2GB. Actually we had some trouble with the serverworks chipset and 
the memory.
The lockups you are describing are probably not nfs related, but due to your 
hardware. Try to update your bios, disable mtrr, agp, similar speed 
optimizing things in your kernel configuration. 
Run memtest86 (the full test), even if you have ECC.
If all of this still doesn't help, try to disable dual-cpu support.
As much as I know ext3 make more problems with nfs than reiserfs does, but 
this shouldn't cause the lockups

Hope it helps,
	Bernd


-------------------------------------------------------
This SF.net email is sponsored by:Crypto Challenge is now open! 
Get cracking and register here for some mind boggling fun and 
the chance of winning an Apple iPod:
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0031en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* NFS problems (kernel locks up)
@ 2003-03-19 18:22 Kresimir Kukulj
  2003-03-21 19:49 ` Bernd Schubert
  2003-03-24 17:19 ` David Dougall
  0 siblings, 2 replies; 7+ messages in thread
From: Kresimir Kukulj @ 2003-03-19 18:22 UTC (permalink / raw)
  To: nfs

Hi

We are trying to assess if linux could perform as a NFS server to linux
client(s). In our test we moved part of mailboxes of a freemail service
(after some initial testing) to a NFS storage (linux NFS server). It worked
ok, and used very little resources. But, during the nightly backup, NFS
server crashed. Symptoms were that:
  1. client detected that NFS server is not responding
  2. NFS server responded to ping, but you could not log in to it. Every
     attempt to log-in stopped at TCP connection being established, but
     daemon did not respond (I presume, that at that particular moment
     TCP/IP stack was still working).
  3. After cca 10 minutes, it locks up (not ping-able).
  4. I have serial console attached to the server, and kernel did not
     respond to SYS-REQ.
  5. After turning off the power and then back on, server booted, and
     resumed its function.

This happened three times, every time during the backup (Networker),
sometimes only 5 minutes after backup started, sometimes after 1.5 hours.
This was all using 2.4.20 kernel (no extra patches), using NFSv3, udp, async.
NFS client was using: rw,hard,intr,udp,rsize=8192,wsize=8192,nodev,nosuid
NFS server used: rw,no_root_squash (default is async).

Then, I have put 2.4.21-pre5 because it contained some NFS fixes. After
that, server survived three days (2 incrementals and one full backup
completed successfully). Then it crashed during the day for no apparent
reason (we have the server monitored with 'cricket', and there were no
unusual activities...).

I have changed to NFSv2,sync,udp and it crashed during the backup that night,
and then again during the day. This resulted with filesystem corruption
(replaying the ext3 journal caused fsck to be invoked - couple of hours was
wasted on checking).

Now I have reverted back to NFSv3,udp, but kept 'sync'. I will see tonight
will it survive or not. 

Filesystem is 99Gb ext3 partition, with 1024 block size, internal journal.
That fs is 50% full, and contains around 290000 files (13.7% fragmentation).
Files are between few kilobytes up to 10 Mb.

Normal filesystem usage is ~200kb read, 300Kb write per second with < 5%
disk utilization. When backup runs, reading gets ~ 5Mb/sec with disk
utilization of ~ 100%.

Client and server are connected to the same switch, with no dropped packets.

We are satisfied with performance (while the server works).

Can anybody give a suggestion ? I have tried everything I can think of.
We would like to use linux as a NFS server, but if this does not work, we
will be forced to consider alternatives like Solaris x86.
Can anyone here suggest a good alternative NFS server OS (for x86) with a
good support for SCSI HW RAID controllers ? ICP Vortex unfortunately is
not supported under Solaris x86, but what other controllers (let's say for
Solaris x86) do you reccommend ?

Also, I am concerned about filesystem. Will ext3 be able to handle, let's
say, 10 million files ? If not, will Solaris x86 UFS be any better.
[ For us, reiser proved to be sometimes difficult, and we had couple of fs
related crashes, so we are trying to find alternatives. Filesystem check
on that amount of files is measured in days. ]

Some info about hardware:
Dell PowerApp 200 with 2 x Pentium III (Coppermine), each 1GHz.
1Gb memory, with CONFIG_HIGHMEM4G=y.
eepro100 ethernet
ServerWorks chipset but nothing except CDROM is connected to it.
ICP Vortex Hardware RAID model GDT8523RZ
Driver for this (SCSI) controller is from 2.4.20 kernel (its pretty new).
5 FUJITSU MAJ3364MC 34Gb drives in RAID5 (4+hotfix).
Filesystem is ext3 with journal=ordered.

Kernel is vanilla 2.4.20, and 2.4.21-pre5.
I can provide 'dmesg' and '.config' for that kernel.

Distribution is Debian stable 3.0.
These packages are installed:
ii  nfs-common              1.0-2                   NFS support files common to client and server
ii  nfs-kernel-server       1.0-2                   Kernel NFS server support

NFS server and client use fixed ports as described at NFS-Howto:
Kernel command line: root=/dev/sda2 lockd.udpport=32768 \
                     lockd.tcpport=32768 console=tty0 console=ttyS0,9600
statd, mountd are fixed as well, and iptables are configured to pass
fragmented packets. By default, NFS server runs with 8 kernel threads
(knfsd). According to /proc/net/rpc/nfsd there is no need for more kernel
threads.

Services that run on NFS client are POP3 and SMTP daemons and a web based
frontend that uses them. Both daemons are configured to use their version of
dot locking (as recommended).

Thanks.

-- 
Kresimir Kukulj
Iskon Internet d.d.
ISS
Savska 41/X.
10000 Zagreb


-------------------------------------------------------
This SF.net email is sponsored by: Does your code think in ink? 
You could win a Tablet PC. Get a free Tablet PC hat just for playing. 
What are you waiting for?
http://ads.sourceforge.net/cgi-bin/redirect.pl?micr5043en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-03-24 17:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-19 20:35 NFS Problems (kernel locks up) Heflin, Roger A.
2003-03-20 19:31 ` Kresimir Kukulj
  -- strict thread matches above, loose matches on Subject: below --
2003-03-19 18:22 NFS problems " Kresimir Kukulj
2003-03-21 19:49 ` Bernd Schubert
2003-03-21 22:54   ` Kresimir Kukulj
2003-03-21 22:57   ` Kresimir Kukulj
2003-03-24 17:19 ` David Dougall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.