linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* processes stuck in D state
@ 2003-05-05  5:52 Zeev Fisher
  2003-05-05 14:56 ` Michael Buesch
  0 siblings, 1 reply; 7+ messages in thread
From: Zeev Fisher @ 2003-05-05  5:52 UTC (permalink / raw)
  To: linux-kernel

Hi!

I got a continuos problem of unkillable processes stuck in D state ( 
uninterruptable sleep ) on my Linux servers.
It happens randomly every time on other server on another process ( all 
the servers are configured the same with 2.4.18-10 kernel ). Here's an 
example :

root@lnx35 /]# ps -el|grep D
 F S   UID   PID  PPID  C PRI  NI ADDR    SZ WCHAN  TTY          TIME CMD
000 D   911 29327     1  0  75   0    -  9382 lock_p ?        00:00:00 
calibre
000 D   894 30049 15854  0  75   0    -  8995 lock_p ?        00:00:01 
calibrewb
000 D   894 30092  8661  0  75   0    -  8995 lock_p ?        00:00:01 
calibrewb
000 D   894 29773 26052  0  75   0    -  8977 lock_p ?        00:00:01 
calibrewb


It was probably stuck while trying to get a lock (which was
certainly free) on an NFS volume mounted from a Netapp server.

Enabling debug mode on rpc ( echo '65535' >/proc/sys/sunrpc/rpc_debug ) 
didn't gave me any clue.
Tracing the stucked process pid doesn't give any output.

Those processes are there already few days and will stay there until 
next reboot.

The load average is now 4 ( although the machine is 100% idle ) and the 
system seems to work fine.
If other programs are started again they run and use the same mounts 
that the processes above are stuck on.

Another detail is that those problems started when i added the 'intr' 
option to my nfs mounted fs but i'm not sure. Also, i can't easily check 
that since this problem is not reproducible.

Has anyone noticed the same behavior ? Is this a well known problem ?


Thanks for your help.

-- 
Zeev Fisher - Unix System Administrator
Marvell Semiconductor Israel Ltd
Moshav Manof, D.N. Misgav 20184, ISRAEL
Email    -  Zeev.Fisher@il.marvell.com
Tel      -  + 972 4 9091402
Cell     -  + 972 54 995402
Fax      -  + 972 4 9091501
WWW Page:     http://www.marvell.com

------------------------------------------------------------------------
This message may contain confidential, proprietary or legally privileged
information. The information is intended only for the use of the individual
or entity named above. If the reader of this message is not the
intended recipient, you are hereby notified that any dissemination, distribution
or copying of this communication is strictly prohibited.
If you have received this communication in error, please notify us
immediately by telephone, or by e-mail and delete the message from your
computer.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: processes stuck in D state
  2003-05-05  5:52 processes stuck in D state Zeev Fisher
@ 2003-05-05 14:56 ` Michael Buesch
  2003-05-05 15:24   ` Mike Waychison
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Buesch @ 2003-05-05 14:56 UTC (permalink / raw)
  To: Zeev Fisher; +Cc: linux kernel mailing list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday 05 May 2003 07:52, Zeev Fisher wrote:
> Hi!

Hi Zeev!

> I got a continuos problem of unkillable processes stuck in D state (
> uninterruptable sleep ) on my Linux servers.
> It happens randomly every time on other server on another process ( all
> the servers are configured the same with 2.4.18-10 kernel ). Here's an
> example :
[snip]
> Has anyone noticed the same behavior ? Is this a well known problem ?

I've had the same problem with some 2.4.21-preX twice (or maybe more times,
don't remember) on one of my machines.
IMHO it has something to do with NFS. (I'm using this box as a NFS-client).
I wish, I could reproduce it one more time, to do some traces, etc
on it. But I've not found a way to reproduce it, yet.

- -- 
Regards Michael Büsch
http://www.8ung.at/tuxsoft
 16:50:44 up 52 min,  1 user,  load average: 1.00, 1.00, 0.94
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+tnugoxoigfggmSgRAt8BAJ0deufnL/E6acpz4pIPZll8f48TIgCfWmcI
auSRmi6oyrTbqMVe+MrfuV4=
=ahIZ
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: processes stuck in D state
  2003-05-05 14:56 ` Michael Buesch
@ 2003-05-05 15:24   ` Mike Waychison
  2003-05-05 16:25     ` Michael Buesch
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Waychison @ 2003-05-05 15:24 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Zeev Fisher, linux kernel mailing list

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 1692 bytes --]



On Mon, 5 May 2003, Michael Buesch wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Monday 05 May 2003 07:52, Zeev Fisher wrote:
> > Hi!
>
> Hi Zeev!
>
> > I got a continuos problem of unkillable processes stuck in D state (
> > uninterruptable sleep ) on my Linux servers.
> > It happens randomly every time on other server on another process ( all
> > the servers are configured the same with 2.4.18-10 kernel ). Here's an
> > example :
> [snip]
> > Has anyone noticed the same behavior ? Is this a well known problem ?
>
> I've had the same problem with some 2.4.21-preX twice (or maybe more times,
> don't remember) on one of my machines.
> IMHO it has something to do with NFS. (I'm using this box as a NFS-client).
> I wish, I could reproduce it one more time, to do some traces, etc
> on it. But I've not found a way to reproduce it, yet.
>

This happens when you mount an NFS mount with the 'hard' option (default)
and a mount's handle expires incorrectly (eg: server crash).
Read the mount manpage for an explanation to the downsides of using
the 'soft' option.


Mike Waychison

> - --
> Regards Michael Büsch
> http://www.8ung.at/tuxsoft
>  16:50:44 up 52 min,  1 user,  load average: 1.00, 1.00, 0.94
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
>
> iD8DBQE+tnugoxoigfggmSgRAt8BAJ0deufnL/E6acpz4pIPZll8f48TIgCfWmcI
> auSRmi6oyrTbqMVe+MrfuV4=
> =ahIZ
> -----END PGP SIGNATURE-----
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: processes stuck in D state
  2003-05-05 15:24   ` Mike Waychison
@ 2003-05-05 16:25     ` Michael Buesch
  2003-05-05 22:12       ` jw schultz
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Buesch @ 2003-05-05 16:25 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Zeev Fisher, linux kernel mailing list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday 05 May 2003 17:24, Mike Waychison wrote:
> On Mon, 5 May 2003, Michael Buesch wrote:
> > On Monday 05 May 2003 07:52, Zeev Fisher wrote:
> > > Hi!
> >
> > Hi Zeev!
> >
> > > I got a continuos problem of unkillable processes stuck in D state (
> > > uninterruptable sleep ) on my Linux servers.
> > > It happens randomly every time on other server on another process ( all
> > > the servers are configured the same with 2.4.18-10 kernel ). Here's an
> > > example :
> >
> > [snip]
> >
> > > Has anyone noticed the same behavior ? Is this a well known problem ?
> >
> > I've had the same problem with some 2.4.21-preX twice (or maybe more
> > times, don't remember) on one of my machines.
> > IMHO it has something to do with NFS. (I'm using this box as a
> > NFS-client). I wish, I could reproduce it one more time, to do some
> > traces, etc on it. But I've not found a way to reproduce it, yet.
>
> This happens when you mount an NFS mount with the 'hard' option (default)
> and a mount's handle expires incorrectly (eg: server crash).
> Read the mount manpage for an explanation to the downsides of using
> the 'soft' option.
>
>
> Mike Waychison

my fstab-entry:
192.168.0.50:/mnt/nfs_1 /mnt/nfs_1      nfs             rw,hard,intr,user,nodev,nosuid,exec         0 0

from man mount:
[snip] The process cannot be interrupted or killed unless you also specify intr. [/snip]

I can't interrupt any process that accessed the NFS-server
while shutting down the server, although intr is specified.
_That's_ my problem. :)

- -- 
Regards Michael Büsch
http://www.8ung.at/tuxsoft
 18:23:58 up 48 min,  3 users,  load average: 1.20, 1.05, 0.93
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+tpCZoxoigfggmSgRAkmdAJwM/L8mZpS+DE2WzjzrXuRdxuY98QCgin1l
aKik6/WGFwWXMjd8pjwHIXw=
=akJd
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: processes stuck in D state
  2003-05-05 16:25     ` Michael Buesch
@ 2003-05-05 22:12       ` jw schultz
  0 siblings, 0 replies; 7+ messages in thread
From: jw schultz @ 2003-05-05 22:12 UTC (permalink / raw)
  To: linux kernel mailing list

On Mon, May 05, 2003 at 06:25:48PM +0200, Michael Buesch wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On Monday 05 May 2003 17:24, Mike Waychison wrote:
> > On Mon, 5 May 2003, Michael Buesch wrote:
> > > On Monday 05 May 2003 07:52, Zeev Fisher wrote:
> > > > Hi!
> > >
> > > Hi Zeev!
> > >
> > > > I got a continuos problem of unkillable processes stuck in D state (
> > > > uninterruptable sleep ) on my Linux servers.
> > > > It happens randomly every time on other server on another process ( all
> > > > the servers are configured the same with 2.4.18-10 kernel ). Here's an
> > > > example :
> > >
> > > [snip]
> > >
> > > > Has anyone noticed the same behavior ? Is this a well known problem ?
> > >
> > > I've had the same problem with some 2.4.21-preX twice (or maybe more
> > > times, don't remember) on one of my machines.
> > > IMHO it has something to do with NFS. (I'm using this box as a
> > > NFS-client). I wish, I could reproduce it one more time, to do some
> > > traces, etc on it. But I've not found a way to reproduce it, yet.
> >
> > This happens when you mount an NFS mount with the 'hard' option (default)
> > and a mount's handle expires incorrectly (eg: server crash).
> > Read the mount manpage for an explanation to the downsides of using
> > the 'soft' option.
> >
> >
> > Mike Waychison
> 
> my fstab-entry:
> 192.168.0.50:/mnt/nfs_1 /mnt/nfs_1      nfs             rw,hard,intr,user,nodev,nosuid,exec         0 0
> 
> from man mount:
> [snip] The process cannot be interrupted or killed unless you also specify intr. [/snip]
> 
> I can't interrupt any process that accessed the NFS-server
> while shutting down the server, although intr is specified.
> _That's_ my problem. :)

I had a similar problem with SuSE's 2.4.18.  Random processes
seemed to go into D state from whence intr is useless.

I rebuilt the kernel with NFSv3 disabled and that problem
went away.  The logs are full of
    May  5 14:54:15 duncan kernel: NFS: NFSv3 not supported.
    May  5 14:54:15 duncan kernel: nfs warning: mount version older than kernel
but that i can live with.  Processes hung and umount failing
i cannot abide.

If there is a better answer, i'm listening.


-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw@pegasys.ws

		Remember Cernan and Schmitt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: processes stuck in D state
  2001-04-04 15:47 Pau Aliagas
@ 2001-04-07 22:07 ` Barry K. Nathan
  0 siblings, 0 replies; 7+ messages in thread
From: Barry K. Nathan @ 2001-04-07 22:07 UTC (permalink / raw)
  To: Pau Aliagas; +Cc: lkml

Pau Aliagas wrote:
> Since 2.2.4-ac28 and 2.4.3 I keep on getting processes in D state that I
> cannot kill, usually mozilla or nautilus which use a large amount of RAM.

I don't have time to help debug this, but I'm getting this too, with
2.4.3 final. The previous kernel I ran was 2.4.3-pre4, and it did not
have this problem.

In my case, it's usually mozilla (I'm seeing this with the daily
snapshots, but not with mozilla-0.8.1, at least not yet), but at least
once I saw it with freeamp (2.1rc5) too.

-Barry K. Nathan <barryn@pobox.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* processes stuck in D state
@ 2001-04-04 15:47 Pau Aliagas
  2001-04-07 22:07 ` Barry K. Nathan
  0 siblings, 1 reply; 7+ messages in thread
From: Pau Aliagas @ 2001-04-04 15:47 UTC (permalink / raw)
  To: lkml


Since 2.2.4-ac28 and 2.4.3 I keep on getting processes in D state that I
cannot kill, usually mozilla or nautilus which use a large amount of RAM.
Today is galeon:

A ps -eo pid,stat,pcpu,nwchan,wchan=WIDE-WCHAN-COLUMN -o args shows the
following:
11520 D     0.0 105db1 down_write_failed /usr/bin/galeon-bin

This didn't happen neither with 2.4.2 nor with 2.4.3-pre7; I'm not sure
about pre8.

Pau


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-05-05 22:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-05  5:52 processes stuck in D state Zeev Fisher
2003-05-05 14:56 ` Michael Buesch
2003-05-05 15:24   ` Mike Waychison
2003-05-05 16:25     ` Michael Buesch
2003-05-05 22:12       ` jw schultz
  -- strict thread matches above, loose matches on Subject: below --
2001-04-04 15:47 Pau Aliagas
2001-04-07 22:07 ` Barry K. Nathan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).