All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.9-rc2 hangs in posix_locks_deadlock
@ 2004-09-19 16:03 Vladimir B. Savkin
  2004-09-19 20:05 ` Vladimir B. Savkin
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir B. Savkin @ 2004-09-19 16:03 UTC (permalink / raw)
  To: linux-kernel

I was experiencing kernel hangs with versions 2.6.9-rc2 and
2.6.9-rc2-mm1 on two different boxes.

Today I managed to see the output of Alt+SysRq+P on the
hanged box and write down call trace (from screen, so it is incomplete).

EIP (c015da89) was in function posix_locks_deadlock,
and the call trace was:
 __posix_lock_file
 fcntl_setlk


Offending process was saslauthd (version 2.1.15)

~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.9-rc2 hangs in posix_locks_deadlock
  2004-09-19 16:03 2.6.9-rc2 hangs in posix_locks_deadlock Vladimir B. Savkin
@ 2004-09-19 20:05 ` Vladimir B. Savkin
  2004-09-19 20:32   ` Trond Myklebust
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir B. Savkin @ 2004-09-19 20:05 UTC (permalink / raw)
  To: linux-kernel

On Sun, Sep 19, 2004 at 08:03:42PM +0400, Vladimir B. Savkin wrote:
> I was experiencing kernel hangs with versions 2.6.9-rc2 and
> 2.6.9-rc2-mm1 on two different boxes.

FYI: I have reverted posix-locking-* patches (as found in 2.6.9-rc2-mm1
patch set), no hangs since that. 

> 
> Today I managed to see the output of Alt+SysRq+P on the
> hanged box and write down call trace (from screen, so it is incomplete).
> 
> EIP (c015da89) was in function posix_locks_deadlock,
> and the call trace was:
>  __posix_lock_file
>  fcntl_setlk
~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.9-rc2 hangs in posix_locks_deadlock
  2004-09-19 20:05 ` Vladimir B. Savkin
@ 2004-09-19 20:32   ` Trond Myklebust
  2004-09-19 20:36     ` Vladimir B. Savkin
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2004-09-19 20:32 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: linux-kernel

På su , 19/09/2004 klokka 13:05, skreiv Vladimir B. Savkin:
> > 
> > Today I managed to see the output of Alt+SysRq+P on the
> > hanged box and write down call trace (from screen, so it is incomplete).
> > 
> > EIP (c015da89) was in function posix_locks_deadlock,
> > and the call trace was:
> >  __posix_lock_file
> >  fcntl_setlk

What filesystems are you using on that box?

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.9-rc2 hangs in posix_locks_deadlock
  2004-09-19 20:32   ` Trond Myklebust
@ 2004-09-19 20:36     ` Vladimir B. Savkin
  2004-09-19 22:51       ` Trond Myklebust
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir B. Savkin @ 2004-09-19 20:36 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel

On Sun, Sep 19, 2004 at 01:32:08PM -0700, Trond Myklebust wrote:
> P? su , 19/09/2004 klokka 13:05, skreiv Vladimir B. Savkin:
> > > 
> > > Today I managed to see the output of Alt+SysRq+P on the
> > > hanged box and write down call trace (from screen, so it is incomplete).
> > > 
> > > EIP (c015da89) was in function posix_locks_deadlock,
> > > and the call trace was:
> > >  __posix_lock_file
> > >  fcntl_setlk
> 
> What filesystems are you using on that box?

reiserfs on all but / and /boot partitions, which are ext2.

~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.9-rc2 hangs in posix_locks_deadlock
  2004-09-19 20:36     ` Vladimir B. Savkin
@ 2004-09-19 22:51       ` Trond Myklebust
  2004-09-20 11:47         ` Vladimir B. Savkin
  2004-10-30 10:39         ` Vladimir B. Savkin
  0 siblings, 2 replies; 7+ messages in thread
From: Trond Myklebust @ 2004-09-19 22:51 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 853 bytes --]

På su , 19/09/2004 klokka 13:36, skreiv Vladimir B. Savkin:
> On Sun, Sep 19, 2004 at 01:32:08PM -0700, Trond Myklebust wrote:
> > P? su , 19/09/2004 klokka 13:05, skreiv Vladimir B. Savkin:
> > > > 
> > > > Today I managed to see the output of Alt+SysRq+P on the
> > > > hanged box and write down call trace (from screen, so it is incomplete).
> > > > 
> > > > EIP (c015da89) was in function posix_locks_deadlock,
> > > > and the call trace was:
> > > >  __posix_lock_file
> > > >  fcntl_setlk

Hmm...  It appears that it is indeed possible for both leases and flocks
to be on the global "blocked_list", so the appended check is *not*
redundant.
Since flocks in particular do not initialize fl_owner, I suspect that
you might be seeing wierd loops that were previously being avoided due
to the ->fl_pid checks...

Cheers,
 Trond


[-- Attachment #2: fix_posix_locks_deadlock.dif --]
[-- Type: text/plain, Size: 1163 bytes --]

[PATCH] fix posix_locks_deadlock().

"blocked_list" may contain both leases and flock locks. Since the latter in
particular do not initialize the fl_owner field, we have to beware not to
call posix_same_owner() on them.

Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>
---
 locks.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

Index: linux-2.6.9-rc2-up/fs/locks.c
===================================================================
--- linux-2.6.9-rc2-up.orig/fs/locks.c	2004-09-19 13:55:33.680258334 -0700
+++ linux-2.6.9-rc2-up/fs/locks.c	2004-09-19 15:37:32.595634679 -0700
@@ -634,14 +634,13 @@
 int posix_locks_deadlock(struct file_lock *caller_fl,
 				struct file_lock *block_fl)
 {
-	struct list_head *tmp;
+	struct file_lock *fl;
 
 next_task:
 	if (posix_same_owner(caller_fl, block_fl))
 		return 1;
-	list_for_each(tmp, &blocked_list) {
-		struct file_lock *fl = list_entry(tmp, struct file_lock, fl_link);
-		if (posix_same_owner(fl, block_fl)) {
+	list_for_each_entry(fl, &blocked_list, fl_link) {
+		if (IS_POSIX(fl) && posix_same_owner(fl, block_fl)) {
 			fl = fl->fl_next;
 			block_fl = fl;
 			goto next_task;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.9-rc2 hangs in posix_locks_deadlock
  2004-09-19 22:51       ` Trond Myklebust
@ 2004-09-20 11:47         ` Vladimir B. Savkin
  2004-10-30 10:39         ` Vladimir B. Savkin
  1 sibling, 0 replies; 7+ messages in thread
From: Vladimir B. Savkin @ 2004-09-20 11:47 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel

On Sun, Sep 19, 2004 at 03:51:43PM -0700, Trond Myklebust wrote:
> Hmm...  It appears that it is indeed possible for both leases and flocks
> to be on the global "blocked_list", so the appended check is *not*
> redundant.
> Since flocks in particular do not initialize fl_owner, I suspect that
> you might be seeing wierd loops that were previously being avoided due
> to the ->fl_pid checks...
> 
> Cheers,
>  Trond
> 

> [PATCH] fix posix_locks_deadlock().

2.6.9-rc2-mm1 with this patch seems to be doing fine, thanks

> 
~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.9-rc2 hangs in posix_locks_deadlock
  2004-09-19 22:51       ` Trond Myklebust
  2004-09-20 11:47         ` Vladimir B. Savkin
@ 2004-10-30 10:39         ` Vladimir B. Savkin
  1 sibling, 0 replies; 7+ messages in thread
From: Vladimir B. Savkin @ 2004-10-30 10:39 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel

On Sun, Sep 19, 2004 at 03:51:43PM -0700, Trond Myklebust wrote:
> Hmm...  It appears that it is indeed possible for both leases and flocks
> to be on the global "blocked_list", so the appended check is *not*
> redundant.
> Since flocks in particular do not initialize fl_owner, I suspect that
> you might be seeing wierd loops that were previously being avoided due
> to the ->fl_pid checks...

I just noticed that this fix didn't make it into 2.6.9.

> [PATCH] fix posix_locks_deadlock().
> 
> "blocked_list" may contain both leases and flock locks. Since the latter in
> particular do not initialize the fl_owner field, we have to beware not to
> call posix_same_owner() on them.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>
> ---
>  locks.c |    7 +++----
>  1 files changed, 3 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6.9-rc2-up/fs/locks.c
> ===================================================================
> --- linux-2.6.9-rc2-up.orig/fs/locks.c	2004-09-19 13:55:33.680258334 -0700
> +++ linux-2.6.9-rc2-up/fs/locks.c	2004-09-19 15:37:32.595634679 -0700
> @@ -634,14 +634,13 @@
>  int posix_locks_deadlock(struct file_lock *caller_fl,
>  				struct file_lock *block_fl)
>  {
> -	struct list_head *tmp;
> +	struct file_lock *fl;
>  
>  next_task:
>  	if (posix_same_owner(caller_fl, block_fl))
>  		return 1;
> -	list_for_each(tmp, &blocked_list) {
> -		struct file_lock *fl = list_entry(tmp, struct file_lock, fl_link);
> -		if (posix_same_owner(fl, block_fl)) {
> +	list_for_each_entry(fl, &blocked_list, fl_link) {
> +		if (IS_POSIX(fl) && posix_same_owner(fl, block_fl)) {
>  			fl = fl->fl_next;
>  			block_fl = fl;
>  			goto next_task;

~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-10-30 10:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-19 16:03 2.6.9-rc2 hangs in posix_locks_deadlock Vladimir B. Savkin
2004-09-19 20:05 ` Vladimir B. Savkin
2004-09-19 20:32   ` Trond Myklebust
2004-09-19 20:36     ` Vladimir B. Savkin
2004-09-19 22:51       ` Trond Myklebust
2004-09-20 11:47         ` Vladimir B. Savkin
2004-10-30 10:39         ` Vladimir B. Savkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.