2.4: Fix steal_locks race
diff mbox series

Message ID 20030808105321.GA5096@gondor.apana.org.au
State New, archived
Headers show
Series
  • 2.4: Fix steal_locks race
Related show

Commit Message

Herbert Xu Aug. 8, 2003, 10:53 a.m. UTC
Hi:

The steal_locks() call in binfmt_elf.c is buggy.  It steals locks from
a files entry whose reference was dropped much earlier.  This allows it
to steal other process's locks.

The following patch calls steal_locks() earlier so that this does not
happen.

Cheers,

Comments

Herbert Xu Aug. 9, 2003, 1:07 a.m. UTC | #1
On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
> 
> The steal_locks() call in binfmt_elf.c is buggy.  It steals locks from
> a files entry whose reference was dropped much earlier.  This allows it
> to steal other process's locks.
> 
> The following patch calls steal_locks() earlier so that this does not
> happen.

My patch is buggy too.  If a file is closed by another clone between
the two steal_locks calls the lock will again be lost.  Fortunately
this much harder to trigger than the previous bug.

The following patch fixes that.
Andreas Gruenbacher Aug. 9, 2003, 1:46 a.m. UTC | #2
Hello,

thanks a lot for analyzing this. Please see the patch I just posted unter
the subject:

  [PATCH] 2.4.22-rc2 steal_locks and load_elf_binary cleanups

That patch fixes the bug in a slightly cleaner way.

On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
> >
> > The steal_locks() call in binfmt_elf.c is buggy.  It steals locks from
> > a files entry whose reference was dropped much earlier.  This allows it
> > to steal other process's locks.

This makes sense.

> > The following patch calls steal_locks() earlier so that this does not
> > happen.
>
> My patch is buggy too.  If a file is closed by another clone between
> the two steal_locks calls the lock will again be lost.  Fortunately
> this much harder to trigger than the previous bug.
>
> The following patch fixes that.
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andreas Gruenbacher Aug. 9, 2003, 2:04 a.m. UTC | #3
On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
> >
> > The steal_locks() call in binfmt_elf.c is buggy.  It steals locks from
> > a files entry whose reference was dropped much earlier.  This allows it
> > to steal other process's locks.
> >
> > The following patch calls steal_locks() earlier so that this does not
> > happen.
>
> My patch is buggy too.  If a file is closed by another clone between
> the two steal_locks calls the lock will again be lost.  Fortunately
> this much harder to trigger than the previous bug.

I think this is not a strict bug---this scenario is not covered by POSIX
in the first place. Unless lock stealing is done atomically with
unshare_files there is a window of oportunity between unshare_files() and
steal_locks(), so locks can still get lost.


Cheers,
Andreas.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Aug. 9, 2003, 2:52 a.m. UTC | #4
On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
>
> > My patch is buggy too.  If a file is closed by another clone between
> > the two steal_locks calls the lock will again be lost.  Fortunately
> > this much harder to trigger than the previous bug.
> 
> I think this is not a strict bug---this scenario is not covered by POSIX
> in the first place. Unless lock stealing is done atomically with
> unshare_files there is a window of oportunity between unshare_files() and
> steal_locks(), so locks can still get lost.

It's not a standard compliance issue.  In this case the lock will never
be released and it will eventually lead to a crash when someone reads
/proc/locks.
Andreas Gruenbacher Aug. 9, 2003, 2:59 a.m. UTC | #5
Here is an accumulate patch of Herbert's and my changes.

I could see no reason for ftmp in flush_old_exec and load_elf_binary, so I
removed that. Please correct me if that is wrong.

Cheers,
Andreas.


On Fri, 8 Aug 2003, Herbert Xu wrote:

> Hi:
>
> The steal_locks() call in binfmt_elf.c is buggy.  It steals locks from
> a files entry whose reference was dropped much earlier.  This allows it
> to steal other process's locks.
>
> The following patch calls steal_locks() earlier so that this does not
> happen.
Index: linux-2.4.22-rc2.orig/fs/exec.c
===================================================================
--- linux-2.4.22-rc2.orig.orig/fs/exec.c	2003-08-09 04:44:07.000000000 +0200
+++ linux-2.4.22-rc2.orig/fs/exec.c	2003-08-09 04:47:53.000000000 +0200
@@ -582,8 +582,6 @@ int flush_old_exec(struct linux_binprm *
 	retval = unshare_files();
 	if(retval)
 		goto flush_failed;
-	steal_locks(files, current->files);
-	put_files_struct(files);
 	
 	/* 
 	 * Release all of the old mmap stuff
@@ -592,6 +590,8 @@ int flush_old_exec(struct linux_binprm *
 	if (retval) goto mmap_failed;
 
 	/* This is the point of no return */
+	steal_locks(files);
+	put_files_struct(files);
 	release_old_signals(oldsig);
 
 	current->sas_ss_sp = current->sas_ss_size = 0;
@@ -629,6 +629,8 @@ int flush_old_exec(struct linux_binprm *
 	return 0;
 
 mmap_failed:
+	put_files_struct(current->files);
+	current->files = files;
 flush_failed:
 	spin_lock_irq(&current->sigmask_lock);
 	if (current->sig != oldsig) {
Index: linux-2.4.22-rc2.orig/include/linux/fs.h
===================================================================
--- linux-2.4.22-rc2.orig.orig/include/linux/fs.h	2003-08-09 04:44:07.000000000 +0200
+++ linux-2.4.22-rc2.orig/include/linux/fs.h	2003-08-09 04:46:05.000000000 +0200
@@ -674,7 +674,7 @@ extern int __get_lease(struct inode *ino
 extern time_t lease_get_mtime(struct inode *);
 extern int lock_may_read(struct inode *, loff_t start, unsigned long count);
 extern int lock_may_write(struct inode *, loff_t start, unsigned long count);
-extern void steal_locks(fl_owner_t from, fl_owner_t to);
+extern void steal_locks(fl_owner_t from);
 
 struct fasync_struct {
 	int	magic;
Index: linux-2.4.22-rc2.orig/fs/binfmt_elf.c
===================================================================
--- linux-2.4.22-rc2.orig.orig/fs/binfmt_elf.c	2003-08-09 04:44:07.000000000 +0200
+++ linux-2.4.22-rc2.orig/fs/binfmt_elf.c	2003-08-09 04:48:27.000000000 +0200
@@ -444,7 +444,7 @@ static int load_elf_binary(struct linux_
 	struct elfhdr interp_elf_ex;
   	struct exec interp_ex;
 	char passed_fileno[6];
-	struct files_struct *files, *ftmp;
+	struct files_struct *files;
 	
 	/* Get the exec-header */
 	elf_ex = *((struct elfhdr *) bprm->buf);
@@ -480,7 +480,6 @@ static int load_elf_binary(struct linux_
 	files = current->files;		/* Refcounted so ok */
 	if(unshare_files() < 0)
 		goto out_free_ph;
-	steal_locks(files, current->files);
 
 	/* exec will make our files private anyway, but for the a.out
 	   loader stuff we need to do it earlier */
@@ -603,7 +602,9 @@ static int load_elf_binary(struct linux_
 		goto out_free_dentry;
 
 	/* Discard our unneeded old files struct */
+	steal_locks(files);
 	put_files_struct(files);
+	files = NULL;
 
 	/* OK, This is the point of no return */
 	current->mm->start_data = 0;
@@ -714,18 +715,16 @@ static int load_elf_binary(struct linux_
 			elf_entry = load_elf_interp(&interp_elf_ex,
 						    interpreter,
 						    &interp_load_addr);
-
-		allow_write_access(interpreter);
-		fput(interpreter);
-		kfree(elf_interpreter);
-
 		if (BAD_ADDR(elf_entry)) {
 			printk(KERN_ERR "Unable to load interpreter\n");
-			kfree(elf_phdata);
 			send_sig(SIGSEGV, current, 0);
 			retval = -ENOEXEC; /* Nobody gets to see this, but.. */
-			goto out;
+			goto out_free_dentry;
 		}
+
+		allow_write_access(interpreter);
+		fput(interpreter);
+		kfree(elf_interpreter);
 	}
 
 	kfree(elf_phdata);
@@ -811,10 +810,10 @@ out_free_interp:
 out_free_file:
 	sys_close(elf_exec_fileno);
 out_free_fh:
-	ftmp = current->files;
-	current->files = files;
-	steal_locks(ftmp, current->files);
-	put_files_struct(ftmp);
+	if (files) {
+		put_files_struct(current->files);
+		current->files = files;
+	}
 out_free_ph:
 	kfree(elf_phdata);
 	goto out;
Index: linux-2.4.22-rc2.orig/fs/locks.c
===================================================================
--- linux-2.4.22-rc2.orig.orig/fs/locks.c	2003-08-09 04:44:07.000000000 +0200
+++ linux-2.4.22-rc2.orig/fs/locks.c	2003-08-09 04:46:05.000000000 +0200
@@ -1937,11 +1937,11 @@ done:
 	return length;
 }
 
-void steal_locks(fl_owner_t from, fl_owner_t to)
+void steal_locks(fl_owner_t from)
 {
 	struct list_head *tmp;
 
-	if (from == to)
+	if (from == current->files)
 		return;
 
 	lock_kernel();
@@ -1949,7 +1949,7 @@ void steal_locks(fl_owner_t from, fl_own
 		struct file_lock *fl = list_entry(tmp, struct file_lock,
 						  fl_link);
 		if (fl->fl_owner == from)
-			fl->fl_owner = to;
+			fl->fl_owner = current->files;
 	}
 	unlock_kernel();
 }
Herbert Xu Aug. 9, 2003, 3:04 a.m. UTC | #6
On Sat, Aug 09, 2003 at 04:59:03AM +0200, Andreas Gruenbacher wrote:
>
> @@ -714,18 +715,16 @@ static int load_elf_binary(struct linux_
>  			elf_entry = load_elf_interp(&interp_elf_ex,
>  						    interpreter,
>  						    &interp_load_addr);
> -
> -		allow_write_access(interpreter);
> -		fput(interpreter);
> -		kfree(elf_interpreter);
> -
>  		if (BAD_ADDR(elf_entry)) {
>  			printk(KERN_ERR "Unable to load interpreter\n");
> -			kfree(elf_phdata);
>  			send_sig(SIGSEGV, current, 0);
>  			retval = -ENOEXEC; /* Nobody gets to see this, but.. */
> -			goto out;
> +			goto out_free_dentry;
>  		}
> +
> +		allow_write_access(interpreter);
> +		fput(interpreter);
> +		kfree(elf_interpreter);
>  	}

This looks bad since you're past the point of no return.
Andreas Gruenbacher Aug. 9, 2003, 3:13 a.m. UTC | #7
On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> >
> > > My patch is buggy too.  If a file is closed by another clone between
> > > the two steal_locks calls the lock will again be lost.  Fortunately
> > > this much harder to trigger than the previous bug.
> >
> > I think this is not a strict bug---this scenario is not covered by POSIX
> > in the first place. Unless lock stealing is done atomically with
> > unshare_files there is a window of oportunity between unshare_files() and
> > steal_locks(), so locks can still get lost.
>
> It's not a standard compliance issue.  In this case the lock will never
> be released and it will eventually lead to a crash when someone reads
> /proc/locks.

I don't see how this would happen. Could you please elaborate?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andreas Gruenbacher Aug. 9, 2003, 3:17 a.m. UTC | #8
On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Sat, Aug 09, 2003 at 04:59:03AM +0200, Andreas Gruenbacher wrote:
> >
> > @@ -714,18 +715,16 @@ static int load_elf_binary(struct linux_
> >  			elf_entry = load_elf_interp(&interp_elf_ex,
> >  						    interpreter,
> >  						    &interp_load_addr);
> > -
> > -		allow_write_access(interpreter);
> > -		fput(interpreter);
> > -		kfree(elf_interpreter);
> > -
> >  		if (BAD_ADDR(elf_entry)) {
> >  			printk(KERN_ERR "Unable to load interpreter\n");
> > -			kfree(elf_phdata);
> >  			send_sig(SIGSEGV, current, 0);
> >  			retval = -ENOEXEC; /* Nobody gets to see this, but.. */
> > -			goto out;
> > +			goto out_free_dentry;
> >  		}
> > +
> > +		allow_write_access(interpreter);
> > +		fput(interpreter);
> > +		kfree(elf_interpreter);
> >  	}
>
> This looks bad since you're past the point of no return.

This is an equivalence transformation except for the explicit
sys_close(elf_exec_fileno) in the unwind code, which would eventually
happen, anyway.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Aug. 9, 2003, 3:19 a.m. UTC | #9
On Sat, Aug 09, 2003 at 05:13:52AM +0200, Andreas Gruenbacher wrote:
> On Sat, 9 Aug 2003, Herbert Xu wrote:
> 
> > On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> > >
> > > > My patch is buggy too.  If a file is closed by another clone between
> > > > the two steal_locks calls the lock will again be lost.  Fortunately
> > > > this much harder to trigger than the previous bug.
> > >
> > > I think this is not a strict bug---this scenario is not covered by POSIX
> > > in the first place. Unless lock stealing is done atomically with
> > > unshare_files there is a window of oportunity between unshare_files() and
> > > steal_locks(), so locks can still get lost.
> >
> > It's not a standard compliance issue.  In this case the lock will never
> > be released and it will eventually lead to a crash when someone reads
> > /proc/locks.
> 
> I don't see how this would happen. Could you please elaborate?

Suppose that A and B share current->files and fd has a POSIX lock on it.

A			B
unshare_files
steal_locks
			close(fd)
exec fails
steal_locks
put_files_struct

The close in B fails to release the lock as it has been stolen by the
new files structure.  The second steal_locks sets the fl_owner back to
the original files structure which no longer has fd in it and hence can
never release that lock.  The put_files_struct doesn't release the lock
either since it is now owned by the original file structure.
Andreas Gruenbacher Aug. 9, 2003, 3:30 a.m. UTC | #10
On Sat, 2003-08-09 at 05:19, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 05:13:52AM +0200, Andreas Gruenbacher wrote:
> > On Sat, 9 Aug 2003, Herbert Xu wrote:
> > 
> > > On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> > > >
> > > > > My patch is buggy too.  If a file is closed by another clone between
> > > > > the two steal_locks calls the lock will again be lost.  Fortunately
> > > > > this much harder to trigger than the previous bug.
> > > >
> > > > I think this is not a strict bug---this scenario is not covered by POSIX
> > > > in the first place. Unless lock stealing is done atomically with
> > > > unshare_files there is a window of oportunity between unshare_files() and
> > > > steal_locks(), so locks can still get lost.
> > >
> > > It's not a standard compliance issue.  In this case the lock will never
> > > be released and it will eventually lead to a crash when someone reads
> > > /proc/locks.
> > 
> > I don't see how this would happen. Could you please elaborate?
> 
> Suppose that A and B share current->files and fd has a POSIX lock on it.
> 
> A			B
> unshare_files
> steal_locks
> 			close(fd)
> exec fails
> steal_locks
> put_files_struct
> 
> The close in B fails to release the lock as it has been stolen by the
> new files structure.  The second steal_locks sets the fl_owner back to
> the original files structure which no longer has fd in it and hence can
> never release that lock.  The put_files_struct doesn't release the lock
> either since it is now owned by the original file structure.

In the patch I've sent there is no stealing back of locks, so that case
does not exist.


Cheers,

Patch
diff mbox series

--- kernel-source-2.4/fs/binfmt_elf.c.orig	2003-08-08 20:46:56.000000000 +1000
+++ kernel-source-2.4/fs/binfmt_elf.c	2003-08-08 20:47:05.000000000 +1000
@@ -480,6 +480,7 @@ 
 	files = current->files;		/* Refcounted so ok */
 	if(unshare_files() < 0)
 		goto out_free_ph;
+	steal_locks(files, current->files);
 
 	/* exec will make our files private anyway, but for the a.out
 	   loader stuff we need to do it earlier */
@@ -797,7 +798,6 @@ 
 	if (current->ptrace & PT_PTRACED)
 		send_sig(SIGTRAP, current, 0);
 	retval = 0;
-	steal_locks(files, current->files);
 out:
 	return retval;
 
@@ -813,6 +813,7 @@ 
 out_free_fh:
 	ftmp = current->files;
 	current->files = files;
+	steal_locks(ftmp, current->files);
 	put_files_struct(ftmp);
 out_free_ph:
 	kfree(elf_phdata);