linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [GIT PULL] core kernel fixes
Date: Mon, 18 May 2009 21:20:10 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LFD.2.00.0905181801250.3516@localhost.localdomain> (raw)
In-Reply-To: <alpine.LFD.2.01.0905180840420.3301@localhost.localdomain>

Linus,

On Mon, 18 May 2009, Linus Torvalds wrote:
> On Mon, 18 May 2009, Ingo Molnar wrote:
> > 
> > Thomas Gleixner (1):
> >       futex: futex mapping needs to be writable
> 
> I do not believe this is right.

You are right to believe that :)

> Just a few lines later, we have:
> 
>          * NOTE: When userspace waits on a MAP_SHARED mapping, even if
>          * it's a read-only handle, it's expected that futexes attach to   
>          * the object not the particular process.
> 
> note how we are _supposed_ to be able to wait for something that is 
> read-only. As such, asking for a writable page is bogus.

We write access the user space address in various places in the futex
functions, so we really need a writeable mapping for a bunch of the
futex ops. 

We have an explicit check for the private futexes a few lines up.

      if (unlikely(!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))))

There are some of the futex ops which can be done with a RO mapping
though. I'm not sure if it makes much sense as user space (at least
glibc) always writes the variable and/or the surrounding members in
the user space data structures, but for now we should leave it RO for
the ones which do not modify the user value.

> I'm not going to pull this. I can well imagine that there was a real bug, 
> but this is _not_ the real fix.
> 
> The commentary is also TOTAL CRAP as far as I can tell. It starts out 
> with:
> 
>     commit 734b05b10e51d4ba38c8fc3ee02e846aab09eedf (futex: use
>     fast_gup()) calls get_user_pages_fast() with the write argument set to
>     0. This went unnoticed [...]
> 
> and that is pure and utter SHIT. The fact is, the write argument was 
> ALWAYS zero, and commit 734b05b10e51d4ba38c8fc3ee02e846aab09eedf has 
> nothing to do with anything what-so-ever, and nothing went unnoticed 
> anywhere.

Sorry, I misread the GUP commit.

> The real bug was apparently just commit e4dc5b7a3 ("clean up").

So we always had that write=0 mapping. And we did not notice that we
always faulted in the futex functions which modify the user space
variable simply because the fault was fixed up in the private futex
fault handling code. The removal of that code led to the problem which
we have right now.

Correct fix below.

Thanks,

	tglx
------>
futex: setup writeable mapping for futex ops which modify user space data

The futex code installs a read only mapping via get_user_pages_fast()
even if the futex op function has to modify user space data. The
eventual fault was fixed up by futex_handle_fault() which walked the
VMA with mmap_sem held.

After the cleanup patches which removed the mmap_sem dependency of the
futex code commit 4dc5b7a36a49eff97050894cf1b3a9a02523717 (futex:
clean up fault logic) removed the private VMA walk logic from the
futex code. This change results in a stale RO mapping which is not
fixed up.

Instead of reintroducing the previous fault logic we set up the
mapping in get_user_pages_fast() read/write for all operations which
modify user space data. Also handle private futexes in the same way
and make the current unconditional access_ok(VERIFY_WRITE) depend on
the futex op.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org

diff --git a/kernel/futex.c b/kernel/futex.c
index eef8cd2..3d7519d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -193,6 +193,7 @@ static void drop_futex_key_refs(union futex_key *key)
  * @uaddr: virtual address of the futex
  * @fshared: 0 for a PROCESS_PRIVATE futex, 1 for PROCESS_SHARED
  * @key: address where result is stored.
+ * @rw: mapping needs to be read/write (values: VERIFY_READ, VERIFY_WRITE)
  *
  * Returns a negative error code or 0
  * The key words are stored in *key on success.
@@ -203,7 +204,8 @@ static void drop_futex_key_refs(union futex_key *key)
  *
  * lock_page() might sleep, the caller should not hold a spinlock.
  */
-static int get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
+static int
+get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw)
 {
 	unsigned long address = (unsigned long)uaddr;
 	struct mm_struct *mm = current->mm;
@@ -226,7 +228,7 @@ static int get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
 	 *        but access_ok() should be faster than find_vma()
 	 */
 	if (!fshared) {
-		if (unlikely(!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))))
+		if (unlikely(!access_ok(rw, uaddr, sizeof(u32))))
 			return -EFAULT;
 		key->private.mm = mm;
 		key->private.address = address;
@@ -235,7 +237,7 @@ static int get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
 	}
 
 again:
-	err = get_user_pages_fast(address, 1, 0, &page);
+	err = get_user_pages_fast(address, 1, rw == VERIFY_WRITE, &page);
 	if (err < 0)
 		return err;
 
@@ -677,7 +679,7 @@ static int futex_wake(u32 __user *uaddr, int fshared, int nr_wake, u32 bitset)
 	if (!bitset)
 		return -EINVAL;
 
-	ret = get_futex_key(uaddr, fshared, &key);
+	ret = get_futex_key(uaddr, fshared, &key, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -723,10 +725,10 @@ futex_wake_op(u32 __user *uaddr1, int fshared, u32 __user *uaddr2,
 	int ret, op_ret;
 
 retry:
-	ret = get_futex_key(uaddr1, fshared, &key1);
+	ret = get_futex_key(uaddr1, fshared, &key1, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
-	ret = get_futex_key(uaddr2, fshared, &key2);
+	ret = get_futex_key(uaddr2, fshared, &key2, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out_put_key1;
 
@@ -814,10 +816,10 @@ static int futex_requeue(u32 __user *uaddr1, int fshared, u32 __user *uaddr2,
 	int ret, drop_count = 0;
 
 retry:
-	ret = get_futex_key(uaddr1, fshared, &key1);
+	ret = get_futex_key(uaddr1, fshared, &key1, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
-	ret = get_futex_key(uaddr2, fshared, &key2);
+	ret = get_futex_key(uaddr2, fshared, &key2, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out_put_key1;
 
@@ -1140,7 +1142,7 @@ static int futex_wait(u32 __user *uaddr, int fshared,
 	q.bitset = bitset;
 retry:
 	q.key = FUTEX_KEY_INIT;
-	ret = get_futex_key(uaddr, fshared, &q.key);
+	ret = get_futex_key(uaddr, fshared, &q.key, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -1330,7 +1332,7 @@ static int futex_lock_pi(u32 __user *uaddr, int fshared,
 	q.pi_state = NULL;
 retry:
 	q.key = FUTEX_KEY_INIT;
-	ret = get_futex_key(uaddr, fshared, &q.key);
+	ret = get_futex_key(uaddr, fshared, &q.key, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -1594,7 +1596,7 @@ retry:
 	if ((uval & FUTEX_TID_MASK) != task_pid_vnr(current))
 		return -EPERM;
 
-	ret = get_futex_key(uaddr, fshared, &key);
+	ret = get_futex_key(uaddr, fshared, &key, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out;
 


  reply	other threads:[~2009-05-18 19:20 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-18 14:23 [GIT PULL] core kernel fixes Ingo Molnar
2009-05-18 15:48 ` Linus Torvalds
2009-05-18 19:20   ` Thomas Gleixner [this message]
2009-05-19 20:52     ` Linus Torvalds
2009-05-19 21:45       ` Thomas Gleixner
2009-05-19 22:20     ` Darren Hart
2009-05-18 16:59 ` [GIT PULL, v2] " Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2020-01-29 11:53 [GIT PULL] " Ingo Molnar
2020-01-29 19:10 ` pr-tracker-bot
2019-05-18  8:51 Ingo Molnar
2019-05-19 17:45 ` pr-tracker-bot
2019-05-16 15:51 Ingo Molnar
2019-05-16 18:20 ` pr-tracker-bot
2018-07-21 11:58 Ingo Molnar
2017-12-06 22:01 Ingo Molnar
2017-11-05 14:33 Ingo Molnar
2017-11-05 18:09 ` Linus Torvalds
2017-11-05 19:53   ` Josh Poimboeuf
2017-11-05 20:12     ` Linus Torvalds
2017-11-05 21:01       ` Josh Poimboeuf
2017-07-21 10:01 Ingo Molnar
2016-10-18 10:14 Ingo Molnar
2016-07-13 10:55 Ingo Molnar
2016-04-03 10:45 Ingo Molnar
2015-02-06 18:28 Ingo Molnar
2012-10-23 10:57 Ingo Molnar
2012-08-03 16:31 Ingo Molnar
2012-08-03 16:55 ` Darren Hart
2012-08-03 17:01   ` Ingo Molnar
2012-08-03 17:24     ` Darren Hart
2012-06-15 18:45 Ingo Molnar
2012-01-26 18:05 Ingo Molnar
2011-08-04 20:45 Ingo Molnar
2011-04-02 10:21 Ingo Molnar
2011-03-25 12:52 Ingo Molnar
2011-01-21  2:11 Ingo Molnar
2011-01-15 15:15 Ingo Molnar
2010-10-05 19:12 Ingo Molnar
2010-10-05 20:15 ` Linus Torvalds
2010-10-05 21:09   ` Paul E. McKenney
2010-10-05 21:45     ` Linus Torvalds
2010-10-05 22:05       ` Paul E. McKenney
2010-10-06  2:56         ` Eric Dumazet
2010-10-06  4:59           ` Paul E. McKenney
2010-10-06 18:20             ` Ingo Molnar
2010-10-06 21:27               ` Paul E. McKenney
2010-10-07  8:11                 ` Ingo Molnar
2010-10-07 17:42                   ` Paul E. McKenney
2010-09-08 13:04 Ingo Molnar
2010-03-26 14:53 Ingo Molnar
2010-03-13 16:35 Ingo Molnar
2009-12-18 18:52 Ingo Molnar
2009-11-10 17:53 Ingo Molnar
2009-10-23 14:53 Ingo Molnar
2009-10-13 18:29 Ingo Molnar
2009-10-08 19:06 Ingo Molnar
2009-10-08 19:16 ` Linus Torvalds
2009-10-08 19:20   ` Ingo Molnar
2009-09-21 13:13 Ingo Molnar
2009-08-13 18:54 Ingo Molnar
2009-08-09 16:07 Ingo Molnar
2009-08-09 18:41 ` Darren Hart
2009-07-10 16:28 Ingo Molnar
2009-07-10 19:06 ` Linus Torvalds
2009-07-10 19:31   ` Ingo Molnar
2009-07-10 19:52     ` Linus Torvalds
2009-07-10 20:02       ` Ingo Molnar
2009-07-13 14:52   ` Joerg Roedel
2009-06-20 17:30 Ingo Molnar
2009-06-20 18:49 ` Linus Torvalds
2009-06-20 19:01   ` Linus Torvalds
2009-06-20 20:27     ` Ingo Molnar
2009-06-21 17:12     ` Thomas Gleixner
2009-06-21 17:37       ` Linus Torvalds
2009-06-21 17:57         ` Linus Torvalds
2009-06-21 19:26           ` Thomas Gleixner
2009-05-05  9:33 Ingo Molnar
2009-01-30 23:12 [git pull] " Ingo Molnar
2009-01-26 17:24 Ingo Molnar
2009-01-11 14:36 Ingo Molnar
2008-12-04 19:39 Ingo Molnar
2008-11-29 19:36 Ingo Molnar
2008-11-18 14:14 Ingo Molnar
2008-11-07 16:28 Ingo Molnar
2008-10-30 23:29 Ingo Molnar
2008-10-15 12:50 [git pull] core kernel updates for v2.6.28 Ingo Molnar
2008-10-16 22:32 ` Linus Torvalds
2008-10-17  6:23   ` [git pull] core kernel fixes Ingo Molnar
2008-08-28 11:44 Ingo Molnar
2008-08-18 18:35 Ingo Molnar
2008-07-24 15:13 Ingo Molnar
2008-06-30 15:32 Ingo Molnar
2008-06-30 17:02 ` Vegard Nossum
2008-06-30 18:20   ` Ingo Molnar
2008-06-30 18:43     ` Vegard Nossum
2008-06-30 19:46       ` Thomas Gleixner
2008-06-30 19:51         ` Vegard Nossum
2008-06-30 19:54           ` Thomas Gleixner
2008-06-23 19:45 Ingo Molnar
2008-06-19 15:16 Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0905181801250.3516@localhost.localdomain \
    --to=tglx@linutronix.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).