From: Benjamin Herrenschmidt <benh@kernel.crashing.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: Shan Hai <haishan.bai@gmail.com>, Peter Zijlstra <peterz@infradead.org>, Peter Zijlstra <a.p.zijlstra@chello.nl>, paulus@samba.org, tglx@linutronix.de, walken@google.com, dhowells@redhat.com, cmetcalf@tilera.com, tony.luck@intel.com, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young Date: Fri, 22 Jul 2011 08:57:33 +1000 [thread overview] Message-ID: <1311289053.25044.550.camel@pasglop> (raw) In-Reply-To: <1311288726.25044.545.camel@pasglop> On Fri, 2011-07-22 at 08:52 +1000, Benjamin Herrenschmidt wrote: > > um, what problem. There's no description here of the user-visible > > effects of the bug hence it's hard to work out what kernel version(s) > > should receive this patch. > > Shan could give you an actual example (it was in the previous thread), > but basically, livelock as the kernel keeps trying and trying the > in_atomic op and never resolves it. > > > What kernel version(s) should receive this patch? > > I haven't dug. Probably anything it applies on as far as we did that > trick of atomic + gup() for futex. Oops, I just realize I didn't document the problem at all in the changelog .. sorry. I meant to say: On archs who use SW tracking of dirty & young, a page without dirty is effectively mapped read-only and a page without young unaccessible in the PTE. Additionally, some architectures might lazily flush the TLB when relaxing write protection (by doing only a local flush), and expect a fault to invalidate the stale entry if it's still present on another processor. The futex code assumes that if the "in_atomic()" access -EFAULT's, it can "fix it up" by causing get_user_pages() which would then be equivalent to taking the fault. However that isn't the case. get_user_pages() will not call handle_mm_fault() in the case where the PTE seems to have the right permissions, regardless of the dirty and young state. It will eventually update those bits ... in the struct page, but not in the PTE. Additionally, it will not handle the lazy TLB flushing that can be required by some architectures in the fault case. Basically, gup is the wrong interface for the job. The patch provides a more appropriate one which boils down to just calling handle_mm_fault() since what we are trying to do is simulate a real page fault. Cheers, Ben. > > > since I'm > > > starting to have the nasty feeling that you are hitting what is > > > somewhat a subtly different issue or my previous patch should > > > have worked (but then I might have done a stupid mistake as well) > > > but let us know anyway. > > > > I assume that Shan reported the secret problem so I added the > > reported-by to the changelog. > > He did :-) Shan, care to provide a rough explanation of what you > observed ? > > Also Russell confirmed that ARM should be affected as well. > > Cheers, > Ben.
WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Herrenschmidt <benh@kernel.crashing.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: tony.luck@intel.com, Peter Zijlstra <a.p.zijlstra@chello.nl>, Shan Hai <haishan.bai@gmail.com>, Peter Zijlstra <peterz@infradead.org>, linux-kernel@vger.kernel.org, cmetcalf@tilera.com, dhowells@redhat.com, paulus@samba.org, tglx@linutronix.de, walken@google.com, linuxppc-dev@lists.ozlabs.org Subject: Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young Date: Fri, 22 Jul 2011 08:57:33 +1000 [thread overview] Message-ID: <1311289053.25044.550.camel@pasglop> (raw) In-Reply-To: <1311288726.25044.545.camel@pasglop> On Fri, 2011-07-22 at 08:52 +1000, Benjamin Herrenschmidt wrote: > > um, what problem. There's no description here of the user-visible > > effects of the bug hence it's hard to work out what kernel version(s) > > should receive this patch. > > Shan could give you an actual example (it was in the previous thread), > but basically, livelock as the kernel keeps trying and trying the > in_atomic op and never resolves it. > > > What kernel version(s) should receive this patch? > > I haven't dug. Probably anything it applies on as far as we did that > trick of atomic + gup() for futex. Oops, I just realize I didn't document the problem at all in the changelog .. sorry. I meant to say: On archs who use SW tracking of dirty & young, a page without dirty is effectively mapped read-only and a page without young unaccessible in the PTE. Additionally, some architectures might lazily flush the TLB when relaxing write protection (by doing only a local flush), and expect a fault to invalidate the stale entry if it's still present on another processor. The futex code assumes that if the "in_atomic()" access -EFAULT's, it can "fix it up" by causing get_user_pages() which would then be equivalent to taking the fault. However that isn't the case. get_user_pages() will not call handle_mm_fault() in the case where the PTE seems to have the right permissions, regardless of the dirty and young state. It will eventually update those bits ... in the struct page, but not in the PTE. Additionally, it will not handle the lazy TLB flushing that can be required by some architectures in the fault case. Basically, gup is the wrong interface for the job. The patch provides a more appropriate one which boils down to just calling handle_mm_fault() since what we are trying to do is simulate a real page fault. Cheers, Ben. > > > since I'm > > > starting to have the nasty feeling that you are hitting what is > > > somewhat a subtly different issue or my previous patch should > > > have worked (but then I might have done a stupid mistake as well) > > > but let us know anyway. > > > > I assume that Shan reported the secret problem so I added the > > reported-by to the changelog. > > He did :-) Shan, care to provide a rough explanation of what you > observed ? > > Also Russell confirmed that ARM should be affected as well. > > Cheers, > Ben.
next prev parent reply other threads:[~2011-07-21 22:58 UTC|newest] Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-07-15 8:07 [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core Shan Hai 2011-07-15 8:07 ` Shan Hai 2011-07-15 8:07 ` [PATCH 1/1] " Shan Hai 2011-07-15 8:07 ` Shan Hai 2011-07-15 10:23 ` Peter Zijlstra 2011-07-15 10:23 ` Peter Zijlstra 2011-07-15 15:18 ` Shan Hai 2011-07-15 15:18 ` Shan Hai 2011-07-15 15:24 ` Peter Zijlstra 2011-07-15 15:24 ` Peter Zijlstra 2011-07-16 15:36 ` Shan Hai 2011-07-16 15:36 ` Shan Hai 2011-07-16 14:50 ` Shan Hai 2011-07-16 14:50 ` Shan Hai 2011-07-16 23:49 ` Benjamin Herrenschmidt 2011-07-16 23:49 ` Benjamin Herrenschmidt 2011-07-17 9:38 ` Peter Zijlstra 2011-07-17 9:38 ` Peter Zijlstra 2011-07-17 14:29 ` Benjamin Herrenschmidt 2011-07-17 14:29 ` Benjamin Herrenschmidt 2011-07-17 23:14 ` Benjamin Herrenschmidt 2011-07-17 23:14 ` Benjamin Herrenschmidt 2011-07-18 3:53 ` Benjamin Herrenschmidt 2011-07-18 3:53 ` Benjamin Herrenschmidt 2011-07-18 4:02 ` Benjamin Herrenschmidt 2011-07-18 4:02 ` Benjamin Herrenschmidt 2011-07-18 4:01 ` Benjamin Herrenschmidt 2011-07-18 4:01 ` Benjamin Herrenschmidt 2011-07-18 6:48 ` Shan Hai 2011-07-18 6:48 ` Shan Hai 2011-07-18 7:01 ` Benjamin Herrenschmidt 2011-07-18 7:01 ` Benjamin Herrenschmidt 2011-07-18 7:26 ` Shan Hai 2011-07-18 7:26 ` Shan Hai 2011-07-18 7:36 ` Benjamin Herrenschmidt 2011-07-18 7:36 ` Benjamin Herrenschmidt 2011-07-18 7:50 ` Shan Hai 2011-07-18 7:50 ` Shan Hai 2011-07-19 3:30 ` Shan Hai 2011-07-19 3:30 ` Shan Hai 2011-07-19 4:20 ` Benjamin Herrenschmidt 2011-07-19 4:20 ` Benjamin Herrenschmidt 2011-07-19 4:29 ` [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young Benjamin Herrenschmidt 2011-07-19 4:29 ` Benjamin Herrenschmidt 2011-07-19 4:55 ` Shan Hai 2011-07-19 4:55 ` Shan Hai 2011-07-19 5:17 ` Shan Hai 2011-07-19 5:17 ` Shan Hai 2011-07-19 5:24 ` Benjamin Herrenschmidt 2011-07-19 5:24 ` Benjamin Herrenschmidt 2011-07-19 5:38 ` Shan Hai 2011-07-19 5:38 ` Shan Hai 2011-07-19 7:46 ` Benjamin Herrenschmidt 2011-07-19 7:46 ` Benjamin Herrenschmidt 2011-07-19 8:24 ` Shan Hai 2011-07-19 8:24 ` Shan Hai 2011-07-19 8:26 ` [RFC/PATCH] mm/futex: Fix futex writes on archs with SW trackingof " David Laight 2011-07-19 8:26 ` David Laight 2011-07-19 8:45 ` Benjamin Herrenschmidt 2011-07-19 8:45 ` Benjamin Herrenschmidt 2011-07-19 8:45 ` Shan Hai 2011-07-19 8:45 ` Shan Hai 2011-07-19 11:10 ` [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of " Peter Zijlstra 2011-07-19 11:10 ` Peter Zijlstra 2011-07-20 14:39 ` Darren Hart 2011-07-20 14:39 ` Darren Hart 2011-07-21 22:36 ` Andrew Morton 2011-07-21 22:36 ` Andrew Morton 2011-07-21 22:52 ` Benjamin Herrenschmidt 2011-07-21 22:52 ` Benjamin Herrenschmidt 2011-07-21 22:57 ` Benjamin Herrenschmidt [this message] 2011-07-21 22:57 ` Benjamin Herrenschmidt 2011-07-21 22:59 ` Andrew Morton 2011-07-21 22:59 ` Andrew Morton 2011-07-22 1:40 ` Benjamin Herrenschmidt 2011-07-22 1:40 ` Benjamin Herrenschmidt 2011-07-22 1:54 ` Shan Hai 2011-07-22 1:54 ` Shan Hai 2011-07-27 6:50 ` Mike Frysinger 2011-07-27 6:50 ` Mike Frysinger 2011-07-27 7:58 ` Benjamin Herrenschmidt 2011-07-27 7:58 ` Benjamin Herrenschmidt 2011-07-27 8:59 ` Peter Zijlstra 2011-07-27 8:59 ` Peter Zijlstra 2011-07-27 10:09 ` David Howells 2011-07-27 10:09 ` David Howells 2011-07-27 10:17 ` Peter Zijlstra 2011-07-27 10:17 ` Peter Zijlstra 2011-07-27 10:20 ` Benjamin Herrenschmidt 2011-07-27 10:20 ` Benjamin Herrenschmidt 2011-07-28 0:12 ` Mike Frysinger 2011-07-28 0:12 ` Mike Frysinger 2011-08-08 2:31 ` Mike Frysinger 2011-08-08 2:31 ` Mike Frysinger 2011-07-28 10:55 ` David Howells 2011-07-28 10:55 ` David Howells 2011-07-17 11:02 ` [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core Peter Zijlstra 2011-07-17 11:02 ` Peter Zijlstra 2011-07-17 13:33 ` Shan Hai 2011-07-17 13:33 ` Shan Hai 2011-07-17 14:48 ` Benjamin Herrenschmidt 2011-07-17 14:48 ` Benjamin Herrenschmidt 2011-07-17 15:40 ` Shan Hai 2011-07-17 15:40 ` Shan Hai 2011-07-17 22:34 ` Benjamin Herrenschmidt 2011-07-17 22:34 ` Benjamin Herrenschmidt 2011-07-17 14:34 ` Benjamin Herrenschmidt 2011-07-17 14:34 ` Benjamin Herrenschmidt 2011-07-15 8:20 ` [PATCH 0/1] " Peter Zijlstra 2011-07-15 8:20 ` Peter Zijlstra 2011-07-15 8:38 ` MailingLists 2011-07-15 8:38 ` MailingLists 2011-07-15 8:44 ` Peter Zijlstra 2011-07-15 8:44 ` Peter Zijlstra 2011-07-15 9:08 ` Shan Hai 2011-07-15 9:08 ` Shan Hai 2011-07-15 9:12 ` Benjamin Herrenschmidt 2011-07-15 9:12 ` Benjamin Herrenschmidt 2011-07-15 9:50 ` Peter Zijlstra 2011-07-15 9:50 ` Peter Zijlstra 2011-07-15 10:06 ` Shan Hai 2011-07-15 10:06 ` Shan Hai 2011-07-15 10:32 ` David Laight 2011-07-15 10:32 ` David Laight 2011-07-15 10:39 ` Peter Zijlstra 2011-07-15 10:39 ` Peter Zijlstra 2011-07-15 15:32 ` Shan Hai 2011-07-15 15:32 ` Shan Hai 2011-07-16 0:20 ` Benjamin Herrenschmidt 2011-07-16 0:20 ` Benjamin Herrenschmidt 2011-07-16 15:03 ` Shan Hai 2011-07-16 15:03 ` Shan Hai 2011-07-15 23:47 ` Benjamin Herrenschmidt 2011-07-15 23:47 ` Benjamin Herrenschmidt 2011-07-15 9:07 ` Benjamin Herrenschmidt 2011-07-15 9:07 ` Benjamin Herrenschmidt 2011-07-15 9:05 ` Benjamin Herrenschmidt 2011-07-15 9:05 ` Benjamin Herrenschmidt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1311289053.25044.550.camel@pasglop \ --to=benh@kernel.crashing.org \ --cc=a.p.zijlstra@chello.nl \ --cc=akpm@linux-foundation.org \ --cc=cmetcalf@tilera.com \ --cc=dhowells@redhat.com \ --cc=haishan.bai@gmail.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=paulus@samba.org \ --cc=peterz@infradead.org \ --cc=tglx@linutronix.de \ --cc=tony.luck@intel.com \ --cc=walken@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.