On Wed, 2018-08-29 at 08:36 -0700, Andy Lutomirski wrote: > On Wed, Aug 29, 2018 at 8:17 AM, Rik van Riel > wrote: > > On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote: > > > On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel > > > wrote: > > > > On Mon, 27 Aug 2018 16:04:16 -0700 > > > > Andy Lutomirski wrote: > > > > > > > > > The 0day bot is still chewing on this, but I've tested it a > > > > > bit > > > > > locally > > > > > and it seems to do the right thing. > > > > > > > > Hi Andy, > > > > > > > > the version of the patch below should fix the bug we talked > > > > about > > > > in email yesterday. It should automatically cover kernel > > > > threads > > > > in lazy TLB mode, because current->mm will be NULL, while the > > > > cpu_tlbstate.loaded_mm should never be NULL. > > > > > > > > > > That's better than mine. I tweaked it a bit and added some > > > debugging, > > > and I got this: > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/fixes&id=dd956eba16646fd0b15c3c0741269dfd84452dac > > > > > > I made the loaded_mm handling a little more conservative to make > > > it > > > more obvious that switch_mm_irqs_off() is safe regardless of > > > exactly > > > when it gets called relative to switching current. > > > > I am not convinced that the dance of writing > > cpu_tlbstate.loaded_mm twice, with a barrier on > > each end, is useful or necessary. > > > > At the time switch_mm_irqs_off returns, nmi_uaccess_ok() > > will still return false, because we have not switched > > "current" to the task that owns the next mm_struct yet. > > > > We just have to make sure to: > > 1) Change cpu_tlbstate.loaded_mm before we manipulate > > CR3, and > > 2) Change "current" only once enough of the mm stuff has > > been switched, __switch_to seems to get that right. > > > > Between the time switch_mm_irqs_off() sets cpu_tlbstate > > to the next mm, and __switch_to moves() over current, > > nmi_uaccess_ok() will return false. > > All true, but I think it stops working as soon as someone starts > calling switch_mm_irqs_off() for some other reason, such as during > text_poke(). And that was the original motivation for this patch. How does calling switch_mm_irqs_off() for text_poke() change around current->mm and cpu_tlbstate.loaded_mm? Does current->mm stay the same throughout the entire text_poke() chain, while cpustate.loaded_mm is the only thing that is changed out? If so, then yes the double assignment is indeed necessary. Good point. -- All Rights Reversed.