live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: gor@linux.ibm.com, jpoimboe@redhat.com, jikos@kernel.org,
	mbenes@suse.cz, pmladek@suse.com, mingo@kernel.org,
	linux-kernel@vger.kernel.org, joe.lawrence@redhat.com,
	fweisbec@gmail.com, tglx@linutronix.de, hca@linux.ibm.com,
	svens@linux.ibm.com, sumanthk@linux.ibm.com,
	live-patching@vger.kernel.org, paulmck@kernel.org,
	rostedt@goodmis.org, x86@kernel.org
Subject: Re: [RFC][PATCH v2 11/11] context_tracking,x86: Fix text_poke_sync() vs NOHZ_FULL
Date: Thu, 21 Oct 2021 16:57:09 -0300	[thread overview]
Message-ID: <20211021195709.GA22422@fuller.cnet> (raw)
In-Reply-To: <20211021192543.GV174703@worktop.programming.kicks-ass.net>

On Thu, Oct 21, 2021 at 09:25:43PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 21, 2021 at 03:39:35PM -0300, Marcelo Tosatti wrote:
> > Peter,
> > 
> > static __always_inline void arch_exit_to_user_mode(void)
> > {
> >         mds_user_clear_cpu_buffers();
> > }
> > 
> > /**
> >  * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability
> >  *
> >  * Clear CPU buffers if the corresponding static key is enabled
> >  */
> > static __always_inline void mds_user_clear_cpu_buffers(void)
> > {
> >         if (static_branch_likely(&mds_user_clear))
> >                 mds_clear_cpu_buffers();
> > }
> > 
> > We were discussing how to perform objtool style validation 
> > that no code after the check for 
> 
> I'm not sure what the point of the above is... Were you trying to ask
> for validation that nothing runs after the mds_user_clear_cpu_buffer()?
> 
> That isn't strictly true today, there's lockdep code after it. I can't
> recall why that order is as it is though.
> 
> Pretty much everything in noinstr is magical, we just have to think
> harder there (and possibly start writing more comments there).

mds_user_clear_cpu_buffers happens after sync_core, in your patchset, 
if i am not mistaken.

> > > +             /* NMI happens here and must still do/finish CT_WORK_n */
> > > +             sync_core();
> > 
> > But after the discussion with you, it seems doing the TLB checking 
> > and (also sync_core) checking very late/very early on exit/entry 
> > makes things easier to review.
> 
> I don't know about late, it must happen *very* early in entry. The
> sync_core() must happen before any self-modifying code gets called
> (static_branch, static_call, etc..) with possible exception of the
> context_tracking static_branch.
> 
> The TLBi must also happen super early, possibly while still on the
> entry stack (since the task stack is vmap'ed).

But will it be ever be freed/remapped from other CPUs while the task
is running?

> We currently don't run C
> code on the entry stack, that needs quite a bit of careful work to make
> happen.

Was thinking of coding in ASM after (as early as possible) the write to 
switch to kernel CR3:

 Kernel entry:
 -------------

       cpu = smp_processor_id();

       if (isolation_enabled(cpu)) {
               reqs = atomic_xchg(&percpudata->user_kernel_state, IN_KERNEL_MODE);
               if (reqs & CPU_REQ_FLUSH_TLB)
			flush_tlb_all();
               if (reqs & CPU_REQ_SYNC_CORE)
			sync_core();
       }                           

Exit to userspace (as close to write to CR3 with user pagetable
pointer):
 -----------------

       cpu = smp_processor_id();

       if (isolation_enabled(cpu)) {
               atomic_or(IN_USER_MODE, &percpudata->user_kernel_state);
       }

You think that is a bad idea (in ASM, not C) ? 
And request side can be in C:

 Request side:
 -------------

       int targetcpu;

       do {
               struct percpudata *pcpudata = per_cpu(&percpudata, targetcpu);

               old_state = pcpudata->user_kernel_state;

               /* in kernel mode ? */
               if (!(old_state & IN_USER_MODE)) {
                       smp_call_function_single(request_fn, targetcpu, 1);
                       break;
               }                                                                                                                         
               new_state = remote_state | CPU_REQ_FLUSH_TLB; // (or CPU_REQ_INV_ICACHE)
       } while (atomic_cmpxchg(&pcpudata->user_kernel_state, old_state, new_state) != old_state);   

(need logic to protect from atomic_cmpxchg always failing, but shouldnt
be difficult).

> > Can then use a single atomic variable with USER/KERNEL state and cmpxchg
> > loops.
> 
> We're not going to add an atomic to context tracking. There is one, we
> just got to extract/share it with RCU.

Again, to avoid kernel TLB flushes you'd have to ensure:

kernel entry:
	instrA addr1,addr2,addr3
	instrB addr2,addr3,addr4  <--- that no address here has TLBs
				       modified and flushed
	instrC addr5,addr6,addr7
        reqs = atomic_xchg(&percpudata->user_kernel_state, IN_KERNEL_MODE);
        if (reqs & CPU_REQ_FLUSH_TLB)
        	flush_tlb_all();

kernel exit:

        atomic_or(IN_USER_MODE, &percpudata->user_kernel_state);
	instrA addr1,addr2,addr3
	instrB addr2,addr3,addr4  <--- that no address here has TLBs
				       modified and flushed

This could be conditional on "task isolated mode" enabled (would be 
better if it didnt, though).

			


  reply	other threads:[~2021-10-21 19:57 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-29 15:17 [PATCH v2 00/11] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 01/11] sched: Improve try_invoke_on_locked_down_task() Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 02/11] sched,rcu: Rework try_invoke_on_locked_down_task() Peter Zijlstra
2021-09-29 15:17 ` [PATCH v2 03/11] sched,livepatch: Use task_call_func() Peter Zijlstra
2021-10-05 11:40   ` Petr Mladek
2021-10-05 14:03     ` Peter Zijlstra
2021-10-06  8:59   ` Miroslav Benes
2021-09-29 15:17 ` [PATCH v2 04/11] sched: Simplify wake_up_*idle*() Peter Zijlstra
2021-10-13 14:32   ` Qian Cai
2021-10-19  3:47     ` Qian Cai
2021-10-19  8:56       ` Peter Zijlstra
2021-10-19  9:10         ` Peter Zijlstra
2021-10-19 15:32           ` Qian Cai
2021-10-19 15:50             ` Peter Zijlstra
2021-10-19 19:22               ` Qian Cai
2021-10-19 20:27                 ` Peter Zijlstra
     [not found]   ` <CGME20211022134630eucas1p2e79e2816587d182c580459d567c1f2a9@eucas1p2.samsung.com>
2021-10-22 13:46     ` Marek Szyprowski
2021-09-29 15:17 ` [PATCH v2 05/11] sched,livepatch: Use wake_up_if_idle() Peter Zijlstra
2021-10-05 12:00   ` Petr Mladek
2021-10-06  9:16   ` Miroslav Benes
2021-10-07  9:18     ` Vasily Gorbik
2021-10-07 10:02       ` Peter Zijlstra
2021-10-13 19:37   ` Arnd Bergmann
2021-10-14 10:42     ` Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 06/11] context_tracking: Prefix user_{enter,exit}*() Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 07/11] context_tracking: Add an atomic sequence/state count Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 08/11] context_tracking,rcu: Replace RCU dynticks counter with context_tracking Peter Zijlstra
2021-09-29 18:37   ` Paul E. McKenney
2021-09-29 19:09     ` Peter Zijlstra
2021-09-29 19:11     ` Peter Zijlstra
2021-09-29 19:13     ` Peter Zijlstra
2021-09-29 19:24       ` Peter Zijlstra
2021-09-29 19:45         ` Paul E. McKenney
2021-09-29 18:54   ` Peter Zijlstra
2021-09-29 15:17 ` [RFC][PATCH v2 09/11] context_tracking,livepatch: Dont disturb NOHZ_FULL Peter Zijlstra
2021-10-06  8:12   ` Petr Mladek
2021-10-06  9:04     ` Peter Zijlstra
2021-10-06 10:29       ` Petr Mladek
2021-10-06 11:41         ` Peter Zijlstra
2021-10-06 11:48         ` Miroslav Benes
2021-09-29 15:17 ` [RFC][PATCH v2 10/11] livepatch: Remove klp_synchronize_transition() Peter Zijlstra
2021-10-06 12:30   ` Petr Mladek
2021-09-29 15:17 ` [RFC][PATCH v2 11/11] context_tracking,x86: Fix text_poke_sync() vs NOHZ_FULL Peter Zijlstra
2021-10-21 18:39   ` Marcelo Tosatti
2021-10-21 18:40     ` Marcelo Tosatti
2021-10-21 19:25     ` Peter Zijlstra
2021-10-21 19:57       ` Marcelo Tosatti [this message]
2021-10-21 20:18         ` Peter Zijlstra
2021-10-26 18:19           ` Marcelo Tosatti
2021-10-26 19:38             ` Peter Zijlstra
2021-09-29 18:03 ` [PATCH v2 00/11] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211021195709.GA22422@fuller.cnet \
    --to=mtosatti@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mingo@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sumanthk@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --subject='Re: [RFC][PATCH v2 11/11] context_tracking,x86: Fix text_poke_sync() vs NOHZ_FULL' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).