All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org, Nitesh Lal <nilal@redhat.com>,
	Nicolas Saenz Julienne <nsaenzju@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Alex Belits <abelits@belits.com>, Peter Xu <peterx@redhat.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Oscar Shiang <oscar0225@livemail.tw>
Subject: Re: [patch v12 09/13] task isolation: add preempt notifier to sync per-CPU vmstat dirty info to thread info
Date: Wed, 4 May 2022 13:32:38 -0300	[thread overview]
Message-ID: <YnKqpkdATqqlDHvK@fuller.cnet> (raw)
In-Reply-To: <87ilquybgz.ffs@tglx>

On Wed, Apr 27, 2022 at 02:09:16PM +0200, Thomas Gleixner wrote:
> On Wed, Apr 27 2022 at 09:11, Thomas Gleixner wrote:
> > On Tue, Mar 15 2022 at 12:31, Marcelo Tosatti wrote:
> >> If a thread has task isolation activated, is preempted by thread B,
> >> which marks vmstat information dirty, and is preempted back in,
> >> one might return to userspace with vmstat dirty information on the 
> >> CPU in question.
> >>
> >> To address this problem, add a preempt notifier that transfers vmstat dirty
> >> information to TIF_TASK_ISOL thread flag.
> >
> > How does this compile with CONFIG_KVM=n?
> 
> Aside of that, the existance of this preempt notifier alone tells me
> that this is either a design fail or has no design in the first place.
> 
> The state of vmstat does not matter at all at the point where a task is
> scheduled in. It matters when an isolated task goes out to user space or
> enters a VM.

If the following happens, with two threads with names that mean whether
a thread has task isolation enabled or not:

Thread-no-task-isol, Thread-task-isol.

Events:

not-runnable  		Thread-task-isol
runnable      		Thread-task-no-isol
marks vmstat dirty	Thread-task-no-isol (writes to some per-CPU vmstat
counter)
not-runnable		Thread-task-no-isol
runnable		Thread-task-isol

Then we have to transfer the "vmstat dirty" information from per-CPU 
bool to per-thread TIF_TASK_ISOL bit (so that the
task_isolation_process_work thing executes on return to userspace).

> We already have something similar in the exit to user path:
> 
>    tick_nohz_user_enter_prepare()
> 
> So you can do something like the below and have:
> 
> static inline void task_isol_exit_to_user_prepare(void)
> {
>         if (unlikely(current_needs_isol_exit_to_user())
>         	__task_isol_exit_to_user_prepare();
> }
> 
> where current_needs_isol_exit_to_user() is a simple check of either an
> existing mechanism like
> 
>          task->syscall_work & SYSCALL_WORK_TASK_ISOL_EXIT
> 
> or of some new task isolation specific member of task_struct which is
> placed so it is cache hot at that point:
> 
>         task->isol_work & SYSCALL_TASK_ISOL_EXIT
> 
> which is going to be almost zero overhead for any non isolated task.

Sure, but who sets SYSCALL_TASK_ISOL_EXIT or SYSCALL_TASK_ISOL_EXIT ?

> It's trivial enough to encode the real stuff into task->isol_work and
> I'm pretty sure, that a 32bit member is sufficient for that. There is
> absolutely no need for a potential 64x64 bit feature matrix.

Well, OK, the meaning of TIF_TASK_ISOL thread flag is ambiguous:

1) We set it when quiescing vmstat feature of task isolation.
2) We set it when switching between tasks A and B, B has 
task isolation configured and activated, and per-CPU vmstat information 
was dirty.
3) We clear it on return to userspace:

	if (test_bit(TIF_TASK_ISOL, thread->flags)) {
		clear_bit(TIF_TASK_ISOL, thread->flags))
		process_task_isol_work();
	}

So you prefer to separate:

Use TIF_TASK_ISOL for "task isolation configured and activated,
quiesce vmstat work on return to userspace" only, and then have
the "is vmstat per-CPU data dirty?" information held on 
task->syscall_work or task->isol_work ? (that will be probably be two
cachelines).

You'd still need the preempt notifier, though (unless i am missing
something).

Happy with either case.

Thanks for the review!

> Thanks,
> 
>         tglx
> ---
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -142,6 +142,12 @@ void noinstr exit_to_user_mode(void)
>  /* Workaround to allow gradual conversion of architecture code */
>  void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
>  
> +static void exit_to_user_update_work(void)
> +{
> +	tick_nohz_user_enter_prepare();
> +	task_isol_exit_to_user_prepare();
> +}
> +
>  static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>  					    unsigned long ti_work)
>  {
> @@ -178,8 +184,7 @@ static unsigned long exit_to_user_mode_l
>  		 */
>  		local_irq_disable_exit_to_user();
>  
> -		/* Check if any of the above work has queued a deferred wakeup */
> -		tick_nohz_user_enter_prepare();
> +		exit_to_user_update_work();
>  
>  		ti_work = read_thread_flags();
>  	}
> @@ -194,8 +199,7 @@ static void exit_to_user_mode_prepare(st
>  
>  	lockdep_assert_irqs_disabled();
>  
> -	/* Flush pending rcuog wakeup before the last need_resched() check */
> -	tick_nohz_user_enter_prepare();
> +	exit_to_user_update_work();
>  
>  	if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
>  		ti_work = exit_to_user_mode_loop(regs, ti_work);
> 
> 


  reply	other threads:[~2022-05-04 16:33 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 15:31 [patch v12 00/13] extensible prctl task isolation interface and vmstat sync Marcelo Tosatti
2022-03-15 15:31 ` [patch v12 01/13] s390: add support for TIF_TASK_ISOL Marcelo Tosatti
2022-03-15 15:31 ` [patch v12 02/13] x86: " Marcelo Tosatti
2022-03-15 15:31 ` [patch v12 03/13] add basic task isolation prctl interface Marcelo Tosatti
2022-04-25 22:23   ` Thomas Gleixner
2022-03-15 15:31 ` [patch v12 04/13] add prctl task isolation prctl docs and samples Marcelo Tosatti
2022-04-26  0:15   ` Thomas Gleixner
2022-03-15 15:31 ` [patch v12 05/13] task isolation: sync vmstats on return to userspace Marcelo Tosatti
2022-04-25 23:06   ` Thomas Gleixner
2022-04-27  6:56   ` Thomas Gleixner
2022-03-15 15:31 ` [patch v12 06/13] procfs: add per-pid task isolation state Marcelo Tosatti
2022-04-25 23:27   ` Thomas Gleixner
2022-03-15 15:31 ` [patch v12 07/13] task isolation: sync vmstats conditional on changes Marcelo Tosatti
2022-03-17 14:51   ` Frederic Weisbecker
2022-04-27  8:03   ` Thomas Gleixner
2022-03-15 15:31 ` [patch v12 08/13] task isolation: enable return to userspace processing Marcelo Tosatti
2022-03-15 15:31 ` [patch v12 09/13] task isolation: add preempt notifier to sync per-CPU vmstat dirty info to thread info Marcelo Tosatti
2022-03-16  2:41   ` Oscar Shiang
2022-04-27  7:11   ` Thomas Gleixner
2022-04-27 12:09     ` Thomas Gleixner
2022-05-04 16:32       ` Marcelo Tosatti [this message]
2022-05-04 17:39         ` Thomas Gleixner
2022-03-15 15:31 ` [patch v12 10/13] KVM: x86: process isolation work from VM-entry code path Marcelo Tosatti
2022-03-15 15:31 ` [patch v12 11/13] mm: vmstat: move need_update Marcelo Tosatti
2022-03-15 15:31 ` [patch v12 12/13] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti
2022-04-27  7:23   ` Thomas Gleixner
2022-05-03 19:17     ` Marcelo Tosatti
2022-03-15 15:31 ` [patch v12 13/13] task isolation: only TIF_TASK_ISOL if task isolation is enabled Marcelo Tosatti
2022-04-27  7:45   ` Thomas Gleixner
2022-05-03 19:12     ` Marcelo Tosatti
2022-05-04 13:03       ` Thomas Gleixner
2022-03-17 15:08 ` [patch v12 00/13] extensible prctl task isolation interface and vmstat sync Frederic Weisbecker
2022-04-25 16:29   ` Marcelo Tosatti
2022-04-25 21:12     ` Thomas Gleixner
2022-05-03 18:57       ` Marcelo Tosatti
2022-04-27  9:19 ` Christoph Lameter
2022-05-03 18:57   ` Marcelo Tosatti
2022-05-04 13:20     ` Thomas Gleixner
2022-05-04 18:56       ` Marcelo Tosatti
2022-05-04 20:15         ` Thomas Gleixner
2022-05-05 16:52           ` Marcelo Tosatti
2022-06-01 16:14             ` Marcelo Tosatti
2022-05-04 17:01 ` Tim Chen
2022-05-04 20:08   ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnKqpkdATqqlDHvK@fuller.cnet \
    --to=mtosatti@redhat.com \
    --cc=abelits@belits.com \
    --cc=bristot@redhat.com \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nilal@redhat.com \
    --cc=nsaenzju@redhat.com \
    --cc=oscar0225@livemail.tw \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.