linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Metcalf <cmetcalf@mellanox.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Lameter <cl@linux.com>,
	Gilad Ben Yossef <giladb@mellanox.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Tejun Heo <tj@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Michal Hocko <mhocko@suse.com>, <linux-mm@kvack.org>,
	<linux-doc@vger.kernel.org>, <linux-api@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v14 04/14] task_isolation: add initial support
Date: Tue, 30 Aug 2016 13:36:00 -0400	[thread overview]
Message-ID: <aea6b90e-4b43-302a-636f-36516f30f5d6@mellanox.com> (raw)
In-Reply-To: <20160830171003.GA14200@lerouge>

On 8/30/2016 1:10 PM, Frederic Weisbecker wrote:
> On Tue, Aug 30, 2016 at 11:41:36AM -0400, Chris Metcalf wrote:
>> On 8/29/2016 8:55 PM, Frederic Weisbecker wrote:
>>> On Mon, Aug 15, 2016 at 10:59:55AM -0400, Chris Metcalf wrote:
>>>> On 8/11/2016 2:50 PM, Christoph Lameter wrote:
>>>>> On Thu, 11 Aug 2016, Frederic Weisbecker wrote:
>>>>>
>>>>>> Do we need to quiesce vmstat everytime before entering userspace?
>>>>>> I thought that vmstat only need to be offlined once and for all?
>>>>> Once is sufficient after disabling the tick.
>>>> It's true that task_isolation_enter() is called every time before
>>>> returning to user space while task isolation is enabled.
>>>>
>>>> But once we enter the kernel again after returning from the initial
>>>> prctl() -- assuming we are in NOSIG mode so doing so is legal in the
>>>> first place -- almost anything can happen, certainly including
>>>> restarting the tick.  Thus, we have to make sure that normal quiescing
>>>> happens again before we return to userspace.
>>> Yes but we need to sort out what needs to be called only once on prctl().
>>>
>>> Once vmstat is quiesced, it's not going to need quiescing again even if we
>>> restart the tick.
>> That's true, but I really do like the idea of having a clean structure
>> where we verify all our prerequisites in task_isolation_ready(), and
>> have code to try to get things fixed up in task_isolation_enter().
>> If we start moving some things here and some things there, it gets
>> harder to manage.  I think by testing "!vmstat_idle()" in
>> task_isolation_enter() we are avoiding any substantial overhead.
> I think that making the code clearer on what needs to be done once for
> all on prctl() and what needs to be done on all actual syscall return
> is more important for readability.

We don't need to just do it on prctl(), though.  We also need to do
it whenever we have been in the kernel for another reason, which
can happen with NOSIG.  So we need to do this on the common return
to userspace path no matter what, right?  Or am I missing something?
It seems like if, for example, we do mmaps or page faults, then on return
to userspace, some of those counters will have been incremented and
we'll need to run the quiet_vmstat_sync() code.

>>>>>> +	if (!tick_nohz_tick_stopped())
>>>>>> +		set_tsk_need_resched(current);
>>>>>> Again, that won't help
>>>> It won't be better than spinning in a loop if there aren't any other
>>>> schedulable processes, but it won't be worse either.  If there is
>>>> another schedulable process, we at least will schedule it sooner than
>>>> if we just sat in a busy loop and waited for the scheduler to kick
>>>> us. But there's nothing else we can do anyway if we want to maintain
>>>> the guarantee that the dyn tick is stopped before return to userspace.
>>> I don't think it helps either way. If reschedule is pending, the current
>>> task already has TIF_RESCHED set.
>> See the other thread with Peter Z for the longer discussion of this.
>> At this point I'm leaning towards replacing the set_tsk_need_resched() with
>>
>>      set_current_state(TASK_INTERRUPTIBLE);
>>      schedule();
>>      __set_current_state(TASK_RUNNING);
> I don't see how that helps. What will wake the thread up except a signal?

The answer is that the scheduler will keep bringing us back to this
point (either after running another runnable task if there is one,
or else just returning to this point immediately without doing a
context switch), and we will then go around the "prepare exit to
userspace" loop and perhaps discover that enough time has gone
by that the last dyntick interrupt has triggered and the kernel has
quiesced the dynticks.  At that point we stop calling schedule()
over and over and can return normally to userspace.

It's very counter-intuitive to burn cpu time intentionally in the kernel.
I really don't see another way to resolve this, though.  I don't think
it would be safe, for example, to just promote the next dyntick to
running immediately (rather than waiting a few microseconds until
it is scheduled to go off).

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

  reply	other threads:[~2016-08-30 23:10 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-09 20:29 [PATCH v14 00/14] support "task_isolation" mode Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 01/14] vmstat: add quiet_vmstat_sync function Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 02/14] vmstat: add vmstat_idle function Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 03/14] lru_add_drain_all: factor out lru_add_drain_needed Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 04/14] task_isolation: add initial support Chris Metcalf
2016-08-11 18:11   ` Frederic Weisbecker
2016-08-11 18:50     ` Christoph Lameter
2016-08-15 14:59       ` Chris Metcalf
2016-08-30  0:55         ` Frederic Weisbecker
2016-08-30 15:41           ` Chris Metcalf
2016-08-30 17:10             ` Frederic Weisbecker
2016-08-30 17:36               ` Chris Metcalf [this message]
2016-08-30 18:17                 ` Chris Metcalf
2016-09-03 15:31                   ` Frederic Weisbecker
2016-09-09 18:54                     ` Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 05/14] task_isolation: track asynchronous interrupts Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 06/14] arch/x86: enable task isolation functionality Chris Metcalf
2016-08-10  7:52   ` Andy Lutomirski
2016-08-10 14:30     ` Chris Metcalf
2016-08-10 19:17       ` Andy Lutomirski
2016-08-10 19:40         ` Chris Metcalf
2016-08-10 20:06           ` Andy Lutomirski
2016-08-09 20:29 ` [PATCH v14 07/14] arm64: factor work_pending state machine to C Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 08/14] arch/arm64: enable task isolation functionality Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 09/14] arch/tile: " Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 10/14] arm, tile: turn off timer tick for oneshot_stopped state Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 11/14] clocksource: Do not schedule watchdog on isolated or NOHZ cpus Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 12/14] task_isolation: support CONFIG_TASK_ISOLATION_ALL Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 13/14] task_isolation: add user-settable notification signal Chris Metcalf
2016-08-09 20:29 ` [PATCH v14 14/14] task_isolation self test Chris Metcalf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aea6b90e-4b43-302a-636f-36516f30f5d6@mellanox.com \
    --to=cmetcalf@mellanox.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=fweisbec@gmail.com \
    --cc=giladb@mellanox.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=viresh.kumar@linaro.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).