linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Hyser <chris.hyser@oracle.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Nishanth Aravamudan <naravamudan@digitalocean.com>,
	Julien Desfossez <jdesfossez@digitalocean.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Vineeth Pillai <viremana@linux.microsoft.com>,
	Aaron Lu <aaron.lwe@gmail.com>,
	Aubrey Li <aubrey.intel@gmail.com>,
	tglx@linutronix.de, linux-kernel@vger.kernel.org,
	mingo@kernel.org, torvalds@linux-foundation.org,
	fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com,
	Phil Auld <pauld@redhat.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	vineeth@bitbyteword.org, Chen Yu <yu.c.chen@intel.com>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Agata Gruza <agata.gruza@intel.com>,
	Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>,
	graf@amazon.com, konrad.wilk@oracle.com, dfaggioli@suse.com,
	pjt@google.com, rostedt@goodmis.org, derkling@google.com,
	benbjiang@tencent.com,
	Alexandre Chartre <alexandre.chartre@oracle.com>,
	James.Bottomley@hansenpartnership.com, OWeisse@umich.edu,
	Dhaval Giani <dhaval.giani@oracle.com>,
	Junaid Shahid <junaids@google.com>,
	jsbarnes@google.com, Ben Segall <bsegall@google.com>,
	Josh Don <joshdon@google.com>, Hao Luo <haoluo@google.com>,
	Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [PATCH v10 2/5] sched: CGroup tagging interface for core scheduling
Date: Fri, 5 Feb 2021 17:19:04 -0500	[thread overview]
Message-ID: <7c5d083b-b722-76e5-37ba-ccd1c1a03e13@oracle.com> (raw)
In-Reply-To: <YB0hN5XG5dB0xiBh@hirez.programming.kicks-ass.net>

On 2/5/21 5:43 AM, Peter Zijlstra wrote:
> On Thu, Feb 04, 2021 at 03:52:55PM -0500, Chris Hyser wrote:
> 
>> A second complication was a decision that new processes (not threads) do not
>> inherit their parents cookie. Thus forking is also not a means to share a
>> cookie. Basically with a "from-only" interface, the new task would need to
>> be modified to call prctl() itself. From-only also does not allow for
>> setting a cookie on an unmodified already running task. This can be fixed by
>> providing both a "to" and "from" sharing interface that allows helper
>> programs to construct arbitrary configurations from unmodified programs.
> 
> Do we really want to inhibit on fork() or would exec() be a better
> place? What about those applications that use fork() based workers?

I like exec-time as a fan of fork-based workers. I suppose the counter argument would be security, but the new process 
is in a state to be trusted to lower it's privileges, change permissions, core sched cookie etc before it makes itself 
differently vulnerable and because it is the same code, it "knows" if it did.

>>> Also, how do I set a unique cookie on myself with this interface?
>>
>> The v10 patch still uses the overloaded v9 mechanism (which as mentioned
>> above is if two tasks w/o cookies share a cookie they get a new shared
>> unique cookie). Yes, that is clearly an inconsistency and kludgy. The
>> mechanism is documented in the docs, but clearly not obvious from the
> 
> I've not seen a document so far (also, I'm not one to actually read
> documentation, much preferring comments and Changelogs).

Understood. I think we should split this patch into separate prctl and cgroup patches. The rationale decided here would 
then go into the prctl patch commit message. We can also use this split to address any dependencies we've created on 
cgroups that you mentioned in a different email.

>> So based on the above, how about we add a "create" to pair with "clear" and
>> call it "create" vs "set" since we are creating a unique opaque cookie
>> versus setting a particular value. And as mentioned, because one can't
>> specify a cookie directly but only thru sharing relationships, we need both
>> "to" and "from" to make it completely usable.
>>
>> So we end up with something like this:
>>      PR_SCHED_CORE_CREATE                       -- give yourself a unique cookie
>>      PR_SCHED_CORE_CLEAR                        -- clear your core sched cookie
>>      PR_SCHED_CORE_SHARE_FROM <src_task>        -- get their cookie for you
>>      PR_SCHED_CORE_SHARE_TO   <dest_task>       -- push your cookie to them
> 
> I'm still wondering why we need _FROM/_TO. What exactly will we miss
> with just _SHARE <pid>?
> 
> 	current		arg_task
> 	<none>		<none>		-EDAFT
> 	<none>		<cookie>	current gets cookie
> 	<cookie>	<none>		arg_task gets cookie
> 	<cookie>	<cookie>	-EDAFTER
> 
> (I have a suspicion, but I want to see it spelled out).

The min requirements of the interface I see are:

1. create a my own cookie
2. clear my own cookie
3. create a cookie for a running unmodified task
4. clear a cookie for a running unmodified task
5. copy a cookie from one running unmodified task to another unmodified task

So from your example above:
 > 	<none>		<cookie>	current gets cookie

could also mean clear the cookie of arg_task and satisfy requirement 4 above.

"Share" is a fuzzy term. I should have used COPY as that is more the semantics I was thinking ... specified directional 
transfer. So we could have a single command with two arguments where argument order determines direction. In the v10 
interface proposal, as one argument, current, was always implied, direction had to be specified in the command.

So a single copy command could be something like:

PR_SCHED_CORE_COPY   <src_task> <dst_task>

to replace the two. The very first util you write to do any thing useful w/ all of this is a "copy_cookie <spid> 
<dpid>". :-)

> Also, do we wants this interface to be able to work on processes? Things
> like fcntl(F_SETOWN_EX) allow you to specify a PID type.

Yes and creating/clearing a cookie on a PGID and SID seem like useful shortcuts to look into.

>> An additional question is should the inheritability of a process' cookie be
>> configurable? The current code gives the child process their own unique
>> cookie if the parent had a cookie. That is useful in some cases, but many
>> other configurations could be made much easier with inheritance.
> 
> What was the argument for not following the traditional fork() semantics
> and inheriting everything?

The code just showed up with no explanation :-), but I think I know what was intended and it touches on the same 
security policy type problem you mentioned in a comment on the CLEAR code. In a secure context, you can't just allow a 
random user to clear their cookie, i.e. make themselves trusted. At the same time, in a non-secure context, and several 
use cases have been put forward, I can't think of anything more annoying then being able to set a cookie on my task and 
then not having permission to clear it.

The fork scenario has a similar problem. A child inheriting the cookie means just that, but not inheriting is likely 
different depending on whether its secure vs non-secure (and obviously we can't use those terms. Who wants to advocate 
for non-security :-). For non-secure, don't inherit means the child gets no cookie; secure means the child gets their 
own unique cookie and not the parent's. In the absence of any way to set a policy, the current code chose the secure 
default which makes sense.

So I guess that raises the ugly question, do we need some kind of globally scoped, "secure/not-secure" core sched policy 
flag?

-chrish

  reply	other threads:[~2021-02-06  4:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-23  1:16 [PATCH v10 0/5] Core scheduling remaining patches Joel Fernandes (Google)
2021-01-23  1:17 ` [PATCH v10 1/5] sched: migration changes for core scheduling Joel Fernandes (Google)
2021-01-23  1:17 ` [PATCH v10 2/5] sched: CGroup tagging interface " Joel Fernandes (Google)
2021-02-03 16:51   ` Peter Zijlstra
2021-02-04 13:59     ` Peter Zijlstra
2021-02-05 16:42       ` Joel Fernandes
2021-02-04 13:54   ` Peter Zijlstra
2021-02-05  3:45     ` Josh Don
2021-02-04 13:57   ` Peter Zijlstra
2021-02-04 20:52     ` Chris Hyser
2021-02-05 10:43       ` Peter Zijlstra
2021-02-05 22:19         ` Chris Hyser [this message]
2021-02-04 14:01   ` Peter Zijlstra
2021-02-05  3:55     ` Josh Don
2021-02-04 14:35   ` Peter Zijlstra
2021-02-05  4:07     ` Josh Don
2021-02-04 14:52   ` Peter Zijlstra
2021-02-05 16:37     ` Joel Fernandes
2021-02-05 11:41   ` Peter Zijlstra
2021-02-05 11:52   ` Peter Zijlstra
2021-02-06  1:15     ` Josh Don
2021-02-05 12:00   ` Peter Zijlstra
2021-02-23  4:00   ` Chris Hyser
2021-02-23  9:05     ` Peter Zijlstra
2021-02-23 19:25       ` Chris Hyser
2021-02-24  5:15         ` Josh Don
2021-02-24 13:02           ` Chris Hyser
2021-02-24 13:52             ` chris hyser
2021-02-24 15:47               ` chris hyser
2021-02-26 20:07                 ` Chris Hyser
2021-03-01 21:01                   ` Josh Don
2021-01-23  1:17 ` [PATCH v10 3/5] kselftest: Add tests for core-sched interface Joel Fernandes (Google)
2021-01-23  1:17 ` [PATCH v10 4/5] Documentation: Add core scheduling documentation Joel Fernandes (Google)
2021-01-23  1:17 ` [PATCH v10 5/5] sched: Debug bits Joel Fernandes (Google)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c5d083b-b722-76e5-37ba-ccd1c1a03e13@oracle.com \
    --to=chris.hyser@oracle.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=OWeisse@umich.edu \
    --cc=aaron.lwe@gmail.com \
    --cc=agata.gruza@intel.com \
    --cc=alexandre.chartre@oracle.com \
    --cc=antonio.gomez.iglesias@intel.com \
    --cc=aubrey.intel@gmail.com \
    --cc=benbjiang@tencent.com \
    --cc=bsegall@google.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=derkling@google.com \
    --cc=dfaggioli@suse.com \
    --cc=dhaval.giani@oracle.com \
    --cc=fweisbec@gmail.com \
    --cc=graf@amazon.com \
    --cc=haoluo@google.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=joel@joelfernandes.org \
    --cc=joshdon@google.com \
    --cc=jsbarnes@google.com \
    --cc=junaids@google.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=naravamudan@digitalocean.com \
    --cc=pauld@redhat.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=vineeth@bitbyteword.org \
    --cc=viremana@linux.microsoft.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).