* [PATCH 1/1] psi: do not require setsched permission from the trigger creator @ 2019-07-30 1:33 Suren Baghdasaryan 2019-07-30 8:11 ` Peter Zijlstra 2019-08-06 11:20 ` [tip:sched/urgent] sched/psi: Do " tip-bot for Suren Baghdasaryan 0 siblings, 2 replies; 8+ messages in thread From: Suren Baghdasaryan @ 2019-07-30 1:33 UTC (permalink / raw) To: mingo, peterz Cc: lizefan, hannes, axboe, dennis, dennisszhou, akpm, linux-mm, linux-doc, linux-kernel, kernel-team, Suren Baghdasaryan, Nick Kralevich When a process creates a new trigger by writing into /proc/pressure/* files, permissions to write such a file should be used to determine whether the process is allowed to do so or not. Current implementation would also require such a process to have setsched capability. Setting of psi trigger thread's scheduling policy is an implementation detail and should not be exposed to the user level. Remove the permission check by using _nocheck version of the function. Suggested-by: Nick Kralevich <nnk@google.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> --- kernel/sched/psi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 7acc632c3b82..ed9a1d573cb1 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1061,7 +1061,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, mutex_unlock(&group->trigger_lock); return ERR_CAST(kworker); } - sched_setscheduler(kworker->task, SCHED_FIFO, ¶m); + sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, ¶m); kthread_init_delayed_work(&group->poll_work, psi_poll_work); rcu_assign_pointer(group->poll_kworker, kworker); -- 2.22.0.709.g102302147b-goog ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] psi: do not require setsched permission from the trigger creator 2019-07-30 1:33 [PATCH 1/1] psi: do not require setsched permission from the trigger creator Suren Baghdasaryan @ 2019-07-30 8:11 ` Peter Zijlstra 2019-07-30 17:44 ` Suren Baghdasaryan 2019-08-06 11:20 ` [tip:sched/urgent] sched/psi: Do " tip-bot for Suren Baghdasaryan 1 sibling, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2019-07-30 8:11 UTC (permalink / raw) To: Suren Baghdasaryan Cc: mingo, lizefan, hannes, axboe, dennis, dennisszhou, akpm, linux-mm, linux-doc, linux-kernel, kernel-team, Nick Kralevich, Thomas Gleixner On Mon, Jul 29, 2019 at 06:33:10PM -0700, Suren Baghdasaryan wrote: > When a process creates a new trigger by writing into /proc/pressure/* > files, permissions to write such a file should be used to determine whether > the process is allowed to do so or not. Current implementation would also > require such a process to have setsched capability. Setting of psi trigger > thread's scheduling policy is an implementation detail and should not be > exposed to the user level. Remove the permission check by using _nocheck > version of the function. > > Suggested-by: Nick Kralevich <nnk@google.com> > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > --- > kernel/sched/psi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > index 7acc632c3b82..ed9a1d573cb1 100644 > --- a/kernel/sched/psi.c > +++ b/kernel/sched/psi.c > @@ -1061,7 +1061,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, > mutex_unlock(&group->trigger_lock); > return ERR_CAST(kworker); > } > - sched_setscheduler(kworker->task, SCHED_FIFO, ¶m); > + sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, ¶m); ARGGH, wtf is there a FIFO-99!! thread here at all !? > kthread_init_delayed_work(&group->poll_work, > psi_poll_work); > rcu_assign_pointer(group->poll_kworker, kworker); > -- > 2.22.0.709.g102302147b-goog > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] psi: do not require setsched permission from the trigger creator 2019-07-30 8:11 ` Peter Zijlstra @ 2019-07-30 17:44 ` Suren Baghdasaryan 2019-08-01 9:51 ` Peter Zijlstra 0 siblings, 1 reply; 8+ messages in thread From: Suren Baghdasaryan @ 2019-07-30 17:44 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, lizefan, Johannes Weiner, axboe, Dennis Zhou, Dennis Zhou, Andrew Morton, linux-mm, linux-doc, LKML, kernel-team, Nick Kralevich, Thomas Gleixner On Tue, Jul 30, 2019 at 1:11 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, Jul 29, 2019 at 06:33:10PM -0700, Suren Baghdasaryan wrote: > > When a process creates a new trigger by writing into /proc/pressure/* > > files, permissions to write such a file should be used to determine whether > > the process is allowed to do so or not. Current implementation would also > > require such a process to have setsched capability. Setting of psi trigger > > thread's scheduling policy is an implementation detail and should not be > > exposed to the user level. Remove the permission check by using _nocheck > > version of the function. > > > > Suggested-by: Nick Kralevich <nnk@google.com> > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > --- > > kernel/sched/psi.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > > index 7acc632c3b82..ed9a1d573cb1 100644 > > --- a/kernel/sched/psi.c > > +++ b/kernel/sched/psi.c > > @@ -1061,7 +1061,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, > > mutex_unlock(&group->trigger_lock); > > return ERR_CAST(kworker); > > } > > - sched_setscheduler(kworker->task, SCHED_FIFO, ¶m); > > + sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, ¶m); > > ARGGH, wtf is there a FIFO-99!! thread here at all !? We need psi poll_kworker to be an rt-priority thread so that psi notifications are delivered to the userspace without delay even when the CPUs are very congested. Otherwise it's easy to delay psi notifications by running a simple CPU hogger executing "chrt -f 50 dd if=/dev/zero of=/dev/null". Because these notifications are time-critical for reacting to memory shortages we can't allow for such delays. Notice that this kworker is created only if userspace creates a psi trigger. So unless you are using psi triggers you will never see this kthread created. > > kthread_init_delayed_work(&group->poll_work, > > psi_poll_work); > > rcu_assign_pointer(group->poll_kworker, kworker); > > -- > > 2.22.0.709.g102302147b-goog > > > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] psi: do not require setsched permission from the trigger creator 2019-07-30 17:44 ` Suren Baghdasaryan @ 2019-08-01 9:51 ` Peter Zijlstra 2019-08-01 18:28 ` Suren Baghdasaryan 0 siblings, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2019-08-01 9:51 UTC (permalink / raw) To: Suren Baghdasaryan Cc: Ingo Molnar, lizefan, Johannes Weiner, axboe, Dennis Zhou, Dennis Zhou, Andrew Morton, linux-mm, linux-doc, LKML, kernel-team, Nick Kralevich, Thomas Gleixner On Tue, Jul 30, 2019 at 10:44:51AM -0700, Suren Baghdasaryan wrote: > On Tue, Jul 30, 2019 at 1:11 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Mon, Jul 29, 2019 at 06:33:10PM -0700, Suren Baghdasaryan wrote: > > > When a process creates a new trigger by writing into /proc/pressure/* > > > files, permissions to write such a file should be used to determine whether > > > the process is allowed to do so or not. Current implementation would also > > > require such a process to have setsched capability. Setting of psi trigger > > > thread's scheduling policy is an implementation detail and should not be > > > exposed to the user level. Remove the permission check by using _nocheck > > > version of the function. > > > > > > Suggested-by: Nick Kralevich <nnk@google.com> > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > > --- > > > kernel/sched/psi.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > > > index 7acc632c3b82..ed9a1d573cb1 100644 > > > --- a/kernel/sched/psi.c > > > +++ b/kernel/sched/psi.c > > > @@ -1061,7 +1061,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, > > > mutex_unlock(&group->trigger_lock); > > > return ERR_CAST(kworker); > > > } > > > - sched_setscheduler(kworker->task, SCHED_FIFO, ¶m); > > > + sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, ¶m); > > > > ARGGH, wtf is there a FIFO-99!! thread here at all !? > > We need psi poll_kworker to be an rt-priority thread so that psi There is a giant difference between 'needs to be higher than OTHER' and FIFO-99. > notifications are delivered to the userspace without delay even when > the CPUs are very congested. Otherwise it's easy to delay psi > notifications by running a simple CPU hogger executing "chrt -f 50 dd > if=/dev/zero of=/dev/null". Because these notifications are So what; at that point that's exactly what you're asking for. Using RT is for those who know what they're doing. > time-critical for reacting to memory shortages we can't allow for such > delays. Furthermore, actual RT programs will have pre-allocated and locked any memory they rely on. They don't give a crap about your pressure nonsense. > Notice that this kworker is created only if userspace creates a psi > trigger. So unless you are using psi triggers you will never see this > kthread created. By marking it FIFO-99 you're in effect saying that your stupid statistics gathering is more important than your life. It will preempt the task that's in control of the band-saw emergency break, it will preempt the task that's adjusting the electromagnetic field containing this plasma flow. That's insane. I'm going to queue a patch to reduce this to FIFO-1, that will preempt regular OTHER tasks but will not perturb (much) actual RT bits. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] psi: do not require setsched permission from the trigger creator 2019-08-01 9:51 ` Peter Zijlstra @ 2019-08-01 18:28 ` Suren Baghdasaryan 2019-08-01 21:59 ` Peter Zijlstra 0 siblings, 1 reply; 8+ messages in thread From: Suren Baghdasaryan @ 2019-08-01 18:28 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, lizefan, Johannes Weiner, axboe, Dennis Zhou, Dennis Zhou, Andrew Morton, linux-mm, linux-doc, LKML, kernel-team, Nick Kralevich, Thomas Gleixner Hi Peter, Thanks for sharing your thoughts. I understand your point and I tend to agree with it. I originally designed this using watchdog as the example of a critical system health signal and in the context of mobile device memory pressure is critical but I agree that there are more important things in life. I checked and your proposal to change it to FIFO-1 should still work for our purposes. Will test to make sure and reply to your patch. Couple clarifications in-line. On Thu, Aug 1, 2019 at 2:51 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Jul 30, 2019 at 10:44:51AM -0700, Suren Baghdasaryan wrote: > > On Tue, Jul 30, 2019 at 1:11 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Mon, Jul 29, 2019 at 06:33:10PM -0700, Suren Baghdasaryan wrote: > > > > When a process creates a new trigger by writing into /proc/pressure/* > > > > files, permissions to write such a file should be used to determine whether > > > > the process is allowed to do so or not. Current implementation would also > > > > require such a process to have setsched capability. Setting of psi trigger > > > > thread's scheduling policy is an implementation detail and should not be > > > > exposed to the user level. Remove the permission check by using _nocheck > > > > version of the function. > > > > > > > > Suggested-by: Nick Kralevich <nnk@google.com> > > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > > > --- > > > > kernel/sched/psi.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > > > > index 7acc632c3b82..ed9a1d573cb1 100644 > > > > --- a/kernel/sched/psi.c > > > > +++ b/kernel/sched/psi.c > > > > @@ -1061,7 +1061,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, > > > > mutex_unlock(&group->trigger_lock); > > > > return ERR_CAST(kworker); > > > > } > > > > - sched_setscheduler(kworker->task, SCHED_FIFO, ¶m); > > > > + sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, ¶m); > > > > > > ARGGH, wtf is there a FIFO-99!! thread here at all !? > > > > We need psi poll_kworker to be an rt-priority thread so that psi > > There is a giant difference between 'needs to be higher than OTHER' and > FIFO-99. > > > notifications are delivered to the userspace without delay even when > > the CPUs are very congested. Otherwise it's easy to delay psi > > notifications by running a simple CPU hogger executing "chrt -f 50 dd > > if=/dev/zero of=/dev/null". Because these notifications are > > So what; at that point that's exactly what you're asking for. Using RT > is for those who know what they're doing. > > > time-critical for reacting to memory shortages we can't allow for such > > delays. > > Furthermore, actual RT programs will have pre-allocated and locked any > memory they rely on. They don't give a crap about your pressure > nonsense. > This signal is used not to protect other RT tasks but to monitor overall system memory health for the sake of system responsiveness. > > Notice that this kworker is created only if userspace creates a psi > > trigger. So unless you are using psi triggers you will never see this > > kthread created. > > By marking it FIFO-99 you're in effect saying that your stupid > statistics gathering is more important than your life. It will preempt > the task that's in control of the band-saw emergency break, it will > preempt the task that's adjusting the electromagnetic field containing > this plasma flow. > > That's insane. IMHO an opt-in feature stops being "stupid" as soon as the user opted in to use it, therefore explicitly indicating interest in it. However I assume you are using "stupid" here to indicate that it's "less important" rather than it's "useless". > I'm going to queue a patch to reduce this to FIFO-1, that will preempt > regular OTHER tasks but will not perturb (much) actual RT bits. > Thanks for posting the fix. > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] psi: do not require setsched permission from the trigger creator 2019-08-01 18:28 ` Suren Baghdasaryan @ 2019-08-01 21:59 ` Peter Zijlstra 2019-08-02 1:19 ` Suren Baghdasaryan 0 siblings, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2019-08-01 21:59 UTC (permalink / raw) To: Suren Baghdasaryan Cc: Ingo Molnar, lizefan, Johannes Weiner, axboe, Dennis Zhou, Dennis Zhou, Andrew Morton, linux-mm, linux-doc, LKML, kernel-team, Nick Kralevich, Thomas Gleixner On Thu, Aug 01, 2019 at 11:28:30AM -0700, Suren Baghdasaryan wrote: > > By marking it FIFO-99 you're in effect saying that your stupid > > statistics gathering is more important than your life. It will preempt > > the task that's in control of the band-saw emergency break, it will > > preempt the task that's adjusting the electromagnetic field containing > > this plasma flow. > > > > That's insane. > > IMHO an opt-in feature stops being "stupid" as soon as the user opted > in to use it, therefore explicitly indicating interest in it. However > I assume you are using "stupid" here to indicate that it's "less > important" rather than it's "useless". Quite; PSI does have its uses. RT just isn't one of them. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] psi: do not require setsched permission from the trigger creator 2019-08-01 21:59 ` Peter Zijlstra @ 2019-08-02 1:19 ` Suren Baghdasaryan 0 siblings, 0 replies; 8+ messages in thread From: Suren Baghdasaryan @ 2019-08-02 1:19 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, lizefan, Johannes Weiner, axboe, Dennis Zhou, Dennis Zhou, Andrew Morton, linux-mm, linux-doc, LKML, kernel-team, Nick Kralevich, Thomas Gleixner On Thu, Aug 1, 2019 at 2:59 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Aug 01, 2019 at 11:28:30AM -0700, Suren Baghdasaryan wrote: > > > By marking it FIFO-99 you're in effect saying that your stupid > > > statistics gathering is more important than your life. It will preempt > > > the task that's in control of the band-saw emergency break, it will > > > preempt the task that's adjusting the electromagnetic field containing > > > this plasma flow. > > > > > > That's insane. > > > > IMHO an opt-in feature stops being "stupid" as soon as the user opted > > in to use it, therefore explicitly indicating interest in it. However > > I assume you are using "stupid" here to indicate that it's "less > > important" rather than it's "useless". > > Quite; PSI does have its uses. RT just isn't one of them. Sorry about messing it up in the first place. If you don't see any issues with my patch replacing sched_setscheduler() with sched_setscheduler_nocheck(), would you mind taking it too? I applied it over your patch onto Linus' ToT with no merge conflicts. Thanks, Suren. > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip:sched/urgent] sched/psi: Do not require setsched permission from the trigger creator 2019-07-30 1:33 [PATCH 1/1] psi: do not require setsched permission from the trigger creator Suren Baghdasaryan 2019-07-30 8:11 ` Peter Zijlstra @ 2019-08-06 11:20 ` tip-bot for Suren Baghdasaryan 1 sibling, 0 replies; 8+ messages in thread From: tip-bot for Suren Baghdasaryan @ 2019-08-06 11:20 UTC (permalink / raw) To: linux-tip-commits; +Cc: hpa, nnk, surenb, peterz, tglx, linux-kernel, mingo Commit-ID: 04e048cf09d7b5fc995817cdc5ae1acd4482429c Gitweb: https://git.kernel.org/tip/04e048cf09d7b5fc995817cdc5ae1acd4482429c Author: Suren Baghdasaryan <surenb@google.com> AuthorDate: Mon, 29 Jul 2019 18:33:10 -0700 Committer: Peter Zijlstra <peterz@infradead.org> CommitDate: Tue, 6 Aug 2019 12:49:18 +0200 sched/psi: Do not require setsched permission from the trigger creator When a process creates a new trigger by writing into /proc/pressure/* files, permissions to write such a file should be used to determine whether the process is allowed to do so or not. Current implementation would also require such a process to have setsched capability. Setting of psi trigger thread's scheduling policy is an implementation detail and should not be exposed to the user level. Remove the permission check by using _nocheck version of the function. Suggested-by: Nick Kralevich <nnk@google.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: lizefan@huawei.com Cc: mingo@redhat.com Cc: akpm@linux-foundation.org Cc: kernel-team@android.com Cc: dennisszhou@gmail.com Cc: dennis@kernel.org Cc: hannes@cmpxchg.org Cc: axboe@kernel.dk Link: https://lkml.kernel.org/r/20190730013310.162367-1-surenb@google.com --- kernel/sched/psi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 7fe2c5fd26b5..23fbbcc414d5 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1061,7 +1061,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, mutex_unlock(&group->trigger_lock); return ERR_CAST(kworker); } - sched_setscheduler(kworker->task, SCHED_FIFO, ¶m); + sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, ¶m); kthread_init_delayed_work(&group->poll_work, psi_poll_work); rcu_assign_pointer(group->poll_kworker, kworker); ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-08-06 11:21 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-30 1:33 [PATCH 1/1] psi: do not require setsched permission from the trigger creator Suren Baghdasaryan 2019-07-30 8:11 ` Peter Zijlstra 2019-07-30 17:44 ` Suren Baghdasaryan 2019-08-01 9:51 ` Peter Zijlstra 2019-08-01 18:28 ` Suren Baghdasaryan 2019-08-01 21:59 ` Peter Zijlstra 2019-08-02 1:19 ` Suren Baghdasaryan 2019-08-06 11:20 ` [tip:sched/urgent] sched/psi: Do " tip-bot for Suren Baghdasaryan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).