All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: tj@kernel.org, lizefan.x@bytedance.com, peterz@infradead.org,
	johunt@akamai.com, mhocko@suse.com, keescook@chromium.org,
	quic_sudaraja@quicinc.com, cgroups@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/1] psi: remove 500ms min window size limitation for triggers
Date: Wed, 1 Mar 2023 15:07:50 -0500	[thread overview]
Message-ID: <Y/+wlg5L8A1iebya@cmpxchg.org> (raw)
In-Reply-To: <20230301193403.1507484-1-surenb@google.com>

On Wed, Mar 01, 2023 at 11:34:03AM -0800, Suren Baghdasaryan wrote:
> Current 500ms min window size for psi triggers limits polling interval
> to 50ms to prevent polling threads from using too much cpu bandwidth by
> polling too frequently. However the number of cgroups with triggers is
> unlimited, so this protection can be defeated by creating multiple
> cgroups with psi triggers (triggers in each cgroup are served by a single
> "psimon" kernel thread).
> Instead of limiting min polling period, which also limits the latency of
> psi events, it's better to limit psi trigger creation to authorized users
> only, like we do for system-wide psi triggers (/proc/pressure/* files can
> be written only by processes with CAP_SYS_RESOURCE capability). This also
> makes access rules for cgroup psi files consistent with system-wide ones.
> Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and
> remove the psi window min size limitation.
> 
> Suggested-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
> Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@quicinc.com/
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  kernel/cgroup/cgroup.c | 10 ++++++++++
>  kernel/sched/psi.c     |  4 +---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 935e8121b21e..b600a6baaeca 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of,
>  	return psi_trigger_poll(&ctx->psi.trigger, of->file, pt);
>  }
>  
> +static int cgroup_pressure_open(struct kernfs_open_file *of)
> +{
> +	return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) ?
> +		-EPERM : 0;
> +}

I agree with the change, but it's a bit unfortunate that this check is
duplicated between system and cgroup.

What do you think about psi_trigger_create() taking the file and
checking FMODE_WRITE and CAP_SYS_RESOURCE against file->f_cred?

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Suren Baghdasaryan <surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	johunt-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org,
	mhocko-IBi9RG/b67k@public.gmane.org,
	keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org,
	quic_sudaraja-jfJNa2p1gH1BDgjK7y7TUQ@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 1/1] psi: remove 500ms min window size limitation for triggers
Date: Wed, 1 Mar 2023 15:07:50 -0500	[thread overview]
Message-ID: <Y/+wlg5L8A1iebya@cmpxchg.org> (raw)
In-Reply-To: <20230301193403.1507484-1-surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Wed, Mar 01, 2023 at 11:34:03AM -0800, Suren Baghdasaryan wrote:
> Current 500ms min window size for psi triggers limits polling interval
> to 50ms to prevent polling threads from using too much cpu bandwidth by
> polling too frequently. However the number of cgroups with triggers is
> unlimited, so this protection can be defeated by creating multiple
> cgroups with psi triggers (triggers in each cgroup are served by a single
> "psimon" kernel thread).
> Instead of limiting min polling period, which also limits the latency of
> psi events, it's better to limit psi trigger creation to authorized users
> only, like we do for system-wide psi triggers (/proc/pressure/* files can
> be written only by processes with CAP_SYS_RESOURCE capability). This also
> makes access rules for cgroup psi files consistent with system-wide ones.
> Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and
> remove the psi window min size limitation.
> 
> Suggested-by: Sudarshan Rajagopalan <quic_sudaraja-jfJNa2p1gH1BDgjK7y7TUQ@public.gmane.org>
> Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja-jfJNa2p1gH1BDgjK7y7TUQ@public.gmane.org/
> Signed-off-by: Suren Baghdasaryan <surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
>  kernel/cgroup/cgroup.c | 10 ++++++++++
>  kernel/sched/psi.c     |  4 +---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 935e8121b21e..b600a6baaeca 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of,
>  	return psi_trigger_poll(&ctx->psi.trigger, of->file, pt);
>  }
>  
> +static int cgroup_pressure_open(struct kernfs_open_file *of)
> +{
> +	return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) ?
> +		-EPERM : 0;
> +}

I agree with the change, but it's a bit unfortunate that this check is
duplicated between system and cgroup.

What do you think about psi_trigger_create() taking the file and
checking FMODE_WRITE and CAP_SYS_RESOURCE against file->f_cred?

  reply	other threads:[~2023-03-01 20:07 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01 19:34 [PATCH 1/1] psi: remove 500ms min window size limitation for triggers Suren Baghdasaryan
2023-03-01 19:34 ` Suren Baghdasaryan
2023-03-01 20:07 ` Johannes Weiner [this message]
2023-03-01 20:07   ` Johannes Weiner
2023-03-01 20:48   ` Suren Baghdasaryan
2023-03-01 20:48     ` Suren Baghdasaryan
2023-03-01 21:00     ` Kalesh Singh
2023-03-01 21:00       ` Kalesh Singh
2023-03-02 15:30     ` Johannes Weiner
2023-03-02 15:30       ` Johannes Weiner
2023-03-02 16:13       ` Suren Baghdasaryan
2023-03-02 16:13         ` Suren Baghdasaryan
2023-03-02 17:41         ` Michal Hocko
2023-03-02 17:41           ` Michal Hocko
2023-03-02 17:58           ` Johannes Weiner
2023-03-02 17:58             ` Johannes Weiner
2023-03-02 13:58 ` Michal Hocko
2023-03-02 16:08   ` Suren Baghdasaryan
2023-03-02 16:08     ` Suren Baghdasaryan
2023-03-02 17:39     ` Michal Hocko
2023-03-02 17:39       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y/+wlg5L8A1iebya@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=cgroups@vger.kernel.org \
    --cc=johunt@akamai.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@suse.com \
    --cc=peterz@infradead.org \
    --cc=quic_sudaraja@quicinc.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.