linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] perf/core: fix mlock accounting in perf_mmap()
@ 2020-01-22 19:04 Song Liu
  2020-01-23  9:33 ` Alexander Shishkin
  0 siblings, 1 reply; 2+ messages in thread
From: Song Liu @ 2020-01-22 19:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Song Liu, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra

sysctl_perf_event_mlock and user->locked_vm can change value
independently, so we can't guarantee:

   user->locked_vm <= user_lock_limit

When user->locked_vm is larger than user_lock_limit, we cannot simply
update extra and user_extra as:

   extra = user_locked - user_lock_limit;
   user_extra -= extra;

Otherwise, user_extra will be negative. In extreme cases, this may lead to
negative user->locked_vm (until this perf-mmap is closed), which break
locked_vm badly.

Fix this by adjusting user_locked before calculating extra and user_extra.

Fixes: c4b75479741c ("perf/core: Make the mlock accounting simple again")
Signed-off-by: Song Liu <songliubraving@fb.com>
Suggested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 kernel/events/core.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2173c23c25b4..d25f2de45996 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5916,8 +5916,19 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
 	 */
 	user_lock_limit *= num_online_cpus();
 
-	user_locked = atomic_long_read(&user->locked_vm) + user_extra;
+	user_locked = atomic_long_read(&user->locked_vm);
 
+	/*
+	 * sysctl_perf_event_mlock and user->locked_vm can change value
+	 * independently. so we can't guarantee:
+	 *     user->locked_vm <= user_lock_limit
+	 *
+	 * Adjust user_locked to be <= user_lock_limit so we can calcualte
+	 * correct extra and user_extra.
+	 */
+	user_locked = min_t(unsigned long, user_locked, user_lock_limit);
+
+	user_locked += user_extra;
 	if (user_locked > user_lock_limit) {
 		/*
 		 * charge locked_vm until it hits user_lock_limit;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v2] perf/core: fix mlock accounting in perf_mmap()
  2020-01-22 19:04 [PATCH v2] perf/core: fix mlock accounting in perf_mmap() Song Liu
@ 2020-01-23  9:33 ` Alexander Shishkin
  0 siblings, 0 replies; 2+ messages in thread
From: Alexander Shishkin @ 2020-01-23  9:33 UTC (permalink / raw)
  To: Song Liu, linux-kernel
  Cc: kernel-team, Song Liu, Arnaldo Carvalho de Melo, Jiri Olsa,
	Peter Zijlstra, alexander.shishkin

Song Liu <songliubraving@fb.com> writes:

> sysctl_perf_event_mlock and user->locked_vm can change value
> independently, so we can't guarantee:

Looks good, I still have some suggestions below.

>
>    user->locked_vm <= user_lock_limit
>
> When user->locked_vm is larger than user_lock_limit, we cannot simply
> update extra and user_extra as:
>
>    extra = user_locked - user_lock_limit;
>    user_extra -= extra;
>
> Otherwise, user_extra will be negative. In extreme cases, this may lead to
> negative user->locked_vm (until this perf-mmap is closed), which break
> locked_vm badly.
>
> Fix this by adjusting user_locked before calculating extra and user_extra.

The commit message is just talking about the code. We can see the code
when we scroll down to the diff. What this can be instead is:

1. Problem statement: decreasing sysctl_perf_event_mlock between two
consecutive mmap()s of a perf ring buffer may lead to an integer
underflow in locked memory accounting. This may lead to the following
undesired behavior: <an example of bad behavior as opposed to expected
behavior>.

2. Fix description: address this by adjusting the accounting logic to
take into account the possibility that the amount of already locked
memory may exceed the current limit.

> Fixes: c4b75479741c ("perf/core: Make the mlock accounting simple again")
> Signed-off-by: Song Liu <songliubraving@fb.com>
> Suggested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> ---
>  kernel/events/core.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 2173c23c25b4..d25f2de45996 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5916,8 +5916,19 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
>  	 */
>  	user_lock_limit *= num_online_cpus();
>  
> -	user_locked = atomic_long_read(&user->locked_vm) + user_extra;
> +	user_locked = atomic_long_read(&user->locked_vm);
>  
> +	/*
> +	 * sysctl_perf_event_mlock and user->locked_vm can change value
> +	 * independently. so we can't guarantee:
> +	 *     user->locked_vm <= user_lock_limit

"sysctl_perf_event_mlock may have changed, so that user->locked_vm >
user_lock_limit".

> +	 *
> +	 * Adjust user_locked to be <= user_lock_limit so we can calcualte
> +	 * correct extra and user_extra.

This comment is also verbalizing the C code that follows. I don't think
it's necessary.

> +	 */
> +	user_locked = min_t(unsigned long, user_locked, user_lock_limit);

A matter of preference, but to me the "if (user_locked >=
user_lock_limit)" is easier to read.

> +
> +	user_locked += user_extra;
>  	if (user_locked > user_lock_limit) {
>  		/*
>  		 * charge locked_vm until it hits user_lock_limit;

Thanks,
--
Alex

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-01-23  9:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-22 19:04 [PATCH v2] perf/core: fix mlock accounting in perf_mmap() Song Liu
2020-01-23  9:33 ` Alexander Shishkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).