linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: get_cmdline use arg_lock instead of mmap_sem
@ 2019-04-17 12:03 Michal Koutný
  2019-04-17 13:41 ` Michal Hocko
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Koutný @ 2019-04-17 12:03 UTC (permalink / raw)
  To: Andrew Morton, Mateusz Guzik, Michal Hocko
  Cc: Mike Rapoport, Vlastimil Babka, Geert Uytterhoeven, Arun KS,
	Bartosz Golaszewski, linux-mm, linux-kernel

The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
semaphore taken.") added synchronization of reading argument/environment
boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
arg_lock to protect arg_start|end and env_start|end in mm_struct")
avoided the coarse use of mmap_sem in similar situations.

get_cmdline can also use arg_lock instead of mmap_sem when it reads the
boundaries.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 mm/util.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index d559bde497a9..568575cceefc 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -758,12 +758,12 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
 	if (!mm->arg_end)
 		goto out_mm;	/* Shh! No looking before we're done */
 
-	down_read(&mm->mmap_sem);
+	spin_lock(&mm->arg_lock);
 	arg_start = mm->arg_start;
 	arg_end = mm->arg_end;
 	env_start = mm->env_start;
 	env_end = mm->env_end;
-	up_read(&mm->mmap_sem);
+	spin_unlock(&mm->arg_lock);
 
 	len = arg_end - arg_start;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-17 12:03 [PATCH] mm: get_cmdline use arg_lock instead of mmap_sem Michal Koutný
@ 2019-04-17 13:41 ` Michal Hocko
  2019-04-17 14:41   ` Michal Koutný
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Hocko @ 2019-04-17 13:41 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Andrew Morton, Mateusz Guzik, Mike Rapoport, Vlastimil Babka,
	Geert Uytterhoeven, Arun KS, Bartosz Golaszewski, linux-mm,
	linux-kernel

On Wed 17-04-19 14:03:47, Michal Koutny wrote:
> The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
> semaphore taken.") added synchronization of reading argument/environment
> boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
> arg_lock to protect arg_start|end and env_start|end in mm_struct")
> avoided the coarse use of mmap_sem in similar situations.
> 
> get_cmdline can also use arg_lock instead of mmap_sem when it reads the
> boundaries.

Don't we need to use the lock in prctl_set_mm as well then?

> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  mm/util.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index d559bde497a9..568575cceefc 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -758,12 +758,12 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
>  	if (!mm->arg_end)
>  		goto out_mm;	/* Shh! No looking before we're done */
>  
> -	down_read(&mm->mmap_sem);
> +	spin_lock(&mm->arg_lock);
>  	arg_start = mm->arg_start;
>  	arg_end = mm->arg_end;
>  	env_start = mm->env_start;
>  	env_end = mm->env_end;
> -	up_read(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
>  
>  	len = arg_end - arg_start;
>  
> -- 
> 2.16.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-17 13:41 ` Michal Hocko
@ 2019-04-17 14:41   ` Michal Koutný
  2019-04-17 14:55     ` Michal Hocko
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Koutný @ 2019-04-17 14:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Bartosz Golaszewski, Arun KS, Geert Uytterhoeven, linux-mm,
	Andrew Morton, Mike Rapoport, Mateusz Guzik, Vlastimil Babka,
	linux-kernel

On Wed, Apr 17, 2019 at 03:41:52PM +0200, Michal Hocko <mhocko@kernel.org> wrote:
> Don't we need to use the lock in prctl_set_mm as well then?

Correct. The patch alone just moves the race from
get_cmdline/prctl_set_mm_map to get_cmdline/prctl_set_mm.

arg_lock could be used in prctl_set_mm but the better idea (IMO) is
complete removal of that code in favor of prctl_set_mm_map [1].

Michal

[1] https://lore.kernel.org/lkml/20180405182651.GM15783@uranus.lan/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-17 14:41   ` Michal Koutný
@ 2019-04-17 14:55     ` Michal Hocko
  2019-04-18 13:50       ` [PATCH] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Hocko @ 2019-04-17 14:55 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Bartosz Golaszewski, Arun KS, Geert Uytterhoeven, linux-mm,
	Andrew Morton, Mike Rapoport, Mateusz Guzik, Vlastimil Babka,
	linux-kernel

On Wed 17-04-19 16:41:42, Michal Koutny wrote:
> On Wed, Apr 17, 2019 at 03:41:52PM +0200, Michal Hocko <mhocko@kernel.org> wrote:
> > Don't we need to use the lock in prctl_set_mm as well then?
> 
> Correct. The patch alone just moves the race from
> get_cmdline/prctl_set_mm_map to get_cmdline/prctl_set_mm.
> 
> arg_lock could be used in prctl_set_mm but the better idea (IMO) is
> complete removal of that code in favor of prctl_set_mm_map [1].

Ohh, I have missed that patch. As long as both are merged together then
no objections from me and you can add

Acked-by: Michal Hocko <mhocko@suse.com>

> Michal
> 
> [1] https://lore.kernel.org/lkml/20180405182651.GM15783@uranus.lan/

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-17 14:55     ` Michal Hocko
@ 2019-04-18 13:50       ` Michal Koutný
  2019-04-18 14:09         ` Cyrill Gorcunov
                           ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Michal Koutný @ 2019-04-18 13:50 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: mhocko, akpm, arunks, brgl, geert+renesas, linux-kernel,
	linux-mm, mguzik, mkoutny, rppt, vbabka, Laurent Dufour

I learnt, it's, alas, too late to drop the non PRCTL_SET_MM_MAP calls
[1], so at least downgrade the write acquisition of mmap_sem as in the
patch below (that should be stacked on the previous one or squashed).

Cyrill, you mentioned lock changes in [1] but the link seems empty. Is
it supposed to be [2]? That could be an alternative to this patch after
some refreshments and clarifications.


[1] https://lore.kernel.org/lkml/20190417165632.GC3040@uranus.lan/
[2] https://lore.kernel.org/lkml/20180507075606.870903028@gmail.com/

========

Since commit 88aa7cc688d4 ("mm: introduce arg_lock to protect
arg_start|end and env_start|end in mm_struct") we use arg_lock for
boundaries modifications. Synchronize prctl_set_mm with this lock and
keep mmap_sem for reading only (analogous to what we already do in
prctl_set_mm_map).

Also, save few cycles by looking up VMA only after performing basic
arguments validation.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/sys.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index 12df0e5434b8..bbce0f26d707 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2125,8 +2125,12 @@ static int prctl_set_mm(int opt, unsigned long addr,
 
 	error = -EINVAL;
 
-	down_write(&mm->mmap_sem);
-	vma = find_vma(mm, addr);
+	/*
+	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
+	 * a) concurrent sys_brk, b) finding VMA for addr validation.
+	 */
+	down_read(&mm->mmap_sem);
+	spin_lock(&mm->arg_lock);
 
 	prctl_map.start_code	= mm->start_code;
 	prctl_map.end_code	= mm->end_code;
@@ -2185,6 +2189,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
 	if (error)
 		goto out;
 
+	vma = find_vma(mm, addr);
 	switch (opt) {
 	/*
 	 * If command line arguments and environment
@@ -2218,7 +2223,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
 
 	error = 0;
 out:
-	up_write(&mm->mmap_sem);
+	spin_unlock(&mm->arg_lock);
+	up_read(&mm->mmap_sem);
 	return error;
 }
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-18 13:50       ` [PATCH] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
@ 2019-04-18 14:09         ` Cyrill Gorcunov
  2019-04-18 14:15         ` Michal Hocko
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-04-18 14:09 UTC (permalink / raw)
  To: Michal Koutný
  Cc: mhocko, akpm, arunks, brgl, geert+renesas, linux-kernel,
	linux-mm, mguzik, rppt, vbabka, Laurent Dufour

On Thu, Apr 18, 2019 at 03:50:39PM +0200, Michal Koutný wrote:
> I learnt, it's, alas, too late to drop the non PRCTL_SET_MM_MAP calls
> [1], so at least downgrade the write acquisition of mmap_sem as in the
> patch below (that should be stacked on the previous one or squashed).
> 
> Cyrill, you mentioned lock changes in [1] but the link seems empty. Is
> it supposed to be [2]? That could be an alternative to this patch after
> some refreshments and clarifications.

Yes, seems so. From a glance the patch shold be ok. Michal will review
it more carefully today. Thanks!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-18 13:50       ` [PATCH] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  2019-04-18 14:09         ` Cyrill Gorcunov
@ 2019-04-18 14:15         ` Michal Hocko
  2019-04-18 14:27         ` Laurent Dufour
  2019-04-18 18:23         ` Cyrill Gorcunov
  3 siblings, 0 replies; 30+ messages in thread
From: Michal Hocko @ 2019-04-18 14:15 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Cyrill Gorcunov, akpm, arunks, brgl, geert+renesas, linux-kernel,
	linux-mm, mguzik, rppt, vbabka, Laurent Dufour

On Thu 18-04-19 15:50:39, Michal Koutny wrote:
> I learnt, it's, alas, too late to drop the non PRCTL_SET_MM_MAP calls
> [1], so at least downgrade the write acquisition of mmap_sem as in the
> patch below (that should be stacked on the previous one or squashed).
> 
> Cyrill, you mentioned lock changes in [1] but the link seems empty. Is
> it supposed to be [2]? That could be an alternative to this patch after
> some refreshments and clarifications.
> 
> 
> [1] https://lore.kernel.org/lkml/20190417165632.GC3040@uranus.lan/
> [2] https://lore.kernel.org/lkml/20180507075606.870903028@gmail.com/
> 
> ========
> 
> Since commit 88aa7cc688d4 ("mm: introduce arg_lock to protect
> arg_start|end and env_start|end in mm_struct") we use arg_lock for
> boundaries modifications. Synchronize prctl_set_mm with this lock and
> keep mmap_sem for reading only (analogous to what we already do in
> prctl_set_mm_map).
> 
> Also, save few cycles by looking up VMA only after performing basic
> arguments validation.
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>

Looks good to me. Please send both patches in one series once you get a
review feedback from other people.

> ---
>  kernel/sys.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 12df0e5434b8..bbce0f26d707 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2125,8 +2125,12 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = -EINVAL;
>  
> -	down_write(&mm->mmap_sem);
> -	vma = find_vma(mm, addr);
> +	/*
> +	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
> +	 * a) concurrent sys_brk, b) finding VMA for addr validation.
> +	 */
> +	down_read(&mm->mmap_sem);
> +	spin_lock(&mm->arg_lock);
>  
>  	prctl_map.start_code	= mm->start_code;
>  	prctl_map.end_code	= mm->end_code;
> @@ -2185,6 +2189,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  	if (error)
>  		goto out;
>  
> +	vma = find_vma(mm, addr);
>  	switch (opt) {
>  	/*
>  	 * If command line arguments and environment
> @@ -2218,7 +2223,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = 0;
>  out:
> -	up_write(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
> +	up_read(&mm->mmap_sem);
>  	return error;
>  }
>  
> -- 
> 2.16.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-18 13:50       ` [PATCH] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  2019-04-18 14:09         ` Cyrill Gorcunov
  2019-04-18 14:15         ` Michal Hocko
@ 2019-04-18 14:27         ` Laurent Dufour
  2019-04-18 18:23         ` Cyrill Gorcunov
  3 siblings, 0 replies; 30+ messages in thread
From: Laurent Dufour @ 2019-04-18 14:27 UTC (permalink / raw)
  To: Michal Koutný, Cyrill Gorcunov
  Cc: mhocko, akpm, arunks, brgl, geert+renesas, linux-kernel,
	linux-mm, mguzik, rppt, vbabka

Le 18/04/2019 à 15:50, Michal Koutný a écrit :
> I learnt, it's, alas, too late to drop the non PRCTL_SET_MM_MAP calls
> [1], so at least downgrade the write acquisition of mmap_sem as in the
> patch below (that should be stacked on the previous one or squashed).
> 
> Cyrill, you mentioned lock changes in [1] but the link seems empty. Is
> it supposed to be [2]? That could be an alternative to this patch after
> some refreshments and clarifications.
> 
> 
> [1] https://lore.kernel.org/lkml/20190417165632.GC3040@uranus.lan/
> [2] https://lore.kernel.org/lkml/20180507075606.870903028@gmail.com/
> 
> ========
> 
> Since commit 88aa7cc688d4 ("mm: introduce arg_lock to protect
> arg_start|end and env_start|end in mm_struct") we use arg_lock for
> boundaries modifications. Synchronize prctl_set_mm with this lock and
> keep mmap_sem for reading only (analogous to what we already do in
> prctl_set_mm_map).
> 
> Also, save few cycles by looking up VMA only after performing basic
> arguments validation.
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>   kernel/sys.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 12df0e5434b8..bbce0f26d707 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2125,8 +2125,12 @@ static int prctl_set_mm(int opt, unsigned long addr,
>   
>   	error = -EINVAL;
>   
> -	down_write(&mm->mmap_sem);
> -	vma = find_vma(mm, addr);
> +	/*
> +	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
> +	 * a) concurrent sys_brk, b) finding VMA for addr validation.
> +	 */
> +	down_read(&mm->mmap_sem);
> +	spin_lock(&mm->arg_lock);
>   
>   	prctl_map.start_code	= mm->start_code;
>   	prctl_map.end_code	= mm->end_code;
> @@ -2185,6 +2189,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
>   	if (error)
>   		goto out;
>   
> +	vma = find_vma(mm, addr);

Why is find_vma() called while holding the arg_lock ?

To limit the time the spinlock is held, would it be better to
    	read_lock(mmap_sem)
    	find_vma()
    	spin_lock(arg_lock)
    	..
out:
	spin_unlock()
	up_read(mmap_sem)

Not sure this would change a lot the performance anyway.

>   	switch (opt) {
>   	/*
>   	 * If command line arguments and environment
> @@ -2218,7 +2223,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
>   
>   	error = 0;
>   out:
> -	up_write(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
> +	up_read(&mm->mmap_sem);
>   	return error;
>   }
>   
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-18 13:50       ` [PATCH] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
                           ` (2 preceding siblings ...)
  2019-04-18 14:27         ` Laurent Dufour
@ 2019-04-18 18:23         ` Cyrill Gorcunov
  2019-04-30  8:18           ` [PATCH 0/3] Reduce mmap_sem usage for args manipulation Michal Koutný
  3 siblings, 1 reply; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-04-18 18:23 UTC (permalink / raw)
  To: Michal Koutný
  Cc: mhocko, akpm, arunks, brgl, geert+renesas, linux-kernel,
	linux-mm, mguzik, rppt, vbabka, Laurent Dufour

On Thu, Apr 18, 2019 at 03:50:39PM +0200, Michal Koutný wrote:
> I learnt, it's, alas, too late to drop the non PRCTL_SET_MM_MAP calls
> [1], so at least downgrade the write acquisition of mmap_sem as in the
> patch below (that should be stacked on the previous one or squashed).
> 
> Cyrill, you mentioned lock changes in [1] but the link seems empty. Is
> it supposed to be [2]? That could be an alternative to this patch after
> some refreshments and clarifications.
> 
> 
> [1] https://lore.kernel.org/lkml/20190417165632.GC3040@uranus.lan/
> [2] https://lore.kernel.org/lkml/20180507075606.870903028@gmail.com/
> 
> ========
> 
> Since commit 88aa7cc688d4 ("mm: introduce arg_lock to protect
> arg_start|end and env_start|end in mm_struct") we use arg_lock for
> boundaries modifications. Synchronize prctl_set_mm with this lock and
> keep mmap_sem for reading only (analogous to what we already do in
> prctl_set_mm_map).
> 
> Also, save few cycles by looking up VMA only after performing basic
> arguments validation.
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

As Laurent mentioned we might move vma lookup before the spinlock,
but this might be done on top of the series.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 0/3] Reduce mmap_sem usage for args manipulation
  2019-04-18 18:23         ` Cyrill Gorcunov
@ 2019-04-30  8:18           ` Michal Koutný
  2019-04-30  8:18             ` [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem Michal Koutný
                               ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Michal Koutný @ 2019-04-30  8:18 UTC (permalink / raw)
  To: gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, mkoutny, rppt, vbabka, ktkhai

Hello.

(apologies for late reply) I've aggregated the two previously discussed patches
into one series and based on responses made some changes summed below.

v2
- insert a patch refactoring validate_prctl_map
- move find_vma out of the arg_lock critical section


Michal Koutný (3):
  mm: get_cmdline use arg_lock instead of mmap_sem
  prctl_set_mm: Refactor checks from validate_prctl_map
  prctl_set_mm: downgrade mmap_sem to read lock

 kernel/sys.c | 55 ++++++++++++++++++++++++++++---------------------------
 mm/util.c    |  4 ++--
 2 files changed, 30 insertions(+), 29 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-30  8:18           ` [PATCH 0/3] Reduce mmap_sem usage for args manipulation Michal Koutný
@ 2019-04-30  8:18             ` Michal Koutný
  2019-04-30  9:09               ` Kirill Tkhai
  2019-04-30  8:18             ` [PATCH 2/3] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
  2019-04-30  8:18             ` [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  2 siblings, 1 reply; 30+ messages in thread
From: Michal Koutný @ 2019-04-30  8:18 UTC (permalink / raw)
  To: gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, mkoutny, rppt, vbabka, ktkhai

The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
semaphore taken.") added synchronization of reading argument/environment
boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
arg_lock to protect arg_start|end and env_start|end in mm_struct")
avoided the coarse use of mmap_sem in similar situations.

get_cmdline can also use arg_lock instead of mmap_sem when it reads the
boundaries.

Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 mm/util.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index 43a2984bccaa..5cf0e84a0823 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -758,12 +758,12 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
 	if (!mm->arg_end)
 		goto out_mm;	/* Shh! No looking before we're done */
 
-	down_read(&mm->mmap_sem);
+	spin_lock(&mm->arg_lock);
 	arg_start = mm->arg_start;
 	arg_end = mm->arg_end;
 	env_start = mm->env_start;
 	env_end = mm->env_end;
-	up_read(&mm->mmap_sem);
+	spin_unlock(&mm->arg_lock);
 
 	len = arg_end - arg_start;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/3] prctl_set_mm: Refactor checks from validate_prctl_map
  2019-04-30  8:18           ` [PATCH 0/3] Reduce mmap_sem usage for args manipulation Michal Koutný
  2019-04-30  8:18             ` [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem Michal Koutný
@ 2019-04-30  8:18             ` Michal Koutný
  2019-04-30  9:27               ` Kirill Tkhai
  2019-04-30  8:18             ` [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  2 siblings, 1 reply; 30+ messages in thread
From: Michal Koutný @ 2019-04-30  8:18 UTC (permalink / raw)
  To: gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, mkoutny, rppt, vbabka, ktkhai

Despite comment of validate_prctl_map claims there are no capability
checks, it is not completely true since commit 4d28df6152aa ("prctl:
Allow local CAP_SYS_ADMIN changing exe_file"). Extract the check out of
the function and make the function perform purely arithmetic checks.

This patch should not change any behavior, it is mere refactoring for
following patch.

CC: Kirill Tkhai <ktkhai@virtuozzo.com>
CC: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/sys.c | 45 ++++++++++++++++++++-------------------------
 1 file changed, 20 insertions(+), 25 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index 12df0e5434b8..e1acb444d7b0 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1882,10 +1882,12 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
 }
 
 /*
+ * Check arithmetic relations of passed addresses.
+ *
  * WARNING: we don't require any capability here so be very careful
  * in what is allowed for modification from userspace.
  */
-static int validate_prctl_map(struct prctl_mm_map *prctl_map)
+static int validate_prctl_map_addr(struct prctl_mm_map *prctl_map)
 {
 	unsigned long mmap_max_addr = TASK_SIZE;
 	struct mm_struct *mm = current->mm;
@@ -1949,24 +1951,6 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map)
 			      prctl_map->start_data))
 			goto out;
 
-	/*
-	 * Someone is trying to cheat the auxv vector.
-	 */
-	if (prctl_map->auxv_size) {
-		if (!prctl_map->auxv || prctl_map->auxv_size > sizeof(mm->saved_auxv))
-			goto out;
-	}
-
-	/*
-	 * Finally, make sure the caller has the rights to
-	 * change /proc/pid/exe link: only local sys admin should
-	 * be allowed to.
-	 */
-	if (prctl_map->exe_fd != (u32)-1) {
-		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
-			goto out;
-	}
-
 	error = 0;
 out:
 	return error;
@@ -1993,11 +1977,17 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
 	if (copy_from_user(&prctl_map, addr, sizeof(prctl_map)))
 		return -EFAULT;
 
-	error = validate_prctl_map(&prctl_map);
+	error = validate_prctl_map_addr(&prctl_map);
 	if (error)
 		return error;
 
 	if (prctl_map.auxv_size) {
+		/*
+		 * Someone is trying to cheat the auxv vector.
+		 */
+		if (!prctl_map.auxv || prctl_map.auxv_size > sizeof(mm->saved_auxv))
+			return -EINVAL;
+
 		memset(user_auxv, 0, sizeof(user_auxv));
 		if (copy_from_user(user_auxv,
 				   (const void __user *)prctl_map.auxv,
@@ -2010,6 +2000,14 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
 	}
 
 	if (prctl_map.exe_fd != (u32)-1) {
+		/*
+		 * Make sure the caller has the rights to
+		 * change /proc/pid/exe link: only local sys admin should
+		 * be allowed to.
+		 */
+		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+			return -EINVAL;
+
 		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
 		if (error)
 			return error;
@@ -2097,7 +2095,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
 			unsigned long arg4, unsigned long arg5)
 {
 	struct mm_struct *mm = current->mm;
-	struct prctl_mm_map prctl_map;
+	struct prctl_mm_map prctl_map = { .auxv = NULL, .auxv_size = 0, .exe_fd = -1 };
 	struct vm_area_struct *vma;
 	int error;
 
@@ -2139,9 +2137,6 @@ static int prctl_set_mm(int opt, unsigned long addr,
 	prctl_map.arg_end	= mm->arg_end;
 	prctl_map.env_start	= mm->env_start;
 	prctl_map.env_end	= mm->env_end;
-	prctl_map.auxv		= NULL;
-	prctl_map.auxv_size	= 0;
-	prctl_map.exe_fd	= -1;
 
 	switch (opt) {
 	case PR_SET_MM_START_CODE:
@@ -2181,7 +2176,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
 		goto out;
 	}
 
-	error = validate_prctl_map(&prctl_map);
+	error = validate_prctl_map_addr(&prctl_map);
 	if (error)
 		goto out;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-30  8:18           ` [PATCH 0/3] Reduce mmap_sem usage for args manipulation Michal Koutný
  2019-04-30  8:18             ` [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem Michal Koutný
  2019-04-30  8:18             ` [PATCH 2/3] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
@ 2019-04-30  8:18             ` Michal Koutný
  2019-04-30  8:55               ` Kirill Tkhai
  2 siblings, 1 reply; 30+ messages in thread
From: Michal Koutný @ 2019-04-30  8:18 UTC (permalink / raw)
  To: gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, mkoutny, rppt, vbabka, ktkhai

Since commit 88aa7cc688d4 ("mm: introduce arg_lock to protect
arg_start|end and env_start|end in mm_struct") we use arg_lock for
boundaries modifications. Synchronize prctl_set_mm with this lock and
keep mmap_sem for reading only (analogous to what we already do in
prctl_set_mm_map).

v2: call find_vma without arg_lock held

CC: Cyrill Gorcunov <gorcunov@gmail.com>
CC: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/sys.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index e1acb444d7b0..641fda756575 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2123,9 +2123,14 @@ static int prctl_set_mm(int opt, unsigned long addr,
 
 	error = -EINVAL;
 
-	down_write(&mm->mmap_sem);
+	/*
+	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
+	 * a) concurrent sys_brk, b) finding VMA for addr validation.
+	 */
+	down_read(&mm->mmap_sem);
 	vma = find_vma(mm, addr);
 
+	spin_lock(&mm->arg_lock);
 	prctl_map.start_code	= mm->start_code;
 	prctl_map.end_code	= mm->end_code;
 	prctl_map.start_data	= mm->start_data;
@@ -2213,7 +2218,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
 
 	error = 0;
 out:
-	up_write(&mm->mmap_sem);
+	spin_unlock(&mm->arg_lock);
+	up_read(&mm->mmap_sem);
 	return error;
 }
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-30  8:18             ` [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
@ 2019-04-30  8:55               ` Kirill Tkhai
  2019-04-30  9:08                 ` Cyrill Gorcunov
  0 siblings, 1 reply; 30+ messages in thread
From: Kirill Tkhai @ 2019-04-30  8:55 UTC (permalink / raw)
  To: Michal Koutný, gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On 30.04.2019 11:18, Michal Koutný wrote:
> Since commit 88aa7cc688d4 ("mm: introduce arg_lock to protect
> arg_start|end and env_start|end in mm_struct") we use arg_lock for
> boundaries modifications. Synchronize prctl_set_mm with this lock and
> keep mmap_sem for reading only (analogous to what we already do in
> prctl_set_mm_map).
> 
> v2: call find_vma without arg_lock held
> 
> CC: Cyrill Gorcunov <gorcunov@gmail.com>
> CC: Laurent Dufour <ldufour@linux.ibm.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  kernel/sys.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index e1acb444d7b0..641fda756575 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2123,9 +2123,14 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = -EINVAL;
>  
> -	down_write(&mm->mmap_sem);
> +	/*
> +	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
> +	 * a) concurrent sys_brk, b) finding VMA for addr validation.
> +	 */
> +	down_read(&mm->mmap_sem);
>  	vma = find_vma(mm, addr);
>  
> +	spin_lock(&mm->arg_lock);
>  	prctl_map.start_code	= mm->start_code;
>  	prctl_map.end_code	= mm->end_code;
>  	prctl_map.start_data	= mm->start_data;
> @@ -2213,7 +2218,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = 0;
>  out:
> -	up_write(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
> +	up_read(&mm->mmap_sem);
>  	return error;

Hm, shouldn't spin_lock()/spin_unlock() pair go as a fixup to existing code
in a separate patch? 

Without them, the existing code has a problem at least in get_mm_cmdline().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-30  8:55               ` Kirill Tkhai
@ 2019-04-30  9:08                 ` Cyrill Gorcunov
  2019-04-30  9:11                   ` Kirill Tkhai
  0 siblings, 1 reply; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-04-30  9:08 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: Michal Koutný,
	akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On Tue, Apr 30, 2019 at 11:55:45AM +0300, Kirill Tkhai wrote:
> > -	up_write(&mm->mmap_sem);
> > +	spin_unlock(&mm->arg_lock);
> > +	up_read(&mm->mmap_sem);
> >  	return error;
> 
> Hm, shouldn't spin_lock()/spin_unlock() pair go as a fixup to existing code
> in a separate patch? 
> 
> Without them, the existing code has a problem at least in get_mm_cmdline().

Seems reasonable to merge it into patch 1.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-30  8:18             ` [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem Michal Koutný
@ 2019-04-30  9:09               ` Kirill Tkhai
  2019-04-30  9:38                 ` Cyrill Gorcunov
  0 siblings, 1 reply; 30+ messages in thread
From: Kirill Tkhai @ 2019-04-30  9:09 UTC (permalink / raw)
  To: Michal Koutný, gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On 30.04.2019 11:18, Michal Koutný wrote:
> The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
> semaphore taken.") added synchronization of reading argument/environment
> boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
> arg_lock to protect arg_start|end and env_start|end in mm_struct")
> avoided the coarse use of mmap_sem in similar situations.
> 
> get_cmdline can also use arg_lock instead of mmap_sem when it reads the
> boundaries.
> 
> Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
> Cc: Mateusz Guzik <mguzik@redhat.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  mm/util.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 43a2984bccaa..5cf0e84a0823 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -758,12 +758,12 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
>  	if (!mm->arg_end)
>  		goto out_mm;	/* Shh! No looking before we're done */
>  
> -	down_read(&mm->mmap_sem);
> +	spin_lock(&mm->arg_lock);
>  	arg_start = mm->arg_start;
>  	arg_end = mm->arg_end;
>  	env_start = mm->env_start;
>  	env_end = mm->env_end;
> -	up_read(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
>  
>  	len = arg_end - arg_start;

This looks OK for me.

But speaking about existing code it's a secret for me, why we ignore arg_lock
in binfmt code, e.g. in load_elf_binary().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock
  2019-04-30  9:08                 ` Cyrill Gorcunov
@ 2019-04-30  9:11                   ` Kirill Tkhai
  2019-05-02 12:52                     ` [PATCH v3 0/2] Reduce mmap_sem usage for args manipulation Michal Koutný
  0 siblings, 1 reply; 30+ messages in thread
From: Kirill Tkhai @ 2019-04-30  9:11 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Michal Koutný,
	akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On 30.04.2019 12:08, Cyrill Gorcunov wrote:
> On Tue, Apr 30, 2019 at 11:55:45AM +0300, Kirill Tkhai wrote:
>>> -	up_write(&mm->mmap_sem);
>>> +	spin_unlock(&mm->arg_lock);
>>> +	up_read(&mm->mmap_sem);
>>>  	return error;
>>
>> Hm, shouldn't spin_lock()/spin_unlock() pair go as a fixup to existing code
>> in a separate patch? 
>>
>> Without them, the existing code has a problem at least in get_mm_cmdline().
> 
> Seems reasonable to merge it into patch 1.

Sounds sensibly.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/3] prctl_set_mm: Refactor checks from validate_prctl_map
  2019-04-30  8:18             ` [PATCH 2/3] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
@ 2019-04-30  9:27               ` Kirill Tkhai
  0 siblings, 0 replies; 30+ messages in thread
From: Kirill Tkhai @ 2019-04-30  9:27 UTC (permalink / raw)
  To: Michal Koutný, gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On 30.04.2019 11:18, Michal Koutný wrote:
> Despite comment of validate_prctl_map claims there are no capability
> checks, it is not completely true since commit 4d28df6152aa ("prctl:
> Allow local CAP_SYS_ADMIN changing exe_file"). Extract the check out of
> the function and make the function perform purely arithmetic checks.
> 
> This patch should not change any behavior, it is mere refactoring for
> following patch.
> 
> CC: Kirill Tkhai <ktkhai@virtuozzo.com>
> CC: Cyrill Gorcunov <gorcunov@gmail.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>

Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

> ---
>  kernel/sys.c | 45 ++++++++++++++++++++-------------------------
>  1 file changed, 20 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 12df0e5434b8..e1acb444d7b0 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1882,10 +1882,12 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
>  }
>  
>  /*
> + * Check arithmetic relations of passed addresses.
> + *
>   * WARNING: we don't require any capability here so be very careful
>   * in what is allowed for modification from userspace.
>   */
> -static int validate_prctl_map(struct prctl_mm_map *prctl_map)
> +static int validate_prctl_map_addr(struct prctl_mm_map *prctl_map)
>  {
>  	unsigned long mmap_max_addr = TASK_SIZE;
>  	struct mm_struct *mm = current->mm;
> @@ -1949,24 +1951,6 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map)
>  			      prctl_map->start_data))
>  			goto out;
>  
> -	/*
> -	 * Someone is trying to cheat the auxv vector.
> -	 */
> -	if (prctl_map->auxv_size) {
> -		if (!prctl_map->auxv || prctl_map->auxv_size > sizeof(mm->saved_auxv))
> -			goto out;
> -	}
> -
> -	/*
> -	 * Finally, make sure the caller has the rights to
> -	 * change /proc/pid/exe link: only local sys admin should
> -	 * be allowed to.
> -	 */
> -	if (prctl_map->exe_fd != (u32)-1) {
> -		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
> -			goto out;
> -	}
> -
>  	error = 0;
>  out:
>  	return error;
> @@ -1993,11 +1977,17 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
>  	if (copy_from_user(&prctl_map, addr, sizeof(prctl_map)))
>  		return -EFAULT;
>  
> -	error = validate_prctl_map(&prctl_map);
> +	error = validate_prctl_map_addr(&prctl_map);
>  	if (error)
>  		return error;
>  
>  	if (prctl_map.auxv_size) {
> +		/*
> +		 * Someone is trying to cheat the auxv vector.
> +		 */
> +		if (!prctl_map.auxv || prctl_map.auxv_size > sizeof(mm->saved_auxv))
> +			return -EINVAL;
> +
>  		memset(user_auxv, 0, sizeof(user_auxv));
>  		if (copy_from_user(user_auxv,
>  				   (const void __user *)prctl_map.auxv,
> @@ -2010,6 +2000,14 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
>  	}
>  
>  	if (prctl_map.exe_fd != (u32)-1) {
> +		/*
> +		 * Make sure the caller has the rights to
> +		 * change /proc/pid/exe link: only local sys admin should
> +		 * be allowed to.
> +		 */
> +		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
> +			return -EINVAL;
> +
>  		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
>  		if (error)
>  			return error;
> @@ -2097,7 +2095,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  			unsigned long arg4, unsigned long arg5)
>  {
>  	struct mm_struct *mm = current->mm;
> -	struct prctl_mm_map prctl_map;
> +	struct prctl_mm_map prctl_map = { .auxv = NULL, .auxv_size = 0, .exe_fd = -1 };
>  	struct vm_area_struct *vma;
>  	int error;
>  
> @@ -2139,9 +2137,6 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  	prctl_map.arg_end	= mm->arg_end;
>  	prctl_map.env_start	= mm->env_start;
>  	prctl_map.env_end	= mm->env_end;
> -	prctl_map.auxv		= NULL;
> -	prctl_map.auxv_size	= 0;
> -	prctl_map.exe_fd	= -1;
>  
>  	switch (opt) {
>  	case PR_SET_MM_START_CODE:
> @@ -2181,7 +2176,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  		goto out;
>  	}
>  
> -	error = validate_prctl_map(&prctl_map);
> +	error = validate_prctl_map_addr(&prctl_map);
>  	if (error)
>  		goto out;
>  
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-30  9:09               ` Kirill Tkhai
@ 2019-04-30  9:38                 ` Cyrill Gorcunov
  2019-04-30  9:53                   ` Kirill Tkhai
  0 siblings, 1 reply; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-04-30  9:38 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: Michal Koutný,
	akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On Tue, Apr 30, 2019 at 12:09:57PM +0300, Kirill Tkhai wrote:
> 
> This looks OK for me.
> 
> But speaking about existing code it's a secret for me, why we ignore arg_lock
> in binfmt code, e.g. in load_elf_binary().

Well, strictly speaking we probably should but you know setup of
the @arg_start by kernel's elf loader doesn't cause any side
effects as far as I can tell (its been working this lockless
way for years, mmap_sem is taken later in the loader code).
Though for consistency sake we probably should set it up
under the spinlock.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-30  9:38                 ` Cyrill Gorcunov
@ 2019-04-30  9:53                   ` Kirill Tkhai
  2019-04-30 10:45                     ` Cyrill Gorcunov
  0 siblings, 1 reply; 30+ messages in thread
From: Kirill Tkhai @ 2019-04-30  9:53 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Michal Koutný,
	akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On 30.04.2019 12:38, Cyrill Gorcunov wrote:
> On Tue, Apr 30, 2019 at 12:09:57PM +0300, Kirill Tkhai wrote:
>>
>> This looks OK for me.
>>
>> But speaking about existing code it's a secret for me, why we ignore arg_lock
>> in binfmt code, e.g. in load_elf_binary().
> 
> Well, strictly speaking we probably should but you know setup of
> the @arg_start by kernel's elf loader doesn't cause any side
> effects as far as I can tell (its been working this lockless
> way for years, mmap_sem is taken later in the loader code).
> Though for consistency sake we probably should set it up
> under the spinlock.

Ok, so elf loader doesn't change these parameters. Thanks for the explanation.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-30  9:53                   ` Kirill Tkhai
@ 2019-04-30 10:45                     ` Cyrill Gorcunov
  2019-04-30 10:56                       ` Michal Koutný
  0 siblings, 1 reply; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-04-30 10:45 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: Michal Koutný,
	akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On Tue, Apr 30, 2019 at 12:53:51PM +0300, Kirill Tkhai wrote:
> > 
> > Well, strictly speaking we probably should but you know setup of
> > the @arg_start by kernel's elf loader doesn't cause any side
> > effects as far as I can tell (its been working this lockless
> > way for years, mmap_sem is taken later in the loader code).
> > Though for consistency sake we probably should set it up
> > under the spinlock.
> 
> Ok, so elf loader doesn't change these parameters.
> Thanks for the explanation.

It setups these parameters unconditionally. I need to revisit
this moment. Technically (if only I'm not missing something
obvious) we might have a race here with prctl setting up new
params, but this should be harmless since most of them (except
stack setup) are purely informative data.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-30 10:45                     ` Cyrill Gorcunov
@ 2019-04-30 10:56                       ` Michal Koutný
  2019-04-30 13:24                         ` Cyrill Gorcunov
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Koutný @ 2019-04-30 10:56 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Kirill Tkhai, brgl, arunks, geert+renesas, mhocko, linux-mm,
	akpm, ldufour, rppt, mguzik, mkoutny, vbabka, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 791 bytes --]

On Tue, Apr 30, 2019 at 01:45:17PM +0300, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> It setups these parameters unconditionally. I need to revisit
> this moment. Technically (if only I'm not missing something
> obvious) we might have a race here with prctl setting up new
> params, but this should be harmless since most of them (except
> stack setup) are purely informative data.
FTR, when I reviewed that usage, I noticed it was missing the
synchronization. My understanding was that the mm_struct isn't yet
shared at this moment. I can see some of the operations take place after
flush_old_exec (where current->mm = mm_struct), so potentially it is
shared since then. OTOH, I guess there aren't concurrent parties that
could access the field at this stage of exec.

My 2 cents,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem
  2019-04-30 10:56                       ` Michal Koutný
@ 2019-04-30 13:24                         ` Cyrill Gorcunov
  0 siblings, 0 replies; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-04-30 13:24 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Kirill Tkhai, brgl, arunks, geert+renesas, mhocko, linux-mm,
	akpm, ldufour, rppt, mguzik, mkoutny, vbabka, linux-kernel

On Tue, Apr 30, 2019 at 12:56:10PM +0200, Michal Koutný wrote:
> On Tue, Apr 30, 2019 at 01:45:17PM +0300, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > It setups these parameters unconditionally. I need to revisit
> > this moment. Technically (if only I'm not missing something
> > obvious) we might have a race here with prctl setting up new
> > params, but this should be harmless since most of them (except
> > stack setup) are purely informative data.
>
> FTR, when I reviewed that usage, I noticed it was missing the
> synchronization. My understanding was that the mm_struct isn't yet
> shared at this moment. I can see some of the operations take place after
> flush_old_exec (where current->mm = mm_struct), so potentially it is
> shared since then. OTOH, I guess there aren't concurrent parties that
> could access the field at this stage of exec.

Just revisited this code -- we're either executing prctl, either execve.
Since both operates with current task we're safe.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 0/2] Reduce mmap_sem usage for args manipulation
  2019-04-30  9:11                   ` Kirill Tkhai
@ 2019-05-02 12:52                     ` Michal Koutný
  2019-05-02 12:52                       ` [PATCH v3 1/2] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
  2019-05-02 12:52                       ` [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  0 siblings, 2 replies; 30+ messages in thread
From: Michal Koutný @ 2019-05-02 12:52 UTC (permalink / raw)
  To: gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, mkoutny, rppt, vbabka, ktkhai

Hello.

Just merged the two dicussed patches and fixed an overlooked warning.

v2
- insert a patch refactoring validate_prctl_map
- move find_vma out of the arg_lock critical section

v3
- squash get_cmdline and prctl_set_mm patches (to keep locking
  consistency)
- validate_prctl_map_addr: remove unused variable mm

Michal Koutný (2):
  prctl_set_mm: Refactor checks from validate_prctl_map
  prctl_set_mm: downgrade mmap_sem to read lock

 kernel/sys.c | 56 ++++++++++++++++++++++++++++----------------------------
 mm/util.c    |  4 ++--
 2 files changed, 30 insertions(+), 30 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 1/2] prctl_set_mm: Refactor checks from validate_prctl_map
  2019-05-02 12:52                     ` [PATCH v3 0/2] Reduce mmap_sem usage for args manipulation Michal Koutný
@ 2019-05-02 12:52                       ` Michal Koutný
  2019-05-02 20:57                         ` Cyrill Gorcunov
  2019-05-02 12:52                       ` [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  1 sibling, 1 reply; 30+ messages in thread
From: Michal Koutný @ 2019-05-02 12:52 UTC (permalink / raw)
  To: gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, mkoutny, rppt, vbabka, ktkhai

Despite comment of validate_prctl_map claims there are no capability
checks, it is not completely true since commit 4d28df6152aa ("prctl:
Allow local CAP_SYS_ADMIN changing exe_file"). Extract the check out of
the function and make the function perform purely arithmetic checks.

This patch should not change any behavior, it is mere refactoring for
following patch.

v1, v2: ---
v3: Remove unused mm variable from validate_prctl_map_addr

CC: Kirill Tkhai <ktkhai@virtuozzo.com>
CC: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 kernel/sys.c | 46 ++++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 26 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index 12df0e5434b8..5e0a5edf47f8 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1882,13 +1882,14 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
 }
 
 /*
+ * Check arithmetic relations of passed addresses.
+ *
  * WARNING: we don't require any capability here so be very careful
  * in what is allowed for modification from userspace.
  */
-static int validate_prctl_map(struct prctl_mm_map *prctl_map)
+static int validate_prctl_map_addr(struct prctl_mm_map *prctl_map)
 {
 	unsigned long mmap_max_addr = TASK_SIZE;
-	struct mm_struct *mm = current->mm;
 	int error = -EINVAL, i;
 
 	static const unsigned char offsets[] = {
@@ -1949,24 +1950,6 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map)
 			      prctl_map->start_data))
 			goto out;
 
-	/*
-	 * Someone is trying to cheat the auxv vector.
-	 */
-	if (prctl_map->auxv_size) {
-		if (!prctl_map->auxv || prctl_map->auxv_size > sizeof(mm->saved_auxv))
-			goto out;
-	}
-
-	/*
-	 * Finally, make sure the caller has the rights to
-	 * change /proc/pid/exe link: only local sys admin should
-	 * be allowed to.
-	 */
-	if (prctl_map->exe_fd != (u32)-1) {
-		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
-			goto out;
-	}
-
 	error = 0;
 out:
 	return error;
@@ -1993,11 +1976,17 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
 	if (copy_from_user(&prctl_map, addr, sizeof(prctl_map)))
 		return -EFAULT;
 
-	error = validate_prctl_map(&prctl_map);
+	error = validate_prctl_map_addr(&prctl_map);
 	if (error)
 		return error;
 
 	if (prctl_map.auxv_size) {
+		/*
+		 * Someone is trying to cheat the auxv vector.
+		 */
+		if (!prctl_map.auxv || prctl_map.auxv_size > sizeof(mm->saved_auxv))
+			return -EINVAL;
+
 		memset(user_auxv, 0, sizeof(user_auxv));
 		if (copy_from_user(user_auxv,
 				   (const void __user *)prctl_map.auxv,
@@ -2010,6 +1999,14 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
 	}
 
 	if (prctl_map.exe_fd != (u32)-1) {
+		/*
+		 * Make sure the caller has the rights to
+		 * change /proc/pid/exe link: only local sys admin should
+		 * be allowed to.
+		 */
+		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+			return -EINVAL;
+
 		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
 		if (error)
 			return error;
@@ -2097,7 +2094,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
 			unsigned long arg4, unsigned long arg5)
 {
 	struct mm_struct *mm = current->mm;
-	struct prctl_mm_map prctl_map;
+	struct prctl_mm_map prctl_map = { .auxv = NULL, .auxv_size = 0, .exe_fd = -1 };
 	struct vm_area_struct *vma;
 	int error;
 
@@ -2139,9 +2136,6 @@ static int prctl_set_mm(int opt, unsigned long addr,
 	prctl_map.arg_end	= mm->arg_end;
 	prctl_map.env_start	= mm->env_start;
 	prctl_map.env_end	= mm->env_end;
-	prctl_map.auxv		= NULL;
-	prctl_map.auxv_size	= 0;
-	prctl_map.exe_fd	= -1;
 
 	switch (opt) {
 	case PR_SET_MM_START_CODE:
@@ -2181,7 +2175,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
 		goto out;
 	}
 
-	error = validate_prctl_map(&prctl_map);
+	error = validate_prctl_map_addr(&prctl_map);
 	if (error)
 		goto out;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock
  2019-05-02 12:52                     ` [PATCH v3 0/2] Reduce mmap_sem usage for args manipulation Michal Koutný
  2019-05-02 12:52                       ` [PATCH v3 1/2] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
@ 2019-05-02 12:52                       ` Michal Koutný
  2019-05-02 20:57                         ` Cyrill Gorcunov
                                           ` (2 more replies)
  1 sibling, 3 replies; 30+ messages in thread
From: Michal Koutný @ 2019-05-02 12:52 UTC (permalink / raw)
  To: gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, mkoutny, rppt, vbabka, ktkhai

The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
semaphore taken.") added synchronization of reading argument/environment
boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
arg_lock to protect arg_start|end and env_start|end in mm_struct")
avoided the coarse use of mmap_sem in similar situations. But there
still remained two places that (mis)use mmap_sem.

get_cmdline should also use arg_lock instead of mmap_sem when it reads the
boundaries.

The second place that should use arg_lock is in prctl_set_mm. By
protecting the boundaries fields with the arg_lock, we can downgrade
mmap_sem to reader lock (analogous to what we already do in
prctl_set_mm_map).

v2: call find_vma without arg_lock held
v3: squashed get_cmdline arg_lock patch

Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
CC: Cyrill Gorcunov <gorcunov@gmail.com>
Co-developed-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/sys.c | 10 ++++++++--
 mm/util.c    |  4 ++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index 5e0a5edf47f8..14be57840511 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2122,9 +2122,14 @@ static int prctl_set_mm(int opt, unsigned long addr,
 
 	error = -EINVAL;
 
-	down_write(&mm->mmap_sem);
+	/*
+	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
+	 * a) concurrent sys_brk, b) finding VMA for addr validation.
+	 */
+	down_read(&mm->mmap_sem);
 	vma = find_vma(mm, addr);
 
+	spin_lock(&mm->arg_lock);
 	prctl_map.start_code	= mm->start_code;
 	prctl_map.end_code	= mm->end_code;
 	prctl_map.start_data	= mm->start_data;
@@ -2212,7 +2217,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
 
 	error = 0;
 out:
-	up_write(&mm->mmap_sem);
+	spin_unlock(&mm->arg_lock);
+	up_read(&mm->mmap_sem);
 	return error;
 }
 
diff --git a/mm/util.c b/mm/util.c
index 43a2984bccaa..5cf0e84a0823 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -758,12 +758,12 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
 	if (!mm->arg_end)
 		goto out_mm;	/* Shh! No looking before we're done */
 
-	down_read(&mm->mmap_sem);
+	spin_lock(&mm->arg_lock);
 	arg_start = mm->arg_start;
 	arg_end = mm->arg_end;
 	env_start = mm->env_start;
 	env_end = mm->env_end;
-	up_read(&mm->mmap_sem);
+	spin_unlock(&mm->arg_lock);
 
 	len = arg_end - arg_start;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v3 1/2] prctl_set_mm: Refactor checks from validate_prctl_map
  2019-05-02 12:52                       ` [PATCH v3 1/2] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
@ 2019-05-02 20:57                         ` Cyrill Gorcunov
  0 siblings, 0 replies; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-05-02 20:57 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka, ktkhai

On Thu, May 02, 2019 at 02:52:02PM +0200, Michal Koutný wrote:
> Despite comment of validate_prctl_map claims there are no capability
> checks, it is not completely true since commit 4d28df6152aa ("prctl:
> Allow local CAP_SYS_ADMIN changing exe_file"). Extract the check out of
> the function and make the function perform purely arithmetic checks.
> 
> This patch should not change any behavior, it is mere refactoring for
> following patch.
> 
> v1, v2: ---
> v3: Remove unused mm variable from validate_prctl_map_addr
> 
> CC: Kirill Tkhai <ktkhai@virtuozzo.com>
> CC: Cyrill Gorcunov <gorcunov@gmail.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock
  2019-05-02 12:52                       ` [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
@ 2019-05-02 20:57                         ` Cyrill Gorcunov
  2019-05-06  9:28                         ` Kirill Tkhai
  2019-05-07 17:42                         ` Michal Hocko
  2 siblings, 0 replies; 30+ messages in thread
From: Cyrill Gorcunov @ 2019-05-02 20:57 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka, ktkhai

On Thu, May 02, 2019 at 02:52:03PM +0200, Michal Koutný wrote:
> The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
> semaphore taken.") added synchronization of reading argument/environment
> boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
> arg_lock to protect arg_start|end and env_start|end in mm_struct")
> avoided the coarse use of mmap_sem in similar situations. But there
> still remained two places that (mis)use mmap_sem.
> 
> get_cmdline should also use arg_lock instead of mmap_sem when it reads the
> boundaries.
> 
> The second place that should use arg_lock is in prctl_set_mm. By
> protecting the boundaries fields with the arg_lock, we can downgrade
> mmap_sem to reader lock (analogous to what we already do in
> prctl_set_mm_map).
> 
> v2: call find_vma without arg_lock held
> v3: squashed get_cmdline arg_lock patch
> 
> Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
> Cc: Mateusz Guzik <mguzik@redhat.com>
> CC: Cyrill Gorcunov <gorcunov@gmail.com>
> Co-developed-by: Laurent Dufour <ldufour@linux.ibm.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock
  2019-05-02 12:52                       ` [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  2019-05-02 20:57                         ` Cyrill Gorcunov
@ 2019-05-06  9:28                         ` Kirill Tkhai
  2019-05-07 17:42                         ` Michal Hocko
  2 siblings, 0 replies; 30+ messages in thread
From: Kirill Tkhai @ 2019-05-06  9:28 UTC (permalink / raw)
  To: Michal Koutný, gorcunov
  Cc: akpm, arunks, brgl, geert+renesas, ldufour, linux-kernel,
	linux-mm, mguzik, mhocko, rppt, vbabka

On 02.05.2019 15:52, Michal Koutný wrote:
> The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
> semaphore taken.") added synchronization of reading argument/environment
> boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
> arg_lock to protect arg_start|end and env_start|end in mm_struct")
> avoided the coarse use of mmap_sem in similar situations. But there
> still remained two places that (mis)use mmap_sem.
> 
> get_cmdline should also use arg_lock instead of mmap_sem when it reads the
> boundaries.
> 
> The second place that should use arg_lock is in prctl_set_mm. By
> protecting the boundaries fields with the arg_lock, we can downgrade
> mmap_sem to reader lock (analogous to what we already do in
> prctl_set_mm_map).
> 
> v2: call find_vma without arg_lock held
> v3: squashed get_cmdline arg_lock patch
> 
> Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
> Cc: Mateusz Guzik <mguzik@redhat.com>
> CC: Cyrill Gorcunov <gorcunov@gmail.com>
> Co-developed-by: Laurent Dufour <ldufour@linux.ibm.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>

Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

> ---
>  kernel/sys.c | 10 ++++++++--
>  mm/util.c    |  4 ++--
>  2 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 5e0a5edf47f8..14be57840511 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2122,9 +2122,14 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = -EINVAL;
>  
> -	down_write(&mm->mmap_sem);
> +	/*
> +	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
> +	 * a) concurrent sys_brk, b) finding VMA for addr validation.
> +	 */
> +	down_read(&mm->mmap_sem);
>  	vma = find_vma(mm, addr);
>  
> +	spin_lock(&mm->arg_lock);
>  	prctl_map.start_code	= mm->start_code;
>  	prctl_map.end_code	= mm->end_code;
>  	prctl_map.start_data	= mm->start_data;
> @@ -2212,7 +2217,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = 0;
>  out:
> -	up_write(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
> +	up_read(&mm->mmap_sem);
>  	return error;
>  }
>  
> diff --git a/mm/util.c b/mm/util.c
> index 43a2984bccaa..5cf0e84a0823 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -758,12 +758,12 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
>  	if (!mm->arg_end)
>  		goto out_mm;	/* Shh! No looking before we're done */
>  
> -	down_read(&mm->mmap_sem);
> +	spin_lock(&mm->arg_lock);
>  	arg_start = mm->arg_start;
>  	arg_end = mm->arg_end;
>  	env_start = mm->env_start;
>  	env_end = mm->env_end;
> -	up_read(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
>  
>  	len = arg_end - arg_start;
>  
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock
  2019-05-02 12:52                       ` [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
  2019-05-02 20:57                         ` Cyrill Gorcunov
  2019-05-06  9:28                         ` Kirill Tkhai
@ 2019-05-07 17:42                         ` Michal Hocko
  2 siblings, 0 replies; 30+ messages in thread
From: Michal Hocko @ 2019-05-07 17:42 UTC (permalink / raw)
  To: Michal Koutný
  Cc: gorcunov, akpm, arunks, brgl, geert+renesas, ldufour,
	linux-kernel, linux-mm, mguzik, rppt, vbabka, ktkhai

On Thu 02-05-19 14:52:03, Michal Koutny wrote:
> The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
> semaphore taken.") added synchronization of reading argument/environment
> boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce
> arg_lock to protect arg_start|end and env_start|end in mm_struct")
> avoided the coarse use of mmap_sem in similar situations. But there
> still remained two places that (mis)use mmap_sem.
> 
> get_cmdline should also use arg_lock instead of mmap_sem when it reads the
> boundaries.
> 
> The second place that should use arg_lock is in prctl_set_mm. By
> protecting the boundaries fields with the arg_lock, we can downgrade
> mmap_sem to reader lock (analogous to what we already do in
> prctl_set_mm_map).
> 
> v2: call find_vma without arg_lock held
> v3: squashed get_cmdline arg_lock patch
> 
> Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
> Cc: Mateusz Guzik <mguzik@redhat.com>
> CC: Cyrill Gorcunov <gorcunov@gmail.com>
> Co-developed-by: Laurent Dufour <ldufour@linux.ibm.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>

Just a nit. S-o-b chain is not correct here. The first s-o-b should
match the author (From) of the patch.

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  kernel/sys.c | 10 ++++++++--
>  mm/util.c    |  4 ++--
>  2 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 5e0a5edf47f8..14be57840511 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2122,9 +2122,14 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = -EINVAL;
>  
> -	down_write(&mm->mmap_sem);
> +	/*
> +	 * arg_lock protects concurent updates of arg boundaries, we need mmap_sem for
> +	 * a) concurrent sys_brk, b) finding VMA for addr validation.
> +	 */
> +	down_read(&mm->mmap_sem);
>  	vma = find_vma(mm, addr);
>  
> +	spin_lock(&mm->arg_lock);
>  	prctl_map.start_code	= mm->start_code;
>  	prctl_map.end_code	= mm->end_code;
>  	prctl_map.start_data	= mm->start_data;
> @@ -2212,7 +2217,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  
>  	error = 0;
>  out:
> -	up_write(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
> +	up_read(&mm->mmap_sem);
>  	return error;
>  }
>  
> diff --git a/mm/util.c b/mm/util.c
> index 43a2984bccaa..5cf0e84a0823 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -758,12 +758,12 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
>  	if (!mm->arg_end)
>  		goto out_mm;	/* Shh! No looking before we're done */
>  
> -	down_read(&mm->mmap_sem);
> +	spin_lock(&mm->arg_lock);
>  	arg_start = mm->arg_start;
>  	arg_end = mm->arg_end;
>  	env_start = mm->env_start;
>  	env_end = mm->env_end;
> -	up_read(&mm->mmap_sem);
> +	spin_unlock(&mm->arg_lock);
>  
>  	len = arg_end - arg_start;
>  
> -- 
> 2.16.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-05-07 17:42 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-17 12:03 [PATCH] mm: get_cmdline use arg_lock instead of mmap_sem Michal Koutný
2019-04-17 13:41 ` Michal Hocko
2019-04-17 14:41   ` Michal Koutný
2019-04-17 14:55     ` Michal Hocko
2019-04-18 13:50       ` [PATCH] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
2019-04-18 14:09         ` Cyrill Gorcunov
2019-04-18 14:15         ` Michal Hocko
2019-04-18 14:27         ` Laurent Dufour
2019-04-18 18:23         ` Cyrill Gorcunov
2019-04-30  8:18           ` [PATCH 0/3] Reduce mmap_sem usage for args manipulation Michal Koutný
2019-04-30  8:18             ` [PATCH 1/3] mm: get_cmdline use arg_lock instead of mmap_sem Michal Koutný
2019-04-30  9:09               ` Kirill Tkhai
2019-04-30  9:38                 ` Cyrill Gorcunov
2019-04-30  9:53                   ` Kirill Tkhai
2019-04-30 10:45                     ` Cyrill Gorcunov
2019-04-30 10:56                       ` Michal Koutný
2019-04-30 13:24                         ` Cyrill Gorcunov
2019-04-30  8:18             ` [PATCH 2/3] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
2019-04-30  9:27               ` Kirill Tkhai
2019-04-30  8:18             ` [PATCH 3/3] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
2019-04-30  8:55               ` Kirill Tkhai
2019-04-30  9:08                 ` Cyrill Gorcunov
2019-04-30  9:11                   ` Kirill Tkhai
2019-05-02 12:52                     ` [PATCH v3 0/2] Reduce mmap_sem usage for args manipulation Michal Koutný
2019-05-02 12:52                       ` [PATCH v3 1/2] prctl_set_mm: Refactor checks from validate_prctl_map Michal Koutný
2019-05-02 20:57                         ` Cyrill Gorcunov
2019-05-02 12:52                       ` [PATCH v3 2/2] prctl_set_mm: downgrade mmap_sem to read lock Michal Koutný
2019-05-02 20:57                         ` Cyrill Gorcunov
2019-05-06  9:28                         ` Kirill Tkhai
2019-05-07 17:42                         ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).