linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] binder: Don't use mmput() from shrinker function.
       [not found] <0000000000001fbbb605aa805c9b@google.com>
@ 2020-07-15 23:36 ` Tetsuo Handa
  2020-07-16  8:35   ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2020-07-15 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, Michal Hocko
  Cc: syzbot, acme, alexander.shishkin, jolsa, linux-kernel,
	mark.rutland, mingo, namhyung, peterz, syzkaller-bugs,
	open list:ANDROID DRIVERS, linux-mm

syzbot is reporting that mmput() from shrinker function has a risk of
deadlock [1]. Don't start synchronous teardown of mm when called from
shrinker function.

[1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45

Reported-by: syzbot <syzbot+1068f09c44d151250c33@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+e5344baa319c9a96edec@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/android/binder_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 42c672f1584e..cbe6aa77d50d 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
 		trace_binder_unmap_user_end(alloc, index);
 	}
 	mmap_read_unlock(mm);
-	mmput(mm);
+	mmput_async(mm);
 
 	trace_binder_unmap_kernel_start(alloc, index);
 
-- 
2.18.4




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] binder: Don't use mmput() from shrinker function.
  2020-07-15 23:36 ` [PATCH] binder: Don't use mmput() from shrinker function Tetsuo Handa
@ 2020-07-16  8:35   ` Michal Hocko
  2020-07-16 13:41     ` Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2020-07-16  8:35 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, syzbot, acme,
	alexander.shishkin, jolsa, linux-kernel, mark.rutland, mingo,
	namhyung, peterz, syzkaller-bugs, open list:ANDROID DRIVERS,
	linux-mm

On Thu 16-07-20 08:36:52, Tetsuo Handa wrote:
> syzbot is reporting that mmput() from shrinker function has a risk of
> deadlock [1]. Don't start synchronous teardown of mm when called from
> shrinker function.

Please add the actual lock dependency to the changelog.

Anyway is this deadlock real? Mayve I have missed some details but the
call graph points to these two paths.
uprobe_mmap					do_shrink_slab	
  uprobes_mmap_hash #lock
  install_breakpoint				  binder_shrink_scan
    set_swbp					    binder_alloc_free_page
      uprobe_write_opcode			      __mmput
	update_ref_ctr				        uprobe_clear_state
    	  mutex_lock(&delayed_uprobe_lock)	          mutex_lock(&delayed_uprobe_lock);
	    allocation -> reclaim

But in order for this to happen the shrinker would have to do the last
put on the mm. But mm cannot go away from under uprobe_mmap so those two
paths cannot race with each other.

Unless I am missing something this is a false positive. I do not mind
using mmput_async from the shrinker as a workaround but the changelog
should be explicit about the fact.

> [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
> 
> Reported-by: syzbot <syzbot+1068f09c44d151250c33@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+e5344baa319c9a96edec@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  drivers/android/binder_alloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index 42c672f1584e..cbe6aa77d50d 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
>  		trace_binder_unmap_user_end(alloc, index);
>  	}
>  	mmap_read_unlock(mm);
> -	mmput(mm);
> +	mmput_async(mm);
>  
>  	trace_binder_unmap_kernel_start(alloc, index);
>  
> -- 
> 2.18.4
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] binder: Don't use mmput() from shrinker function.
  2020-07-16  8:35   ` Michal Hocko
@ 2020-07-16 13:41     ` Tetsuo Handa
  2020-07-16 13:54       ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2020-07-16 13:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, syzbot, acme,
	alexander.shishkin, jolsa, linux-kernel, mark.rutland, mingo,
	namhyung, peterz, syzkaller-bugs, open list:ANDROID DRIVERS,
	linux-mm

On 2020/07/16 17:35, Michal Hocko wrote:
> On Thu 16-07-20 08:36:52, Tetsuo Handa wrote:
>> syzbot is reporting that mmput() from shrinker function has a risk of
>> deadlock [1]. Don't start synchronous teardown of mm when called from
>> shrinker function.
> 
> Please add the actual lock dependency to the changelog.
> 
> Anyway is this deadlock real? Mayve I have missed some details but the
> call graph points to these two paths.
> uprobe_mmap					do_shrink_slab	
>   uprobes_mmap_hash #lock
>   install_breakpoint				  binder_shrink_scan
>     set_swbp					    binder_alloc_free_page
>       uprobe_write_opcode			      __mmput
> 	update_ref_ctr				        uprobe_clear_state
>     	  mutex_lock(&delayed_uprobe_lock)	          mutex_lock(&delayed_uprobe_lock);
> 	    allocation -> reclaim
> 

static int update_ref_ctr(struct uprobe *uprobe, struct mm_struct *mm, short d) {
  mutex_lock(&delayed_uprobe_lock);
  ret = delayed_uprobe_add(uprobe, mm1) {
    du = kzalloc(sizeof(*du), GFP_KERNEL) {
      do_shrink_slab() {
        binder_shrink_scan() {
          binder_alloc_free_page() {
            mmget_not_zero(mm2);
            mmput(mm2) {
              __mmput(mm2) {
                uprobe_clear_state(mm2) {
                  mutex_lock(&delayed_uprobe_lock);
                  delayed_uprobe_remove(NULL, mm2);
                  mutex_unlock(&delayed_uprobe_lock);
                }
              }
            }
          }
        }
      }
    }
  }
  mutex_unlock(&delayed_uprobe_lock);
}

> But in order for this to happen the shrinker would have to do the last
> put on the mm. But mm cannot go away from under uprobe_mmap so those two
> paths cannot race with each other.

and mm1 != mm2 is possible, isn't it?

> 
> Unless I am missing something this is a false positive. I do not mind
> using mmput_async from the shrinker as a workaround but the changelog
> should be explicit about the fact.
> 

binder_alloc_free_page() is already using mmput_async() 14 lines later.
It just took 18 months to hit this race for the third time, for it is
quite difficult to let the owner of mm2 to call mmput(mm2) between
binder_alloc_free_page() calls mmget_not_zero(mm2) and mmput(mm2).

The reason I added you is to see whether we can do

 void mmput(struct mm_struct *mm)
 {
 	might_sleep();
+	/* Calling mmput() from shrinker context can deadlock. */
+	WARN_ON(current->flags & PF_MEMALLOC);
 
 	if (atomic_dec_and_test(&mm->mm_users))
 		__mmput(mm);
 }

in order to catch this bug easier.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] binder: Don't use mmput() from shrinker function.
  2020-07-16 13:41     ` Tetsuo Handa
@ 2020-07-16 13:54       ` Michal Hocko
  2020-07-16 15:12         ` [PATCH v2] " Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2020-07-16 13:54 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, syzbot, acme,
	alexander.shishkin, jolsa, linux-kernel, mark.rutland, mingo,
	namhyung, peterz, syzkaller-bugs, open list:ANDROID DRIVERS,
	linux-mm

On Thu 16-07-20 22:41:14, Tetsuo Handa wrote:
> On 2020/07/16 17:35, Michal Hocko wrote:
[...]
> > But in order for this to happen the shrinker would have to do the last
> > put on the mm. But mm cannot go away from under uprobe_mmap so those two
> > paths cannot race with each other.
> 
> and mm1 != mm2 is possible, isn't it?

OK, I have missed that information. You are right. Can you make this
into the changelog please?
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] binder: Don't use mmput() from shrinker function.
  2020-07-16 13:54       ` Michal Hocko
@ 2020-07-16 15:12         ` Tetsuo Handa
  2020-07-16 15:17           ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2020-07-16 15:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner
  Cc: Michal Hocko, syzbot, acme, alexander.shishkin, jolsa,
	linux-kernel, mark.rutland, mingo, namhyung, peterz,
	syzkaller-bugs, open list:ANDROID DRIVERS, linux-mm

syzbot is reporting that mmput() from shrinker function has a risk of
deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.

Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
callback") replaced mmput() with mmput_async() in order to avoid sleeping
with spinlock held. But this patch replaces mmput() with mmput_async() in
order not to start __mmput() from shrinker context.

[1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45

Reported-by: syzbot <syzbot+1068f09c44d151250c33@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+e5344baa319c9a96edec@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/android/binder_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 42c672f1584e..cbe6aa77d50d 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
 		trace_binder_unmap_user_end(alloc, index);
 	}
 	mmap_read_unlock(mm);
-	mmput(mm);
+	mmput_async(mm);
 
 	trace_binder_unmap_kernel_start(alloc, index);
 
-- 
2.18.4



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] binder: Don't use mmput() from shrinker function.
  2020-07-16 15:12         ` [PATCH v2] " Tetsuo Handa
@ 2020-07-16 15:17           ` Michal Hocko
  2020-07-16 16:29             ` Christian Brauner
  2020-07-16 23:53             ` Todd Kjos
  0 siblings, 2 replies; 9+ messages in thread
From: Michal Hocko @ 2020-07-16 15:17 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, syzbot, acme,
	alexander.shishkin, jolsa, linux-kernel, mark.rutland, mingo,
	namhyung, peterz, syzkaller-bugs, open list:ANDROID DRIVERS,
	linux-mm

On Fri 17-07-20 00:12:15, Tetsuo Handa wrote:
> syzbot is reporting that mmput() from shrinker function has a risk of
> deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
> kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
> uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.
> 
> Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
> callback") replaced mmput() with mmput_async() in order to avoid sleeping
> with spinlock held. But this patch replaces mmput() with mmput_async() in
> order not to start __mmput() from shrinker context.
> 
> [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
> 
> Reported-by: syzbot <syzbot+1068f09c44d151250c33@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+e5344baa319c9a96edec@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

Reviewed-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  drivers/android/binder_alloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index 42c672f1584e..cbe6aa77d50d 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
>  		trace_binder_unmap_user_end(alloc, index);
>  	}
>  	mmap_read_unlock(mm);
> -	mmput(mm);
> +	mmput_async(mm);
>  
>  	trace_binder_unmap_kernel_start(alloc, index);
>  
> -- 
> 2.18.4

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] binder: Don't use mmput() from shrinker function.
  2020-07-16 15:17           ` Michal Hocko
@ 2020-07-16 16:29             ` Christian Brauner
  2020-07-16 22:27               ` Tetsuo Handa
  2020-07-16 23:53             ` Todd Kjos
  1 sibling, 1 reply; 9+ messages in thread
From: Christian Brauner @ 2020-07-16 16:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Tetsuo Handa, Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, syzbot, acme,
	alexander.shishkin, jolsa, linux-kernel, mark.rutland, mingo,
	namhyung, peterz, syzkaller-bugs, open list:ANDROID DRIVERS,
	linux-mm

On Thu, Jul 16, 2020 at 05:17:56PM +0200, Michal Hocko wrote:
> On Fri 17-07-20 00:12:15, Tetsuo Handa wrote:
> > syzbot is reporting that mmput() from shrinker function has a risk of
> > deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
> > kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
> > uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.
> > 
> > Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
> > callback") replaced mmput() with mmput_async() in order to avoid sleeping
> > with spinlock held. But this patch replaces mmput() with mmput_async() in
> > order not to start __mmput() from shrinker context.
> > 
> > [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
> > 
> > Reported-by: syzbot <syzbot+1068f09c44d151250c33@syzkaller.appspotmail.com>
> > Reported-by: syzbot <syzbot+e5344baa319c9a96edec@syzkaller.appspotmail.com>
> > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> 
> Reviewed-by: Michal Hocko <mhocko@suse.com>

Thanks for the careful review Michal!
Does this need a Cc: stable?

Otherwise

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

Thanks!
Christian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] binder: Don't use mmput() from shrinker function.
  2020-07-16 16:29             ` Christian Brauner
@ 2020-07-16 22:27               ` Tetsuo Handa
  0 siblings, 0 replies; 9+ messages in thread
From: Tetsuo Handa @ 2020-07-16 22:27 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Michal Hocko, Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, syzbot, acme,
	alexander.shishkin, jolsa, linux-kernel, mark.rutland, mingo,
	namhyung, peterz, syzkaller-bugs, open list:ANDROID DRIVERS,
	linux-mm

On 2020/07/17 1:29, Christian Brauner wrote:
> Does this need a Cc: stable?

Up to someone who applies this patch. I think this race is hard to hit.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] binder: Don't use mmput() from shrinker function.
  2020-07-16 15:17           ` Michal Hocko
  2020-07-16 16:29             ` Christian Brauner
@ 2020-07-16 23:53             ` Todd Kjos
  1 sibling, 0 replies; 9+ messages in thread
From: Todd Kjos @ 2020-07-16 23:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Tetsuo Handa, Greg Kroah-Hartman, Arve Hjonnevag, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, syzbot, acme,
	alexander.shishkin, jolsa, LKML, Mark Rutland, Ingo Molnar,
	namhyung, Peter Zijlstra, syzkaller-bugs,
	open list:ANDROID DRIVERS, linux-mm

On Thu, Jul 16, 2020 at 8:18 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Fri 17-07-20 00:12:15, Tetsuo Handa wrote:
> > syzbot is reporting that mmput() from shrinker function has a risk of
> > deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
> > kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
> > uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.
> >
> > Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
> > callback") replaced mmput() with mmput_async() in order to avoid sleeping
> > with spinlock held. But this patch replaces mmput() with mmput_async() in
> > order not to start __mmput() from shrinker context.
> >
> > [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45
> >
> > Reported-by: syzbot <syzbot+1068f09c44d151250c33@syzkaller.appspotmail.com>
> > Reported-by: syzbot <syzbot+e5344baa319c9a96edec@syzkaller.appspotmail.com>
> > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>
> Reviewed-by: Michal Hocko <mhocko@suse.com>

Acked-by: Todd Kjos <tkjos@google.com>

>
> Thanks!
>
> > ---
> >  drivers/android/binder_alloc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> > index 42c672f1584e..cbe6aa77d50d 100644
> > --- a/drivers/android/binder_alloc.c
> > +++ b/drivers/android/binder_alloc.c
> > @@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
> >               trace_binder_unmap_user_end(alloc, index);
> >       }
> >       mmap_read_unlock(mm);
> > -     mmput(mm);
> > +     mmput_async(mm);
> >
> >       trace_binder_unmap_kernel_start(alloc, index);
> >
> > --
> > 2.18.4
>
> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-16 23:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <0000000000001fbbb605aa805c9b@google.com>
2020-07-15 23:36 ` [PATCH] binder: Don't use mmput() from shrinker function Tetsuo Handa
2020-07-16  8:35   ` Michal Hocko
2020-07-16 13:41     ` Tetsuo Handa
2020-07-16 13:54       ` Michal Hocko
2020-07-16 15:12         ` [PATCH v2] " Tetsuo Handa
2020-07-16 15:17           ` Michal Hocko
2020-07-16 16:29             ` Christian Brauner
2020-07-16 22:27               ` Tetsuo Handa
2020-07-16 23:53             ` Todd Kjos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).