Re: possible deadlock in lru_add_drain

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: possible deadlock in lru_add_drain_all
       [not found] <089e0825eec8955c1f055c83d476@google.com>
@ 2017-10-27  9:34 ` Michal Hocko
  2017-10-27  9:44   ` Dmitry Vyukov
  2017-10-27 11:27   ` Vlastimil Babka
  0 siblings, 2 replies; 27+ messages in thread
From: Michal Hocko @ 2017-10-27  9:34 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, dan.j.williams, hannes, jack, jglisse, linux-kernel,
	linux-mm, shli, syzkaller-bugs, tglx, vbabka, ying.huang

On Fri 27-10-17 02:22:40, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on
> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.

I do not see such a commit. My linux-next top is next-20171018
 
[...]
> Chain exists of:
>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&sb->s_type->i_mutex_key#9);
>                                lock(&pipe->mutex/1);
>                                lock(&sb->s_type->i_mutex_key#9);
>   lock(cpu_hotplug_lock.rw_sem);

I am quite confused about this report. Where exactly is the deadlock?
I do not see where we would get pipe mutex from inside of the hotplug
lock. Is it possible this is just a false possitive due to cross release
feature?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-27  9:34 ` possible deadlock in lru_add_drain_all Michal Hocko
@ 2017-10-27  9:44   ` Dmitry Vyukov
  2017-10-27  9:47     ` Dmitry Vyukov
  2017-10-27 13:42     ` Michal Hocko
  2017-10-27 11:27   ` Vlastimil Babka
  1 sibling, 2 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-10-27  9:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: syzbot, Andrew Morton, Dan Williams, Johannes Weiner, Jan Kara,
	jglisse, LKML, linux-mm, shli, syzkaller-bugs, Thomas Gleixner,
	Vlastimil Babka, ying.huang

On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Fri 27-10-17 02:22:40, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>
> I do not see such a commit. My linux-next top is next-20171018
>
> [...]
>> Chain exists of:
>>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
>>
>>  Possible unsafe locking scenario:
>>
>>        CPU0                    CPU1
>>        ----                    ----
>>   lock(&sb->s_type->i_mutex_key#9);
>>                                lock(&pipe->mutex/1);
>>                                lock(&sb->s_type->i_mutex_key#9);
>>   lock(cpu_hotplug_lock.rw_sem);
>
> I am quite confused about this report. Where exactly is the deadlock?
> I do not see where we would get pipe mutex from inside of the hotplug
> lock. Is it possible this is just a false possitive due to cross release
> feature?


As far as I understand this CPU0/CPU1 scheme works only for simple
cases with 2 mutexes. This seem to have larger cycle as denoted by
"the existing dependency chain (in reverse order) is:" section.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-27  9:44   ` Dmitry Vyukov
@ 2017-10-27  9:47     ` Dmitry Vyukov
  2017-10-27 13:42     ` Michal Hocko
  1 sibling, 0 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-10-27  9:47 UTC (permalink / raw)
  To: Michal Hocko
  Cc: syzbot, Andrew Morton, Dan Williams, Johannes Weiner, Jan Kara,
	jglisse, LKML, linux-mm, shli, syzkaller-bugs, Thomas Gleixner,
	Vlastimil Babka, ying.huang

On Fri, Oct 27, 2017 at 11:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> On Fri 27-10-17 02:22:40, syzbot wrote:
>>> Hello,
>>>
>>> syzkaller hit the following crash on
>>> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>>> compiler: gcc (GCC) 7.1.1 20170620
>>> .config is attached
>>> Raw console output is attached.
>>
>> I do not see such a commit. My linux-next top is next-20171018

As far as I understand linux-next constantly recreates tree, so that
all commits hashes are destroyed.
Somebody mentioned some time ago about linux-next-something tree which
keeps all of the history (but I don't remember it off the top of my
head).


>> [...]
>>> Chain exists of:
>>>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
>>>
>>>  Possible unsafe locking scenario:
>>>
>>>        CPU0                    CPU1
>>>        ----                    ----
>>>   lock(&sb->s_type->i_mutex_key#9);
>>>                                lock(&pipe->mutex/1);
>>>                                lock(&sb->s_type->i_mutex_key#9);
>>>   lock(cpu_hotplug_lock.rw_sem);
>>
>> I am quite confused about this report. Where exactly is the deadlock?
>> I do not see where we would get pipe mutex from inside of the hotplug
>> lock. Is it possible this is just a false possitive due to cross release
>> feature?
>
>
> As far as I understand this CPU0/CPU1 scheme works only for simple
> cases with 2 mutexes. This seem to have larger cycle as denoted by
> "the existing dependency chain (in reverse order) is:" section.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-27  9:44   ` Dmitry Vyukov
  2017-10-27  9:47     ` Dmitry Vyukov
@ 2017-10-27 13:42     ` Michal Hocko
  2017-10-30  8:22       ` Michal Hocko
  1 sibling, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2017-10-27 13:42 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Andrew Morton, Dan Williams, Johannes Weiner, Jan Kara,
	jglisse, LKML, linux-mm, shli, syzkaller-bugs, Thomas Gleixner,
	Vlastimil Babka, ying.huang

On Fri 27-10-17 11:44:58, Dmitry Vyukov wrote:
> On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Fri 27-10-17 02:22:40, syzbot wrote:
> >> Hello,
> >>
> >> syzkaller hit the following crash on
> >> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
> >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> >> compiler: gcc (GCC) 7.1.1 20170620
> >> .config is attached
> >> Raw console output is attached.
> >
> > I do not see such a commit. My linux-next top is next-20171018
> >
> > [...]
> >> Chain exists of:
> >>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
> >>
> >>  Possible unsafe locking scenario:
> >>
> >>        CPU0                    CPU1
> >>        ----                    ----
> >>   lock(&sb->s_type->i_mutex_key#9);
> >>                                lock(&pipe->mutex/1);
> >>                                lock(&sb->s_type->i_mutex_key#9);
> >>   lock(cpu_hotplug_lock.rw_sem);
> >
> > I am quite confused about this report. Where exactly is the deadlock?
> > I do not see where we would get pipe mutex from inside of the hotplug
> > lock. Is it possible this is just a false possitive due to cross release
> > feature?
> 
> 
> As far as I understand this CPU0/CPU1 scheme works only for simple
> cases with 2 mutexes. This seem to have larger cycle as denoted by
> "the existing dependency chain (in reverse order) is:" section.

My point was that lru_add_drain_all doesn't take any external locks
other than lru_lock and that one is not anywhere in the chain AFAICS.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-27 13:42     ` Michal Hocko
@ 2017-10-30  8:22       ` Michal Hocko
  2017-10-30 10:09         ` Byungchul Park
  2017-10-30 10:26         ` Byungchul Park
  0 siblings, 2 replies; 27+ messages in thread
From: Michal Hocko @ 2017-10-30  8:22 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Andrew Morton, Dan Williams, Johannes Weiner, Jan Kara,
	jglisse, LKML, linux-mm, shli, syzkaller-bugs, Thomas Gleixner,
	Vlastimil Babka, ying.huang, Byungchul Park

[Cc Byungchul. The original full report is
http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com]

Could you have a look please? This smells like a false positive to me.

On Fri 27-10-17 15:42:34, Michal Hocko wrote:
> On Fri 27-10-17 11:44:58, Dmitry Vyukov wrote:
> > On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > > On Fri 27-10-17 02:22:40, syzbot wrote:
> > >> Hello,
> > >>
> > >> syzkaller hit the following crash on
> > >> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
> > >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > >> compiler: gcc (GCC) 7.1.1 20170620
> > >> .config is attached
> > >> Raw console output is attached.
> > >
> > > I do not see such a commit. My linux-next top is next-20171018
> > >
> > > [...]
> > >> Chain exists of:
> > >>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
> > >>
> > >>  Possible unsafe locking scenario:
> > >>
> > >>        CPU0                    CPU1
> > >>        ----                    ----
> > >>   lock(&sb->s_type->i_mutex_key#9);
> > >>                                lock(&pipe->mutex/1);
> > >>                                lock(&sb->s_type->i_mutex_key#9);
> > >>   lock(cpu_hotplug_lock.rw_sem);
> > >
> > > I am quite confused about this report. Where exactly is the deadlock?
> > > I do not see where we would get pipe mutex from inside of the hotplug
> > > lock. Is it possible this is just a false possitive due to cross release
> > > feature?
> > 
> > 
> > As far as I understand this CPU0/CPU1 scheme works only for simple
> > cases with 2 mutexes. This seem to have larger cycle as denoted by
> > "the existing dependency chain (in reverse order) is:" section.
> 
> My point was that lru_add_drain_all doesn't take any external locks
> other than lru_lock and that one is not anywhere in the chain AFAICS.
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-30  8:22       ` Michal Hocko
@ 2017-10-30 10:09         ` Byungchul Park
  2017-10-30 15:10           ` Peter Zijlstra
  2017-10-30 10:26         ` Byungchul Park
  1 sibling, 1 reply; 27+ messages in thread
From: Byungchul Park @ 2017-10-30 10:09 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team, peterz

On Mon, Oct 30, 2017 at 09:22:03AM +0100, Michal Hocko wrote:
> [Cc Byungchul. The original full report is
> http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com]
> 
> Could you have a look please? This smells like a false positive to me.

+cc peterz@infradead.org

Hello,

IMHO, the false positive was caused by the lockdep_map of 'cpuhp_state'
which couldn't distinguish between cpu-up and cpu-down.

And it was solved with the following commit by Peter and Thomas:

5f4b55e10645b7371322c800a5ec745cab487a6c
smp/hotplug: Differentiate the AP-work lockdep class between up and down

Therefore, we can avoid the false positive on later than the commit.

Peter and Thomas, could you confirm it?

Thanks,
Byungchul

> On Fri 27-10-17 15:42:34, Michal Hocko wrote:
> > On Fri 27-10-17 11:44:58, Dmitry Vyukov wrote:
> > > On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > > > On Fri 27-10-17 02:22:40, syzbot wrote:
> > > >> Hello,
> > > >>
> > > >> syzkaller hit the following crash on
> > > >> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
> > > >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > > >> compiler: gcc (GCC) 7.1.1 20170620
> > > >> .config is attached
> > > >> Raw console output is attached.
> > > >
> > > > I do not see such a commit. My linux-next top is next-20171018
> > > >
> > > > [...]
> > > >> Chain exists of:
> > > >>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
> > > >>
> > > >>  Possible unsafe locking scenario:
> > > >>
> > > >>        CPU0                    CPU1
> > > >>        ----                    ----
> > > >>   lock(&sb->s_type->i_mutex_key#9);
> > > >>                                lock(&pipe->mutex/1);
> > > >>                                lock(&sb->s_type->i_mutex_key#9);
> > > >>   lock(cpu_hotplug_lock.rw_sem);
> > > >
> > > > I am quite confused about this report. Where exactly is the deadlock?
> > > > I do not see where we would get pipe mutex from inside of the hotplug
> > > > lock. Is it possible this is just a false possitive due to cross release
> > > > feature?
> > > 
> > > 
> > > As far as I understand this CPU0/CPU1 scheme works only for simple
> > > cases with 2 mutexes. This seem to have larger cycle as denoted by
> > > "the existing dependency chain (in reverse order) is:" section.
> > 
> > My point was that lru_add_drain_all doesn't take any external locks
> > other than lru_lock and that one is not anywhere in the chain AFAICS.
> > 
> > -- 
> > Michal Hocko
> > SUSE Labs
> 
> -- 
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-30 10:09         ` Byungchul Park
@ 2017-10-30 15:10           ` Peter Zijlstra
  2017-10-30 15:22             ` Peter Zijlstra
  2017-10-31 13:13             ` Michal Hocko
  0 siblings, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-10-30 15:10 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Michal Hocko, Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Mon, Oct 30, 2017 at 07:09:21PM +0900, Byungchul Park wrote:
> On Mon, Oct 30, 2017 at 09:22:03AM +0100, Michal Hocko wrote:
> > [Cc Byungchul. The original full report is
> > http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com]
> > 
> > Could you have a look please? This smells like a false positive to me.
> 
> +cc peterz@infradead.org
> 
> Hello,
> 
> IMHO, the false positive was caused by the lockdep_map of 'cpuhp_state'
> which couldn't distinguish between cpu-up and cpu-down.
> 
> And it was solved with the following commit by Peter and Thomas:
> 
> 5f4b55e10645b7371322c800a5ec745cab487a6c
> smp/hotplug: Differentiate the AP-work lockdep class between up and down
> 
> Therefore, we can avoid the false positive on later than the commit.
> 
> Peter and Thomas, could you confirm it?

I can indeed confirm it's running old code; cpuhp_state is no more.

However, that splat translates like:

	__cpuhp_setup_state()
#0	  cpus_read_lock()
	  __cpuhp_setup_state_cpuslocked()
#1	    mutex_lock(&cpuhp_state_mutex)



	__cpuhp_state_add_instance()
#2	  mutex_lock(&cpuhp_state_mutex)
	  cpuhp_issue_call()
	    cpuhp_invoke_ap_callback()
#3	      wait_for_completion()

						msr_device_create()
						  ...
#4						    filename_create()
#3						complete()



	do_splice()
#4	  file_start_write()
	  do_splice_from()
	    iter_file_splice_write()
#5	      pipe_lock()
	      vfs_iter_write()
	        ...
#6		  inode_lock()



	sys_fcntl()
	  do_fcntl()
	    shmem_fcntl()
#5	      inode_lock()
	      shmem_wait_for_pins()
	        if (!scan)
		  lru_add_drain_all()
#0		    cpus_read_lock()



Which is an actual real deadlock, there is no mixing of up and down.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-30 15:10           ` Peter Zijlstra
@ 2017-10-30 15:22             ` Peter Zijlstra
  2017-10-31 13:13             ` Michal Hocko
  1 sibling, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-10-30 15:22 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Michal Hocko, Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Mon, Oct 30, 2017 at 04:10:09PM +0100, Peter Zijlstra wrote:
> I can indeed confirm it's running old code; cpuhp_state is no more.
> 
> However, that splat translates like:
> 
> 	__cpuhp_setup_state()
> #0	  cpus_read_lock()
> 	  __cpuhp_setup_state_cpuslocked()
> #1	    mutex_lock(&cpuhp_state_mutex)
> 
> 
> 
> 	__cpuhp_state_add_instance()
> #2	  mutex_lock(&cpuhp_state_mutex)
> 	  cpuhp_issue_call()
> 	    cpuhp_invoke_ap_callback()
> #3	      wait_for_completion()
> 
> 						msr_device_create()
> 						  ...
> #4						    filename_create()
> #3						complete()
> 


So all this you can get in a single callchain when you do something
shiny like:

	modprobe msr


> 	do_splice()
> #4	  file_start_write()
> 	  do_splice_from()
> 	    iter_file_splice_write()
> #5	      pipe_lock()
> 	      vfs_iter_write()
> 	        ...
> #6		  inode_lock()
> 
> 

This is a splice into a devtmpfs file


> 	sys_fcntl()
> 	  do_fcntl()
> 	    shmem_fcntl()
> #5	      inode_lock()

#6 (obviously)

> 	      shmem_wait_for_pins()
> 	        if (!scan)
> 		  lru_add_drain_all()
> #0		    cpus_read_lock()
> 

Is the right fcntl()


So 3 different callchains, and *splat*..

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-30 15:10           ` Peter Zijlstra
  2017-10-30 15:22             ` Peter Zijlstra
@ 2017-10-31 13:13             ` Michal Hocko
  2017-10-31 13:51               ` Peter Zijlstra
  2017-10-31 15:25               ` Peter Zijlstra
  1 sibling, 2 replies; 27+ messages in thread
From: Michal Hocko @ 2017-10-31 13:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Byungchul Park, Dmitry Vyukov, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Mon 30-10-17 16:10:09, Peter Zijlstra wrote:
> On Mon, Oct 30, 2017 at 07:09:21PM +0900, Byungchul Park wrote:
> > On Mon, Oct 30, 2017 at 09:22:03AM +0100, Michal Hocko wrote:
> > > [Cc Byungchul. The original full report is
> > > http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com]
> > > 
> > > Could you have a look please? This smells like a false positive to me.
> > 
> > +cc peterz@infradead.org
> > 
> > Hello,
> > 
> > IMHO, the false positive was caused by the lockdep_map of 'cpuhp_state'
> > which couldn't distinguish between cpu-up and cpu-down.
> > 
> > And it was solved with the following commit by Peter and Thomas:
> > 
> > 5f4b55e10645b7371322c800a5ec745cab487a6c
> > smp/hotplug: Differentiate the AP-work lockdep class between up and down
> > 
> > Therefore, we can avoid the false positive on later than the commit.
> > 
> > Peter and Thomas, could you confirm it?
> 
> I can indeed confirm it's running old code; cpuhp_state is no more.

Does this mean the below chain is no longer possible with the current
linux-next (tip)?

> However, that splat translates like:
> 
> 	__cpuhp_setup_state()
> #0	  cpus_read_lock()
> 	  __cpuhp_setup_state_cpuslocked()
> #1	    mutex_lock(&cpuhp_state_mutex)
> 
> 
> 
> 	__cpuhp_state_add_instance()
> #2	  mutex_lock(&cpuhp_state_mutex)

this should be #1 right?

> 	  cpuhp_issue_call()
> 	    cpuhp_invoke_ap_callback()
> #3	      wait_for_completion()
> 
> 						msr_device_create()
> 						  ...
> #4						    filename_create()
> #3						complete()
> 
> 
> 
> 	do_splice()
> #4	  file_start_write()
> 	  do_splice_from()
> 	    iter_file_splice_write()
> #5	      pipe_lock()
> 	      vfs_iter_write()
> 	        ...
> #6		  inode_lock()
> 
> 
> 
> 	sys_fcntl()
> 	  do_fcntl()
> 	    shmem_fcntl()
> #5	      inode_lock()
> 	      shmem_wait_for_pins()
> 	        if (!scan)
> 		  lru_add_drain_all()
> #0		    cpus_read_lock()
> 
> 
> 
> Which is an actual real deadlock, there is no mixing of up and down.

thanks a lot, this made it more clear to me. It took a while to
actually see 0 -> 1 -> 3 -> 4 -> 5 -> 0 cycle. I have only focused
on lru_add_drain_all while it was holding the cpus lock.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 13:13             ` Michal Hocko
@ 2017-10-31 13:51               ` Peter Zijlstra
  2017-10-31 13:55                 ` Dmitry Vyukov
  2017-10-31 15:25               ` Peter Zijlstra
  1 sibling, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2017-10-31 13:51 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Byungchul Park, Dmitry Vyukov, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:
> On Mon 30-10-17 16:10:09, Peter Zijlstra wrote:

> > However, that splat translates like:
> > 
> > 	__cpuhp_setup_state()
> > #0	  cpus_read_lock()
> > 	  __cpuhp_setup_state_cpuslocked()
> > #1	    mutex_lock(&cpuhp_state_mutex)
> > 
> > 
> > 
> > 	__cpuhp_state_add_instance()
> > #2	  mutex_lock(&cpuhp_state_mutex)
> 
> this should be #1 right?

Yes

> > 	  cpuhp_issue_call()
> > 	    cpuhp_invoke_ap_callback()
> > #3	      wait_for_completion()
> > 
> > 						msr_device_create()
> > 						  ...
> > #4						    filename_create()
> > #3						complete()
> > 
> > 
> > 
> > 	do_splice()
> > #4	  file_start_write()
> > 	  do_splice_from()
> > 	    iter_file_splice_write()
> > #5	      pipe_lock()
> > 	      vfs_iter_write()
> > 	        ...
> > #6		  inode_lock()
> > 
> > 
> > 
> > 	sys_fcntl()
> > 	  do_fcntl()
> > 	    shmem_fcntl()
> > #5	      inode_lock()

And that #6

> > 	      shmem_wait_for_pins()
> > 	        if (!scan)
> > 		  lru_add_drain_all()
> > #0		    cpus_read_lock()
> > 
> > 
> > 
> > Which is an actual real deadlock, there is no mixing of up and down.
> 
> thanks a lot, this made it more clear to me. It took a while to
> actually see 0 -> 1 -> 3 -> 4 -> 5 -> 0 cycle. I have only focused
> on lru_add_drain_all while it was holding the cpus lock.

Yeah, these things are a pain to read, which is why I always construct
something like the above first.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 13:51               ` Peter Zijlstra
@ 2017-10-31 13:55                 ` Dmitry Vyukov
  2017-10-31 14:52                   ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Vyukov @ 2017-10-31 13:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michal Hocko, Byungchul Park, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Tue, Oct 31, 2017 at 4:51 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:
>> On Mon 30-10-17 16:10:09, Peter Zijlstra wrote:
>
>> > However, that splat translates like:
>> >
>> >     __cpuhp_setup_state()
>> > #0    cpus_read_lock()
>> >       __cpuhp_setup_state_cpuslocked()
>> > #1      mutex_lock(&cpuhp_state_mutex)
>> >
>> >
>> >
>> >     __cpuhp_state_add_instance()
>> > #2    mutex_lock(&cpuhp_state_mutex)
>>
>> this should be #1 right?
>
> Yes
>
>> >       cpuhp_issue_call()
>> >         cpuhp_invoke_ap_callback()
>> > #3        wait_for_completion()
>> >
>> >                                             msr_device_create()
>> >                                               ...
>> > #4                                              filename_create()
>> > #3                                          complete()
>> >
>> >
>> >
>> >     do_splice()
>> > #4    file_start_write()
>> >       do_splice_from()
>> >         iter_file_splice_write()
>> > #5        pipe_lock()
>> >           vfs_iter_write()
>> >             ...
>> > #6            inode_lock()
>> >
>> >
>> >
>> >     sys_fcntl()
>> >       do_fcntl()
>> >         shmem_fcntl()
>> > #5        inode_lock()
>
> And that #6
>
>> >           shmem_wait_for_pins()
>> >             if (!scan)
>> >               lru_add_drain_all()
>> > #0              cpus_read_lock()
>> >
>> >
>> >
>> > Which is an actual real deadlock, there is no mixing of up and down.
>>
>> thanks a lot, this made it more clear to me. It took a while to
>> actually see 0 -> 1 -> 3 -> 4 -> 5 -> 0 cycle. I have only focused
>> on lru_add_drain_all while it was holding the cpus lock.
>
> Yeah, these things are a pain to read, which is why I always construct
> something like the above first.


I noticed that for a simple 2 lock deadlock lockdep prints only 2
stacks. FWIW in user-space TSAN we print 4 stacks for such deadlocks,
namely where A was locked, where B was locked under A, where B was
locked, where A was locked under B. It makes it easier to figure out
what happens. However, for this report it seems to be 8 stacks this
way. So it's probably hard either way.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 13:55                 ` Dmitry Vyukov
@ 2017-10-31 14:52                   ` Peter Zijlstra
  2017-10-31 14:58                     ` Michal Hocko
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2017-10-31 14:52 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Michal Hocko, Byungchul Park, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Tue, Oct 31, 2017 at 04:55:32PM +0300, Dmitry Vyukov wrote:

> I noticed that for a simple 2 lock deadlock lockdep prints only 2
> stacks.

3, one of which is useless :-)

For the AB-BA we print where we acquire A (#0) where we acquire B while
holding A #1 and then where we acquire B while holding A the unwind at
point of splat.

The #0 trace is useless.

> FWIW in user-space TSAN we print 4 stacks for such deadlocks,
> namely where A was locked, where B was locked under A, where B was
> locked, where A was locked under B. It makes it easier to figure out
> what happens. However, for this report it seems to be 8 stacks this
> way. So it's probably hard either way.

Right, its a question of performance and overhead I suppose. Lockdep
typically only saves a stack trace when it finds a new link.

So only when we find the AB relation do we save the stacktrace; which
reflects the location where we acquire B. But by that time we've lost
where it was we acquire A.

If we want to save those stacks; we have to save a stacktrace on _every_
lock acquire, simply because we never know ahead of time if there will
be a new link. Doing this is _expensive_.

Furthermore, the space into which we store stacktraces is limited;
since memory allocators use locks we can't very well use dynamic memory
for lockdep -- that would give recursive and robustness issues.

Also, its usually not too hard to find the site where we took A if we
know the site of AB.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 14:52                   ` Peter Zijlstra
@ 2017-10-31 14:58                     ` Michal Hocko
  2017-10-31 15:10                       ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2017-10-31 14:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Vyukov, Byungchul Park, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Tue 31-10-17 15:52:47, Peter Zijlstra wrote:
[...]
> If we want to save those stacks; we have to save a stacktrace on _every_
> lock acquire, simply because we never know ahead of time if there will
> be a new link. Doing this is _expensive_.
> 
> Furthermore, the space into which we store stacktraces is limited;
> since memory allocators use locks we can't very well use dynamic memory
> for lockdep -- that would give recursive and robustness issues.

Wouldn't stackdepot help here? Sure the first stack unwind will be
costly but then you amortize that over time. It is quite likely that
locks are held from same addresses.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 14:58                     ` Michal Hocko
@ 2017-10-31 15:10                       ` Peter Zijlstra
  2017-11-01  8:59                         ` Byungchul Park
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2017-10-31 15:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Dmitry Vyukov, Byungchul Park, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Tue, Oct 31, 2017 at 03:58:04PM +0100, Michal Hocko wrote:
> On Tue 31-10-17 15:52:47, Peter Zijlstra wrote:
> [...]
> > If we want to save those stacks; we have to save a stacktrace on _every_
> > lock acquire, simply because we never know ahead of time if there will
> > be a new link. Doing this is _expensive_.
> > 
> > Furthermore, the space into which we store stacktraces is limited;
> > since memory allocators use locks we can't very well use dynamic memory
> > for lockdep -- that would give recursive and robustness issues.
> 
> Wouldn't stackdepot help here? Sure the first stack unwind will be
> costly but then you amortize that over time. It is quite likely that
> locks are held from same addresses.

I'm not familiar with that; but looking at it, no. It uses alloc_pages()
which has locks in and it has a lock itself.

Also, it seems to index the stack based on the entire stacktrace; which
means you actually have to have the stacktrace first. And doing
stacktraces on every single acquire is horrendously expensive.

The idea just saves on storage, it doesn't help with having to do a
gazillion of unwinds in the first place.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 15:10                       ` Peter Zijlstra
@ 2017-11-01  8:59                         ` Byungchul Park
  2017-11-01 12:01                           ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Byungchul Park @ 2017-11-01  8:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michal Hocko, Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Tue, Oct 31, 2017 at 04:10:24PM +0100, Peter Zijlstra wrote:
> On Tue, Oct 31, 2017 at 03:58:04PM +0100, Michal Hocko wrote:
> > On Tue 31-10-17 15:52:47, Peter Zijlstra wrote:
> > [...]
> > > If we want to save those stacks; we have to save a stacktrace on _every_
> > > lock acquire, simply because we never know ahead of time if there will
> > > be a new link. Doing this is _expensive_.
> > > 
> > > Furthermore, the space into which we store stacktraces is limited;
> > > since memory allocators use locks we can't very well use dynamic memory
> > > for lockdep -- that would give recursive and robustness issues.

I agree with all you said.

But, I have a better idea, that is, to save only the caller's ip of each
acquisition as an additional information? Of course, it's not enough in
some cases, but it's cheep and better than doing nothing.

For example, when building A->B, let's save not only full stack of B,
but also caller's ip of A together, then use them on warning like:

-> #3 aa_mutex:
   a()
   b()
   c()
   d()
   ---
   while holding bb_mutex at $IP <- additional information I said

-> #2 bb_mutex:
   e()
   f()
   g()
   h()
   ---
   while holding cc_mutex at $IP <- additional information I said

-> #1 cc_mutex:
   i()
   j()
   k()
   l()
   ---
   while holding xxx at $IP <- additional information I said

and so on.

Don't you think this is worth working it?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-11-01  8:59                         ` Byungchul Park
@ 2017-11-01 12:01                           ` Peter Zijlstra
  2017-11-01 23:54                             ` Byungchul Park
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2017-11-01 12:01 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Michal Hocko, Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Wed, Nov 01, 2017 at 05:59:27PM +0900, Byungchul Park wrote:
> On Tue, Oct 31, 2017 at 04:10:24PM +0100, Peter Zijlstra wrote:
> > On Tue, Oct 31, 2017 at 03:58:04PM +0100, Michal Hocko wrote:
> > > On Tue 31-10-17 15:52:47, Peter Zijlstra wrote:
> > > [...]
> > > > If we want to save those stacks; we have to save a stacktrace on _every_
> > > > lock acquire, simply because we never know ahead of time if there will
> > > > be a new link. Doing this is _expensive_.
> > > > 
> > > > Furthermore, the space into which we store stacktraces is limited;
> > > > since memory allocators use locks we can't very well use dynamic memory
> > > > for lockdep -- that would give recursive and robustness issues.
> 
> I agree with all you said.
> 
> But, I have a better idea, that is, to save only the caller's ip of each
> acquisition as an additional information? Of course, it's not enough in
> some cases, but it's cheep and better than doing nothing.
> 
> For example, when building A->B, let's save not only full stack of B,
> but also caller's ip of A together, then use them on warning like:

Like said; I've never really had trouble finding where we take A. And
for the most difficult cases, just the IP isn't too useful either.

So that would solve a non problem while leaving the real problem.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-11-01 12:01                           ` Peter Zijlstra
@ 2017-11-01 23:54                             ` Byungchul Park
  2018-02-14 14:01                               ` Dmitry Vyukov
  0 siblings, 1 reply; 27+ messages in thread
From: Byungchul Park @ 2017-11-01 23:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michal Hocko, Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Wed, Nov 01, 2017 at 01:01:01PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 01, 2017 at 05:59:27PM +0900, Byungchul Park wrote:
> > On Tue, Oct 31, 2017 at 04:10:24PM +0100, Peter Zijlstra wrote:
> > > On Tue, Oct 31, 2017 at 03:58:04PM +0100, Michal Hocko wrote:
> > > > On Tue 31-10-17 15:52:47, Peter Zijlstra wrote:
> > > > [...]
> > > > > If we want to save those stacks; we have to save a stacktrace on _every_
> > > > > lock acquire, simply because we never know ahead of time if there will
> > > > > be a new link. Doing this is _expensive_.
> > > > > 
> > > > > Furthermore, the space into which we store stacktraces is limited;
> > > > > since memory allocators use locks we can't very well use dynamic memory
> > > > > for lockdep -- that would give recursive and robustness issues.
> > 
> > I agree with all you said.
> > 
> > But, I have a better idea, that is, to save only the caller's ip of each
> > acquisition as an additional information? Of course, it's not enough in
> > some cases, but it's cheep and better than doing nothing.
> > 
> > For example, when building A->B, let's save not only full stack of B,
> > but also caller's ip of A together, then use them on warning like:
> 
> Like said; I've never really had trouble finding where we take A. And

Me, either, since I know the way. But I've seen many guys who got
confused with it, which is why I suggested it.

But, leave it if you don't think so.

> for the most difficult cases, just the IP isn't too useful either.
> 
> So that would solve a non problem while leaving the real problem.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-11-01 23:54                             ` Byungchul Park
@ 2018-02-14 14:01                               ` Dmitry Vyukov
  2018-02-14 15:44                                 ` Michal Hocko
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Vyukov @ 2018-02-14 14:01 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Peter Zijlstra, Michal Hocko, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, Jerome Glisse, LKML,
	Linux-MM, shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Thu, Nov 2, 2017 at 12:54 AM, Byungchul Park <byungchul.park@lge.com> wrote:
> On Wed, Nov 01, 2017 at 01:01:01PM +0100, Peter Zijlstra wrote:
>> On Wed, Nov 01, 2017 at 05:59:27PM +0900, Byungchul Park wrote:
>> > On Tue, Oct 31, 2017 at 04:10:24PM +0100, Peter Zijlstra wrote:
>> > > On Tue, Oct 31, 2017 at 03:58:04PM +0100, Michal Hocko wrote:
>> > > > On Tue 31-10-17 15:52:47, Peter Zijlstra wrote:
>> > > > [...]
>> > > > > If we want to save those stacks; we have to save a stacktrace on _every_
>> > > > > lock acquire, simply because we never know ahead of time if there will
>> > > > > be a new link. Doing this is _expensive_.
>> > > > >
>> > > > > Furthermore, the space into which we store stacktraces is limited;
>> > > > > since memory allocators use locks we can't very well use dynamic memory
>> > > > > for lockdep -- that would give recursive and robustness issues.
>> >
>> > I agree with all you said.
>> >
>> > But, I have a better idea, that is, to save only the caller's ip of each
>> > acquisition as an additional information? Of course, it's not enough in
>> > some cases, but it's cheep and better than doing nothing.
>> >
>> > For example, when building A->B, let's save not only full stack of B,
>> > but also caller's ip of A together, then use them on warning like:
>>
>> Like said; I've never really had trouble finding where we take A. And
>
> Me, either, since I know the way. But I've seen many guys who got
> confused with it, which is why I suggested it.
>
> But, leave it if you don't think so.
>
>> for the most difficult cases, just the IP isn't too useful either.
>>
>> So that would solve a non problem while leaving the real problem.


Hi,

What's the status of this? Was any patch submitted for this?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2018-02-14 14:01                               ` Dmitry Vyukov
@ 2018-02-14 15:44                                 ` Michal Hocko
  2018-02-14 15:57                                   ` Dmitry Vyukov
  0 siblings, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2018-02-14 15:44 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Byungchul Park, Peter Zijlstra, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, Jerome Glisse, LKML,
	Linux-MM, shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Wed 14-02-18 15:01:38, Dmitry Vyukov wrote:
> On Thu, Nov 2, 2017 at 12:54 AM, Byungchul Park <byungchul.park@lge.com> wrote:
> > On Wed, Nov 01, 2017 at 01:01:01PM +0100, Peter Zijlstra wrote:
> >> On Wed, Nov 01, 2017 at 05:59:27PM +0900, Byungchul Park wrote:
> >> > On Tue, Oct 31, 2017 at 04:10:24PM +0100, Peter Zijlstra wrote:
> >> > > On Tue, Oct 31, 2017 at 03:58:04PM +0100, Michal Hocko wrote:
> >> > > > On Tue 31-10-17 15:52:47, Peter Zijlstra wrote:
> >> > > > [...]
> >> > > > > If we want to save those stacks; we have to save a stacktrace on _every_
> >> > > > > lock acquire, simply because we never know ahead of time if there will
> >> > > > > be a new link. Doing this is _expensive_.
> >> > > > >
> >> > > > > Furthermore, the space into which we store stacktraces is limited;
> >> > > > > since memory allocators use locks we can't very well use dynamic memory
> >> > > > > for lockdep -- that would give recursive and robustness issues.
> >> >
> >> > I agree with all you said.
> >> >
> >> > But, I have a better idea, that is, to save only the caller's ip of each
> >> > acquisition as an additional information? Of course, it's not enough in
> >> > some cases, but it's cheep and better than doing nothing.
> >> >
> >> > For example, when building A->B, let's save not only full stack of B,
> >> > but also caller's ip of A together, then use them on warning like:
> >>
> >> Like said; I've never really had trouble finding where we take A. And
> >
> > Me, either, since I know the way. But I've seen many guys who got
> > confused with it, which is why I suggested it.
> >
> > But, leave it if you don't think so.
> >
> >> for the most difficult cases, just the IP isn't too useful either.
> >>
> >> So that would solve a non problem while leaving the real problem.
> 
> 
> Hi,
> 
> What's the status of this? Was any patch submitted for this?

This http://lkml.kernel.org/r/20171116120535.23765-1-mhocko@kernel.org?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2018-02-14 15:44                                 ` Michal Hocko
@ 2018-02-14 15:57                                   ` Dmitry Vyukov
  0 siblings, 0 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2018-02-14 15:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Byungchul Park, Peter Zijlstra, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, Jerome Glisse, LKML,
	Linux-MM, shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Wed, Feb 14, 2018 at 4:44 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> >> > > > [...]
>> >> > > > > If we want to save those stacks; we have to save a stacktrace on _every_
>> >> > > > > lock acquire, simply because we never know ahead of time if there will
>> >> > > > > be a new link. Doing this is _expensive_.
>> >> > > > >
>> >> > > > > Furthermore, the space into which we store stacktraces is limited;
>> >> > > > > since memory allocators use locks we can't very well use dynamic memory
>> >> > > > > for lockdep -- that would give recursive and robustness issues.
>> >> >
>> >> > I agree with all you said.
>> >> >
>> >> > But, I have a better idea, that is, to save only the caller's ip of each
>> >> > acquisition as an additional information? Of course, it's not enough in
>> >> > some cases, but it's cheep and better than doing nothing.
>> >> >
>> >> > For example, when building A->B, let's save not only full stack of B,
>> >> > but also caller's ip of A together, then use them on warning like:
>> >>
>> >> Like said; I've never really had trouble finding where we take A. And
>> >
>> > Me, either, since I know the way. But I've seen many guys who got
>> > confused with it, which is why I suggested it.
>> >
>> > But, leave it if you don't think so.
>> >
>> >> for the most difficult cases, just the IP isn't too useful either.
>> >>
>> >> So that would solve a non problem while leaving the real problem.
>>
>>
>> Hi,
>>
>> What's the status of this? Was any patch submitted for this?
>
> This http://lkml.kernel.org/r/20171116120535.23765-1-mhocko@kernel.org?

Thanks

Let's tell syzbot:

#syz fix: mm: drop hotplug lock from lru_add_drain_all()

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 13:13             ` Michal Hocko
  2017-10-31 13:51               ` Peter Zijlstra
@ 2017-10-31 15:25               ` Peter Zijlstra
  2017-10-31 15:45                 ` Michal Hocko
  2017-11-01  8:31                 ` Byungchul Park
  1 sibling, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-10-31 15:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Byungchul Park, Dmitry Vyukov, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team

On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:

> > I can indeed confirm it's running old code; cpuhp_state is no more.
> 
> Does this mean the below chain is no longer possible with the current
> linux-next (tip)?

I see I failed to answer this; no it will happen but now reads like:

	s/cpuhp_state/&_up/

Where we used to have a single lock protecting the hotplug stuff, we now
have 2, one for bringing stuff up and one for tearing it down.

This got rid of lock cycles that included cpu-up and cpu-down parts;
those are false positives because we cannot do cpu-up and cpu-down
concurrently.

But this report only includes a single (cpu-up) part and therefore is
not affected by that change other than a lock name changing.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 15:25               ` Peter Zijlstra
@ 2017-10-31 15:45                 ` Michal Hocko
  2017-10-31 16:30                   ` Peter Zijlstra
  2017-11-01  8:31                 ` Byungchul Park
  1 sibling, 1 reply; 27+ messages in thread
From: Michal Hocko @ 2017-10-31 15:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Byungchul Park, Dmitry Vyukov, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team, David Herrmann

[CC David Herrmann for shmem_wait_for_pins. The thread starts
 http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com
 with the callchains explained http://lkml.kernel.org/r/20171030151009.ip4k7nwan7muouca@hirez.programming.kicks-ass.net
 for shmem_wait_for_pins involvement see below]

On Tue 31-10-17 16:25:32, Peter Zijlstra wrote:
> On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:
> 
> > > I can indeed confirm it's running old code; cpuhp_state is no more.
> > 
> > Does this mean the below chain is no longer possible with the current
> > linux-next (tip)?
> 
> I see I failed to answer this; no it will happen but now reads like:
> 
> 	s/cpuhp_state/&_up/
> 
> Where we used to have a single lock protecting the hotplug stuff, we now
> have 2, one for bringing stuff up and one for tearing it down.
> 
> This got rid of lock cycles that included cpu-up and cpu-down parts;
> those are false positives because we cannot do cpu-up and cpu-down
> concurrently.
> 
> But this report only includes a single (cpu-up) part and therefore is
> not affected by that change other than a lock name changing.

Hmm, OK. I have quickly glanced through shmem_wait_for_pins and I fail
to see why it needs lru_add_drain_all at all. All we should care about
is the radix tree and the lru cache only cares about the proper
placement on the LRU list which is not checked here. I might be missing
something subtle though. David?

We've had some MM vs. hotplug issues. See e.g. a459eeb7b852 ("mm,
page_alloc: do not depend on cpu hotplug locks inside the allocator"),
so I suspect we might want/need to do similar for lru_add_drain_all.
It feels like I've already worked on that but for the live of mine I
cannot remember.

Anyway, this lock dependecy is subtle as hell and I am worried that we
might have way too many of those. We have so many callers of
get_online_cpus that dependecies like this are just waiting to blow up.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 15:45                 ` Michal Hocko
@ 2017-10-31 16:30                   ` Peter Zijlstra
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-10-31 16:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Byungchul Park, Dmitry Vyukov, syzbot, Andrew Morton,
	Dan Williams, Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm,
	shli, syzkaller-bugs, Thomas Gleixner, Vlastimil Babka,
	ying.huang, kernel-team, David Herrmann

On Tue, Oct 31, 2017 at 04:45:46PM +0100, Michal Hocko wrote:
> Anyway, this lock dependecy is subtle as hell and I am worried that we
> might have way too many of those. We have so many callers of
> get_online_cpus that dependecies like this are just waiting to blow up.

Yes, the filesystem locks inside hotplug thing is totally annoying. I've
got a few other splats that contain a similar theme and I've no real
clue what to do about.

See for instance this one:

  https://lkml.kernel.org/r/20171027151137.GC3165@worktop.lehotels.local

splice from devtmpfs is another common theme and links it do the
pipe->mutex, which then makes another other splice op invert against
that hotplug crap :/


I'm sure I've suggested simply creating possible_cpus devtmpfs files up
front to get around this... maybe we should just do that.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-31 15:25               ` Peter Zijlstra
  2017-10-31 15:45                 ` Michal Hocko
@ 2017-11-01  8:31                 ` Byungchul Park
  1 sibling, 0 replies; 27+ messages in thread
From: Byungchul Park @ 2017-11-01  8:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michal Hocko, Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Tue, Oct 31, 2017 at 04:25:32PM +0100, Peter Zijlstra wrote:
> But this report only includes a single (cpu-up) part and therefore is

Thanks for fixing me, Peter. I thought '#1 -> #2' and '#2 -> #3', where
#2 is 'cpuhp_state', should have been built with two different classes
of #2 as the latest code. Sorry for confusing Michal.

> not affected by that change other than a lock name changing.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-30  8:22       ` Michal Hocko
  2017-10-30 10:09         ` Byungchul Park
@ 2017-10-30 10:26         ` Byungchul Park
  2017-10-30 11:48           ` Michal Hocko
  1 sibling, 1 reply; 27+ messages in thread
From: Byungchul Park @ 2017-10-30 10:26 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Mon, Oct 30, 2017 at 09:22:03AM +0100, Michal Hocko wrote:
> [Cc Byungchul. The original full report is
> http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com]
> 
> Could you have a look please? This smells like a false positive to me.
> 
> On Fri 27-10-17 15:42:34, Michal Hocko wrote:
> > On Fri 27-10-17 11:44:58, Dmitry Vyukov wrote:
> > > On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > > > On Fri 27-10-17 02:22:40, syzbot wrote:
> > > >> Hello,
> > > >>
> > > >> syzkaller hit the following crash on
> > > >> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
> > > >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > > >> compiler: gcc (GCC) 7.1.1 20170620
> > > >> .config is attached
> > > >> Raw console output is attached.
> > > >
> > > > I do not see such a commit. My linux-next top is next-20171018
> > > >
> > > > [...]
> > > >> Chain exists of:
> > > >>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
> > > >>
> > > >>  Possible unsafe locking scenario:
> > > >>
> > > >>        CPU0                    CPU1
> > > >>        ----                    ----
> > > >>   lock(&sb->s_type->i_mutex_key#9);
> > > >>                                lock(&pipe->mutex/1);
> > > >>                                lock(&sb->s_type->i_mutex_key#9);
> > > >>   lock(cpu_hotplug_lock.rw_sem);
> > > >
> > > > I am quite confused about this report. Where exactly is the deadlock?
> > > > I do not see where we would get pipe mutex from inside of the hotplug
> > > > lock. Is it possible this is just a false possitive due to cross release
> > > > feature?
> > > 
> > > 
> > > As far as I understand this CPU0/CPU1 scheme works only for simple
> > > cases with 2 mutexes. This seem to have larger cycle as denoted by
> > > "the existing dependency chain (in reverse order) is:" section.
> > 
> > My point was that lru_add_drain_all doesn't take any external locks
> > other than lru_lock and that one is not anywhere in the chain AFAICS.

I think lru_add_drain_all() takes cpu_hotplug_lock.rw_sem implicitly in
get_online_cpus(), which appears in the chain.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-30 10:26         ` Byungchul Park
@ 2017-10-30 11:48           ` Michal Hocko
  0 siblings, 0 replies; 27+ messages in thread
From: Michal Hocko @ 2017-10-30 11:48 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Dmitry Vyukov, syzbot, Andrew Morton, Dan Williams,
	Johannes Weiner, Jan Kara, jglisse, LKML, linux-mm, shli,
	syzkaller-bugs, Thomas Gleixner, Vlastimil Babka, ying.huang,
	kernel-team

On Mon 30-10-17 19:26:19, Byungchul Park wrote:
> On Mon, Oct 30, 2017 at 09:22:03AM +0100, Michal Hocko wrote:
> > [Cc Byungchul. The original full report is
> > http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com]
> > 
> > Could you have a look please? This smells like a false positive to me.
> > 
> > On Fri 27-10-17 15:42:34, Michal Hocko wrote:
> > > On Fri 27-10-17 11:44:58, Dmitry Vyukov wrote:
> > > > On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > > > > On Fri 27-10-17 02:22:40, syzbot wrote:
> > > > >> Hello,
> > > > >>
> > > > >> syzkaller hit the following crash on
> > > > >> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
> > > > >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > > > >> compiler: gcc (GCC) 7.1.1 20170620
> > > > >> .config is attached
> > > > >> Raw console output is attached.
> > > > >
> > > > > I do not see such a commit. My linux-next top is next-20171018
> > > > >
> > > > > [...]
> > > > >> Chain exists of:
> > > > >>   cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
> > > > >>
> > > > >>  Possible unsafe locking scenario:
> > > > >>
> > > > >>        CPU0                    CPU1
> > > > >>        ----                    ----
> > > > >>   lock(&sb->s_type->i_mutex_key#9);
> > > > >>                                lock(&pipe->mutex/1);
> > > > >>                                lock(&sb->s_type->i_mutex_key#9);
> > > > >>   lock(cpu_hotplug_lock.rw_sem);
> > > > >
> > > > > I am quite confused about this report. Where exactly is the deadlock?
> > > > > I do not see where we would get pipe mutex from inside of the hotplug
> > > > > lock. Is it possible this is just a false possitive due to cross release
> > > > > feature?
> > > > 
> > > > 
> > > > As far as I understand this CPU0/CPU1 scheme works only for simple
> > > > cases with 2 mutexes. This seem to have larger cycle as denoted by
> > > > "the existing dependency chain (in reverse order) is:" section.
> > > 
> > > My point was that lru_add_drain_all doesn't take any external locks
> > > other than lru_lock and that one is not anywhere in the chain AFAICS.
> 
> I think lru_add_drain_all() takes cpu_hotplug_lock.rw_sem implicitly in
> get_online_cpus(), which appears in the chain.

Yes, but it doesn't take any _other_ locks which are externally visible
except for lru_lock and that itself doesn't provide any further
dependency AFAICS. So what exactly is the deadlock scenario?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: possible deadlock in lru_add_drain_all
  2017-10-27  9:34 ` possible deadlock in lru_add_drain_all Michal Hocko
  2017-10-27  9:44   ` Dmitry Vyukov
@ 2017-10-27 11:27   ` Vlastimil Babka
  1 sibling, 0 replies; 27+ messages in thread
From: Vlastimil Babka @ 2017-10-27 11:27 UTC (permalink / raw)
  To: Michal Hocko, syzbot
  Cc: akpm, dan.j.williams, hannes, jack, jglisse, linux-kernel,
	linux-mm, shli, syzkaller-bugs, tglx, ying.huang

On 10/27/2017 11:34 AM, Michal Hocko wrote:
> On Fri 27-10-17 02:22:40, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
> 
> I do not see such a commit. My linux-next top is next-20171018

It's the next-20170911 tag. Try git fetch --tags, but I'm not sure how
many are archived...

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2018-02-14 15:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <089e0825eec8955c1f055c83d476@google.com>
2017-10-27  9:34 ` possible deadlock in lru_add_drain_all Michal Hocko
2017-10-27  9:44   ` Dmitry Vyukov
2017-10-27  9:47     ` Dmitry Vyukov
2017-10-27 13:42     ` Michal Hocko
2017-10-30  8:22       ` Michal Hocko
2017-10-30 10:09         ` Byungchul Park
2017-10-30 15:10           ` Peter Zijlstra
2017-10-30 15:22             ` Peter Zijlstra
2017-10-31 13:13             ` Michal Hocko
2017-10-31 13:51               ` Peter Zijlstra
2017-10-31 13:55                 ` Dmitry Vyukov
2017-10-31 14:52                   ` Peter Zijlstra
2017-10-31 14:58                     ` Michal Hocko
2017-10-31 15:10                       ` Peter Zijlstra
2017-11-01  8:59                         ` Byungchul Park
2017-11-01 12:01                           ` Peter Zijlstra
2017-11-01 23:54                             ` Byungchul Park
2018-02-14 14:01                               ` Dmitry Vyukov
2018-02-14 15:44                                 ` Michal Hocko
2018-02-14 15:57                                   ` Dmitry Vyukov
2017-10-31 15:25               ` Peter Zijlstra
2017-10-31 15:45                 ` Michal Hocko
2017-10-31 16:30                   ` Peter Zijlstra
2017-11-01  8:31                 ` Byungchul Park
2017-10-30 10:26         ` Byungchul Park
2017-10-30 11:48           ` Michal Hocko
2017-10-27 11:27   ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).