All of lore.kernel.org
 help / color / mirror / Atom feed
* Dead lock condition occured ipanic during register_netdevice_notifier call in 4.9.102 <Draft>
@ 2018-05-31  8:43 Mansoor, Illyas
  2018-05-31  9:21 ` Kirill Tkhai
  0 siblings, 1 reply; 3+ messages in thread
From: Mansoor, Illyas @ 2018-05-31  8:43 UTC (permalink / raw)
  To: ktkhai, davem, netdev
  Cc: Laxminarayan Bharadiya, Pankaj, Feng, Fleming, Li, Lili, Zhang,
	Baoli, Pan, Kris, Xia, Hui, Mei, Paul

Hi Tkhai/David,

We are facing mutex dead lock condition that we think might be related to a fix that you have provided in:
Merge branch 'Close-race-between-un-register_netdevice_notifier-and-pernet_operations' commit b9a12601541eb55d07e00261a5112a4bc36fe7be

We tried to backport the patch series, but got stuck due to dependencies not met in 4.9.102 kernel for these patch series.
Could you please provide some pointers, so that we can fix in 4.9.y kernel.

Appreciate any help or pointers on this one.

Ipanic logs pasted below:

<3>[ 6513.681473] INFO: task sensors@1.0-ser:2744 blocked for more than 120 seconds.
<3>[ 6513.689723]       Tainted: P     U  W  O    4.9.102-quilt-2e5dc0ac-07850-g222b9655589b #1
<3>[ 6513.699108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[ 6513.707997] sensors@1.0-ser D    0  2744      1 0x00000000
<4>[ 6513.708007]  ffff880223f38040 ffff88027fc980c0 0000000000000000 ffff880271987000
<4>[ 6513.708024]  ffff88026f9ae040 ffffc90000d57d40 ffffffff81b363d1 ffffffff81396e0b
<4>[ 6513.708032]  00ffc90000d57d20 ffff88027fc980c0 ffffc90000d57d90 ffff88026f9ae040
<4>[ 6513.708040] Call Trace:
<4>[ 6513.708056]  [<ffffffff81b363d1>] ? __schedule+0x221/0x6e0
<4>[ 6513.708063]  [<ffffffff81396e0b>] ? sidtab_context_to_sid+0x39b/0x410
<4>[ 6513.708068]  [<ffffffff81b368c6>] schedule+0x36/0x90
<4>[ 6513.708072]  [<ffffffff81b36d18>] schedule_preempt_disabled+0x18/0x30
<4>[ 6513.708078]  [<ffffffff81b39a25>] __mutex_lock_slowpath+0x185/0x3f0
<4>[ 6513.708083]  [<ffffffff81b39cb5>] mutex_lock+0x25/0x30
<4>[ 6513.708089]  [<ffffffff81993fa5>] rtnl_lock+0x15/0x20
<4>[ 6513.708095]  [<ffffffff8197d29d>] register_netdevice_notifier+0x2d/0x200
<4>[ 6513.708107]  [<ffffffff81ad64db>] raw_init+0x8b/0x90
<4>[ 6513.708118]  [<ffffffff81ad52e1>] can_create+0xe1/0x1c0
<4>[ 6513.708129]  [<ffffffff819645fe>] __sock_create+0x12e/0x210
<4>[ 6513.708141]  [<ffffffff81965fe5>] SyS_socket+0x55/0xb0
<4>[ 6513.708156]  [<ffffffff81001fca>] do_syscall_64+0x6a/0xe0
<4>[ 6513.708166]  [<ffffffff81b3dd20>] entry_SYSCALL_64_after_swapgs+0x5d/0xd7
<4>[ 6513.708171] NMI backtrace for cpu 2
<4>[ 6513.708178] CPU: 2 PID: 482 Comm: khungtaskd Tainted: P     U  W  O    4.9.102-quilt-2e5dc0ac-07850-g222b9655589b #1
<4>[ 6513.708180]  ffffc90000eafdd0 ffffffff813f56bc 0000000000000000 0000000000000000
<4>[ 6513.708188]  ffffc90000eafe00 ffffffff813f9fe1 0000000000000002 0000000000000000
<4>[ 6513.708195]  ffffffff81042d80 ffffffff826120f8 ffffc90000eafe30 ffffffff813fa0a3

Thanks&
Regards,
Illyas

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Dead lock condition occured ipanic during register_netdevice_notifier call in 4.9.102 <Draft>
  2018-05-31  8:43 Dead lock condition occured ipanic during register_netdevice_notifier call in 4.9.102 <Draft> Mansoor, Illyas
@ 2018-05-31  9:21 ` Kirill Tkhai
  2018-05-31  9:26   ` Bharadiya,Pankaj
  0 siblings, 1 reply; 3+ messages in thread
From: Kirill Tkhai @ 2018-05-31  9:21 UTC (permalink / raw)
  To: Mansoor, Illyas, davem, netdev
  Cc: Laxminarayan Bharadiya, Pankaj, Feng, Fleming, Li, Lili, Zhang,
	Baoli, Pan, Kris, Xia, Hui, Mei, Paul

Hi, Illyas,

On 31.05.2018 11:43, Mansoor, Illyas wrote:
> We are facing mutex dead lock condition that we think might be related to a fix that you have provided in:
> Merge branch 'Close-race-between-un-register_netdevice_notifier-and-pernet_operations' commit b9a12601541eb55d07e00261a5112a4bc36fe7be
> 
> We tried to backport the patch series, but got stuck due to dependencies not met in 4.9.102 kernel for these patch series.
> Could you please provide some pointers, so that we can fix in 4.9.y kernel.
> 
> Appreciate any help or pointers on this one.
> 
> Ipanic logs pasted below:
> 
> <3>[ 6513.681473] INFO: task sensors@1.0-ser:2744 blocked for more than 120 seconds.
> <3>[ 6513.689723]       Tainted: P     U  W  O    4.9.102-quilt-2e5dc0ac-07850-g222b9655589b #1
> <3>[ 6513.699108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> <6>[ 6513.707997] sensors@1.0-ser D    0  2744      1 0x00000000
> <4>[ 6513.708007]  ffff880223f38040 ffff88027fc980c0 0000000000000000 ffff880271987000
> <4>[ 6513.708024]  ffff88026f9ae040 ffffc90000d57d40 ffffffff81b363d1 ffffffff81396e0b
> <4>[ 6513.708032]  00ffc90000d57d20 ffff88027fc980c0 ffffc90000d57d90 ffff88026f9ae040
> <4>[ 6513.708040] Call Trace:
> <4>[ 6513.708056]  [<ffffffff81b363d1>] ? __schedule+0x221/0x6e0
> <4>[ 6513.708063]  [<ffffffff81396e0b>] ? sidtab_context_to_sid+0x39b/0x410
> <4>[ 6513.708068]  [<ffffffff81b368c6>] schedule+0x36/0x90
> <4>[ 6513.708072]  [<ffffffff81b36d18>] schedule_preempt_disabled+0x18/0x30
> <4>[ 6513.708078]  [<ffffffff81b39a25>] __mutex_lock_slowpath+0x185/0x3f0
> <4>[ 6513.708083]  [<ffffffff81b39cb5>] mutex_lock+0x25/0x30
> <4>[ 6513.708089]  [<ffffffff81993fa5>] rtnl_lock+0x15/0x20
> <4>[ 6513.708095]  [<ffffffff8197d29d>] register_netdevice_notifier+0x2d/0x200
> <4>[ 6513.708107]  [<ffffffff81ad64db>] raw_init+0x8b/0x90
> <4>[ 6513.708118]  [<ffffffff81ad52e1>] can_create+0xe1/0x1c0
> <4>[ 6513.708129]  [<ffffffff819645fe>] __sock_create+0x12e/0x210
> <4>[ 6513.708141]  [<ffffffff81965fe5>] SyS_socket+0x55/0xb0
> <4>[ 6513.708156]  [<ffffffff81001fca>] do_syscall_64+0x6a/0xe0
> <4>[ 6513.708166]  [<ffffffff81b3dd20>] entry_SYSCALL_64_after_swapgs+0x5d/0xd7
> <4>[ 6513.708171] NMI backtrace for cpu 2
> <4>[ 6513.708178] CPU: 2 PID: 482 Comm: khungtaskd Tainted: P     U  W  O    4.9.102-quilt-2e5dc0ac-07850-g222b9655589b #1
> <4>[ 6513.708180]  ffffc90000eafdd0 ffffffff813f56bc 0000000000000000 0000000000000000
> <4>[ 6513.708188]  ffffc90000eafe00 ffffffff813f9fe1 0000000000000002 0000000000000000
> <4>[ 6513.708195]  ffffffff81042d80 ffffffff826120f8 ffffc90000eafe30 ffffffff813fa0a3

1)I'm not sure commit b9a12601541eb55d07e00261a5112a4bc36fe7be will help here, because this
stack looks for me like just someone does not release the mutex. It's possible firstly
try to analyze who actually owns it.

2)Also, note that rtnl_is_locked() is used in wrong way in one driver there
(see WILC_WFI_deinit_mon_interface()), so it also may introduce an imbalance
(if you use the driver).

Kirill

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Dead lock condition occured ipanic during register_netdevice_notifier call in 4.9.102 <Draft>
  2018-05-31  9:21 ` Kirill Tkhai
@ 2018-05-31  9:26   ` Bharadiya,Pankaj
  0 siblings, 0 replies; 3+ messages in thread
From: Bharadiya,Pankaj @ 2018-05-31  9:26 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: Mansoor, Illyas, davem, netdev, Feng, Fleming, Li, Lili, Zhang,
	Baoli, Pan, Kris, Xia, Hui, Mei, Paul

On Thu, May 31, 2018 at 12:21:31PM +0300, Kirill Tkhai wrote:
> Hi, Illyas,
> 
> On 31.05.2018 11:43, Mansoor, Illyas wrote:
> > We are facing mutex dead lock condition that we think might be related to a fix that you have provided in:
> > Merge branch 'Close-race-between-un-register_netdevice_notifier-and-pernet_operations' commit b9a12601541eb55d07e00261a5112a4bc36fe7be
> > 
> > We tried to backport the patch series, but got stuck due to dependencies not met in 4.9.102 kernel for these patch series.
> > Could you please provide some pointers, so that we can fix in 4.9.y kernel.
> > 
> > Appreciate any help or pointers on this one.
> > 
> > Ipanic logs pasted below:
> > 
> > <3>[ 6513.681473] INFO: task sensors@1.0-ser:2744 blocked for more than 120 seconds.
> > <3>[ 6513.689723]       Tainted: P     U  W  O    4.9.102-quilt-2e5dc0ac-07850-g222b9655589b #1
> > <3>[ 6513.699108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > <6>[ 6513.707997] sensors@1.0-ser D    0  2744      1 0x00000000
> > <4>[ 6513.708007]  ffff880223f38040 ffff88027fc980c0 0000000000000000 ffff880271987000
> > <4>[ 6513.708024]  ffff88026f9ae040 ffffc90000d57d40 ffffffff81b363d1 ffffffff81396e0b
> > <4>[ 6513.708032]  00ffc90000d57d20 ffff88027fc980c0 ffffc90000d57d90 ffff88026f9ae040
> > <4>[ 6513.708040] Call Trace:
> > <4>[ 6513.708056]  [<ffffffff81b363d1>] ? __schedule+0x221/0x6e0
> > <4>[ 6513.708063]  [<ffffffff81396e0b>] ? sidtab_context_to_sid+0x39b/0x410
> > <4>[ 6513.708068]  [<ffffffff81b368c6>] schedule+0x36/0x90
> > <4>[ 6513.708072]  [<ffffffff81b36d18>] schedule_preempt_disabled+0x18/0x30
> > <4>[ 6513.708078]  [<ffffffff81b39a25>] __mutex_lock_slowpath+0x185/0x3f0
> > <4>[ 6513.708083]  [<ffffffff81b39cb5>] mutex_lock+0x25/0x30
> > <4>[ 6513.708089]  [<ffffffff81993fa5>] rtnl_lock+0x15/0x20
> > <4>[ 6513.708095]  [<ffffffff8197d29d>] register_netdevice_notifier+0x2d/0x200
> > <4>[ 6513.708107]  [<ffffffff81ad64db>] raw_init+0x8b/0x90
> > <4>[ 6513.708118]  [<ffffffff81ad52e1>] can_create+0xe1/0x1c0
> > <4>[ 6513.708129]  [<ffffffff819645fe>] __sock_create+0x12e/0x210
> > <4>[ 6513.708141]  [<ffffffff81965fe5>] SyS_socket+0x55/0xb0
> > <4>[ 6513.708156]  [<ffffffff81001fca>] do_syscall_64+0x6a/0xe0
> > <4>[ 6513.708166]  [<ffffffff81b3dd20>] entry_SYSCALL_64_after_swapgs+0x5d/0xd7
> > <4>[ 6513.708171] NMI backtrace for cpu 2
> > <4>[ 6513.708178] CPU: 2 PID: 482 Comm: khungtaskd Tainted: P     U  W  O    4.9.102-quilt-2e5dc0ac-07850-g222b9655589b #1
> > <4>[ 6513.708180]  ffffc90000eafdd0 ffffffff813f56bc 0000000000000000 0000000000000000
> > <4>[ 6513.708188]  ffffc90000eafe00 ffffffff813f9fe1 0000000000000002 0000000000000000
> > <4>[ 6513.708195]  ffffffff81042d80 ffffffff826120f8 ffffc90000eafe30 ffffffff813fa0a3
> 
> 1)I'm not sure commit b9a12601541eb55d07e00261a5112a4bc36fe7be will help here, because this
> stack looks for me like just someone does not release the mutex. It's possible firstly
> try to analyze who actually owns it.
> 
> 2)Also, note that rtnl_is_locked() is used in wrong way in one driver there
> (see WILC_WFI_deinit_mon_interface()), so it also may introduce an imbalance
> (if you use the driver).
>

Thank you for your quick response. We will look into your suggestions and get back.

Thanks,
Pankaj
 
> Kirill

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-05-31  9:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-31  8:43 Dead lock condition occured ipanic during register_netdevice_notifier call in 4.9.102 <Draft> Mansoor, Illyas
2018-05-31  9:21 ` Kirill Tkhai
2018-05-31  9:26   ` Bharadiya,Pankaj

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.