All of lore.kernel.org
 help / color / mirror / Atom feed
* lockdep warning in lpfc v5.9-rc4
@ 2020-09-11  8:34 Daniel Wagner
  2020-09-11 22:49 ` James Smart
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Wagner @ 2020-09-11  8:34 UTC (permalink / raw)
  To: James Smart; +Cc: linux-scsi

Hi,

I just hit a lockdep warning in lpfc. Not sure if it is a valid complain
or not:

 ================================
 WARNING: inconsistent lock state
 5.9.0-rc4-default #80 Tainted: G            E    
 --------------------------------
 inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
 kworker/2:2/264 [HC0[0]:SC0[0]:HE1:SE1] takes:
 ffff9a726e7cd668 (&lpfc_ncmd->buf_lock){+.?.}-{2:2}, at: lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
 {IN-SOFTIRQ-W} state was registered at:
   lock_acquire+0xb2/0x3a0
   _raw_spin_lock+0x30/0x70
   lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
   lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
   lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
   __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
   __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
   lpfc_cq_poll_hdler+0x16/0x20 [lpfc]
   irq_poll_softirq+0x96/0x150
   __do_softirq+0xd3/0x47b
   asm_call_on_stack+0x12/0x20
   do_softirq_own_stack+0x52/0x60
   irq_exit_rcu+0xea/0xf0
   common_interrupt+0xa9/0x1a0
   asm_common_interrupt+0x1e/0x40
   refresh_cpu_vm_stats+0x20c/0x2a0
   vmstat_update+0xf/0x50
   process_one_work+0x2b7/0x640
   worker_thread+0x39/0x3f0
   kthread+0x139/0x150
   ret_from_fork+0x22/0x30
 irq event stamp: 2621
 hardirqs last  enabled at (2621): [<ffffffff91ff525d>] _raw_spin_unlock_irqrestore+0x2d/0x50
 hardirqs last disabled at (2620): [<ffffffff91ff5a38>] _raw_spin_lock_irqsave+0x88/0x8a
 softirqs last  enabled at (1420): [<ffffffff92200351>] __do_softirq+0x351/0x47b
 softirqs last disabled at (1399): [<ffffffff92001032>] asm_call_on_stack+0x12/0x20
 
 other info that might help us debug this:
  Possible unsafe locking scenario:
 
        CPU0
        ----
   lock(&lpfc_ncmd->buf_lock);
   <Interrupt>
     lock(&lpfc_ncmd->buf_lock);
 
  *** DEADLOCK ***
 
 2 locks held by kworker/2:2/264:
  #0: ffff9a727ccd2d48 ((wq_completion)lpfc_wq#4){+.+.}-{0:0}, at: process_one_work+0x237/0x640
  #1: ffffb73dc0d37e68 ((work_completion)(&queue->irqwork)){+.+.}-{0:0}, at: process_one_work+0x237/0x640
 
 stack backtrace:
 CPU: 2 PID: 264 Comm: kworker/2:2 Tainted: G            E     5.9.0-rc4-default #80
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014
 Workqueue: lpfc_wq lpfc_sli4_hba_process_cq [lpfc]
 Call Trace:
  dump_stack+0x8d/0xbb
  mark_lock+0x5e5/0x690
  ? print_shortest_lock_dependencies+0x180/0x180
  __lock_acquire+0x2d5/0xbf0
  lock_acquire+0xb2/0x3a0
  ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
  ? lock_acquire+0xb2/0x3a0
  _raw_spin_lock+0x30/0x70
  ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
  lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
  lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
  ? ret_from_fork+0x22/0x30
  ? unwind_next_frame+0x1fc/0x640
  ? create_prof_cpu_mask+0x20/0x20
  ? arch_stack_walk+0x8f/0xf0
  ? ret_from_fork+0x22/0x30
  ? lpfc_handle_fcp_err+0xb00/0xb00 [lpfc]
  ? lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
  lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
  __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
  ? lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x170/0x170 [lpfc]
  __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
  process_one_work+0x2b7/0x640
  ? find_held_lock+0x34/0xa0
  ? process_one_work+0x640/0x640
  worker_thread+0x39/0x3f0
  ? process_one_work+0x640/0x640
  kthread+0x139/0x150
  ? kthread_park+0x90/0x90
  ret_from_fork+0x22/0x30


Thanks,
Daniel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: lockdep warning in lpfc v5.9-rc4
  2020-09-11  8:34 lockdep warning in lpfc v5.9-rc4 Daniel Wagner
@ 2020-09-11 22:49 ` James Smart
  2020-09-11 23:54   ` James Smart
  0 siblings, 1 reply; 3+ messages in thread
From: James Smart @ 2020-09-11 22:49 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-scsi

On 9/11/2020 1:34 AM, Daniel Wagner wrote:
> Hi,
> 
> I just hit a lockdep warning in lpfc. Not sure if it is a valid complain
> or not:
> 
>   ================================
>   WARNING: inconsistent lock state
>   5.9.0-rc4-default #80 Tainted: G            E
>   --------------------------------
>   inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
>   kworker/2:2/264 [HC0[0]:SC0[0]:HE1:SE1] takes:
>   ffff9a726e7cd668 (&lpfc_ncmd->buf_lock){+.?.}-{2:2}, at: lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>   {IN-SOFTIRQ-W} state was registered at:
>     lock_acquire+0xb2/0x3a0
>     _raw_spin_lock+0x30/0x70
>     lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>     lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
>     lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
>     __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
>     __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
>     lpfc_cq_poll_hdler+0x16/0x20 [lpfc]
>     irq_poll_softirq+0x96/0x150
>     __do_softirq+0xd3/0x47b
>     asm_call_on_stack+0x12/0x20
>     do_softirq_own_stack+0x52/0x60
>     irq_exit_rcu+0xea/0xf0
>     common_interrupt+0xa9/0x1a0
>     asm_common_interrupt+0x1e/0x40
>     refresh_cpu_vm_stats+0x20c/0x2a0
>     vmstat_update+0xf/0x50
>     process_one_work+0x2b7/0x640
>     worker_thread+0x39/0x3f0
>     kthread+0x139/0x150
>     ret_from_fork+0x22/0x30
>   irq event stamp: 2621
>   hardirqs last  enabled at (2621): [<ffffffff91ff525d>] _raw_spin_unlock_irqrestore+0x2d/0x50
>   hardirqs last disabled at (2620): [<ffffffff91ff5a38>] _raw_spin_lock_irqsave+0x88/0x8a
>   softirqs last  enabled at (1420): [<ffffffff92200351>] __do_softirq+0x351/0x47b
>   softirqs last disabled at (1399): [<ffffffff92001032>] asm_call_on_stack+0x12/0x20
>   
>   other info that might help us debug this:
>    Possible unsafe locking scenario:
>   
>          CPU0
>          ----
>     lock(&lpfc_ncmd->buf_lock);
>     <Interrupt>
>       lock(&lpfc_ncmd->buf_lock);
>   
>    *** DEADLOCK ***
>   
>   2 locks held by kworker/2:2/264:
>    #0: ffff9a727ccd2d48 ((wq_completion)lpfc_wq#4){+.+.}-{0:0}, at: process_one_work+0x237/0x640
>    #1: ffffb73dc0d37e68 ((work_completion)(&queue->irqwork)){+.+.}-{0:0}, at: process_one_work+0x237/0x640
>   
>   stack backtrace:
>   CPU: 2 PID: 264 Comm: kworker/2:2 Tainted: G            E     5.9.0-rc4-default #80
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014
>   Workqueue: lpfc_wq lpfc_sli4_hba_process_cq [lpfc]
>   Call Trace:
>    dump_stack+0x8d/0xbb
>    mark_lock+0x5e5/0x690
>    ? print_shortest_lock_dependencies+0x180/0x180
>    __lock_acquire+0x2d5/0xbf0
>    lock_acquire+0xb2/0x3a0
>    ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>    ? lock_acquire+0xb2/0x3a0
>    _raw_spin_lock+0x30/0x70
>    ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>    lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>    lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
>    ? ret_from_fork+0x22/0x30
>    ? unwind_next_frame+0x1fc/0x640
>    ? create_prof_cpu_mask+0x20/0x20
>    ? arch_stack_walk+0x8f/0xf0
>    ? ret_from_fork+0x22/0x30
>    ? lpfc_handle_fcp_err+0xb00/0xb00 [lpfc]
>    ? lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
>    lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
>    __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
>    ? lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x170/0x170 [lpfc]
>    __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
>    process_one_work+0x2b7/0x640
>    ? find_held_lock+0x34/0xa0
>    ? process_one_work+0x640/0x640
>    worker_thread+0x39/0x3f0
>    ? process_one_work+0x640/0x640
>    kthread+0x139/0x150
>    ? kthread_park+0x90/0x90
>    ret_from_fork+0x22/0x30
> 
> 
> Thanks,
> Daniel
> 

It's likely valid, but rare. Most of the time, only one of these 2 flows 
are occuring. We'll look into it.

-- james

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: lockdep warning in lpfc v5.9-rc4
  2020-09-11 22:49 ` James Smart
@ 2020-09-11 23:54   ` James Smart
  0 siblings, 0 replies; 3+ messages in thread
From: James Smart @ 2020-09-11 23:54 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-scsi

On 9/11/2020 3:49 PM, James Smart wrote:
> On 9/11/2020 1:34 AM, Daniel Wagner wrote:
>> Hi,
>>
>> I just hit a lockdep warning in lpfc. Not sure if it is a valid complain
>> or not:
>>
>>   ================================
>>   WARNING: inconsistent lock state
>>   5.9.0-rc4-default #80 Tainted: G            E
>>   --------------------------------
>>   inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
>>   kworker/2:2/264 [HC0[0]:SC0[0]:HE1:SE1] takes:
>>   ffff9a726e7cd668 (&lpfc_ncmd->buf_lock){+.?.}-{2:2}, at: 
>> lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>>   {IN-SOFTIRQ-W} state was registered at:
>>     lock_acquire+0xb2/0x3a0
>>     _raw_spin_lock+0x30/0x70
>>     lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>>     lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
>>     lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
>>     __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
>>     __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
>>     lpfc_cq_poll_hdler+0x16/0x20 [lpfc]
>>     irq_poll_softirq+0x96/0x150
>>     __do_softirq+0xd3/0x47b
>>     asm_call_on_stack+0x12/0x20
>>     do_softirq_own_stack+0x52/0x60
>>     irq_exit_rcu+0xea/0xf0
>>     common_interrupt+0xa9/0x1a0
>>     asm_common_interrupt+0x1e/0x40
>>     refresh_cpu_vm_stats+0x20c/0x2a0
>>     vmstat_update+0xf/0x50
>>     process_one_work+0x2b7/0x640
>>     worker_thread+0x39/0x3f0
>>     kthread+0x139/0x150
>>     ret_from_fork+0x22/0x30
>>   irq event stamp: 2621
>>   hardirqs last  enabled at (2621): [<ffffffff91ff525d>] 
>> _raw_spin_unlock_irqrestore+0x2d/0x50
>>   hardirqs last disabled at (2620): [<ffffffff91ff5a38>] 
>> _raw_spin_lock_irqsave+0x88/0x8a
>>   softirqs last  enabled at (1420): [<ffffffff92200351>] 
>> __do_softirq+0x351/0x47b
>>   softirqs last disabled at (1399): [<ffffffff92001032>] 
>> asm_call_on_stack+0x12/0x20
>>     other info that might help us debug this:
>>    Possible unsafe locking scenario:
>>            CPU0
>>          ----
>>     lock(&lpfc_ncmd->buf_lock);
>>     <Interrupt>
>>       lock(&lpfc_ncmd->buf_lock);
>>      *** DEADLOCK ***
>>     2 locks held by kworker/2:2/264:
>>    #0: ffff9a727ccd2d48 ((wq_completion)lpfc_wq#4){+.+.}-{0:0}, at: 
>> process_one_work+0x237/0x640
>>    #1: ffffb73dc0d37e68 
>> ((work_completion)(&queue->irqwork)){+.+.}-{0:0}, at: 
>> process_one_work+0x237/0x640
>>     stack backtrace:
>>   CPU: 2 PID: 264 Comm: kworker/2:2 Tainted: G            E 
>> 5.9.0-rc4-default #80
>>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014
>>   Workqueue: lpfc_wq lpfc_sli4_hba_process_cq [lpfc]
>>   Call Trace:
>>    dump_stack+0x8d/0xbb
>>    mark_lock+0x5e5/0x690
>>    ? print_shortest_lock_dependencies+0x180/0x180
>>    __lock_acquire+0x2d5/0xbf0
>>    lock_acquire+0xb2/0x3a0
>>    ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>>    ? lock_acquire+0xb2/0x3a0
>>    _raw_spin_lock+0x30/0x70
>>    ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>>    lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
>>    lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
>>    ? ret_from_fork+0x22/0x30
>>    ? unwind_next_frame+0x1fc/0x640
>>    ? create_prof_cpu_mask+0x20/0x20
>>    ? arch_stack_walk+0x8f/0xf0
>>    ? ret_from_fork+0x22/0x30
>>    ? lpfc_handle_fcp_err+0xb00/0xb00 [lpfc]
>>    ? lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
>>    lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
>>    __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
>>    ? lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x170/0x170 [lpfc]
>>    __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
>>    process_one_work+0x2b7/0x640
>>    ? find_held_lock+0x34/0xa0
>>    ? process_one_work+0x640/0x640
>>    worker_thread+0x39/0x3f0
>>    ? process_one_work+0x640/0x640
>>    kthread+0x139/0x150
>>    ? kthread_park+0x90/0x90
>>    ret_from_fork+0x22/0x30
>>
>>
>> Thanks,
>> Daniel
>>
>
> It's likely valid, but rare. Most of the time, only one of these 2 
> flows are occuring. We'll look into it.
>
> -- james

Looking a little futher - it doesn't look like this can be hit. Although 
both the softirq and work element could be running, both call the same 
routine and only 1 callee will succeed in a cmpxchg() for the queue. As  
such, only one entity will process the queue and there would not be a 
way for both to be looking at the same io completion.

-- james


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-09-11 23:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-11  8:34 lockdep warning in lpfc v5.9-rc4 Daniel Wagner
2020-09-11 22:49 ` James Smart
2020-09-11 23:54   ` James Smart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.