All of lore.kernel.org
 help / color / mirror / Atom feed
* Inconsistent lock state warning for rxe_poll_cq()
@ 2022-02-14 18:01 Bart Van Assche
  2022-02-14 19:51 ` Bob Pearson
  2022-02-15  4:50 ` Guoqing Jiang
  0 siblings, 2 replies; 14+ messages in thread
From: Bart Van Assche @ 2022-02-14 18:01 UTC (permalink / raw)
  To: Bob Pearson; +Cc: linux-rdma, Jason Gunthorpe

Hi Bob,

If I run the SRP tests against Jason's rdma/for-next branch then these
tests pass if I use the siw driver but not if I use the rdma_rxe driver.
Can you please take a look at the output triggered by running blktests?
If I run blktests as follows: ./check -q srp, the following output
appears:

WARNING: CPU: 1 PID: 1052 at kernel/softirq.c:363 __local_bh_enable_ip+0xa4/0xf0
  _raw_write_unlock_bh+0x31/0x40
  __rxe_add_index+0x38/0x50 [rdma_rxe]
  rxe_create_ah+0xce/0x1b0 [rdma_rxe]
  _rdma_create_ah+0x2c8/0x2f0 [ib_core]
  rdma_create_ah+0xfd/0x1c0 [ib_core]
  cm_alloc_msg+0xbc/0x280 [ib_cm]
  cm_alloc_priv_msg+0x2d/0x70 [ib_cm]
  ib_send_cm_req+0x4fe/0x830 [ib_cm]
  cma_connect_ib+0x3c4/0x600 [rdma_cm]
  rdma_connect_locked+0x145/0x490 [rdma_cm]
  rdma_connect+0x31/0x50 [rdma_cm]
  srp_send_req+0x58a/0x830 [ib_srp]
  srp_connect_ch+0x9f/0x1d0 [ib_srp]
  add_target_store+0xa6b/0xf20 [ib_srp]
  dev_attr_store+0x3e/0x60
  sysfs_kf_write+0x87/0xa0
  kernfs_fop_write_iter+0x1c7/0x270
  new_sync_write+0x296/0x3c0
  vfs_write+0x43c/0x580
  ksys_write+0xd9/0x180
  __x64_sys_write+0x42/0x50
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

[ ... ]

raw_local_irq_restore() called with IRQs enabled
WARNING: CPU: 1 PID: 1052 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2f/0x50
Call Trace:
  <TASK>
  _raw_spin_unlock_irqrestore+0x6c/0x70
  ib_send_mad+0x4ca/0xa40 [ib_core]
  ib_post_send_mad+0x244/0x4b0 [ib_core]
  ib_send_cm_req+0x61b/0x830 [ib_cm]
  cma_connect_ib+0x3c4/0x600 [rdma_cm]
  rdma_connect_locked+0x145/0x490 [rdma_cm]
  rdma_connect+0x31/0x50 [rdma_cm]
  srp_send_req+0x58a/0x830 [ib_srp]
  srp_connect_ch+0x9f/0x1d0 [ib_srp]
  add_target_store+0xa6b/0xf20 [ib_srp]
  dev_attr_store+0x3e/0x60
  sysfs_kf_write+0x87/0xa0
  kernfs_fop_write_iter+0x1c7/0x270
  new_sync_write+0x296/0x3c0
  vfs_write+0x43c/0x580
  ksys_write+0xd9/0x180
  __x64_sys_write+0x42/0x50
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

[ ... ]

================================
WARNING: inconsistent lock state
5.17.0-rc1-dbg+ #3 Tainted: G        W
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ksoftirqd/1/19 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffff88811a85a228 (&ch->lock){+.?.}-{2:2}, at: srp_process_rsp+0x175/0x400 [ib_srp]
{SOFTIRQ-ON-W} state was registered at:
   lockdep_hardirqs_on_prepare.part.0+0x11b/0x1f0
   lockdep_hardirqs_on_prepare+0x43/0x50
   trace_hardirqs_on+0x22/0x120
   __local_bh_enable_ip+0x88/0xf0
   _raw_spin_unlock_bh+0x31/0x40
   rxe_poll_cq+0x164/0x180 [rdma_rxe]
   __ib_process_cq+0xab/0x3c0 [ib_core]
   ib_process_cq_direct+0x8c/0xc0 [ib_core]
   __srp_get_tx_iu+0x5d/0x160 [ib_srp]
   srp_queuecommand+0xf5/0x40c [ib_srp]
   scsi_dispatch_cmd+0x16a/0x530
   scsi_queue_rq+0x383/0x780
   blk_mq_dispatch_rq_list+0x344/0xc00
   __blk_mq_sched_dispatch_requests+0x19b/0x280
   blk_mq_sched_dispatch_requests+0x8a/0xc0
   __blk_mq_run_hw_queue+0x99/0x220
   __blk_mq_delay_run_hw_queue+0x372/0x3a0
   blk_mq_run_hw_queue+0x1d6/0x2b0
   blk_mq_sched_insert_request+0x208/0x290
   blk_execute_rq_nowait+0x9c/0xb0
   blk_execute_rq+0xcf/0x200
   __scsi_execute+0x220/0x340
   scsi_probe_lun.constprop.0+0x17c/0x670
   scsi_probe_and_add_lun+0x178/0x710
   __scsi_scan_target+0x17c/0x300
   scsi_scan_target+0xf1/0x110
   srp_add_target+0x2a5/0x490 [ib_srp]
   add_target_store+0xe30/0xf20 [ib_srp]
   dev_attr_store+0x3e/0x60
   sysfs_kf_write+0x87/0xa0
   kernfs_fop_write_iter+0x1c7/0x270
   new_sync_write+0x296/0x3c0
   vfs_write+0x43c/0x580
   ksys_write+0xd9/0x180
   __x64_sys_write+0x42/0x50
   do_syscall_64+0x35/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
irq event stamp: 76629
hardirqs last  enabled at (76628): [<ffffffff810ce588>] __local_bh_enable_ip+0x88/0xf0
hardirqs last disabled at (76629): [<ffffffff81e669bd>] _raw_spin_lock_irqsave+0x5d/0x60
softirqs last  enabled at (76618): [<ffffffff82200467>] __do_softirq+0x467/0x6e1
softirqs last disabled at (76623): [<ffffffff810ce3a7>] run_ksoftirqd+0x37/0x60

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&ch->lock);
   <Interrupt>
     lock(&ch->lock);

  *** DEADLOCK ***

no locks held by ksoftirqd/1/19.

stack backtrace:
CPU: 1 PID: 19 Comm: ksoftirqd/1 Tainted: G        W         5.17.0-rc1-dbg+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
Call Trace:
  <TASK>
  show_stack+0x52/0x58
  dump_stack_lvl+0x5b/0x82
  dump_stack+0x10/0x12
  print_usage_bug.part.0+0x29c/0x2ab
  mark_lock_irq.cold+0x54/0xbf
  ? lock_chain_count+0x20/0x20
  ? stack_trace_save+0x94/0xc0
  ? filter_irq_stacks+0x70/0x70
  ? __asan_loadN+0xf/0x20
  ? jhash.constprop.0+0x1bc/0x220
  ? save_trace+0x174/0x2d0
  mark_lock+0x414/0xac0
  ? mark_lock_irq+0xf70/0xf70
  mark_usage+0x74/0x1a0
  __lock_acquire+0x45b/0xce0
  lock_acquire.part.0+0x126/0x2f0
  ? srp_process_rsp+0x175/0x400 [ib_srp]
  ? rcu_read_unlock+0x50/0x50
  ? __this_cpu_preempt_check+0x13/0x20
  lock_acquire+0x9b/0x1a0
  ? srp_process_rsp+0x175/0x400 [ib_srp]
  _raw_spin_lock_irqsave+0x43/0x60
  ? srp_process_rsp+0x175/0x400 [ib_srp]
  srp_process_rsp+0x175/0x400 [ib_srp]
  srp_recv_done+0x184/0x360 [ib_srp]
  ? rxe_poll_cq+0x164/0x180 [rdma_rxe]
  ? srp_process_rsp+0x400/0x400 [ib_srp]
  ? __this_cpu_preempt_check+0x13/0x20
  __ib_process_cq+0x11b/0x3c0 [ib_core]
  ib_poll_handler+0x47/0x1f0 [ib_core]
  irq_poll_softirq+0x12f/0x2e0
  __do_softirq+0x1d8/0x6e1
  run_ksoftirqd+0x37/0x60
  smpboot_thread_fn+0x302/0x410
  ? __irq_exit_rcu+0x140/0x140
  ? __smpboot_create_thread.part.0+0x1c0/0x1c0
  kthread+0x15f/0x190
  ? kthread_complete_and_exit+0x30/0x30
  ret_from_fork+0x1f/0x30
  </TASK>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-14 18:01 Inconsistent lock state warning for rxe_poll_cq() Bart Van Assche
@ 2022-02-14 19:51 ` Bob Pearson
  2022-02-14 19:55   ` Bart Van Assche
  2022-02-15  4:50 ` Guoqing Jiang
  1 sibling, 1 reply; 14+ messages in thread
From: Bob Pearson @ 2022-02-14 19:51 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma, Jason Gunthorpe

On 2/14/22 12:01, Bart Van Assche wrote:
> Hi Bob,
> 
> If I run the SRP tests against Jason's rdma/for-next branch then these
> tests pass if I use the siw driver but not if I use the rdma_rxe driver.
> Can you please take a look at the output triggered by running blktests?
> If I run blktests as follows: ./check -q srp, the following output
> appears:
> 5.17.0-rc1+
You are running kernel 5.17.0-rc1-dbg+ I have 5.17.0-rc1+. I assume they are different. How do I make the same kernel
you are running?

Bob

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-14 19:51 ` Bob Pearson
@ 2022-02-14 19:55   ` Bart Van Assche
  2022-02-14 20:25     ` Bob Pearson
  0 siblings, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2022-02-14 19:55 UTC (permalink / raw)
  To: Bob Pearson; +Cc: linux-rdma, Jason Gunthorpe

On 2/14/22 11:51, Bob Pearson wrote:
> On 2/14/22 12:01, Bart Van Assche wrote:
>> If I run the SRP tests against Jason's rdma/for-next branch then
>> these tests pass if I use the siw driver but not if I use the
>> rdma_rxe driver. Can you please take a look at the output triggered
>> by running blktests? If I run blktests as follows: ./check -q srp,
>> the following output appears: 5.17.0-rc1+
> 
> You are running kernel 5.17.0-rc1-dbg+ I have 5.17.0-rc1+. I assume
> they are different. How do I make the same kernel you are running?

The -dbg suffix comes from the following kernel configuration option:
CONFIG_LOCALVERSION="-dbg"

The rdma/for-next commit I used in my tests is as follows: 2f1b2820b546
("Merge branch 'irdma_dscp' into rdma.git for-next").

Does this answer your question?

Bart.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-14 19:55   ` Bart Van Assche
@ 2022-02-14 20:25     ` Bob Pearson
  2022-02-14 20:48       ` Bart Van Assche
  0 siblings, 1 reply; 14+ messages in thread
From: Bob Pearson @ 2022-02-14 20:25 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma, Jason Gunthorpe

On 2/14/22 13:55, Bart Van Assche wrote:
> On 2/14/22 11:51, Bob Pearson wrote:
>> On 2/14/22 12:01, Bart Van Assche wrote:
>>> If I run the SRP tests against Jason's rdma/for-next branch then
>>> these tests pass if I use the siw driver but not if I use the
>>> rdma_rxe driver. Can you please take a look at the output triggered
>>> by running blktests? If I run blktests as follows: ./check -q srp,
>>> the following output appears: 5.17.0-rc1+
>>
>> You are running kernel 5.17.0-rc1-dbg+ I have 5.17.0-rc1+. I assume
>> they are different. How do I make the same kernel you are running?
> 
> The -dbg suffix comes from the following kernel configuration option:
> CONFIG_LOCALVERSION="-dbg"
> 
> The rdma/for-next commit I used in my tests is as follows: 2f1b2820b546
> ("Merge branch 'irdma_dscp' into rdma.git for-next").
> 
> Does this answer your question?
> 
> Bart.

It helps. I am trying to run blktests -q srp but I need to install xfs first it seems. Do I need two nodes or can I
run it with just one? 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-14 20:25     ` Bob Pearson
@ 2022-02-14 20:48       ` Bart Van Assche
  2022-02-15  5:43         ` Bob Pearson
  0 siblings, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2022-02-14 20:48 UTC (permalink / raw)
  To: Bob Pearson; +Cc: linux-rdma, Jason Gunthorpe

On 2/14/22 12:25, Bob Pearson wrote:
> It helps. I am trying to run blktests -q srp but I need to install
> xfs first it seems. Do I need two nodes or can I run it with just
> one?

XFS? All SRP tests use the null_blk driver if I remember correctly and 
hence don't need any physical block device. Some tests outside the SRP 
directory require xfstools but the SRP tests do not. If blktests are run 
as follows, XFS should not be required:

./check -q srp

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-14 18:01 Inconsistent lock state warning for rxe_poll_cq() Bart Van Assche
  2022-02-14 19:51 ` Bob Pearson
@ 2022-02-15  4:50 ` Guoqing Jiang
  1 sibling, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2022-02-15  4:50 UTC (permalink / raw)
  To: Bart Van Assche, Bob Pearson; +Cc: linux-rdma, Jason Gunthorpe



On 2/15/22 2:01 AM, Bart Van Assche wrote:
> Hi Bob,
>
> If I run the SRP tests against Jason's rdma/for-next branch then these
> tests pass if I use the siw driver but not if I use the rdma_rxe driver.
> Can you please take a look at the output triggered by running blktests?
> If I run blktests as follows: ./check -q srp, the following output
> appears:
>
> WARNING: CPU: 1 PID: 1052 at kernel/softirq.c:363 
> __local_bh_enable_ip+0xa4/0xf0
>  _raw_write_unlock_bh+0x31/0x40
>  __rxe_add_index+0x38/0x50 [rdma_rxe]
>  rxe_create_ah+0xce/0x1b0 [rdma_rxe]
>  _rdma_create_ah+0x2c8/0x2f0 [ib_core]
>  rdma_create_ah+0xfd/0x1c0 [ib_core]
>  cm_alloc_msg+0xbc/0x280 [ib_cm]
>  cm_alloc_priv_msg+0x2d/0x70 [ib_cm]
>  ib_send_cm_req+0x4fe/0x830 [ib_cm]
>  cma_connect_ib+0x3c4/0x600 [rdma_cm]
>  rdma_connect_locked+0x145/0x490 [rdma_cm]
>  rdma_connect+0x31/0x50 [rdma_cm]
>  srp_send_req+0x58a/0x830 [ib_srp]
>  srp_connect_ch+0x9f/0x1d0 [ib_srp]
>  add_target_store+0xa6b/0xf20 [ib_srp]
>  dev_attr_store+0x3e/0x60
>  sysfs_kf_write+0x87/0xa0
>  kernfs_fop_write_iter+0x1c7/0x270
>  new_sync_write+0x296/0x3c0
>  vfs_write+0x43c/0x580
>  ksys_write+0xd9/0x180
>  __x64_sys_write+0x42/0x50
>  do_syscall_64+0x35/0x80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae

I think it is exactly the same which I encountered with 5.17-rc3.

https://lore.kernel.org/linux-rdma/20220210073655.42281-1-guoqing.jiang@linux.dev/T/#m0c0ce0745078095ea931f61d2b1c6ce0fdd5403b

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-14 20:48       ` Bart Van Assche
@ 2022-02-15  5:43         ` Bob Pearson
  2022-02-15 12:16           ` Guoqing Jiang
  2022-02-15 14:41           ` Jason Gunthorpe
  0 siblings, 2 replies; 14+ messages in thread
From: Bob Pearson @ 2022-02-15  5:43 UTC (permalink / raw)
  To: Bart Van Assche, Guoqing Jiang; +Cc: linux-rdma, Jason Gunthorpe

On 2/14/22 14:48, Bart Van Assche wrote:
> On 2/14/22 12:25, Bob Pearson wrote:
>> It helps. I am trying to run blktests -q srp but I need to install
>> xfs first it seems. Do I need two nodes or can I run it with just
>> one?
> 
> XFS? All SRP tests use the null_blk driver if I remember correctly and hence don't need any physical block device. Some tests outside the SRP directory require xfstools but the SRP tests do not. If blktests are run as follows, XFS should not be required:
> 
> ./check -q srp
> 
> Thanks,
> 
> Bart.

I am now able to reproduce what I think is the same trace you are seeing.

The first error is the warning:

	[ 1808.574513] WARNING: CPU: 7 PID: 3887 at kernel/softirq.c:363 __local_bh_enable_ip+0xac/0x100

which is called from __local_bh_enable_ip()

	void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)

	{

	        WARN_ON_ONCE(in_irq());

	        lockdep_assert_irqs_enabled();

	#ifdef CONFIG_TRACE_IRQFLAGS

	        local_irq_disable();

	#endif


in lockdep_assert_irqs_enabled()

and this is in turn called from __rxe_add_index() which looks like

	int __rxe_add_index(struct rxe_pool_elem *elem)

	{

	        struct rxe_pool *pool = elem->pool;

	        int err;


	
	        write_lock_bh(&pool->pool_lock);

	        err = __rxe_add_index_locked(elem);

	        write_unlock_bh(&pool->pool_lock);


	
	        return err;

	}


in the write_unlock_bh() call. This appears to complain if hardirqs are not enabled on the current cpu. This only happens if CONFIG_PROVE_LOCKING=y. The problem with all this is that the pool->pool_lock is never held by anyone
else except __rxe_add_index when the first error occurs. Perhaps someone else has disabled hard irqs and lets us gain
control of this cpu. If instead of _bh locks we use _irqsave locks in rxe_pool.c, which was the case a while ago
the test is different and passes. If you don't set CONFIG_PROVE_LOCKING this error does not happen.

Somehow just using _irqsave locks because it makes this error vanish doesn't seem right. There should be a root
cause that makes sense.

Any ideas.

Bob

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-15  5:43         ` Bob Pearson
@ 2022-02-15 12:16           ` Guoqing Jiang
  2022-02-15 14:41           ` Jason Gunthorpe
  1 sibling, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2022-02-15 12:16 UTC (permalink / raw)
  To: Bob Pearson, Bart Van Assche; +Cc: linux-rdma, Jason Gunthorpe



On 2/15/22 1:43 PM, Bob Pearson wrote:
> On 2/14/22 14:48, Bart Van Assche wrote:
>> On 2/14/22 12:25, Bob Pearson wrote:
>>> It helps. I am trying to run blktests -q srp but I need to install
>>> xfs first it seems. Do I need two nodes or can I run it with just
>>> one?
>> XFS? All SRP tests use the null_blk driver if I remember correctly and hence don't need any physical block device. Some tests outside the SRP directory require xfstools but the SRP tests do not. If blktests are run as follows, XFS should not be required:
>>
>> ./check -q srp
>>
>> Thanks,
>>
>> Bart.
> I am now able to reproduce what I think is the same trace you are seeing.
>
> The first error is the warning:
>
> 	[ 1808.574513] WARNING: CPU: 7 PID: 3887 at kernel/softirq.c:363 __local_bh_enable_ip+0xac/0x100
>
> which is called from __local_bh_enable_ip()
>
> 	void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
>
> 	{
>
> 	        WARN_ON_ONCE(in_irq());
>
> 	        lockdep_assert_irqs_enabled();
>
> 	#ifdef CONFIG_TRACE_IRQFLAGS
>
> 	        local_irq_disable();
>
> 	#endif
>
>
> in lockdep_assert_irqs_enabled()
>
> and this is in turn called from __rxe_add_index() which looks like
>
> 	int __rxe_add_index(struct rxe_pool_elem *elem)
>
> 	{
>
> 	        struct rxe_pool *pool = elem->pool;
>
> 	        int err;
>
>
> 	
> 	        write_lock_bh(&pool->pool_lock);
>
> 	        err = __rxe_add_index_locked(elem);
>
> 	        write_unlock_bh(&pool->pool_lock);
>
>
> 	
> 	        return err;
>
> 	}
>
>
> in the write_unlock_bh() call. This appears to complain if hardirqs are not enabled on the current cpu.

Let's suppose only NIC is involved at the moment, once NIC driver has
switched to NAPI which means no hard irq is enabled, is it possible?

> This only happens if CONFIG_PROVE_LOCKING=y. The problem with all this is that the pool->pool_lock is never held by anyone
> else except __rxe_add_index when the first error occurs. Perhaps someone else has disabled hard irqs and lets us gain
> control of this cpu. If instead of _bh locks we use _irqsave locks in rxe_pool.c, which was the case a while ago
> the test is different and passes. If you don't set CONFIG_PROVE_LOCKING this error does not happen.
>
> Somehow just using _irqsave locks because it makes this error vanish doesn't seem right. There should be a root
> cause that makes sense.

At least I can find two similar fixes, just FYI.

4956b9eaad45 io_uring: rsrc ref lock needs to be IRQ safe
2800aadc18a6 iwlwifi: Fix softirq/hardirq disabling in 
iwl_pcie_enqueue_hcmd()

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-15  5:43         ` Bob Pearson
  2022-02-15 12:16           ` Guoqing Jiang
@ 2022-02-15 14:41           ` Jason Gunthorpe
  2022-02-15 16:45             ` Pearson, Robert B
  1 sibling, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2022-02-15 14:41 UTC (permalink / raw)
  To: Bob Pearson; +Cc: Bart Van Assche, Guoqing Jiang, linux-rdma

On Mon, Feb 14, 2022 at 11:43:40PM -0600, Bob Pearson wrote:

> in the write_unlock_bh() call. This appears to complain if hardirqs
> are not enabled on the current cpu. This only happens if
> CONFIG_PROVE_LOCKING=y.

The trace shows this context is called within a irq disabled spinlock
region. Ie this is trying to do

  spinlock_irqsave()
  write_lock_bh()
  write_unlock_bh()
  spinunlock_irqrestore()

Which is illegal locking.

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-15 14:41           ` Jason Gunthorpe
@ 2022-02-15 16:45             ` Pearson, Robert B
  2022-02-15 16:48               ` Jason Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: Pearson, Robert B @ 2022-02-15 16:45 UTC (permalink / raw)
  To: Jason Gunthorpe, Bob Pearson; +Cc: Bart Van Assche, Guoqing Jiang, linux-rdma



-----Original Message-----
From: Jason Gunthorpe <jgg@nvidia.com> 
Sent: Tuesday, February 15, 2022 8:42 AM
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: Bart Van Assche <bvanassche@acm.org>; Guoqing Jiang <guoqing.jiang@linux.dev>; linux-rdma@vger.kernel.org
Subject: Re: Inconsistent lock state warning for rxe_poll_cq()

On Mon, Feb 14, 2022 at 11:43:40PM -0600, Bob Pearson wrote:

>> in the write_unlock_bh() call. This appears to complain if hardirqs 
>> are not enabled on the current cpu. This only happens if 
>> CONFIG_PROVE_LOCKING=y.

> The trace shows this context is called within a irq disabled spinlock region. Ie this is trying to do

>   spinlock_irqsave()
>   write_lock_bh()
>   write_unlock_bh()
>   spinunlock_irqrestore()

> Which is illegal locking.

> Jason

Yes. I figured that. But I don't know why, or why

spin_lock_irqsave(lock_a)
spin_lock_irqsave(lock_b)
spin_unlock_irqrestore(lock_b)
spin_unlock_irqrestore(lock_a)

Is clearly better which has been suggested as a fix.

Apparently someone above me is taking an irqsave lock and then calling
Down to rxe_create_ah(). Calling the verbs APIs while holding a spinlock seems likely to cause problems.
This looks to be in response to receiving a MAD so possibly is a call to ibv_create_ah_from_wc().

Bob

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-15 16:45             ` Pearson, Robert B
@ 2022-02-15 16:48               ` Jason Gunthorpe
  2022-02-15 16:53                 ` Pearson, Robert B
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2022-02-15 16:48 UTC (permalink / raw)
  To: Pearson, Robert B; +Cc: Bob Pearson, Bart Van Assche, Guoqing Jiang, linux-rdma

On Tue, Feb 15, 2022 at 04:45:01PM +0000, Pearson, Robert B wrote:
> 
> 
> From: Jason Gunthorpe <jgg@nvidia.com> 
> Sent: Tuesday, February 15, 2022 8:42 AM
> To: Bob Pearson <rpearsonhpe@gmail.com>
> Cc: Bart Van Assche <bvanassche@acm.org>; Guoqing Jiang <guoqing.jiang@linux.dev>; linux-rdma@vger.kernel.org
> Subject: Re: Inconsistent lock state warning for rxe_poll_cq()
> 
> On Mon, Feb 14, 2022 at 11:43:40PM -0600, Bob Pearson wrote:
> 
> >> in the write_unlock_bh() call. This appears to complain if hardirqs 
> >> are not enabled on the current cpu. This only happens if 
> >> CONFIG_PROVE_LOCKING=y.
> 
> > The trace shows this context is called within a irq disabled spinlock region. Ie this is trying to do
> 
> >   spinlock_irqsave()
> >   write_lock_bh()
> >   write_unlock_bh()
> >   spinunlock_irqrestore()
> 
> > Which is illegal locking.
> 
> > Jason
> 
> Yes. I figured that. But I don't know why, or why
> 
> spin_lock_irqsave(lock_a)
> spin_lock_irqsave(lock_b)
> spin_unlock_irqrestore(lock_b)
> spin_unlock_irqrestore(lock_a)
> 
> Is clearly better which has been suggested as a fix.

Right

> Apparently someone above me is taking an irqsave lock and then
> calling

Yes, the CM code

> Down to rxe_create_ah().

Yep

> Calling the verbs APIs while holding a spinlock seems likely to
> cause problems.

Indeed, I hate it, but it is hard to fix

> This looks to be in response to receiving a MAD so possibly is a
> call to ibv_create_ah_from_wc().

Yes something like that

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-15 16:48               ` Jason Gunthorpe
@ 2022-02-15 16:53                 ` Pearson, Robert B
  2022-02-15 17:01                   ` Jason Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: Pearson, Robert B @ 2022-02-15 16:53 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Bob Pearson, Bart Van Assche, Guoqing Jiang, linux-rdma



-----Original Message-----
From: Jason Gunthorpe <jgg@nvidia.com> 
Sent: Tuesday, February 15, 2022 10:48 AM
To: Pearson, Robert B <robert.pearson2@hpe.com>
Cc: Bob Pearson <rpearsonhpe@gmail.com>; Bart Van Assche <bvanassche@acm.org>; Guoqing Jiang <guoqing.jiang@linux.dev>; linux-rdma@vger.kernel.org
Subject: Re: Inconsistent lock state warning for rxe_poll_cq()

> 
> >   spinlock_irqsave()
> >   write_lock_bh()
> >   write_unlock_bh()
> >   spinunlock_irqrestore()
> 
> > Which is illegal locking.
> 
> > Jason
> 
Jason, can you tell me what the problem is with this. I'm not fighting it just want to know.

Bob

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-15 16:53                 ` Pearson, Robert B
@ 2022-02-15 17:01                   ` Jason Gunthorpe
  2022-02-15 17:18                     ` Haakon Bugge
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2022-02-15 17:01 UTC (permalink / raw)
  To: Pearson, Robert B; +Cc: Bob Pearson, Bart Van Assche, Guoqing Jiang, linux-rdma

On Tue, Feb 15, 2022 at 04:53:25PM +0000, Pearson, Robert B wrote:
> > >   spinlock_irqsave()
> > >   write_lock_bh()
> > >   write_unlock_bh()
> > >   spinunlock_irqrestore()
> > 
> > > Which is illegal locking.
> > 
> > > Jason
> > 
> Jason, can you tell me what the problem is with this. I'm not fighting it just want to know.

I don't know the exact reason, judging from the code it looks like the
implementation of unlock_bh is allowed to assume hard irqs were not
blocked and unblock them - probably some micro-optimization.

I don't think there is a deadlock issue with the above because irqsave
is a superset of the _bh varient.

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inconsistent lock state warning for rxe_poll_cq()
  2022-02-15 17:01                   ` Jason Gunthorpe
@ 2022-02-15 17:18                     ` Haakon Bugge
  0 siblings, 0 replies; 14+ messages in thread
From: Haakon Bugge @ 2022-02-15 17:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Pearson, Robert B, Bob Pearson, Bart Van Assche, Guoqing Jiang,
	OFED mailing list



> On 15 Feb 2022, at 18:01, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> On Tue, Feb 15, 2022 at 04:53:25PM +0000, Pearson, Robert B wrote:
>>>>  spinlock_irqsave()
>>>>  write_lock_bh()
>>>>  write_unlock_bh()
>>>>  spinunlock_irqrestore()
>>> 
>>>> Which is illegal locking.
>>> 
>>>> Jason
>>> 
>> Jason, can you tell me what the problem is with this. I'm not fighting it just want to know.
> 
> I don't know the exact reason, judging from the code it looks like the
> implementation of unlock_bh is allowed to assume hard irqs were not
> blocked and unblock them - probably some micro-optimization.
> 
> I don't think there is a deadlock issue with the above because irqsave
> is a superset of the _bh varient.

I worked on an issue in peernet2id(), which called _unlock_bh() whilst IRQs were off. Simply, you cannot unlock a _bh lock while IRQs are off.

I did actually find archaic bugs reported due to this, see https://lore.kernel.org/netdev/7F058034-8A2B-4C19-A39E-12B0DB117328@oracle.com/


Thxs, Håkon



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-02-15 17:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 18:01 Inconsistent lock state warning for rxe_poll_cq() Bart Van Assche
2022-02-14 19:51 ` Bob Pearson
2022-02-14 19:55   ` Bart Van Assche
2022-02-14 20:25     ` Bob Pearson
2022-02-14 20:48       ` Bart Van Assche
2022-02-15  5:43         ` Bob Pearson
2022-02-15 12:16           ` Guoqing Jiang
2022-02-15 14:41           ` Jason Gunthorpe
2022-02-15 16:45             ` Pearson, Robert B
2022-02-15 16:48               ` Jason Gunthorpe
2022-02-15 16:53                 ` Pearson, Robert B
2022-02-15 17:01                   ` Jason Gunthorpe
2022-02-15 17:18                     ` Haakon Bugge
2022-02-15  4:50 ` Guoqing Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.