All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux kernel v4.15-rc4 and rdma_rxe
@ 2017-12-20  1:10 Bart Van Assche
       [not found] ` <1513732236.2535.25.camel-Sjgp3cTcYWE@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2017-12-20  1:10 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4798 bytes --]

Hello,

Can anyone who is reading this list tell me whether or not the rdma_rxe
driver undergoes regular testing? It was a few months ago that I tried to
run the SRP protocol over that driver. When I tried again today I ran into
the call trace shown below. I can share the details of the test I ran in
case anyone would be interested.

watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:2110]
Modules linked in: ib_srp libcrc32c scsi_transport_srp crc32c_generic target_core_pscsi target_core_file ib_srpt target_core_iblock target_core_mod rdma_rxe ip6_udp_tunnel udp_tunnel ib_umad
ib_uverbs scsi_debug brd mq_deadline deadline_iosched cfq_iosched crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper sg serio_raw
virtio_balloon virtio_console button i2c_piix4 af_packet dm_multipath dm_mod dax scsi_dh_rdac scsi_dh_emc scsi_dh_alua ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi ip_tables x_tables ipv6 autofs4 ext4 crc16 mbcache jbd2 sd_mod virtio_blk virtio_scsi sr_mod cdrom virtio_net ata_generic pata_acpi ata_piix uhci_hcd psmouse libata i2c_core
crc32c_intel atkbd virtio_pci
ehci_hcd virtio_ring virtio intel_agp usbcore intel_gtt scsi_mod usb_common agpgart unix [last unloaded: scsi_transport_srp]
irq event stamp: 57031930
hardirqs last  enabled at (57031929): [<000000002d4feb01>] _raw_spin_unlock_irqrestore+0x5a/0x70
hardirqs last disabled at (57031930): [<00000000ec2d6096>] apic_timer_interrupt+0x98/0xb0
softirqs last  enabled at (165802): [<0000000034ea4c57>] __do_softirq+0x3af/0x52c
softirqs last disabled at (165701): [<000000008110c584>] irq_exit+0xb6/0xc0
CPU: 1 PID: 2110 Comm: kworker/1:0 Not tainted 4.15.0-rc4-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Workqueue: events_freezable_power_ disk_events_workfn
RIP: 0010:_raw_spin_unlock_irqrestore+0x5f/0x70
RSP: 0018:ffffa6a7cf9933f8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
RAX: ffff8dea7c1ed0c0 RBX: 0000000000000246 RCX: 0000000000000006
RDX: 0000000000000007 RSI: ffff8dea7c1ed9d0 RDI: 0000000000000246
RBP: ffffa6a7cf993408 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8dea7fc91080
R13: 0000000000000001 R14: 000000000000008f R15: ffff8dea680ca170
FS:  0000000000000000(0000) GS:ffff8dea7fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffc75667cd0 CR3: 0000000058a0f002 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
mod_timer+0x1b9/0x4b0
? _raw_spin_unlock_irqrestore+0x45/0x70
rxe_completer+0xa07/0xbf0 [rdma_rxe]
rxe_do_task+0x7c/0x100 [rdma_rxe]
rxe_run_task+0x16/0x30 [rdma_rxe]
rxe_comp_queue_pkt+0x42/0x50 [rdma_rxe]
rxe_rcv+0x2f1/0x8b0 [rdma_rxe]
rxe_loopback+0x9/0x10 [rdma_rxe]
? rxe_loopback+0x9/0x10 [rdma_rxe]
send_ack+0x9a/0x190 [rdma_rxe]
rxe_responder+0x1637/0x1db0 [rdma_rxe]
? _raw_spin_unlock_irqrestore+0x5a/0x70
? trace_hardirqs_on_caller+0xf4/0x190
rxe_do_task+0x7c/0x100 [rdma_rxe]
rxe_run_task+0x16/0x30 [rdma_rxe]
rxe_resp_queue_pkt+0x42/0x50 [rdma_rxe]
rxe_rcv+0x364/0x8b0 [rdma_rxe]
rxe_loopback+0x9/0x10 [rdma_rxe]
? rxe_loopback+0x9/0x10 [rdma_rxe]
rxe_requester+0x6ea/0x1160 [rdma_rxe]
rxe_do_task+0x7c/0x100 [rdma_rxe]
rxe_run_task+0x16/0x30 [rdma_rxe]
rxe_post_send+0x2f0/0x550 [rdma_rxe]
srp_queuecommand+0x29f/0x430 [ib_srp]
? __lock_is_held+0x5a/0xa0
scsi_dispatch_cmd+0xfd/0x400 [scsi_mod]
scsi_queue_rq+0x4ed/0x640 [scsi_mod]
blk_mq_dispatch_rq_list+0x71/0x420
? blk_mq_dequeue_from_ctx+0x128/0x1b0
blk_mq_do_dispatch_ctx+0xd2/0x120
blk_mq_sched_dispatch_requests+0x14e/0x1a0
__blk_mq_run_hw_queue+0x12a/0x1d0
__blk_mq_delay_run_hw_queue+0x9f/0xf0
blk_mq_run_hw_queue+0xb9/0x1e0
blk_mq_sched_insert_request+0x9d/0x190
? __lockdep_init_map+0x52/0x1d0
blk_execute_rq_nowait+0x7c/0x100
blk_execute_rq+0x5f/0xb0
scsi_execute+0xd6/0x1f0 [scsi_mod]
scsi_test_unit_ready+0x4c/0xc0 [scsi_mod]
sd_check_events+0xc9/0x170 [sd_mod]
disk_check_events+0x4e/0x120
disk_events_workfn+0x17/0x20
process_one_work+0x211/0x6a0
worker_thread+0x38/0x3b0
kthread+0x124/0x140
? process_one_work+0x6a0/0x6a0
? kthread_create_worker_on_cpu+0x40/0x40
ret_from_fork+0x24/0x30
Code: 00 00 e8 25 c7 b3 ff bf 01 00 00 00 e8 fb f8 b0 ff 65 8b 05 f4 f3 a8 4c 85 c0 74 16 5b 41 5c 5d c3 e8 f6 f2 b3 ff 48 89 df 57 9d <0f> 1f 44 00 00 eb d5 e8 7d 40 a8 ff 5b 41 5c 5d c3 55 48 89
e5 N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found] ` <1513732236.2535.25.camel-Sjgp3cTcYWE@public.gmane.org>
@ 2017-12-21  8:23   ` Moni Shoua
       [not found]     ` <CAG9sBKP=i9ukbkj7x11oW=yLAz8u2RYHTMbf_Z1h3=nQ3pjfLw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Moni Shoua @ 2017-12-21  8:23 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Dec 20, 2017 at 3:10 AM, Bart Van Assche <Bart.VanAssche-Sjgp3cTcYWE@public.gmane.org> wrote:
> Hello,
>
> Can anyone who is reading this list tell me whether or not the rdma_rxe
> driver undergoes regular testing? It was a few months ago that I tried to
> run the SRP protocol over that driver. When I tried again today I ran into
> the call trace shown below. I can share the details of the test I ran in
> case anyone would be interested.

Hi Bart
Thanks for the report.
Please share the test details and we'll look into that.

thanks
Moni
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]     ` <CAG9sBKP=i9ukbkj7x11oW=yLAz8u2RYHTMbf_Z1h3=nQ3pjfLw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-12-22 23:01       ` Bart Van Assche
       [not found]         ` <1513983674.2579.27.camel-Sjgp3cTcYWE@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2017-12-22 23:01 UTC (permalink / raw)
  To: monis-VPRAkNaXOzVWk0Htik3J/w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Thu, 2017-12-21 at 10:23 +0200, Moni Shoua wrote:
> On Wed, Dec 20, 2017 at 3:10 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> > Can anyone who is reading this list tell me whether or not the rdma_rxe
> > driver undergoes regular testing? It was a few months ago that I tried to
> > run the SRP protocol over that driver. When I tried again today I ran into
> > the call trace shown below. I can share the details of the test I ran in
> > case anyone would be interested.
> 
> Please share the test details and we'll look into that.

Thanks Moni!

The call trace in my previous e-mail was caused by a bug in the SRP initiator
driver. I will post the patches that fix that bug after the holidays. But even
after having fixed that bug I noticed a remarkable behavior difference between
the mlx4_ib and rxe drivers. ib_srpt channels get closed properly when using
the mlx4 driver but not when using the rxe driver. The test I ran is as follows:
* Clone, build and install the kernel from branch block-scsi-for-next of
  repository https://github.com/bvanassche/linux. Make sure that the SRP
  initiator and target drivers are enabled in the kernel config. I plan to post
  all patches that are in that repository and that are not yet upstream after
  the holidays.
* Clone https://github.com/bvanassche/srp-test.
* Edit /etc/multipath.conf as indicated in the README.md document in the
  srp-test repository.
* Start multipathd.
* If I run the following command on a system with a ConnectX-3 adapter:
    srp-test/run_tests -d -r 10 -t 02-mq
  then the test finishes after about 11 seconds.
  But if I run the following command on a system without any RDMA adapters:
    srp-test/run_tests -c -d -r 10 -t 02-mq
  then the following output appears:

Unloaded the ib_srpt kernel module
Unloaded the rdma_rxe kernel module
SoftRoCE network interfaces: rxe0
Zero-initializing /dev/ram0 ... done
Zero-initializing /dev/ram1 ... done
Zero-initializing /dev/sdb ... done
Configured SRP target driver
Running test /home/bart/software/infiniband/srp-test/tests/02-mq ...
Test file I/O on top of multipath concurrently with logout and login (0 min; mq)
Using /dev/disk/by-id/dm-uuid-mpath-3600140572616d6469736b31000000000 -> ../../dm-2
Unmounting /root/mnt1 from /dev/mapper/mpathb
SRP LUN /sys/class/scsi_device/5:0:0:0 / sdc: removing /dev/dm-2: done
SRP LUN /sys/class/scsi_device/5:0:0:1 / sde: removing /dev/dm-1: done
SRP LUN /sys/class/scsi_device/5:0:0:2 / sdd: removing /dev/dm-0: done
Unloaded the ib_srp kernel module
Test /home/bart/software/infiniband/srp-test/tests/02-mq succeeded
1 tests succeeded and 0 tests failed

[ test script hangs ]

While the test script hangs the following appears in the system log (please note
that the ib_srpt:srpt_zerolength_write_done: ib_srpt wc->status message is missing):

ib_srpt:srpt_close_ch: ib_srpt 192.168.122.76-32: queued zerolength write
[ ... ]
ib_srpt srpt_disconnect_ch_sync(192.168.122.76-18 state 3): still waiting ...
[ ... ]
INFO: task rmdir:3215 blocked for more than 120 seconds.
      Not tainted 4.15.0-rc4-dbg+ #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmdir           D13912  3215   3208 0x00000000
Call Trace:
 __schedule+0x2ad/0xb90
 schedule+0x31/0x90
 schedule_timeout+0x1fb/0x590
 wait_for_completion_timeout+0x11a/0x180
 srpt_close_session+0xba/0x180 [ib_srpt]
 target_shutdown_sessions+0xc8/0xd0 [target_core_mod]
 core_tpg_del_initiator_node_acl+0x7c/0x130 [target_core_mod]
 target_fabric_nacl_base_release+0x20/0x30 [target_core_mod]
 config_item_release+0x5a/0xc0 [configfs]
 config_item_put+0x21/0x24 [configfs]
 configfs_rmdir+0x1ef/0x2f0 [configfs]
 vfs_rmdir+0x6e/0x150
 do_rmdir+0x168/0x1c0
 SyS_rmdir+0x11/0x20

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]         ` <1513983674.2579.27.camel-Sjgp3cTcYWE@public.gmane.org>
@ 2017-12-25 13:02           ` Moni Shoua
       [not found]             ` <CAG9sBKOOvvwG6yaQfWG=c04-LQ-Yvw8QbD-jzTm-zJYOFJeEVQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Moni Shoua @ 2017-12-25 13:02 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

> The call trace in my previous e-mail was caused by a bug in the SRP initiator
> driver. I will post the patches that fix that bug after the holidays. But even
> after having fixed that bug I noticed a remarkable behavior difference between
> the mlx4_ib and rxe drivers. ib_srpt channels get closed properly when using
> the mlx4 driver but not when using the rxe driver. The test I ran is as follows:
> * Clone, build and install the kernel from branch block-scsi-for-next of
>   repository https://github.com/bvanassche/linux. Make sure that the SRP
>   initiator and target drivers are enabled in the kernel config. I plan to post
>   all patches that are in that repository and that are not yet upstream after
>   the holidays.
> * Clone https://github.com/bvanassche/srp-test.
> * Edit /etc/multipath.conf as indicated in the README.md document in the
>   srp-test repository.
> * Start multipathd.
> * If I run the following command on a system with a ConnectX-3 adapter:
>     srp-test/run_tests -d -r 10 -t 02-mq
>   then the test finishes after about 11 seconds.
>   But if I run the following command on a system without any RDMA adapters:
>     srp-test/run_tests -c -d -r 10 -t 02-mq
>   then the following output appears:
>
> Unloaded the ib_srpt kernel module
> Unloaded the rdma_rxe kernel module
> SoftRoCE network interfaces: rxe0
> Zero-initializing /dev/ram0 ... done
> Zero-initializing /dev/ram1 ... done
> Zero-initializing /dev/sdb ... done
> Configured SRP target driver
> Running test /home/bart/software/infiniband/srp-test/tests/02-mq ...
> Test file I/O on top of multipath concurrently with logout and login (0 min; mq)
> Using /dev/disk/by-id/dm-uuid-mpath-3600140572616d6469736b31000000000 -> ../../dm-2
> Unmounting /root/mnt1 from /dev/mapper/mpathb
> SRP LUN /sys/class/scsi_device/5:0:0:0 / sdc: removing /dev/dm-2: done
> SRP LUN /sys/class/scsi_device/5:0:0:1 / sde: removing /dev/dm-1: done
> SRP LUN /sys/class/scsi_device/5:0:0:2 / sdd: removing /dev/dm-0: done
> Unloaded the ib_srp kernel module
> Test /home/bart/software/infiniband/srp-test/tests/02-mq succeeded
> 1 tests succeeded and 0 tests failed
>
> [ test script hangs ]
>
> While the test script hangs the following appears in the system log (please note
> that the ib_srpt:srpt_zerolength_write_done: ib_srpt wc->status message is missing):
>
> ib_srpt:srpt_close_ch: ib_srpt 192.168.122.76-32: queued zerolength write
> [ ... ]
> ib_srpt srpt_disconnect_ch_sync(192.168.122.76-18 state 3): still waiting ...
> [ ... ]
> INFO: task rmdir:3215 blocked for more than 120 seconds.
>       Not tainted 4.15.0-rc4-dbg+ #2
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> rmdir           D13912  3215   3208 0x00000000
> Call Trace:
>  __schedule+0x2ad/0xb90
>  schedule+0x31/0x90
>  schedule_timeout+0x1fb/0x590
>  wait_for_completion_timeout+0x11a/0x180
>  srpt_close_session+0xba/0x180 [ib_srpt]
>  target_shutdown_sessions+0xc8/0xd0 [target_core_mod]
>  core_tpg_del_initiator_node_acl+0x7c/0x130 [target_core_mod]
>  target_fabric_nacl_base_release+0x20/0x30 [target_core_mod]
>  config_item_release+0x5a/0xc0 [configfs]
>  config_item_put+0x21/0x24 [configfs]
>  configfs_rmdir+0x1ef/0x2f0 [configfs]
>  vfs_rmdir+0x6e/0x150
>  do_rmdir+0x168/0x1c0
>  SyS_rmdir+0x11/0x20
>

Hi Bart
Thanks for the detailed answer.
1. I will do my best to add more tests to RXE regression. However, it
may take a while.
2. Differences in behavior doesn't necessarily mean that at least one
implementation is wrong. In what you describe it is hard to understand
what you think is wrong with RXE, If I understand it right the script
tried to delete a directory that ib_srpt owns (configs or such?) and
this operation waits for a completion. If this is right do you know
who is expected to call complete()? It sound unlikely that rxe is the
one.
3. Despite that, let's try this: when script hangs, can you run echo t
> /proc/sysrq-trigger and see if you something in dmesg that can
explain the hang? Maybe a trace that rdma_rxe is a part of it?

thanks
Moni
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]             ` <CAG9sBKOOvvwG6yaQfWG=c04-LQ-Yvw8QbD-jzTm-zJYOFJeEVQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-02 16:29               ` Bart Van Assche
       [not found]                 ` <dd87887d-8641-c8fc-48f1-234e7cf67cb1-Sjgp3cTcYWE@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2018-01-02 16:29 UTC (permalink / raw)
  To: Moni Shoua; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 12/25/17 05:02, Moni Shoua wrote:
> 1. I will do my best to add more tests to RXE regression. However, it
> may take a while.
> 2. Differences in behavior doesn't necessarily mean that at least one
> implementation is wrong. In what you describe it is hard to understand
> what you think is wrong with RXE, If I understand it right the script
> tried to delete a directory that ib_srpt owns (configs or such?) and
> this operation waits for a completion. If this is right do you know
> who is expected to call complete()? It sound unlikely that rxe is the
> one.
> 3. Despite that, let's try this: when script hangs, can you run echo t
> > /proc/sysrq-trigger and see if you something in dmesg that can
> explain the hang? Maybe a trace that rdma_rxe is a part of it?

Hello Moni,

The ib_srpt driver uses zero-length writes to trigger the completion 
handler if either an RTU event is received or an RDMA channel is being 
closed. In the log I saw the message "queued zerolength write" appear 
but not "srpt_zerolength_write_done: wc->status = ..." when the hang was 
observed. That made me wonder whether the rxe driver perhaps suppresses 
completions for zero-length writes if the queue pair state is changed 
into IB_QPS_ERR? I think it is required by the IB spec to queue an error 
completion for pending work requests upon the transition to IB_QPS_ERR.

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]                 ` <dd87887d-8641-c8fc-48f1-234e7cf67cb1-Sjgp3cTcYWE@public.gmane.org>
@ 2018-01-02 17:44                   ` Moni Shoua
       [not found]                     ` <CAG9sBKOtxoG-HLU5A=WEZqUuADzje74_AjZgtG21Ueeb2YCAEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Moni Shoua @ 2018-01-02 17:44 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

>
> Hello Moni,
>
> The ib_srpt driver uses zero-length writes to trigger the completion handler
> if either an RTU event is received or an RDMA channel is being closed. In
> the log I saw the message "queued zerolength write" appear but not
> "srpt_zerolength_write_done: wc->status = ..." when the hang was observed.
> That made me wonder whether the rxe driver perhaps suppresses completions
> for zero-length writes if the queue pair state is changed into IB_QPS_ERR? I
> think it is required by the IB spec to queue an error completion for pending
> work requests upon the transition to IB_QPS_ERR.
>

Hi Bart
This is a great input for the debugger (whoever that be). From a brief
look at the code I see that error QP is checked when during the
validation of  RDMA_WRITE request. In this case a completion is
generated and the size of the buffer to write remains irrelevant.
However, to verify that I wasn't wrong you can add some printk() in
the path that starts with rxe_responder(). When flow reaches
check_resource() and when QP is in ERROR state the function returns
RESPST_COMPLETE. The next step in the state machine would be to call
the do_complete() function.

thanks

Moni
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]                     ` <CAG9sBKOtxoG-HLU5A=WEZqUuADzje74_AjZgtG21Ueeb2YCAEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-03  0:53                       ` Bart Van Assche
       [not found]                         ` <1514940799.14857.20.camel-Sjgp3cTcYWE@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2018-01-03  0:53 UTC (permalink / raw)
  To: monis-VPRAkNaXOzVWk0Htik3J/w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, 2018-01-02 at 19:44 +0200, Moni Shoua wrote:
> This is a great input for the debugger (whoever that be). From a brief
> look at the code I see that error QP is checked when during the
> validation of  RDMA_WRITE request. In this case a completion is
> generated and the size of the buffer to write remains irrelevant.
> However, to verify that I wasn't wrong you can add some printk() in
> the path that starts with rxe_responder(). When flow reaches
> check_resource() and when QP is in ERROR state the function returns
> RESPST_COMPLETE. The next step in the state machine would be to call
> the do_complete() function.

Hello Moni,

Thanks for the feedback and the suggestion. I will check the ib_srpt code
further for possible race conditions. But after I had enabled the dynamic
debugging statements in the rdma_rxe driver I ran into something of which
I don't think that it is caused by the ib_srpt driver (with memory poisoning
enabled):

rdma_rxe:rxe_responder: rdma_rxe: qp#19 state = CLEANUP
rdma_rxe:rxe_responder: rdma_rxe: qp#19 state = DONE
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 1385 Comm: kworker/1:26 Not tainted 4.15.0-rc4-dbg+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Workqueue: target_completion target_complete_ok_work [target_core_mod]
RIP: 0010:__lock_acquire+0xe4/0x13b0
RSP: 0018:ffff944ec40df9b0 EFLAGS: 00010002
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8acf4b076748
RBP: ffff944ec40dfa80 R08: 0000000000000001 R09: ffffffffc061d1b7
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8acf5e8f2880 R14: 0000000000000001 R15: ffff8acf4b076748
FS:  0000000000000000(0000) GS:ffff8acf7fc8000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffab80d000 CR3: 000000005fa0f005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lock_acquire+0xac/0x230
_raw_spin_lock_irqsave+0x45/0x60
rxe_do_task+0x87/0x100 [rdma_rxe]
rxe_run_task+0x16/0x30 [rdma_rxe]
rxe_resp_queue_pkt+0x42/0x50 [rdma_rxe]
rxe_rcv+0x363/0x8b0 [rdma_rxe]
rxe_loopback+0x9/0x10 [rdma_rxe]
rxe_requester+0x6ea/0x1160 [rdma_rxe]
rxe_do_task+0x7c/0x100 [rdma_rxe]
rxe_run_task+0x16/0x30 [rdma_rxe]
rxe_post_send+0x2f0/0x550 [rdma_rxe]
srpt_queue_response+0x20c/0x400 [ib_srpt]
srpt_queue_status+0x28/0x40 [ib_srpt]
target_complete_ok_work+0x1ea/0x520 [target_core_mod]
process_one_work+0x211/0x6a0
worker_thread+0x38/0x3b0
kthread+0x124/0x140

(gdb) list *(rxe_do_task+0x87)
0xc1e7 is in rxe_do_task (drivers/infiniband/sw/rxe/rxe_task.c:90).
85              do {
86                      cont = 0;
87                      ret = task->func(task->arg);
88
89                      spin_lock_irqsave(&task->state_lock, flags);
90                      switch (task->state) {
91                      case TASK_STATE_BUSY:
92                              if (ret)
93                                      task->state = TASK_STATE_START;
94                              else

From the disas rxe_do_task output:
   0x000000000000c1d9 <+121>:   callq  *0x78(%rbx)
   0x000000000000c1dc <+124>:   mov    %r12,%rdi
   0x000000000000c1df <+127>:   mov    %eax,%r14d
   0x000000000000c1e2 <+130>:   callq  0xc1e7 <rxe_do_task+135>

Does this perhaps mean that the rxe_qp structure can be freed while rxe_do_task()
is in progress? Please note that the ib_srpt driver only destroys a QP
(srpt_destroy_ch_ib() call in srpt_release_channel_work()) after all SCSI command
processing has finished (transport_deregister_session()).

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]                         ` <1514940799.14857.20.camel-Sjgp3cTcYWE@public.gmane.org>
@ 2018-01-03  5:13                           ` Moni Shoua
       [not found]                             ` <CAG9sBKPhnknQPYX=s-y9-tonxTRrE05gU5W75x0FhKmTdggFmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Moni Shoua @ 2018-01-03  5:13 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

> Does this perhaps mean that the rxe_qp structure can be freed while rxe_do_task()
> is in progress? Please note that the ib_srpt driver only destroys a QP
> (srpt_destroy_ch_ib() call in srpt_release_channel_work()) after all SCSI command
> processing has finished (transport_deregister_session()).
>
> Thanks,
>
> Bart.
Hi Bart
If I understand right you say that the system is hung when trying to
take a lock in rxe_do_taks() (line 89). Is that right?
Anyway, It's possible that you hit a bug related to destroying a QP.

thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]                             ` <CAG9sBKPhnknQPYX=s-y9-tonxTRrE05gU5W75x0FhKmTdggFmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-03 21:25                               ` Bart Van Assche
  2018-01-05 21:18                               ` Bart Van Assche
  1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2018-01-03 21:25 UTC (permalink / raw)
  To: monis-VPRAkNaXOzVWk0Htik3J/w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, 2018-01-03 at 07:13 +0200, Moni Shoua wrote:
> > Does this perhaps mean that the rxe_qp structure can be freed while rxe_do_task()
> > is in progress? Please note that the ib_srpt driver only destroys a QP
> > (srpt_destroy_ch_ib() call in srpt_release_channel_work()) after all SCSI command
> > processing has finished (transport_deregister_session()).
> 
> If I understand right you say that the system is hung when trying to
> take a lock in rxe_do_taks() (line 89). Is that right?
> Anyway, It's possible that you hit a bug related to destroying a QP.

Hello Moni,

The issues I had reported may be unrelated. BTW, this is what I saw appearing
in the system log a few minutes ago:

Jan  3 13:03:56 ubuntu-vm kernel: ib_srpt:srpt_close_ch: ib_srpt 192.168.122.76-18: queued zerolength write
Jan  3 13:03:56 ubuntu-vm kernel: rdma_rxe:rxe_completer: rdma_rxe: rxe_completer(): qp valid 1, state ERROR
[ ... ]
Jan  3 13:04:09 ubuntu-vm kernel: ib_srpt:srpt_disconnect_ch_sync: ib_srpt ch 192.168.122.76-18 state 3
[ ... ]
Jan  3 13:04:14 ubuntu-vm kernel: ib_srpt srpt_disconnect_ch_sync(192.168.122.76-18 state 3): still waiting ...

In other words, the ib_srpt driver had queued a zero-length write and changed
the QP state into ERROR but no completion was queued for that zero-length write.
The rdma_rxe log message was generated by the following code:

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index 6cdc40ed8a9f..f6c40edbddc6 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -550,6 +550,9 @@ int rxe_completer(void *arg)
 
 	if (!qp->valid || qp->req.state == QP_STATE_ERROR ||
 	    qp->req.state == QP_STATE_RESET) {
+		pr_debug("rxe_completer(): qp valid %d, state %s\n",
+			 qp->valid, qp->req.state == QP_STATE_ERROR ? "ERROR" :
+			 qp->req.state == QP_STATE_RESET ? "RESET" : "(?)");
 		rxe_drain_resp_pkts(qp, qp->valid &&
 				    qp->req.state == QP_STATE_ERROR);
 		goto exit;

Bart.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Linux kernel v4.15-rc4 and rdma_rxe
       [not found]                             ` <CAG9sBKPhnknQPYX=s-y9-tonxTRrE05gU5W75x0FhKmTdggFmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-01-03 21:25                               ` Bart Van Assche
@ 2018-01-05 21:18                               ` Bart Van Assche
  1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2018-01-05 21:18 UTC (permalink / raw)
  To: monis-VPRAkNaXOzVWk0Htik3J/w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1003 bytes --]

On Wed, 2018-01-03 at 07:13 +0200, Moni Shoua wrote:
> Anyway, It's possible that you hit a bug related to destroying a QP.

The patch below is not a full solution but seems to help. It would be
appreciated if you could have a look at this patch.

Thanks,

Bart.

diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 6a6995159ec6..b6aa5c67e0eb 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -601,12 +601,8 @@ int rxe_requester(void *arg)
 	rxe_add_ref(qp);
 
 next_wqe:
-	if (unlikely(!qp->valid)) {
-		rxe_drain_req_pkts(qp, true);
-		goto exit;
-	}
-
-	if (unlikely(qp->req.state == QP_STATE_ERROR)) {
+	if (unlikely(!qp->valid || qp->req.state == QP_STATE_ERROR)) {
+		rxe_run_task(&qp->resp.task, 1);
 		rxe_drain_req_pkts(qp, true);
 		goto exit;
 	}N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-01-05 21:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-20  1:10 Linux kernel v4.15-rc4 and rdma_rxe Bart Van Assche
     [not found] ` <1513732236.2535.25.camel-Sjgp3cTcYWE@public.gmane.org>
2017-12-21  8:23   ` Moni Shoua
     [not found]     ` <CAG9sBKP=i9ukbkj7x11oW=yLAz8u2RYHTMbf_Z1h3=nQ3pjfLw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-22 23:01       ` Bart Van Assche
     [not found]         ` <1513983674.2579.27.camel-Sjgp3cTcYWE@public.gmane.org>
2017-12-25 13:02           ` Moni Shoua
     [not found]             ` <CAG9sBKOOvvwG6yaQfWG=c04-LQ-Yvw8QbD-jzTm-zJYOFJeEVQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-02 16:29               ` Bart Van Assche
     [not found]                 ` <dd87887d-8641-c8fc-48f1-234e7cf67cb1-Sjgp3cTcYWE@public.gmane.org>
2018-01-02 17:44                   ` Moni Shoua
     [not found]                     ` <CAG9sBKOtxoG-HLU5A=WEZqUuADzje74_AjZgtG21Ueeb2YCAEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-03  0:53                       ` Bart Van Assche
     [not found]                         ` <1514940799.14857.20.camel-Sjgp3cTcYWE@public.gmane.org>
2018-01-03  5:13                           ` Moni Shoua
     [not found]                             ` <CAG9sBKPhnknQPYX=s-y9-tonxTRrE05gU5W75x0FhKmTdggFmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-03 21:25                               ` Bart Van Assche
2018-01-05 21:18                               ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.