* rxe panic
@ 2019-12-25 4:55 Frank Huang
2019-12-25 5:27 ` Zhu Yanjun
2019-12-25 6:32 ` Leon Romanovsky
0 siblings, 2 replies; 12+ messages in thread
From: Frank Huang @ 2019-12-25 4:55 UTC (permalink / raw)
To: linux-rdma
hi, there is a panic on rdma_rxe module when the restart
network.service or shutdown the switch.
it looks like a use-after-free error.
everytime it happens, there is the log "rdma_rxe: Unknown layer 3 protocol: 0"
is it a known error?
my kernel version is 4.14.97
[448840.314544] rdma_rxe: Unknown layer 3 protocol: 0
[448840.314626] general protection fault: 0000 [#1] SMP PTI
[448840.314627] Modules linked in: binfmt_misc ib_isert
iscsi_target_mod ib_srpt target_core_mod rpcrdma ib_iser ib_srp
scsi_transport_srp rdma_rxe(OE) ib_ipoib ib_umad ip6_udp_tunnel
udp_tunnel rdma_ucm rdma_cm iw_cm ib_cm ib_uverbs ib_core
ebtable_filter ebtables devlink ip6table_filter ip6_tables
ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink iptable_nat
xt_addrtype xt_conntrack br_netfilter bridge stp llc overlay
ip_set_hash_ip ip_set nfnetlink iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi sch_ingress openvswitch nf_conntrack_ipv6
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c sunrpc intel_rapl
x86_pkg_temp_thermal intel_powerclamp coretemp vfat fat kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
intel_cstate
[448840.314677] intel_uncore intel_rapl_perf mxm_wmi iTCO_wdt
iTCO_vendor_support ipmi_ssif pcspkr i2c_i801 lpc_ich ipmi_si
ipmi_devintf ipmi_msghandler pcc_cpufreq shpchp wmi ast drm_kms_helper
ttm crc32c_intel drm ixgbe igb mdio ptp pps_core dca i2c_algo_bit
[448840.314700] CPU: 1 PID: 17 Comm: ksoftirqd/1 Tainted: G
OE 4.14.97-el7.centos.x86_64 #1
[448840.314701] Hardware name: /80010211 , BIOS 3.12 11/27/2018
[448840.314703] task: ffff9ce768af8000 task.stack: ffffbd7c4c6c4000
[448840.314710] RIP: 0010:rxe_elem_release+0xf/0x60 [rdma_rxe]
[448840.314711] RSP: 0018:ffffbd7c4c6c7d28 EFLAGS: 00010246
[448840.314713] RAX: 0000000000000000 RBX: 2917351aae258b92 RCX:
0000000000000000
[448840.314714] RDX: ffff9cfb3f64ba40 RSI: 000000000000026c RDI:
ffff9cfb3f678008
[448840.314715] RBP: ffff9cfb3f678000 R08: 0000000000000201 R09:
ffffbd7c4df35000
[448840.314716] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000000
[448840.314717] R13: 000000000000001d R14: 0000000000000006 R15:
ffff9cfb3f678000
[448840.314719] FS: 0000000000000000(0000) GS:ffff9ce76f840000(0000)
knlGS:0000000000000000
[448840.314720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[448840.314721] CR2: 00007f4fc400f000 CR3: 000000260420a005 CR4:
00000000001626e0
[448840.314723] Call Trace:
[448840.314730] rxe_responder+0xcf0/0x1fe0 [rdma_rxe]
[448840.314738] ? check_preempt_wakeup+0x125/0x240
[448840.314742] ? check_preempt_curr+0x84/0x90
[448840.314745] ? ttwu_do_wakeup+0x19/0x140
[448840.314747] ? try_to_wake_up+0x54/0x450
[448840.314751] rxe_do_task+0x8b/0x100 [rdma_rxe]
[448840.314754] tasklet_action+0xfe/0x110
[448840.314758] __do_softirq+0xd9/0x2a2
[448840.314761] run_ksoftirqd+0x1e/0x70
[448840.314763] smpboot_thread_fn+0x10e/0x160
[448840.314766] kthread+0xff/0x140
[448840.314768] ? sort_range+0x20/0x20
[448840.314770] ? __kthread_parkme+0x90/0x90
[448840.314771] ret_from_fork+0x35/0x40
[448840.314773] Code: 7a 00 00 74 04 31 c0 eb c3 4c 89 e7 e8 bb f9 ff
ff 31 c0 eb b7 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 8d 6f f8 53
48 8b 5f f8 <48> 8b 43 20 48 85 c0 74 08 48 89 ef e8 60 1c 53 fb 8b 43
30 48
[448840.314817] RIP: rxe_elem_release+0xf/0x60 [rdma_rxe] RSP: ffffbd7c4c6c7d28
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 4:55 rxe panic Frank Huang
@ 2019-12-25 5:27 ` Zhu Yanjun
2019-12-25 6:01 ` Frank Huang
2019-12-25 6:32 ` Leon Romanovsky
1 sibling, 1 reply; 12+ messages in thread
From: Zhu Yanjun @ 2019-12-25 5:27 UTC (permalink / raw)
To: Frank Huang; +Cc: linux-rdma
Is there any vmcore about this problem?
On Wed, Dec 25, 2019 at 1:03 PM Frank Huang <tigerinxm@gmail.com> wrote:
>
> hi, there is a panic on rdma_rxe module when the restart
> network.service or shutdown the switch.
>
> it looks like a use-after-free error.
>
> everytime it happens, there is the log "rdma_rxe: Unknown layer 3 protocol: 0"
>
> is it a known error?
>
> my kernel version is 4.14.97
>
> [448840.314544] rdma_rxe: Unknown layer 3 protocol: 0
> [448840.314626] general protection fault: 0000 [#1] SMP PTI
> [448840.314627] Modules linked in: binfmt_misc ib_isert
> iscsi_target_mod ib_srpt target_core_mod rpcrdma ib_iser ib_srp
> scsi_transport_srp rdma_rxe(OE) ib_ipoib ib_umad ip6_udp_tunnel
> udp_tunnel rdma_ucm rdma_cm iw_cm ib_cm ib_uverbs ib_core
> ebtable_filter ebtables devlink ip6table_filter ip6_tables
> ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink iptable_nat
> xt_addrtype xt_conntrack br_netfilter bridge stp llc overlay
> ip_set_hash_ip ip_set nfnetlink iscsi_tcp libiscsi_tcp libiscsi
> scsi_transport_iscsi sch_ingress openvswitch nf_conntrack_ipv6
> nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
> nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c sunrpc intel_rapl
> x86_pkg_temp_thermal intel_powerclamp coretemp vfat fat kvm_intel kvm
> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> intel_cstate
> [448840.314677] intel_uncore intel_rapl_perf mxm_wmi iTCO_wdt
> iTCO_vendor_support ipmi_ssif pcspkr i2c_i801 lpc_ich ipmi_si
> ipmi_devintf ipmi_msghandler pcc_cpufreq shpchp wmi ast drm_kms_helper
> ttm crc32c_intel drm ixgbe igb mdio ptp pps_core dca i2c_algo_bit
> [448840.314700] CPU: 1 PID: 17 Comm: ksoftirqd/1 Tainted: G
> OE 4.14.97-el7.centos.x86_64 #1
> [448840.314701] Hardware name: /80010211 , BIOS 3.12 11/27/2018
> [448840.314703] task: ffff9ce768af8000 task.stack: ffffbd7c4c6c4000
> [448840.314710] RIP: 0010:rxe_elem_release+0xf/0x60 [rdma_rxe]
> [448840.314711] RSP: 0018:ffffbd7c4c6c7d28 EFLAGS: 00010246
> [448840.314713] RAX: 0000000000000000 RBX: 2917351aae258b92 RCX:
> 0000000000000000
> [448840.314714] RDX: ffff9cfb3f64ba40 RSI: 000000000000026c RDI:
> ffff9cfb3f678008
> [448840.314715] RBP: ffff9cfb3f678000 R08: 0000000000000201 R09:
> ffffbd7c4df35000
> [448840.314716] R10: 0000000000000000 R11: 0000000000000001 R12:
> 0000000000000000
> [448840.314717] R13: 000000000000001d R14: 0000000000000006 R15:
> ffff9cfb3f678000
> [448840.314719] FS: 0000000000000000(0000) GS:ffff9ce76f840000(0000)
> knlGS:0000000000000000
> [448840.314720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [448840.314721] CR2: 00007f4fc400f000 CR3: 000000260420a005 CR4:
> 00000000001626e0
> [448840.314723] Call Trace:
> [448840.314730] rxe_responder+0xcf0/0x1fe0 [rdma_rxe]
> [448840.314738] ? check_preempt_wakeup+0x125/0x240
> [448840.314742] ? check_preempt_curr+0x84/0x90
> [448840.314745] ? ttwu_do_wakeup+0x19/0x140
> [448840.314747] ? try_to_wake_up+0x54/0x450
> [448840.314751] rxe_do_task+0x8b/0x100 [rdma_rxe]
> [448840.314754] tasklet_action+0xfe/0x110
> [448840.314758] __do_softirq+0xd9/0x2a2
> [448840.314761] run_ksoftirqd+0x1e/0x70
> [448840.314763] smpboot_thread_fn+0x10e/0x160
> [448840.314766] kthread+0xff/0x140
> [448840.314768] ? sort_range+0x20/0x20
> [448840.314770] ? __kthread_parkme+0x90/0x90
> [448840.314771] ret_from_fork+0x35/0x40
> [448840.314773] Code: 7a 00 00 74 04 31 c0 eb c3 4c 89 e7 e8 bb f9 ff
> ff 31 c0 eb b7 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 8d 6f f8 53
> 48 8b 5f f8 <48> 8b 43 20 48 85 c0 74 08 48 89 ef e8 60 1c 53 fb 8b 43
> 30 48
> [448840.314817] RIP: rxe_elem_release+0xf/0x60 [rdma_rxe] RSP: ffffbd7c4c6c7d28
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 5:27 ` Zhu Yanjun
@ 2019-12-25 6:01 ` Frank Huang
2019-12-25 6:34 ` Zhu Yanjun
0 siblings, 1 reply; 12+ messages in thread
From: Frank Huang @ 2019-12-25 6:01 UTC (permalink / raw)
To: Zhu Yanjun; +Cc: linux-rdma
yes,
what is the information should i post?
crash> bt
PID: 108 TASK: ffff978e28548000 CPU: 16 COMMAND: "ksoftirqd/16"
#0 [ffffa2f14c9a7b18] machine_kexec at ffffffff8f059992
#1 [ffffa2f14c9a7b70] __crash_kexec at ffffffff8f13cf7d
#2 [ffffa2f14c9a7c38] crash_kexec at ffffffff8f13e089
#3 [ffffa2f14c9a7c50] oops_end at ffffffff8f027a77
#4 [ffffa2f14c9a7c70] general_protection at ffffffff8fa01635
[exception RIP: rxe_elem_release+15]
RIP: ffffffffc08da38f RSP: ffffa2f14c9a7d28 RFLAGS: 00010246
RAX: 0000000000000000 RBX: 860e42124013b0aa RCX: 0000000000000000
RDX: ffff978e03ba8900 RSI: 0000000000000281 RDI: ffff978e02e746e8
RBP: ffff978e02e746e0 R8: 0000000000000201 R9: ffffa2f14dcb9000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 000000000000001d R14: 0000000000000006 R15: ffff978e02e746e0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#5 [ffffa2f14c9a7d38] rxe_responder at ffffffffc08d7d10 [rdma_rxe]
#6 [ffffa2f14c9a7e48] rxe_do_task at ffffffffc08e060b [rdma_rxe]
#7 [ffffa2f14c9a7e70] tasklet_action at ffffffff8f0afa1e
#8 [ffffa2f14c9a7e88] __softirqentry_text_start at ffffffff8fc000d9
#9 [ffffa2f14c9a7ee0] run_ksoftirqd at ffffffff8f0afa4e
#10 [ffffa2f14c9a7ee8] smpboot_thread_fn at ffffffff8f0cca5e
#11 [ffffa2f14c9a7f10] kthread at ffffffff8f0c8c9f
#12 [ffffa2f14c9a7f50] ret_from_fork at ffffffff8fa00205
crash> dis -l ffffffffc08d7d10
0xffffffffc08d7d10 <rxe_responder+3312>: jmpq
0xffffffffc08d7c6c <rxe_responder+3148>
crash>
0xffffffffc08d7c97 <rxe_responder+3191>: mov 0xec(%r15),%eax
0xffffffffc08d7c9e <rxe_responder+3198>: cmp $0x2,%eax
0xffffffffc08d7ca1 <rxe_responder+3201>: je
0xffffffffc08d8213 <rxe_responder+4595>
0xffffffffc08d7ca7 <rxe_responder+3207>: cmp $0x3,%eax
0xffffffffc08d7caa <rxe_responder+3210>: jne
0xffffffffc08d7ecc <rxe_responder+3756>
0xffffffffc08d7cb0 <rxe_responder+3216>: mov 0x450(%r15),%eax
0xffffffffc08d7cb7 <rxe_responder+3223>: cmp $0x20,%eax
0xffffffffc08d7cba <rxe_responder+3226>: jl
0xffffffffc08d873e <rxe_responder+5918>
0xffffffffc08d7cc0 <rxe_responder+3232>: cmp $0x21,%eax
0xffffffffc08d7cc3 <rxe_responder+3235>: jle
0xffffffffc08d8725 <rxe_responder+5893>
0xffffffffc08d7cc9 <rxe_responder+3241>: sub $0x26,%eax
0xffffffffc08d7ccc <rxe_responder+3244>: cmp $0x1,%eax
0xffffffffc08d7ccf <rxe_responder+3247>: ja
0xffffffffc08d873e <rxe_responder+5918>
0xffffffffc08d7cd5 <rxe_responder+3253>: movzbl 0x2d(%rbx),%eax
0xffffffffc08d7cd9 <rxe_responder+3257>: sub $0x27,%eax
0xffffffffc08d7cdc <rxe_responder+3260>: cmp $0x3,%al
0xffffffffc08d7cde <rxe_responder+3262>: sbb %r13d,%r13d
0xffffffffc08d7ce1 <rxe_responder+3265>: and $0xfffffff0,%r13d
0xffffffffc08d7ce5 <rxe_responder+3269>: add $0x14,%r13d
0xffffffffc08d7ce9 <rxe_responder+3273>: jmpq
0xffffffffc08d70a2 <rxe_responder+130>
0xffffffffc08d7cee <rxe_responder+3278>: mov %rbp,%rdi
0xffffffffc08d7cf1 <rxe_responder+3281>: callq
0xffffffffc08da380 <rxe_elem_release>
0xffffffffc08d7cf6 <rxe_responder+3286>: jmpq
0xffffffffc08d7b66 <rxe_responder+2886>
0xffffffffc08d7cfb <rxe_responder+3291>: mov %rbp,%rdi
0xffffffffc08d7cfe <rxe_responder+3294>: callq
0xffffffffc08da380 <rxe_elem_release>
0xffffffffc08d7d03 <rxe_responder+3299>: jmpq
0xffffffffc08d7b14 <rxe_responder+2804>
0xffffffffc08d7d08 <rxe_responder+3304>: mov %rbp,%rdi
0xffffffffc08d7d0b <rxe_responder+3307>: callq
0xffffffffc08da380 <rxe_elem_release>
0xffffffffc08d7d10 <rxe_responder+3312>: jmpq
0xffffffffc08d7c6c <rxe_responder+3148>
0xffffffffc08d7d15 <rxe_responder+3317>: test $0x10000,%eax
0xffffffffc08d7d1a <rxe_responder+3322>: je
0xffffffffc08d804f <rxe_responder+4143>
0xffffffffc08d7d20 <rxe_responder+3328>: mov 0x24(%rbx),%r12d
0xffffffffc08d7d24 <rxe_responder+3332>: movzbl 0x19f(%r15),%edi
0xffffffffc08d7d2c <rxe_responder+3340>: lea 0x6c0(%r15),%rsi
0xffffffffc08d7d33 <rxe_responder+3347>: mov %r12d,%edx
0xffffffffc08d7d36 <rxe_responder+3350>: callq
0xffffffffc08d6af0 <find_resource>
0xffffffffc08d7d3b <rxe_responder+3355>: test %rax,%rax
0xffffffffc08d7d3e <rxe_responder+3358>: je
0xffffffffc08d8c40 <rxe_responder+7200>
0xffffffffc08d7d44 <rxe_responder+3364>: movzbl 0x2d(%rbx),%edx
0xffffffffc08d7d48 <rxe_responder+3368>: movzbl 0x2e(%rbx),%ecx
0xffffffffc08d7d4c <rxe_responder+3372>: mov $0xc,%r13d
0xffffffffc08d7d52 <rxe_responder+3378>: mov 0x20(%rax),%rdi
0xffffffffc08d7d56 <rxe_responder+3382>: shl $0x6,%rdx
0xffffffffc08d7d5a <rxe_responder+3386>: movslq -0x3f715564(%rdx),%rdx
0xffffffffc08d7d61 <rxe_responder+3393>: add %rdx,%rcx
0xffffffffc08d7d64 <rxe_responder+3396>: add 0x18(%rbx),%rcx
0xffffffffc08d7d68 <rxe_responder+3400>: mov (%rcx),%rdx
0xffffffffc08d7d6b <rxe_responder+3403>: mov 0xc(%rcx),%esi
0xffffffffc08d7d6e <rxe_responder+3406>: bswap %rdx
0xffffffffc08d7d71 <rxe_responder+3409>: bswap %esi
0xffffffffc08d7d73 <rxe_responder+3411>: cmp %rdi,%rdx
0xffffffffc08d7d76 <rxe_responder+3414>: jb
0xffffffffc08d70a2 <rxe_responder+130>
0xffffffffc08d7d7c <rxe_responder+3420>: mov 0x2c(%rax),%r8d
0xffffffffc08d7d80 <rxe_responder+3424>: cmp %r8d,%esi
0xffffffffc08d7d83 <rxe_responder+3427>: ja
0xffffffffc08d70a2 <rxe_responder+130>
0xffffffffc08d7d89 <rxe_responder+3433>: mov %esi,%r9d
0xffffffffc08d7d8c <rxe_responder+3436>: add %r8,%rdi
0xffffffffc08d7d8f <rxe_responder+3439>: add %rdx,%r9
0xffffffffc08d7d92 <rxe_responder+3442>: cmp %rdi,%r9
0xffffffffc08d7d95 <rxe_responder+3445>: ja
0xffffffffc08d70a2 <rxe_responder+130>
0xffffffffc08d7d9b <rxe_responder+3451>: mov 0x8(%rcx),%ecx
0xffffffffc08d7d9e <rxe_responder+3454>: bswap %ecx
0xffffffffc08d7da0 <rxe_responder+3456>: cmp 0x28(%rax),%ecx
On Wed, Dec 25, 2019 at 1:28 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
>
> Is there any vmcore about this problem?
>
> On Wed, Dec 25, 2019 at 1:03 PM Frank Huang <tigerinxm@gmail.com> wrote:
> >
> > hi, there is a panic on rdma_rxe module when the restart
> > network.service or shutdown the switch.
> >
> > it looks like a use-after-free error.
> >
> > everytime it happens, there is the log "rdma_rxe: Unknown layer 3 protocol: 0"
> >
> > is it a known error?
> >
> > my kernel version is 4.14.97
> >
> > [448840.314544] rdma_rxe: Unknown layer 3 protocol: 0
> > [448840.314626] general protection fault: 0000 [#1] SMP PTI
> > [448840.314627] Modules linked in: binfmt_misc ib_isert
> > iscsi_target_mod ib_srpt target_core_mod rpcrdma ib_iser ib_srp
> > scsi_transport_srp rdma_rxe(OE) ib_ipoib ib_umad ip6_udp_tunnel
> > udp_tunnel rdma_ucm rdma_cm iw_cm ib_cm ib_uverbs ib_core
> > ebtable_filter ebtables devlink ip6table_filter ip6_tables
> > ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink iptable_nat
> > xt_addrtype xt_conntrack br_netfilter bridge stp llc overlay
> > ip_set_hash_ip ip_set nfnetlink iscsi_tcp libiscsi_tcp libiscsi
> > scsi_transport_iscsi sch_ingress openvswitch nf_conntrack_ipv6
> > nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
> > nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c sunrpc intel_rapl
> > x86_pkg_temp_thermal intel_powerclamp coretemp vfat fat kvm_intel kvm
> > irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> > intel_cstate
> > [448840.314677] intel_uncore intel_rapl_perf mxm_wmi iTCO_wdt
> > iTCO_vendor_support ipmi_ssif pcspkr i2c_i801 lpc_ich ipmi_si
> > ipmi_devintf ipmi_msghandler pcc_cpufreq shpchp wmi ast drm_kms_helper
> > ttm crc32c_intel drm ixgbe igb mdio ptp pps_core dca i2c_algo_bit
> > [448840.314700] CPU: 1 PID: 17 Comm: ksoftirqd/1 Tainted: G
> > OE 4.14.97-el7.centos.x86_64 #1
> > [448840.314701] Hardware name: /80010211 , BIOS 3.12 11/27/2018
> > [448840.314703] task: ffff9ce768af8000 task.stack: ffffbd7c4c6c4000
> > [448840.314710] RIP: 0010:rxe_elem_release+0xf/0x60 [rdma_rxe]
> > [448840.314711] RSP: 0018:ffffbd7c4c6c7d28 EFLAGS: 00010246
> > [448840.314713] RAX: 0000000000000000 RBX: 2917351aae258b92 RCX:
> > 0000000000000000
> > [448840.314714] RDX: ffff9cfb3f64ba40 RSI: 000000000000026c RDI:
> > ffff9cfb3f678008
> > [448840.314715] RBP: ffff9cfb3f678000 R08: 0000000000000201 R09:
> > ffffbd7c4df35000
> > [448840.314716] R10: 0000000000000000 R11: 0000000000000001 R12:
> > 0000000000000000
> > [448840.314717] R13: 000000000000001d R14: 0000000000000006 R15:
> > ffff9cfb3f678000
> > [448840.314719] FS: 0000000000000000(0000) GS:ffff9ce76f840000(0000)
> > knlGS:0000000000000000
> > [448840.314720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [448840.314721] CR2: 00007f4fc400f000 CR3: 000000260420a005 CR4:
> > 00000000001626e0
> > [448840.314723] Call Trace:
> > [448840.314730] rxe_responder+0xcf0/0x1fe0 [rdma_rxe]
> > [448840.314738] ? check_preempt_wakeup+0x125/0x240
> > [448840.314742] ? check_preempt_curr+0x84/0x90
> > [448840.314745] ? ttwu_do_wakeup+0x19/0x140
> > [448840.314747] ? try_to_wake_up+0x54/0x450
> > [448840.314751] rxe_do_task+0x8b/0x100 [rdma_rxe]
> > [448840.314754] tasklet_action+0xfe/0x110
> > [448840.314758] __do_softirq+0xd9/0x2a2
> > [448840.314761] run_ksoftirqd+0x1e/0x70
> > [448840.314763] smpboot_thread_fn+0x10e/0x160
> > [448840.314766] kthread+0xff/0x140
> > [448840.314768] ? sort_range+0x20/0x20
> > [448840.314770] ? __kthread_parkme+0x90/0x90
> > [448840.314771] ret_from_fork+0x35/0x40
> > [448840.314773] Code: 7a 00 00 74 04 31 c0 eb c3 4c 89 e7 e8 bb f9 ff
> > ff 31 c0 eb b7 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 8d 6f f8 53
> > 48 8b 5f f8 <48> 8b 43 20 48 85 c0 74 08 48 89 ef e8 60 1c 53 fb 8b 43
> > 30 48
> > [448840.314817] RIP: rxe_elem_release+0xf/0x60 [rdma_rxe] RSP: ffffbd7c4c6c7d28
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 4:55 rxe panic Frank Huang
2019-12-25 5:27 ` Zhu Yanjun
@ 2019-12-25 6:32 ` Leon Romanovsky
2019-12-25 7:23 ` Frank Huang
1 sibling, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2019-12-25 6:32 UTC (permalink / raw)
To: Frank Huang; +Cc: linux-rdma
On Wed, Dec 25, 2019 at 12:55:35PM +0800, Frank Huang wrote:
> hi, there is a panic on rdma_rxe module when the restart
> network.service or shutdown the switch.
>
> it looks like a use-after-free error.
>
> everytime it happens, there is the log "rdma_rxe: Unknown layer 3 protocol: 0"
The error print itself is harmless.
>
> is it a known error?
>
> my kernel version is 4.14.97
Your kernel is old enough and doesn't include refcount,
so I can't say for sure that it is the case, but the
following code is not correct and with refcount debug
it will be seen immediately.
1213 int rxe_responder(void *arg)
1214 {
1215 struct rxe_qp *qp = (struct rxe_qp *)arg;
1216 struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
1217 enum resp_states state;
1218 struct rxe_pkt_info *pkt = NULL;
1219 int ret = 0;
1220
1221 rxe_add_ref(qp); <------ USE-AFTER-FREE
1222
1223 qp->resp.aeth_syndrome = AETH_ACK_UNLIMITED;
1224
1225 if (!qp->valid) {
1226 ret = -EINVAL;
1227 goto done;
1228 }
Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 6:01 ` Frank Huang
@ 2019-12-25 6:34 ` Zhu Yanjun
2019-12-25 7:10 ` Frank Huang
0 siblings, 1 reply; 12+ messages in thread
From: Zhu Yanjun @ 2019-12-25 6:34 UTC (permalink / raw)
To: Frank Huang; +Cc: linux-rdma
Please install kernel-dbg file. And run "mod -S
directory-of-kernel-ko". Then run "dis -lr rxe_elem_release+15".
Show us the result.
On Wed, Dec 25, 2019 at 2:02 PM Frank Huang <tigerinxm@gmail.com> wrote:
>
> yes,
>
> what is the information should i post?
>
> crash> bt
> PID: 108 TASK: ffff978e28548000 CPU: 16 COMMAND: "ksoftirqd/16"
> #0 [ffffa2f14c9a7b18] machine_kexec at ffffffff8f059992
> #1 [ffffa2f14c9a7b70] __crash_kexec at ffffffff8f13cf7d
> #2 [ffffa2f14c9a7c38] crash_kexec at ffffffff8f13e089
> #3 [ffffa2f14c9a7c50] oops_end at ffffffff8f027a77
> #4 [ffffa2f14c9a7c70] general_protection at ffffffff8fa01635
> [exception RIP: rxe_elem_release+15]
> RIP: ffffffffc08da38f RSP: ffffa2f14c9a7d28 RFLAGS: 00010246
> RAX: 0000000000000000 RBX: 860e42124013b0aa RCX: 0000000000000000
> RDX: ffff978e03ba8900 RSI: 0000000000000281 RDI: ffff978e02e746e8
> RBP: ffff978e02e746e0 R8: 0000000000000201 R9: ffffa2f14dcb9000
> R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> R13: 000000000000001d R14: 0000000000000006 R15: ffff978e02e746e0
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #5 [ffffa2f14c9a7d38] rxe_responder at ffffffffc08d7d10 [rdma_rxe]
> #6 [ffffa2f14c9a7e48] rxe_do_task at ffffffffc08e060b [rdma_rxe]
> #7 [ffffa2f14c9a7e70] tasklet_action at ffffffff8f0afa1e
> #8 [ffffa2f14c9a7e88] __softirqentry_text_start at ffffffff8fc000d9
> #9 [ffffa2f14c9a7ee0] run_ksoftirqd at ffffffff8f0afa4e
> #10 [ffffa2f14c9a7ee8] smpboot_thread_fn at ffffffff8f0cca5e
> #11 [ffffa2f14c9a7f10] kthread at ffffffff8f0c8c9f
> #12 [ffffa2f14c9a7f50] ret_from_fork at ffffffff8fa00205
> crash> dis -l ffffffffc08d7d10
> 0xffffffffc08d7d10 <rxe_responder+3312>: jmpq
> 0xffffffffc08d7c6c <rxe_responder+3148>
> crash>
>
> 0xffffffffc08d7c97 <rxe_responder+3191>: mov 0xec(%r15),%eax
> 0xffffffffc08d7c9e <rxe_responder+3198>: cmp $0x2,%eax
> 0xffffffffc08d7ca1 <rxe_responder+3201>: je
> 0xffffffffc08d8213 <rxe_responder+4595>
> 0xffffffffc08d7ca7 <rxe_responder+3207>: cmp $0x3,%eax
> 0xffffffffc08d7caa <rxe_responder+3210>: jne
> 0xffffffffc08d7ecc <rxe_responder+3756>
> 0xffffffffc08d7cb0 <rxe_responder+3216>: mov 0x450(%r15),%eax
> 0xffffffffc08d7cb7 <rxe_responder+3223>: cmp $0x20,%eax
> 0xffffffffc08d7cba <rxe_responder+3226>: jl
> 0xffffffffc08d873e <rxe_responder+5918>
> 0xffffffffc08d7cc0 <rxe_responder+3232>: cmp $0x21,%eax
> 0xffffffffc08d7cc3 <rxe_responder+3235>: jle
> 0xffffffffc08d8725 <rxe_responder+5893>
> 0xffffffffc08d7cc9 <rxe_responder+3241>: sub $0x26,%eax
> 0xffffffffc08d7ccc <rxe_responder+3244>: cmp $0x1,%eax
> 0xffffffffc08d7ccf <rxe_responder+3247>: ja
> 0xffffffffc08d873e <rxe_responder+5918>
> 0xffffffffc08d7cd5 <rxe_responder+3253>: movzbl 0x2d(%rbx),%eax
> 0xffffffffc08d7cd9 <rxe_responder+3257>: sub $0x27,%eax
> 0xffffffffc08d7cdc <rxe_responder+3260>: cmp $0x3,%al
> 0xffffffffc08d7cde <rxe_responder+3262>: sbb %r13d,%r13d
> 0xffffffffc08d7ce1 <rxe_responder+3265>: and $0xfffffff0,%r13d
> 0xffffffffc08d7ce5 <rxe_responder+3269>: add $0x14,%r13d
> 0xffffffffc08d7ce9 <rxe_responder+3273>: jmpq
> 0xffffffffc08d70a2 <rxe_responder+130>
> 0xffffffffc08d7cee <rxe_responder+3278>: mov %rbp,%rdi
> 0xffffffffc08d7cf1 <rxe_responder+3281>: callq
> 0xffffffffc08da380 <rxe_elem_release>
> 0xffffffffc08d7cf6 <rxe_responder+3286>: jmpq
> 0xffffffffc08d7b66 <rxe_responder+2886>
> 0xffffffffc08d7cfb <rxe_responder+3291>: mov %rbp,%rdi
> 0xffffffffc08d7cfe <rxe_responder+3294>: callq
> 0xffffffffc08da380 <rxe_elem_release>
> 0xffffffffc08d7d03 <rxe_responder+3299>: jmpq
> 0xffffffffc08d7b14 <rxe_responder+2804>
> 0xffffffffc08d7d08 <rxe_responder+3304>: mov %rbp,%rdi
> 0xffffffffc08d7d0b <rxe_responder+3307>: callq
> 0xffffffffc08da380 <rxe_elem_release>
> 0xffffffffc08d7d10 <rxe_responder+3312>: jmpq
> 0xffffffffc08d7c6c <rxe_responder+3148>
> 0xffffffffc08d7d15 <rxe_responder+3317>: test $0x10000,%eax
> 0xffffffffc08d7d1a <rxe_responder+3322>: je
> 0xffffffffc08d804f <rxe_responder+4143>
> 0xffffffffc08d7d20 <rxe_responder+3328>: mov 0x24(%rbx),%r12d
> 0xffffffffc08d7d24 <rxe_responder+3332>: movzbl 0x19f(%r15),%edi
> 0xffffffffc08d7d2c <rxe_responder+3340>: lea 0x6c0(%r15),%rsi
> 0xffffffffc08d7d33 <rxe_responder+3347>: mov %r12d,%edx
> 0xffffffffc08d7d36 <rxe_responder+3350>: callq
> 0xffffffffc08d6af0 <find_resource>
> 0xffffffffc08d7d3b <rxe_responder+3355>: test %rax,%rax
> 0xffffffffc08d7d3e <rxe_responder+3358>: je
> 0xffffffffc08d8c40 <rxe_responder+7200>
> 0xffffffffc08d7d44 <rxe_responder+3364>: movzbl 0x2d(%rbx),%edx
> 0xffffffffc08d7d48 <rxe_responder+3368>: movzbl 0x2e(%rbx),%ecx
> 0xffffffffc08d7d4c <rxe_responder+3372>: mov $0xc,%r13d
> 0xffffffffc08d7d52 <rxe_responder+3378>: mov 0x20(%rax),%rdi
> 0xffffffffc08d7d56 <rxe_responder+3382>: shl $0x6,%rdx
> 0xffffffffc08d7d5a <rxe_responder+3386>: movslq -0x3f715564(%rdx),%rdx
> 0xffffffffc08d7d61 <rxe_responder+3393>: add %rdx,%rcx
> 0xffffffffc08d7d64 <rxe_responder+3396>: add 0x18(%rbx),%rcx
> 0xffffffffc08d7d68 <rxe_responder+3400>: mov (%rcx),%rdx
> 0xffffffffc08d7d6b <rxe_responder+3403>: mov 0xc(%rcx),%esi
> 0xffffffffc08d7d6e <rxe_responder+3406>: bswap %rdx
> 0xffffffffc08d7d71 <rxe_responder+3409>: bswap %esi
> 0xffffffffc08d7d73 <rxe_responder+3411>: cmp %rdi,%rdx
> 0xffffffffc08d7d76 <rxe_responder+3414>: jb
> 0xffffffffc08d70a2 <rxe_responder+130>
> 0xffffffffc08d7d7c <rxe_responder+3420>: mov 0x2c(%rax),%r8d
> 0xffffffffc08d7d80 <rxe_responder+3424>: cmp %r8d,%esi
> 0xffffffffc08d7d83 <rxe_responder+3427>: ja
> 0xffffffffc08d70a2 <rxe_responder+130>
> 0xffffffffc08d7d89 <rxe_responder+3433>: mov %esi,%r9d
> 0xffffffffc08d7d8c <rxe_responder+3436>: add %r8,%rdi
> 0xffffffffc08d7d8f <rxe_responder+3439>: add %rdx,%r9
> 0xffffffffc08d7d92 <rxe_responder+3442>: cmp %rdi,%r9
> 0xffffffffc08d7d95 <rxe_responder+3445>: ja
> 0xffffffffc08d70a2 <rxe_responder+130>
> 0xffffffffc08d7d9b <rxe_responder+3451>: mov 0x8(%rcx),%ecx
> 0xffffffffc08d7d9e <rxe_responder+3454>: bswap %ecx
> 0xffffffffc08d7da0 <rxe_responder+3456>: cmp 0x28(%rax),%ecx
>
> On Wed, Dec 25, 2019 at 1:28 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> >
> > Is there any vmcore about this problem?
> >
> > On Wed, Dec 25, 2019 at 1:03 PM Frank Huang <tigerinxm@gmail.com> wrote:
> > >
> > > hi, there is a panic on rdma_rxe module when the restart
> > > network.service or shutdown the switch.
> > >
> > > it looks like a use-after-free error.
> > >
> > > everytime it happens, there is the log "rdma_rxe: Unknown layer 3 protocol: 0"
> > >
> > > is it a known error?
> > >
> > > my kernel version is 4.14.97
> > >
> > > [448840.314544] rdma_rxe: Unknown layer 3 protocol: 0
> > > [448840.314626] general protection fault: 0000 [#1] SMP PTI
> > > [448840.314627] Modules linked in: binfmt_misc ib_isert
> > > iscsi_target_mod ib_srpt target_core_mod rpcrdma ib_iser ib_srp
> > > scsi_transport_srp rdma_rxe(OE) ib_ipoib ib_umad ip6_udp_tunnel
> > > udp_tunnel rdma_ucm rdma_cm iw_cm ib_cm ib_uverbs ib_core
> > > ebtable_filter ebtables devlink ip6table_filter ip6_tables
> > > ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink iptable_nat
> > > xt_addrtype xt_conntrack br_netfilter bridge stp llc overlay
> > > ip_set_hash_ip ip_set nfnetlink iscsi_tcp libiscsi_tcp libiscsi
> > > scsi_transport_iscsi sch_ingress openvswitch nf_conntrack_ipv6
> > > nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
> > > nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c sunrpc intel_rapl
> > > x86_pkg_temp_thermal intel_powerclamp coretemp vfat fat kvm_intel kvm
> > > irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> > > intel_cstate
> > > [448840.314677] intel_uncore intel_rapl_perf mxm_wmi iTCO_wdt
> > > iTCO_vendor_support ipmi_ssif pcspkr i2c_i801 lpc_ich ipmi_si
> > > ipmi_devintf ipmi_msghandler pcc_cpufreq shpchp wmi ast drm_kms_helper
> > > ttm crc32c_intel drm ixgbe igb mdio ptp pps_core dca i2c_algo_bit
> > > [448840.314700] CPU: 1 PID: 17 Comm: ksoftirqd/1 Tainted: G
> > > OE 4.14.97-el7.centos.x86_64 #1
> > > [448840.314701] Hardware name: /80010211 , BIOS 3.12 11/27/2018
> > > [448840.314703] task: ffff9ce768af8000 task.stack: ffffbd7c4c6c4000
> > > [448840.314710] RIP: 0010:rxe_elem_release+0xf/0x60 [rdma_rxe]
> > > [448840.314711] RSP: 0018:ffffbd7c4c6c7d28 EFLAGS: 00010246
> > > [448840.314713] RAX: 0000000000000000 RBX: 2917351aae258b92 RCX:
> > > 0000000000000000
> > > [448840.314714] RDX: ffff9cfb3f64ba40 RSI: 000000000000026c RDI:
> > > ffff9cfb3f678008
> > > [448840.314715] RBP: ffff9cfb3f678000 R08: 0000000000000201 R09:
> > > ffffbd7c4df35000
> > > [448840.314716] R10: 0000000000000000 R11: 0000000000000001 R12:
> > > 0000000000000000
> > > [448840.314717] R13: 000000000000001d R14: 0000000000000006 R15:
> > > ffff9cfb3f678000
> > > [448840.314719] FS: 0000000000000000(0000) GS:ffff9ce76f840000(0000)
> > > knlGS:0000000000000000
> > > [448840.314720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [448840.314721] CR2: 00007f4fc400f000 CR3: 000000260420a005 CR4:
> > > 00000000001626e0
> > > [448840.314723] Call Trace:
> > > [448840.314730] rxe_responder+0xcf0/0x1fe0 [rdma_rxe]
> > > [448840.314738] ? check_preempt_wakeup+0x125/0x240
> > > [448840.314742] ? check_preempt_curr+0x84/0x90
> > > [448840.314745] ? ttwu_do_wakeup+0x19/0x140
> > > [448840.314747] ? try_to_wake_up+0x54/0x450
> > > [448840.314751] rxe_do_task+0x8b/0x100 [rdma_rxe]
> > > [448840.314754] tasklet_action+0xfe/0x110
> > > [448840.314758] __do_softirq+0xd9/0x2a2
> > > [448840.314761] run_ksoftirqd+0x1e/0x70
> > > [448840.314763] smpboot_thread_fn+0x10e/0x160
> > > [448840.314766] kthread+0xff/0x140
> > > [448840.314768] ? sort_range+0x20/0x20
> > > [448840.314770] ? __kthread_parkme+0x90/0x90
> > > [448840.314771] ret_from_fork+0x35/0x40
> > > [448840.314773] Code: 7a 00 00 74 04 31 c0 eb c3 4c 89 e7 e8 bb f9 ff
> > > ff 31 c0 eb b7 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 8d 6f f8 53
> > > 48 8b 5f f8 <48> 8b 43 20 48 85 c0 74 08 48 89 ef e8 60 1c 53 fb 8b 43
> > > 30 48
> > > [448840.314817] RIP: rxe_elem_release+0xf/0x60 [rdma_rxe] RSP: ffffbd7c4c6c7d28
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 6:34 ` Zhu Yanjun
@ 2019-12-25 7:10 ` Frank Huang
0 siblings, 0 replies; 12+ messages in thread
From: Frank Huang @ 2019-12-25 7:10 UTC (permalink / raw)
To: Zhu Yanjun; +Cc: linux-rdma
hi, zhu
Here is the detail. Wish it is what your wanted.
[root@test 127.0.0.1-2019-11-11-18:59:57]# crash
/usr/lib/debug/lib/modules/$(uname -r)/vmlinux vmcore
crash 7.2.3-10.el7
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
WARNING: kernel relocated [224MB]: patching 92417 gdb minimal_symbol values
KERNEL: /usr/lib/debug/lib/modules/4.14.97-.el7.centos.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 48
DATE: Mon Nov 11 18:59:49 2019
UPTIME: 11 days, 04:57:53
LOAD AVERAGE: 3.14, 3.11, 3.09
TASKS: 1103
NODENAME: test
RELEASE: 4.14.97-.el7.centos.x86_64
VERSION: #1 SMP Mon Apr 29 14:32:59 CST 2019
MACHINE: x86_64 (2494 Mhz)
MEMORY: 159.9 GB
PANIC: "general protection fault: 0000 [#1] SMP PTI"
PID: 108
COMMAND: "ksoftirqd/16"
TASK: ffff978e28548000 [THREAD_INFO: ffff978e28548000]
CPU: 16
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 108 TASK: ffff978e28548000 CPU: 16 COMMAND: "ksoftirqd/16"
#0 [ffffa2f14c9a7b18] machine_kexec at ffffffff8f059992
#1 [ffffa2f14c9a7b70] __crash_kexec at ffffffff8f13cf7d
#2 [ffffa2f14c9a7c38] crash_kexec at ffffffff8f13e089
#3 [ffffa2f14c9a7c50] oops_end at ffffffff8f027a77
#4 [ffffa2f14c9a7c70] general_protection at ffffffff8fa01635
[exception RIP: rxe_elem_release+15]
RIP: ffffffffc08da38f RSP: ffffa2f14c9a7d28 RFLAGS: 00010246
RAX: 0000000000000000 RBX: 860e42124013b0aa RCX: 0000000000000000
RDX: ffff978e03ba8900 RSI: 0000000000000281 RDI: ffff978e02e746e8
RBP: ffff978e02e746e0 R8: 0000000000000201 R9: ffffa2f14dcb9000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 000000000000001d R14: 0000000000000006 R15: ffff978e02e746e0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#5 [ffffa2f14c9a7d38] rxe_responder at ffffffffc08d7d10 [rdma_rxe]
#6 [ffffa2f14c9a7e48] rxe_do_task at ffffffffc08e060b [rdma_rxe]
#7 [ffffa2f14c9a7e70] tasklet_action at ffffffff8f0afa1e
#8 [ffffa2f14c9a7e88] __softirqentry_text_start at ffffffff8fc000d9
#9 [ffffa2f14c9a7ee0] run_ksoftirqd at ffffffff8f0afa4e
#10 [ffffa2f14c9a7ee8] smpboot_thread_fn at ffffffff8f0cca5e
#11 [ffffa2f14c9a7f10] kthread at ffffffff8f0c8c9f
#12 [ffffa2f14c9a7f50] ret_from_fork at ffffffff8fa00205
crash> mod -s rdma_rxe
MODULE NAME SIZE OBJECT FILE
ffffffffc08ef240 rdma_rxe 126976
/usr/lib/debug/usr/lib/modules/4.14.97-.el7.centos.x86_64/kernel/drivers/infiniband/sw/rxe/rdma_rxe.ko.debug
crash> dis -lr rxe_elem_release+15
/usr/src/debug/kernel--4.14.97-1.el7/linux-4.14.97-.el7.centos.x86_64/drivers/infiniband/sw/rxe/rxe_pool.c:
452
0xffffffffc08da380 <rxe_elem_release>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffc08da385 <rxe_elem_release+5>: push %rbp
/usr/src/debug/kernel--4.14.97-1.el7/linux-4.14.97-.el7.centos.x86_64/drivers/infiniband/sw/rxe/rxe_pool.c:
447
0xffffffffc08da386 <rxe_elem_release+6>: lea -0x8(%rdi),%rbp
/usr/src/debug/kernel--4.14.97-1.el7/linux-4.14.97-.el7.centos.x86_64/arch/x86/include/asm/refcount.h:
52
0xffffffffc08da38a <rxe_elem_release+10>: push %rbx
0xffffffffc08da38b <rxe_elem_release+11>: mov -0x8(%rdi),%rbx
0xffffffffc08da38f <rxe_elem_release+15>: mov 0x20(%rbx),%rax
crash> quit
[root@test 127.0.0.1-2019-11-11-18:59:57]#
433 void *rxe_pool_get_index(struct rxe_pool *pool, u32 index)
434 {
435 struct rb_node *node = NULL;
436 struct rxe_pool_entry *elem = NULL;
437 unsigned long flags;
438
439 spin_lock_irqsave(&pool->pool_lock, flags);
440
441 if (pool->state != rxe_pool_valid)
442 goto out;
443
444 node = pool->tree.rb_node;
445
446 while (node) {
447 elem = rb_entry(node, struct rxe_pool_entry, node);
448
449 if (elem->index > index)
450 node = node->rb_left;
451 else if (elem->index < index)
452 node = node->rb_right;
453 else
454 break;
455 }
456
457 if (node)
458 kref_get(&elem->ref_cnt);
459
460 out:
461 spin_unlock_irqrestore(&pool->pool_lock, flags);
462 return node ? elem : NULL;
463 }
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 6:32 ` Leon Romanovsky
@ 2019-12-25 7:23 ` Frank Huang
2019-12-25 7:43 ` Frank Huang
2019-12-25 9:23 ` Leon Romanovsky
0 siblings, 2 replies; 12+ messages in thread
From: Frank Huang @ 2019-12-25 7:23 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: linux-rdma
hi leon
I can not get what you means, do you say the rxe_add_ref(qp) is not needed?
My kernel is old, and I found some bugs of rxe on 4.14.97, especially
the rnr errors.
I can not upgrade whole kernel because there are many dependencies.
Finally , I sync the fixed from newest kernel version to the 4.14.97.
When I compare my rxe_resp.c with kernel 5.2.9 , I found the snippet
of duplicate_request is changed.
and rxe_xmit_packet will call rxe_send,enter the log "rdma_rxe:
Unknown layer 3 protocol: 0"
1137 } else {
1138 struct resp_res *res;
1139
1140 /* Find the operation in our list of responder resources. */
1141 res = find_resource(qp, pkt->psn);
1142 if (res) {
1143 struct sk_buff *skb_copy;
1144
1145 skb_copy = skb_clone(res->atomic.skb, GFP_ATOMIC);
1146 if (skb_copy) {
1147 rxe_add_ref(qp); /* for the new SKB */
1148 } else {
1149 pr_warn("Couldn't clone atomic resp\n");
1150 rc = RESPST_CLEANUP;
1151 goto out;
1152 }
1153
1154 /* Resend the result. */
1155 rc = rxe_xmit_packet(to_rdev(qp->ibqp.device), qp,
1156 pkt, skb_copy);
1157 if (rc) {
1158 pr_err("Failed resending result. This flow is not handled - skb
ignored\n");
1159 rxe_drop_ref(qp);
1160 rc = RESPST_CLEANUP;
1161 goto out;
1162 }
1163 }
1164
1165 /* Resource not found. Class D error. Drop the request. */
1166 rc = RESPST_CLEANUP;
1167 goto out;
1168 }
1169 out:
1170 return rc;
1171 }
On Wed, Dec 25, 2019 at 2:33 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Wed, Dec 25, 2019 at 12:55:35PM +0800, Frank Huang wrote:
> > hi, there is a panic on rdma_rxe module when the restart
> > network.service or shutdown the switch.
> >
> > it looks like a use-after-free error.
> >
> > everytime it happens, there is the log "rdma_rxe: Unknown layer 3 protocol: 0"
>
> The error print itself is harmless.
> >
> > is it a known error?
> >
> > my kernel version is 4.14.97
>
> Your kernel is old enough and doesn't include refcount,
> so I can't say for sure that it is the case, but the
> following code is not correct and with refcount debug
> it will be seen immediately.
>
> 1213 int rxe_responder(void *arg)
> 1214 {
> 1215 struct rxe_qp *qp = (struct rxe_qp *)arg;
> 1216 struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
> 1217 enum resp_states state;
> 1218 struct rxe_pkt_info *pkt = NULL;
> 1219 int ret = 0;
> 1220
> 1221 rxe_add_ref(qp); <------ USE-AFTER-FREE
> 1222
> 1223 qp->resp.aeth_syndrome = AETH_ACK_UNLIMITED;
> 1224
> 1225 if (!qp->valid) {
> 1226 ret = -EINVAL;
> 1227 goto done;
> 1228 }
>
> Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 7:23 ` Frank Huang
@ 2019-12-25 7:43 ` Frank Huang
2019-12-25 9:23 ` Leon Romanovsky
1 sibling, 0 replies; 12+ messages in thread
From: Frank Huang @ 2019-12-25 7:43 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: linux-rdma
there is the patch what i used. :)
rdma_rxe(4.14.97) : has problems in dealing with disorderly messages
this patch transplant rdma_rxe module from linux-5.2.9 to fix this problems.
the fix only under linux-4.14.97/drivers/infiniband/sw/rxe. At
present, no impact on other modules has been found.
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_comp.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_comp.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_comp.c 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_comp.c 2019-09-17
16:00:39.168896560 +0800
@@ -191,6 +191,7 @@
{
qp->comp.retry_cnt = qp->attr.retry_cnt;
qp->comp.rnr_retry = qp->attr.rnr_retry;
+ qp->comp.started_retry = 0;
}
static inline enum comp_state check_psn(struct rxe_qp *qp,
@@ -253,6 +254,17 @@
case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:
if (pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE &&
pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST) {
+ /* read retries of partial data may restart from
+ * read response first or response only.
+ */
+ if ((pkt->psn == wqe->first_psn &&
+ pkt->opcode ==
+ IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST) ||
+ (wqe->first_psn == wqe->last_psn &&
+ pkt->opcode ==
+ IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY))
+ break;
+
return COMPST_ERROR;
}
break;
@@ -270,8 +282,8 @@
if ((syn & AETH_TYPE_MASK) != AETH_ACK)
return COMPST_ERROR;
- /* Fall through (IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE
- * doesn't have an AETH)
+ /* fall through */
+ /* (IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE doesn't have an AETH)
*/
case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:
if (wqe->wr.opcode != IB_WR_RDMA_READ &&
@@ -501,11 +513,11 @@
struct rxe_pkt_info *pkt,
struct rxe_send_wqe *wqe)
{
- qp->comp.opcode = -1;
-
- if (pkt) {
- if (psn_compare(pkt->psn, qp->comp.psn) >= 0)
- qp->comp.psn = (pkt->psn + 1) & BTH_PSN_MASK;
+ if (pkt && wqe->state == wqe_state_pending) {
+ if (psn_compare(wqe->last_psn, qp->comp.psn) >= 0) {
+ qp->comp.psn = (wqe->last_psn + 1) & BTH_PSN_MASK;
+ qp->comp.opcode = -1;
+ }
if (qp->req.wait_psn) {
qp->req.wait_psn = 0;
@@ -662,7 +674,6 @@
qp->qp_timeout_jiffies)
mod_timer(&qp->retrans_timer,
jiffies + qp->qp_timeout_jiffies);
- WARN_ON_ONCE(skb);
goto exit;
case COMPST_ERROR_RETRY:
@@ -676,10 +687,23 @@
/* there is nothing to retry in this case */
if (!wqe || (wqe->state == wqe_state_posted)) {
- WARN_ON_ONCE(skb);
goto exit;
}
+ /* if we've started a retry, don't start another
+ * retry sequence, unless this is a timeout.
+ */
+ if (qp->comp.started_retry &&
+ !qp->comp.timeout_retry) {
+ if (pkt) {
+ rxe_drop_ref(pkt->qp);
+ kfree_skb(skb);
+ skb = NULL;
+ }
+
+ goto done;
+ }
+
if (qp->comp.retry_cnt > 0) {
if (qp->comp.retry_cnt != 7)
qp->comp.retry_cnt--;
@@ -696,6 +720,7 @@
rxe_counter_inc(rxe,
RXE_CNT_COMP_RETRY);
qp->req.need_retry = 1;
+ qp->comp.started_retry = 1;
rxe_run_task(&qp->req.task, 1);
}
@@ -705,8 +730,7 @@
skb = NULL;
}
- WARN_ON_ONCE(skb);
- goto exit;
+ goto done;
} else {
rxe_counter_inc(rxe, RXE_CNT_RETRY_EXCEEDED);
@@ -749,7 +773,6 @@
skb = NULL;
}
- WARN_ON_ONCE(skb);
goto exit;
}
}
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe.h
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe.h
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe.h 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe.h 2019-09-17
16:00:39.169896565 +0800
@@ -74,7 +74,6 @@
SHASH_DESC_ON_STACK(shash, rxe->tfm);
shash->tfm = rxe->tfm;
- shash->flags = 0;
*(u32 *)shash_desc_ctx(shash) = crc;
err = crypto_shash_update(shash, next, len);
if (unlikely(err)) {
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_hdr.h
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_hdr.h
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_hdr.h 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_hdr.h 2019-09-17
16:00:39.169896565 +0800
@@ -643,7 +643,7 @@
__be32 rkey;
__be64 swap_add;
__be64 comp;
-} __attribute__((__packed__));
+} __packed;
static inline u64 __atmeth_va(void *arg)
{
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_hw_counters.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_hw_counters.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_hw_counters.c
2019-01-31 15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_hw_counters.c
2019-09-17 16:00:39.169896565 +0800
@@ -37,11 +37,11 @@
[RXE_CNT_SENT_PKTS] = "sent_pkts",
[RXE_CNT_RCVD_PKTS] = "rcvd_pkts",
[RXE_CNT_DUP_REQ] = "duplicate_request",
- [RXE_CNT_OUT_OF_SEQ_REQ] = "out_of_sequence",
+ [RXE_CNT_OUT_OF_SEQ_REQ] = "out_of_seq_request",
[RXE_CNT_RCV_RNR] = "rcvd_rnr_err",
[RXE_CNT_SND_RNR] = "send_rnr_err",
[RXE_CNT_RCV_SEQ_ERR] = "rcvd_seq_err",
- [RXE_CNT_COMPLETER_SCHED] = "ack_deffered",
+ [RXE_CNT_COMPLETER_SCHED] = "ack_deferred",
[RXE_CNT_RETRY_EXCEEDED] = "retry_exceeded_err",
[RXE_CNT_RNR_RETRY_EXCEEDED] = "retry_rnr_exceeded_err",
[RXE_CNT_COMP_RETRY] = "completer_retry_err",
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_loc.h
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_loc.h
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_loc.h 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_loc.h 2019-09-17
16:00:39.170896570 +0800
@@ -268,7 +268,8 @@
if (pkt->mask & RXE_LOOPBACK_MASK) {
memcpy(SKB_TO_PKT(skb), pkt, sizeof(*pkt));
- err = rxe_loopback(skb);
+ rxe_loopback(skb);
+ err = 0;
} else {
err = rxe_send(rxe, pkt, skb);
}
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_mmap.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_mmap.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_mmap.c 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_mmap.c 2019-09-17
16:00:39.170896570 +0800
@@ -146,6 +146,8 @@
void *obj)
{
struct rxe_mmap_info *ip;
+ if (!context)
+ return ERR_PTR(-EINVAL);
ip = kmalloc(sizeof(*ip), GFP_KERNEL);
if (!ip)
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_pool.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_pool.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_pool.c 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_pool.c 2019-09-17
16:00:39.171896575 +0800
@@ -112,6 +112,20 @@
return rxe_type_info[pool->type].cache;
}
+static void rxe_cache_clean(size_t cnt)
+{
+ int i;
+ struct rxe_type_info *type;
+
+ for (i = 0; i < cnt; i++) {
+ type = &rxe_type_info[i];
+ if (!(type->flags & RXE_POOL_NO_ALLOC)) {
+ kmem_cache_destroy(type->cache);
+ type->cache = NULL;
+ }
+ }
+}
+
int rxe_cache_init(void)
{
int err;
@@ -136,24 +150,14 @@
return 0;
err1:
- while (--i >= 0) {
- kmem_cache_destroy(type->cache);
- type->cache = NULL;
- }
+ rxe_cache_clean(i);
return err;
}
void rxe_cache_exit(void)
{
- int i;
- struct rxe_type_info *type;
-
- for (i = 0; i < RXE_NUM_TYPES; i++) {
- type = &rxe_type_info[i];
- kmem_cache_destroy(type->cache);
- type->cache = NULL;
- }
+ rxe_cache_clean(RXE_NUM_TYPES);
}
static int rxe_pool_init_index(struct rxe_pool *pool, u32 max, u32 min)
@@ -207,7 +211,7 @@
kref_init(&pool->ref_cnt);
- spin_lock_init(&pool->pool_lock);
+ rwlock_init(&pool->pool_lock);
if (rxe_type_info[type].flags & RXE_POOL_INDEX) {
err = rxe_pool_init_index(pool,
@@ -222,7 +226,7 @@
pool->key_size = rxe_type_info[type].key_size;
}
- pool->state = rxe_pool_valid;
+ pool->state = RXE_POOL_STATE_VALID;
out:
return err;
@@ -232,7 +236,7 @@
{
struct rxe_pool *pool = container_of(kref, struct rxe_pool, ref_cnt);
- pool->state = rxe_pool_invalid;
+ pool->state = RXE_POOL_STATE_INVALID;
kfree(pool->table);
}
@@ -245,12 +249,12 @@
{
unsigned long flags;
- spin_lock_irqsave(&pool->pool_lock, flags);
- pool->state = rxe_pool_invalid;
+ write_lock_irqsave(&pool->pool_lock, flags);
+ pool->state = RXE_POOL_STATE_INVALID;
if (atomic_read(&pool->num_elem) > 0)
pr_warn("%s pool destroyed with unfree'd elem\n",
pool_name(pool));
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ write_unlock_irqrestore(&pool->pool_lock, flags);
rxe_pool_put(pool);
@@ -336,10 +340,10 @@
struct rxe_pool *pool = elem->pool;
unsigned long flags;
- spin_lock_irqsave(&pool->pool_lock, flags);
+ write_lock_irqsave(&pool->pool_lock, flags);
memcpy((u8 *)elem + pool->key_offset, key, pool->key_size);
insert_key(pool, elem);
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ write_unlock_irqrestore(&pool->pool_lock, flags);
}
void rxe_drop_key(void *arg)
@@ -348,9 +352,9 @@
struct rxe_pool *pool = elem->pool;
unsigned long flags;
- spin_lock_irqsave(&pool->pool_lock, flags);
+ write_lock_irqsave(&pool->pool_lock, flags);
rb_erase(&elem->node, &pool->tree);
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ write_unlock_irqrestore(&pool->pool_lock, flags);
}
void rxe_add_index(void *arg)
@@ -359,10 +363,10 @@
struct rxe_pool *pool = elem->pool;
unsigned long flags;
- spin_lock_irqsave(&pool->pool_lock, flags);
+ write_lock_irqsave(&pool->pool_lock, flags);
elem->index = alloc_index(pool);
insert_index(pool, elem);
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ write_unlock_irqrestore(&pool->pool_lock, flags);
}
void rxe_drop_index(void *arg)
@@ -371,10 +375,10 @@
struct rxe_pool *pool = elem->pool;
unsigned long flags;
- spin_lock_irqsave(&pool->pool_lock, flags);
+ write_lock_irqsave(&pool->pool_lock, flags);
clear_bit(elem->index - pool->min_index, pool->table);
rb_erase(&elem->node, &pool->tree);
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ write_unlock_irqrestore(&pool->pool_lock, flags);
}
void *rxe_alloc(struct rxe_pool *pool)
@@ -384,13 +388,13 @@
might_sleep_if(!(pool->flags & RXE_POOL_ATOMIC));
- spin_lock_irqsave(&pool->pool_lock, flags);
- if (pool->state != rxe_pool_valid) {
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ read_lock_irqsave(&pool->pool_lock, flags);
+ if (pool->state != RXE_POOL_STATE_VALID) {
+ read_unlock_irqrestore(&pool->pool_lock, flags);
return NULL;
}
kref_get(&pool->ref_cnt);
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ read_unlock_irqrestore(&pool->pool_lock, flags);
kref_get(&pool->rxe->ref_cnt);
@@ -436,9 +440,9 @@
struct rxe_pool_entry *elem = NULL;
unsigned long flags;
- spin_lock_irqsave(&pool->pool_lock, flags);
+ read_lock_irqsave(&pool->pool_lock, flags);
- if (pool->state != rxe_pool_valid)
+ if (pool->state != RXE_POOL_STATE_VALID)
goto out;
node = pool->tree.rb_node;
@@ -450,15 +454,14 @@
node = node->rb_left;
else if (elem->index < index)
node = node->rb_right;
- else
+ else {
+ kref_get(&elem->ref_cnt);
break;
+ }
}
- if (node)
- kref_get(&elem->ref_cnt);
-
out:
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ read_unlock_irqrestore(&pool->pool_lock, flags);
return node ? elem : NULL;
}
@@ -469,9 +472,9 @@
int cmp;
unsigned long flags;
- spin_lock_irqsave(&pool->pool_lock, flags);
+ read_lock_irqsave(&pool->pool_lock, flags);
- if (pool->state != rxe_pool_valid)
+ if (pool->state != RXE_POOL_STATE_VALID)
goto out;
node = pool->tree.rb_node;
@@ -494,6 +497,6 @@
kref_get(&elem->ref_cnt);
out:
- spin_unlock_irqrestore(&pool->pool_lock, flags);
+ read_unlock_irqrestore(&pool->pool_lock, flags);
return node ? elem : NULL;
}
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_pool.h
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_pool.h
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_pool.h 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_pool.h 2019-09-17
16:00:39.171896575 +0800
@@ -41,6 +41,7 @@
RXE_POOL_ATOMIC = BIT(0),
RXE_POOL_INDEX = BIT(1),
RXE_POOL_KEY = BIT(2),
+ RXE_POOL_NO_ALLOC = BIT(4),
};
enum rxe_elem_type {
@@ -74,8 +75,8 @@
extern struct rxe_type_info rxe_type_info[];
enum rxe_pool_state {
- rxe_pool_invalid,
- rxe_pool_valid,
+ RXE_POOL_STATE_INVALID,
+ RXE_POOL_STATE_VALID,
};
struct rxe_pool_entry {
@@ -90,7 +91,7 @@
struct rxe_pool {
struct rxe_dev *rxe;
- spinlock_t pool_lock; /* pool spinlock */
+ rwlock_t pool_lock; /* protects pool add/del/search */
size_t elem_size;
struct kref ref_cnt;
void (*cleanup)(struct rxe_pool_entry *obj);
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_qp.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_qp.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_qp.c 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_qp.c 2019-09-17
16:00:39.172896580 +0800
@@ -235,6 +235,16 @@
return err;
qp->sk->sk->sk_user_data = qp;
+ /* pick a source UDP port number for this QP based on
+ * the source QPN. this spreads traffic for different QPs
+ * across different NIC RX queues (while using a single
+ * flow for a given QP to maintain packet order).
+ * the port number must be in the Dynamic Ports range
+ * (0xc000 - 0xffff).
+ */
+ qp->src_port = RXE_ROCE_V2_SPORT +
+ (hash_32_generic(qp_num(qp), 14) & 0x3fff);
+
qp->sq.max_wr = init->cap.max_send_wr;
qp->sq.max_sge = init->cap.max_send_sge;
qp->sq.max_inline = init->cap.max_inline_data;
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_req.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_req.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_req.c 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_req.c 2019-09-17
16:00:39.172896580 +0800
@@ -73,9 +73,6 @@
int npsn;
int first = 1;
- wqe = queue_head(qp->sq.queue);
- npsn = (qp->comp.psn - wqe->first_psn) & BTH_PSN_MASK;
-
qp->req.wqe_index = consumer_index(qp->sq.queue);
qp->req.psn = qp->comp.psn;
qp->req.opcode = -1;
@@ -107,11 +104,17 @@
if (first) {
first = 0;
- if (mask & WR_WRITE_OR_SEND_MASK)
+ if (mask & WR_WRITE_OR_SEND_MASK) {
+ npsn = (qp->comp.psn - wqe->first_psn) &
+ BTH_PSN_MASK;
retry_first_write_send(qp, wqe, mask, npsn);
+ }
- if (mask & WR_READ_MASK)
+ if (mask & WR_READ_MASK) {
+ npsn = (wqe->dma.length - wqe->dma.resid) /
+ qp->mtu;
wqe->iova += npsn * qp->mtu;
+ }
}
wqe->state = wqe_state_posted;
@@ -435,7 +438,7 @@
if (pkt->mask & RXE_RETH_MASK) {
reth_set_rkey(pkt, ibwr->wr.rdma.rkey);
reth_set_va(pkt, wqe->iova);
- reth_set_len(pkt, wqe->dma.length);
+ reth_set_len(pkt, wqe->dma.resid);
}
if (pkt->mask & RXE_IMMDT_MASK)
@@ -713,6 +716,7 @@
if (fill_packet(qp, wqe, &pkt, skb, payload)) {
pr_debug("qp#%d Error during fill packet\n", qp_num(qp));
+ kfree_skb(skb);
goto err;
}
@@ -744,7 +748,6 @@
goto next_wqe;
err:
- kfree_skb(skb);
wqe->status = IB_WC_LOC_PROT_ERR;
wqe->state = wqe_state_error;
__rxe_do_task(&qp->comp.task);
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_resp.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_resp.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_resp.c 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_resp.c 2019-09-17
16:00:39.173896585 +0800
@@ -124,12 +124,9 @@
struct sk_buff *skb;
if (qp->resp.state == QP_STATE_ERROR) {
- skb = skb_dequeue(&qp->req_pkts);
- if (skb) {
- /* drain request packet queue */
+ while ((skb = skb_dequeue(&qp->req_pkts))) {
rxe_drop_ref(qp);
kfree_skb(skb);
- return RESPST_GET_REQ;
}
/* go drain recv wr queue */
@@ -435,6 +432,7 @@
qp->resp.va = reth_va(pkt);
qp->resp.rkey = reth_rkey(pkt);
qp->resp.resid = reth_len(pkt);
+ qp->resp.length = reth_len(pkt);
}
access = (pkt->mask & RXE_READ_MASK) ? IB_ACCESS_REMOTE_READ
: IB_ACCESS_REMOTE_WRITE;
@@ -860,7 +858,9 @@
pkt->mask & RXE_WRITE_MASK) ?
IB_WC_RECV_RDMA_WITH_IMM : IB_WC_RECV;
wc->vendor_err = 0;
- wc->byte_len = wqe->dma.length - wqe->dma.resid;
+ wc->byte_len = (pkt->mask & RXE_IMMDT_MASK &&
+ pkt->mask & RXE_WRITE_MASK) ?
+ qp->resp.length : wqe->dma.length - wqe->dma.resid;
/* fields after byte_len are different between kernel and user
* space
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_verbs.c
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_verbs.c
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_verbs.c 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_verbs.c 2019-09-17
16:00:39.174896590 +0800
@@ -644,6 +644,7 @@
switch (wr->opcode) {
case IB_WR_RDMA_WRITE_WITH_IMM:
wr->ex.imm_data = ibwr->ex.imm_data;
+ /* fall through */
case IB_WR_RDMA_READ:
case IB_WR_RDMA_WRITE:
wr->wr.rdma.remote_addr = rdma_wr(ibwr)->remote_addr;
@@ -774,7 +775,6 @@
unsigned int mask;
unsigned int length = 0;
int i;
- int must_sched;
while (wr) {
mask = wr_opcode_mask(wr->opcode, qp);
@@ -804,14 +804,7 @@
wr = wr->next;
}
- /*
- * Must sched in case of GSI QP because ib_send_mad() hold irq lock,
- * and the requester call ip_local_out_sk() that takes spin_lock_bh.
- */
- must_sched = (qp_type(qp) == IB_QPT_GSI) ||
- (queue_count(qp->sq.queue) > 1);
-
- rxe_run_task(&qp->req.task, must_sched);
+ rxe_run_task(&qp->req.task, 1);
if (unlikely(qp->req.state == QP_STATE_ERROR))
rxe_run_task(&qp->comp.task, 1);
diff -ur linux-4.14.97/drivers/infiniband/sw/rxe/rxe_verbs.h
linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_verbs.h
--- linux-4.14.97/drivers/infiniband/sw/rxe/rxe_verbs.h 2019-01-31
15:13:48.000000000 +0800
+++ linux-4.14.97-rxe/drivers/infiniband/sw/rxe/rxe_verbs.h 2019-09-17
16:00:39.174896590 +0800
@@ -160,6 +160,7 @@
int opcode;
int timeout;
int timeout_retry;
+ int started_retry;
u32 retry_cnt;
u32 rnr_retry;
struct rxe_task task;
@@ -214,6 +215,7 @@
struct rxe_mem *mr;
u32 resid;
u32 rkey;
+ u32 length;
u64 atomic_orig;
/* SRQ only */
@@ -252,6 +254,7 @@
struct socket *sk;
u32 dst_cookie;
+ u16 src_port;
struct rxe_av pri_av;
struct rxe_av alt_av;
On Wed, Dec 25, 2019 at 3:23 PM Frank Huang <tigerinxm@gmail.com> wrote:
>
> hi leon
>
> I can not get what you means, do you say the rxe_add_ref(qp) is not needed?
> My kernel is old, and I found some bugs of rxe on 4.14.97, especially
> the rnr errors.
> I can not upgrade whole kernel because there are many dependencies.
> Finally , I sync the fixed from newest kernel version to the 4.14.97.
>
> When I compare my rxe_resp.c with kernel 5.2.9 , I found the snippet
> of duplicate_request is changed.
> and rxe_xmit_packet will call rxe_send,enter the log "rdma_rxe:
> Unknown layer 3 protocol: 0"
>
> 1137 } else {
> 1138 struct resp_res *res;
> 1139
> 1140 /* Find the operation in our list of responder resources. */
> 1141 res = find_resource(qp, pkt->psn);
> 1142 if (res) {
> 1143 struct sk_buff *skb_copy;
> 1144
> 1145 skb_copy = skb_clone(res->atomic.skb, GFP_ATOMIC);
> 1146 if (skb_copy) {
> 1147 rxe_add_ref(qp); /* for the new SKB */
> 1148 } else {
> 1149 pr_warn("Couldn't clone atomic resp\n");
> 1150 rc = RESPST_CLEANUP;
> 1151 goto out;
> 1152 }
> 1153
> 1154 /* Resend the result. */
> 1155 rc = rxe_xmit_packet(to_rdev(qp->ibqp.device), qp,
> 1156 pkt, skb_copy);
> 1157 if (rc) {
> 1158 pr_err("Failed resending result. This flow is not handled - skb
> ignored\n");
> 1159 rxe_drop_ref(qp);
> 1160 rc = RESPST_CLEANUP;
> 1161 goto out;
> 1162 }
> 1163 }
> 1164
> 1165 /* Resource not found. Class D error. Drop the request. */
> 1166 rc = RESPST_CLEANUP;
> 1167 goto out;
> 1168 }
> 1169 out:
> 1170 return rc;
> 1171 }
>
> On Wed, Dec 25, 2019 at 2:33 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Wed, Dec 25, 2019 at 12:55:35PM +0800, Frank Huang wrote:
> > > hi, there is a panic on rdma_rxe module when the restart
> > > network.service or shutdown the switch.
> > >
> > > it looks like a use-after-free error.
> > >
> > > everytime it happens, there is the log "rdma_rxe: Unknown layer 3 protocol: 0"
> >
> > The error print itself is harmless.
> > >
> > > is it a known error?
> > >
> > > my kernel version is 4.14.97
> >
> > Your kernel is old enough and doesn't include refcount,
> > so I can't say for sure that it is the case, but the
> > following code is not correct and with refcount debug
> > it will be seen immediately.
> >
> > 1213 int rxe_responder(void *arg)
> > 1214 {
> > 1215 struct rxe_qp *qp = (struct rxe_qp *)arg;
> > 1216 struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
> > 1217 enum resp_states state;
> > 1218 struct rxe_pkt_info *pkt = NULL;
> > 1219 int ret = 0;
> > 1220
> > 1221 rxe_add_ref(qp); <------ USE-AFTER-FREE
> > 1222
> > 1223 qp->resp.aeth_syndrome = AETH_ACK_UNLIMITED;
> > 1224
> > 1225 if (!qp->valid) {
> > 1226 ret = -EINVAL;
> > 1227 goto done;
> > 1228 }
> >
> > Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 7:23 ` Frank Huang
2019-12-25 7:43 ` Frank Huang
@ 2019-12-25 9:23 ` Leon Romanovsky
2019-12-26 1:08 ` Zhu Yanjun
1 sibling, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2019-12-25 9:23 UTC (permalink / raw)
To: Frank Huang; +Cc: linux-rdma
On Wed, Dec 25, 2019 at 03:23:53PM +0800, Frank Huang wrote:
> hi leon
>
> I can not get what you means, do you say the rxe_add_ref(qp) is not needed?
It is not what I'm saying.
The use of rxe_add_ref(qp) assumes that QP can't disappear while it is
called. From what I see in the code, rxe_responder() doesn't guarantee
that.
Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-25 9:23 ` Leon Romanovsky
@ 2019-12-26 1:08 ` Zhu Yanjun
2019-12-26 1:39 ` Frank Huang
0 siblings, 1 reply; 12+ messages in thread
From: Zhu Yanjun @ 2019-12-26 1:08 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Frank Huang, linux-rdma
I agree with you, Leon. I have fixed several problems similar to this
in the Linux upstream. Not sure whether this problem is fixed or not
in Linux upstream.
Zhu Yanjun
On Wed, Dec 25, 2019 at 5:29 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Wed, Dec 25, 2019 at 03:23:53PM +0800, Frank Huang wrote:
> > hi leon
> >
> > I can not get what you means, do you say the rxe_add_ref(qp) is not needed?
>
> It is not what I'm saying.
>
> The use of rxe_add_ref(qp) assumes that QP can't disappear while it is
> called. From what I see in the code, rxe_responder() doesn't guarantee
> that.
>
> Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-26 1:08 ` Zhu Yanjun
@ 2019-12-26 1:39 ` Frank Huang
2019-12-26 2:35 ` Zhu Yanjun
0 siblings, 1 reply; 12+ messages in thread
From: Frank Huang @ 2019-12-26 1:39 UTC (permalink / raw)
To: Zhu Yanjun; +Cc: Leon Romanovsky, linux-rdma
Hi, zhu
Can you show some patches about these problems?
On Thu, Dec 26, 2019 at 9:08 AM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
>
> I agree with you, Leon. I have fixed several problems similar to this
> in the Linux upstream. Not sure whether this problem is fixed or not
> in Linux upstream.
>
> Zhu Yanjun
>
> On Wed, Dec 25, 2019 at 5:29 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Wed, Dec 25, 2019 at 03:23:53PM +0800, Frank Huang wrote:
> > > hi leon
> > >
> > > I can not get what you means, do you say the rxe_add_ref(qp) is not needed?
> >
> > It is not what I'm saying.
> >
> > The use of rxe_add_ref(qp) assumes that QP can't disappear while it is
> > called. From what I see in the code, rxe_responder() doesn't guarantee
> > that.
> >
> > Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: rxe panic
2019-12-26 1:39 ` Frank Huang
@ 2019-12-26 2:35 ` Zhu Yanjun
0 siblings, 0 replies; 12+ messages in thread
From: Zhu Yanjun @ 2019-12-26 2:35 UTC (permalink / raw)
To: Frank Huang; +Cc: Leon Romanovsky, linux-rdma
Please make tests with Linux upstream.
Thanks.
Zhu Yanjun
On Thu, Dec 26, 2019 at 9:39 AM Frank Huang <tigerinxm@gmail.com> wrote:
>
> Hi, zhu
>
> Can you show some patches about these problems?
>
> On Thu, Dec 26, 2019 at 9:08 AM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> >
> > I agree with you, Leon. I have fixed several problems similar to this
> > in the Linux upstream. Not sure whether this problem is fixed or not
> > in Linux upstream.
> >
> > Zhu Yanjun
> >
> > On Wed, Dec 25, 2019 at 5:29 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Wed, Dec 25, 2019 at 03:23:53PM +0800, Frank Huang wrote:
> > > > hi leon
> > > >
> > > > I can not get what you means, do you say the rxe_add_ref(qp) is not needed?
> > >
> > > It is not what I'm saying.
> > >
> > > The use of rxe_add_ref(qp) assumes that QP can't disappear while it is
> > > called. From what I see in the code, rxe_responder() doesn't guarantee
> > > that.
> > >
> > > Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-12-26 2:35 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-25 4:55 rxe panic Frank Huang
2019-12-25 5:27 ` Zhu Yanjun
2019-12-25 6:01 ` Frank Huang
2019-12-25 6:34 ` Zhu Yanjun
2019-12-25 7:10 ` Frank Huang
2019-12-25 6:32 ` Leon Romanovsky
2019-12-25 7:23 ` Frank Huang
2019-12-25 7:43 ` Frank Huang
2019-12-25 9:23 ` Leon Romanovsky
2019-12-26 1:08 ` Zhu Yanjun
2019-12-26 1:39 ` Frank Huang
2019-12-26 2:35 ` Zhu Yanjun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).