From: Zhu Yanjun <zyjzyj2000@gmail.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: "Pearson, Robert B" <robert.pearson2@hpe.com>,
Bob Pearson <rpearsonhpe@gmail.com>,
Jason Gunthorpe <jgg@nvidia.com>,
linux-rdma <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-next] RDMA/rxe: Fix bug in rxe_net.c
Date: Mon, 26 Jul 2021 15:42:35 +0800 [thread overview]
Message-ID: <CAD=hENdqHx7FANVNFG4u-_WFmgsMBa=Mv67V3emqcO+wgwZaCQ@mail.gmail.com> (raw)
In-Reply-To: <CAN-5tyEVZRUyFf4bGRvL-DkoMmAXB10zQhZFB7K_UzNJ2uNWVQ@mail.gmail.com>
On Thu, Jul 22, 2021 at 11:37 PM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> I'm RHEL based in terms of my userland software. I work on upstream
> kernels using that.
Simply, one host with the kernel 5.14-rc1, the other host with the
kernel 5.14-rc2.
Then the following errors will appear
"
...
[13873.255148] rdma_rxe: bad ICRC from 192.168.1.92
[13877.567475] rdma_rxe: bad ICRC from 192.168.1.92
[13882.175544] rdma_rxe: bad ICRC from 192.168.1.92
...
"
Correct?
Zhu Yanjun
>
> Client is running kernel version 5.14-rc1 (when I started, now rc2) on
> a RHEL8.4 (beta, when I started) VM (RHEL8.2 VM with 5.14-rc1 kernel
> for server). RHEL8.4 beta that came with userland versions
> [aglo@localhost linux-nfs]$ rpm -qa | grep rdma
> rdma-core-devel-32.0-1.el8.x86_64
> librdmacm-utils-32.0-1.el8.x86_64
> rdma-core-32.0-1.el8.x86_64
> librdmacm-32.0-1.el8.x86_64
>
> I upgraded to RHEL8.4GA to make sure it's on an official release of
> the userspace. The results are the same (at the end of the mail).
> [root@localhost yum.repos.d]# rpm -qa | grep rdma
> rdma-core-32.0-4.el8.x86_64
> rdma-core-devel-32.0-4.el8.x86_64
> librdmacm-utils-32.0-4.el8.x86_64
> librdmacm-32.0-4.el8.x86_64
>
> Now, let's go back to NFSoRDMA so that we remove the variable of what
> version are the userland libraries (and if there are any
> interoperability issues with kernel changes and userland). Doing an
> NFS mount, leads to client logging continuously logging "bad ICRC"
> until mount fails with connection refused.
>
> Network trace has "ConnectRequest" which gets back ConnectReject
> (reason 0x001c) which I'm assuming is bad ICRC?
>
> nfs oops (that doesn't actually crash the machine which is nice) (this
> is a snippet and doesn't reflect the #of bad ICRC message in total):
> [ 342.290895] rdma_rxe: bad ICRC from 192.168.1.92
> [ 348.947562] rdma_rxe: bad ICRC from 192.168.1.92
> [ 355.602913] rdma_rxe: invalid mask or state for qp
> [ 355.606411] rdma_rxe: invalid mask or state for qp
> [ 355.608928] ------------[ cut here ]------------
> [ 355.610831] failed to drain recv queue: -22
> [ 355.612549] WARNING: CPU: 1 PID: 516 at
> drivers/infiniband/core/verbs.c:2738 __ib_drain_rq+0x258/0x290
> [ib_core]
> [ 355.616200] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver
> nfs lockd grace rpcrdma rdma_ucm rdma_cm iw_cm ib_cm rdma_rxe
> ip6_udp_tunnel udp_tunnel ib_uverbs ib_core uinput nls_utf8 isofs
> rfcomm xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter
> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
> nf_tables nfnetlink tun bridge stp llc bnep vsock_loopback
> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> intel_rapl_msr snd_seq_midi snd_seq_midi_event intel_rapl_common
> crct10dif_pclmul crc32_pclmul vmw_balloon ghash_clmulni_intel rapl
> snd_ens1371 joydev pcspkr snd_ac97_codec ac97_bus snd_seq uvcvideo
> btusb snd_pcm videobuf2_vmalloc btrtl videobuf2_memops btbcm
> videobuf2_v4l2 btintel videobuf2_common bluetooth videodev snd_timer
> rfkill snd_rawmidi snd_seq_device mc ecdh_generic snd ecc soundcore
> vmw_vmci i2c_piix4 auth_rpcgss sunrpc ip_tables xfs libcrc32c sr_mod
> cdrom sg ata_generic crc32c_intel vmwgfx ttm drm_kms_helper nvme ahci
> syscopyarea
> [ 355.616399] sysfillrect libahci sysimgblt ata_piix serio_raw
> fb_sys_fops drm nvme_core libata vmxnet3 t10_pi dm_mirror
> dm_region_hash dm_log dm_mod fuse
> [ 355.648889] CPU: 1 PID: 516 Comm: kworker/u256:28 Tainted: G
> W 5.14.0-rc2+ #199
> [ 355.651852] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
> [ 355.655245] Workqueue: xprtiod xprt_autoclose [sunrpc]
> [ 355.657033] RIP: 0010:__ib_drain_rq+0x258/0x290 [ib_core]
> [ 355.658808] Code: 00 00 00 48 89 ef e8 f7 a9 cc de 48 85 c0 74 e1
> e9 f6 fe ff ff 89 c6 48 c7 c7 40 09 4d c1 c6 05 0a 60 08 00 01 e8 da
> 29 c6 de <0f> 0b e9 da fe ff ff 80 3d f6 5f 08 00 00 0f 85 cd fe ff ff
> 89 c6
> [ 355.665601] RSP: 0018:ffff888008cc7b48 EFLAGS: 00010286
> [ 355.667435] RAX: 0000000000000000 RBX: 1ffff11001198f69 RCX: ffffffff9f427a3e
> [ 355.669758] RDX: 1ffff1100b98cd35 RSI: 0000000000000008 RDI: ffff88805cc669ac
> [ 355.672397] RBP: ffff88801c83c058 R08: ffffed100b98df31 R09: ffffed100b98df31
> [ 355.675018] R10: ffff88805cc6f987 R11: ffffed100b98df30 R12: ffff88801312f000
> [ 355.677844] R13: ffff8880183ef810 R14: ffffffffc18173c0 R15: ffff888001119000
> [ 355.680599] FS: 0000000000000000(0000) GS:ffff88805cc40000(0000)
> knlGS:0000000000000000
> [ 355.683904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 355.686447] CR2: 00003bafb8832fd0 CR3: 00000000142e6002 CR4: 00000000001706e0
> [ 355.689154] Call Trace:
> [ 355.690157] ? __ib_drain_sq+0x280/0x280 [ib_core]
> [ 355.692013] ? autoremove_wake_function+0x82/0xa0
> [ 355.694000] ? mutex_lock+0x8e/0xe0
> [ 355.695683] ? mutex_unlock+0x1d/0x40
> [ 355.697821] ? cma_modify_qp_err+0xa5/0xf0 [rdma_cm]
> [ 355.700409] ? rdma_unlock_handler+0x20/0x20 [rdma_cm]
> [ 355.702602] ? __update_load_avg_cfs_rq+0x5a/0x550
> [ 355.704558] ib_drain_rq+0x9f/0xb0 [ib_core]
> [ 355.706253] rpcrdma_xprt_disconnect+0xbe/0x4b0 [rpcrdma]
> [ 355.708215] xprt_rdma_close+0xe/0x50 [rpcrdma]
> [ 355.709785] xprt_autoclose+0x8b/0x160 [sunrpc]
> [ 355.711810] process_one_work+0x3ab/0x6b0
> [ 355.713303] worker_thread+0x57/0x5c0
> [ 355.714477] ? process_one_work+0x6b0/0x6b0
> [ 355.715806] kthread+0x1bf/0x1f0
> [ 355.716901] ? set_kthread_struct+0x80/0x80
> [ 355.718333] ret_from_fork+0x22/0x30
> [ 355.719577] ---[ end trace dc0181bd9d91f55b ]---
> [ 355.721135] rdma_rxe: invalid mask or state for qp
> [ 355.723117] ------------[ cut here ]------------
>
> rping oops.
>
> [13873.255148] rdma_rxe: bad ICRC from 192.168.1.92
> [13877.567475] rdma_rxe: bad ICRC from 192.168.1.92
> [13882.175544] rdma_rxe: bad ICRC from 192.168.1.92
> [13886.784329] rdma_rxe: bad ICRC from 192.168.1.92
> [13891.391534] rdma_rxe: bad ICRC from 192.168.1.92
> [13896.000084] rdma_rxe: bad ICRC from 192.168.1.92
> [13900.608291] rdma_rxe: bad ICRC from 192.168.1.92
> [13905.219925] rdma_rxe: bad ICRC from 192.168.1.92
> [13905.222298] rdma_rxe: bad ICRC from 192.168.1.92
> [13907.392305] rdma_rxe: bad ICRC from 192.168.1.92
> [13909.569156] rdma_rxe: bad ICRC from 192.168.1.92
> [13911.744391] rdma_rxe: bad ICRC from 192.168.1.92
> [13913.921244] rdma_rxe: bad ICRC from 192.168.1.92
> [13916.097423] rdma_rxe: bad ICRC from 192.168.1.92
> [13918.272800] rdma_rxe: bad ICRC from 192.168.1.92
> [13920.449837] BUG: unable to handle page fault for address: ffffc90103782194
> [13920.453440] #PF: supervisor read access in kernel mode
> [13920.455627] #PF: error_code(0x0000) - not-present page
> [13920.457585] PGD 1000067 P4D 1000067 PUD 0
> [13920.459103] Oops: 0000 [#1] SMP KASAN PTI
> [13920.460659] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W
> 5.14.0-rc2+ #199
> [13920.463284] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
> [13920.466820] RIP: 0010:copy_data+0x45/0x3a0 [rdma_rxe]
> [13920.468732] Code: 20 48 89 0c 24 44 89 4c 24 10 45 85 c0 0f 84 6c
> 01 00 00 48 8d 42 04 49 89 d4 45 89 c5 48 89 c7 48 89 44 24 30 e8 fb
> c0 6e ec <45> 8b 7c 24 04 44 89 7c 24 14 45 39 ef 0f 8c 08 03 00 00 49
> 8d 44
> [13920.474397] RSP: 0018:ffff88805cc092e0 EFLAGS: 00010246
> [13920.476010] RAX: 0000000000000000 RBX: ffff88800ed36520 RCX: ffffffffc15b4555
> [13920.478234] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffc90103782194
> [13920.480721] RBP: ffff888042fc9a48 R08: 0000000000000010 R09: 0000000000000000
> [13920.483235] R10: ffff888042fc9a55 R11: ffffed1001da6c01 R12: ffffc90103782190
> [13920.485626] R13: 0000000000000010 R14: ffff8880172a536a R15: ffff88800ed36000
> [13920.488373] FS: 0000000000000000(0000) GS:ffff88805cc00000(0000)
> knlGS:0000000000000000
> [13920.491327] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [13920.493344] CR2: ffffc90103782194 CR3: 0000000044dac005 CR4: 00000000001706f0
> [13920.495705] Call Trace:
> [13920.496608] <IRQ>
> [13920.497554] send_data_in.isra.30+0x21/0x40 [rdma_rxe]
> [13920.499371] rxe_responder+0x1a06/0x3e50 [rdma_rxe]
> [13920.500970] ? fib_info_nh_uses_dev+0x6d/0x320
> [13920.502530] ? rxe_resp_queue_pkt+0x60/0x60 [rdma_rxe]
> [13920.504183] ? crc32_pclmul_update+0x36/0x42 [crc32_pclmul]
> [13920.505895] ? rxe_crc32.isra.14+0x7d/0x100 [rdma_rxe]
> [13920.507485] ? check_type_state.isra.8+0x150/0x150 [rdma_rxe]
> [13920.509248] ? find_gid+0x166/0x210 [ib_core]
> [13920.510978] ? _raw_spin_lock_irqsave+0x80/0xe0
> [13920.512449] ? _raw_write_lock_irqsave+0xe0/0xe0
> [13920.513877] ? rxe_resp_queue_pkt+0x60/0x60 [rdma_rxe]
> [13920.515506] rxe_do_task+0xd2/0x160 [rdma_rxe]
> [13920.516881] rxe_rcv+0x5a5/0xe30 [rdma_rxe]
> [13920.518510] ? rxe_crc32.isra.14+0x100/0x100 [rdma_rxe]
> [13920.520297] ? __udp4_lib_lookup+0x3fa/0x5b0
> [13920.521617] ? ib_device_get_by_netdev+0x165/0x1b0 [ib_core]
> [13920.523403] ? ib_unregister_driver+0x170/0x170 [ib_core]
> [13920.525327] ? stack_access_ok+0x35/0x80
> [13920.526808] rxe_udp_encap_recv+0xd0/0x120 [rdma_rxe]
> [13920.528541] ? rxe_enable_task+0x20/0x20 [rdma_rxe]
> [13920.530252] udp_queue_rcv_one_skb+0x36d/0x8a0
> [13920.531985] udp_unicast_rcv_skb.isra.65+0x126/0x140
> [13920.533800] __udp4_lib_rcv+0x924/0x1310
> [13920.535186] ? udp_err+0x20/0x20
> [13920.536190] ? is_bpf_text_address+0x13/0x20
> [13920.537554] ? kernel_text_address+0x100/0x110
> [13920.538944] ? __unwind_start+0x2e8/0x370
> [13920.540193] ? raw_rcv+0x1a0/0x1a0
> [13920.541253] ? nft_do_chain_arp+0xa0/0xa0 [nf_tables]
> [13920.542913] ? nft_do_chain_ipv4+0xe4/0x110 [nf_tables]
> [13920.544569] ? nf_nat_ipv4_fn+0x21/0xc0 [nf_nat]
> [13920.546109] ip_protocol_deliver_rcu+0x170/0x2c0
> [13920.547907] ip_local_deliver_finish+0xae/0xc0
> [13920.549598] ip_local_deliver+0x1ae/0x1c0
> [13920.551031] ? ip_local_deliver_finish+0xc0/0xc0
> [13920.552586] ? ip_route_input_rcu+0x421/0x4b0
> [13920.554071] ? ip_protocol_deliver_rcu+0x2c0/0x2c0
> [13920.555662] ? ip_sublist_rcv+0x3c0/0x3c0
> [13920.556962] ? ip_sublist_rcv+0x3c0/0x3c0
> [13920.558439] ip_rcv+0x159/0x160
> [13920.559549] ? ip_sublist_rcv+0x3c0/0x3c0
> [13920.560782] ? secondary_startup_64_no_verify+0xc2/0xcb
> [13920.562683] ? remove_all_stable_nodes+0x40/0x190
> [13920.564674] ? ip_local_deliver+0x1c0/0x1c0
> [13920.566054] ? __napi_poll+0x5d/0x1f0
> [13920.567310] ? net_rx_action+0x21c/0x4a0
> [13920.568616] ? __do_softirq+0xf9/0x376
> [13920.569809] __netif_receive_skb_one_core+0x133/0x150
> [13920.571350] ? __netif_receive_skb_core+0x1760/0x1760
> [13920.572889] ? ip_finish_output+0xc0/0xc0
> [13920.574123] ? _raw_spin_lock_irqsave+0x80/0xe0
> [13920.575505] ? _raw_write_lock_irqsave+0xe0/0xe0
> [13920.576910] ? kasan_set_track+0x1c/0x30
> [13920.578205] netif_receive_skb+0x94/0x240
> [13920.579667] ? __netif_receive_skb+0xa0/0xa0
> [13920.581132] ? eth_type_trans+0x134/0x270
> [13920.582422] ? eth_gro_receive+0x310/0x310
> [13920.583679] ? __build_skb_around+0x10e/0x130
> [13920.585023] ? dma_unmap_page_attrs+0x1c6/0x2d0
> [13920.586439] vmxnet3_rq_rx_complete+0xa76/0x17b0 [vmxnet3]
> [13920.588146] vmxnet3_poll_rx_only+0x47/0xd0 [vmxnet3]
> [13920.589693] __napi_poll+0x5d/0x1f0
> [13920.590766] net_rx_action+0x21c/0x4a0
> [13920.591918] ? napi_threaded_poll+0x1c0/0x1c0
> [13920.593253] ? vmxnet3_msix_tx+0x100/0x100 [vmxnet3]
> [13920.594792] ? note_interrupt+0xf0/0x3a0
> [13920.596042] ? add_interrupt_randomness+0x15f/0x2a0
> [13920.597677] ? _raw_spin_lock+0x7a/0xd0
> [13920.598853] ? _raw_write_lock_bh+0xe0/0xe0
> [13920.600144] __do_softirq+0xf9/0x376
> [13920.601247] irq_exit_rcu+0x118/0x130
> [13920.602435] common_interrupt+0x77/0x90
> [13920.603712] </IRQ>
> [13920.604411] asm_common_interrupt+0x1e/0x40
> [13920.606075] RIP: 0010:acpi_idle_do_entry+0x61/0x70
> [13920.607750] Code: ef 01 00 be 08 00 00 00 48 89 df e8 89 10 54 ff
> 48 89 df e8 41 06 54 ff 48 8b 03 a8 08 75 0c eb 07 0f 00 2d 01 b1 73
> 00 fb f4 <fa> 5b c3 48 89 df 5b e9 93 f9 ff ff cc cc cc 0f 1f 44 00 00
> 41 57
> [13920.613840] RSP: 0018:ffffffffaf407d98 EFLAGS: 00000246
> [13920.615463] RAX: 0000000000004000 RBX: ffffffffaf41a400 RCX: ffffffffae76014f
> [13920.617651] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffffffffaf41a400
> [13920.619820] RBP: 0000000000000001 R08: fffffbfff5e83481 R09: fffffbfff5e83481
> [13920.622005] R10: ffffffffaf41a407 R11: fffffbfff5e83480 R12: ffff88800953a000
> [13920.624162] R13: 0000000000000001 R14: ffff88800953a004 R15: ffff8880046c1800
> [13920.626335] ? acpi_idle_do_entry+0x4f/0x70
> [13920.627740] ? acpi_idle_do_entry+0x4f/0x70
> [13920.629020] acpi_idle_enter+0x14d/0x1c0
> [13920.630295] cpuidle_enter_state+0xb2/0x590
> [13920.631603] ? tick_nohz_stop_tick+0x1f0/0x2d0
> [13920.632987] cpuidle_enter+0x3c/0x60
> [13920.634136] do_idle+0x399/0x400
> [13920.635192] ? arch_cpu_idle_exit+0x40/0x40
> [13920.636471] ? do_idle+0x26d/0x400
> [13920.637517] cpu_startup_entry+0x19/0x20
> [13920.638716] start_kernel+0x378/0x396
> [13920.639925] secondary_startup_64_no_verify+0xc2/0xcb
> [13920.641564] Modules linked in: rpcrdma rdma_ucm rdma_cm iw_cm ib_cm
> rdma_rxe ip6_udp_tunnel udp_tunnel ib_uverbs ib_core uinput nls_utf8
> isofs rfcomm xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat
> nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
> nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc bnep
> vsock_loopback vmw_vsock_virtio_transport_common
> vmw_vsock_vmci_transport vsock intel_rapl_msr snd_seq_midi
> snd_seq_midi_event intel_rapl_common crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel vmw_balloon rapl pcspkr joydev snd_ens1371
> snd_ac97_codec ac97_bus btusb snd_seq uvcvideo btrtl btbcm btintel
> videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common
> bluetooth snd_pcm videodev mc snd_timer rfkill snd_rawmidi
> snd_seq_device ecdh_generic snd ecc soundcore vmw_vmci i2c_piix4
> auth_rpcgss sunrpc ip_tables xfs libcrc32c sr_mod cdrom sg ata_generic
> crc32c_intel nvme vmwgfx ata_piix serio_raw nvme_core ttm ahci libahci
> drm_kms_helper libata syscopyarea
> [13920.641737] sysfillrect sysimgblt fb_sys_fops vmxnet3 t10_pi drm
> dm_mirror dm_region_hash dm_log dm_mod fuse
> [13920.671747] CR2: ffffc90103782194
> [13920.674506] ---[ end trace 6ae70b2fba32e277 ]---
>
>
> On Wed, Jul 21, 2021 at 9:56 PM Pearson, Robert B
> <robert.pearson2@hpe.com> wrote:
> >
> > OK. For tomorrow. I need to know more about your setup. Which versions of kernel, rdma-core and what application SW you are running so I can try to reproduce your results.
> >
> > Regards,
> >
> > Bob Pearson
> >
> > -----Original Message-----
> > From: Olga Kornievskaia <aglo@umich.edu>
> > Sent: Wednesday, July 21, 2021 7:31 PM
> > To: Bob Pearson <rpearsonhpe@gmail.com>
> > Cc: Jason Gunthorpe <jgg@nvidia.com>; Zhu Yanjun <zyjzyj2000@gmail.com>; linux-rdma <linux-rdma@vger.kernel.org>
> > Subject: Re: [PATCH for-next] RDMA/rxe: Fix bug in rxe_net.c
> >
> > On Wed, Jul 21, 2021 at 5:42 PM Bob Pearson <rpearsonhpe@gmail.com> wrote:
> > >
> > > An earlier patch removed setting of tot_len in IPV4 headers because it
> > > was also set in ip_local_out. However, this change resulted in an
> > > incorrect ICRC being computed because the tot_len field is not masked
> > > out. This patch restores that line. This fixes the bug reported by Zhu Yanjun.
> > > This bug would have also affected anyone using rxe.
> > >
> > > Fixes: 230bb836ee88 ("RDMA/rxe: Fix redundant call to ip_send_check")
> > > Reported_by: Zhu Yanjun <zyjzyj2000@gmail.com>
> > > Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> > > ---
> > > drivers/infiniband/sw/rxe/rxe_net.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_net.c
> > > b/drivers/infiniband/sw/rxe/rxe_net.c
> > > index dec92928a1cd..5ac27f28ace1 100644
> > > --- a/drivers/infiniband/sw/rxe/rxe_net.c
> > > +++ b/drivers/infiniband/sw/rxe/rxe_net.c
> > > @@ -259,6 +259,7 @@ static void prepare_ipv4_hdr(struct dst_entry
> > > *dst, struct sk_buff *skb,
> > >
> > > iph->version = IPVERSION;
> > > iph->ihl = sizeof(struct iphdr) >> 2;
> > > + iph->tot_len = htons(skb->len);
> > > iph->frag_off = df;
> > > iph->protocol = proto;
> > > iph->tos = tos;
> > > --
> >
> > This patch made the server crash (just like one of the other crashes I've seen and posted to the list).
> >
> > The client logs:
> >
> > [ 206.437839] rdma_rxe: bad ICRC from 192.168.1.92 [ 211.043978] rdma_rxe: bad ICRC from 192.168.1.92 [ 215.652973] rdma_rxe: bad ICRC from 192.168.1.92
> >
> >
> > Server crash:
> >
> > [11568.440098] BUG: unable to handle page fault for address: ffffaddb21f61180 [11568.442923] #PF: supervisor write access in kernel mode [11568.444452] #PF: error_code(0x0002) - not-present page [11568.445996] PGD 1000067 P4D 1000067 PUD 11b9067 PMD 0 [11568.447527] Oops: 0002 [#1] SMP PTI
> > [11568.448606] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W
> > 5.14.0-rc1+ #42
> > [11568.450911] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020 [11568.454072] RIP: 0010:rxe_cq_post+0x98/0x210 [rdma_rxe] [11568.455613] Code: 8b b3 48 01 00 00 4d 8b 48 08 41 8b 48 28 49 8d
> > b9 80 01 00 00 85 f6 0f 84 78 01 00 00 41 8b 50 34 d3 e2 48 01 fa 48 8b 4d 00 <48> 89 0a 48 8b 4d 08 48 89 4a 08 48 8b 4d 10 48 89 4a 10 48 8b 4d [11568.461093] RSP: 0018:ffffaddb004c0988 EFLAGS: 00010082 [11568.462621] RAX: 0000000000000246 RBX: ffff9c9137df1a00 RCX: 0000000000000000 [11568.464695] RDX: ffffaddb21f61180 RSI: 0000000000000001 RDI: ffffaddb05f5f180 [11568.466779] RBP: ffffaddb004c0a30 R08: ffff9c9123186c00 R09: ffffaddb05f5f000 [11568.468902] R10: 80139a1c70550000 R11: 400000005d050000 R12: 0000000000000000 [11568.470977] R13: ffff9c9137df1b40 R14: ffff9c9137d50008 R15: 000000000000000a [11568.473050] FS: 0000000000000000(0000) GS:ffff9c917be40000(0000)
> > knlGS:0000000000000000
> > [11568.475395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [11568.477090] CR2: ffffaddb21f61180 CR3: 0000000043476005 CR4: 00000000001706e0 [11568.479170] Call Trace:
> > [11568.479966] <IRQ>
> > [11568.480683] rxe_responder+0x612/0x2470 [rdma_rxe] [11568.482122] rxe_do_task+0x89/0x100 [rdma_rxe] [11568.483427] rxe_rcv+0x2eb/0x900 [rdma_rxe] [11568.484655] ? __udp4_lib_lookup+0x2c8/0x440 [11568.486159] rxe_udp_encap_recv+0x68/0xa0 [rdma_rxe] [11568.487721] ? rxe_enable_task+0x10/0x10 [rdma_rxe] [11568.489223] udp_queue_rcv_one_skb+0x1df/0x4e0 [11568.490528] udp_unicast_rcv_skb.isra.67+0x74/0x90
> > [11568.491926] __udp4_lib_rcv+0x555/0xb90 [11568.493053] ip_protocol_deliver_rcu+0xe8/0x1b0
> > [11568.494479] ip_local_deliver_finish+0x44/0x50 [11568.496204] ip_local_deliver+0xf1/0x100 [11568.497621] ? ip_protocol_deliver_rcu+0x1b0/0x1b0
> > [11568.499147] ip_rcv+0xcb/0xe0
> > [11568.500032] __netif_receive_skb_core+0x3a2/0x1010
> > [11568.501491] ? packet_rcv+0x40/0x4b0
> > [11568.502661] ? select_idle_sibling+0x29/0x970 [11568.504019] __netif_receive_skb_one_core+0x3c/0xa0
> > [11568.505455] netif_receive_skb+0x3d/0x130 [11568.506650] vmxnet3_rq_rx_complete+0x5f0/0xdc0 [vmxnet3] [11568.508808] vmxnet3_poll_rx_only+0x31/0xa0 [vmxnet3] [11568.510526] __napi_poll+0x2b/0x120 [11568.511596] net_rx_action+0xe2/0x240 [11568.512678] ? vmxnet3_msix_rx+0x4a/0x60 [vmxnet3] [11568.514084] __do_softirq+0xd9/0x2a1 [11568.515218] irq_exit_rcu+0xba/0xd0 [11568.516272] common_interrupt+0x77/0x90 [11568.517438] </IRQ> [11568.518059] asm_common_interrupt+0x1e/0x40 [11568.519291] RIP: 0010:acpi_idle_do_entry+0x4c/0x50 [11568.520680] Code: 08 48 8b 15 3a e3 94 01 ed c3 e9 5f fc ff ff 65
> > 48 8b 04 25 00 6f 01 00 48 8b 00 a8 08 75 ea eb 07 0f 00 2d 40 41 50
> > 00 fb f4 <fa> c3 cc cc 0f 1f 44 00 00 41 55 41 89 d5 41 54 49 89 f4 55
> > 53 48
> > [11568.526026] RSP: 0018:ffffaddb0009be68 EFLAGS: 00000246 [11568.527569] RAX: 0000000000004000 RBX: 0000000000000001 RCX: ffff9c917be40000 [11568.529627] RDX: 0000000000000001 RSI: ffffffff9dcc99c0 RDI: ffff9c917c03b464 [11568.531723] RBP: ffff9c9105f63400 R08: ffff9c917c03b400 R09: 000000000000b0e0 [11568.533772] R10: 0000000000001e99 R11: ffff9c917be6a984 R12: ffffffff9dcc9a40 [11568.535918] R13: ffffffff9dcc99c0 R14: 0000000000000001 R15: 0000000000000000 [11568.538558] ? sched_clock_cpu+0x9/0xa0 [11568.539706] acpi_idle_enter+0x4d/0xb0 [11568.540912] cpuidle_enter_state+0x8c/0x350 [11568.542164] cpuidle_enter+0x29/0x40 [11568.543211] do_idle+0x257/0x2a0 [11568.544303] cpu_startup_entry+0x19/0x20 [11568.545455] start_secondary+0x116/0x150 [11568.546928] secondary_startup_64_no_verify+0xc2/0xcb
> > [11568.548479] Modules linked in: rpcrdma rdma_ucm rdma_cm iw_cm ib_cm rdma_rxe ip6_udp_tunnel udp_tunnel ib_uverbs ib_core fuse rfcomm xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT
> > nf_reject_ipv4 nft_counter nft_compat nf_tables nfnetlink tun bridge stp llc vmw_vsock_vmci_transport vsock bnep snd_seq_midi snd_seq_midi_event intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul vmw_balloon ghash_clmulni_intel joydev pcspkr btusb btrtl btbcm btintel bluetooth uvcvideo rfkill videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_ens1371 snd_ac97_codec ac97_bus snd_seq videobuf2_common snd_pcm videodev mc ecdh_generic ecc snd_timer snd_rawmidi snd_seq_device snd soundcore vmw_vmci i2c_piix4 auth_rpcgss sunrpc ip_tables xfs libcrc32c sr_mod cdrom sg crc32c_intel ata_generic vmwgfx ttm serio_raw nvme drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvme_core t10_pi cec ata_piix ahci vmxnet3 libahci drm libata [11568.575542] CR2: ffffaddb21f61180 [11568.577210] ---[ end trace 8afcc89bb91d9b85 ]--- [11568.578573] RIP: 0010:rxe_cq_post+0x98/0x210 [rdma_rxe]
> >
> >
> >
> > > 2.30.2
> > >
next prev parent reply other threads:[~2021-07-26 7:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-21 21:41 [PATCH for-next] RDMA/rxe: Fix bug in rxe_net.c Bob Pearson
2021-07-22 0:31 ` Olga Kornievskaia
2021-07-22 1:55 ` Pearson, Robert B
2021-07-22 15:37 ` Olga Kornievskaia
2021-07-26 7:42 ` Zhu Yanjun [this message]
2021-07-26 13:15 ` Pearson, Robert B
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAD=hENdqHx7FANVNFG4u-_WFmgsMBa=Mv67V3emqcO+wgwZaCQ@mail.gmail.com' \
--to=zyjzyj2000@gmail.com \
--cc=aglo@umich.edu \
--cc=jgg@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
--cc=robert.pearson2@hpe.com \
--cc=rpearsonhpe@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).