From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robert LeBlanc Subject: Re: iSER with policy based routing error Date: Mon, 15 May 2017 21:33:09 -0600 Message-ID: References: <71232433-a3d1-1bc0-a995-ae32fc05913f@grimberg.me> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <71232433-a3d1-1bc0-a995-ae32fc05913f-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Sagi Grimberg Cc: linux-rdma List-Id: linux-rdma@vger.kernel.org We are getting errors with rping too, even on the same IPv4 subnet. All userland RDMA programs seem to be failing. It is like there is a missing library or something. We installed "Infiniband Support" group install, so I'm not sure what could be missing. # rping -s -a 0.0.0.0 Segmentation fault >>From dmesg: [Mon May 15 14:28:36 2017] rping[3289]: segfault at 18 ip 00007f0142ca9a34 sp 00007ffd7f0a8cc0 error 4 in libibverbs.so.1.0.0[7f0142c9e000+11000] # gdb rping core.3289 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /usr/bin/rping...Reading symbols from /usr/lib/debug/usr/bin/rping.debug...done. done. [New LWP 3289] [New LWP 3292] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `rping -s -a 0.0.0.0'. Program terminated with signal 11, Segmentation fault. #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 196 pd = context->ops.alloc_pd(context); Missing separate debuginfos, use: debuginfo-install libcxgb3-1.3.1-8.el7.x86_64 libcxgb4-1.3.5-3.el7.x86_64 libipathverbs-1.3-2.el7.x86_64 libmlx4-1.0.6-5.el7.x86_64 libmlx5-1.0.2-1.el7.x86_64 libmthca-1.0.6-13.el7.x86_64 libne s-1.1.4-2.el7.x86_64 libnl3-3.2.21-10.el7.x86_64 (gdb) bt #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 #1 0x0000563bc5c1f5c6 in rping_setup_qp (cb=cb@entry=0x563bc61f9780, cm_id=) at examples/rping.c:519 #2 0x0000563bc5c1de5a in rping_run_server (cb=0x563bc61f9780) at examples/rping.c:890 #3 main (argc=4, argv=0x7ffd7f0a8ee8) at examples/rping.c:1268 (gdb) f 0 #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 196 pd = context->ops.alloc_pd(context); (gdb) list 191 192 struct ibv_pd *__ibv_alloc_pd(struct ibv_context *context) 193 { 194 struct ibv_pd *pd; 195 196 pd = context->ops.alloc_pd(context); 197 if (pd) 198 pd->context = context; 199 200 return pd; (gdb) p context $1 = (struct ibv_context *) 0x0 (gdb) # rping -c -a 192.168.0.13 cma event RDMA_CM_EVENT_REJECTED, error 28 wait for CONNECTED state 4 connect error -1 Segmentation fault >>From dmesg [Mon May 15 14:27:24 2017] rping[3075]: segfault at 7f2386c800a8 ip 00007f2386672adf sp 00007ffd40e5df60 error 4 in libibverbs.so.1.0.0[7f2386667000+11000] # gdb rping core.3075 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /usr/bin/rping...Reading symbols from /usr/lib/debug/usr/bin/rping.debug...done. done. [New LWP 3075] [New LWP 3078] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `rping -c -a 192.168.0.13'. Program terminated with signal 11, Segmentation fault. #0 __ibv_dereg_mr (mr=0x5624b478c740) at src/verbs.c:237 237 ret = mr->context->ops.dereg_mr(mr); Missing separate debuginfos, use: debuginfo-install libcxgb3-1.3.1-8.el7.x86_64 libcxgb4-1.3.5-3.el7.x86_64 libgcc-4.8.5-4.el7.x86_64 libipathverbs-1.3-2.el7.x86_64 libmlx4-1.0.6-5.el7.x86_64 libmlx5-1.0.2-1.el7.x86_64 libmthca -1.0.6-13.el7.x86_64 libnes-1.1.4-2.el7.x86_64 libnl3-3.2.21-10.el7.x86_64 (gdb) bt #0 __ibv_dereg_mr (mr=0x5624b478c740) at src/verbs.c:237 #1 0x00005624b2b618a7 in rping_free_buffers (cb=0x5624b4786780) at examples/rping.c:470 #2 0x00005624b2b5fef3 in rping_run_client (cb=) at examples/rping.c:1111 #3 main (argc=, argv=) at examples/rping.c:1270 (gdb) f 0 #0 __ibv_dereg_mr (mr=0x5624b478c740) at src/verbs.c:237 237 ret = mr->context->ops.dereg_mr(mr); (gdb) list 232 { 233 int ret; 234 void *addr = mr->addr; 235 size_t length = mr->length; 236 237 ret = mr->context->ops.dereg_mr(mr); 238 if (!ret) 239 ibv_dofork_range(addr, length); 240 241 return ret; (gdb) p *mr $1 = {context = 0x7f2386c80070, pd = 0x5624b4789f30, addr = 0x5624b47867e8, length = 16, handle = 0, lkey = 162608, rkey = 162608} (gdb) p *mr->context Cannot access memory at address 0x7f2386c80070 ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, May 15, 2017 at 4:39 AM, Sagi Grimberg wrote: > Hi Robert, > >> We are trying to leverage multiple cards/ports for iSER for >> performance and resiliency reasons. The ports are configured with only >> IPv6 addresses and each port is on a separate VLAN/subnet that is >> routable to each other subnet. We are using rules with tables to set a >> default gateway for each adapter/subnet based on the source IPv6 >> address (policy based routing). Using TCP for iSCSI, everything works >> fine and traffic ingresses/egresses the right ports. However, when we >> try using iSER, we get connection errors. >> >> May 12 13:39:27 prv-0-14-roberttest kernel: iser: iser_connect: >> rdma_resolve_addr failed: -101 >> May 12 13:39:27 prv-0-14-roberttest iscsid: Received iferror -101: >> Network is unreachable. >> May 12 13:39:27 prv-0-14-roberttest iscsid: cannot make a connection >> to 2604:3140:40:300:0:580:d0:0:3260 (-101,0) > > > This looks 100% rdma_cm to me. iser is completely agnostic to address > families and routes. > >> If we put a default gateway for IPv6 in the 'default' table, then iSER >> is able to make a connection, but we can only use one port. It looks >> as if iSER is not following the rules in the default routing table to >> find the appropriate default gateway in a different table. > > > As I said, iser relies on rdma_cm for routing decisions. > I would suspect that all rdma_cm based protocols to be > affected as well (nfs, nvmf). > > Did you check plain rping like Or suggested? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html