linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
@ 2017-09-18 18:18 Stephen  Bates
       [not found] ` <59792242-F2C8-431A-BEDA-996844EDE4C5-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen  Bates @ 2017-09-18 18:18 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Logan Gunthorpe, Sagi Grimberg, Max Gurtovoy

Hi All

I am seeing an issue that I think is a problem with rdma_cm and wanted to report it here to see if anyone has any advice. Basically, I have two HCAs in a single server connected via a network cable. I can perform ping, iperf and other IP related applications and I see traffic flow out one NIC, over the cable and in the other NIC.. However, any rdma_cm related applications fail at the rdma_connect step. [BTW I have confirmed that things work fine in a more traditional setup using two servers.]

The Details

1. 4.12.3 stable kernel.
2. rdma-core v14.
3. Mellanox CX5 100G HCAs configured for Ethernet (RoCE) mode.
4. Intel x86_64 CPU.

Using a NAT approach discussed in [1] I can setup IPv4 addresses on both HCAs such that I avoid a local loopback (the addresses I use are a little different to the ones in that reference but the approach is identical). This allows ping, iperf and other IP based applications to work just fine. For example:

<server>
iperf –B 172.18.1.1
</server>
<client>
iperf 172.18.11.1
</client>

works great and I can use packet counters to confirm the traffic is hitting the network cable.

However, if I try:

<server>
rping –s –a 172.18.1.1 -vVd
</server>
<client>
rping –c –a 172.18.11.1 –vVd
</client>

I see the following:

<server>
created cm_id 0xceded03170
rdma_bind_addr successful
rdma_listen
</server>
<client>
created cm_id 0x138702d110
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x138702d110 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x138702d110 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x138702cb80
created channel 0x138702cba0
created cq 0x138702cbc0
created qp 0x138702faf8
rping_setup_buffers called on cb 0x13870253c0
allocated & registered buffers...
cq_thread started.
cma_event type RDMA_CM_EVENT_UNREACHABLE cma_id 0x138702d110 (parent)
cma event RDMA_CM_EVENT_UNREACHABLE, error -110
wait for CONNECTED state 4
connect error -1
</client>

I’ve tried using configfs to switch the preferred RoCE mode but that had no effect. I’d appreciate any ideas or input from anyone who might have got this working on their systems. I know there are other ways to solve this (e.g. (para)virtualization of the client) but I’d like to get this approach up and running if I can). BTW as an extra piece of input I also tried using in-kernel rdma_cm (using NVMe over Fabrics) and got a similar error message…

Cheers
 
Stephen

[1] https://serverfault.com/questions/127636/force-local-ip-traffic-to-an-external-interface



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-09-26 20:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 18:18 [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration Stephen  Bates
     [not found] ` <59792242-F2C8-431A-BEDA-996844EDE4C5-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
2017-09-18 19:09   ` Parav Pandit
     [not found]     ` <VI1PR0502MB3008FAC205874147558EA241D1630-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-09-18 19:41       ` Stephen  Bates
2017-09-19  3:51       ` Leon Romanovsky
     [not found]         ` <20170919035134.GH5788-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-09-19  6:13           ` Parav Pandit
2017-09-18 20:59   ` Jason Gunthorpe
     [not found]     ` <20170918205959.GC7059-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-09-19  1:11       ` Stephen  Bates
     [not found]         ` <AB27E7A4-FD5A-438E-A0CF-E593882F5EAE-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
2017-09-20 16:28           ` Jason Gunthorpe
     [not found]             ` <20170920162828.GC536-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-09-26 20:52               ` Stephen  Bates

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).