linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
@ 2017-09-18 18:18 Stephen  Bates
       [not found] ` <59792242-F2C8-431A-BEDA-996844EDE4C5-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen  Bates @ 2017-09-18 18:18 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Logan Gunthorpe, Sagi Grimberg, Max Gurtovoy

Hi All

I am seeing an issue that I think is a problem with rdma_cm and wanted to report it here to see if anyone has any advice. Basically, I have two HCAs in a single server connected via a network cable. I can perform ping, iperf and other IP related applications and I see traffic flow out one NIC, over the cable and in the other NIC.. However, any rdma_cm related applications fail at the rdma_connect step. [BTW I have confirmed that things work fine in a more traditional setup using two servers.]

The Details

1. 4.12.3 stable kernel.
2. rdma-core v14.
3. Mellanox CX5 100G HCAs configured for Ethernet (RoCE) mode.
4. Intel x86_64 CPU.

Using a NAT approach discussed in [1] I can setup IPv4 addresses on both HCAs such that I avoid a local loopback (the addresses I use are a little different to the ones in that reference but the approach is identical). This allows ping, iperf and other IP based applications to work just fine. For example:

<server>
iperf –B 172.18.1.1
</server>
<client>
iperf 172.18.11.1
</client>

works great and I can use packet counters to confirm the traffic is hitting the network cable.

However, if I try:

<server>
rping –s –a 172.18.1.1 -vVd
</server>
<client>
rping –c –a 172.18.11.1 –vVd
</client>

I see the following:

<server>
created cm_id 0xceded03170
rdma_bind_addr successful
rdma_listen
</server>
<client>
created cm_id 0x138702d110
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x138702d110 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x138702d110 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x138702cb80
created channel 0x138702cba0
created cq 0x138702cbc0
created qp 0x138702faf8
rping_setup_buffers called on cb 0x13870253c0
allocated & registered buffers...
cq_thread started.
cma_event type RDMA_CM_EVENT_UNREACHABLE cma_id 0x138702d110 (parent)
cma event RDMA_CM_EVENT_UNREACHABLE, error -110
wait for CONNECTED state 4
connect error -1
</client>

I’ve tried using configfs to switch the preferred RoCE mode but that had no effect. I’d appreciate any ideas or input from anyone who might have got this working on their systems. I know there are other ways to solve this (e.g. (para)virtualization of the client) but I’d like to get this approach up and running if I can). BTW as an extra piece of input I also tried using in-kernel rdma_cm (using NVMe over Fabrics) and got a similar error message…

Cheers
 
Stephen

[1] https://serverfault.com/questions/127636/force-local-ip-traffic-to-an-external-interface



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found] ` <59792242-F2C8-431A-BEDA-996844EDE4C5-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
@ 2017-09-18 19:09   ` Parav Pandit
       [not found]     ` <VI1PR0502MB3008FAC205874147558EA241D1630-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  2017-09-18 20:59   ` Jason Gunthorpe
  1 sibling, 1 reply; 9+ messages in thread
From: Parav Pandit @ 2017-09-18 19:09 UTC (permalink / raw)
  To: Stephen Bates, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Logan Gunthorpe, Sagi Grimberg, Max Gurtovoy

Hi Stephen,

If you only intention is to force to outgoing port and loopback externally, I can suggest you different route table configuration without NAT.
Let me know.

Parav

> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Stephen Bates
> Sent: Monday, September 18, 2017 1:18 PM
> To: linux-rdma@vger.kernel.org
> Cc: Logan Gunthorpe <logang@deltatee.com>; Sagi Grimberg
> <sagi@grimberg.me>; Max Gurtovoy <maxg@mellanox.com>
> Subject: [linux-rdma and rdma-core]: Unable to perform rdma_connect in
> loopbacked configuration
> 
> Hi All
> 
> I am seeing an issue that I think is a problem with rdma_cm and wanted to
> report it here to see if anyone has any advice. Basically, I have two HCAs in a
> single server connected via a network cable. I can perform ping, iperf and other
> IP related applications and I see traffic flow out one NIC, over the cable and in
> the other NIC.. However, any rdma_cm related applications fail at the
> rdma_connect step. [BTW I have confirmed that things work fine in a more
> traditional setup using two servers.]
> 
> The Details
> 
> 1. 4.12.3 stable kernel.
> 2. rdma-core v14.
> 3. Mellanox CX5 100G HCAs configured for Ethernet (RoCE) mode.
> 4. Intel x86_64 CPU.
> 
> Using a NAT approach discussed in [1] I can setup IPv4 addresses on both HCAs
> such that I avoid a local loopback (the addresses I use are a little different to the
> ones in that reference but the approach is identical). This allows ping, iperf and
> other IP based applications to work just fine. For example:
> 
> <server>
> iperf –B 172.18.1.1
> </server>
> <client>
> iperf 172.18.11.1
> </client>
> 
> works great and I can use packet counters to confirm the traffic is hitting the
> network cable.
> 
> However, if I try:
> 
> <server>
> rping –s –a 172.18.1.1 -vVd
> </server>
> <client>
> rping –c –a 172.18.11.1 –vVd
> </client>
> 
> I see the following:
> 
> <server>
> created cm_id 0xceded03170
> rdma_bind_addr successful
> rdma_listen
> </server>
> <client>
> created cm_id 0x138702d110
> cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x138702d110
> (parent) cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id
> 0x138702d110 (parent) rdma_resolve_addr - rdma_resolve_route successful
> created pd 0x138702cb80 created channel 0x138702cba0 created cq
> 0x138702cbc0 created qp 0x138702faf8 rping_setup_buffers called on cb
> 0x13870253c0 allocated & registered buffers...
> cq_thread started.
> cma_event type RDMA_CM_EVENT_UNREACHABLE cma_id 0x138702d110
> (parent) cma event RDMA_CM_EVENT_UNREACHABLE, error -110 wait for
> CONNECTED state 4 connect error -1 </client>
> 
> I’ve tried using configfs to switch the preferred RoCE mode but that had no
> effect. I’d appreciate any ideas or input from anyone who might have got this
> working on their systems. I know there are other ways to solve this (e.g.
> (para)virtualization of the client) but I’d like to get this approach up and running
> if I can). BTW as an extra piece of input I also tried using in-kernel rdma_cm
> (using NVMe over Fabrics) and got a similar error message…
> 
> Cheers
> 
> Stephen
> 
> [1]
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fserverf
> ault.com%2Fquestions%2F127636%2Fforce-local-ip-traffic-to-an-external-
> interface&data=02%7C01%7Cparav%40mellanox.com%7Ce2f8f804df59419cc25
> a08d4fec1b338%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63641
> 3555309452199&sdata=E2MQTCbSTHiUxmIAVZITBCPgKTEWBTY2%2BWVqHuSJ
> KlY%3D&reserved=0
> 
> 
> \x04 {.n +       +%  lzwm  b 맲  r  zX  \x1aݙ   \x17  ܨ}   Ơz &j:+v        zZ+  +zf   h   ~    i   z \x1e w   ?
> & )ߢ^[f

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found]     ` <VI1PR0502MB3008FAC205874147558EA241D1630-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-09-18 19:41       ` Stephen  Bates
  2017-09-19  3:51       ` Leon Romanovsky
  1 sibling, 0 replies; 9+ messages in thread
From: Stephen  Bates @ 2017-09-18 19:41 UTC (permalink / raw)
  To: Parav Pandit, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Logan Gunthorpe, Sagi Grimberg, Max Gurtovoy


> If you only intention is to force to outgoing port and loopback externally, I can suggest you different route table configuration without NAT.
> Let me know.

Thanks Parav. I can try that and see if it helps. Can you send the configuration to me? 

Stephen



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found] ` <59792242-F2C8-431A-BEDA-996844EDE4C5-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
  2017-09-18 19:09   ` Parav Pandit
@ 2017-09-18 20:59   ` Jason Gunthorpe
       [not found]     ` <20170918205959.GC7059-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  1 sibling, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2017-09-18 20:59 UTC (permalink / raw)
  To: Stephen Bates
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Logan Gunthorpe,
	Sagi Grimberg, Max Gurtovoy

On Mon, Sep 18, 2017 at 06:18:25PM +0000, Stephen  Bates wrote:

> Using a NAT approach discussed in [1] I can setup IPv4 addresses on
> both HCAs such that I avoid a local loopback (the addresses I use
> are a little different to the ones in that reference but the
> approach is identical). This allows ping, iperf and other IP based
> applications to work just fine. For example:

The RDMA stuff ignores everything in iptables.

IIRC, you need to make 'ip route get blah' not return lo. This is done
with some combination of policy routing and sysfs tweaking.

IMHO, it is also a bug if loopback roce doesn't just work out of the
box, fully internally to the NIC. :\

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found]     ` <20170918205959.GC7059-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-09-19  1:11       ` Stephen  Bates
       [not found]         ` <AB27E7A4-FD5A-438E-A0CF-E593882F5EAE-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen  Bates @ 2017-09-19  1:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Logan Gunthorpe,
	Sagi Grimberg, Max Gurtovoy

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1084 bytes --]

>> Using a NAT approach discussed in [1] I can setup IPv4 addresses on
>> both HCAs such that I avoid a local loopback (the addresses I use
>> are a little different to the ones in that reference but the
>> approach is identical). This allows ping, iperf and other IP based
>> applications to work just fine. For example:

Thanks for the reply Jason

> The RDMA stuff ignores everything in iptables.

Can I ask why this is?

> IIRC, you need to make 'ip route get blah' not return lo. This is done
> with some combination of policy routing and sysfs tweaking.

OK I think what Parav sent me aligns to that. I will test it and see if it works.

> IMHO, it is also a bug if loopback roce doesn't just work out of the
> box, fully internally to the NIC. :\

OK. Note what I am trying to get working is a bit different. It’s not internal to one
NIC but across two NICs in the same server. Loopback inside one NIC works fine.



N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found]     ` <VI1PR0502MB3008FAC205874147558EA241D1630-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  2017-09-18 19:41       ` Stephen  Bates
@ 2017-09-19  3:51       ` Leon Romanovsky
       [not found]         ` <20170919035134.GH5788-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  1 sibling, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2017-09-19  3:51 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Stephen Bates, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Logan Gunthorpe, Sagi Grimberg, Max Gurtovoy

[-- Attachment #1: Type: text/plain, Size: 2013 bytes --]

On Mon, Sep 18, 2017 at 07:09:14PM +0000, Parav Pandit wrote:
> Hi Stephen,
>
> If you only intention is to force to outgoing port and loopback externally, I can suggest you different route table configuration without NAT.
> Let me know.

Hi Parav,

For countless number of times, and it is already two digits count,
I asked you in private and public do NOT answer in top-posting format on
the public mailing lists, but it seems like you don't care and prefer
do not respect participants of those lists.

What can we do to stop it? Premoderation of your posts? VP approval of
EVERY post? Anything else?

>
> Parav
>
> > -----Original Message-----
> > From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> > owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Stephen Bates
> > Sent: Monday, September 18, 2017 1:18 PM
> > To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Cc: Logan Gunthorpe <logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>; Sagi Grimberg
> > <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>; Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Subject: [linux-rdma and rdma-core]: Unable to perform rdma_connect in
> > loopbacked configuration
> >

<...>

> > [1]
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fserverf
> > ault.com%2Fquestions%2F127636%2Fforce-local-ip-traffic-to-an-external-
> > interface&data=02%7C01%7Cparav%40mellanox.com%7Ce2f8f804df59419cc25
> > a08d4fec1b338%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63641
> > 3555309452199&sdata=E2MQTCbSTHiUxmIAVZITBCPgKTEWBTY2%2BWVqHuSJ
> > KlY%3D&reserved=0

And if we are talking about your emails, please work with IT to
configure proper replies without links overriding.

In meanwhile, please STOP using public mailing lists till you fix these two issues.

Thanks

> >
> >
> > \x04 {.n +       +%  lzwm  b 맲  r  zX  \x1aݙ   \x17  ܨ}   Ơz &j:+v        zZ+  +zf   h   ~    i   z \x1e w   ?
> > & )ߢ^[f

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found]         ` <20170919035134.GH5788-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-09-19  6:13           ` Parav Pandit
  0 siblings, 0 replies; 9+ messages in thread
From: Parav Pandit @ 2017-09-19  6:13 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Stephen Bates, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Logan Gunthorpe, Sagi Grimberg, Max Gurtovoy

Hi Leon,

> For countless number of times, and it is already two digits count, I asked you in
> private and public do NOT answer in top-posting format on the public mailing
> lists, but it seems like you don't care and prefer do not respect participants of
> those lists.
> 
> What can we do to stop it? Premoderation of your posts? VP approval of EVERY
> post? Anything else?

There was no particular point in Stephen's email that I could answer as interspersed response.
Answering it at bottom and top appeared same to me in cases where it is not interspersed.
Seems top posting even for such case is very serious concern than communicating the point.
I apologize again for such violation.
I will review email responses twice now before sending to make sure they are not top posted.

I understood from [1] that why bottom posting is efficient.
I didn't realize that top-posting is disrespectful to participants as you mentioned above.
I didn't mean to disrespects the participants.

Again I will follow bottom posting.

[1] http://www.caliburn.nl/topposting.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found]         ` <AB27E7A4-FD5A-438E-A0CF-E593882F5EAE-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
@ 2017-09-20 16:28           ` Jason Gunthorpe
       [not found]             ` <20170920162828.GC536-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2017-09-20 16:28 UTC (permalink / raw)
  To: Stephen Bates
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Logan Gunthorpe,
	Sagi Grimberg, Max Gurtovoy

On Tue, Sep 19, 2017 at 01:11:49AM +0000, Stephen  Bates wrote:
> >> Using a NAT approach discussed in [1] I can setup IPv4 addresses on
> >> both HCAs such that I avoid a local loopback (the addresses I use
> >> are a little different to the ones in that reference but the
> >> approach is identical). This allows ping, iperf and other IP based
> >> applications to work just fine. For example:
> 
> Thanks for the reply Jason
> 
> > The RDMA stuff ignores everything in iptables.
> 
> Can I ask why this is?

roce is an incomplete emulation of the netstack with hardware
offload. everything iptables is part of the incompleteness..

> > IMHO, it is also a bug if loopback roce doesn't just work out of the
> > box, fully internally to the NIC. :\
> 
> OK. Note what I am trying to get working is a bit different. It'ss
> not internal to one NIC but across two NICs in the same
> server. Loopback inside one NIC works fine.

I got that, but it makes no difference to my statement. Out of the box
the kernel should select one of the two NICs and do internal loopback.

That is consistent with the netstack semantic that the IP address is a
property of the host not the netdev.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration
       [not found]             ` <20170920162828.GC536-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-09-26 20:52               ` Stephen  Bates
  0 siblings, 0 replies; 9+ messages in thread
From: Stephen  Bates @ 2017-09-26 20:52 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Logan Gunthorpe,
	Sagi Grimberg, Max Gurtovoy

> 
> Thanks for the reply Jason
> 
> > > The RDMA stuff ignores everything in iptables.
> > 
> > Can I ask why this is?
>
> roce is an incomplete emulation of the netstack with hardware
> offload. everything iptables is part of the incompleteness..

OK thanks

>> > IMHO, it is also a bug if loopback roce doesn't just work out of the
>> > box, fully internally to the NIC. :\
>> 
>> OK. Note what I am trying to get working is a bit different. It'ss
>> not internal to one NIC but across two NICs in the same
>> server. Loopback inside one NIC works fine.
>
> I got that, but it makes no difference to my statement. Out of the box
> the kernel should select one of the two NICs and do internal loopback.
>
> That is consistent with the netstack semantic that the IP address is a
> property of the host not the netdev.

OK I see what you are saying. I do see this working on my setup in the sense that if I don’t use Parav’s loopback script I see rdma_cm based connections loopback back internally off one of the HCAs.

Stephen




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-09-26 20:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 18:18 [linux-rdma and rdma-core]: Unable to perform rdma_connect in loopbacked configuration Stephen  Bates
     [not found] ` <59792242-F2C8-431A-BEDA-996844EDE4C5-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
2017-09-18 19:09   ` Parav Pandit
     [not found]     ` <VI1PR0502MB3008FAC205874147558EA241D1630-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-09-18 19:41       ` Stephen  Bates
2017-09-19  3:51       ` Leon Romanovsky
     [not found]         ` <20170919035134.GH5788-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-09-19  6:13           ` Parav Pandit
2017-09-18 20:59   ` Jason Gunthorpe
     [not found]     ` <20170918205959.GC7059-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-09-19  1:11       ` Stephen  Bates
     [not found]         ` <AB27E7A4-FD5A-438E-A0CF-E593882F5EAE-pv7U853sEMVWk0Htik3J/w@public.gmane.org>
2017-09-20 16:28           ` Jason Gunthorpe
     [not found]             ` <20170920162828.GC536-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-09-26 20:52               ` Stephen  Bates

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).