All of lore.kernel.org
 help / color / mirror / Atom feed
* rdma_rxe usage problem
@ 2022-01-19 11:42 Alexander Kalentyev
  2022-01-19 14:12 ` Yanjun Zhu
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Kalentyev @ 2022-01-19 11:42 UTC (permalink / raw)
  To: linux-rdma

I am trying to install and use soft RoCE for development purposes
(right now on a localhost).
I installed rdma-core on a MANJARO system from AUR.
Then I did:

sudo modprobe rdma_rxe
sudo rdma link add rxe0 type rxe netdev wlp60s0

then ibstat shows:
CA 'rxe0'
        CA type:
        Number of ports: 1
        Firmware version:
        Hardware version:
        Node GUID: 0x4a51c5fffef6e159
        System image GUID: 0x4a51c5fffef6e159
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 2.5
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x00010000
                Port GUID: 0x4a51c5fffef6e159
                Link layer: Ethernet

But launching any example I always get an error by call of: ibv_modify_qp
with an attempt to modify QP state to RTR (for example by launching
.ibv_rc_pingpong)
The type of the error is EINVAL.
I believe I miss something very obvious.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rdma_rxe usage problem
  2022-01-19 11:42 rdma_rxe usage problem Alexander Kalentyev
@ 2022-01-19 14:12 ` Yanjun Zhu
  2022-01-19 17:53   ` Alexander Kalentyev
  0 siblings, 1 reply; 7+ messages in thread
From: Yanjun Zhu @ 2022-01-19 14:12 UTC (permalink / raw)
  To: Alexander Kalentyev, linux-rdma

在 2022/1/19 19:42, Alexander Kalentyev 写道:
> I am trying to install and use soft RoCE for development purposes
> (right now on a localhost).
> I installed rdma-core on a MANJARO system from AUR.
> Then I did:
> 
> sudo modprobe rdma_rxe
> sudo rdma link add rxe0 type rxe netdev wlp60s0
> 
> then ibstat shows:
> CA 'rxe0'
>          CA type:
>          Number of ports: 1
>          Firmware version:
>          Hardware version:
>          Node GUID: 0x4a51c5fffef6e159
>          System image GUID: 0x4a51c5fffef6e159
>          Port 1:
>                  State: Active
>                  Physical state: LinkUp
>                  Rate: 2.5
>                  Base lid: 0
>                  LMC: 0
>                  SM lid: 0
>                  Capability mask: 0x00010000
>                  Port GUID: 0x4a51c5fffef6e159
>                  Link layer: Ethernet

Can rping work after you configured this test environment?

Zhu Yanjun

> 
> But launching any example I always get an error by call of: ibv_modify_qp
> with an attempt to modify QP state to RTR (for example by launching
> .ibv_rc_pingpong)
> The type of the error is EINVAL.
> I believe I miss something very obvious.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rdma_rxe usage problem
  2022-01-19 14:12 ` Yanjun Zhu
@ 2022-01-19 17:53   ` Alexander Kalentyev
  2022-01-19 19:54     ` Pearson, Robert B
  2022-01-20 13:21     ` Yanjun Zhu
  0 siblings, 2 replies; 7+ messages in thread
From: Alexander Kalentyev @ 2022-01-19 17:53 UTC (permalink / raw)
  To: Yanjun Zhu; +Cc: linux-rdma

With rping everything was fiine, but I had to use a real Ip address.
 >rping -s -C 10 -v
server ping data: rdma-ping-0:
ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
server ping data: rdma-ping-1:
BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
server ping data: rdma-ping-2:
CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
server ping data: rdma-ping-3:
DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
server ping data: rdma-ping-4:
EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
server ping data: rdma-ping-5:
FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
server ping data: rdma-ping-6:
GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
server ping data: rdma-ping-7:
HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
server ping data: rdma-ping-8:
IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
server ping data: rdma-ping-9:
JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
server DISCONNECT EVENT...
wait for RDMA_READ_ADV state 10

>rping -c -a 192.168.0.176 -C 10 -v
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
client DISCONNECT EVENT...

Anyway the ibv_rc_pingpong shows an error:

>ibv_rc_pingpong -d rxe0 -g 0
  local address:  LID 0x0000, QPN 0x000015, PSN 0x015dd8, GID
fe80::4a51:c5ff:fef6:e159
Failed to modify QP to RTR
Couldn't connect to remote QP

>ibv_rc_pingpong -d rxe0 -g 0 192.168.0.176
  local address:  LID 0x0000, QPN 0x000016, PSN 0x007fa7, GID
fe80::4a51:c5ff:fef6:e159
client read/write: No space left on device
Couldn't read/write remote address

ср, 19 янв. 2022 г. в 15:12, Yanjun Zhu <yanjun.zhu@linux.dev>:
>
> 在 2022/1/19 19:42, Alexander Kalentyev 写道:
> > I am trying to install and use soft RoCE for development purposes
> > (right now on a localhost).
> > I installed rdma-core on a MANJARO system from AUR.
> > Then I did:
> >
> > sudo modprobe rdma_rxe
> > sudo rdma link add rxe0 type rxe netdev wlp60s0
> >
> > then ibstat shows:
> > CA 'rxe0'
> >          CA type:
> >          Number of ports: 1
> >          Firmware version:
> >          Hardware version:
> >          Node GUID: 0x4a51c5fffef6e159
> >          System image GUID: 0x4a51c5fffef6e159
> >          Port 1:
> >                  State: Active
> >                  Physical state: LinkUp
> >                  Rate: 2.5
> >                  Base lid: 0
> >                  LMC: 0
> >                  SM lid: 0
> >                  Capability mask: 0x00010000
> >                  Port GUID: 0x4a51c5fffef6e159
> >                  Link layer: Ethernet
>
> Can rping work after you configured this test environment?
>
> Zhu Yanjun
>
> >
> > But launching any example I always get an error by call of: ibv_modify_qp
> > with an attempt to modify QP state to RTR (for example by launching
> > .ibv_rc_pingpong)
> > The type of the error is EINVAL.
> > I believe I miss something very obvious.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: rdma_rxe usage problem
  2022-01-19 17:53   ` Alexander Kalentyev
@ 2022-01-19 19:54     ` Pearson, Robert B
  2022-01-20 11:32       ` Alexander Kalentyev
  2022-01-20 13:21     ` Yanjun Zhu
  1 sibling, 1 reply; 7+ messages in thread
From: Pearson, Robert B @ 2022-01-19 19:54 UTC (permalink / raw)
  To: Alexander Kalentyev, Yanjun Zhu; +Cc: linux-rdma



-----Original Message-----
From: Alexander Kalentyev <comrad.karlovich@gmail.com> 
Sent: Wednesday, January 19, 2022 11:53 AM


Anyway the ibv_rc_pingpong shows an error:

>ibv_rc_pingpong -d rxe0 -g 0
  local address:  LID 0x0000, QPN 0x000015, PSN 0x015dd8, GID
fe80::4a51:c5ff:fef6:e159
Failed to modify QP to RTR
Couldn't connect to remote QP

>
Alexander,

I use a script to restart rxe after changing anything it looks like

	#!/bin/bash

	export LD_LIBRARY_PATH=<path to rdma-core>/rdma-core/build/lib:/usr/lib

	sudo rmmod rdma_rxe
	sudo modprobe rdma_rxe

	sudo ip link set dev enp0s3 mtu 4500
	sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe35:5ea7/64
	sudo rdma link add rxe0 type rxe netdev enp0s3

The important line is adding the ipv6 address which corresponds with the MAC address of
The ethernet nic which is

	08:00:27:35:5e:a7

Some OSes (like mine) do not create this address automatically but mangle the address.
But the rdma core driver seems to expect all roce providers to have it.

Hope this helps.

Bob Pearson
rpearson@hpe.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rdma_rxe usage problem
  2022-01-19 19:54     ` Pearson, Robert B
@ 2022-01-20 11:32       ` Alexander Kalentyev
  2022-01-20 12:23         ` Leon Romanovsky
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Kalentyev @ 2022-01-20 11:32 UTC (permalink / raw)
  To: Pearson, Robert B; +Cc: Yanjun Zhu, linux-rdma

Dear Robert,

Thank you for your help!
What I did was :
ipv6calc --in prefix+mac fe80:: XX:XX:XX:XX:XX:XX
(with XX:XX:XX:XX:XX:XX one have to insert the MAC address of the NIC)
And then
udo ip addr add dev <dev> fe80::ZZZZ:ZZZZ:ZZZZ:ZZZZ/64
(with ZZZ:ZZZZ:ZZZZ the address genereted by ipv6calc)
sudo rdma link add rxe0 type rxe netdev <dev>
After that everything is working as expected:
>ibv_rc_pingpong -d rxe0 -g 0 192.168.0.176
  local address:  LID 0x0000, QPN 0x000012, PSN 0xf85048, GID
ZZZZ::ZZZZ:ZZZZ:ZZZZ:ZZZZ
  remote address: LID 0x0000, QPN 0x000011, PSN 0xf7c7b7, GID
ZZZZ::ZZZZ:ZZZZ:ZZZZ:ZZZZ
8192000 bytes in 0.01 seconds = 4743.49 Mbit/sec
1000 iters in 0.01 seconds = 13.82 usec/iter

So, I think it make sense to include this information in a README on
github. Otherwise somebody like me can spend a week trying to figure
out what is going on!

Thanks once again!
Regards,
Alexander Kalentev

ср, 19 янв. 2022 г. в 20:54, Pearson, Robert B <robert.pearson2@hpe.com>:
>
>
>
> -----Original Message-----
> From: Alexander Kalentyev <comrad.karlovich@gmail.com>
> Sent: Wednesday, January 19, 2022 11:53 AM
>
>
> Anyway the ibv_rc_pingpong shows an error:
>
> >ibv_rc_pingpong -d rxe0 -g 0
>   local address:  LID 0x0000, QPN 0x000015, PSN 0x015dd8, GID
> fe80::4a51:c5ff:fef6:e159
> Failed to modify QP to RTR
> Couldn't connect to remote QP
>
> >
> Alexander,
>
> I use a script to restart rxe after changing anything it looks like
>
>         #!/bin/bash
>
>         export LD_LIBRARY_PATH=<path to rdma-core>/rdma-core/build/lib:/usr/lib
>
>         sudo rmmod rdma_rxe
>         sudo modprobe rdma_rxe
>
>         sudo ip link set dev enp0s3 mtu 4500
>         sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe35:5ea7/64
>         sudo rdma link add rxe0 type rxe netdev enp0s3
>
> The important line is adding the ipv6 address which corresponds with the MAC address of
> The ethernet nic which is
>
>         08:00:27:35:5e:a7
>
> Some OSes (like mine) do not create this address automatically but mangle the address.
> But the rdma core driver seems to expect all roce providers to have it.
>
> Hope this helps.
>
> Bob Pearson
> rpearson@hpe.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rdma_rxe usage problem
  2022-01-20 11:32       ` Alexander Kalentyev
@ 2022-01-20 12:23         ` Leon Romanovsky
  0 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2022-01-20 12:23 UTC (permalink / raw)
  To: Alexander Kalentyev; +Cc: Pearson, Robert B, Yanjun Zhu, linux-rdma

On Thu, Jan 20, 2022 at 12:32:12PM +0100, Alexander Kalentyev wrote:
> Dear Robert,
> 
> Thank you for your help!
> What I did was :
> ipv6calc --in prefix+mac fe80:: XX:XX:XX:XX:XX:XX
> (with XX:XX:XX:XX:XX:XX one have to insert the MAC address of the NIC)
> And then
> udo ip addr add dev <dev> fe80::ZZZZ:ZZZZ:ZZZZ:ZZZZ/64
> (with ZZZ:ZZZZ:ZZZZ the address genereted by ipv6calc)
> sudo rdma link add rxe0 type rxe netdev <dev>
> After that everything is working as expected:
> >ibv_rc_pingpong -d rxe0 -g 0 192.168.0.176
>   local address:  LID 0x0000, QPN 0x000012, PSN 0xf85048, GID
> ZZZZ::ZZZZ:ZZZZ:ZZZZ:ZZZZ
>   remote address: LID 0x0000, QPN 0x000011, PSN 0xf7c7b7, GID
> ZZZZ::ZZZZ:ZZZZ:ZZZZ:ZZZZ
> 8192000 bytes in 0.01 seconds = 4743.49 Mbit/sec
> 1000 iters in 0.01 seconds = 13.82 usec/iter
> 
> So, I think it make sense to include this information in a README on
> github. Otherwise somebody like me can spend a week trying to figure
> out what is going on!

PR is more than welcomed.

Thanks

> 
> Thanks once again!
> Regards,
> Alexander Kalentev
> 
> ср, 19 янв. 2022 г. в 20:54, Pearson, Robert B <robert.pearson2@hpe.com>:
> >
> >
> >
> > -----Original Message-----
> > From: Alexander Kalentyev <comrad.karlovich@gmail.com>
> > Sent: Wednesday, January 19, 2022 11:53 AM
> >
> >
> > Anyway the ibv_rc_pingpong shows an error:
> >
> > >ibv_rc_pingpong -d rxe0 -g 0
> >   local address:  LID 0x0000, QPN 0x000015, PSN 0x015dd8, GID
> > fe80::4a51:c5ff:fef6:e159
> > Failed to modify QP to RTR
> > Couldn't connect to remote QP
> >
> > >
> > Alexander,
> >
> > I use a script to restart rxe after changing anything it looks like
> >
> >         #!/bin/bash
> >
> >         export LD_LIBRARY_PATH=<path to rdma-core>/rdma-core/build/lib:/usr/lib
> >
> >         sudo rmmod rdma_rxe
> >         sudo modprobe rdma_rxe
> >
> >         sudo ip link set dev enp0s3 mtu 4500
> >         sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe35:5ea7/64
> >         sudo rdma link add rxe0 type rxe netdev enp0s3
> >
> > The important line is adding the ipv6 address which corresponds with the MAC address of
> > The ethernet nic which is
> >
> >         08:00:27:35:5e:a7
> >
> > Some OSes (like mine) do not create this address automatically but mangle the address.
> > But the rdma core driver seems to expect all roce providers to have it.
> >
> > Hope this helps.
> >
> > Bob Pearson
> > rpearson@hpe.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rdma_rxe usage problem
  2022-01-19 17:53   ` Alexander Kalentyev
  2022-01-19 19:54     ` Pearson, Robert B
@ 2022-01-20 13:21     ` Yanjun Zhu
  1 sibling, 0 replies; 7+ messages in thread
From: Yanjun Zhu @ 2022-01-20 13:21 UTC (permalink / raw)
  To: Alexander Kalentyev; +Cc: linux-rdma

If I remember correctly, can you use this command to make tests?

server:

ibv_rc_pingpong -d rxe0 -g 1

Client:

ibv_rc_pingpong -d rxe0 -g 1 server_ip_addr

Zhu Yanjun

在 2022/1/20 1:53, Alexander Kalentyev 写道:
> With rping everything was fiine, but I had to use a real Ip address.
>   >rping -s -C 10 -v
> server ping data: rdma-ping-0:
> ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
> server ping data: rdma-ping-1:
> BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
> server ping data: rdma-ping-2:
> CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
> server ping data: rdma-ping-3:
> DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
> server ping data: rdma-ping-4:
> EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
> server ping data: rdma-ping-5:
> FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
> server ping data: rdma-ping-6:
> GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
> server ping data: rdma-ping-7:
> HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
> server ping data: rdma-ping-8:
> IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
> server ping data: rdma-ping-9:
> JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
> server DISCONNECT EVENT...
> wait for RDMA_READ_ADV state 10
>
>> rping -c -a 192.168.0.176 -C 10 -v
> ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
> ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
> ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
> ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
> ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
> ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
> ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
> ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
> ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
> ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
> client DISCONNECT EVENT...
>
> Anyway the ibv_rc_pingpong shows an error:
>
>> ibv_rc_pingpong -d rxe0 -g 0
>    local address:  LID 0x0000, QPN 0x000015, PSN 0x015dd8, GID
> fe80::4a51:c5ff:fef6:e159
> Failed to modify QP to RTR
> Couldn't connect to remote QP
>
>> ibv_rc_pingpong -d rxe0 -g 0 192.168.0.176
>    local address:  LID 0x0000, QPN 0x000016, PSN 0x007fa7, GID
> fe80::4a51:c5ff:fef6:e159
> client read/write: No space left on device
> Couldn't read/write remote address
>
> ср, 19 янв. 2022 г. в 15:12, Yanjun Zhu <yanjun.zhu@linux.dev>:
>> 在 2022/1/19 19:42, Alexander Kalentyev 写道:
>>> I am trying to install and use soft RoCE for development purposes
>>> (right now on a localhost).
>>> I installed rdma-core on a MANJARO system from AUR.
>>> Then I did:
>>>
>>> sudo modprobe rdma_rxe
>>> sudo rdma link add rxe0 type rxe netdev wlp60s0
>>>
>>> then ibstat shows:
>>> CA 'rxe0'
>>>           CA type:
>>>           Number of ports: 1
>>>           Firmware version:
>>>           Hardware version:
>>>           Node GUID: 0x4a51c5fffef6e159
>>>           System image GUID: 0x4a51c5fffef6e159
>>>           Port 1:
>>>                   State: Active
>>>                   Physical state: LinkUp
>>>                   Rate: 2.5
>>>                   Base lid: 0
>>>                   LMC: 0
>>>                   SM lid: 0
>>>                   Capability mask: 0x00010000
>>>                   Port GUID: 0x4a51c5fffef6e159
>>>                   Link layer: Ethernet
>> Can rping work after you configured this test environment?
>>
>> Zhu Yanjun
>>
>>> But launching any example I always get an error by call of: ibv_modify_qp
>>> with an attempt to modify QP state to RTR (for example by launching
>>> .ibv_rc_pingpong)
>>> The type of the error is EINVAL.
>>> I believe I miss something very obvious.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-01-20 13:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-19 11:42 rdma_rxe usage problem Alexander Kalentyev
2022-01-19 14:12 ` Yanjun Zhu
2022-01-19 17:53   ` Alexander Kalentyev
2022-01-19 19:54     ` Pearson, Robert B
2022-01-20 11:32       ` Alexander Kalentyev
2022-01-20 12:23         ` Leon Romanovsky
2022-01-20 13:21     ` Yanjun Zhu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.