All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
@ 2023-06-07 19:43 Chuck Lever
  2023-06-08 15:30 ` Chuck Lever III
  2023-06-09 20:49 ` Tom Talpey
  0 siblings, 2 replies; 13+ messages in thread
From: Chuck Lever @ 2023-06-07 19:43 UTC (permalink / raw)
  To: jgg, BMT; +Cc: Chuck Lever, tom, linux-rdma

From: Chuck Lever <chuck.lever@oracle.com>

We would like to enable the use of siw on top of a VPN that is
constructed and managed via a tun device. That hasn't worked up
until now because ARPHRD_NONE devices (such as tun devices) have
no GID for the RDMA/core to look up.

But it turns out that the egress device has already been picked for
us. addr_handler() just has to do the right thing with it.

Tested with siw and qedr, both initiator and target.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 drivers/infiniband/core/cma.c |    3 +++
 1 file changed, 3 insertions(+)

This of course needs broader testing, but it seems to work, and it's
a little nicer than "if (dev_type == ARPHRD_NONE)".

One thing I discovered is that the NFS/RDMA server implementation
does not deal at all with more than one RDMA device on the system.
I will address that with an ib_client; SunRPC patches forthcoming.

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 56e568fcd32b..c9a2bdb49e3c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
 	if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
 		return ERR_PTR(-ENODEV);
 
+	if (rdma_protocol_iwarp(device, port))
+		return rdma_get_gid_attr(device, port, 0);
+
 	if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
 		return ERR_PTR(-ENODEV);
 



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-07 19:43 [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP Chuck Lever
@ 2023-06-08 15:30 ` Chuck Lever III
  2023-06-09 20:49 ` Tom Talpey
  1 sibling, 0 replies; 13+ messages in thread
From: Chuck Lever III @ 2023-06-08 15:30 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Bernard Metzler, Tom Talpey, linux-rdma, Chuck Lever



> On Jun 7, 2023, at 3:43 PM, Chuck Lever <cel@kernel.org> wrote:
> 
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> We would like to enable the use of siw on top of a VPN that is
> constructed and managed via a tun device. That hasn't worked up
> until now because ARPHRD_NONE devices (such as tun devices) have
> no GID for the RDMA/core to look up.
> 
> But it turns out that the egress device has already been picked for
> us. addr_handler() just has to do the right thing with it.
> 
> Tested with siw and qedr, both initiator and target.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> drivers/infiniband/core/cma.c |    3 +++
> 1 file changed, 3 insertions(+)
> 
> This of course needs broader testing, but it seems to work, and it's
> a little nicer than "if (dev_type == ARPHRD_NONE)".
> 
> One thing I discovered is that the NFS/RDMA server implementation
> does not deal at all with more than one RDMA device on the system.
> I will address that with an ib_client; SunRPC patches forthcoming.

Or maybe not.

I'm looking at cma_iw_acquire_dev(). Where is the
listen_id_priv->id.port_num field supposed to be initialized?


> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 56e568fcd32b..c9a2bdb49e3c 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
> if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
> return ERR_PTR(-ENODEV);
> 
> + if (rdma_protocol_iwarp(device, port))
> + return rdma_get_gid_attr(device, port, 0);
> +
> if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
> return ERR_PTR(-ENODEV);
> 
> 
> 

--
Chuck Lever



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-07 19:43 [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP Chuck Lever
  2023-06-08 15:30 ` Chuck Lever III
@ 2023-06-09 20:49 ` Tom Talpey
  2023-06-10  0:04   ` Zhu Yanjun
  2023-06-10 12:05   ` Jason Gunthorpe
  1 sibling, 2 replies; 13+ messages in thread
From: Tom Talpey @ 2023-06-09 20:49 UTC (permalink / raw)
  To: Chuck Lever, jgg, BMT; +Cc: Chuck Lever, linux-rdma

On 6/7/2023 3:43 PM, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> We would like to enable the use of siw on top of a VPN that is
> constructed and managed via a tun device. That hasn't worked up
> until now because ARPHRD_NONE devices (such as tun devices) have
> no GID for the RDMA/core to look up.
> 
> But it turns out that the egress device has already been picked for
> us. addr_handler() just has to do the right thing with it.
> 
> Tested with siw and qedr, both initiator and target.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>   drivers/infiniband/core/cma.c |    3 +++
>   1 file changed, 3 insertions(+)
> 
> This of course needs broader testing, but it seems to work, and it's
> a little nicer than "if (dev_type == ARPHRD_NONE)".
> 
> One thing I discovered is that the NFS/RDMA server implementation
> does not deal at all with more than one RDMA device on the system.
> I will address that with an ib_client; SunRPC patches forthcoming.
> 
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 56e568fcd32b..c9a2bdb49e3c 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>   	if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>   		return ERR_PTR(-ENODEV);
>   
> +	if (rdma_protocol_iwarp(device, port))
> +		return rdma_get_gid_attr(device, port, 0);

This might work, but I'm skeptical of the conditional. There's nothing
special about iWARP that says a GID should come from the outgoing device
mac. And, other protocols without IB GID addressing won't benefit.

Wouldn't it be better to detect a missing GID, or infer the need from
some other transport attribute?

Tom.

> +
>   	if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
>   		return ERR_PTR(-ENODEV);
>   
> 
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-09 20:49 ` Tom Talpey
@ 2023-06-10  0:04   ` Zhu Yanjun
  2023-06-10 16:43     ` Chuck Lever III
  2023-06-10 12:05   ` Jason Gunthorpe
  1 sibling, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2023-06-10  0:04 UTC (permalink / raw)
  To: Tom Talpey, Chuck Lever, jgg, BMT; +Cc: Chuck Lever, linux-rdma

在 2023/6/10 4:49, Tom Talpey 写道:
> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> We would like to enable the use of siw on top of a VPN that is
>> constructed and managed via a tun device. That hasn't worked up
>> until now because ARPHRD_NONE devices (such as tun devices) have
>> no GID for the RDMA/core to look up.
>>
>> But it turns out that the egress device has already been picked for
>> us. addr_handler() just has to do the right thing with it.
>>
>> Tested with siw and qedr, both initiator and target.
>>
>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>   drivers/infiniband/core/cma.c |    3 +++
>>   1 file changed, 3 insertions(+)
>>
>> This of course needs broader testing, but it seems to work, and it's
>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>
>> One thing I discovered is that the NFS/RDMA server implementation
>> does not deal at all with more than one RDMA device on the system.
>> I will address that with an ib_client; SunRPC patches forthcoming.
>>
>> diff --git a/drivers/infiniband/core/cma.c 
>> b/drivers/infiniband/core/cma.c
>> index 56e568fcd32b..c9a2bdb49e3c 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>       if (!rdma_dev_access_netns(device, 
>> id_priv->id.route.addr.dev_addr.net))
>>           return ERR_PTR(-ENODEV);
>> +    if (rdma_protocol_iwarp(device, port))
>> +        return rdma_get_gid_attr(device, port, 0);
> 
> This might work, but I'm skeptical of the conditional. There's nothing
> special about iWARP that says a GID should come from the outgoing device
> mac. And, other protocols without IB GID addressing won't benefit.

Agree with you. Other protocols, such as RXE, also need be handled.
So a better solution than this needs.

Zhu Yanjun

> 
> Wouldn't it be better to detect a missing GID, or infer the need from
> some other transport attribute?
> 
> Tom.
> 
>> +
>>       if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, 
>> port))
>>           return ERR_PTR(-ENODEV);
>>
>>
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-09 20:49 ` Tom Talpey
  2023-06-10  0:04   ` Zhu Yanjun
@ 2023-06-10 12:05   ` Jason Gunthorpe
  2023-06-10 16:38     ` Tom Talpey
                       ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2023-06-10 12:05 UTC (permalink / raw)
  To: Tom Talpey; +Cc: Chuck Lever, BMT, Chuck Lever, linux-rdma

On Fri, Jun 09, 2023 at 04:49:49PM -0400, Tom Talpey wrote:
> On 6/7/2023 3:43 PM, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> > 
> > We would like to enable the use of siw on top of a VPN that is
> > constructed and managed via a tun device. That hasn't worked up
> > until now because ARPHRD_NONE devices (such as tun devices) have
> > no GID for the RDMA/core to look up.
> > 
> > But it turns out that the egress device has already been picked for
> > us. addr_handler() just has to do the right thing with it.
> > 
> > Tested with siw and qedr, both initiator and target.
> > 
> > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >   drivers/infiniband/core/cma.c |    3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > This of course needs broader testing, but it seems to work, and it's
> > a little nicer than "if (dev_type == ARPHRD_NONE)".
> > 
> > One thing I discovered is that the NFS/RDMA server implementation
> > does not deal at all with more than one RDMA device on the system.
> > I will address that with an ib_client; SunRPC patches forthcoming.
> > 
> > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> > index 56e568fcd32b..c9a2bdb49e3c 100644
> > --- a/drivers/infiniband/core/cma.c
> > +++ b/drivers/infiniband/core/cma.c
> > @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
> >   	if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
> >   		return ERR_PTR(-ENODEV);
> > +	if (rdma_protocol_iwarp(device, port))
> > +		return rdma_get_gid_attr(device, port, 0);
> 
> This might work, but I'm skeptical of the conditional. There's nothing
> special about iWARP that says a GID should come from the outgoing device
> mac. And, other protocols without IB GID addressing won't benefit.

The GID represents the source address, so it better come from the
outgoing device or something is really wrong.

iWARP gets a conditional because iwarp always has a single GID, other
devices do not work that way.

Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-10 12:05   ` Jason Gunthorpe
@ 2023-06-10 16:38     ` Tom Talpey
  2023-06-10 17:11       ` Chuck Lever III
  2023-06-11  1:08       ` Jason Gunthorpe
  2023-06-11  0:33     ` Zhu Yanjun
  2023-06-12  0:19     ` Chuck Lever III
  2 siblings, 2 replies; 13+ messages in thread
From: Tom Talpey @ 2023-06-10 16:38 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Chuck Lever, BMT, Chuck Lever, linux-rdma

On 6/10/2023 8:05 AM, Jason Gunthorpe wrote:
> On Fri, Jun 09, 2023 at 04:49:49PM -0400, Tom Talpey wrote:
>> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>
>>> We would like to enable the use of siw on top of a VPN that is
>>> constructed and managed via a tun device. That hasn't worked up
>>> until now because ARPHRD_NONE devices (such as tun devices) have
>>> no GID for the RDMA/core to look up.
>>>
>>> But it turns out that the egress device has already been picked for
>>> us. addr_handler() just has to do the right thing with it.
>>>
>>> Tested with siw and qedr, both initiator and target.
>>>
>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>    drivers/infiniband/core/cma.c |    3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> This of course needs broader testing, but it seems to work, and it's
>>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>>
>>> One thing I discovered is that the NFS/RDMA server implementation
>>> does not deal at all with more than one RDMA device on the system.
>>> I will address that with an ib_client; SunRPC patches forthcoming.
>>>
>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>> index 56e568fcd32b..c9a2bdb49e3c 100644
>>> --- a/drivers/infiniband/core/cma.c
>>> +++ b/drivers/infiniband/core/cma.c
>>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>>    	if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>>>    		return ERR_PTR(-ENODEV);
>>> +	if (rdma_protocol_iwarp(device, port))
>>> +		return rdma_get_gid_attr(device, port, 0);
>>
>> This might work, but I'm skeptical of the conditional. There's nothing
>> special about iWARP that says a GID should come from the outgoing device
>> mac. And, other protocols without IB GID addressing won't benefit.
> 
> The GID represents the source address, so it better come from the
> outgoing device or something is really wrong.
> 
> iWARP gets a conditional because iwarp always has a single GID, other
> devices do not work that way.

Not sure I follow. TCP is routable so it can use multiple egress ports.
That same routing means an incoming packet bearing an appropriate local
address will be accepted on any port.

So still, I don't think this an iWARP property per se. It's coming from
the transport and its addressing. I'd hope that this can be gleaned from
something other than "it's iWARP, cma needs to do ...".

Tom.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-10  0:04   ` Zhu Yanjun
@ 2023-06-10 16:43     ` Chuck Lever III
  2023-06-11  0:27       ` Zhu Yanjun
  0 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever III @ 2023-06-10 16:43 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Tom Talpey, Chuck Lever, Jason Gunthorpe, Bernard Metzler, linux-rdma



> On Jun 9, 2023, at 8:04 PM, Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
> 
> 在 2023/6/10 4:49, Tom Talpey 写道:
>> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>>> From: Chuck Lever <chuck.lever@oracle.com>
>>> 
>>> We would like to enable the use of siw on top of a VPN that is
>>> constructed and managed via a tun device. That hasn't worked up
>>> until now because ARPHRD_NONE devices (such as tun devices) have
>>> no GID for the RDMA/core to look up.
>>> 
>>> But it turns out that the egress device has already been picked for
>>> us. addr_handler() just has to do the right thing with it.
>>> 
>>> Tested with siw and qedr, both initiator and target.
>>> 
>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>   drivers/infiniband/core/cma.c |    3 +++
>>>   1 file changed, 3 insertions(+)
>>> 
>>> This of course needs broader testing, but it seems to work, and it's
>>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>> 
>>> One thing I discovered is that the NFS/RDMA server implementation
>>> does not deal at all with more than one RDMA device on the system.
>>> I will address that with an ib_client; SunRPC patches forthcoming.
>>> 
>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>> index 56e568fcd32b..c9a2bdb49e3c 100644
>>> --- a/drivers/infiniband/core/cma.c
>>> +++ b/drivers/infiniband/core/cma.c
>>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>>       if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>>>           return ERR_PTR(-ENODEV);
>>> +    if (rdma_protocol_iwarp(device, port))
>>> +        return rdma_get_gid_attr(device, port, 0);
>> This might work, but I'm skeptical of the conditional. There's nothing
>> special about iWARP that says a GID should come from the outgoing device
>> mac. And, other protocols without IB GID addressing won't benefit.

Hi Zhu, thanks for having a look.


> Agree with you. Other protocols, such as RXE, also need be handled.
> So a better solution than this needs.

I assume you mean, in particular, RoCEv2 with rxe on a non-Ethernet
device? Tom and I discussed that possibility privately while I
developed this patch.

I do not object to that idea. But I would feel more comfortable if
that was justified and tested via a separate patch. I expect that
the logic for those cases will look different to what I'm adding here.


> Zhu Yanjun
> 
>> Wouldn't it be better to detect a missing GID, or infer the need from
>> some other transport attribute?
>> Tom.
>>> +
>>>       if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
>>>           return ERR_PTR(-ENODEV);


--
Chuck Lever



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-10 16:38     ` Tom Talpey
@ 2023-06-10 17:11       ` Chuck Lever III
  2023-06-11  1:08       ` Jason Gunthorpe
  1 sibling, 0 replies; 13+ messages in thread
From: Chuck Lever III @ 2023-06-10 17:11 UTC (permalink / raw)
  To: Tom Talpey; +Cc: Jason Gunthorpe, Chuck Lever, BMT, linux-rdma



> On Jun 10, 2023, at 12:38 PM, Tom Talpey <tom@talpey.com> wrote:
> 
> On 6/10/2023 8:05 AM, Jason Gunthorpe wrote:
>> On Fri, Jun 09, 2023 at 04:49:49PM -0400, Tom Talpey wrote:
>>> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>> 
>>>> We would like to enable the use of siw on top of a VPN that is
>>>> constructed and managed via a tun device. That hasn't worked up
>>>> until now because ARPHRD_NONE devices (such as tun devices) have
>>>> no GID for the RDMA/core to look up.
>>>> 
>>>> But it turns out that the egress device has already been picked for
>>>> us. addr_handler() just has to do the right thing with it.
>>>> 
>>>> Tested with siw and qedr, both initiator and target.
>>>> 
>>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>>   drivers/infiniband/core/cma.c |    3 +++
>>>>   1 file changed, 3 insertions(+)
>>>> 
>>>> This of course needs broader testing, but it seems to work, and it's
>>>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>>> 
>>>> One thing I discovered is that the NFS/RDMA server implementation
>>>> does not deal at all with more than one RDMA device on the system.
>>>> I will address that with an ib_client; SunRPC patches forthcoming.
>>>> 
>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>> index 56e568fcd32b..c9a2bdb49e3c 100644
>>>> --- a/drivers/infiniband/core/cma.c
>>>> +++ b/drivers/infiniband/core/cma.c
>>>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>>>    if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>>>>    return ERR_PTR(-ENODEV);
>>>> + if (rdma_protocol_iwarp(device, port))
>>>> + return rdma_get_gid_attr(device, port, 0);
>>> 
>>> This might work, but I'm skeptical of the conditional. There's nothing
>>> special about iWARP that says a GID should come from the outgoing device
>>> mac. And, other protocols without IB GID addressing won't benefit.
>> The GID represents the source address, so it better come from the
>> outgoing device or something is really wrong.
>> iWARP gets a conditional because iwarp always has a single GID, other
>> devices do not work that way.
> 
> Not sure I follow. TCP is routable so it can use multiple egress ports.
> That same routing means an incoming packet bearing an appropriate local
> address will be accepted on any port.

An siw virtual device, for example, is paired with one particular network
i/f. I'm not sure that "multiple egress ports" is applicable. I would
think this is also the case with a device-offloaded iWARP implementation:
the egress device is always the i/f on the offload card.


> So still, I don't think this an iWARP property per se.

Agreed, it's not particular to the iWARP protocol. But it does seem to
be particular to the entire class of iWARP protocol implementations on
Linux.

Adding a comment about this distinction would certainly be apropos.


> It's coming from
> the transport and its addressing. I'd hope that this can be gleaned from
> something other than "it's iWARP, cma needs to do ...".

I could add another flag somewhere to indicate this property of a
device. I expect only Linux iWARP devices would set it, though, so
would it really add much value?

Do you have a recommendation for where to home such a flag? Would it
be a fixed device property, or something that would be set dynamically
during earlier steps of address resolution? Again, it feels like
something that would be set iff rdma_protocol_iwarp() is true...

--
Chuck Lever



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-10 16:43     ` Chuck Lever III
@ 2023-06-11  0:27       ` Zhu Yanjun
  0 siblings, 0 replies; 13+ messages in thread
From: Zhu Yanjun @ 2023-06-11  0:27 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Tom Talpey, Chuck Lever, Jason Gunthorpe, Bernard Metzler, linux-rdma


在 2023/6/11 0:43, Chuck Lever III 写道:
>
>> On Jun 9, 2023, at 8:04 PM, Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>
>> 在 2023/6/10 4:49, Tom Talpey 写道:
>>> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>
>>>> We would like to enable the use of siw on top of a VPN that is
>>>> constructed and managed via a tun device. That hasn't worked up
>>>> until now because ARPHRD_NONE devices (such as tun devices) have
>>>> no GID for the RDMA/core to look up.
>>>>
>>>> But it turns out that the egress device has already been picked for
>>>> us. addr_handler() just has to do the right thing with it.
>>>>
>>>> Tested with siw and qedr, both initiator and target.
>>>>
>>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>>    drivers/infiniband/core/cma.c |    3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> This of course needs broader testing, but it seems to work, and it's
>>>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>>>
>>>> One thing I discovered is that the NFS/RDMA server implementation
>>>> does not deal at all with more than one RDMA device on the system.
>>>> I will address that with an ib_client; SunRPC patches forthcoming.
>>>>
>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>> index 56e568fcd32b..c9a2bdb49e3c 100644
>>>> --- a/drivers/infiniband/core/cma.c
>>>> +++ b/drivers/infiniband/core/cma.c
>>>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>>>        if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>>>>            return ERR_PTR(-ENODEV);
>>>> +    if (rdma_protocol_iwarp(device, port))
>>>> +        return rdma_get_gid_attr(device, port, 0);
>>> This might work, but I'm skeptical of the conditional. There's nothing
>>> special about iWARP that says a GID should come from the outgoing device
>>> mac. And, other protocols without IB GID addressing won't benefit.
> Hi Zhu, thanks for having a look.
>
>
>> Agree with you. Other protocols, such as RXE, also need be handled.
>> So a better solution than this needs.
> I assume you mean, in particular, RoCEv2 with rxe on a non-Ethernet
> device? Tom and I discussed that possibility privately while I
> developed this patch.

I am not sure if I can get you correctly or not. I mean, if RXE is 
attached to a loopback device,

the symptopms are similar to iWARP.

So if you want to fix this problem on iWARP, it is better to also fix 
the similar problem on RXE.

That is , your patch should fix the similar problems on all the rdma 
devices, not just for iWARP.

I am not sure if this request is appropriate to you or not.

Zhu Yanjun

>
> I do not object to that idea. But I would feel more comfortable if
> that was justified and tested via a separate patch. I expect that
> the logic for those cases will look different to what I'm adding here.
>
>
>> Zhu Yanjun
>>
>>> Wouldn't it be better to detect a missing GID, or infer the need from
>>> some other transport attribute?
>>> Tom.
>>>> +
>>>>        if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
>>>>            return ERR_PTR(-ENODEV);
>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-10 12:05   ` Jason Gunthorpe
  2023-06-10 16:38     ` Tom Talpey
@ 2023-06-11  0:33     ` Zhu Yanjun
  2023-06-12  0:19     ` Chuck Lever III
  2 siblings, 0 replies; 13+ messages in thread
From: Zhu Yanjun @ 2023-06-11  0:33 UTC (permalink / raw)
  To: Jason Gunthorpe, Tom Talpey; +Cc: Chuck Lever, BMT, Chuck Lever, linux-rdma

在 2023/6/10 20:05, Jason Gunthorpe 写道:
> On Fri, Jun 09, 2023 at 04:49:49PM -0400, Tom Talpey wrote:
>> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>
>>> We would like to enable the use of siw on top of a VPN that is
>>> constructed and managed via a tun device. That hasn't worked up
>>> until now because ARPHRD_NONE devices (such as tun devices) have
>>> no GID for the RDMA/core to look up.
>>>
>>> But it turns out that the egress device has already been picked for
>>> us. addr_handler() just has to do the right thing with it.
>>>
>>> Tested with siw and qedr, both initiator and target.
>>>
>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>    drivers/infiniband/core/cma.c |    3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> This of course needs broader testing, but it seems to work, and it's
>>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>>
>>> One thing I discovered is that the NFS/RDMA server implementation
>>> does not deal at all with more than one RDMA device on the system.
>>> I will address that with an ib_client; SunRPC patches forthcoming.
>>>
>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>> index 56e568fcd32b..c9a2bdb49e3c 100644
>>> --- a/drivers/infiniband/core/cma.c
>>> +++ b/drivers/infiniband/core/cma.c
>>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>>    	if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>>>    		return ERR_PTR(-ENODEV);
>>> +	if (rdma_protocol_iwarp(device, port))
>>> +		return rdma_get_gid_attr(device, port, 0);
>>
>> This might work, but I'm skeptical of the conditional. There's nothing
>> special about iWARP that says a GID should come from the outgoing device
>> mac. And, other protocols without IB GID addressing won't benefit.
> 
> The GID represents the source address, so it better come from the
> outgoing device or something is really wrong.
> 
> iWARP gets a conditional because iwarp always has a single GID, other
> devices do not work that way.


Thanks. If I get this problem correctly, this problem is about "GID can 
not be generated correctly if a iWARP device is attached to a loopback 
device". This patch will fix this problem. So iWARP device can be 
managed via this unique GID.

Do you mean that RXE does not have the similar problem?
But if a RXE device is attached to a loopback device, the similar 
problem occurs.

Zhu Yanjun

> 
> Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-10 16:38     ` Tom Talpey
  2023-06-10 17:11       ` Chuck Lever III
@ 2023-06-11  1:08       ` Jason Gunthorpe
  1 sibling, 0 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2023-06-11  1:08 UTC (permalink / raw)
  To: Tom Talpey; +Cc: Chuck Lever, BMT, Chuck Lever, linux-rdma

On Sat, Jun 10, 2023 at 12:38:53PM -0400, Tom Talpey wrote:
> On 6/10/2023 8:05 AM, Jason Gunthorpe wrote:
> > On Fri, Jun 09, 2023 at 04:49:49PM -0400, Tom Talpey wrote:
> > > On 6/7/2023 3:43 PM, Chuck Lever wrote:
> > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > 
> > > > We would like to enable the use of siw on top of a VPN that is
> > > > constructed and managed via a tun device. That hasn't worked up
> > > > until now because ARPHRD_NONE devices (such as tun devices) have
> > > > no GID for the RDMA/core to look up.
> > > > 
> > > > But it turns out that the egress device has already been picked for
> > > > us. addr_handler() just has to do the right thing with it.
> > > > 
> > > > Tested with siw and qedr, both initiator and target.
> > > > 
> > > > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > ---
> > > >    drivers/infiniband/core/cma.c |    3 +++
> > > >    1 file changed, 3 insertions(+)
> > > > 
> > > > This of course needs broader testing, but it seems to work, and it's
> > > > a little nicer than "if (dev_type == ARPHRD_NONE)".
> > > > 
> > > > One thing I discovered is that the NFS/RDMA server implementation
> > > > does not deal at all with more than one RDMA device on the system.
> > > > I will address that with an ib_client; SunRPC patches forthcoming.
> > > > 
> > > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> > > > index 56e568fcd32b..c9a2bdb49e3c 100644
> > > > --- a/drivers/infiniband/core/cma.c
> > > > +++ b/drivers/infiniband/core/cma.c
> > > > @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
> > > >    	if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
> > > >    		return ERR_PTR(-ENODEV);
> > > > +	if (rdma_protocol_iwarp(device, port))
> > > > +		return rdma_get_gid_attr(device, port, 0);
> > > 
> > > This might work, but I'm skeptical of the conditional. There's nothing
> > > special about iWARP that says a GID should come from the outgoing device
> > > mac. And, other protocols without IB GID addressing won't benefit.
> > 
> > The GID represents the source address, so it better come from the
> > outgoing device or something is really wrong.
> > 
> > iWARP gets a conditional because iwarp always has a single GID, other
> > devices do not work that way.
> 
> Not sure I follow. TCP is routable so it can use multiple egress ports.
> That same routing means an incoming packet bearing an appropriate local
> address will be accepted on any port.

sgid is about the outgoing device (and usually IP, but not for iwarp),
and RDMA is always bound to a single outgoing netdevice

> So still, I don't think this an iWARP property per se. It's coming from
> the transport and its addressing. I'd hope that this can be gleaned from
> something other than "it's iWARP, cma needs to do ...".

Well, we are supposed to match the sgid based  on the outgoing route
selected, but the iwarp drivers don't setup their sgid's properly..

You are right it should be fixed better, but I'm not sure anyone is
left familiar enough with all the iwarp drivers to do the change.

Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-10 12:05   ` Jason Gunthorpe
  2023-06-10 16:38     ` Tom Talpey
  2023-06-11  0:33     ` Zhu Yanjun
@ 2023-06-12  0:19     ` Chuck Lever III
  2023-06-12  2:33       ` Zhu Yanjun
  2 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever III @ 2023-06-12  0:19 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Tom Talpey, Chuck Lever, Bernard Metzler, linux-rdma



> On Jun 10, 2023, at 8:05 AM, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> On Fri, Jun 09, 2023 at 04:49:49PM -0400, Tom Talpey wrote:
>> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>>> From: Chuck Lever <chuck.lever@oracle.com>
>>> 
>>> We would like to enable the use of siw on top of a VPN that is
>>> constructed and managed via a tun device. That hasn't worked up
>>> until now because ARPHRD_NONE devices (such as tun devices) have
>>> no GID for the RDMA/core to look up.
>>> 
>>> But it turns out that the egress device has already been picked for
>>> us. addr_handler() just has to do the right thing with it.
>>> 
>>> Tested with siw and qedr, both initiator and target.
>>> 
>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>  drivers/infiniband/core/cma.c |    3 +++
>>>  1 file changed, 3 insertions(+)
>>> 
>>> This of course needs broader testing, but it seems to work, and it's
>>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>> 
>>> One thing I discovered is that the NFS/RDMA server implementation
>>> does not deal at all with more than one RDMA device on the system.
>>> I will address that with an ib_client; SunRPC patches forthcoming.
>>> 
>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>> index 56e568fcd32b..c9a2bdb49e3c 100644
>>> --- a/drivers/infiniband/core/cma.c
>>> +++ b/drivers/infiniband/core/cma.c
>>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>>   if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>>>   return ERR_PTR(-ENODEV);
>>> + 	if (rdma_protocol_iwarp(device, port))
>>> + 		return rdma_get_gid_attr(device, port, 0);
>> 
>> This might work, but I'm skeptical of the conditional. There's nothing
>> special about iWARP that says a GID should come from the outgoing device
>> mac. And, other protocols without IB GID addressing won't benefit.
> 
> The GID represents the source address, so it better come from the
> outgoing device or something is really wrong.
> 
> iWARP gets a conditional because iwarp always has a single GID, other
> devices do not work that way.

If rdma_get_gid_attr() works for rxe, perhaps we could change
the conditional to match on only software-emulated devices.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP
  2023-06-12  0:19     ` Chuck Lever III
@ 2023-06-12  2:33       ` Zhu Yanjun
  0 siblings, 0 replies; 13+ messages in thread
From: Zhu Yanjun @ 2023-06-12  2:33 UTC (permalink / raw)
  To: Chuck Lever III, Jason Gunthorpe
  Cc: Tom Talpey, Chuck Lever, Bernard Metzler, linux-rdma

在 2023/6/12 8:19, Chuck Lever III 写道:
> 
> 
>> On Jun 10, 2023, at 8:05 AM, Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>> On Fri, Jun 09, 2023 at 04:49:49PM -0400, Tom Talpey wrote:
>>> On 6/7/2023 3:43 PM, Chuck Lever wrote:
>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>
>>>> We would like to enable the use of siw on top of a VPN that is
>>>> constructed and managed via a tun device. That hasn't worked up
>>>> until now because ARPHRD_NONE devices (such as tun devices) have
>>>> no GID for the RDMA/core to look up.
>>>>
>>>> But it turns out that the egress device has already been picked for
>>>> us. addr_handler() just has to do the right thing with it.
>>>>
>>>> Tested with siw and qedr, both initiator and target.
>>>>
>>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>>   drivers/infiniband/core/cma.c |    3 +++
>>>>   1 file changed, 3 insertions(+)
>>>>
>>>> This of course needs broader testing, but it seems to work, and it's
>>>> a little nicer than "if (dev_type == ARPHRD_NONE)".
>>>>
>>>> One thing I discovered is that the NFS/RDMA server implementation
>>>> does not deal at all with more than one RDMA device on the system.
>>>> I will address that with an ib_client; SunRPC patches forthcoming.
>>>>
>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>> index 56e568fcd32b..c9a2bdb49e3c 100644
>>>> --- a/drivers/infiniband/core/cma.c
>>>> +++ b/drivers/infiniband/core/cma.c
>>>> @@ -694,6 +694,9 @@ cma_validate_port(struct ib_device *device, u32 port,
>>>>    if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
>>>>    return ERR_PTR(-ENODEV);
>>>> + 	if (rdma_protocol_iwarp(device, port))
>>>> + 		return rdma_get_gid_attr(device, port, 0);
>>>
>>> This might work, but I'm skeptical of the conditional. There's nothing
>>> special about iWARP that says a GID should come from the outgoing device
>>> mac. And, other protocols without IB GID addressing won't benefit.
>>
>> The GID represents the source address, so it better come from the
>> outgoing device or something is really wrong.
>>
>> iWARP gets a conditional because iwarp always has a single GID, other
>> devices do not work that way.
> 
> If rdma_get_gid_attr() works for rxe, perhaps we could change
> the conditional to match on only software-emulated devices.


drivers/infiniband/sw/rxe/rxe_net.c:

453 struct sk_buff *rxe_init_packet(struct rxe_dev *rxe, struct rxe_av *av,
454                                 int paylen, struct rxe_pkt_info *pkt)
455 {
456         unsigned int hdr_len;
457         struct sk_buff *skb = NULL;
458         struct net_device *ndev;
459         const struct ib_gid_attr *attr;
460         const int port_num = 1;
461
462         attr = rdma_get_gid_attr(&rxe->ib_dev, port_num, 
av->grh.sgid_index);
463         if (IS_ERR(attr))
464                 return NULL;
465
...

Zhu Yanjun

> 
> 
> --
> Chuck Lever
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-12  2:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07 19:43 [PATCH v1] RDMA/core: Handle ARPHRD_NONE devices for iWARP Chuck Lever
2023-06-08 15:30 ` Chuck Lever III
2023-06-09 20:49 ` Tom Talpey
2023-06-10  0:04   ` Zhu Yanjun
2023-06-10 16:43     ` Chuck Lever III
2023-06-11  0:27       ` Zhu Yanjun
2023-06-10 12:05   ` Jason Gunthorpe
2023-06-10 16:38     ` Tom Talpey
2023-06-10 17:11       ` Chuck Lever III
2023-06-11  1:08       ` Jason Gunthorpe
2023-06-11  0:33     ` Zhu Yanjun
2023-06-12  0:19     ` Chuck Lever III
2023-06-12  2:33       ` Zhu Yanjun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.