All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
@ 2013-09-17 20:49 Or Gerlitz
       [not found] ` <CAJZOPZJ_F06xORoQyt-6_SK5P5Y7LXekQuNKHHYSt+oJ8sV1GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Or Gerlitz @ 2013-09-17 20:49 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Jason Gunthorpe, Or Gerlitz, Devesh Sharma,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis, matanb

On Tue, Sep 17, 2013 at 8:50 PM, Roland Dreier wrote:
> On Thu, Sep 12, 2013 at 10:22 AM, Jason Gunthorpe wrote:
>> On Thu, Sep 12, 2013 at 03:24:46PM +0300, Or Gerlitz wrote:

>>> Let me clarify this. The idea is that current RoCE applications will
>>> run as is after they update "their" librdmacm, since its this
>>> library that works with the new uverbs entries.

>> Or, we are not supposed to break userspace. You can't insist that a
>> user space library be updated in-sync with the kernel.

> Agree.  This "IP based addressing" for RoCE looks like a big problem
> at the moment.  Let me reiterate my understanding, and you guys can
> correct me if I get something wrong:
>
>  - current addressing scheme is broken for virtualization use cases,
> because VMs may not know about what VLANs are in use.  (also there are
> issues around bonding modes that use different Ethernet addresses)

The current addressing is actually broken for vlan use cases, both
native and virtualized, for the virt as of the argument you mentioned,
for native as of one node connected to Ethernet edge switch acting in
access mode (that is the switch does vlan insertion/stripping) and the
other node handling vlans by itself. Each one will form different GID
for the other party.

>  - proposed change requires:
>    * all systems must update kernel at the same time, because old and
> new kernels cannot talk to each other
>    * all systems must update librdmacm when they update the kernel,
> because old librdmacm does not work with new kernel

> I understand that we want to fix the issue around VLAN tagged traffic
> from VMs, but I don't see how we can break the whole stack to
> accomplish that.  Isn't there some incremental way forward?

To begin with, we don't break the whole stack -- using the current
patch set, for ports whose link is IB, all biz as usual, and this is
the in the port resolution, that is if for a given device one port is
IB and one port Eth, existing librdmacm keep working on the IB por.

Another fact to put in the fire is that SRIOV VMs don't have RoCE now
(not supported by upstream). Actually we're holding off with the SRIOV
RoCE patches submission b/c of the breakage with the current scheme
--> no need for backward compatibility here either. The vast majority
if not all the Cloud use cases we are aware to which would use RoCE
need VST and need it to work right.

With vlans being broken already, I would say we need 1st and most fix
that and only/maybe later worry on backward compatibility for the few
native mode use cases that somehow manage to workaround the buggish
gid format when they use vlans.

As for those who don't use vlans, which is also rare, as RoCE is
working best over some lossless channel which is typically achieved
using PFC over a vlan... we can use the fact that the IP bases
addressing patches configure both interface IPv4 and IPv6 addresses
into the gid table.

Now,  the IPv6 link address is actually also plugged into the gid
table by nodes running the old code since this is how the non-vlan MAC
based GID is constructed. Using this fact, we can allow

1. the patched kernel to work with non updated user space, as long as
they use the GID which relates to an IPv6 link local address

2. node running the "old" code to talk with "new" node over what the
old node sees as a non-vlan MAC based GID and the new node sees as
IPv6 link local gid.

Sounds better?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found] ` <CAJZOPZJ_F06xORoQyt-6_SK5P5Y7LXekQuNKHHYSt+oJ8sV1GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-09-17 23:10   ` Roland Dreier
       [not found]     ` <CAG4TOxOtsy+vtmtYciREk0bOC=o9-ME1T=cqvt46CNssCU57zA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-09-29 10:48   ` Or Gerlitz
  1 sibling, 1 reply; 15+ messages in thread
From: Roland Dreier @ 2013-09-17 23:10 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Jason Gunthorpe, Or Gerlitz, Devesh Sharma,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis, matanb

On Tue, Sep 17, 2013 at 1:49 PM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> To begin with, we don't break the whole stack -- using the current
> patch set, for ports whose link is IB, all biz as usual, and this is
> the in the port resolution, that is if for a given device one port is
> IB and one port Eth, existing librdmacm keep working on the IB por.

Sure, and people using USB webcams and wifi are also unaffected by
changes to the RoCE stack.   For anyone using RoCE the impact is
pretty big.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]     ` <CAG4TOxOtsy+vtmtYciREk0bOC=o9-ME1T=cqvt46CNssCU57zA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-09-18  4:31       ` Or Gerlitz
  0 siblings, 0 replies; 15+ messages in thread
From: Or Gerlitz @ 2013-09-18  4:31 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Jason Gunthorpe, Or Gerlitz, Devesh Sharma,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis, matanb

On Wed, Sep 18, 2013 at 2:10 AM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Tue, Sep 17, 2013 at 1:49 PM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> To begin with, we don't break the whole stack -- using the current
>> patch set, for ports whose link is IB, all biz as usual, and this is
>> the in the port resolution, that is if for a given device one port is
>> IB and one port Eth, existing librdmacm keep working on the IB por.
>
> Sure, and people using USB webcams and wifi are also unaffected by
> changes to the RoCE stack.   For anyone using RoCE the impact is pretty big.

I see your point and this is why I haven't stopped after the "to begin
with" paragraph... any comment on what I wrote following that?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found] ` <CAJZOPZJ_F06xORoQyt-6_SK5P5Y7LXekQuNKHHYSt+oJ8sV1GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-09-17 23:10   ` Roland Dreier
@ 2013-09-29 10:48   ` Or Gerlitz
       [not found]     ` <52480568.8000801-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 15+ messages in thread
From: Or Gerlitz @ 2013-09-29 10:48 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Jason Gunthorpe, Devesh Sharma,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Moni Shoua, Matan Barak

On 17/09/2013 23:49, Or Gerlitz wrote:
> On Tue, Sep 17, 2013 at 8:50 PM, Roland Dreier wrote:
>> On Thu, Sep 12, 2013 at 10:22 AM, Jason Gunthorpe wrote:
>>> On Thu, Sep 12, 2013 at 03:24:46PM +0300, Or Gerlitz wrote:
>>>> Let me clarify this. The idea is that current RoCE applications will
>>>> run as is after they update "their" librdmacm, since its this
>>>> library that works with the new uverbs entries.
>>> Or, we are not supposed to break userspace. You can't insist that a
>>> user space library be updated in-sync with the kernel.
>> Agree.  This "IP based addressing" for RoCE looks like a big problem
>> at the moment.  Let me reiterate my understanding, and you guys can
>> correct me if I get something wrong:
>>
>>   - current addressing scheme is broken for virtualization use cases,
>> because VMs may not know about what VLANs are in use.  (also there are
>> issues around bonding modes that use different Ethernet addresses)
> The current addressing is actually broken for vlan use cases, both
> native and virtualized, for the virt as of the argument you mentioned,
> for native as of one node connected to Ethernet edge switch acting in
> access mode (that is the switch does vlan insertion/stripping) and the
> other node handling vlans by itself. Each one will form different GID
> for the other party.
>
>>   - proposed change requires:
>>     * all systems must update kernel at the same time, because old and
>> new kernels cannot talk to each other
>>     * all systems must update librdmacm when they update the kernel,
>> because old librdmacm does not work with new kernel
>> I understand that we want to fix the issue around VLAN tagged traffic
>> from VMs, but I don't see how we can break the whole stack to
>> accomplish that.  Isn't there some incremental way forward?
> To begin with, we don't break the whole stack -- using the current
> patch set, for ports whose link is IB, all biz as usual, and this is
> the in the port resolution, that is if for a given device one port is
> IB and one port Eth, existing librdmacm keep working on the IB por.
>
> Another fact to put in the fire is that SRIOV VMs don't have RoCE now
> (not supported by upstream). Actually we're holding off with the SRIOV
> RoCE patches submission b/c of the breakage with the current scheme
> --> no need for backward compatibility here either. The vast majority
> if not all the Cloud use cases we are aware to which would use RoCE
> need VST and need it to work right.
>
> With vlans being broken already, I would say we need 1st and most fix
> that and only/maybe later worry on backward compatibility for the few
> native mode use cases that somehow manage to workaround the buggish
> gid format when they use vlans.
>
> As for those who don't use vlans, which is also rare, as RoCE is
> working best over some lossless channel which is typically achieved
> using PFC over a vlan... we can use the fact that the IP bases
> addressing patches configure both interface IPv4 and IPv6 addresses
> into the gid table.
>
> Now,  the IPv6 link address is actually also plugged into the gid
> table by nodes running the old code since this is how the non-vlan MAC
> based GID is constructed. Using this fact, we can allow
>
> 1. the patched kernel to work with non updated user space, as long as
> they use the GID which relates to an IPv6 link local address
>
> 2. node running the "old" code to talk with "new" node over what the
> old node sees as a non-vlan MAC based GID and the new node sees as
> IPv6 link local gid.
>
> Sounds better?
>
>

Hi Roland, ping, I have wrote a detailed reply to your concerns and no 
word from you except on the
"begin with" part, can you? Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]     ` <52480568.8000801-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-10-02 15:09       ` Devesh Sharma
       [not found]         ` <CAGgPuS2791OXo9JrZ030qSn_4Yi777Vw5f8LP1-u2npNKppoKA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-10-10 21:26       ` Or Gerlitz
  1 sibling, 1 reply; 15+ messages in thread
From: Devesh Sharma @ 2013-10-02 15:09 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Roland Dreier, Jason Gunthorpe,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Moni Shoua, Matan Barak

Hi Or,

One more point I have is, since current application like
perftest/qperf/rping/krping do not have code to receive ipv6 address,
do you have plans to modify these?

On Sun, Sep 29, 2013 at 4:18 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 17/09/2013 23:49, Or Gerlitz wrote:
>>
>> On Tue, Sep 17, 2013 at 8:50 PM, Roland Dreier wrote:
>>>
>>> On Thu, Sep 12, 2013 at 10:22 AM, Jason Gunthorpe wrote:
>>>>
>>>> On Thu, Sep 12, 2013 at 03:24:46PM +0300, Or Gerlitz wrote:
>>>>>
>>>>> Let me clarify this. The idea is that current RoCE applications will
>>>>> run as is after they update "their" librdmacm, since its this
>>>>> library that works with the new uverbs entries.
>>>>
>>>> Or, we are not supposed to break userspace. You can't insist that a
>>>> user space library be updated in-sync with the kernel.
>>>
>>> Agree.  This "IP based addressing" for RoCE looks like a big problem
>>> at the moment.  Let me reiterate my understanding, and you guys can
>>> correct me if I get something wrong:
>>>
>>>   - current addressing scheme is broken for virtualization use cases,
>>> because VMs may not know about what VLANs are in use.  (also there are
>>> issues around bonding modes that use different Ethernet addresses)
>>
>> The current addressing is actually broken for vlan use cases, both
>> native and virtualized, for the virt as of the argument you mentioned,
>> for native as of one node connected to Ethernet edge switch acting in
>> access mode (that is the switch does vlan insertion/stripping) and the
>> other node handling vlans by itself. Each one will form different GID
>> for the other party.
>>
>>>   - proposed change requires:
>>>     * all systems must update kernel at the same time, because old and
>>> new kernels cannot talk to each other
>>>     * all systems must update librdmacm when they update the kernel,
>>> because old librdmacm does not work with new kernel
>>> I understand that we want to fix the issue around VLAN tagged traffic
>>> from VMs, but I don't see how we can break the whole stack to
>>> accomplish that.  Isn't there some incremental way forward?
>>
>> To begin with, we don't break the whole stack -- using the current
>> patch set, for ports whose link is IB, all biz as usual, and this is
>> the in the port resolution, that is if for a given device one port is
>> IB and one port Eth, existing librdmacm keep working on the IB por.
>>
>> Another fact to put in the fire is that SRIOV VMs don't have RoCE now
>> (not supported by upstream). Actually we're holding off with the SRIOV
>> RoCE patches submission b/c of the breakage with the current scheme
>> --> no need for backward compatibility here either. The vast majority
>> if not all the Cloud use cases we are aware to which would use RoCE
>> need VST and need it to work right.
>>
>> With vlans being broken already, I would say we need 1st and most fix
>> that and only/maybe later worry on backward compatibility for the few
>> native mode use cases that somehow manage to workaround the buggish
>> gid format when they use vlans.
>>
>> As for those who don't use vlans, which is also rare, as RoCE is
>> working best over some lossless channel which is typically achieved
>> using PFC over a vlan... we can use the fact that the IP bases
>> addressing patches configure both interface IPv4 and IPv6 addresses
>> into the gid table.
>>
>> Now,  the IPv6 link address is actually also plugged into the gid
>> table by nodes running the old code since this is how the non-vlan MAC
>> based GID is constructed. Using this fact, we can allow
>>
>> 1. the patched kernel to work with non updated user space, as long as
>> they use the GID which relates to an IPv6 link local address
>>
>> 2. node running the "old" code to talk with "new" node over what the
>> old node sees as a non-vlan MAC based GID and the new node sees as
>> IPv6 link local gid.
>>
>> Sounds better?
>>
>>
>
> Hi Roland, ping, I have wrote a detailed reply to your concerns and no word
> from you except on the
> "begin with" part, can you? Or.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]         ` <CAGgPuS2791OXo9JrZ030qSn_4Yi777Vw5f8LP1-u2npNKppoKA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-10-02 20:01           ` Or Gerlitz
  0 siblings, 0 replies; 15+ messages in thread
From: Or Gerlitz @ 2013-10-02 20:01 UTC (permalink / raw)
  To: Devesh Sharma
  Cc: Or Gerlitz, Roland Dreier, Jason Gunthorpe,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Moni Shoua, Matan Barak

On Wed, Oct 2, 2013 at 6:09 PM, Devesh Sharma <desh.t2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> One more point I have is, since current application like
> perftest/qperf/rping/krping do not have code to receive ipv6 address,
> do you have plans to modify these?

rping supports ipv6, as for krping, yes, it needs to be enhanced to
support that too.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]     ` <52480568.8000801-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-10-02 15:09       ` Devesh Sharma
@ 2013-10-10 21:26       ` Or Gerlitz
  1 sibling, 0 replies; 15+ messages in thread
From: Or Gerlitz @ 2013-10-10 21:26 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Jason Gunthorpe, Devesh Sharma,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Moni Shoua, Matan Barak

On Sun, Sep 29, 2013 at 1:48 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 17/09/2013 23:49, Or Gerlitz wrote:
>> On Tue, Sep 17, 2013 at 8:50 PM, Roland Dreier wrote:
>>> On Thu, Sep 12, 2013 at 10:22 AM, Jason Gunthorpe wrote:
>>>> On Thu, Sep 12, 2013 at 03:24:46PM +0300, Or Gerlitz wrote:

>>>>> Let me clarify this. The idea is that current RoCE applications will
>>>>> run as is after they update "their" librdmacm, since its this
>>>>> library that works with the new uverbs entries.

>>>> Or, we are not supposed to break userspace. You can't insist that a
>>>> user space library be updated in-sync with the kernel.

>>> Agree.  This "IP based addressing" for RoCE looks like a big problem
>>> at the moment.  Let me reiterate my understanding, and you guys can
>>> correct me if I get something wrong:

>>>   - current addressing scheme is broken for virtualization use cases,
>>> because VMs may not know about what VLANs are in use.  (also there are
>>> issues around bonding modes that use different Ethernet addresses)
>>
>> The current addressing is actually broken for vlan use cases, both
>> native and virtualized, for the virt as of the argument you mentioned,
>> for native as of one node connected to Ethernet edge switch acting in
>> access mode (that is the switch does vlan insertion/stripping) and the
>> other node handling vlans by itself. Each one will form different GID
>> for the other party.
>>
>>>   - proposed change requires:
>>>     * all systems must update kernel at the same time, because old and
>>> new kernels cannot talk to each other
>>>     * all systems must update librdmacm when they update the kernel,
>>> because old librdmacm does not work with new kernel
>>> I understand that we want to fix the issue around VLAN tagged traffic
>>> from VMs, but I don't see how we can break the whole stack to
>>> accomplish that.  Isn't there some incremental way forward?
>>
>> To begin with, we don't break the whole stack -- using the current
>> patch set, for ports whose link is IB, all biz as usual, and this is
>> the in the port resolution, that is if for a given device one port is
>> IB and one port Eth, existing librdmacm keep working on the IB por.
>>
>> Another fact to put in the fire is that SRIOV VMs don't have RoCE now
>> (not supported by upstream). Actually we're holding off with the SRIOV
>> RoCE patches submission b/c of the breakage with the current scheme
>> --> no need for backward compatibility here either. The vast majority
>> if not all the Cloud use cases we are aware to which would use RoCE
>> need VST and need it to work right.
>>
>> With vlans being broken already, I would say we need 1st and most fix
>> that and only/maybe later worry on backward compatibility for the few
>> native mode use cases that somehow manage to workaround the buggish
>> gid format when they use vlans.
>>
>> As for those who don't use vlans, which is also rare, as RoCE is
>> working best over some lossless channel which is typically achieved
>> using PFC over a vlan... we can use the fact that the IP bases
>> addressing patches configure both interface IPv4 and IPv6 addresses
>> into the gid table.
>>
>> Now,  the IPv6 link address is actually also plugged into the gid
>> table by nodes running the old code since this is how the non-vlan MAC
>> based GID is constructed. Using this fact, we can allow
>>
>> 1. the patched kernel to work with non updated user space, as long as
>> they use the GID which relates to an IPv6 link local address
>>
>> 2. node running the "old" code to talk with "new" node over what the
>> old node sees as a non-vlan MAC based GID and the new node sees as
>> IPv6 link local gid.
>>
>> Sounds better?
>>
>>
>
> Hi Roland, ping, I have wrote a detailed reply to your concerns and no word
> from you except on the "begin with" part, can you? Or.

Roland, its been almost a month since I replied to your concerns and
so far not a word from you on my core arguments, 3.12 is almost on rc5
and another kernel cycle can be easliy lost here.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]                       ` <20130912172252.GA4611-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-09-17 17:50                         ` Roland Dreier
  0 siblings, 0 replies; 15+ messages in thread
From: Roland Dreier @ 2013-09-17 17:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Or Gerlitz, Devesh Sharma, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	monis, matanb

On Thu, Sep 12, 2013 at 10:22 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:

> On Thu, Sep 12, 2013 at 03:24:46PM +0300, Or Gerlitz wrote:
> > Let me clarify this. The idea is that current RoCE applications will
> > run as is after they update "their" librdmacm, since its this
> > library that works with the new uverbs entries.

> Or, we are not supposed to break userspace. You can't insist that a
> user space library be updated in-sync with the kernel.

Agree.  This "IP based addressing" for RoCE looks like a big problem
at the moment.  Let me reiterate my understanding, and you guys can
correct me if I get something wrong:

 - current addressing scheme is broken for virtualization use cases,
because VMs may not know about what VLANs are in use.  (also there are
issues around bonding modes that use different Ethernet addresses)

 - proposed change requires:

   * all systems must update kernel at the same time, because old and
new kernels cannot talk to each other

   * all systems must update librdmacm when they update the kernel,
because old librdmacm does not work with new kernel

I understand that we want to fix the issue around VLAN tagged traffic
from VMs, but I don't see how we can break the whole stack to
accomplish that.  Isn't there some incremental way forward?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]                   ` <5231B28E.4090605-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-09-12 17:22                     ` Jason Gunthorpe
       [not found]                       ` <20130912172252.GA4611-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2013-09-12 17:22 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Devesh Sharma, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

On Thu, Sep 12, 2013 at 03:24:46PM +0300, Or Gerlitz wrote:

> Let me clarify this. The idea is that current RoCE applications will
> run as is after they update "their" librdmacm, since its this
> library that works with the new uverbs entries.

Or, we are not supposed to break userspace. You can't insist that a
user space library be updated in-sync with the kernel.

Can you please think of a way to retain compatability?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
  2013-09-12 11:31               ` Devesh Sharma
@ 2013-09-12 12:24                 ` Or Gerlitz
       [not found]                   ` <5231B28E.4090605-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Or Gerlitz @ 2013-09-12 12:24 UTC (permalink / raw)
  To: Devesh Sharma
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

On 12/09/2013 14:31, Devesh Sharma wrote:
> On Thu, Sep 12, 2013 at 4:15 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> We've posted the kernel patches, that should be enough for the review. If you have any specific questions re user space aspects of this series, feel free to send them now.
> Yes! for kenel space I see the above set of patches will work fine
> without any issues. On the other hand, if from user space some
> application tries to establish a connection using RDMACM, the Driver
> will receive dmac and vlanid fields as Zeros because
> libibverbs/librdmacm still does not call _EX versions of UVERBS/UCM
> commands, which are introduced in these set of patches (7/9, 8/9). So,
> for example if I try to run ib_send_bw with -R, traffice will not run!!
>
> So what are the plans to add these changes in libibverbs/librdmacm libraries.
>                   OR
> there is some flaw in my understanding that librdmacm/libibverbs needs
> changes in order to uses newly proposed scheme. Please clarify.
>

Let me clarify this. The idea is that current RoCE applications will run 
as is after they update "their"
librdmacm, since its this library that works with the new uverbs entries.

Note that the RoCE stack assumes existence of Ethernet device for the 
specific vendor along with
their IB driver and this series does another tiny step assuming this 
device has IP address configured
and hence RoCE applications are expected to use librdmacm.

If the underlying device/port run IB, biz as usual, the patches / 
extended commands need not come into play and
the related non-extended  uverbs commands are all working as they did 
before the patches. This is per port, that
is if port1 is IB and port2 is Eth, QPs on port1 can be worked out with 
the older uverbs commands.

As for your question, ib_send_bw -R uses the rdma-cm and hence will not 
need to change, since the rdma-cm
is doing the qp modifications.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]             ` <52319B38.5070807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-09-12 11:31               ` Devesh Sharma
@ 2013-09-12 11:46               ` Devesh Sharma
  1 sibling, 0 replies; 15+ messages in thread
From: Devesh Sharma @ 2013-09-12 11:46 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

Inline Below on Second question:

On Thu, Sep 12, 2013 at 4:15 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 12/09/2013 08:26, Devesh Sharma wrote:
>>
>> I don't see any patches to librdmacm/libibverbs git to call _EX version of
>> uverbs commands.
>
>
> We've posted the kernel patches, that should be enough for the review. If
> you have any specific questions re user
> space aspects of this series, feel free to send them now.
>
>
>> The patch you have pointed in v4-0 patch still seems to be incomplete. Are
>> these broken?
>
>
> I don't understand the question. During the review of V4 we were pointed to
> a part missing in patch 5/9
> and this will be fixed in V5, sure.
Yes, 5/9 miss a part related to populatin gid table during load time.
Well I was totally concerned about the user space apps, with current
git of libibverbs/librdmacm user app will fail to perform data
transfer operations.
After digging more into the liunx-rdma git I found the completed set
of patches for flow-steering which introduces extension commands in
kernel space. Still looking for corresponding patches for
liblibverbs/librdmacm.

-Regards
 Devesh Sharama

>
> Or.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]             ` <52319B38.5070807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-09-12 11:31               ` Devesh Sharma
  2013-09-12 12:24                 ` Or Gerlitz
  2013-09-12 11:46               ` Devesh Sharma
  1 sibling, 1 reply; 15+ messages in thread
From: Devesh Sharma @ 2013-09-12 11:31 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

Inline Below:

On Thu, Sep 12, 2013 at 4:15 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 12/09/2013 08:26, Devesh Sharma wrote:
>>
>> I don't see any patches to librdmacm/libibverbs git to call _EX version of
>> uverbs commands.
>
>
> We've posted the kernel patches, that should be enough for the review. If
> you have any specific questions re user
> space aspects of this series, feel free to send them now.

Yes! for kenel space I see the above set of patches will work fine
without any issues. On the other hand, if from user space some
application tries to establish a connection using RDMACM, the Driver
will receive dmac and vlanid fields as Zeros because
libibverbs/librdmacm still does not call _EX versions of UVERBS/UCM
commands, which are introduced in these set of patches (7/9, 8/9). So,
for example if I try to run ib_send_bw with -R, traffice will not
run!!

So what are the plans to add these changes in libibverbs/librdmacm libraries.
                 OR
there is some flaw in my understanding that librdmacm/libibverbs needs
changes in order to uses newly proposed scheme. Please clarify.

-Regards
 Devesh
>
>
>> The patch you have pointed in v4-0 patch still seems to be incomplete. Are
>> these broken?
>
>
> I don't understand the question. During the review of V4 we were pointed to
> a part missing in patch 5/9
> and this will be fixed in V5, sure.
>
> Or.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]         ` <CAGgPuS1tAiyA3TZ5_fpua3ue6JrZ9ruS+O+QU-7t28i0dZ7cUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-09-12 10:45           ` Or Gerlitz
       [not found]             ` <52319B38.5070807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Or Gerlitz @ 2013-09-12 10:45 UTC (permalink / raw)
  To: Devesh Sharma
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

On 12/09/2013 08:26, Devesh Sharma wrote:
> I don't see any patches to librdmacm/libibverbs git to call _EX version of uverbs commands.

We've posted the kernel patches, that should be enough for the review. 
If you have any specific questions re user
space aspects of this series, feel free to send them now.

> The patch you have pointed in v4-0 patch still seems to be incomplete. Are these broken?

I don't understand the question. During the review of V4 we were pointed 
to a part missing in patch 5/9
and this will be fixed in V5, sure.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found]     ` <1378824099-22150-10-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-09-12  5:26       ` Devesh Sharma
       [not found]         ` <CAGgPuS1tAiyA3TZ5_fpua3ue6JrZ9ruS+O+QU-7t28i0dZ7cUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Devesh Sharma @ 2013-09-12  5:26 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

Hi Or,

I don't see any patches to librdmacm/libibverbs git to call _EX
version of uverbs commands. The patch you have pointed in v4-0 patch
still seems to be incomplete. Are these broken?

On Tue, Sep 10, 2013 at 8:11 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> mlx4_ib driver should indicate that it supports
> MODIFY_QP_EX user verbs extended command.
>
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/hw/mlx4/main.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
> index 7a29ad5..77c87d0 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -1755,7 +1755,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>                 (1ull << IB_USER_VERBS_CMD_QUERY_SRQ)           |
>                 (1ull << IB_USER_VERBS_CMD_DESTROY_SRQ)         |
>                 (1ull << IB_USER_VERBS_CMD_CREATE_XSRQ)         |
> -               (1ull << IB_USER_VERBS_CMD_OPEN_QP);
> +               (1ull << IB_USER_VERBS_CMD_OPEN_QP)             |
> +               (1ull << IB_USER_VERBS_CMD_MODIFY_QP_EX);
>
>         ibdev->ib_dev.query_device      = mlx4_ib_query_device;
>         ibdev->ib_dev.query_port        = mlx4_ib_query_port;
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX
       [not found] ` <1378824099-22150-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-09-10 14:41   ` Or Gerlitz
       [not found]     ` <1378824099-22150-10-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Or Gerlitz @ 2013-09-10 14:41 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w, Or Gerlitz

From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

mlx4_ib driver should indicate that it supports
MODIFY_QP_EX user verbs extended command.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/main.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 7a29ad5..77c87d0 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -1755,7 +1755,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 		(1ull << IB_USER_VERBS_CMD_QUERY_SRQ)		|
 		(1ull << IB_USER_VERBS_CMD_DESTROY_SRQ)		|
 		(1ull << IB_USER_VERBS_CMD_CREATE_XSRQ)		|
-		(1ull << IB_USER_VERBS_CMD_OPEN_QP);
+		(1ull << IB_USER_VERBS_CMD_OPEN_QP)		|
+		(1ull << IB_USER_VERBS_CMD_MODIFY_QP_EX);
 
 	ibdev->ib_dev.query_device	= mlx4_ib_query_device;
 	ibdev->ib_dev.query_port	= mlx4_ib_query_port;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-10-10 21:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-17 20:49 [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX Or Gerlitz
     [not found] ` <CAJZOPZJ_F06xORoQyt-6_SK5P5Y7LXekQuNKHHYSt+oJ8sV1GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-17 23:10   ` Roland Dreier
     [not found]     ` <CAG4TOxOtsy+vtmtYciREk0bOC=o9-ME1T=cqvt46CNssCU57zA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-18  4:31       ` Or Gerlitz
2013-09-29 10:48   ` Or Gerlitz
     [not found]     ` <52480568.8000801-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-10-02 15:09       ` Devesh Sharma
     [not found]         ` <CAGgPuS2791OXo9JrZ030qSn_4Yi777Vw5f8LP1-u2npNKppoKA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-02 20:01           ` Or Gerlitz
2013-10-10 21:26       ` Or Gerlitz
  -- strict thread matches above, loose matches on Subject: below --
2013-09-10 14:41 [PATCH V4 0/9] IP based RoCE GID Addressing Or Gerlitz
     [not found] ` <1378824099-22150-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-09-10 14:41   ` [PATCH V4 9/9] IB/mlx4: Enable mlx4_ib support for MODIFY_QP_EX Or Gerlitz
     [not found]     ` <1378824099-22150-10-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-09-12  5:26       ` Devesh Sharma
     [not found]         ` <CAGgPuS1tAiyA3TZ5_fpua3ue6JrZ9ruS+O+QU-7t28i0dZ7cUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-12 10:45           ` Or Gerlitz
     [not found]             ` <52319B38.5070807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-09-12 11:31               ` Devesh Sharma
2013-09-12 12:24                 ` Or Gerlitz
     [not found]                   ` <5231B28E.4090605-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-09-12 17:22                     ` Jason Gunthorpe
     [not found]                       ` <20130912172252.GA4611-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-09-17 17:50                         ` Roland Dreier
2013-09-12 11:46               ` Devesh Sharma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.