All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
@ 2015-03-04  0:15 roopa
  2015-03-04  4:15 ` John Fastabend
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: roopa @ 2015-03-04  0:15 UTC (permalink / raw)
  To: sfeldma, jiri; +Cc: netdev, davem

From: Roopa Prabhu <roopa@cumulusnetworks.com>

With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
on rocker ports, the second command (bridge link set) below will turn off
learning in the rocker hw (Scott/Jiri, need some confirmation from
you that this is indeed a problem and if the below patch is ok).

ip link set dev swp1 master br0
bridge link set dev swp1 learning off master
bridge link set dev swp1 learning_sync on self

This patch fixes rocker to ignore learning setting when 'master'
is set. This makes it possible to set/unset learning in kernel and bridge
driver independently.

The below command will continue to set learning on in both kernel and rocker
hw:
bridge link set dev swp1 learning on

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 drivers/net/ethernet/rocker/rocker.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index e5a15a4..d7c31d2 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
 	struct nlattr *attr;
 	int err;
 
+	if (flags && !(flags & BRIDGE_FLAGS_SELF))
+		return 0;
+
 	protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
 				   IFLA_PROTINFO);
 	if (protinfo) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-04  0:15 [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler roopa
@ 2015-03-04  4:15 ` John Fastabend
  2015-03-04  7:02 ` Scott Feldman
  2015-03-05  8:36 ` Jiri Pirko
  2 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2015-03-04  4:15 UTC (permalink / raw)
  To: roopa; +Cc: sfeldma, jiri, netdev, davem

On 03/03/2015 04:15 PM, roopa@cumulusnetworks.com wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
> on rocker ports, the second command (bridge link set) below will turn off
> learning in the rocker hw (Scott/Jiri, need some confirmation from
> you that this is indeed a problem and if the below patch is ok).
>
> ip link set dev swp1 master br0
> bridge link set dev swp1 learning off master
> bridge link set dev swp1 learning_sync on self
>
> This patch fixes rocker to ignore learning setting when 'master'
> is set. This makes it possible to set/unset learning in kernel and bridge
> driver independently.
>
> The below command will continue to set learning on in both kernel and rocker
> hw:
> bridge link set dev swp1 learning on
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---

We will at least want this behaviour in the NIC case where the NIC
can support l2 learning and forwarding to VF's or even other ports.
(remind again when a NIC becomes a switch and a switch becomes a NIC?).

In this case you may want to learn in the hardware but in the
hypervisor only support assigned MAC address because this is
controlled by a libvirt/qemu for example.

.John

-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-04  0:15 [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler roopa
  2015-03-04  4:15 ` John Fastabend
@ 2015-03-04  7:02 ` Scott Feldman
  2015-03-04  8:51   ` roopa
  2015-03-05  8:36 ` Jiri Pirko
  2 siblings, 1 reply; 38+ messages in thread
From: Scott Feldman @ 2015-03-04  7:02 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: Jiří Pírko, Netdev, David S. Miller

On Tue, Mar 3, 2015 at 4:15 PM,  <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
> on rocker ports, the second command (bridge link set) below will turn off
> learning in the rocker hw (Scott/Jiri, need some confirmation from
> you that this is indeed a problem and if the below patch is ok).
>
> ip link set dev swp1 master br0
> bridge link set dev swp1 learning off master
> bridge link set dev swp1 learning_sync on self
>
> This patch fixes rocker to ignore learning setting when 'master'
> is set. This makes it possible to set/unset learning in kernel and bridge
> driver independently.
>
> The below command will continue to set learning on in both kernel and rocker
> hw:
> bridge link set dev swp1 learning on
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
>  drivers/net/ethernet/rocker/rocker.c |    3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
> index e5a15a4..d7c31d2 100644
> --- a/drivers/net/ethernet/rocker/rocker.c
> +++ b/drivers/net/ethernet/rocker/rocker.c
> @@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
>         struct nlattr *attr;
>         int err;
>
> +       if (flags && !(flags & BRIDGE_FLAGS_SELF))
> +               return 0;
> +
>         protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
>                                    IFLA_PROTINFO);
>         if (protinfo) {


NACK on this patch.  This is the problem with netlink creeping into
ndo ops: it's too tempting to push work-arounds down to driver.

In this case, you're making the driver check to see if it's SELF when
it's already SELF by definition.

Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
patches.  Now it is, but this isn't the right fix.

Can we revisit this so these two commands only hit MASTER:

   bridge link set dev swp1 learning on
   bridge link set dev swp1 learning on master

And this one hits SELF:

    bridge link set dev swp1 learning on self

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-04  7:02 ` Scott Feldman
@ 2015-03-04  8:51   ` roopa
  2015-03-04 16:24     ` Scott Feldman
  2015-03-05  8:02     ` Jiri Pirko
  0 siblings, 2 replies; 38+ messages in thread
From: roopa @ 2015-03-04  8:51 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Jiří Pírko, Netdev, David S. Miller

On 3/3/15, 11:02 PM, Scott Feldman wrote:
> On Tue, Mar 3, 2015 at 4:15 PM,  <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>> on rocker ports, the second command (bridge link set) below will turn off
>> learning in the rocker hw (Scott/Jiri, need some confirmation from
>> you that this is indeed a problem and if the below patch is ok).
>>
>> ip link set dev swp1 master br0
>> bridge link set dev swp1 learning off master
>> bridge link set dev swp1 learning_sync on self
>>
>> This patch fixes rocker to ignore learning setting when 'master'
>> is set. This makes it possible to set/unset learning in kernel and bridge
>> driver independently.
>>
>> The below command will continue to set learning on in both kernel and rocker
>> hw:
>> bridge link set dev swp1 learning on
>>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> ---
>>   drivers/net/ethernet/rocker/rocker.c |    3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>> index e5a15a4..d7c31d2 100644
>> --- a/drivers/net/ethernet/rocker/rocker.c
>> +++ b/drivers/net/ethernet/rocker/rocker.c
>> @@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
>>          struct nlattr *attr;
>>          int err;
>>
>> +       if (flags && !(flags & BRIDGE_FLAGS_SELF))
>> +               return 0;
>> +
>>          protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
>>                                     IFLA_PROTINFO);
>>          if (protinfo) {
>
> NACK on this patch.  This is the problem with netlink creeping into
> ndo ops: it's too tempting to push work-arounds down to driver.
netlink has already crept into the driver. You do parse the netlink 
message right below.
The patch actually does not touch the message. It is just using one of 
the args already passed to the handler.

> In this case, you're making the driver check to see if it's SELF when
> it's already SELF by definition.
>
> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
> patches.
>   Now it is,
sure, I can submit a patch to remove the flag on rocker ports if thats 
what you prefer.

> but this isn't the right fix.
>
> Can we revisit this so these two commands only hit MASTER:
>
>     bridge link set dev swp1 learning on
>     bridge link set dev swp1 learning on master
>
> And this one hits SELF:
>
>      bridge link set dev swp1 learning on self

For the above to work you just need to remove the feature flag in the driver (i can submit a patch).

The reason why it is useful to have it the way it currently is:

the setlink request does not always only contain the 'learning' flag.
It handles vlans too. I dont see rocker parsing vlans yet ie 
IFLA_AF_SPEC (or did i miss it ?)
and in those cases I thought it would be nice to pass the vlan info to 
the rocker ports when it
is added to the bridge driver.

For example the below command will hit the kernel bridge driver and 
rocker today.

bridge vlan add vid 10-20 dev swp1

If you remove the feature flag, you will have to run both

bridge vlan add vid 10-20 dev swp1
bridge vlan add vid 10-20 dev swp1 self


Let me know,
Thanks,
Roopa


   

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-04  8:51   ` roopa
@ 2015-03-04 16:24     ` Scott Feldman
  2015-03-05  0:31       ` roopa
  2015-03-05  8:02     ` Jiri Pirko
  1 sibling, 1 reply; 38+ messages in thread
From: Scott Feldman @ 2015-03-04 16:24 UTC (permalink / raw)
  To: roopa; +Cc: Jiří Pírko, Netdev, David S. Miller

On Wed, Mar 4, 2015 at 12:51 AM, roopa <roopa@cumulusnetworks.com> wrote:
> On 3/3/15, 11:02 PM, Scott Feldman wrote:
>>
>> On Tue, Mar 3, 2015 at 4:15 PM,  <roopa@cumulusnetworks.com> wrote:
>>>
>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>>> on rocker ports, the second command (bridge link set) below will turn off
>>> learning in the rocker hw (Scott/Jiri, need some confirmation from
>>> you that this is indeed a problem and if the below patch is ok).
>>>
>>> ip link set dev swp1 master br0
>>> bridge link set dev swp1 learning off master
>>> bridge link set dev swp1 learning_sync on self
>>>
>>> This patch fixes rocker to ignore learning setting when 'master'
>>> is set. This makes it possible to set/unset learning in kernel and bridge
>>> driver independently.
>>>
>>> The below command will continue to set learning on in both kernel and
>>> rocker
>>> hw:
>>> bridge link set dev swp1 learning on
>>>
>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>> ---
>>>   drivers/net/ethernet/rocker/rocker.c |    3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/rocker/rocker.c
>>> b/drivers/net/ethernet/rocker/rocker.c
>>> index e5a15a4..d7c31d2 100644
>>> --- a/drivers/net/ethernet/rocker/rocker.c
>>> +++ b/drivers/net/ethernet/rocker/rocker.c
>>> @@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct
>>> net_device *dev,
>>>          struct nlattr *attr;
>>>          int err;
>>>
>>> +       if (flags && !(flags & BRIDGE_FLAGS_SELF))
>>> +               return 0;
>>> +
>>>          protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
>>>                                     IFLA_PROTINFO);
>>>          if (protinfo) {
>>
>>
>> NACK on this patch.  This is the problem with netlink creeping into
>> ndo ops: it's too tempting to push work-arounds down to driver.
>
> netlink has already crept into the driver. You do parse the netlink message
> right below.
> The patch actually does not touch the message. It is just using one of the
> args already passed to the handler.
>
>> In this case, you're making the driver check to see if it's SELF when
>> it's already SELF by definition.
>>
>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>> patches.
>>   Now it is,
>
> sure, I can submit a patch to remove the flag on rocker ports if thats what
> you prefer.

I don't want that either.  I haven't figured out why I need
NETIF_F_HW_SWITCH_OFFLOAD at the moment, but if this is supposed to be
a flag that's set on switchdev driver ports, then it should be set on
rocker ports.


>> but this isn't the right fix.
>>
>> Can we revisit this so these two commands only hit MASTER:
>>
>>     bridge link set dev swp1 learning on
>>     bridge link set dev swp1 learning on master
>>
>> And this one hits SELF:
>>
>>      bridge link set dev swp1 learning on self
>
>
> For the above to work you just need to remove the feature flag in the driver
> (i can submit a patch).
>
> The reason why it is useful to have it the way it currently is:
>
> the setlink request does not always only contain the 'learning' flag.
> It handles vlans too. I dont see rocker parsing vlans yet ie IFLA_AF_SPEC
> (or did i miss it ?)

Yes, you missed it.  See
br_setlink->br_afspec->br_vlan_info->nbp_vlan_add->__vlan_add->__vlan_vid_add

The bridge is already calling into the driver to tell it which vlans
are on which port.

This works when doing vlans outside of bridge driver also.  One
interface for both bridge and stand-alone vlans.

I see no reason to parse netlink in the driver's setlink/dellink other
than getting the SELF flags.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-04 16:24     ` Scott Feldman
@ 2015-03-05  0:31       ` roopa
  0 siblings, 0 replies; 38+ messages in thread
From: roopa @ 2015-03-05  0:31 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Jiří Pírko, Netdev, David S. Miller

On 3/4/15, 8:24 AM, Scott Feldman wrote:
> On Wed, Mar 4, 2015 at 12:51 AM, roopa <roopa@cumulusnetworks.com> wrote:
>> On 3/3/15, 11:02 PM, Scott Feldman wrote:
>>> On Tue, Mar 3, 2015 at 4:15 PM,  <roopa@cumulusnetworks.com> wrote:
>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>
>>>> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>>>> on rocker ports, the second command (bridge link set) below will turn off
>>>> learning in the rocker hw (Scott/Jiri, need some confirmation from
>>>> you that this is indeed a problem and if the below patch is ok).
>>>>
>>>> ip link set dev swp1 master br0
>>>> bridge link set dev swp1 learning off master
>>>> bridge link set dev swp1 learning_sync on self
>>>>
>>>> This patch fixes rocker to ignore learning setting when 'master'
>>>> is set. This makes it possible to set/unset learning in kernel and bridge
>>>> driver independently.
>>>>
>>>> The below command will continue to set learning on in both kernel and
>>>> rocker
>>>> hw:
>>>> bridge link set dev swp1 learning on
>>>>
>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>> ---
>>>>    drivers/net/ethernet/rocker/rocker.c |    3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/net/ethernet/rocker/rocker.c
>>>> b/drivers/net/ethernet/rocker/rocker.c
>>>> index e5a15a4..d7c31d2 100644
>>>> --- a/drivers/net/ethernet/rocker/rocker.c
>>>> +++ b/drivers/net/ethernet/rocker/rocker.c
>>>> @@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct
>>>> net_device *dev,
>>>>           struct nlattr *attr;
>>>>           int err;
>>>>
>>>> +       if (flags && !(flags & BRIDGE_FLAGS_SELF))
>>>> +               return 0;
>>>> +
>>>>           protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
>>>>                                      IFLA_PROTINFO);
>>>>           if (protinfo) {
>>>
>>> NACK on this patch.  This is the problem with netlink creeping into
>>> ndo ops: it's too tempting to push work-arounds down to driver.
>> netlink has already crept into the driver. You do parse the netlink message
>> right below.
>> The patch actually does not touch the message. It is just using one of the
>> args already passed to the handler.
>>
>>> In this case, you're making the driver check to see if it's SELF when
>>> it's already SELF by definition.
>>>
>>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>> patches.
>>>    Now it is,
>> sure, I can submit a patch to remove the flag on rocker ports if thats what
>> you prefer.
> I don't want that either.  I haven't figured out why I need
> NETIF_F_HW_SWITCH_OFFLOAD at the moment, but if this is supposed to be
> a flag that's set on switchdev driver ports, then it should be set on
> rocker ports.
Its not mandatory. Only If you need it. If you want the user to drive 
rocker explicitly using 'self',
then you dont need the OFFLOAD flag.

>
>
>>> but this isn't the right fix.
>>>
>>> Can we revisit this so these two commands only hit MASTER:
>>>
>>>      bridge link set dev swp1 learning on
>>>      bridge link set dev swp1 learning on master
>>>
>>> And this one hits SELF:
>>>
>>>       bridge link set dev swp1 learning on self
>>
>> For the above to work you just need to remove the feature flag in the driver
>> (i can submit a patch).
>>
>> The reason why it is useful to have it the way it currently is:
>>
>> the setlink request does not always only contain the 'learning' flag.
>> It handles vlans too. I dont see rocker parsing vlans yet ie IFLA_AF_SPEC
>> (or did i miss it ?)
> Yes, you missed it.  See
> br_setlink->br_afspec->br_vlan_info->nbp_vlan_add->__vlan_add->__vlan_vid_add
ok thanks for the clarification. I thought you were using 
ndo_bridge_setlink to pass the vlan info to the driver (because that is 
one way to pass vlans for the vlan filtering bridge). And the switch 
port driver already implements ndo_bridge_setlink.

>
> The bridge is already calling into the driver to tell it which vlans
> are on which port.
>
> This works when doing vlans outside of bridge driver also.  One
> interface for both bridge and stand-alone vlans.
agreed, however, i would have thought you also need the vinfo flags 
(struct bridge_vlan_info).
>
> I see no reason to parse netlink in the driver's setlink/dellink other
> than getting the SELF flags.
hmm.., the driver is already doing this to parse learning and learning 
sync flags. And bridge setlink flags is passed in the ndo op.

Let me know if you have other ideas to fix this (one approach is to not 
advertise the OFFLOAD flag on rocker ports).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-04  8:51   ` roopa
  2015-03-04 16:24     ` Scott Feldman
@ 2015-03-05  8:02     ` Jiri Pirko
  2015-03-05 14:55       ` roopa
  1 sibling, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2015-03-05  8:02 UTC (permalink / raw)
  To: roopa; +Cc: Scott Feldman, Netdev, David S. Miller

Wed, Mar 04, 2015 at 09:51:20AM CET, roopa@cumulusnetworks.com wrote:
>On 3/3/15, 11:02 PM, Scott Feldman wrote:
>>On Tue, Mar 3, 2015 at 4:15 PM,  <roopa@cumulusnetworks.com> wrote:
>>>From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>>With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>>>on rocker ports, the second command (bridge link set) below will turn off
>>>learning in the rocker hw (Scott/Jiri, need some confirmation from
>>>you that this is indeed a problem and if the below patch is ok).
>>>
>>>ip link set dev swp1 master br0
>>>bridge link set dev swp1 learning off master
>>>bridge link set dev swp1 learning_sync on self
>>>
>>>This patch fixes rocker to ignore learning setting when 'master'
>>>is set. This makes it possible to set/unset learning in kernel and bridge
>>>driver independently.
>>>
>>>The below command will continue to set learning on in both kernel and rocker
>>>hw:
>>>bridge link set dev swp1 learning on
>>>
>>>Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>---
>>>  drivers/net/ethernet/rocker/rocker.c |    3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>>diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>>>index e5a15a4..d7c31d2 100644
>>>--- a/drivers/net/ethernet/rocker/rocker.c
>>>+++ b/drivers/net/ethernet/rocker/rocker.c
>>>@@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
>>>         struct nlattr *attr;
>>>         int err;
>>>
>>>+       if (flags && !(flags & BRIDGE_FLAGS_SELF))
>>>+               return 0;
>>>+
>>>         protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
>>>                                    IFLA_PROTINFO);
>>>         if (protinfo) {
>>
>>NACK on this patch.  This is the problem with netlink creeping into
>>ndo ops: it's too tempting to push work-arounds down to driver.
>netlink has already crept into the driver. You do parse the netlink message
>right below.
>The patch actually does not touch the message. It is just using one of the
>args already passed to the handler.
>
>>In this case, you're making the driver check to see if it's SELF when
>>it's already SELF by definition.
>>
>>Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>patches.
>>  Now it is,
>sure, I can submit a patch to remove the flag on rocker ports if thats what
>you prefer.

I don't believe that Scott prefers that. The offload flag is there not
only for this case, but also for many future switch offloading cases.
Rocker port have to have that port set by default. I do not really
understand why you suggest removing it...


>
>>but this isn't the right fix.
>>
>>Can we revisit this so these two commands only hit MASTER:
>>
>>    bridge link set dev swp1 learning on
>>    bridge link set dev swp1 learning on master
>>
>>And this one hits SELF:
>>
>>     bridge link set dev swp1 learning on self
>
>For the above to work you just need to remove the feature flag in the driver (i can submit a patch).
>
>The reason why it is useful to have it the way it currently is:
>
>the setlink request does not always only contain the 'learning' flag.
>It handles vlans too. I dont see rocker parsing vlans yet ie IFLA_AF_SPEC (or
>did i miss it ?)
>and in those cases I thought it would be nice to pass the vlan info to the
>rocker ports when it
>is added to the bridge driver.
>
>For example the below command will hit the kernel bridge driver and rocker
>today.
>
>bridge vlan add vid 10-20 dev swp1
>
>If you remove the feature flag, you will have to run both
>
>bridge vlan add vid 10-20 dev swp1
>bridge vlan add vid 10-20 dev swp1 self
>
>
>Let me know,
>Thanks,
>Roopa
>
>
>
>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-04  0:15 [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler roopa
  2015-03-04  4:15 ` John Fastabend
  2015-03-04  7:02 ` Scott Feldman
@ 2015-03-05  8:36 ` Jiri Pirko
  2015-03-05 15:01   ` roopa
  2 siblings, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2015-03-05  8:36 UTC (permalink / raw)
  To: roopa; +Cc: sfeldma, netdev, davem

Wed, Mar 04, 2015 at 01:15:29AM CET, roopa@cumulusnetworks.com wrote:
>From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
>With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>on rocker ports, the second command (bridge link set) below will turn off
>learning in the rocker hw (Scott/Jiri, need some confirmation from
>you that this is indeed a problem and if the below patch is ok).
>
>ip link set dev swp1 master br0
>bridge link set dev swp1 learning off master
>bridge link set dev swp1 learning_sync on self
>
>This patch fixes rocker to ignore learning setting when 'master'
>is set. This makes it possible to set/unset learning in kernel and bridge
>driver independently.
>
>The below command will continue to set learning on in both kernel and rocker
>hw:
>bridge link set dev swp1 learning on
>
>Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>---
> drivers/net/ethernet/rocker/rocker.c |    3 +++
> 1 file changed, 3 insertions(+)
>
>diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>index e5a15a4..d7c31d2 100644
>--- a/drivers/net/ethernet/rocker/rocker.c
>+++ b/drivers/net/ethernet/rocker/rocker.c
>@@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
> 	struct nlattr *attr;
> 	int err;
> 
>+	if (flags && !(flags & BRIDGE_FLAGS_SELF))
>+		return 0;
>+

I think that the called should handle this case, not every particular
driver. How about to put this into rtnl_bridge_setlink?


> 	protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
> 				   IFLA_PROTINFO);
> 	if (protinfo) {
>-- 
>1.7.10.4
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-05  8:02     ` Jiri Pirko
@ 2015-03-05 14:55       ` roopa
  2015-03-05 20:06         ` Scott Feldman
  0 siblings, 1 reply; 38+ messages in thread
From: roopa @ 2015-03-05 14:55 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Scott Feldman, Netdev, David S. Miller

On 3/5/15, 12:02 AM, Jiri Pirko wrote:
> Wed, Mar 04, 2015 at 09:51:20AM CET, roopa@cumulusnetworks.com wrote:
>> On 3/3/15, 11:02 PM, Scott Feldman wrote:
>>> On Tue, Mar 3, 2015 at 4:15 PM,  <roopa@cumulusnetworks.com> wrote:
>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>
>>>> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>>>> on rocker ports, the second command (bridge link set) below will turn off
>>>> learning in the rocker hw (Scott/Jiri, need some confirmation from
>>>> you that this is indeed a problem and if the below patch is ok).
>>>>
>>>> ip link set dev swp1 master br0
>>>> bridge link set dev swp1 learning off master
>>>> bridge link set dev swp1 learning_sync on self
>>>>
>>>> This patch fixes rocker to ignore learning setting when 'master'
>>>> is set. This makes it possible to set/unset learning in kernel and bridge
>>>> driver independently.
>>>>
>>>> The below command will continue to set learning on in both kernel and rocker
>>>> hw:
>>>> bridge link set dev swp1 learning on
>>>>
>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>> ---
>>>>   drivers/net/ethernet/rocker/rocker.c |    3 +++
>>>>   1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>>>> index e5a15a4..d7c31d2 100644
>>>> --- a/drivers/net/ethernet/rocker/rocker.c
>>>> +++ b/drivers/net/ethernet/rocker/rocker.c
>>>> @@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
>>>>          struct nlattr *attr;
>>>>          int err;
>>>>
>>>> +       if (flags && !(flags & BRIDGE_FLAGS_SELF))
>>>> +               return 0;
>>>> +
>>>>          protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
>>>>                                     IFLA_PROTINFO);
>>>>          if (protinfo) {
>>> NACK on this patch.  This is the problem with netlink creeping into
>>> ndo ops: it's too tempting to push work-arounds down to driver.
>> netlink has already crept into the driver. You do parse the netlink message
>> right below.
>> The patch actually does not touch the message. It is just using one of the
>> args already passed to the handler.
>>
>>> In this case, you're making the driver check to see if it's SELF when
>>> it's already SELF by definition.
>>>
>>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>> patches.
>>>   Now it is,
>> sure, I can submit a patch to remove the flag on rocker ports if thats what
>> you prefer.
> I don't believe that Scott prefers that. The offload flag is there not
> only for this case, but also for many future switch offloading cases.
I very well understand that.
> Rocker port have to have that port set by default. I do not really
> understand why you suggest removing it...
>
I was just making sure scott is on board with the use of the flag.
I added it on rocker ports for l2 and learning might be broken because 
of that. Which i am trying to fix.
also, scotts l3 patches dont seem to use the flag.
So, if the patch in this thread is not the right way to fix it..,.i was 
just trying to give scott an option to remove the flag to keep l2 
working and for him to bring back the flag when he is ready.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-05  8:36 ` Jiri Pirko
@ 2015-03-05 15:01   ` roopa
  2015-03-05 15:09     ` roopa
  0 siblings, 1 reply; 38+ messages in thread
From: roopa @ 2015-03-05 15:01 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: sfeldma, netdev, davem

On 3/5/15, 12:36 AM, Jiri Pirko wrote:
> Wed, Mar 04, 2015 at 01:15:29AM CET, roopa@cumulusnetworks.com wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>> on rocker ports, the second command (bridge link set) below will turn off
>> learning in the rocker hw (Scott/Jiri, need some confirmation from
>> you that this is indeed a problem and if the below patch is ok).
>>
>> ip link set dev swp1 master br0
>> bridge link set dev swp1 learning off master
>> bridge link set dev swp1 learning_sync on self
>>
>> This patch fixes rocker to ignore learning setting when 'master'
>> is set. This makes it possible to set/unset learning in kernel and bridge
>> driver independently.
>>
>> The below command will continue to set learning on in both kernel and rocker
>> hw:
>> bridge link set dev swp1 learning on
>>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> ---
>> drivers/net/ethernet/rocker/rocker.c |    3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>> index e5a15a4..d7c31d2 100644
>> --- a/drivers/net/ethernet/rocker/rocker.c
>> +++ b/drivers/net/ethernet/rocker/rocker.c
>> @@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
>> 	struct nlattr *attr;
>> 	int err;
>>
>> +	if (flags && !(flags & BRIDGE_FLAGS_SELF))
>> +		return 0;
>> +
> I think that the called should handle this case, not every particular
> driver. How about to put this into rtnl_bridge_setlink?

The problem is setlink can contain many more attributes than just learning.
And for most attributes you want to mirror them in hardware.
Only a few attributes like learning, you want to be able to set them in 
the kernel driver
differently than in hardware.

We have previously discussed encoding hw/sw in the upper bits of bridge 
port attributes.
I was trying to avoid that and use the existing master/self to achieve 
the same.

I will repost alternatives if I can think of something.

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-05 15:01   ` roopa
@ 2015-03-05 15:09     ` roopa
  0 siblings, 0 replies; 38+ messages in thread
From: roopa @ 2015-03-05 15:09 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: sfeldma, netdev, davem

On 3/5/15, 7:01 AM, roopa wrote:
> On 3/5/15, 12:36 AM, Jiri Pirko wrote:
>> Wed, Mar 04, 2015 at 01:15:29AM CET, roopa@cumulusnetworks.com wrote:
>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>> With the recent addition of the NETIF_F_HW_SWITCH_OFFLOAD flag
>>> on rocker ports, the second command (bridge link set) below will 
>>> turn off
>>> learning in the rocker hw (Scott/Jiri, need some confirmation from
>>> you that this is indeed a problem and if the below patch is ok).
>>>
>>> ip link set dev swp1 master br0
>>> bridge link set dev swp1 learning off master
>>> bridge link set dev swp1 learning_sync on self
>>>
>>> This patch fixes rocker to ignore learning setting when 'master'
>>> is set. This makes it possible to set/unset learning in kernel and 
>>> bridge
>>> driver independently.
>>>
>>> The below command will continue to set learning on in both kernel 
>>> and rocker
>>> hw:
>>> bridge link set dev swp1 learning on
>>>
>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>> ---
>>> drivers/net/ethernet/rocker/rocker.c |    3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/rocker/rocker.c 
>>> b/drivers/net/ethernet/rocker/rocker.c
>>> index e5a15a4..d7c31d2 100644
>>> --- a/drivers/net/ethernet/rocker/rocker.c
>>> +++ b/drivers/net/ethernet/rocker/rocker.c
>>> @@ -3769,6 +3769,9 @@ static int rocker_port_bridge_setlink(struct 
>>> net_device *dev,
>>>     struct nlattr *attr;
>>>     int err;
>>>
>>> +    if (flags && !(flags & BRIDGE_FLAGS_SELF))
>>> +        return 0;
>>> +
>> I think that the called should handle this case, not every particular
>> driver. How about to put this into rtnl_bridge_setlink?
>
> The problem is setlink can contain many more attributes than just 
> learning.
> And for most attributes you want to mirror them in hardware.
> Only a few attributes like learning, you want to be able to set them 
> in the kernel driver
> differently than in hardware.
>
> We have previously discussed encoding hw/sw in the upper bits of 
> bridge port attributes.
> I was trying to avoid that and use the existing master/self to achieve 
> the same.
>
> I will repost alternatives if I can think of something.
Just adding more clarification to my previous comment: I have not 
confirmed if learning in rocker is really broken.
My patch comment specifically says that the sequence of commands listed 
may turn off learning in rocker hardware unintentionally.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-05 14:55       ` roopa
@ 2015-03-05 20:06         ` Scott Feldman
  2015-03-05 20:43           ` roopa
  0 siblings, 1 reply; 38+ messages in thread
From: Scott Feldman @ 2015-03-05 20:06 UTC (permalink / raw)
  To: roopa; +Cc: Jiri Pirko, Netdev, David S. Miller

On Thu, Mar 5, 2015 at 6:55 AM, roopa <roopa@cumulusnetworks.com> wrote:
> On 3/5/15, 12:02 AM, Jiri Pirko wrote:
>>
>> Wed, Mar 04, 2015 at 09:51:20AM CET, roopa@cumulusnetworks.com wrote:
>>>
>>> On 3/3/15, 11:02 PM, Scott Feldman wrote:

[cut]

>>>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>>> patches.
>>>>   Now it is,
>>>
>>> sure, I can submit a patch to remove the flag on rocker ports if thats
>>> what
>>> you prefer.
>>
>> I don't believe that Scott prefers that. The offload flag is there not
>> only for this case, but also for many future switch offloading cases.
>
> I very well understand that.
>>
>> Rocker port have to have that port set by default. I do not really
>> understand why you suggest removing it...
>>
> I was just making sure scott is on board with the use of the flag.
> I added it on rocker ports for l2 and learning might be broken because of
> that. Which i am trying to fix.
> also, scotts l3 patches dont seem to use the flag.

That's an oversight due to work starting on L3 patches way before
NETIF_F_HW_SWITCH_OFFLOAD was introduced ; I'll add checks for that
flag to the L3 patches.

> So, if the patch in this thread is not the right way to fix it..,.i was just
> trying to give scott an option to remove the flag to keep l2 working and for
> him to bring back the flag when he is ready.

When I'm ready for what?  I ready for it to work like it did before
NETIF_F_HW_SWITCH_OFFLOAD and your setlink changes went in.  ;-)

We want two sets of IFLA_PROTINFO attrs for a bridge port.   The first
set is the bridge side of the port, so these attrs like learning or
flooding control the how the bridge (software) manages the port.  The
second set is the device side of the port, so attrs here control how
the (offload) device manages the port.  The two sets are independent.
For example, you could have learning turned OFF on the bridge side and
learning turned ON on the device side.  The device side will
initialize its set.  The bridge will initialize its set.   The user
can see both sets with cmd

    bridge -d link show.

The user can set the bridge's attr with one of cmds:

    bridge link set <attr> dev <port>
    bridge link set <attr> dev <port> master

The user can set the device's attr with cmd:

    bridge link set <attr> dev <port> self

The driver setlink/getlink ops should only be called in the SELF
context because the driver is the SELF side of the port.  Somehow we
got away from that.   We talked about IFLA_AF_SPEC being handled
differently for VLANs.  For IFLA_PROTINFO, what do we need to get back
to what I have above?

-scott

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-05 20:06         ` Scott Feldman
@ 2015-03-05 20:43           ` roopa
  2015-03-05 21:40             ` roopa
  2015-03-06  9:52             ` Scott Feldman
  0 siblings, 2 replies; 38+ messages in thread
From: roopa @ 2015-03-05 20:43 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Jiri Pirko, Netdev, David S. Miller

On 3/5/15, 12:06 PM, Scott Feldman wrote:
> On Thu, Mar 5, 2015 at 6:55 AM, roopa <roopa@cumulusnetworks.com> wrote:
>> On 3/5/15, 12:02 AM, Jiri Pirko wrote:
>>> Wed, Mar 04, 2015 at 09:51:20AM CET, roopa@cumulusnetworks.com wrote:
>>>> On 3/3/15, 11:02 PM, Scott Feldman wrote:
> [cut]
>
>>>>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>>>> patches.
>>>>>    Now it is,
>>>> sure, I can submit a patch to remove the flag on rocker ports if thats
>>>> what
>>>> you prefer.
>>> I don't believe that Scott prefers that. The offload flag is there not
>>> only for this case, but also for many future switch offloading cases.
>> I very well understand that.
>>> Rocker port have to have that port set by default. I do not really
>>> understand why you suggest removing it...
>>>
>> I was just making sure scott is on board with the use of the flag.
>> I added it on rocker ports for l2 and learning might be broken because of
>> that. Which i am trying to fix.
>> also, scotts l3 patches dont seem to use the flag.
> That's an oversight due to work starting on L3 patches way before
> NETIF_F_HW_SWITCH_OFFLOAD was introduced ; I'll add checks for that
> flag to the L3 patches.
ok, i was not sure if you were planning to. I did post a comment on v2.
>> So, if the patch in this thread is not the right way to fix it..,.i was just
>> trying to give scott an option to remove the flag to keep l2 working and for
>> him to bring back the flag when he is ready.
> When I'm ready for what?  I ready for it to work like it did before
> NETIF_F_HW_SWITCH_OFFLOAD and your setlink changes went in.  ;-)
well, to be fair scott, my current patch in this thread is trying to get 
to it, if you let me :)

>
> We want two sets of IFLA_PROTINFO attrs for a bridge port.   The first
> set is the bridge side of the port, so these attrs like learning or
> flooding control the how the bridge (software) manages the port.  The
> second set is the device side of the port, so attrs here control how
> the (offload) device manages the port.  The two sets are independent.
> For example, you could have learning turned OFF on the bridge side and
> learning turned ON on the device side.  The device side will
> initialize its set.  The bridge will initialize its set.   The user
> can see both sets with cmd
>
>      bridge -d link show.
>
> The user can set the bridge's attr with one of cmds:
>
>      bridge link set <attr> dev <port>
>      bridge link set <attr> dev <port> master
>
> The user can set the device's attr with cmd:
>
>      bridge link set <attr> dev <port> self
>
> The driver setlink/getlink ops should only be called in the SELF
> context because the driver is the SELF side of the port.  Somehow we
> got away from that.
True, If all attributes that the bridge setlink sets need to go 
separately to kernel and hardware and
  we will never mirror them. That is not the case however. vlans is an 
example and there will be more bridge port attributes in the future.

>   We talked about IFLA_AF_SPEC being handled
> differently for VLANs.  For IFLA_PROTINFO, what do we need to get back
> to what I have above?

If I understand you correctly, you are saying that ndo_bridge_setlink 
when it comes to offloads
should only be used for IFLA_PROTINFO ?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-05 20:43           ` roopa
@ 2015-03-05 21:40             ` roopa
  2015-03-06  9:52             ` Scott Feldman
  1 sibling, 0 replies; 38+ messages in thread
From: roopa @ 2015-03-05 21:40 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Jiri Pirko, Netdev, David S. Miller

On 3/5/15, 12:43 PM, roopa wrote:
> On 3/5/15, 12:06 PM, Scott Feldman wrote:
>> On Thu, Mar 5, 2015 at 6:55 AM, roopa <roopa@cumulusnetworks.com> wrote:
>>> On 3/5/15, 12:02 AM, Jiri Pirko wrote:
>>>> Wed, Mar 04, 2015 at 09:51:20AM CET, roopa@cumulusnetworks.com wrote:
>>>>> On 3/3/15, 11:02 PM, Scott Feldman wrote:
>> [cut]
>>
>>>>>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>>>>> patches.
>>>>>>    Now it is,
>>>>> sure, I can submit a patch to remove the flag on rocker ports if 
>>>>> thats
>>>>> what
>>>>> you prefer.
>>>> I don't believe that Scott prefers that. The offload flag is there not
>>>> only for this case, but also for many future switch offloading cases.
>>> I very well understand that.
>>>> Rocker port have to have that port set by default. I do not really
>>>> understand why you suggest removing it...
>>>>
>>> I was just making sure scott is on board with the use of the flag.
>>> I added it on rocker ports for l2 and learning might be broken 
>>> because of
>>> that. Which i am trying to fix.
>>> also, scotts l3 patches dont seem to use the flag.
>> That's an oversight due to work starting on L3 patches way before
>> NETIF_F_HW_SWITCH_OFFLOAD was introduced ; I'll add checks for that
>> flag to the L3 patches.
> ok, i was not sure if you were planning to. I did post a comment on v2.
>>> So, if the patch in this thread is not the right way to fix it..,.i 
>>> was just
>>> trying to give scott an option to remove the flag to keep l2 working 
>>> and for
>>> him to bring back the flag when he is ready.
>> When I'm ready for what?  I ready for it to work like it did before
>> NETIF_F_HW_SWITCH_OFFLOAD and your setlink changes went in.  ;-)
> well, to be fair scott, my current patch in this thread is trying to 
> get to it, if you let me :)
>
>>
>> We want two sets of IFLA_PROTINFO attrs for a bridge port.   The first
>> set is the bridge side of the port, so these attrs like learning or
>> flooding control the how the bridge (software) manages the port.  The
>> second set is the device side of the port, so attrs here control how
>> the (offload) device manages the port.  The two sets are independent.
>> For example, you could have learning turned OFF on the bridge side and
>> learning turned ON on the device side.  The device side will
>> initialize its set.  The bridge will initialize its set.   The user
>> can see both sets with cmd
>>
>>      bridge -d link show.
>>
>> The user can set the bridge's attr with one of cmds:
>>
>>      bridge link set <attr> dev <port>
>>      bridge link set <attr> dev <port> master
>>
>> The user can set the device's attr with cmd:
>>
>>      bridge link set <attr> dev <port> self
>>
>> The driver setlink/getlink ops should only be called in the SELF
>> context because the driver is the SELF side of the port. Somehow we
>> got away from that.
> True, If all attributes that the bridge setlink sets need to go 
> separately to kernel and hardware and
>  we will never mirror them. That is not the case however. vlans is an 
> example and there will be more bridge port attributes in the future.
>
>>   We talked about IFLA_AF_SPEC being handled
>> differently for VLANs.  For IFLA_PROTINFO, what do we need to get back
>> to what I have above?
>
> If I understand you correctly, you are saying that ndo_bridge_setlink 
> when it comes to offloads
> should only be used for IFLA_PROTINFO ?
>
and...since you brought up IFLA_PROTINFO...i would like to also add 
that, .....my rocker patch in this current thread is nothing but rocker 
driver telling that "i accept IFLA_PROTINFO attributes only if the 
BRIDGE_FLAGS_SELF is set". I had the check around IFLA_PROTINFO 
initially...but since you only seem to be looking at IFLA_PROTINFO in 
the function, I put the check at the beginning of the function.

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-05 20:43           ` roopa
  2015-03-05 21:40             ` roopa
@ 2015-03-06  9:52             ` Scott Feldman
  2015-03-08 14:19               ` roopa
  1 sibling, 1 reply; 38+ messages in thread
From: Scott Feldman @ 2015-03-06  9:52 UTC (permalink / raw)
  To: roopa; +Cc: Jiri Pirko, Netdev, David S. Miller

On Thu, Mar 5, 2015 at 12:43 PM, roopa <roopa@cumulusnetworks.com> wrote:
> On 3/5/15, 12:06 PM, Scott Feldman wrote:
>>
>> On Thu, Mar 5, 2015 at 6:55 AM, roopa <roopa@cumulusnetworks.com> wrote:
>>>
>>> On 3/5/15, 12:02 AM, Jiri Pirko wrote:
>>>>
>>>> Wed, Mar 04, 2015 at 09:51:20AM CET, roopa@cumulusnetworks.com wrote:
>>>>>
>>>>> On 3/3/15, 11:02 PM, Scott Feldman wrote:
>>
>> [cut]
>>
>>>>>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>>>>> patches.
>>>>>>    Now it is,
>>>>>
>>>>> sure, I can submit a patch to remove the flag on rocker ports if thats
>>>>> what
>>>>> you prefer.
>>>>
>>>> I don't believe that Scott prefers that. The offload flag is there not
>>>> only for this case, but also for many future switch offloading cases.
>>>
>>> I very well understand that.
>>>>
>>>> Rocker port have to have that port set by default. I do not really
>>>> understand why you suggest removing it...
>>>>
>>> I was just making sure scott is on board with the use of the flag.
>>> I added it on rocker ports for l2 and learning might be broken because of
>>> that. Which i am trying to fix.
>>> also, scotts l3 patches dont seem to use the flag.
>>
>> That's an oversight due to work starting on L3 patches way before
>> NETIF_F_HW_SWITCH_OFFLOAD was introduced ; I'll add checks for that
>> flag to the L3 patches.
>
> ok, i was not sure if you were planning to. I did post a comment on v2.
>>>
>>> So, if the patch in this thread is not the right way to fix it..,.i was
>>> just
>>> trying to give scott an option to remove the flag to keep l2 working and
>>> for
>>> him to bring back the flag when he is ready.
>>
>> When I'm ready for what?  I ready for it to work like it did before
>> NETIF_F_HW_SWITCH_OFFLOAD and your setlink changes went in.  ;-)
>
> well, to be fair scott, my current patch in this thread is trying to get to
> it, if you let me :)
>
>>
>> We want two sets of IFLA_PROTINFO attrs for a bridge port.   The first
>> set is the bridge side of the port, so these attrs like learning or
>> flooding control the how the bridge (software) manages the port.  The
>> second set is the device side of the port, so attrs here control how
>> the (offload) device manages the port.  The two sets are independent.
>> For example, you could have learning turned OFF on the bridge side and
>> learning turned ON on the device side.  The device side will
>> initialize its set.  The bridge will initialize its set.   The user
>> can see both sets with cmd
>>
>>      bridge -d link show.
>>
>> The user can set the bridge's attr with one of cmds:
>>
>>      bridge link set <attr> dev <port>
>>      bridge link set <attr> dev <port> master
>>
>> The user can set the device's attr with cmd:
>>
>>      bridge link set <attr> dev <port> self
>>
>> The driver setlink/getlink ops should only be called in the SELF
>> context because the driver is the SELF side of the port.  Somehow we
>> got away from that.
>
> True, If all attributes that the bridge setlink sets need to go separately
> to kernel and hardware and
>  we will never mirror them. That is not the case however. vlans is an
> example and there will be more bridge port attributes in the future.
>
>>   We talked about IFLA_AF_SPEC being handled
>> differently for VLANs.  For IFLA_PROTINFO, what do we need to get back
>> to what I have above?
>
>
> If I understand you correctly, you are saying that ndo_bridge_setlink when
> it comes to offloads
> should only be used for IFLA_PROTINFO ?

I'm saying port driver's ndo_bridge_setlink shouldn't be called unless
SELF is set.  The port driver shouldn't have to check if it's SELF,
that's my main beef with this patch.  If the app (iproute2) wants to
mirror an attribute, then it can set both MASTER and SELF.  This is
consistent with ndo_fdb ops, where MASTER targets the bridge's FDB and
SELF targets the device's FDB.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-06  9:52             ` Scott Feldman
@ 2015-03-08 14:19               ` roopa
  2015-03-08 23:17                 ` Scott Feldman
  0 siblings, 1 reply; 38+ messages in thread
From: roopa @ 2015-03-08 14:19 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Jiri Pirko, Netdev, David S. Miller

On 3/6/15, 1:52 AM, Scott Feldman wrote:
> On Thu, Mar 5, 2015 at 12:43 PM, roopa <roopa@cumulusnetworks.com> wrote:
>> On 3/5/15, 12:06 PM, Scott Feldman wrote:
>>> On Thu, Mar 5, 2015 at 6:55 AM, roopa <roopa@cumulusnetworks.com> wrote:
>>>> On 3/5/15, 12:02 AM, Jiri Pirko wrote:
>>>>> Wed, Mar 04, 2015 at 09:51:20AM CET, roopa@cumulusnetworks.com wrote:
>>>>>> On 3/3/15, 11:02 PM, Scott Feldman wrote:
>>> [cut]
>>>
>>>>>>> Rocker setlink wasn't broken prior to the NETIF_F_HW_SWITCH_OFFLOAD
>>>>>>> patches.
>>>>>>>     Now it is,
>>>>>> sure, I can submit a patch to remove the flag on rocker ports if thats
>>>>>> what
>>>>>> you prefer.
>>>>> I don't believe that Scott prefers that. The offload flag is there not
>>>>> only for this case, but also for many future switch offloading cases.
>>>> I very well understand that.
>>>>> Rocker port have to have that port set by default. I do not really
>>>>> understand why you suggest removing it...
>>>>>
>>>> I was just making sure scott is on board with the use of the flag.
>>>> I added it on rocker ports for l2 and learning might be broken because of
>>>> that. Which i am trying to fix.
>>>> also, scotts l3 patches dont seem to use the flag.
>>> That's an oversight due to work starting on L3 patches way before
>>> NETIF_F_HW_SWITCH_OFFLOAD was introduced ; I'll add checks for that
>>> flag to the L3 patches.
>> ok, i was not sure if you were planning to. I did post a comment on v2.
>>>> So, if the patch in this thread is not the right way to fix it..,.i was
>>>> just
>>>> trying to give scott an option to remove the flag to keep l2 working and
>>>> for
>>>> him to bring back the flag when he is ready.
>>> When I'm ready for what?  I ready for it to work like it did before
>>> NETIF_F_HW_SWITCH_OFFLOAD and your setlink changes went in.  ;-)
>> well, to be fair scott, my current patch in this thread is trying to get to
>> it, if you let me :)
>>
>>> We want two sets of IFLA_PROTINFO attrs for a bridge port.   The first
>>> set is the bridge side of the port, so these attrs like learning or
>>> flooding control the how the bridge (software) manages the port.  The
>>> second set is the device side of the port, so attrs here control how
>>> the (offload) device manages the port.  The two sets are independent.
>>> For example, you could have learning turned OFF on the bridge side and
>>> learning turned ON on the device side.  The device side will
>>> initialize its set.  The bridge will initialize its set.   The user
>>> can see both sets with cmd
>>>
>>>       bridge -d link show.
>>>
>>> The user can set the bridge's attr with one of cmds:
>>>
>>>       bridge link set <attr> dev <port>
>>>       bridge link set <attr> dev <port> master
>>>
>>> The user can set the device's attr with cmd:
>>>
>>>       bridge link set <attr> dev <port> self
>>>
>>> The driver setlink/getlink ops should only be called in the SELF
>>> context because the driver is the SELF side of the port.  Somehow we
>>> got away from that.
>> True, If all attributes that the bridge setlink sets need to go separately
>> to kernel and hardware and
>>   we will never mirror them. That is not the case however. vlans is an
>> example and there will be more bridge port attributes in the future.
>>
>>>    We talked about IFLA_AF_SPEC being handled
>>> differently for VLANs.  For IFLA_PROTINFO, what do we need to get back
>>> to what I have above?
>>
>> If I understand you correctly, you are saying that ndo_bridge_setlink when
>> it comes to offloads
>> should only be used for IFLA_PROTINFO ?
> I'm saying port driver's ndo_bridge_setlink shouldn't be called unless
> SELF is set.  The port driver shouldn't have to check if it's SELF,
> that's my main beef with this patch.  If the app (iproute2) wants to
> mirror an attribute, then it can set both MASTER and SELF.  This is
> consistent with ndo_fdb ops, where MASTER targets the bridge's FDB and
> SELF targets the device's FDB.

bridge setlink/dellink ndo sets/clears bridge port flags, vlans.

4 ways to the ndo op gets called (value in brackets is the bridge flags):

1. ndo_bridge_setlink ()
2. ndo_bridge_setlink (master)
3. ndo_bridge_setlink (self)
4. ndo_bridge_setlink (master | self)


in case of 1) and 2) today, bridge driver also calls into the port 
driver (as part of NETIF_F_HW_SWITCH_OFFLOAD) because vlans and other 
port flags need to be offloaded if they are set on master (I understand 
that rocker does not use bridge setlink ndo op for vlans. But it is 
completely valid for a switch driver to use the ndo setlink op for 
offloading vlan config)

One thing i can do is the NETIF_F_HW_SWITCH_OFFLOAD case only applies in 
case 1 above. ie when bridge flags is zero.

which leads to the following commands to disable learning in the bridge 
driver and enable it in hw,

1. bridge link learning on - enable learning in both rocker and bridge 
driver due to the NETIF_F_HW_SWITCH_OFFLOAD flag
2. bridge link learning off master - learning disable only in the bridge 
driver

With this i think we hit a compromise, rocker setlink can be called when 
flags is zero, but we still get a valid sequence of commands to 
enable/disable certain port flags only in the bridge driver or port 
driver without having an explicit self check in rocker.

(my work laptop is broken..., I will re-post the patch tomorrow if you 
are ok with the plan)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-08 14:19               ` roopa
@ 2015-03-08 23:17                 ` Scott Feldman
  2015-03-09  0:20                   ` roopa
       [not found]                   ` <CAJieiUhHdXOZjWkb4s_GviLwzq5Gct-1o8xv8b-JeM46S4e-dg@mail.gmail.com>
  0 siblings, 2 replies; 38+ messages in thread
From: Scott Feldman @ 2015-03-08 23:17 UTC (permalink / raw)
  To: roopa; +Cc: Jiri Pirko, Netdev, David S. Miller

On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com> wrote:
> On 3/6/15, 1:52 AM, Scott Feldman wrote:

[cut]

>> I'm saying port driver's ndo_bridge_setlink shouldn't be called unless
>> SELF is set.  The port driver shouldn't have to check if it's SELF,
>> that's my main beef with this patch.  If the app (iproute2) wants to
>> mirror an attribute, then it can set both MASTER and SELF.  This is
>> consistent with ndo_fdb ops, where MASTER targets the bridge's FDB and
>> SELF targets the device's FDB.
>
>
> bridge setlink/dellink ndo sets/clears bridge port flags, vlans.
>
> 4 ways to the ndo op gets called (value in brackets is the bridge flags):
>
> 1. ndo_bridge_setlink ()
> 2. ndo_bridge_setlink (master)
> 3. ndo_bridge_setlink (self)
> 4. ndo_bridge_setlink (master | self)
>
>
> in case of 1) and 2) today, bridge driver also calls into the port driver
> (as part of NETIF_F_HW_SWITCH_OFFLOAD) because vlans and other port flags
> need to be offloaded if they are set on master (I understand that rocker
> does not use bridge setlink ndo op for vlans. But it is completely valid for
> a switch driver to use the ndo setlink op for offloading vlan config)
>
> One thing i can do is the NETIF_F_HW_SWITCH_OFFLOAD case only applies in
> case 1 above. ie when bridge flags is zero.
>
> which leads to the following commands to disable learning in the bridge
> driver and enable it in hw,
>
> 1. bridge link learning on - enable learning in both rocker and bridge
> driver due to the NETIF_F_HW_SWITCH_OFFLOAD flag
> 2. bridge link learning off master - learning disable only in the bridge
> driver
>
> With this i think we hit a compromise, rocker setlink can be called when
> flags is zero, but we still get a valid sequence of commands to
> enable/disable certain port flags only in the bridge driver or port driver
> without having an explicit self check in rocker.
>
> (my work laptop is broken..., I will re-post the patch tomorrow if you are
> ok with the plan)

I'm not OK with the plan; it doesn't address the issue.  I'm saying
port driver's ndo_bridge_setlink shouldn't be called unless SELF is
set.

We have this currently, looking from the user's cmd:

bridge link set                               calls into bridge and port driver
bridge link set master                    calls into bridge and port driver
bridge link set self                         calls into port driver
bridge link set self master              calls into bridge and port driver

Your proposal above changes it to:

bridge link set                               calls into bridge and port driver
bridge link set master                    calls into bridge
bridge link set self                         calls into port driver
bridge link set self master              calls into bridge and port driver

Which is still no good because the default case (neither self or
master explicitly set) calls into the port driver.  Before all of
these changes, we had:

bridge link set                               calls into bridge
bridge link set master                    calls into bridge
bridge link set hwmode veb|vepa     calls into port driver

hwmode was a way to set SELF.  We wanted to add "self" flag to command
to be more like bridge fdb cmd.  I assumed the default case (bridge
set link) would continue to work as before.  So we'd have:

bridge link set                               calls into bridge
bridge link set master                    calls into bridge
bridge link set self                         calls into port driver
bridge link set self master              calls into bridge and port driver

But that's not what we got.  Your changes changed the default case
(and the bridge link set master case) to call into both bridge and
port driver even though user didn't specify self.

How hard would it be to get back to original default behavior?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-08 23:17                 ` Scott Feldman
@ 2015-03-09  0:20                   ` roopa
       [not found]                   ` <CAJieiUhHdXOZjWkb4s_GviLwzq5Gct-1o8xv8b-JeM46S4e-dg@mail.gmail.com>
  1 sibling, 0 replies; 38+ messages in thread
From: roopa @ 2015-03-09  0:20 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Jiri Pirko, Netdev, David S. Miller

On 3/8/15, 4:17 PM, Scott Feldman wrote:
> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com> wrote:
>> On 3/6/15, 1:52 AM, Scott Feldman wrote:
> [cut]
>
>>> I'm saying port driver's ndo_bridge_setlink shouldn't be called unless
>>> SELF is set.  The port driver shouldn't have to check if it's SELF,
>>> that's my main beef with this patch.  If the app (iproute2) wants to
>>> mirror an attribute, then it can set both MASTER and SELF.  This is
>>> consistent with ndo_fdb ops, where MASTER targets the bridge's FDB and
>>> SELF targets the device's FDB.
>>
>> bridge setlink/dellink ndo sets/clears bridge port flags, vlans.
>>
>> 4 ways to the ndo op gets called (value in brackets is the bridge flags):
>>
>> 1. ndo_bridge_setlink ()
>> 2. ndo_bridge_setlink (master)
>> 3. ndo_bridge_setlink (self)
>> 4. ndo_bridge_setlink (master | self)
>>
>>
>> in case of 1) and 2) today, bridge driver also calls into the port driver
>> (as part of NETIF_F_HW_SWITCH_OFFLOAD) because vlans and other port flags
>> need to be offloaded if they are set on master (I understand that rocker
>> does not use bridge setlink ndo op for vlans. But it is completely valid for
>> a switch driver to use the ndo setlink op for offloading vlan config)
>>
>> One thing i can do is the NETIF_F_HW_SWITCH_OFFLOAD case only applies in
>> case 1 above. ie when bridge flags is zero.
>>
>> which leads to the following commands to disable learning in the bridge
>> driver and enable it in hw,
>>
>> 1. bridge link learning on - enable learning in both rocker and bridge
>> driver due to the NETIF_F_HW_SWITCH_OFFLOAD flag
>> 2. bridge link learning off master - learning disable only in the bridge
>> driver
>>
>> With this i think we hit a compromise, rocker setlink can be called when
>> flags is zero, but we still get a valid sequence of commands to
>> enable/disable certain port flags only in the bridge driver or port driver
>> without having an explicit self check in rocker.
>>
>> (my work laptop is broken..., I will re-post the patch tomorrow if you are
>> ok with the plan)
> I'm not OK with the plan; it doesn't address the issue.  I'm saying
> port driver's ndo_bridge_setlink shouldn't be called unless SELF is
> set.

but, why scott ?. With l2, l3 and all other offloads, we try not to 
change user commands.
Before switchdev 'self' was always used to go to nic drivers directly to 
program fdb.
l2 never used the bridge driver.

with switchdev l2 offloads, we are using the bridge driver.
But there is a need to turn off/on some attributes in hw independently of
attributes in the kernel bridge driver. To that i had suggested we 
continue to use 'self'
for those attributes without introducing a new hw mode.
>
> We have this currently, looking from the user's cmd:
>
> bridge link set                               calls into bridge and port driver
> bridge link set master                    calls into bridge and port driver
> bridge link set self                         calls into port driver
> bridge link set self master              calls into bridge and port driver
>
> Your proposal above changes it to:
>
> bridge link set                               calls into bridge and port driver
> bridge link set master                    calls into bridge
> bridge link set self                         calls into port driver
> bridge link set self master              calls into bridge and port driver
>
> Which is still no good because the default case (neither self or
> master explicitly set) calls into the port driver.  Before all of
> these changes, we had:
>
> bridge link set                               calls into bridge
> bridge link set master                    calls into bridge
> bridge link set hwmode veb|vepa     calls into port driver
>
> hwmode was a way to set SELF.  We wanted to add "self" flag to command
> to be more like bridge fdb cmd.  I assumed the default case (bridge
> set link) would continue to work as before.  So we'd have:
>
> bridge link set                               calls into bridge
> bridge link set master                    calls into bridge
> bridge link set self                         calls into port driver
> bridge link set self master              calls into bridge and port driver
>
> But that's not what we got.  Your changes changed the default case
> (and the bridge link set master case) to call into both bridge and
> port driver even though user didn't specify self.

I have made sure that no default case has changed. It works this way 
only when
the driver advertises offload using NETIF_F_HW_SWITCH_OFFLOAD.
Which means the driver is interested in involving the bridge driver in 
all l2 offloads
(And this includes all bridge ndo ops setlink/dellink/stp/fdb).

http://www.spinics.net/lists/netdev/msg314407.html
>
> How hard would it be to get back to original default behavior?

Am not sure which default behavior you are talking about. If you are 
talking about default behavior before switchdev,
that has not changed. For switch devices which want the bridge driver 
involved and who advertise the NETIF_F_HW_SWITCH_OFFLOAD flag, this is 
new behavior. And rocker does advertise this flag today.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
       [not found]                   ` <CAJieiUhHdXOZjWkb4s_GviLwzq5Gct-1o8xv8b-JeM46S4e-dg@mail.gmail.com>
@ 2015-03-09  6:40                     ` Jiri Pirko
  2015-03-09 15:59                       ` Arad, Ronen
  0 siblings, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2015-03-09  6:40 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: Scott Feldman, Netdev, David S. Miller

Mon, Mar 09, 2015 at 01:12:47AM CET, roopa@cumulusnetworks.com wrote:
>On Sun, Mar 8, 2015 at 4:17 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>
>> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com> wrote:
>> > On 3/6/15, 1:52 AM, Scott Feldman wrote:
>>
>> [cut]
>>
>> >> I'm saying port driver's ndo_bridge_setlink shouldn't be called unless
>> >> SELF is set.  The port driver shouldn't have to check if it's SELF,
>> >> that's my main beef with this patch.  If the app (iproute2) wants to
>> >> mirror an attribute, then it can set both MASTER and SELF.  This is
>> >> consistent with ndo_fdb ops, where MASTER targets the bridge's FDB and
>> >> SELF targets the device's FDB.
>> >
>> >
>> > bridge setlink/dellink ndo sets/clears bridge port flags, vlans.
>> >
>> > 4 ways to the ndo op gets called (value in brackets is the bridge flags):
>> >
>> > 1. ndo_bridge_setlink ()
>> > 2. ndo_bridge_setlink (master)
>> > 3. ndo_bridge_setlink (self)
>> > 4. ndo_bridge_setlink (master | self)
>> >
>> >
>> > in case of 1) and 2) today, bridge driver also calls into the port driver
>> > (as part of NETIF_F_HW_SWITCH_OFFLOAD) because vlans and other port flags
>> > need to be offloaded if they are set on master (I understand that rocker
>> > does not use bridge setlink ndo op for vlans. But it is completely valid
>> for
>> > a switch driver to use the ndo setlink op for offloading vlan config)
>> >
>> > One thing i can do is the NETIF_F_HW_SWITCH_OFFLOAD case only applies in
>> > case 1 above. ie when bridge flags is zero.
>> >
>> > which leads to the following commands to disable learning in the bridge
>> > driver and enable it in hw,
>> >
>> > 1. bridge link learning on - enable learning in both rocker and bridge
>> > driver due to the NETIF_F_HW_SWITCH_OFFLOAD flag
>> > 2. bridge link learning off master - learning disable only in the bridge
>> > driver
>> >
>> > With this i think we hit a compromise, rocker setlink can be called when
>> > flags is zero, but we still get a valid sequence of commands to
>> > enable/disable certain port flags only in the bridge driver or port
>> driver
>> > without having an explicit self check in rocker.
>> >
>> > (my work laptop is broken..., I will re-post the patch tomorrow if you
>> are
>> > ok with the plan)
>>
>> I'm not OK with the plan; it doesn't address the issue.  I'm saying
>> port driver's ndo_bridge_setlink shouldn't be called unless SELF is
>> set.
>>
>
>but, why scott ?. With l2, l3 and all other offloads, we try not to change
>user commands.
>Before switchdev 'self' was always used to go to nic drivers directly to
>program fdb.
>l2 never used the bridge driver.


I also think that is is fine that in case either master or self are not set,
driver ndo setlink is called as well (in case the offload bit is on). It
makes it transparent for user in case he does not care what in under. In
case he cares, he can specify master or self to achieve exactly what he
needs. I like it that way.



>
>with switchdev l2 offloads, we are using the bridge driver.
>But there is a need to turn off/on some attributes in hw independently of
>attributes in the kernel bridge driver. To that i had suggested we continue
>to use 'self'
>for those attributes without introducing a new hw mode.
>
>
>>
>> We have this currently, looking from the user's cmd:
>>
>> bridge link set                               calls into bridge and port
>> driver
>> bridge link set master                    calls into bridge and port driver
>> bridge link set self                         calls into port driver
>> bridge link set self master              calls into bridge and port driver
>>
>> Your proposal above changes it to:
>>
>> bridge link set                               calls into bridge and port
>> driver
>> bridge link set master                    calls into bridge
>> bridge link set self                         calls into port driver
>> bridge link set self master              calls into bridge and port driver
>>
>> Which is still no good because the default case (neither self or
>> master explicitly set) calls into the port driver.  Before all of
>> these changes, we had:
>>
>> bridge link set                               calls into bridge
>> bridge link set master                    calls into bridge
>> bridge link set hwmode veb|vepa     calls into port driver
>>
>> hwmode was a way to set SELF.  We wanted to add "self" flag to command
>> to be more like bridge fdb cmd.  I assumed the default case (bridge
>> set link) would continue to work as before.  So we'd have:
>>
>> bridge link set                               calls into bridge
>> bridge link set master                    calls into bridge
>> bridge link set self                         calls into port driver
>> bridge link set self master              calls into bridge and port driver
>>
>> But that's not what we got.  Your changes changed the default case
>> (and the bridge link set master case) to call into both bridge and
>> port driver even though user didn't specify self.
>>
>
>I have made sure that no default case has changed. It works this way only
>when
>the driver advertises offload using NETIF_F_HW_SWITCH_OFFLOAD.
>Which means the driver is interested in involving the bridge driver in all
>l2 offloads
>(And this includes all bridge ndo ops setlink/dellink/stp/fdb).
>
>http://www.spinics.net/lists/netdev/msg314407.html
>
>
>
>
>>
>> How hard would it be to get back to original default behavior?
>>
>
>Am not sure which default behavior you are talking about. If you are
>talking about default behavior before switchdev,
>that has not changed. For switch devices which want the bridge driver
>involved and who advertise the NETIF_F_HW_SWITCH_OFFLOAD flag, this is new
>behavior. And rocker does advertise this flag today.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-09  6:40                     ` Jiri Pirko
@ 2015-03-09 15:59                       ` Arad, Ronen
  2015-03-09 16:07                         ` Jiri Pirko
       [not found]                         ` <CAJieiUhcdfGitY7rbG11Vt_Beemz8dy3=gKtvbyVLS8O0DkgNw@mail.gmail.com>
  0 siblings, 2 replies; 38+ messages in thread
From: Arad, Ronen @ 2015-03-09 15:59 UTC (permalink / raw)
  To: Jiri Pirko, Roopa Prabhu; +Cc: Scott Feldman, Netdev, David S. Miller



>-----Original Message-----
>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>Behalf Of Jiri Pirko
>Sent: Monday, March 09, 2015 6:41 AM
>To: Roopa Prabhu
>Cc: Scott Feldman; Netdev; David S. Miller
>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>setlink handler
>
>Mon, Mar 09, 2015 at 01:12:47AM CET, roopa@cumulusnetworks.com wrote:
>>On Sun, Mar 8, 2015 at 4:17 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>
>>> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com> wrote:
>>> > On 3/6/15, 1:52 AM, Scott Feldman wrote:
>>>
>>> [cut]
>>>
>>> >> I'm saying port driver's ndo_bridge_setlink shouldn't be called unless
>>> >> SELF is set.  The port driver shouldn't have to check if it's SELF,
>>> >> that's my main beef with this patch.  If the app (iproute2) wants to
>>> >> mirror an attribute, then it can set both MASTER and SELF.  This is
>>> >> consistent with ndo_fdb ops, where MASTER targets the bridge's FDB and
>>> >> SELF targets the device's FDB.
>>> >
>>> >
>>> > bridge setlink/dellink ndo sets/clears bridge port flags, vlans.
>>> >
>>> > 4 ways to the ndo op gets called (value in brackets is the bridge flags):
>>> >
>>> > 1. ndo_bridge_setlink ()
>>> > 2. ndo_bridge_setlink (master)
>>> > 3. ndo_bridge_setlink (self)
>>> > 4. ndo_bridge_setlink (master | self)
>>> >
>>> >
>>> > in case of 1) and 2) today, bridge driver also calls into the port driver
>>> > (as part of NETIF_F_HW_SWITCH_OFFLOAD) because vlans and other port flags
>>> > need to be offloaded if they are set on master (I understand that rocker
>>> > does not use bridge setlink ndo op for vlans. But it is completely valid
>>> for
>>> > a switch driver to use the ndo setlink op for offloading vlan config)
>>> >
>>> > One thing i can do is the NETIF_F_HW_SWITCH_OFFLOAD case only applies in
>>> > case 1 above. ie when bridge flags is zero.
>>> >
>>> > which leads to the following commands to disable learning in the bridge
>>> > driver and enable it in hw,
>>> >
>>> > 1. bridge link learning on - enable learning in both rocker and bridge
>>> > driver due to the NETIF_F_HW_SWITCH_OFFLOAD flag
>>> > 2. bridge link learning off master - learning disable only in the bridge
>>> > driver
>>> >
>>> > With this i think we hit a compromise, rocker setlink can be called when
>>> > flags is zero, but we still get a valid sequence of commands to
>>> > enable/disable certain port flags only in the bridge driver or port
>>> driver
>>> > without having an explicit self check in rocker.
>>> >
>>> > (my work laptop is broken..., I will re-post the patch tomorrow if you
>>> are
>>> > ok with the plan)
>>>
>>> I'm not OK with the plan; it doesn't address the issue.  I'm saying
>>> port driver's ndo_bridge_setlink shouldn't be called unless SELF is
>>> set.
>>>
>>
>>but, why scott ?. With l2, l3 and all other offloads, we try not to change
>>user commands.
>>Before switchdev 'self' was always used to go to nic drivers directly to
>>program fdb.
>>l2 never used the bridge driver.
>
>
>I also think that is is fine that in case either master or self are not set,
>driver ndo setlink is called as well (in case the offload bit is on). It
>makes it transparent for user in case he does not care what in under. In
>case he cares, he can specify master or self to achieve exactly what he
>needs. I like it that way.
>
>
It would be nice to have switch driver implementation that could work
Consistently with and without the bridge driver. I expect use case where a bridge is only used when L3 is needed or to leverage L2 protocols that are implemented by the bridge driver (e.g. STP, IGMP snooping).

If my driver can process setlink requests using SELF flag today, I'd like it
to work the same when setlink is propagated down from a bridge master or
from a team/bond master.

There is current issue with notification.
When SELF flag is set, the bridge driver does not offload to the port driver
and notification is issued by rtnetlink.c by calling the port getlink ndo.
This call, however, is done with zero filter_mask such that VLAN information
is not included in the notification.
When SELF flag is not set, bridge driver offloads to the port and issues the 
notification. In that case it sets the filter to VLAN_COMPRESSED.
I don't see how I can get my driver to behave consistently with and without
a bridge. 
The closest I can get notification with and without a bridge is for the
driver to examine the flags. If SELF is set, the driver knows it got invoked
directly from rtnetlink and it should notify VLAN setting as the subsequent
notification triggered by rtnetlink won't.
When the switch driver does not see SELF set it knows that it was invoked by
The bridge driver which already takes care of notification including VLAN
information. 

>
>>
>>with switchdev l2 offloads, we are using the bridge driver.
>>But there is a need to turn off/on some attributes in hw independently of
>>attributes in the kernel bridge driver. To that i had suggested we continue
>>to use 'self'
>>for those attributes without introducing a new hw mode.
>>
>>
>>>
>>> We have this currently, looking from the user's cmd:
>>>
>>> bridge link set                               calls into bridge and port
>>> driver
>>> bridge link set master                    calls into bridge and port driver
>>> bridge link set self                         calls into port driver
>>> bridge link set self master              calls into bridge and port driver
>>>
>>> Your proposal above changes it to:
>>>
>>> bridge link set                               calls into bridge and port
>>> driver
>>> bridge link set master                    calls into bridge
>>> bridge link set self                         calls into port driver
>>> bridge link set self master              calls into bridge and port driver
>>>
>>> Which is still no good because the default case (neither self or
>>> master explicitly set) calls into the port driver.  Before all of
>>> these changes, we had:
>>>
>>> bridge link set                               calls into bridge
>>> bridge link set master                    calls into bridge
>>> bridge link set hwmode veb|vepa     calls into port driver
>>>
>>> hwmode was a way to set SELF.  We wanted to add "self" flag to command
>>> to be more like bridge fdb cmd.  I assumed the default case (bridge
>>> set link) would continue to work as before.  So we'd have:
>>>
>>> bridge link set                               calls into bridge
>>> bridge link set master                    calls into bridge
>>> bridge link set self                         calls into port driver
>>> bridge link set self master              calls into bridge and port driver
>>>
>>> But that's not what we got.  Your changes changed the default case
>>> (and the bridge link set master case) to call into both bridge and
>>> port driver even though user didn't specify self.
>>>
>>
>>I have made sure that no default case has changed. It works this way only
>>when
>>the driver advertises offload using NETIF_F_HW_SWITCH_OFFLOAD.
>>Which means the driver is interested in involving the bridge driver in all
>>l2 offloads
>>(And this includes all bridge ndo ops setlink/dellink/stp/fdb).
>>
>>http://www.spinics.net/lists/netdev/msg314407.html
>>
>>
>>
>>
>>>
>>> How hard would it be to get back to original default behavior?
>>>
>>
>>Am not sure which default behavior you are talking about. If you are
>>talking about default behavior before switchdev,
>>that has not changed. For switch devices which want the bridge driver
>>involved and who advertise the NETIF_F_HW_SWITCH_OFFLOAD flag, this is new
>>behavior. And rocker does advertise this flag today.
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-09 15:59                       ` Arad, Ronen
@ 2015-03-09 16:07                         ` Jiri Pirko
  2015-03-10  0:51                           ` Arad, Ronen
       [not found]                         ` <CAJieiUhcdfGitY7rbG11Vt_Beemz8dy3=gKtvbyVLS8O0DkgNw@mail.gmail.com>
  1 sibling, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2015-03-09 16:07 UTC (permalink / raw)
  To: Arad, Ronen; +Cc: Roopa Prabhu, Scott Feldman, Netdev, David S. Miller

Mon, Mar 09, 2015 at 04:59:13PM CET, ronen.arad@intel.com wrote:
>
>
>>-----Original Message-----
>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>Behalf Of Jiri Pirko
>>Sent: Monday, March 09, 2015 6:41 AM
>>To: Roopa Prabhu
>>Cc: Scott Feldman; Netdev; David S. Miller
>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>>setlink handler
>>
>>Mon, Mar 09, 2015 at 01:12:47AM CET, roopa@cumulusnetworks.com wrote:
>>>On Sun, Mar 8, 2015 at 4:17 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>
>>>> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com> wrote:
>>>> > On 3/6/15, 1:52 AM, Scott Feldman wrote:
>>>>
>>>> [cut]
>>>>
>>>> >> I'm saying port driver's ndo_bridge_setlink shouldn't be called unless
>>>> >> SELF is set.  The port driver shouldn't have to check if it's SELF,
>>>> >> that's my main beef with this patch.  If the app (iproute2) wants to
>>>> >> mirror an attribute, then it can set both MASTER and SELF.  This is
>>>> >> consistent with ndo_fdb ops, where MASTER targets the bridge's FDB and
>>>> >> SELF targets the device's FDB.
>>>> >
>>>> >
>>>> > bridge setlink/dellink ndo sets/clears bridge port flags, vlans.
>>>> >
>>>> > 4 ways to the ndo op gets called (value in brackets is the bridge flags):
>>>> >
>>>> > 1. ndo_bridge_setlink ()
>>>> > 2. ndo_bridge_setlink (master)
>>>> > 3. ndo_bridge_setlink (self)
>>>> > 4. ndo_bridge_setlink (master | self)
>>>> >
>>>> >
>>>> > in case of 1) and 2) today, bridge driver also calls into the port driver
>>>> > (as part of NETIF_F_HW_SWITCH_OFFLOAD) because vlans and other port flags
>>>> > need to be offloaded if they are set on master (I understand that rocker
>>>> > does not use bridge setlink ndo op for vlans. But it is completely valid
>>>> for
>>>> > a switch driver to use the ndo setlink op for offloading vlan config)
>>>> >
>>>> > One thing i can do is the NETIF_F_HW_SWITCH_OFFLOAD case only applies in
>>>> > case 1 above. ie when bridge flags is zero.
>>>> >
>>>> > which leads to the following commands to disable learning in the bridge
>>>> > driver and enable it in hw,
>>>> >
>>>> > 1. bridge link learning on - enable learning in both rocker and bridge
>>>> > driver due to the NETIF_F_HW_SWITCH_OFFLOAD flag
>>>> > 2. bridge link learning off master - learning disable only in the bridge
>>>> > driver
>>>> >
>>>> > With this i think we hit a compromise, rocker setlink can be called when
>>>> > flags is zero, but we still get a valid sequence of commands to
>>>> > enable/disable certain port flags only in the bridge driver or port
>>>> driver
>>>> > without having an explicit self check in rocker.
>>>> >
>>>> > (my work laptop is broken..., I will re-post the patch tomorrow if you
>>>> are
>>>> > ok with the plan)
>>>>
>>>> I'm not OK with the plan; it doesn't address the issue.  I'm saying
>>>> port driver's ndo_bridge_setlink shouldn't be called unless SELF is
>>>> set.
>>>>
>>>
>>>but, why scott ?. With l2, l3 and all other offloads, we try not to change
>>>user commands.
>>>Before switchdev 'self' was always used to go to nic drivers directly to
>>>program fdb.
>>>l2 never used the bridge driver.
>>
>>
>>I also think that is is fine that in case either master or self are not set,
>>driver ndo setlink is called as well (in case the offload bit is on). It
>>makes it transparent for user in case he does not care what in under. In
>>case he cares, he can specify master or self to achieve exactly what he
>>needs. I like it that way.
>>
>>
>It would be nice to have switch driver implementation that could work
>Consistently with and without the bridge driver. I expect use case where a bridge is only used when L3 is needed or to leverage L2 protocols that are implemented by the bridge driver (e.g. STP, IGMP snooping).
>
>If my driver can process setlink requests using SELF flag today, I'd like it
>to work the same when setlink is propagated down from a bridge master or
>from a team/bond master.

I just want to note that the name of the op is "ndo_bridge_setlink" and
it is specific for in-bridge use. Using it for team/bond seems very
wrong to me.


>
>There is current issue with notification.
>When SELF flag is set, the bridge driver does not offload to the port driver
>and notification is issued by rtnetlink.c by calling the port getlink ndo.
>This call, however, is done with zero filter_mask such that VLAN information
>is not included in the notification.
>When SELF flag is not set, bridge driver offloads to the port and issues the 
>notification. In that case it sets the filter to VLAN_COMPRESSED.
>I don't see how I can get my driver to behave consistently with and without
>a bridge. 
>The closest I can get notification with and without a bridge is for the
>driver to examine the flags. If SELF is set, the driver knows it got invoked
>directly from rtnetlink and it should notify VLAN setting as the subsequent
>notification triggered by rtnetlink won't.
>When the switch driver does not see SELF set it knows that it was invoked by
>The bridge driver which already takes care of notification including VLAN
>information. 
>
>>
>>>
>>>with switchdev l2 offloads, we are using the bridge driver.
>>>But there is a need to turn off/on some attributes in hw independently of
>>>attributes in the kernel bridge driver. To that i had suggested we continue
>>>to use 'self'
>>>for those attributes without introducing a new hw mode.
>>>
>>>
>>>>
>>>> We have this currently, looking from the user's cmd:
>>>>
>>>> bridge link set                               calls into bridge and port
>>>> driver
>>>> bridge link set master                    calls into bridge and port driver
>>>> bridge link set self                         calls into port driver
>>>> bridge link set self master              calls into bridge and port driver
>>>>
>>>> Your proposal above changes it to:
>>>>
>>>> bridge link set                               calls into bridge and port
>>>> driver
>>>> bridge link set master                    calls into bridge
>>>> bridge link set self                         calls into port driver
>>>> bridge link set self master              calls into bridge and port driver
>>>>
>>>> Which is still no good because the default case (neither self or
>>>> master explicitly set) calls into the port driver.  Before all of
>>>> these changes, we had:
>>>>
>>>> bridge link set                               calls into bridge
>>>> bridge link set master                    calls into bridge
>>>> bridge link set hwmode veb|vepa     calls into port driver
>>>>
>>>> hwmode was a way to set SELF.  We wanted to add "self" flag to command
>>>> to be more like bridge fdb cmd.  I assumed the default case (bridge
>>>> set link) would continue to work as before.  So we'd have:
>>>>
>>>> bridge link set                               calls into bridge
>>>> bridge link set master                    calls into bridge
>>>> bridge link set self                         calls into port driver
>>>> bridge link set self master              calls into bridge and port driver
>>>>
>>>> But that's not what we got.  Your changes changed the default case
>>>> (and the bridge link set master case) to call into both bridge and
>>>> port driver even though user didn't specify self.
>>>>
>>>
>>>I have made sure that no default case has changed. It works this way only
>>>when
>>>the driver advertises offload using NETIF_F_HW_SWITCH_OFFLOAD.
>>>Which means the driver is interested in involving the bridge driver in all
>>>l2 offloads
>>>(And this includes all bridge ndo ops setlink/dellink/stp/fdb).
>>>
>>>http://www.spinics.net/lists/netdev/msg314407.html
>>>
>>>
>>>
>>>
>>>>
>>>> How hard would it be to get back to original default behavior?
>>>>
>>>
>>>Am not sure which default behavior you are talking about. If you are
>>>talking about default behavior before switchdev,
>>>that has not changed. For switch devices which want the bridge driver
>>>involved and who advertise the NETIF_F_HW_SWITCH_OFFLOAD flag, this is new
>>>behavior. And rocker does advertise this flag today.
>>--
>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
       [not found]                         ` <CAJieiUhcdfGitY7rbG11Vt_Beemz8dy3=gKtvbyVLS8O0DkgNw@mail.gmail.com>
@ 2015-03-09 23:23                           ` Roopa Prabhu
  0 siblings, 0 replies; 38+ messages in thread
From: Roopa Prabhu @ 2015-03-09 23:23 UTC (permalink / raw)
  To: Arad, Ronen; +Cc: Jiri Pirko, Scott Feldman, Netdev, David S. Miller

sorry, this email might have gone with non-text (though i selected
'plain text' in my google mail web client). Retrying...apologize for
multiple emails.

On Mon, Mar 9, 2015 at 3:56 PM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>
>
> On Mon, Mar 9, 2015 at 8:59 AM, Arad, Ronen <ronen.arad@intel.com> wrote:
>>
>>
>>
>> >-----Original Message-----
>> >From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>> > On
>> >Behalf Of Jiri Pirko
>> >
>> >
>>
>> >I also think that is is fine that in case either master or self are not
>> > set,
>> >driver ndo setlink is called as well (in case the offload bit is on). It
>> >makes it transparent for user in case he does not care what in under. In
>> >case he cares, he can specify master or self to achieve exactly what he
>> >needs. I like it that way.
>> >
>> >
>> It would be nice to have switch driver implementation that could work
>> Consistently with and without the bridge driver. I expect use case where a
>> bridge is only used when L3 is needed or to leverage L2 protocols that are
>> implemented by the bridge driver (e.g. STP, IGMP snooping).
>
>
>>
>> If my driver can process setlink requests using SELF flag today, I'd like
>> it
>> to work the same when setlink is propagated down from a bridge master or
>> from a team/bond master.
>
>
>>
>>
>> There is current issue with notification.
>> When SELF flag is set, the bridge driver does not offload to the port
>> driver
>> and notification is issued by rtnetlink.c by calling the port getlink ndo.
>
>
>
>>
>> This call, however, is done with zero filter_mask such that VLAN
>> information
>> is not included in the notification.
>
>
>
>>
>> When SELF flag is not set, bridge driver offloads to the port and issues
>> the
>> notification. In that case it sets the filter to VLAN_COMPRESSED.
>> I don't see how I can get my driver to behave consistently with and
>> without
>> a bridge.
>> The closest I can get notification with and without a bridge is for the
>> driver to examine the flags. If SELF is set, the driver knows it got
>> invoked
>> directly from rtnetlink and it should notify VLAN setting as the
>> subsequent
>> notification triggered by rtnetlink won't.
>> When the switch driver does not see SELF set it knows that it was invoked
>> by
>> The bridge driver which already takes care of notification including VLAN
>> information.
>>
>
> You are right about self being handled from rtnetlink.c and the notification
> triggered from rtnetlink.c
> can be different from the one sent from the bridge driver.
> In the specific case of vlan information not being present in the self
> notifications generated from rtnetlink.c, that is
> something that will need to be fixed eventually.
> Since rtnetlink.c generates these notifications and not the port driver,
> like you say, port driver could generate the appropriate notification or
> rtnetlink.c will have to get the size of the skb from the port driver before
> doing a bridge_getlink call into the port driver.
>
> This is another good example, where the port driver might need to look at
> the self flag after-all.
>
> Thanks,
> Roopa
>
>
>
>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-09 16:07                         ` Jiri Pirko
@ 2015-03-10  0:51                           ` Arad, Ronen
  2015-03-10  6:39                             ` Jiri Pirko
  0 siblings, 1 reply; 38+ messages in thread
From: Arad, Ronen @ 2015-03-10  0:51 UTC (permalink / raw)
  To: Jiri Pirko, Netdev; +Cc: Roopa Prabhu, Scott Feldman, David S. Miller



>-----Original Message-----
>From: Jiri Pirko [mailto:jiri@resnulli.us]
>Sent: Monday, March 09, 2015 4:08 PM
>To: Arad, Ronen
>Cc: Roopa Prabhu; Scott Feldman; Netdev; David S. Miller
>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>setlink handler
>
>Mon, Mar 09, 2015 at 04:59:13PM CET, ronen.arad@intel.com wrote:
>>
>>
>>>-----Original Message-----
>>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>>Behalf Of Jiri Pirko
>>>Sent: Monday, March 09, 2015 6:41 AM
>>>To: Roopa Prabhu
>>>Cc: Scott Feldman; Netdev; David S. Miller
>>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>>>setlink handler
>>>
>>>Mon, Mar 09, 2015 at 01:12:47AM CET, roopa@cumulusnetworks.com wrote:
>>>>On Sun, Mar 8, 2015 at 4:17 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>>
>>>>> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com> wrote:
>>>>> > On 3/6/15, 1:52 AM, Scott Feldman wrote:
>>>>>
[cut]
>>>
>>It would be nice to have switch driver implementation that could work
>>Consistently with and without the bridge driver. I expect use case where a
>bridge is only used when L3 is needed or to leverage L2 protocols that are
>implemented by the bridge driver (e.g. STP, IGMP snooping).
>>
>>If my driver can process setlink requests using SELF flag today, I'd like it
>>to work the same when setlink is propagated down from a bridge master or
>>from a team/bond master.
>
>I just want to note that the name of the op is "ndo_bridge_setlink" and
>it is specific for in-bridge use. Using it for team/bond seems very
>wrong to me.
>

This propagation from master to its slaves is in a recently applied patch.
Are you suggesting reverting the patch which added 
ndo_dflt_netdev_switch_port_bridge_setlink function which is set as the
ndo_bridge_setlink of both team and bond drivers?

Would renaming this NDO to "ndo_master_setlink" address your concern?
It could clarify that this ndo deals with master devices and is not limited
To just bridge driver.

My interest here is the VLAN filtering policy which is provisioned using the 
IFLA_AF_SPEC nested attribute of RTM_SETLINK command in PF_BRIDGE family.
VLAN policy is something that is common to Ethernet ports (physical or logical),
VLAN-aware bridge, or team/bond ports.

It is a reasonable expectation of users to use the team/bond as target for
any configuration that has to be mirrored to all of their enslaved ports.

[cut]
>>>--
>>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>the body of a message to majordomo@vger.kernel.org
>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-10  0:51                           ` Arad, Ronen
@ 2015-03-10  6:39                             ` Jiri Pirko
  2015-03-10  8:02                               ` Arad, Ronen
  0 siblings, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2015-03-10  6:39 UTC (permalink / raw)
  To: Arad, Ronen; +Cc: Netdev, Roopa Prabhu, Scott Feldman, David S. Miller

Tue, Mar 10, 2015 at 01:51:51AM CET, ronen.arad@intel.com wrote:
>
>
>>-----Original Message-----
>>From: Jiri Pirko [mailto:jiri@resnulli.us]
>>Sent: Monday, March 09, 2015 4:08 PM
>>To: Arad, Ronen
>>Cc: Roopa Prabhu; Scott Feldman; Netdev; David S. Miller
>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>>setlink handler
>>
>>Mon, Mar 09, 2015 at 04:59:13PM CET, ronen.arad@intel.com wrote:
>>>
>>>
>>>>-----Original Message-----
>>>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>>>Behalf Of Jiri Pirko
>>>>Sent: Monday, March 09, 2015 6:41 AM
>>>>To: Roopa Prabhu
>>>>Cc: Scott Feldman; Netdev; David S. Miller
>>>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>>>>setlink handler
>>>>
>>>>Mon, Mar 09, 2015 at 01:12:47AM CET, roopa@cumulusnetworks.com wrote:
>>>>>On Sun, Mar 8, 2015 at 4:17 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>>>
>>>>>> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com> wrote:
>>>>>> > On 3/6/15, 1:52 AM, Scott Feldman wrote:
>>>>>>
>[cut]
>>>>
>>>It would be nice to have switch driver implementation that could work
>>>Consistently with and without the bridge driver. I expect use case where a
>>bridge is only used when L3 is needed or to leverage L2 protocols that are
>>implemented by the bridge driver (e.g. STP, IGMP snooping).
>>>
>>>If my driver can process setlink requests using SELF flag today, I'd like it
>>>to work the same when setlink is propagated down from a bridge master or
>>>from a team/bond master.
>>
>>I just want to note that the name of the op is "ndo_bridge_setlink" and
>>it is specific for in-bridge use. Using it for team/bond seems very
>>wrong to me.
>>
>
>This propagation from master to its slaves is in a recently applied patch.
>Are you suggesting reverting the patch which added 
>ndo_dflt_netdev_switch_port_bridge_setlink function which is set as the
>ndo_bridge_setlink of both team and bond drivers?

I'm not suggesting such thing. Proparation is correct. Origination is
not. That was my point.

>
>Would renaming this NDO to "ndo_master_setlink" address your concern?
>It could clarify that this ndo deals with master devices and is not limited
>To just bridge driver.

Renaming ndo is not enough.... You would have to do much more.

>
>My interest here is the VLAN filtering policy which is provisioned using the 
>IFLA_AF_SPEC nested attribute of RTM_SETLINK command in PF_BRIDGE family.
>VLAN policy is something that is common to Ethernet ports (physical or logical),
>VLAN-aware bridge, or team/bond ports.
>
>It is a reasonable expectation of users to use the team/bond as target for
>any configuration that has to be mirrored to all of their enslaved ports.
>
>[cut]
>>>>--
>>>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>the body of a message to majordomo@vger.kernel.org
>>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-10  6:39                             ` Jiri Pirko
@ 2015-03-10  8:02                               ` Arad, Ronen
  2015-03-10  8:28                                 ` Jiri Pirko
  0 siblings, 1 reply; 38+ messages in thread
From: Arad, Ronen @ 2015-03-10  8:02 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Netdev, Roopa Prabhu, Scott Feldman, David S. Miller



>-----Original Message-----
>From: Jiri Pirko [mailto:jiri@resnulli.us]
>Sent: Tuesday, March 10, 2015 6:39 AM
>To: Arad, Ronen
>Cc: Netdev; Roopa Prabhu; Scott Feldman; David S. Miller
>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>setlink handler
>
>Tue, Mar 10, 2015 at 01:51:51AM CET, ronen.arad@intel.com wrote:
>>
>>
>>>-----Original Message-----
>>>From: Jiri Pirko [mailto:jiri@resnulli.us]
>>>Sent: Monday, March 09, 2015 4:08 PM
>>>To: Arad, Ronen
>>>Cc: Roopa Prabhu; Scott Feldman; Netdev; David S. Miller
>>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>>>setlink handler
>>>
>>>Mon, Mar 09, 2015 at 04:59:13PM CET, ronen.arad@intel.com wrote:
>>>>
>>>>
>>>>>-----Original Message-----
>>>>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>On
>>>>>Behalf Of Jiri Pirko
>>>>>Sent: Monday, March 09, 2015 6:41 AM
>>>>>To: Roopa Prabhu
>>>>>Cc: Scott Feldman; Netdev; David S. Miller
>>>>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in
>bridge
>>>>>setlink handler
>>>>>
>>>>>Mon, Mar 09, 2015 at 01:12:47AM CET, roopa@cumulusnetworks.com wrote:
>>>>>>On Sun, Mar 8, 2015 at 4:17 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>>>>
>>>>>>> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com>
>wrote:
>>>>>>> > On 3/6/15, 1:52 AM, Scott Feldman wrote:
>>>>>>>
>>[cut]
>>>>>
>>>>It would be nice to have switch driver implementation that could work
>>>>Consistently with and without the bridge driver. I expect use case where a
>>>bridge is only used when L3 is needed or to leverage L2 protocols that are
>>>implemented by the bridge driver (e.g. STP, IGMP snooping).
>>>>
>>>>If my driver can process setlink requests using SELF flag today, I'd like
>it
>>>>to work the same when setlink is propagated down from a bridge master or
>>>>from a team/bond master.
>>>
>>>I just want to note that the name of the op is "ndo_bridge_setlink" and
>>>it is specific for in-bridge use. Using it for team/bond seems very
>>>wrong to me.
>>>
>>
>>This propagation from master to its slaves is in a recently applied patch.
>>Are you suggesting reverting the patch which added
>>ndo_dflt_netdev_switch_port_bridge_setlink function which is set as the
>>ndo_bridge_setlink of both team and bond drivers?
>
>I'm not suggesting such thing. Proparation is correct. Origination is
>not. That was my point.
>
I'd like to make sure I understand your position. If only propagation is
correct, it would mean that a bridge device is always required for VLAN
filtering configuration.
This also mean that targeting "bridge vlan add vid VLAN-ID dev DEVNAME self"
to DEVNAME of a switch port type or team/bond is incorrect.
If this position is accepted, it would be best to enforce it, possibly in
rtnl_bridge_setlink().
My recollection is that others asked to preserve use-cases where SELF flag
is used for targeting port devices directly without using a bridge device.
>>
>>Would renaming this NDO to "ndo_master_setlink" address your concern?
>>It could clarify that this ndo deals with master devices and is not limited
>>To just bridge driver.
>
>Renaming ndo is not enough.... You would have to do much more.
>
>>
>>My interest here is the VLAN filtering policy which is provisioned using the
>>IFLA_AF_SPEC nested attribute of RTM_SETLINK command in PF_BRIDGE family.
>>VLAN policy is something that is common to Ethernet ports (physical or
>logical),
>>VLAN-aware bridge, or team/bond ports.
>>
>>It is a reasonable expectation of users to use the team/bond as target for
>>any configuration that has to be mirrored to all of their enslaved ports.
>>
>>[cut]
>>>>>--
>>>>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>>the body of a message to majordomo@vger.kernel.org
>>>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-10  8:02                               ` Arad, Ronen
@ 2015-03-10  8:28                                 ` Jiri Pirko
  2015-03-16 22:01                                   ` John Fastabend
  0 siblings, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2015-03-10  8:28 UTC (permalink / raw)
  To: Arad, Ronen; +Cc: Netdev, Roopa Prabhu, Scott Feldman, David S. Miller

Tue, Mar 10, 2015 at 09:02:00AM CET, ronen.arad@intel.com wrote:
>
>
>>-----Original Message-----
>>From: Jiri Pirko [mailto:jiri@resnulli.us]
>>Sent: Tuesday, March 10, 2015 6:39 AM
>>To: Arad, Ronen
>>Cc: Netdev; Roopa Prabhu; Scott Feldman; David S. Miller
>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>>setlink handler
>>
>>Tue, Mar 10, 2015 at 01:51:51AM CET, ronen.arad@intel.com wrote:
>>>
>>>
>>>>-----Original Message-----
>>>>From: Jiri Pirko [mailto:jiri@resnulli.us]
>>>>Sent: Monday, March 09, 2015 4:08 PM
>>>>To: Arad, Ronen
>>>>Cc: Roopa Prabhu; Scott Feldman; Netdev; David S. Miller
>>>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge
>>>>setlink handler
>>>>
>>>>Mon, Mar 09, 2015 at 04:59:13PM CET, ronen.arad@intel.com wrote:
>>>>>
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>>On
>>>>>>Behalf Of Jiri Pirko
>>>>>>Sent: Monday, March 09, 2015 6:41 AM
>>>>>>To: Roopa Prabhu
>>>>>>Cc: Scott Feldman; Netdev; David S. Miller
>>>>>>Subject: Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in
>>bridge
>>>>>>setlink handler
>>>>>>
>>>>>>Mon, Mar 09, 2015 at 01:12:47AM CET, roopa@cumulusnetworks.com wrote:
>>>>>>>On Sun, Mar 8, 2015 at 4:17 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>>>>>
>>>>>>>> On Sun, Mar 8, 2015 at 7:19 AM, roopa <roopa@cumulusnetworks.com>
>>wrote:
>>>>>>>> > On 3/6/15, 1:52 AM, Scott Feldman wrote:
>>>>>>>>
>>>[cut]
>>>>>>
>>>>>It would be nice to have switch driver implementation that could work
>>>>>Consistently with and without the bridge driver. I expect use case where a
>>>>bridge is only used when L3 is needed or to leverage L2 protocols that are
>>>>implemented by the bridge driver (e.g. STP, IGMP snooping).
>>>>>
>>>>>If my driver can process setlink requests using SELF flag today, I'd like
>>it
>>>>>to work the same when setlink is propagated down from a bridge master or
>>>>>from a team/bond master.
>>>>
>>>>I just want to note that the name of the op is "ndo_bridge_setlink" and
>>>>it is specific for in-bridge use. Using it for team/bond seems very
>>>>wrong to me.
>>>>
>>>
>>>This propagation from master to its slaves is in a recently applied patch.
>>>Are you suggesting reverting the patch which added
>>>ndo_dflt_netdev_switch_port_bridge_setlink function which is set as the
>>>ndo_bridge_setlink of both team and bond drivers?
>>
>>I'm not suggesting such thing. Proparation is correct. Origination is
>>not. That was my point.
>>
>I'd like to make sure I understand your position. If only propagation is
>correct, it would mean that a bridge device is always required for VLAN
>filtering configuration.

Not accurate. Setting this via ndo_bridge_setlink should be done only in
case netdev is a bridge port. There are other ndos to do this when vlan
dev in on top (ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid)


>This also mean that targeting "bridge vlan add vid VLAN-ID dev DEVNAME self"
>to DEVNAME of a switch port type or team/bond is incorrect.

Definitelly. PF_BRIDGE should be used only for bridge and its ports.
(Leaving aside the fact that PF_BRIDGE should have never been
introduced, should be replaced and let to rot)


>If this position is accepted, it would be best to enforce it, possibly in
>rtnl_bridge_setlink().
>My recollection is that others asked to preserve use-cases where SELF flag
>is used for targeting port devices directly without using a bridge device.

I know it is possible, and it is incorrect and hacky. But it is part of
user api :/ I think we should not abuse this more in the future and
rather make the api correct and use that.


>>>
>>>Would renaming this NDO to "ndo_master_setlink" address your concern?
>>>It could clarify that this ndo deals with master devices and is not limited
>>>To just bridge driver.
>>
>>Renaming ndo is not enough.... You would have to do much more.
>>
>>>
>>>My interest here is the VLAN filtering policy which is provisioned using the
>>>IFLA_AF_SPEC nested attribute of RTM_SETLINK command in PF_BRIDGE family.
>>>VLAN policy is something that is common to Ethernet ports (physical or
>>logical),
>>>VLAN-aware bridge, or team/bond ports.
>>>
>>>It is a reasonable expectation of users to use the team/bond as target for
>>>any configuration that has to be mirrored to all of their enslaved ports.
>>>
>>>[cut]
>>>>>>--
>>>>>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>>>the body of a message to majordomo@vger.kernel.org
>>>>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-10  8:28                                 ` Jiri Pirko
@ 2015-03-16 22:01                                   ` John Fastabend
  2015-03-17  7:00                                     ` Jiri Pirko
  0 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2015-03-16 22:01 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Arad, Ronen, Netdev, Roopa Prabhu, Scott Feldman, David S. Miller

[...]

>
>> If this position is accepted, it would be best to enforce it, possibly in
>> rtnl_bridge_setlink().
>> My recollection is that others asked to preserve use-cases where SELF flag
>> is used for targeting port devices directly without using a bridge device.
>
> I know it is possible, and it is incorrect and hacky. But it is part of
> user api :/ I think we should not abuse this more in the future and
> rather make the api correct and use that.
>

Working my way through my backlog of email sorry for the days delay.

Jiri, are you suggesting it is incorrect to configure the hardware L2
independent of bridge device? There is absolutely use cases for this.

The case being we want the hardware to do L2 learning via fdb and then
when flows get 'trapped' into software we want to handle them
differently. Possibly send them onto a specific application for logging.

I'm at a loss around what use "really" running the bridge in software
is. There shouldn't be packets traversing the software path and if they
are being sent onto software I really can't think of any use case I
would want to run them through the software bridge. More likely I want
to run OVS or some equivalent controller-based software to forward them
"specially" to the correct software/controller/port.

My current use case is L2/L3 on the hardware, OVS and applications in
software. I don't think this is broken or hacky. This means no bridge
in software but programming the L2 bridge in hardware.

.John


-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-16 22:01                                   ` John Fastabend
@ 2015-03-17  7:00                                     ` Jiri Pirko
  2015-03-17 14:31                                       ` John Fastabend
  0 siblings, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2015-03-17  7:00 UTC (permalink / raw)
  To: John Fastabend
  Cc: Arad, Ronen, Netdev, Roopa Prabhu, Scott Feldman, David S. Miller

Mon, Mar 16, 2015 at 11:01:30PM CET, john.fastabend@gmail.com wrote:
>[...]
>
>>
>>>If this position is accepted, it would be best to enforce it, possibly in
>>>rtnl_bridge_setlink().
>>>My recollection is that others asked to preserve use-cases where SELF flag
>>>is used for targeting port devices directly without using a bridge device.
>>
>>I know it is possible, and it is incorrect and hacky. But it is part of
>>user api :/ I think we should not abuse this more in the future and
>>rather make the api correct and use that.
>>
>
>Working my way through my backlog of email sorry for the days delay.
>
>Jiri, are you suggesting it is incorrect to configure the hardware L2
>independent of bridge device? There is absolutely use cases for this.
>
>The case being we want the hardware to do L2 learning via fdb and then
>when flows get 'trapped' into software we want to handle them
>differently. Possibly send them onto a specific application for logging.

Yes, but that can be done in transparent way, exposing hw ports, having
them in bridge/ovs/whatever. Same as we do with rocker.

>
>I'm at a loss around what use "really" running the bridge in software
>is. There shouldn't be packets traversing the software path and if they
>are being sent onto software I really can't think of any use case I
>would want to run them through the software bridge. More likely I want
>to run OVS or some equivalent controller-based software to forward them
>"specially" to the correct software/controller/port.
>
>My current use case is L2/L3 on the hardware, OVS and applications in
>software. I don't think this is broken or hacky. This means no bridge
>in software but programming the L2 bridge in hardware.
>
>.John
>
>
>-- 
>John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-17  7:00                                     ` Jiri Pirko
@ 2015-03-17 14:31                                       ` John Fastabend
  2015-03-17 20:27                                         ` roopa
  0 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2015-03-17 14:31 UTC (permalink / raw)
  To: Jiri Pirko, John Fastabend
  Cc: Arad, Ronen, Netdev, Roopa Prabhu, Scott Feldman, David S. Miller

On 03/17/2015 12:00 AM, Jiri Pirko wrote:
> Mon, Mar 16, 2015 at 11:01:30PM CET, john.fastabend@gmail.com wrote:
>> [...]
>>
>>>
>>>> If this position is accepted, it would be best to enforce it, possibly in
>>>> rtnl_bridge_setlink().
>>>> My recollection is that others asked to preserve use-cases where SELF flag
>>>> is used for targeting port devices directly without using a bridge device.
>>>
>>> I know it is possible, and it is incorrect and hacky. But it is part of
>>> user api :/ I think we should not abuse this more in the future and
>>> rather make the api correct and use that.
>>>
>>
>> Working my way through my backlog of email sorry for the days delay.
>>
>> Jiri, are you suggesting it is incorrect to configure the hardware L2
>> independent of bridge device? There is absolutely use cases for this.
>>
>> The case being we want the hardware to do L2 learning via fdb and then
>> when flows get 'trapped' into software we want to handle them
>> differently. Possibly send them onto a specific application for logging.
> 
> Yes, but that can be done in transparent way, exposing hw ports, having
> them in bridge/ovs/whatever. Same as we do with rocker.
> 

My point is you don't want a bridge in software at all. So I don't
understand the "transparent" way. In this case you want to configure
the hardware to do l2 bridge and put the ports in some other objects
for simplicity consider a OVS instance in software. In this model
the ports are attached to the software OVS and we do not want to
"transparently" offload any of OVS to hardware.

Also ports can not be in both an OVS instance and bridge instance.

                  +----+----+
                  |   OVS   |      <- netdevs mapped to sw ovs, not offoaded
                  +----+----+
                     |    |
                   sw0p1 sw0p2     <- netdev representing hardware ports
                     |    |
             +----+----+----+---+
             |    L2 hw bridge  |  <- l2 hardware bridge managed via netlink
             +----+----+----+---+

In many cases it doesn't make any sense to fall back to software.
You can't have a 100Gbps links "falling" back onto the kernel datapath.
And in these environments having ports attached to a "transparent" bridge
breaks. Worse the management CPU is usually something light, its not
typically a quad socket top of line CPU where you might have a chance.

Nothing is broke at the moment because we have the "self" flag. I'm
responding to you "incorrect and hacky" comment. Similarly we are
going to need a flag for L3 that puts the rule in hardware or fails.
Just like L2 we can't have L3 being sent into software its not a
viable fallback path in many use cases. And doing it "transparently"
so that the controlling agent is unaware it is offloaded makes it
difficult to manage the system. I think the "transparent" model only
works for smallish devices, home routers and the likes.

Thanks,
.John

>>
>> I'm at a loss around what use "really" running the bridge in software
>> is. There shouldn't be packets traversing the software path and if they
>> are being sent onto software I really can't think of any use case I
>> would want to run them through the software bridge. More likely I want
>> to run OVS or some equivalent controller-based software to forward them
>> "specially" to the correct software/controller/port.
>>
>> My current use case is L2/L3 on the hardware, OVS and applications in
>> software. I don't think this is broken or hacky. This means no bridge
>> in software but programming the L2 bridge in hardware.
>>
>> .John
>>
>>
>> -- 
>> John Fastabend         Intel Corporation
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-17 14:31                                       ` John Fastabend
@ 2015-03-17 20:27                                         ` roopa
  2015-03-18  0:16                                           ` John Fastabend
  0 siblings, 1 reply; 38+ messages in thread
From: roopa @ 2015-03-17 20:27 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiri Pirko, John Fastabend, Arad, Ronen, Netdev, Scott Feldman,
	David S. Miller

On 3/17/15, 7:31 AM, John Fastabend wrote:
> On 03/17/2015 12:00 AM, Jiri Pirko wrote:
>> Mon, Mar 16, 2015 at 11:01:30PM CET, john.fastabend@gmail.com wrote:
>>> [...]
>>>
>>>>> If this position is accepted, it would be best to enforce it, possibly in
>>>>> rtnl_bridge_setlink().
>>>>> My recollection is that others asked to preserve use-cases where SELF flag
>>>>> is used for targeting port devices directly without using a bridge device.
>>>> I know it is possible, and it is incorrect and hacky. But it is part of
>>>> user api :/ I think we should not abuse this more in the future and
>>>> rather make the api correct and use that.
>>>>
>>> Working my way through my backlog of email sorry for the days delay.
>>>
>>> Jiri, are you suggesting it is incorrect to configure the hardware L2
>>> independent of bridge device? There is absolutely use cases for this.
>>>
>>> The case being we want the hardware to do L2 learning via fdb and then
>>> when flows get 'trapped' into software we want to handle them
>>> differently. Possibly send them onto a specific application for logging.
>> Yes, but that can be done in transparent way, exposing hw ports, having
>> them in bridge/ovs/whatever. Same as we do with rocker.
>>
> My point is you don't want a bridge in software at all. So I don't
> understand the "transparent" way. In this case you want to configure
> the hardware to do l2 bridge and put the ports in some other objects
> for simplicity consider a OVS instance in software. In this model
> the ports are attached to the software OVS and we do not want to
> "transparently" offload any of OVS to hardware.
>
> Also ports can not be in both an OVS instance and bridge instance.
>
>                    +----+----+
>                    |   OVS   |      <- netdevs mapped to sw ovs, not offoaded
>                    +----+----+
>                       |    |
>                     sw0p1 sw0p2     <- netdev representing hardware ports
>                       |    |
>               +----+----+----+---+
>               |    L2 hw bridge  |  <- l2 hardware bridge managed via netlink
>               +----+----+----+---+
>
> In many cases it doesn't make any sense to fall back to software.
> You can't have a 100Gbps links "falling" back onto the kernel datapath.
> And in these environments having ports attached to a "transparent" bridge
> breaks. Worse the management CPU is usually something light, its not
> typically a quad socket top of line CPU where you might have a chance.
>
> Nothing is broke at the moment because we have the "self" flag. I'm
> responding to you "incorrect and hacky" comment. Similarly we are
> going to need a flag for L3 that puts the rule in hardware or fails.
> Just like L2 we can't have L3 being sent into software its not a
> viable fallback path in many use cases. And doing it "transparently"
> so that the controlling agent is unaware it is offloaded makes it
> difficult to manage the system. I think the "transparent" model only
> works for smallish devices, home routers and the likes.
>

I have not followed Jiri's and your exchange of comments fully yet.
If it helps I just wanted to clarify the part where the word 
'transparency' was introduced in this thread:

This is in the context of traversing lower devices to get to the switch 
port (example, a bond with switch ports as slaves and you want to reach 
the slaves via the bond).

Its was not in the context of whether the kernel bridge driver is used 
or not for l2 offload.
Understand that there are l2 nics which are programmed today by directly 
going to the driver
bypassing the bridge driver. and these are programmed with 'self' today.

Even for offloads that use the in kernel bridge driver (switch devices 
eg rocker),
user can use  'self' to go directly to the switch driver. And this is 
required in some cases
where you want a bridge port attribute to be different than the 
in-kernel bridge port attribute.
eg learning.

bridge link set dev swp1 learning off   (sets learning off in both 
in-kernel bridge and rocker)
bridge link set dev swp1 learning on self   (sets learning on in rocker)

To describe the stacked netdev/bridge port case which is the context of 
this thread,
a rocker port can be a slave of a bond and the bond can be a bridge 
port. In such cases you want
to traverse the bond lowerdevs to get to the rocker port to call into 
the switch driver.

bridge link set dev bond0 learning off   (sets learning off in both 
in-kernel bridge and rocker)
bridge link set dev bond0 learning on self   (sets learning on in rocker)

For the above to work, since rtnetlink.c calls the op on the port driver 
directly , bonding driver should implement
the required op.

But rtnetlink.c:rtnl_setlink(), can be changed to use the switchdev op 
below in the 'self' case (same api that jiri is trying to restore in 
this thread). This makes sure your l2 devices don't break and switchdev 
api is propagated transparently to the
switch port via the lowerdev list.

int netdev_switch_port_bridge_setlink(struct net_device *dev,
				      struct nlmsghdr *nlh, u16 flags)
{
	const struct net_device_ops *ops = dev->netdev_ops;
	struct net_device *lower_dev;
	struct list_head *iter;
	int ret = 0, err = 0;


	if (ops->ndo_bridge_setlink)
		return ops->ndo_bridge_setlink(dev, nlh, flags);

	netdev_for_each_lower_dev(dev, lower_dev, iter) {
		err = netdev_switch_port_bridge_setlink(lower_dev, nlh, flags);
		if (err)
			ret = err;
	}

	return ret;
}

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-17 20:27                                         ` roopa
@ 2015-03-18  0:16                                           ` John Fastabend
  2015-03-18  6:29                                             ` roopa
  0 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2015-03-18  0:16 UTC (permalink / raw)
  To: roopa
  Cc: Jiri Pirko, John Fastabend, Arad, Ronen, Netdev, Scott Feldman,
	David S. Miller

On 03/17/2015 01:27 PM, roopa wrote:
> On 3/17/15, 7:31 AM, John Fastabend wrote:
>> On 03/17/2015 12:00 AM, Jiri Pirko wrote:
>>> Mon, Mar 16, 2015 at 11:01:30PM CET, john.fastabend@gmail.com wrote:
>>>> [...]
>>>>
>>>>>> If this position is accepted, it would be best to enforce it, possibly in
>>>>>> rtnl_bridge_setlink().
>>>>>> My recollection is that others asked to preserve use-cases where SELF flag
>>>>>> is used for targeting port devices directly without using a bridge device.
>>>>> I know it is possible, and it is incorrect and hacky. But it is part of
>>>>> user api :/ I think we should not abuse this more in the future and
>>>>> rather make the api correct and use that.
>>>>>
>>>> Working my way through my backlog of email sorry for the days delay.
>>>>
>>>> Jiri, are you suggesting it is incorrect to configure the hardware L2
>>>> independent of bridge device? There is absolutely use cases for this.
>>>>
>>>> The case being we want the hardware to do L2 learning via fdb and then
>>>> when flows get 'trapped' into software we want to handle them
>>>> differently. Possibly send them onto a specific application for logging.
>>> Yes, but that can be done in transparent way, exposing hw ports, having
>>> them in bridge/ovs/whatever. Same as we do with rocker.
>>>
>> My point is you don't want a bridge in software at all. So I don't
>> understand the "transparent" way. In this case you want to configure
>> the hardware to do l2 bridge and put the ports in some other objects
>> for simplicity consider a OVS instance in software. In this model
>> the ports are attached to the software OVS and we do not want to
>> "transparently" offload any of OVS to hardware.
>>
>> Also ports can not be in both an OVS instance and bridge instance.
>>
>>                    +----+----+
>>                    |   OVS   |      <- netdevs mapped to sw ovs, not offoaded
>>                    +----+----+
>>                       |    |
>>                     sw0p1 sw0p2     <- netdev representing hardware ports
>>                       |    |
>>               +----+----+----+---+
>>               |    L2 hw bridge  |  <- l2 hardware bridge managed via netlink
>>               +----+----+----+---+
>>
>> In many cases it doesn't make any sense to fall back to software.
>> You can't have a 100Gbps links "falling" back onto the kernel datapath.
>> And in these environments having ports attached to a "transparent" bridge
>> breaks. Worse the management CPU is usually something light, its not
>> typically a quad socket top of line CPU where you might have a chance.
>>
>> Nothing is broke at the moment because we have the "self" flag. I'm
>> responding to you "incorrect and hacky" comment. Similarly we are
>> going to need a flag for L3 that puts the rule in hardware or fails.
>> Just like L2 we can't have L3 being sent into software its not a
>> viable fallback path in many use cases. And doing it "transparently"
>> so that the controlling agent is unaware it is offloaded makes it
>> difficult to manage the system. I think the "transparent" model only
>> works for smallish devices, home routers and the likes.
>>
> 
> I have not followed Jiri's and your exchange of comments fully yet.
> If it helps I just wanted to clarify the part where the word 'transparency' was introduced in this thread:
> 
> This is in the context of traversing lower devices to get to the switch port (example, a bond with switch ports as slaves and you want to reach the slaves via the bond).
> 
> Its was not in the context of whether the kernel bridge driver is used or not for l2 offload.
> Understand that there are l2 nics which are programmed today by directly going to the driver
> bypassing the bridge driver. and these are programmed with 'self' today.

Actually not just NICs but also switches will use this.

> 
> Even for offloads that use the in kernel bridge driver (switch devices eg rocker),
> user can use  'self' to go directly to the switch driver. And this is required in some cases
> where you want a bridge port attribute to be different than the in-kernel bridge port attribute.
> eg learning.
> 
> bridge link set dev swp1 learning off   (sets learning off in both in-kernel bridge and rocker)
> bridge link set dev swp1 learning on self   (sets learning on in rocker)

yep :) this is my use case. And I will need to add similar policy to l3 which I will
hopefully get to soon. I was just keying off the 

> 
> To describe the stacked netdev/bridge port case which is the context of this thread,
> a rocker port can be a slave of a bond and the bond can be a bridge port. In such cases you want
> to traverse the bond lowerdevs to get to the rocker port to call into the switch driver.
> 
> bridge link set dev bond0 learning off   (sets learning off in both in-kernel bridge and rocker)
> bridge link set dev bond0 learning on self   (sets learning on in rocker)
> 
> For the above to work, since rtnetlink.c calls the op on the port driver directly , bonding driver should implement
> the required op.

OK, but I'm not entirely sure this is correct. I'm trying to wrap my head around it. In this
case it can be _any_ type of stacked device correct?

So what about a vlan device? In this case the software viewpoint is different then the hardware
viewpoint so is it correct to pass the configuration down like this? Also what if the bond device
is a LAG, is it correct to passthrough like this?

Thanks for the clarification I guess I need to work through some examples to convince myself
this works. I'm guessing you (or someone) already did this and I'm just late to the game.

Thanks,
.John

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-18  0:16                                           ` John Fastabend
@ 2015-03-18  6:29                                             ` roopa
  2015-03-18 15:24                                               ` John Fastabend
  0 siblings, 1 reply; 38+ messages in thread
From: roopa @ 2015-03-18  6:29 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiri Pirko, John Fastabend, Arad, Ronen, Netdev, Scott Feldman,
	David S. Miller

On 3/17/15, 5:16 PM, John Fastabend wrote:
> On 03/17/2015 01:27 PM, roopa wrote:
>> On 3/17/15, 7:31 AM, John Fastabend wrote:
>>> On 03/17/2015 12:00 AM, Jiri Pirko wrote:
>>>> Mon, Mar 16, 2015 at 11:01:30PM CET, john.fastabend@gmail.com wrote:
>>>>> [...]
>>>>>
>>>>>>> If this position is accepted, it would be best to enforce it, possibly in
>>>>>>> rtnl_bridge_setlink().
>>>>>>> My recollection is that others asked to preserve use-cases where SELF flag
>>>>>>> is used for targeting port devices directly without using a bridge device.
>>>>>> I know it is possible, and it is incorrect and hacky. But it is part of
>>>>>> user api :/ I think we should not abuse this more in the future and
>>>>>> rather make the api correct and use that.
>>>>>>
>>>>> Working my way through my backlog of email sorry for the days delay.
>>>>>
>>>>> Jiri, are you suggesting it is incorrect to configure the hardware L2
>>>>> independent of bridge device? There is absolutely use cases for this.
>>>>>
>>>>> The case being we want the hardware to do L2 learning via fdb and then
>>>>> when flows get 'trapped' into software we want to handle them
>>>>> differently. Possibly send them onto a specific application for logging.
>>>> Yes, but that can be done in transparent way, exposing hw ports, having
>>>> them in bridge/ovs/whatever. Same as we do with rocker.
>>>>
>>> My point is you don't want a bridge in software at all. So I don't
>>> understand the "transparent" way. In this case you want to configure
>>> the hardware to do l2 bridge and put the ports in some other objects
>>> for simplicity consider a OVS instance in software. In this model
>>> the ports are attached to the software OVS and we do not want to
>>> "transparently" offload any of OVS to hardware.
>>>
>>> Also ports can not be in both an OVS instance and bridge instance.
>>>
>>>                     +----+----+
>>>                     |   OVS   |      <- netdevs mapped to sw ovs, not offoaded
>>>                     +----+----+
>>>                        |    |
>>>                      sw0p1 sw0p2     <- netdev representing hardware ports
>>>                        |    |
>>>                +----+----+----+---+
>>>                |    L2 hw bridge  |  <- l2 hardware bridge managed via netlink
>>>                +----+----+----+---+
>>>
>>> In many cases it doesn't make any sense to fall back to software.
>>> You can't have a 100Gbps links "falling" back onto the kernel datapath.
>>> And in these environments having ports attached to a "transparent" bridge
>>> breaks. Worse the management CPU is usually something light, its not
>>> typically a quad socket top of line CPU where you might have a chance.
>>>
>>> Nothing is broke at the moment because we have the "self" flag. I'm
>>> responding to you "incorrect and hacky" comment. Similarly we are
>>> going to need a flag for L3 that puts the rule in hardware or fails.
>>> Just like L2 we can't have L3 being sent into software its not a
>>> viable fallback path in many use cases. And doing it "transparently"
>>> so that the controlling agent is unaware it is offloaded makes it
>>> difficult to manage the system. I think the "transparent" model only
>>> works for smallish devices, home routers and the likes.
>>>
>> I have not followed Jiri's and your exchange of comments fully yet.
>> If it helps I just wanted to clarify the part where the word 'transparency' was introduced in this thread:
>>
>> This is in the context of traversing lower devices to get to the switch port (example, a bond with switch ports as slaves and you want to reach the slaves via the bond).
>>
>> Its was not in the context of whether the kernel bridge driver is used or not for l2 offload.
>> Understand that there are l2 nics which are programmed today by directly going to the driver
>> bypassing the bridge driver. and these are programmed with 'self' today.
> Actually not just NICs but also switches will use this.
>
>> Even for offloads that use the in kernel bridge driver (switch devices eg rocker),
>> user can use  'self' to go directly to the switch driver. And this is required in some cases
>> where you want a bridge port attribute to be different than the in-kernel bridge port attribute.
>> eg learning.
>>
>> bridge link set dev swp1 learning off   (sets learning off in both in-kernel bridge and rocker)
>> bridge link set dev swp1 learning on self   (sets learning on in rocker)
> yep :) this is my use case. And I will need to add similar policy to l3 which I will
> hopefully get to soon. I was just keying off the
>
>> To describe the stacked netdev/bridge port case which is the context of this thread,
>> a rocker port can be a slave of a bond and the bond can be a bridge port. In such cases you want
>> to traverse the bond lowerdevs to get to the rocker port to call into the switch driver.
>>
>> bridge link set dev bond0 learning off   (sets learning off in both in-kernel bridge and rocker)
>> bridge link set dev bond0 learning on self   (sets learning on in rocker)
>>
>> For the above to work, since rtnetlink.c calls the op on the port driver directly , bonding driver should implement
>> the required op.
> OK, but I'm not entirely sure this is correct. I'm trying to wrap my head around it. In this
> case it can be _any_ type of stacked device correct?
>
> So what about a vlan device?
Our main focus has always been devices which use the in-kernel bridge 
driver. We have been testing this with mainly vlan
filtering bridge. But yes, vlan and vxlan devices will need to be 
supported in the stacked netdevice case.
And that's why the initial proposal was to transparently traverse the 
stacked netdevs and we are trying to bring that back in this thread.

> In this case the software viewpoint is different then the hardware
> viewpoint so is it correct to pass the configuration down like this?

We just want bridge port config passed down to the switch driver.

> Also what if the bond device
> is a LAG, is it correct to passthrough like this?
hmm...I don't think it matters. We are just trying to get to the switch 
driver.
>
> Thanks for the clarification I guess I need to work through some examples to convince myself
> this works. I'm guessing you (or someone) already did this and I'm just late to the game.
>
For cases where we use the in-kernel bridge driver, yes it is tested for 
passing down bridge port attributes that is
different than the in-kernel bridge attributes (example learning).

I am not sure how this would be and what other issues you will hit if 
you are planning to bypass the kernel and directly go to the switch 
driver for all l2 and l3 in the stacked netdevice case. For l3, its 
better to use the in-kernel route fib offload mechanism which was 
recently submitted by scott feldman.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-18  6:29                                             ` roopa
@ 2015-03-18 15:24                                               ` John Fastabend
  2015-03-18 16:55                                                 ` John Fastabend
                                                                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: John Fastabend @ 2015-03-18 15:24 UTC (permalink / raw)
  To: roopa
  Cc: John Fastabend, Jiri Pirko, Arad, Ronen, Netdev, Scott Feldman,
	David S. Miller

[...]

>> So what about a vlan device?
> Our main focus has always been devices which use the in-kernel bridge
> driver. We have been testing this with mainly vlan
> filtering bridge. But yes, vlan and vxlan devices will need to be
> supported in the stacked netdevice case.
> And that's why the initial proposal was to transparently traverse the
> stacked netdevs and we are trying to bring that back in this thread.
>
>> In this case the software viewpoint is different then the hardware
>> viewpoint so is it correct to pass the configuration down like this?
>
> We just want bridge port config passed down to the switch driver.
>

Sure thought about it some more and I can't see any cases that break.
But it is a change in the model from the normal software case.

>> Also what if the bond device
>> is a LAG, is it correct to passthrough like this?
> hmm...I don't think it matters. We are just trying to get to the switch
> driver.

Came to the same conclusion, it doesn't seem to matter it is different
though.

>>
>> Thanks for the clarification I guess I need to work through some
>> examples to convince myself
>> this works. I'm guessing you (or someone) already did this and I'm
>> just late to the game.
>>
> For cases where we use the in-kernel bridge driver, yes it is tested for
> passing down bridge port attributes that is
> different than the in-kernel bridge attributes (example learning).

Yep, I've tested it here as well this is good.

>
> I am not sure how this would be and what other issues you will hit if
> you are planning to bypass the kernel and directly go to the switch
> driver for all l2 and l3 in the stacked netdevice case. For l3, its
> better to use the in-kernel route fib offload mechanism which was
> recently submitted by scott feldman.
>

Why? I saw the patched and liked it but noted that the existing policy
wont actually work for real networks. Its a good start. My proposal
is to add a flag to l3 to similarly fail to load a rule if it can't
be pushed at hardware same as l2.

I'm getting off the topic of this thread I guess but I'm not
bypassing anything IMO. I want to configure the hardware datapath and I
want to configure the software datapath. For devices with 10, 40,
100Gbps links dropping traffic into the software datapath is not a
viable option in many cases. Traffic will degrade, packets will be
dropped and with 100's or 1000's of these switches managing a network
that some times jumps into software or worse on a single path through
the network might be in software on one hop and in hardware in the next
is not manageable.

When a packet hits the software datapath it is the exception case I want
to handle it as an exception. It also got into the software datapath
because I had a "trap" action in hardware to send it up to software. So
having the software/hardware datapaths mirror each other isn't really
useful at least on the devices I work on. For small home routers and
other types of systems it makes some sense. Perhaps you can even manage
10Gpbs ports like this if you are careful but I really don't see how you
throw a set of 100Gbps links up to kernel datapath running on a
smallish CPU.

.John

-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-18 15:24                                               ` John Fastabend
@ 2015-03-18 16:55                                                 ` John Fastabend
  2015-03-19  5:03                                                 ` roopa
  2015-03-19  5:49                                                 ` Scott Feldman
  2 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2015-03-18 16:55 UTC (permalink / raw)
  To: roopa
  Cc: John Fastabend, Jiri Pirko, Arad, Ronen, Netdev, Scott Feldman,
	David S. Miller

On 03/18/2015 08:24 AM, John Fastabend wrote:
> [...]
>
>>> So what about a vlan device?
>> Our main focus has always been devices which use the in-kernel bridge
>> driver. We have been testing this with mainly vlan
>> filtering bridge. But yes, vlan and vxlan devices will need to be
>> supported in the stacked netdevice case.
>> And that's why the initial proposal was to transparently traverse the
>> stacked netdevs and we are trying to bring that back in this thread.
>>
>>> In this case the software viewpoint is different then the hardware
>>> viewpoint so is it correct to pass the configuration down like this?
>>
>> We just want bridge port config passed down to the switch driver.
>>
>
> Sure thought about it some more and I can't see any cases that break.
> But it is a change in the model from the normal software case.
>
>>> Also what if the bond device
>>> is a LAG, is it correct to passthrough like this?
>> hmm...I don't think it matters. We are just trying to get to the switch
>> driver.
>
> Came to the same conclusion, it doesn't seem to matter it is different
> though.
>
>>>
>>> Thanks for the clarification I guess I need to work through some
>>> examples to convince myself
>>> this works. I'm guessing you (or someone) already did this and I'm
>>> just late to the game.
>>>
>> For cases where we use the in-kernel bridge driver, yes it is tested for
>> passing down bridge port attributes that is
>> different than the in-kernel bridge attributes (example learning).
>
> Yep, I've tested it here as well this is good.
>
>>
>> I am not sure how this would be and what other issues you will hit if
>> you are planning to bypass the kernel and directly go to the switch
>> driver for all l2 and l3 in the stacked netdevice case. For l3, its
>> better to use the in-kernel route fib offload mechanism which was
>> recently submitted by scott feldman.
>>
>
> Why? I saw the patched and liked it but noted that the existing policy
> wont actually work for real networks. Its a good start. My proposal
> is to add a flag to l3 to similarly fail to load a rule if it can't
> be pushed at hardware same as l2.
>

Or minimally don't flush the l3 table on an overrun and generate
a notification that the flow has _only_ been added to software. Then
my software agent can handle the exception case in some more intelligent
way if it wants to and I haven't dropped everything into software.

The best way to proceed is probably to write up a patch with a proposal
and get feedback.

> I'm getting off the topic of this thread I guess but I'm not
> bypassing anything IMO. I want to configure the hardware datapath and I
> want to configure the software datapath. For devices with 10, 40,
> 100Gbps links dropping traffic into the software datapath is not a
> viable option in many cases. Traffic will degrade, packets will be
> dropped and with 100's or 1000's of these switches managing a network
> that some times jumps into software or worse on a single path through
> the network might be in software on one hop and in hardware in the next
> is not manageable.
>
> When a packet hits the software datapath it is the exception case I want
> to handle it as an exception. It also got into the software datapath
> because I had a "trap" action in hardware to send it up to software. So
> having the software/hardware datapaths mirror each other isn't really
> useful at least on the devices I work on. For small home routers and
> other types of systems it makes some sense. Perhaps you can even manage
> 10Gpbs ports like this if you are careful but I really don't see how you
> throw a set of 100Gbps links up to kernel datapath running on a
> smallish CPU.
>
> .John
>


-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-18 15:24                                               ` John Fastabend
  2015-03-18 16:55                                                 ` John Fastabend
@ 2015-03-19  5:03                                                 ` roopa
  2015-03-19  5:49                                                 ` Scott Feldman
  2 siblings, 0 replies; 38+ messages in thread
From: roopa @ 2015-03-19  5:03 UTC (permalink / raw)
  To: John Fastabend
  Cc: John Fastabend, Jiri Pirko, Arad, Ronen, Netdev, Scott Feldman,
	David S. Miller

On 3/18/15, 8:24 AM, John Fastabend wrote:
> [...]
>
>>> So what about a vlan device?
>> Our main focus has always been devices which use the in-kernel bridge
>> driver. We have been testing this with mainly vlan
>> filtering bridge. But yes, vlan and vxlan devices will need to be
>> supported in the stacked netdevice case.
>> And that's why the initial proposal was to transparently traverse the
>> stacked netdevs and we are trying to bring that back in this thread.
>>
>>> In this case the software viewpoint is different then the hardware
>>> viewpoint so is it correct to pass the configuration down like this?
>>
>> We just want bridge port config passed down to the switch driver.
>>
>
> Sure thought about it some more and I can't see any cases that break.
> But it is a change in the model from the normal software case.
>
>>> Also what if the bond device
>>> is a LAG, is it correct to passthrough like this?
>> hmm...I don't think it matters. We are just trying to get to the switch
>> driver.
>
> Came to the same conclusion, it doesn't seem to matter it is different
> though.
>
>>>
>>> Thanks for the clarification I guess I need to work through some
>>> examples to convince myself
>>> this works. I'm guessing you (or someone) already did this and I'm
>>> just late to the game.
>>>
>> For cases where we use the in-kernel bridge driver, yes it is tested for
>> passing down bridge port attributes that is
>> different than the in-kernel bridge attributes (example learning).
>
> Yep, I've tested it here as well this is good.
thanks for confirming. I will test this again on my side and jiri,..i 
can resubmit this (unless you prefer to).
>
>>
>> I am not sure how this would be and what other issues you will hit if
>> you are planning to bypass the kernel and directly go to the switch
>> driver for all l2 and l3 in the stacked netdevice case. For l3, its
>> better to use the in-kernel route fib offload mechanism which was
>> recently submitted by scott feldman.
>>
>
> Why? I saw the patched and liked it but noted that the existing policy
> wont actually work for real networks. Its a good start. My proposal
> is to add a flag to l3 to similarly fail to load a rule if it can't
> be pushed at hardware same as l2.

agree and i had raised that concern.
the current policy will not work for us too. But the first attempt was 
trying to keep it simple.
we should be able to change/refine the policy in subsequent patches.
>
> I'm getting off the topic of this thread I guess but I'm not
> bypassing anything IMO. I want to configure the hardware datapath and I
> want to configure the software datapath. For devices with 10, 40,
> 100Gbps links dropping traffic into the software datapath is not a
> viable option in many cases. Traffic will degrade, packets will be
> dropped and with 100's or 1000's of these switches managing a network
> that some times jumps into software or worse on a single path through
> the network might be in software on one hop and in hardware in the next
> is not manageable.

agreed.
>
> When a packet hits the software datapath it is the exception case I want
> to handle it as an exception. It also got into the software datapath
> because I had a "trap" action in hardware to send it up to software. So
> having the software/hardware datapaths mirror each other isn't really
> useful at least on the devices I work on. For small home routers and
> other types of systems it makes some sense. Perhaps you can even manage
> 10Gpbs ports like this if you are careful but I really don't see how you
> throw a set of 100Gbps links up to kernel datapath running on a
> smallish CPU.
ack. For us,  we want the kernel FIB to match FIB in HW, but any failure 
to add in HW should fail in kernel too telling Quagga
or any other routing daemon to not add any more routes. With the CPU's 
on the boxes we support, there is no way we can handle fallback to  in 
kernel/software.

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-18 15:24                                               ` John Fastabend
  2015-03-18 16:55                                                 ` John Fastabend
  2015-03-19  5:03                                                 ` roopa
@ 2015-03-19  5:49                                                 ` Scott Feldman
  2015-03-19 13:29                                                   ` roopa
  2 siblings, 1 reply; 38+ messages in thread
From: Scott Feldman @ 2015-03-19  5:49 UTC (permalink / raw)
  To: John Fastabend
  Cc: roopa, John Fastabend, Jiri Pirko, Arad, Ronen, Netdev, David S. Miller

On Wed, Mar 18, 2015 at 8:24 AM, John Fastabend
<john.fastabend@gmail.com> wrote:
> [...]

>> I am not sure how this would be and what other issues you will hit if
>> you are planning to bypass the kernel and directly go to the switch
>> driver for all l2 and l3 in the stacked netdevice case. For l3, its
>> better to use the in-kernel route fib offload mechanism which was
>> recently submitted by scott feldman.
>>
>
> Why? I saw the patched and liked it but noted that the existing policy
> wont actually work for real networks. Its a good start. My proposal
> is to add a flag to l3 to similarly fail to load a rule if it can't
> be pushed at hardware same as l2.

RIght, what we have is a start to get the basic plumbing in place.
Agreed, the current would be inadequate for a real switch that can't
handle a software fallback.

Maybe the next step is to not flush hw of all routes on failure to add
the Nth one, but rather just fail the Nth completely (don't install in
hw or sw and return err to user).  This would keep the switch alive,
but now moves a decision to the user.  The user must decide what to do
with the failed Nth route.

We also added the netlink flag RTNH_F_EXTERNAL to mark routes
offloaded to hardware, but the marking is only done internally now, by
the kernel.  What I'm hoping is we can use that same flag in the
user's netlink msg to work like you describe: if user requests
RTNH_F_EXTERNAL, and it can't be loaded into hw, don't load into sw.
Or something like that.  Again, punting the decision on what to do
next to the user.

This part of the discussion should probably move to a new thread;
maybe someone brave can propose a patch to move us to the next level?

-scott

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-19  5:49                                                 ` Scott Feldman
@ 2015-03-19 13:29                                                   ` roopa
  2015-03-19 13:59                                                     ` John Fastabend
  0 siblings, 1 reply; 38+ messages in thread
From: roopa @ 2015-03-19 13:29 UTC (permalink / raw)
  To: Scott Feldman
  Cc: John Fastabend, John Fastabend, Jiri Pirko, Arad, Ronen, Netdev,
	David S. Miller

On 3/18/15, 10:49 PM, Scott Feldman wrote:
> On Wed, Mar 18, 2015 at 8:24 AM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>> [...]
>>> I am not sure how this would be and what other issues you will hit if
>>> you are planning to bypass the kernel and directly go to the switch
>>> driver for all l2 and l3 in the stacked netdevice case. For l3, its
>>> better to use the in-kernel route fib offload mechanism which was
>>> recently submitted by scott feldman.
>>>
>> Why? I saw the patched and liked it but noted that the existing policy
>> wont actually work for real networks. Its a good start. My proposal
>> is to add a flag to l3 to similarly fail to load a rule if it can't
>> be pushed at hardware same as l2.
> RIght, what we have is a start to get the basic plumbing in place.
> Agreed, the current would be inadequate for a real switch that can't
> handle a software fallback.
>
> Maybe the next step is to not flush hw of all routes on failure to add
> the Nth one, but rather just fail the Nth completely (don't install in
> hw or sw and return err to user).  This would keep the switch alive,
> but now moves a decision to the user.  The user must decide what to do
> with the failed Nth route.
I would prefer this. The routing daemon probably already has policies to 
handle routes
  that don't get installed in the FIB (It should not really care if the 
FIB is hardware accelerated or not).

>
> We also added the netlink flag RTNH_F_EXTERNAL to mark routes
> offloaded to hardware, but the marking is only done internally now, by
> the kernel.  What I'm hoping is we can use that same flag in the
> user's netlink msg to work like you describe: if user requests
> RTNH_F_EXTERNAL, and it can't be loaded into hw, don't load into sw.
> Or something like that.  Again, punting the decision on what to do
> next to the user.
yes, however this requires change in userspace (routing daemon) to 
explicitly set this flag.
It definitely can be optional IMO for people who need it (maybe JohnF)
>
> This part of the discussion should probably move to a new thread;
> maybe someone brave can propose a patch to move us to the next level?
>
ack, I will try and get to it this week, unless somebody beats me to it.

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler
  2015-03-19 13:29                                                   ` roopa
@ 2015-03-19 13:59                                                     ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2015-03-19 13:59 UTC (permalink / raw)
  To: roopa, Scott Feldman
  Cc: John Fastabend, Jiri Pirko, Arad, Ronen, Netdev, David S. Miller

On 03/19/2015 06:29 AM, roopa wrote:
> On 3/18/15, 10:49 PM, Scott Feldman wrote:
>> On Wed, Mar 18, 2015 at 8:24 AM, John Fastabend
>> <john.fastabend@gmail.com> wrote:
>>> [...]
>>>> I am not sure how this would be and what other issues you will hit if
>>>> you are planning to bypass the kernel and directly go to the switch
>>>> driver for all l2 and l3 in the stacked netdevice case. For l3, its
>>>> better to use the in-kernel route fib offload mechanism which was
>>>> recently submitted by scott feldman.
>>>>
>>> Why? I saw the patched and liked it but noted that the existing policy
>>> wont actually work for real networks. Its a good start. My proposal
>>> is to add a flag to l3 to similarly fail to load a rule if it can't
>>> be pushed at hardware same as l2.
>> RIght, what we have is a start to get the basic plumbing in place.
>> Agreed, the current would be inadequate for a real switch that can't
>> handle a software fallback.
>>
>> Maybe the next step is to not flush hw of all routes on failure to add
>> the Nth one, but rather just fail the Nth completely (don't install in
>> hw or sw and return err to user).  This would keep the switch alive,
>> but now moves a decision to the user.  The user must decide what to do
>> with the failed Nth route.
> I would prefer this. The routing daemon probably already has policies to handle routes
>  that don't get installed in the FIB (It should not really care if the FIB is hardware accelerated or not).
> 

+1 this works for me as well.

>>
>> We also added the netlink flag RTNH_F_EXTERNAL to mark routes
>> offloaded to hardware, but the marking is only done internally now, by
>> the kernel.  What I'm hoping is we can use that same flag in the
>> user's netlink msg to work like you describe: if user requests
>> RTNH_F_EXTERNAL, and it can't be loaded into hw, don't load into sw.
>> Or something like that.  Again, punting the decision on what to do
>> next to the user.
> yes, however this requires change in userspace (routing daemon) to explicitly set this flag.
> It definitely can be optional IMO for people who need it (maybe JohnF)

Yes it would be helpful for some software but I think getting the above
case working first seems to be the right approach to me.

>>
>> This part of the discussion should probably move to a new thread;
>> maybe someone brave can propose a patch to move us to the next level?
>>
> ack, I will try and get to it this week, unless somebody beats me to it.
> 

Thanks.

> Thanks,
> Roopa
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2015-03-19 13:59 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-04  0:15 [PATCH net-next] rocker: check for BRIDGE_FLAGS_SELF in bridge setlink handler roopa
2015-03-04  4:15 ` John Fastabend
2015-03-04  7:02 ` Scott Feldman
2015-03-04  8:51   ` roopa
2015-03-04 16:24     ` Scott Feldman
2015-03-05  0:31       ` roopa
2015-03-05  8:02     ` Jiri Pirko
2015-03-05 14:55       ` roopa
2015-03-05 20:06         ` Scott Feldman
2015-03-05 20:43           ` roopa
2015-03-05 21:40             ` roopa
2015-03-06  9:52             ` Scott Feldman
2015-03-08 14:19               ` roopa
2015-03-08 23:17                 ` Scott Feldman
2015-03-09  0:20                   ` roopa
     [not found]                   ` <CAJieiUhHdXOZjWkb4s_GviLwzq5Gct-1o8xv8b-JeM46S4e-dg@mail.gmail.com>
2015-03-09  6:40                     ` Jiri Pirko
2015-03-09 15:59                       ` Arad, Ronen
2015-03-09 16:07                         ` Jiri Pirko
2015-03-10  0:51                           ` Arad, Ronen
2015-03-10  6:39                             ` Jiri Pirko
2015-03-10  8:02                               ` Arad, Ronen
2015-03-10  8:28                                 ` Jiri Pirko
2015-03-16 22:01                                   ` John Fastabend
2015-03-17  7:00                                     ` Jiri Pirko
2015-03-17 14:31                                       ` John Fastabend
2015-03-17 20:27                                         ` roopa
2015-03-18  0:16                                           ` John Fastabend
2015-03-18  6:29                                             ` roopa
2015-03-18 15:24                                               ` John Fastabend
2015-03-18 16:55                                                 ` John Fastabend
2015-03-19  5:03                                                 ` roopa
2015-03-19  5:49                                                 ` Scott Feldman
2015-03-19 13:29                                                   ` roopa
2015-03-19 13:59                                                     ` John Fastabend
     [not found]                         ` <CAJieiUhcdfGitY7rbG11Vt_Beemz8dy3=gKtvbyVLS8O0DkgNw@mail.gmail.com>
2015-03-09 23:23                           ` Roopa Prabhu
2015-03-05  8:36 ` Jiri Pirko
2015-03-05 15:01   ` roopa
2015-03-05 15:09     ` roopa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.