* [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
@ 2015-02-20 7:09 sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 1/5] neighbour: add external aged flag sfeldma
` (6 more replies)
0 siblings, 7 replies; 30+ messages in thread
From: sfeldma @ 2015-02-20 7:09 UTC (permalink / raw)
To: netdev, jiri, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
From: Scott Feldman <sfeldma@gmail.com>
Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example by
offload hardware. Switchdev driver/devices can set this flag when learning a
new FDB entry and SW (the bridge driver) will skip this entry when running its
ageing task. If flag is set, the driver/device is responsible for calling
call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry expires.
This give the flexibility for driver/device to decide ageing policy based on
its capabilities. For devices managing many FDB entries, it is desireable for
the device to aged out its own entries. Devices not capable of aged entries
can rely of SW to age out the entries.
Scott Feldman (5):
neighbour: add external aged flag
switchdev: add ntf_flags to FDB notifier
bridge: call external learn add if adding FDB entry with
NTF_EXT_LEARNED set
bridge: let HW control FDB ageing by setting NTF_EXT_AGED
rocker: explicitly set SW ageing for rocker
drivers/net/ethernet/rocker/rocker.c | 9 +++++----
include/net/switchdev.h | 1 +
include/uapi/linux/neighbour.h | 1 +
net/bridge/br.c | 18 ++++++++++--------
net/bridge/br_fdb.c | 16 +++++++++++++---
net/bridge/br_private.h | 8 +++++---
6 files changed, 35 insertions(+), 18 deletions(-)
--
1.7.10.4
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH net-next RFC 1/5] neighbour: add external aged flag
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
@ 2015-02-20 7:09 ` sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 2/5] switchdev: add ntf_flags to FDB notifier sfeldma
` (5 subsequent siblings)
6 siblings, 0 replies; 30+ messages in thread
From: sfeldma @ 2015-02-20 7:09 UTC (permalink / raw)
To: netdev, jiri, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
From: Scott Feldman <sfeldma@gmail.com>
NFT_EXT_AGED marks neighbor (FDB) entries as externally aged, typically in
hardware. Software can skip these entries when aging entries.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
---
include/uapi/linux/neighbour.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index 3873a35..e9866f9 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -40,6 +40,7 @@ enum {
#define NTF_MASTER 0x04
#define NTF_PROXY 0x08 /* == ATF_PUBL */
#define NTF_EXT_LEARNED 0x10
+#define NTF_EXT_AGED 0x20
#define NTF_ROUTER 0x80
/*
--
1.7.10.4
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next RFC 2/5] switchdev: add ntf_flags to FDB notifier
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 1/5] neighbour: add external aged flag sfeldma
@ 2015-02-20 7:09 ` sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 3/5] bridge: call external learn add if adding FDB entry with NTF_EXT_LEARNED set sfeldma
` (4 subsequent siblings)
6 siblings, 0 replies; 30+ messages in thread
From: sfeldma @ 2015-02-20 7:09 UTC (permalink / raw)
To: netdev, jiri, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
From: Scott Feldman <sfeldma@gmail.com>
Let switchdev driver set NTF_* flags per FDB entry. Driver can set NTF_EXT_AGED
to indicate driver/device will aged out entry. Driver can set NTF_EXT_LEARNED
to indicate driver/device learned this entry.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
---
include/net/switchdev.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index cfcdac2..a523745 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -26,6 +26,7 @@ struct netdev_switch_notifier_fdb_info {
struct netdev_switch_notifier_info info; /* must be first */
const unsigned char *addr;
u16 vid;
+ u8 ntf_flags;
};
static inline struct net_device *
--
1.7.10.4
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next RFC 3/5] bridge: call external learn add if adding FDB entry with NTF_EXT_LEARNED set
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 1/5] neighbour: add external aged flag sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 2/5] switchdev: add ntf_flags to FDB notifier sfeldma
@ 2015-02-20 7:09 ` sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 4/5] bridge: let HW control FDB ageing by setting NTF_EXT_AGED sfeldma
` (3 subsequent siblings)
6 siblings, 0 replies; 30+ messages in thread
From: sfeldma @ 2015-02-20 7:09 UTC (permalink / raw)
To: netdev, jiri, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
From: Scott Feldman <sfeldma@gmail.com>
Only call into bridge driver to add externally learned FDB entry if driver/
device indicates entry with NTF_EXT_LEARNED.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
---
net/bridge/br.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/net/bridge/br.c b/net/bridge/br.c
index fb57ab6..3dc5463 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -140,17 +140,19 @@ static int br_netdev_switch_event(struct notifier_block *unused,
switch (event) {
case NETDEV_SWITCH_FDB_ADD:
fdb_info = ptr;
- err = br_fdb_external_learn_add(br, p, fdb_info->addr,
- fdb_info->vid);
- if (err)
- err = notifier_from_errno(err);
+ if (fdb_info->ntf_flags & NTF_EXT_LEARNED) {
+ err = br_fdb_external_learn_add(br, p, fdb_info);
+ if (err)
+ err = notifier_from_errno(err);
+ }
break;
case NETDEV_SWITCH_FDB_DEL:
fdb_info = ptr;
- err = br_fdb_external_learn_del(br, p, fdb_info->addr,
- fdb_info->vid);
- if (err)
- err = notifier_from_errno(err);
+ if (fdb_info->ntf_flags & NTF_EXT_LEARNED) {
+ err = br_fdb_external_learn_del(br, p, fdb_info);
+ if (err)
+ err = notifier_from_errno(err);
+ }
break;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next RFC 4/5] bridge: let HW control FDB ageing by setting NTF_EXT_AGED
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
` (2 preceding siblings ...)
2015-02-20 7:09 ` [PATCH net-next RFC 3/5] bridge: call external learn add if adding FDB entry with NTF_EXT_LEARNED set sfeldma
@ 2015-02-20 7:09 ` sfeldma
2015-02-20 17:31 ` David Miller
2015-02-20 7:09 ` [PATCH net-next RFC 5/5] rocker: explicitly set SW ageing for rocker sfeldma
` (2 subsequent siblings)
6 siblings, 1 reply; 30+ messages in thread
From: sfeldma @ 2015-02-20 7:09 UTC (permalink / raw)
To: netdev, jiri, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
From: Scott Feldman <sfeldma@gmail.com>
During bridge's ageing function, exclude FDB entries marked with NTF_EXT_AGED.
The external driver/device which set NTF_EXT_AGED flag will age out these
entries.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
---
net/bridge/br_fdb.c | 16 +++++++++++++---
net/bridge/br_private.h | 8 +++++---
2 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e0670d7..9311858 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -24,6 +24,7 @@
#include <linux/atomic.h>
#include <asm/unaligned.h>
#include <linux/if_vlan.h>
+#include <net/switchdev.h>
#include "br_private.h"
static struct kmem_cache *br_fdb_cache __read_mostly;
@@ -280,7 +281,7 @@ void br_fdb_cleanup(unsigned long _data)
hlist_for_each_entry_safe(f, n, &br->hash[i], hlist) {
unsigned long this_timer;
- if (f->is_static)
+ if (f->is_static || f->is_external_aged)
continue;
this_timer = f->updated + delay;
if (time_before_eq(this_timer, jiffies))
@@ -482,6 +483,7 @@ static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
fdb->is_static = 0;
fdb->added_by_user = 0;
fdb->added_by_external_learn = 0;
+ fdb->is_external_aged = 0;
fdb->updated = fdb->used = jiffies;
hlist_add_head_rcu(&fdb->hlist, head);
}
@@ -615,6 +617,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
ndm->ndm_pad1 = 0;
ndm->ndm_pad2 = 0;
ndm->ndm_flags = fdb->added_by_external_learn ? NTF_EXT_LEARNED : 0;
+ ndm->ndm_flags |= fdb->is_external_aged ? NTF_EXT_AGED : 0;
ndm->ndm_type = 0;
ndm->ndm_ifindex = fdb->dst ? fdb->dst->dev->ifindex : br->dev->ifindex;
ndm->ndm_state = fdb_to_nud(fdb);
@@ -990,10 +993,13 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
}
int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
- const unsigned char *addr, u16 vid)
+ struct netdev_switch_notifier_fdb_info *fdb_info)
{
struct hlist_head *head;
struct net_bridge_fdb_entry *fdb;
+ const unsigned char *addr = fdb_info->addr;
+ u16 vid = fdb_info->vid;
+ bool is_external_aged = !!(fdb_info->ntf_flags & NTF_EXT_AGED);
int err = 0;
ASSERT_RTNL();
@@ -1008,6 +1014,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
goto err_unlock;
}
fdb->added_by_external_learn = 1;
+ fdb->is_external_aged = is_external_aged;
fdb_notify(br, fdb, RTM_NEWNEIGH);
} else if (fdb->added_by_external_learn) {
/* Refresh entry */
@@ -1015,6 +1022,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
} else if (!fdb->added_by_user) {
/* Take over SW learned entry */
fdb->added_by_external_learn = 1;
+ fdb->is_external_aged = is_external_aged;
fdb->updated = jiffies;
fdb_notify(br, fdb, RTM_NEWNEIGH);
}
@@ -1026,10 +1034,12 @@ err_unlock:
}
int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
- const unsigned char *addr, u16 vid)
+ struct netdev_switch_notifier_fdb_info *fdb_info)
{
struct hlist_head *head;
struct net_bridge_fdb_entry *fdb;
+ const unsigned char *addr = fdb_info->addr;
+ u16 vid = fdb_info->vid;
int err = 0;
ASSERT_RTNL();
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index de09199..8fb822e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -18,6 +18,7 @@
#include <linux/netpoll.h>
#include <linux/u64_stats_sync.h>
#include <net/route.h>
+#include <net/switchdev.h>
#include <linux/if_vlan.h>
#define BR_HASH_BITS 8
@@ -101,7 +102,8 @@ struct net_bridge_fdb_entry
unsigned char is_local:1,
is_static:1,
added_by_user:1,
- added_by_external_learn:1;
+ added_by_external_learn:1,
+ is_external_aged:1;
__u16 vlan_id;
};
@@ -403,9 +405,9 @@ int br_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
- const unsigned char *addr, u16 vid);
+ struct netdev_switch_notifier_fdb_info *fdb_info);
int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
- const unsigned char *addr, u16 vid);
+ struct netdev_switch_notifier_fdb_info *fdb_info);
/* br_forward.c */
void br_deliver(const struct net_bridge_port *to, struct sk_buff *skb);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next RFC 5/5] rocker: explicitly set SW ageing for rocker
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
` (3 preceding siblings ...)
2015-02-20 7:09 ` [PATCH net-next RFC 4/5] bridge: let HW control FDB ageing by setting NTF_EXT_AGED sfeldma
@ 2015-02-20 7:09 ` sfeldma
2015-02-20 9:46 ` Jiri Pirko
2015-02-20 17:29 ` [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW roopa
2015-02-21 1:23 ` Siva Mannem
6 siblings, 1 reply; 30+ messages in thread
From: sfeldma @ 2015-02-20 7:09 UTC (permalink / raw)
To: netdev, jiri, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
From: Scott Feldman <sfeldma@gmail.com>
Rocker (for now) will rely on SW (bridge) ageing. Other drivers/devices may
wish to set NTF_EXT_AGED to turn off SW (bridge) ageing.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
---
drivers/net/ethernet/rocker/rocker.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index 34389b6a..7b55934 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3026,10 +3026,11 @@ static void rocker_port_fdb_learn_work(struct work_struct *work)
container_of(work, struct rocker_fdb_learn_work, work);
bool removing = (lw->flags & ROCKER_OP_FLAG_REMOVE);
bool learned = (lw->flags & ROCKER_OP_FLAG_LEARNED);
- struct netdev_switch_notifier_fdb_info info;
-
- info.addr = lw->addr;
- info.vid = lw->vid;
+ struct netdev_switch_notifier_fdb_info info = {
+ .addr = lw->addr,
+ .vid = lw->vid,
+ .ntf_flags = NTF_EXT_LEARNED,
+ };
if (learned && removing)
call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL,
--
1.7.10.4
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 5/5] rocker: explicitly set SW ageing for rocker
2015-02-20 7:09 ` [PATCH net-next RFC 5/5] rocker: explicitly set SW ageing for rocker sfeldma
@ 2015-02-20 9:46 ` Jiri Pirko
2015-02-20 14:56 ` Scott Feldman
0 siblings, 1 reply; 30+ messages in thread
From: Jiri Pirko @ 2015-02-20 9:46 UTC (permalink / raw)
To: sfeldma
Cc: netdev, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
Fri, Feb 20, 2015 at 08:09:55AM CET, sfeldma@gmail.com wrote:
>From: Scott Feldman <sfeldma@gmail.com>
>
>Rocker (for now) will rely on SW (bridge) ageing. Other drivers/devices may
>wish to set NTF_EXT_AGED to turn off SW (bridge) ageing.
>
>Signed-off-by: Scott Feldman <sfeldma@gmail.com>
>---
> drivers/net/ethernet/rocker/rocker.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>index 34389b6a..7b55934 100644
>--- a/drivers/net/ethernet/rocker/rocker.c
>+++ b/drivers/net/ethernet/rocker/rocker.c
>@@ -3026,10 +3026,11 @@ static void rocker_port_fdb_learn_work(struct work_struct *work)
> container_of(work, struct rocker_fdb_learn_work, work);
> bool removing = (lw->flags & ROCKER_OP_FLAG_REMOVE);
> bool learned = (lw->flags & ROCKER_OP_FLAG_LEARNED);
>- struct netdev_switch_notifier_fdb_info info;
>-
>- info.addr = lw->addr;
>- info.vid = lw->vid;
>+ struct netdev_switch_notifier_fdb_info info = {
>+ .addr = lw->addr,
>+ .vid = lw->vid,
>+ .ntf_flags = NTF_EXT_LEARNED,
>+ };
I think that this patch should be before "bridge: call external learn add
if adding FDB entry with NTF_EXT_LEARNED set" in order to not break
functionality during bisect.
>
> if (learned && removing)
> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL,
>--
>1.7.10.4
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 5/5] rocker: explicitly set SW ageing for rocker
2015-02-20 9:46 ` Jiri Pirko
@ 2015-02-20 14:56 ` Scott Feldman
0 siblings, 0 replies; 30+ messages in thread
From: Scott Feldman @ 2015-02-20 14:56 UTC (permalink / raw)
To: Jiri Pirko
Cc: Netdev, Roopa Prabhu, Guenter Roeck, Florian Fainelli, andrew,
Andy Gospodarek, vbandaru, Siva Mannem
On Fri, Feb 20, 2015 at 1:46 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Feb 20, 2015 at 08:09:55AM CET, sfeldma@gmail.com wrote:
>>From: Scott Feldman <sfeldma@gmail.com>
>>
>>Rocker (for now) will rely on SW (bridge) ageing. Other drivers/devices may
>>wish to set NTF_EXT_AGED to turn off SW (bridge) ageing.
>>
>>Signed-off-by: Scott Feldman <sfeldma@gmail.com>
>>---
>> drivers/net/ethernet/rocker/rocker.c | 9 +++++----
>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>
>>diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>>index 34389b6a..7b55934 100644
>>--- a/drivers/net/ethernet/rocker/rocker.c
>>+++ b/drivers/net/ethernet/rocker/rocker.c
>>@@ -3026,10 +3026,11 @@ static void rocker_port_fdb_learn_work(struct work_struct *work)
>> container_of(work, struct rocker_fdb_learn_work, work);
>> bool removing = (lw->flags & ROCKER_OP_FLAG_REMOVE);
>> bool learned = (lw->flags & ROCKER_OP_FLAG_LEARNED);
>>- struct netdev_switch_notifier_fdb_info info;
>>-
>>- info.addr = lw->addr;
>>- info.vid = lw->vid;
>>+ struct netdev_switch_notifier_fdb_info info = {
>>+ .addr = lw->addr,
>>+ .vid = lw->vid,
>>+ .ntf_flags = NTF_EXT_LEARNED,
>>+ };
>
> I think that this patch should be before "bridge: call external learn add
> if adding FDB entry with NTF_EXT_LEARNED set" in order to not break
> functionality during bisect.
Ok, I reordered 'em. Thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
` (4 preceding siblings ...)
2015-02-20 7:09 ` [PATCH net-next RFC 5/5] rocker: explicitly set SW ageing for rocker sfeldma
@ 2015-02-20 17:29 ` roopa
2015-02-20 19:13 ` Florian Fainelli
2015-02-21 1:23 ` Siva Mannem
6 siblings, 1 reply; 30+ messages in thread
From: roopa @ 2015-02-20 17:29 UTC (permalink / raw)
To: sfeldma
Cc: netdev, jiri, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
On 2/19/15, 11:09 PM, sfeldma@gmail.com wrote:
> From: Scott Feldman <sfeldma@gmail.com>
>
> Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example by
> offload hardware. Switchdev driver/devices can set this flag when learning a
> new FDB entry and SW (the bridge driver) will skip this entry when running its
> ageing task. If flag is set, the driver/device is responsible for calling
> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry expires.
>
> This give the flexibility for driver/device to decide ageing policy based on
> its capabilities. For devices managing many FDB entries, it is desireable for
> the device to aged out its own entries. Devices not capable of aged entries
> can rely of SW to age out the entries.
>
scott, patches look good. However, I am not sure yet if there is a need
to make it a per fdb entry flag.
At some point we will also need the hw ageing parameter to be configurable.
So other approach could be,
- ageing parameter on bridge gets offloaded the hw
- so, by default hw and kernel age their own entries using the same
bridge device default timer
- User can explicitly disable HW ageing by using self (this needs some
more thought because now the call is on the bridge device)
Thanks,
Roopa
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 4/5] bridge: let HW control FDB ageing by setting NTF_EXT_AGED
2015-02-20 7:09 ` [PATCH net-next RFC 4/5] bridge: let HW control FDB ageing by setting NTF_EXT_AGED sfeldma
@ 2015-02-20 17:31 ` David Miller
0 siblings, 0 replies; 30+ messages in thread
From: David Miller @ 2015-02-20 17:31 UTC (permalink / raw)
To: sfeldma
Cc: netdev, jiri, roopa, linux, f.fainelli, andrew, gospo, vbandaru,
siva.mannem.lnx
From: sfeldma@gmail.com
Date: Thu, 19 Feb 2015 23:09:54 -0800
> @@ -990,10 +993,13 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
> }
>
> int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
> - const unsigned char *addr, u16 vid)
> + struct netdev_switch_notifier_fdb_info *fdb_info)
> {
...
> @@ -1026,10 +1034,12 @@ err_unlock:
> }
>
> int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
> - const unsigned char *addr, u16 vid)
> + struct netdev_switch_notifier_fdb_info *fdb_info)
> {
...
> @@ -403,9 +405,9 @@ int br_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
> int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
> void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
> int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
> - const unsigned char *addr, u16 vid);
> + struct netdev_switch_notifier_fdb_info *fdb_info);
> int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
> - const unsigned char *addr, u16 vid);
> + struct netdev_switch_notifier_fdb_info *fdb_info);
You probably want to mark the fdb_info argument as 'const'.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-20 17:29 ` [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW roopa
@ 2015-02-20 19:13 ` Florian Fainelli
2015-02-20 19:45 ` Guenter Roeck
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Florian Fainelli @ 2015-02-20 19:13 UTC (permalink / raw)
To: roopa, sfeldma
Cc: netdev, jiri, linux, andrew, gospo, vbandaru, siva.mannem.lnx
On 20/02/15 09:29, roopa wrote:
> On 2/19/15, 11:09 PM, sfeldma@gmail.com wrote:
>> From: Scott Feldman <sfeldma@gmail.com>
>>
>> Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example by
>> offload hardware. Switchdev driver/devices can set this flag when
>> learning a
>> new FDB entry and SW (the bridge driver) will skip this entry when
>> running its
>> ageing task. If flag is set, the driver/device is responsible for
>> calling
>> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry
>> expires.
>>
>> This give the flexibility for driver/device to decide ageing policy
>> based on
>> its capabilities. For devices managing many FDB entries, it is
>> desireable for
>> the device to aged out its own entries. Devices not capable of aged
>> entries
>> can rely of SW to age out the entries.
>>
> scott, patches look good. However, I am not sure yet if there is a need
> to make it a per fdb entry flag.
I agree, in fact, most of the HW I have access to only has a global age
timer configuration knob. Is this configurable on a per-port basis for
higher end switches, or even maybe per-FDB entry?
>
> At some point we will also need the hw ageing parameter to be configurable.
> So other approach could be,
> - ageing parameter on bridge gets offloaded the hw
> - so, by default hw and kernel age their own entries using the same
> bridge device default timer
> - User can explicitly disable HW ageing by using self (this needs some
> more thought because now the call is on the bridge device)
I think we want to be careful with both SW and HW aging entries since a
host CPU's clock might be suspended/drifting etc.. at least, we probably
want one or the other to take precedence other the other one?
--
Florian
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-20 19:13 ` Florian Fainelli
@ 2015-02-20 19:45 ` Guenter Roeck
2015-02-21 0:20 ` Viswanath Bandaru
2015-02-21 11:03 ` Jiri Pirko
2 siblings, 0 replies; 30+ messages in thread
From: Guenter Roeck @ 2015-02-20 19:45 UTC (permalink / raw)
To: Florian Fainelli
Cc: roopa, sfeldma, netdev, jiri, andrew, gospo, vbandaru, siva.mannem.lnx
On Fri, Feb 20, 2015 at 11:13:46AM -0800, Florian Fainelli wrote:
> On 20/02/15 09:29, roopa wrote:
> > On 2/19/15, 11:09 PM, sfeldma@gmail.com wrote:
> >> From: Scott Feldman <sfeldma@gmail.com>
> >>
> >> Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example by
> >> offload hardware. Switchdev driver/devices can set this flag when
> >> learning a
> >> new FDB entry and SW (the bridge driver) will skip this entry when
> >> running its
> >> ageing task. If flag is set, the driver/device is responsible for
> >> calling
> >> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry
> >> expires.
> >>
> >> This give the flexibility for driver/device to decide ageing policy
> >> based on
> >> its capabilities. For devices managing many FDB entries, it is
> >> desireable for
> >> the device to aged out its own entries. Devices not capable of aged
> >> entries
> >> can rely of SW to age out the entries.
> >>
> > scott, patches look good. However, I am not sure yet if there is a need
> > to make it a per fdb entry flag.
>
> I agree, in fact, most of the HW I have access to only has a global age
> timer configuration knob. Is this configurable on a per-port basis for
Same here (for all the Marvell switches I am aware of). There is only a single
timer with a granularity of 15 seconds (or maybe 16 seconds on some chips).
> higher end switches, or even maybe per-FDB entry?
>
For the Marvell switches, the only per-fdb entry is a flag to indicate
static vs. dynamic.
> >
> > At some point we will also need the hw ageing parameter to be configurable.
> > So other approach could be,
> > - ageing parameter on bridge gets offloaded the hw
> > - so, by default hw and kernel age their own entries using the same
> > bridge device default timer
> > - User can explicitly disable HW ageing by using self (this needs some
> > more thought because now the call is on the bridge device)
>
> I think we want to be careful with both SW and HW aging entries since a
> host CPU's clock might be suspended/drifting etc.. at least, we probably
> want one or the other to take precedence other the other one?
I may be missing something, but I don't immediately see how this patch set
helps me solve any of the problems I am seeing when integrating the Marvell
switch code. Even if the switch internally does hardware aging, it still
seems to me that we'll need software aging on top of that, even if the
bridge code in the kernel has the same addresses in its fdb as the switch.
I see no feasible means to keep the fdb in the switch synchronized
with the fdb in the kernel.
How would this work with Broadcom switch chips ?
Thanks,
Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-20 19:13 ` Florian Fainelli
2015-02-20 19:45 ` Guenter Roeck
@ 2015-02-21 0:20 ` Viswanath Bandaru
2015-02-21 0:39 ` Guenter Roeck
2015-02-21 18:29 ` roopa
2015-02-21 11:03 ` Jiri Pirko
2 siblings, 2 replies; 30+ messages in thread
From: Viswanath Bandaru @ 2015-02-21 0:20 UTC (permalink / raw)
To: Florian Fainelli, roopa, sfeldma
Cc: netdev, jiri, linux, andrew, gospo, siva.mannem.lnx
> -----Original Message-----
> From: Florian Fainelli [mailto:f.fainelli@gmail.com]
> Sent: Saturday, February 21, 2015 12:44 AM
> To: roopa; sfeldma@gmail.com
> Cc: netdev@vger.kernel.org; jiri@resnulli.us; linux@roeck-us.net;
> andrew@lunn.ch; gospo@cumulusnetworks.com; Viswanath Bandaru;
> siva.mannem.lnx@gmail.com
> Subject: Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB
> ageing in SW or HW
>
> On 20/02/15 09:29, roopa wrote:
> > On 2/19/15, 11:09 PM, sfeldma@gmail.com wrote:
> >> From: Scott Feldman <sfeldma@gmail.com>
> >>
> >> Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example
> by
> >> offload hardware. Switchdev driver/devices can set this flag when
> >> learning a
> >> new FDB entry and SW (the bridge driver) will skip this entry when
> >> running its ageing task. If flag is set, the driver/device is
> >> responsible for calling
> >> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry
> >> expires.
> >>
> >> This give the flexibility for driver/device to decide ageing policy
> >> based on its capabilities. For devices managing many FDB entries, it
> >> is desireable for the device to aged out its own entries. Devices
> >> not capable of aged entries can rely of SW to age out the entries.
> >>
> > scott, patches look good. However, I am not sure yet if there is a
> > need to make it a per fdb entry flag.
>
> I agree, in fact, most of the HW I have access to only has a global age timer
> configuration knob. Is this configurable on a per-port basis for higher end
> switches, or even maybe per-FDB entry?
>
The decision to age an entry is on a per-FDB entry, So if the flag is not available in the fdb-entry, then it has to be obtained from associated netdev. I think the per-FDB may be more convenient than keeping it on netdev.
> >
> > At some point we will also need the hw ageing parameter to be
> configurable.
> > So other approach could be,
> > - ageing parameter on bridge gets offloaded the hw
Like everybody said, the ageing is really the switch's property. If there are two bridges with ports from same switching device, it becomes tricky to configure ageing time on the switch. ( I think this is what you may be referring to be saying it needs more thought).
> > - so, by default hw and kernel age their own entries using the same
> > bridge device default timer
> > - User can explicitly disable HW ageing by using self (this needs some
> > more thought because now the call is on the bridge device)
>
> I think we want to be careful with both SW and HW aging entries since a host
> CPU's clock might be suspended/drifting etc.. at least, we probably want one
> or the other to take precedence other the other one?
> --
> Florian
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-21 0:20 ` Viswanath Bandaru
@ 2015-02-21 0:39 ` Guenter Roeck
2015-02-21 18:29 ` roopa
1 sibling, 0 replies; 30+ messages in thread
From: Guenter Roeck @ 2015-02-21 0:39 UTC (permalink / raw)
To: Viswanath Bandaru, Florian Fainelli, roopa, sfeldma
Cc: netdev, jiri, andrew, gospo, siva.mannem.lnx
On 02/20/2015 04:20 PM, Viswanath Bandaru wrote:
>
> Like everybody said, the ageing is really the switch's property. If there are two bridges with ports from same switching device, it becomes tricky to configure ageing time on the switch. ( I think this is what you may be referring to be saying it needs more thought).
>
There is only so much we can do. One thing we can not do is to change
the existing switch hardware. Maybe we can ask switch vendors to consider
adding the ability to provide aging time per bridge group or per port,
but even if vendors agree to do that, it won't happen quickly, and it
won't fix existing chips.
Asking the switch to set its aging time to min(aging for all ports)
is better than to keep the aging time at the default of 5 minutes
for all ports, no matter what.
Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
` (5 preceding siblings ...)
2015-02-20 17:29 ` [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW roopa
@ 2015-02-21 1:23 ` Siva Mannem
6 siblings, 0 replies; 30+ messages in thread
From: Siva Mannem @ 2015-02-21 1:23 UTC (permalink / raw)
To: Scott Feldman
Cc: Netdev, Jiří Pírko, Roopa Prabhu, linux,
f.fainelli, andrew, gospo, vbandaru
On Fri, Feb 20, 2015 at 12:39 PM, <sfeldma@gmail.com> wrote:
> From: Scott Feldman <sfeldma@gmail.com>
>
> Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example by
> offload hardware. Switchdev driver/devices can set this flag when learning a
> new FDB entry and SW (the bridge driver) will skip this entry when running its
> ageing task. If flag is set, the driver/device is responsible for calling
> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry expires.
>
> This give the flexibility for driver/device to decide ageing policy based on
> its capabilities. For devices managing many FDB entries, it is desireable for
> the device to aged out its own entries. Devices not capable of aged entries
> can rely of SW to age out the entries.
>
> Scott Feldman (5):
> neighbour: add external aged flag
> switchdev: add ntf_flags to FDB notifier
> bridge: call external learn add if adding FDB entry with
> NTF_EXT_LEARNED set
> bridge: let HW control FDB ageing by setting NTF_EXT_AGED
> rocker: explicitly set SW ageing for rocker
>
> drivers/net/ethernet/rocker/rocker.c | 9 +++++----
> include/net/switchdev.h | 1 +
> include/uapi/linux/neighbour.h | 1 +
> net/bridge/br.c | 18 ++++++++++--------
> net/bridge/br_fdb.c | 16 +++++++++++++---
> net/bridge/br_private.h | 8 +++++---
> 6 files changed, 35 insertions(+), 18 deletions(-)
>
> --
> 1.7.10.4
>
Patches look good. With these, my earlier patch
(http://patchwork.ozlabs.org/patch/435597/) is no longer required.
--
Regards,
Siva Mannem.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-20 19:13 ` Florian Fainelli
2015-02-20 19:45 ` Guenter Roeck
2015-02-21 0:20 ` Viswanath Bandaru
@ 2015-02-21 11:03 ` Jiri Pirko
2015-02-21 11:29 ` Viswanath Bandaru
2 siblings, 1 reply; 30+ messages in thread
From: Jiri Pirko @ 2015-02-21 11:03 UTC (permalink / raw)
To: Florian Fainelli
Cc: roopa, sfeldma, netdev, linux, andrew, gospo, vbandaru, siva.mannem.lnx
Fri, Feb 20, 2015 at 08:13:46PM CET, f.fainelli@gmail.com wrote:
>On 20/02/15 09:29, roopa wrote:
>> On 2/19/15, 11:09 PM, sfeldma@gmail.com wrote:
>>> From: Scott Feldman <sfeldma@gmail.com>
>>>
>>> Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example by
>>> offload hardware. Switchdev driver/devices can set this flag when
>>> learning a
>>> new FDB entry and SW (the bridge driver) will skip this entry when
>>> running its
>>> ageing task. If flag is set, the driver/device is responsible for
>>> calling
>>> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry
>>> expires.
>>>
>>> This give the flexibility for driver/device to decide ageing policy
>>> based on
>>> its capabilities. For devices managing many FDB entries, it is
>>> desireable for
>>> the device to aged out its own entries. Devices not capable of aged
>>> entries
>>> can rely of SW to age out the entries.
>>>
>> scott, patches look good. However, I am not sure yet if there is a need
>> to make it a per fdb entry flag.
>
>I agree, in fact, most of the HW I have access to only has a global age
>timer configuration knob. Is this configurable on a per-port basis for
>higher end switches, or even maybe per-FDB entry?
I'm currently not aware of any hw which does not have global age timer.
But I believe that they will appear. The model that we have now, to
propagate aging setting of bridge down is more general and should be ok.
Drivers should probably take care of multi bridge setup with different
aging setup. Maybe to find minimal time and print a warning?
>
>>
>> At some point we will also need the hw ageing parameter to be configurable.
>> So other approach could be,
>> - ageing parameter on bridge gets offloaded the hw
>> - so, by default hw and kernel age their own entries using the same
>> bridge device default timer
>> - User can explicitly disable HW ageing by using self (this needs some
>> more thought because now the call is on the bridge device)
>
>I think we want to be careful with both SW and HW aging entries since a
>host CPU's clock might be suspended/drifting etc.. at least, we probably
>want one or the other to take precedence other the other one?
>--
>Florian
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-21 11:03 ` Jiri Pirko
@ 2015-02-21 11:29 ` Viswanath Bandaru
2015-02-21 15:50 ` Jiri Pirko
0 siblings, 1 reply; 30+ messages in thread
From: Viswanath Bandaru @ 2015-02-21 11:29 UTC (permalink / raw)
To: Jiri Pirko, Florian Fainelli
Cc: roopa, sfeldma, netdev, linux, andrew, gospo, siva.mannem.lnx
> -----Original Message-----
> >
> >I agree, in fact, most of the HW I have access to only has a global age
> >timer configuration knob. Is this configurable on a per-port basis for
> >higher end switches, or even maybe per-FDB entry?
>
> I'm currently not aware of any hw which does not have global age timer.
> But I believe that they will appear. The model that we have now, to
> propagate aging setting of bridge down is more general and should be ok.
>
> Drivers should probably take care of multi bridge setup with different aging
> setup. Maybe to find minimal time and print a warning?
>
Setting up the minimal time in such a scenario is good.
Should we also consider the possibility bridges containing ports from different devices (and therefore different drivers) ? If that is a possibility, I think the bridge module should take responsibility of finding out the minimal time to pushing to all involved drivers.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-21 11:29 ` Viswanath Bandaru
@ 2015-02-21 15:50 ` Jiri Pirko
2015-02-21 17:20 ` Viswanath Bandaru
2015-02-21 18:31 ` roopa
0 siblings, 2 replies; 30+ messages in thread
From: Jiri Pirko @ 2015-02-21 15:50 UTC (permalink / raw)
To: Viswanath Bandaru
Cc: Florian Fainelli, roopa, sfeldma, netdev, linux, andrew, gospo,
siva.mannem.lnx
Sat, Feb 21, 2015 at 12:29:38PM CET, vbandaru@broadcom.com wrote:
>> -----Original Message-----
>> >
>> >I agree, in fact, most of the HW I have access to only has a global age
>> >timer configuration knob. Is this configurable on a per-port basis for
>> >higher end switches, or even maybe per-FDB entry?
>>
>> I'm currently not aware of any hw which does not have global age timer.
>> But I believe that they will appear. The model that we have now, to
>> propagate aging setting of bridge down is more general and should be ok.
>>
>> Drivers should probably take care of multi bridge setup with different aging
>> setup. Maybe to find minimal time and print a warning?
>>
>
>Setting up the minimal time in such a scenario is good.
>
>Should we also consider the possibility bridges containing ports from different devices (and therefore different drivers) ? If that is a possibility, I think the bridge module should take responsibility of finding out the minimal time to pushing to all involved drivers.
It is certainly possible to bridge ports from multiple switch devices.
But that should not be a problem, because 1 bridge has 1 aging setup
which will be passed to all port drivers.
I believe that the only case which need to be resolved is multiple
bridges over single switch device. And I believe that it should be
handled in drivers because only drivers know what the hw is capable of
(if it supports aging setup per port/bridge/global).
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-21 15:50 ` Jiri Pirko
@ 2015-02-21 17:20 ` Viswanath Bandaru
2015-02-21 18:31 ` roopa
1 sibling, 0 replies; 30+ messages in thread
From: Viswanath Bandaru @ 2015-02-21 17:20 UTC (permalink / raw)
To: Jiri Pirko
Cc: Florian Fainelli, roopa, sfeldma, netdev, linux, andrew, gospo,
siva.mannem.lnx
> -----Original Message-----
> From: Jiri Pirko [mailto:jiri@resnulli.us]
> Sent: Saturday, February 21, 2015 9:20 PM
> To: Viswanath Bandaru
> Cc: Florian Fainelli; roopa; sfeldma@gmail.com; netdev@vger.kernel.org;
> linux@roeck-us.net; andrew@lunn.ch; gospo@cumulusnetworks.com;
> siva.mannem.lnx@gmail.com
> Subject: Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB
> ageing in SW or HW
>
> Sat, Feb 21, 2015 at 12:29:38PM CET, vbandaru@broadcom.com wrote:
> >> -----Original Message-----
> >> >
> >> >I agree, in fact, most of the HW I have access to only has a global
> >> >age timer configuration knob. Is this configurable on a per-port
> >> >basis for higher end switches, or even maybe per-FDB entry?
> >>
> >> I'm currently not aware of any hw which does not have global age timer.
> >> But I believe that they will appear. The model that we have now, to
> >> propagate aging setting of bridge down is more general and should be ok.
> >>
> >> Drivers should probably take care of multi bridge setup with
> >> different aging setup. Maybe to find minimal time and print a warning?
> >>
> >
> >Setting up the minimal time in such a scenario is good.
> >
> >Should we also consider the possibility bridges containing ports from
> different devices (and therefore different drivers) ? If that is a possibility, I
> think the bridge module should take responsibility of finding out the minimal
> time to pushing to all involved drivers.
>
> It is certainly possible to bridge ports from multiple switch devices.
> But that should not be a problem, because 1 bridge has 1 aging setup which
> will be passed to all port drivers.
I apologize for confusion, I actually meant multiple bridge devices (which was the context) each having ports from different switch devices. Is this expected to be expected in reality ?
>
> I believe that the only case which need to be resolved is multiple bridges
> over single switch device. And I believe that it should be handled in drivers
> because only drivers know what the hw is capable of (if it supports aging
> setup per port/bridge/global).
>
> >
> >
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-21 0:20 ` Viswanath Bandaru
2015-02-21 0:39 ` Guenter Roeck
@ 2015-02-21 18:29 ` roopa
[not found] ` <CAE4R7bAPS1GZKaC4M6x9cqfTOkVju1+Po4KzanfSniEFX9oi1w@mail.gmail.com>
1 sibling, 1 reply; 30+ messages in thread
From: roopa @ 2015-02-21 18:29 UTC (permalink / raw)
To: Viswanath Bandaru
Cc: Florian Fainelli, sfeldma, netdev, jiri, linux, andrew, gospo,
siva.mannem.lnx
On 2/20/15, 4:20 PM, Viswanath Bandaru wrote:
>> -----Original Message-----
>> From: Florian Fainelli [mailto:f.fainelli@gmail.com]
>> Sent: Saturday, February 21, 2015 12:44 AM
>> To: roopa; sfeldma@gmail.com
>> Cc: netdev@vger.kernel.org; jiri@resnulli.us; linux@roeck-us.net;
>> andrew@lunn.ch; gospo@cumulusnetworks.com; Viswanath Bandaru;
>> siva.mannem.lnx@gmail.com
>> Subject: Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB
>> ageing in SW or HW
>>
>> On 20/02/15 09:29, roopa wrote:
>>> On 2/19/15, 11:09 PM, sfeldma@gmail.com wrote:
>>>> From: Scott Feldman <sfeldma@gmail.com>
>>>>
>>>> Add a new NTF_EXT_FLAG to mark an FDB as externally aged, for example
>> by
>>>> offload hardware. Switchdev driver/devices can set this flag when
>>>> learning a
>>>> new FDB entry and SW (the bridge driver) will skip this entry when
>>>> running its ageing task. If flag is set, the driver/device is
>>>> responsible for calling
>>>> call_netdev_switch_notifiers(NETDEV_SWITCH_FDB_DEL, ...) when entry
>>>> expires.
>>>>
>>>> This give the flexibility for driver/device to decide ageing policy
>>>> based on its capabilities. For devices managing many FDB entries, it
>>>> is desireable for the device to aged out its own entries. Devices
>>>> not capable of aged entries can rely of SW to age out the entries.
>>>>
>>> scott, patches look good. However, I am not sure yet if there is a
>>> need to make it a per fdb entry flag.
>> I agree, in fact, most of the HW I have access to only has a global age timer
>> configuration knob. Is this configurable on a per-port basis for higher end
>> switches, or even maybe per-FDB entry?
>>
> The decision to age an entry is on a per-FDB entry, So if the flag is not available in the fdb-entry, then it has to be obtained from associated netdev. I think the per-FDB may be more convenient than keeping it on netdev.
I was not saying netdev, but the bridge could have it...something like
the below:
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e0670d7..1815924 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -280,7 +280,8 @@ void br_fdb_cleanup(unsigned long _data)
hlist_for_each_entry_safe(f, n, &br->hash[i], hlist) {
unsigned long this_timer;
- if (f->is_static)
+ if (f->is_static || (f->added_by_external_learn &&
+ br->hw_aging_enabled))
continue;
this_timer = f->updated + delay;
if (time_before_eq(this_timer, jiffies))
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index de09199..68bfb4b 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -246,6 +246,7 @@ struct net_bridge
unsigned char topology_change;
unsigned char topology_change_detected;
+ unsigned char hw_aging_enabled;
#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
unsigned char multicast_router;
I have no objections to the current patchset (they clearly fix the
problem). I was only wondering if the notion of aging can continue to
stay in the bridge
and not be carried into every fdb entry. which seems cleaner to me.
>
>>> At some point we will also need the hw ageing parameter to be
>> configurable.
>>> So other approach could be,
>>> - ageing parameter on bridge gets offloaded the hw
> Like everybody said, the ageing is really the switch's property. If there are two bridges with ports from same switching device, it becomes tricky to configure ageing time on the switch. ( I think this is what you may be referring to be saying it needs more thought).
Regarding the two bridges with ports from the same switching device, I
think the switch driver should handle that. Maybe min value etc as
discussed further in this thread.
Today given that most people are going to use the vlan filtering bridge
and most likely prefer having a single vlan filtering bridge represent
their swicth (we do in most cases), I see this being a non critical
problem to solve.
Thanks,
Roopa
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-21 15:50 ` Jiri Pirko
2015-02-21 17:20 ` Viswanath Bandaru
@ 2015-02-21 18:31 ` roopa
1 sibling, 0 replies; 30+ messages in thread
From: roopa @ 2015-02-21 18:31 UTC (permalink / raw)
To: Jiri Pirko
Cc: Viswanath Bandaru, Florian Fainelli, sfeldma, netdev, linux,
andrew, gospo, siva.mannem.lnx
On 2/21/15, 7:50 AM, Jiri Pirko wrote:
> Sat, Feb 21, 2015 at 12:29:38PM CET, vbandaru@broadcom.com wrote:
>>> -----Original Message-----
>>>> I agree, in fact, most of the HW I have access to only has a global age
>>>> timer configuration knob. Is this configurable on a per-port basis for
>>>> higher end switches, or even maybe per-FDB entry?
>>> I'm currently not aware of any hw which does not have global age timer.
>>> But I believe that they will appear. The model that we have now, to
>>> propagate aging setting of bridge down is more general and should be ok.
>>>
>>> Drivers should probably take care of multi bridge setup with different aging
>>> setup. Maybe to find minimal time and print a warning?
>>>
>> Setting up the minimal time in such a scenario is good.
>>
>> Should we also consider the possibility bridges containing ports from different devices (and therefore different drivers) ? If that is a possibility, I think the bridge module should take responsibility of finding out the minimal time to pushing to all involved drivers.
> It is certainly possible to bridge ports from multiple switch devices.
> But that should not be a problem, because 1 bridge has 1 aging setup
> which will be passed to all port drivers.
>
> I believe that the only case which need to be resolved is multiple
> bridges over single switch device. And I believe that it should be
> handled in drivers because only drivers know what the hw is capable of
> (if it supports aging setup per port/bridge/global).
>
>>
I agree.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
[not found] ` <CAE4R7bAPS1GZKaC4M6x9cqfTOkVju1+Po4KzanfSniEFX9oi1w@mail.gmail.com>
@ 2015-02-25 14:31 ` Andrew Lunn
2015-02-25 16:43 ` Guenter Roeck
2015-02-25 17:29 ` David Miller
0 siblings, 2 replies; 30+ messages in thread
From: Andrew Lunn @ 2015-02-25 14:31 UTC (permalink / raw)
To: Scott Feldman
Cc: roopa, Viswanath Bandaru, Florian Fainelli, netdev, jiri, linux,
gospo, siva.mannem.lnx
> Geunter had this question in the thread:
>
> "I may be missing something, but I don't immediately see how this patch set
> helps me solve any of the problems I am seeing when integrating the Marvell
> switch code. Even if the switch internally does hardware aging, it still
> seems to me that we'll need software aging on top of that, even if the
> bridge code in the kernel has the same addresses in its fdb as the switch.
> I see no feasible means to keep the fdb in the switch synchronized
> with the fdb in the kernel".
>
> You'll want to turn learning off on the bridge, and enable learning (and
> learning_sync) in hardware. The hw driver will install an FDB entry in the
> bridge's FDB and mark it "external". The entry will also appear in the
> device's FDB.
I don't think this is going to work. There is no efficient way to get
the hardware tables out of the hardware. We don't get notification of
additions or removals. We can only read the whole table. And it can be
expensive to read the whole table, since it can be 1K or more entries,
going over an MDIO bus, which in the worst case can be bit banging on
gpio lines.
We probably need a design for devices where we can efficiently get
access to the hardware table, and use it in the software bridge. But
we also need a design where the SW and HW bridges have independently
tables.
Andrew
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 14:31 ` Andrew Lunn
@ 2015-02-25 16:43 ` Guenter Roeck
2015-02-25 17:31 ` B Viswanath
2015-02-25 17:29 ` David Miller
1 sibling, 1 reply; 30+ messages in thread
From: Guenter Roeck @ 2015-02-25 16:43 UTC (permalink / raw)
To: Andrew Lunn
Cc: Scott Feldman, roopa, Viswanath Bandaru, Florian Fainelli,
netdev, jiri, gospo, siva.mannem.lnx
On Wed, Feb 25, 2015 at 03:31:13PM +0100, Andrew Lunn wrote:
> > Geunter had this question in the thread:
> >
> > "I may be missing something, but I don't immediately see how this patch set
> > helps me solve any of the problems I am seeing when integrating the Marvell
> > switch code. Even if the switch internally does hardware aging, it still
> > seems to me that we'll need software aging on top of that, even if the
> > bridge code in the kernel has the same addresses in its fdb as the switch.
> > I see no feasible means to keep the fdb in the switch synchronized
> > with the fdb in the kernel".
> >
> > You'll want to turn learning off on the bridge, and enable learning (and
> > learning_sync) in hardware. The hw driver will install an FDB entry in the
> > bridge's FDB and mark it "external". The entry will also appear in the
> > device's FDB.
>
> I don't think this is going to work. There is no efficient way to get
> the hardware tables out of the hardware. We don't get notification of
> additions or removals. We can only read the whole table. And it can be
> expensive to read the whole table, since it can be 1K or more entries,
> going over an MDIO bus, which in the worst case can be bit banging on
> gpio lines.
>
Which, coincidentially, is the case in my application. The newer
Marvell switches support up to 8k forwarding table entries, so that
would be really awkward.
> We probably need a design for devices where we can efficiently get
> access to the hardware table, and use it in the software bridge. But
> we also need a design where the SW and HW bridges have independently
> tables.
>
Agreed.
Some of the Marvell chip support accessing its registers through Ethernet,
so that may be an option. That is not supported on all chips, so it would
not be a generic solution, but it may be worthwhile looking into.
Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 14:31 ` Andrew Lunn
2015-02-25 16:43 ` Guenter Roeck
@ 2015-02-25 17:29 ` David Miller
1 sibling, 0 replies; 30+ messages in thread
From: David Miller @ 2015-02-25 17:29 UTC (permalink / raw)
To: andrew
Cc: sfeldma, roopa, vbandaru, f.fainelli, netdev, jiri, linux, gospo,
siva.mannem.lnx
From: Andrew Lunn <andrew@lunn.ch>
Date: Wed, 25 Feb 2015 15:31:13 +0100
> I don't think this is going to work. There is no efficient way to get
> the hardware tables out of the hardware. We don't get notification of
> additions or removals. We can only read the whole table. And it can be
> expensive to read the whole table, since it can be 1K or more entries,
> going over an MDIO bus, which in the worst case can be bit banging on
> gpio lines.
Agreed, for most chips, this is how things seem to be structured.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 16:43 ` Guenter Roeck
@ 2015-02-25 17:31 ` B Viswanath
2015-02-25 18:39 ` Guenter Roeck
0 siblings, 1 reply; 30+ messages in thread
From: B Viswanath @ 2015-02-25 17:31 UTC (permalink / raw)
To: Guenter Roeck
Cc: Andrew Lunn, Scott Feldman, roopa, Viswanath Bandaru,
Florian Fainelli, netdev, jiri, gospo, siva.mannem.lnx
On 25 February 2015 at 22:13, Guenter Roeck <linux@roeck-us.net> wrote:
>
<snip>
> > >
> > > You'll want to turn learning off on the bridge, and enable learning (and
> > > learning_sync) in hardware. The hw driver will install an FDB entry in the
> > > bridge's FDB and mark it "external". The entry will also appear in the
> > > device's FDB.
> >
> > I don't think this is going to work. There is no efficient way to get
> > the hardware tables out of the hardware. We don't get notification of
> > additions or removals. We can only read the whole table. And it can be
> > expensive to read the whole table, since it can be 1K or more entries,
> > going over an MDIO bus, which in the worst case can be bit banging on
> > gpio lines.
> >
> Which, coincidentially, is the case in my application. The newer
> Marvell switches support up to 8k forwarding table entries, so that
> would be really awkward.
>
> > We probably need a design for devices where we can efficiently get
> > access to the hardware table, and use it in the software bridge. But
> > we also need a design where the SW and HW bridges have independently
> > tables.
> >
I do agree that reading all of FDB into CPU is a pain. Given the table
size of 1K or 8K, I am (probably incorrectly) speculating that the
device may be a router primarily. Also, not having means of
learn-notifications and/or a quick (hw) interface to get all the
table, I (again probably incorrectly) speculate that there are not
many use cases associated with FDB and end-user/CPU for this silicon.
So I am thinking why would you want to read FDB to CPU ? You can
disable learning on the bridge and have the driver not send any
learning notifications to kernel, while the silicon continues to learn
and forward. The end user may not be able see the FDB on a command,
but is this a requirement for you ?
I may be missing some use cases here, so would you mind mentioning ?
> Agreed.
>
> Some of the Marvell chip support accessing its registers through Ethernet,
> so that may be an option. That is not supported on all chips, so it would
> not be a generic solution, but it may be worthwhile looking into.
>
> Guenter
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 17:31 ` B Viswanath
@ 2015-02-25 18:39 ` Guenter Roeck
2015-02-25 18:51 ` B Viswanath
0 siblings, 1 reply; 30+ messages in thread
From: Guenter Roeck @ 2015-02-25 18:39 UTC (permalink / raw)
To: B Viswanath
Cc: Andrew Lunn, Scott Feldman, roopa, Viswanath Bandaru,
Florian Fainelli, netdev, jiri, gospo, siva.mannem.lnx
On Wed, Feb 25, 2015 at 11:01:00PM +0530, B Viswanath wrote:
> On 25 February 2015 at 22:13, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> <snip>
>
> > > >
> > > > You'll want to turn learning off on the bridge, and enable learning (and
> > > > learning_sync) in hardware. The hw driver will install an FDB entry in the
> > > > bridge's FDB and mark it "external". The entry will also appear in the
> > > > device's FDB.
> > >
> > > I don't think this is going to work. There is no efficient way to get
> > > the hardware tables out of the hardware. We don't get notification of
> > > additions or removals. We can only read the whole table. And it can be
> > > expensive to read the whole table, since it can be 1K or more entries,
> > > going over an MDIO bus, which in the worst case can be bit banging on
> > > gpio lines.
> > >
> > Which, coincidentially, is the case in my application. The newer
> > Marvell switches support up to 8k forwarding table entries, so that
> > would be really awkward.
> >
> > > We probably need a design for devices where we can efficiently get
> > > access to the hardware table, and use it in the software bridge. But
> > > we also need a design where the SW and HW bridges have independently
> > > tables.
> > >
>
> I do agree that reading all of FDB into CPU is a pain. Given the table
> size of 1K or 8K, I am (probably incorrectly) speculating that the
> device may be a router primarily. Also, not having means of
No. I don't think any of the Marvell or Broadcom entry level
switch chips supports L3.
See http://www.marvell.com/switching/link-street/ and
http://community.broadcom.com/docs/DOC-1724.
Forwarding table size in those chips is from 1k all the way up to 16k.
It is correct to assume that some of those chips are _used_ in
routers/L3 switches, but that would not be used for L3 data but
to interrconnect the various cards in the system, primarily for
internal management traffic.
> learn-notifications and/or a quick (hw) interface to get all the
> table, I (again probably incorrectly) speculate that there are not
> many use cases associated with FDB and end-user/CPU for this silicon.
>
> So I am thinking why would you want to read FDB to CPU ? You can
> disable learning on the bridge and have the driver not send any
> learning notifications to kernel, while the silicon continues to learn
> and forward. The end user may not be able see the FDB on a command,
> but is this a requirement for you ?
>
> I may be missing some use cases here, so would you mind mentioning ?
>
A bridge can span multiple switch chips as well as some local interfaces.
In that case, it would be beneficial if the switch would be able to share
its fdb with the CPU, but I don't think it is mandatory. I may be missing
something, though.
Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 18:39 ` Guenter Roeck
@ 2015-02-25 18:51 ` B Viswanath
2015-02-25 19:15 ` Guenter Roeck
0 siblings, 1 reply; 30+ messages in thread
From: B Viswanath @ 2015-02-25 18:51 UTC (permalink / raw)
To: Guenter Roeck
Cc: Andrew Lunn, Scott Feldman, roopa, Viswanath Bandaru,
Florian Fainelli, netdev, jiri, gospo, siva.mannem.lnx
On 26 February 2015 at 00:09, Guenter Roeck <linux@roeck-us.net> wrote:
> On Wed, Feb 25, 2015 at 11:01:00PM +0530, B Viswanath wrote:
>> On 25 February 2015 at 22:13, Guenter Roeck <linux@roeck-us.net> wrote:
>> >
>> <snip>
>>
>> > > >
>> > > > You'll want to turn learning off on the bridge, and enable learning (and
>> > > > learning_sync) in hardware. The hw driver will install an FDB entry in the
>> > > > bridge's FDB and mark it "external". The entry will also appear in the
>> > > > device's FDB.
>> > >
>> > > I don't think this is going to work. There is no efficient way to get
>> > > the hardware tables out of the hardware. We don't get notification of
>> > > additions or removals. We can only read the whole table. And it can be
>> > > expensive to read the whole table, since it can be 1K or more entries,
>> > > going over an MDIO bus, which in the worst case can be bit banging on
>> > > gpio lines.
>> > >
>> > Which, coincidentially, is the case in my application. The newer
>> > Marvell switches support up to 8k forwarding table entries, so that
>> > would be really awkward.
>> >
>> > > We probably need a design for devices where we can efficiently get
>> > > access to the hardware table, and use it in the software bridge. But
>> > > we also need a design where the SW and HW bridges have independently
>> > > tables.
>> > >
>>
>> I do agree that reading all of FDB into CPU is a pain. Given the table
>> size of 1K or 8K, I am (probably incorrectly) speculating that the
>> device may be a router primarily. Also, not having means of
>
> No. I don't think any of the Marvell or Broadcom entry level
> switch chips supports L3.
>
> See http://www.marvell.com/switching/link-street/ and
> http://community.broadcom.com/docs/DOC-1724.
>
> Forwarding table size in those chips is from 1k all the way up to 16k.
>
> It is correct to assume that some of those chips are _used_ in
> routers/L3 switches, but that would not be used for L3 data but
> to interrconnect the various cards in the system, primarily for
> internal management traffic.
I didn't mean that the chips support L3 in hardware, as you said they
don't. I meant that these chips are used in routers which have routing
done by CPU. Typically the OPENWRT class devices, SOHO or similar
devices.
For these routers, exposing FDB to CPU is not really a requirement. So
the hardware is not built to have learn/age-notifications or efficient
access to FDB. This is the reason I asked you whether the driver you
are developing really needs to expose FDB to CPU.
>
>> learn-notifications and/or a quick (hw) interface to get all the
>> table, I (again probably incorrectly) speculate that there are not
>> many use cases associated with FDB and end-user/CPU for this silicon.
>>
>> So I am thinking why would you want to read FDB to CPU ? You can
>> disable learning on the bridge and have the driver not send any
>> learning notifications to kernel, while the silicon continues to learn
>> and forward. The end user may not be able see the FDB on a command,
>> but is this a requirement for you ?
>>
>> I may be missing some use cases here, so would you mind mentioning ?
>>
> A bridge can span multiple switch chips as well as some local interfaces.
> In that case, it would be beneficial if the switch would be able to share
> its fdb with the CPU, but I don't think it is mandatory. I may be missing
> something, though.
This is a general usecase and is usually with chips that can support
learning notifications and other CPU controls on FDB. It can be
implemented via bridge controlling the FDB. But I suspect this usecase
won't be applicable for the device you are attempting to port the
driver to.
>
> Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 18:51 ` B Viswanath
@ 2015-02-25 19:15 ` Guenter Roeck
2015-02-25 19:33 ` B Viswanath
0 siblings, 1 reply; 30+ messages in thread
From: Guenter Roeck @ 2015-02-25 19:15 UTC (permalink / raw)
To: B Viswanath
Cc: Andrew Lunn, Scott Feldman, roopa, Viswanath Bandaru,
Florian Fainelli, netdev, jiri, gospo, siva.mannem.lnx
On Thu, Feb 26, 2015 at 12:21:47AM +0530, B Viswanath wrote:
> On 26 February 2015 at 00:09, Guenter Roeck <linux@roeck-us.net> wrote:
> > On Wed, Feb 25, 2015 at 11:01:00PM +0530, B Viswanath wrote:
> >> On 25 February 2015 at 22:13, Guenter Roeck <linux@roeck-us.net> wrote:
> >> >
> >> <snip>
> >>
> >> > > >
> >> > > > You'll want to turn learning off on the bridge, and enable learning (and
> >> > > > learning_sync) in hardware. The hw driver will install an FDB entry in the
> >> > > > bridge's FDB and mark it "external". The entry will also appear in the
> >> > > > device's FDB.
> >> > >
> >> > > I don't think this is going to work. There is no efficient way to get
> >> > > the hardware tables out of the hardware. We don't get notification of
> >> > > additions or removals. We can only read the whole table. And it can be
> >> > > expensive to read the whole table, since it can be 1K or more entries,
> >> > > going over an MDIO bus, which in the worst case can be bit banging on
> >> > > gpio lines.
> >> > >
> >> > Which, coincidentially, is the case in my application. The newer
> >> > Marvell switches support up to 8k forwarding table entries, so that
> >> > would be really awkward.
> >> >
> >> > > We probably need a design for devices where we can efficiently get
> >> > > access to the hardware table, and use it in the software bridge. But
> >> > > we also need a design where the SW and HW bridges have independently
> >> > > tables.
> >> > >
> >>
> >> I do agree that reading all of FDB into CPU is a pain. Given the table
> >> size of 1K or 8K, I am (probably incorrectly) speculating that the
> >> device may be a router primarily. Also, not having means of
> >
> > No. I don't think any of the Marvell or Broadcom entry level
> > switch chips supports L3.
> >
> > See http://www.marvell.com/switching/link-street/ and
> > http://community.broadcom.com/docs/DOC-1724.
> >
> > Forwarding table size in those chips is from 1k all the way up to 16k.
> >
> > It is correct to assume that some of those chips are _used_ in
> > routers/L3 switches, but that would not be used for L3 data but
> > to interrconnect the various cards in the system, primarily for
> > internal management traffic.
>
> I didn't mean that the chips support L3 in hardware, as you said they
> don't. I meant that these chips are used in routers which have routing
> done by CPU. Typically the OPENWRT class devices, SOHO or similar
> devices.
>
No, that is not the use case in any of the use cases I am aware of
and have been involved with.
> For these routers, exposing FDB to CPU is not really a requirement. So
> the hardware is not built to have learn/age-notifications or efficient
> access to FDB. This is the reason I asked you whether the driver you
> are developing really needs to expose FDB to CPU.
>
> >
> >> learn-notifications and/or a quick (hw) interface to get all the
> >> table, I (again probably incorrectly) speculate that there are not
> >> many use cases associated with FDB and end-user/CPU for this silicon.
> >>
> >> So I am thinking why would you want to read FDB to CPU ? You can
> >> disable learning on the bridge and have the driver not send any
> >> learning notifications to kernel, while the silicon continues to learn
> >> and forward. The end user may not be able see the FDB on a command,
> >> but is this a requirement for you ?
> >>
> >> I may be missing some use cases here, so would you mind mentioning ?
> >>
> > A bridge can span multiple switch chips as well as some local interfaces.
> > In that case, it would be beneficial if the switch would be able to share
> > its fdb with the CPU, but I don't think it is mandatory. I may be missing
> > something, though.
>
> This is a general usecase and is usually with chips that can support
> learning notifications and other CPU controls on FDB. It can be
> implemented via bridge controlling the FDB. But I suspect this usecase
> won't be applicable for the device you are attempting to port the
> driver to.
>
What do you mean with "bridge controlling the FDB" ? You mean by software ?
That would not scale for the reasons mentioned earlier; the switch chips
need to be able to populate their own FDBs.
Note that I am not that much concerned about "my" use cases, at least not
for now. For those cases, there will in general only be one port connected
to the CPU, and a bridge group is not shared across multiple ethernet ports
conneted to the CPU and some of the switch ports. But that doesn't mean
that we should ignore that possibility.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 19:15 ` Guenter Roeck
@ 2015-02-25 19:33 ` B Viswanath
2015-02-25 20:03 ` Guenter Roeck
0 siblings, 1 reply; 30+ messages in thread
From: B Viswanath @ 2015-02-25 19:33 UTC (permalink / raw)
To: Guenter Roeck
Cc: Andrew Lunn, Scott Feldman, roopa, Viswanath Bandaru,
Florian Fainelli, netdev, jiri, gospo, siva.mannem.lnx
On 26 February 2015 at 00:45, Guenter Roeck <linux@roeck-us.net> wrote:
> On Thu, Feb 26, 2015 at 12:21:47AM +0530, B Viswanath wrote:
>> On 26 February 2015 at 00:09, Guenter Roeck <linux@roeck-us.net> wrote:
>> > On Wed, Feb 25, 2015 at 11:01:00PM +0530, B Viswanath wrote:
>> >> On 25 February 2015 at 22:13, Guenter Roeck <linux@roeck-us.net> wrote:
>> >> >
>> >> <snip>
>> >>
>> >> > > >
>> >> > > > You'll want to turn learning off on the bridge, and enable learning (and
>> >> > > > learning_sync) in hardware. The hw driver will install an FDB entry in the
>> >> > > > bridge's FDB and mark it "external". The entry will also appear in the
>> >> > > > device's FDB.
>> >> > >
>> >> > > I don't think this is going to work. There is no efficient way to get
>> >> > > the hardware tables out of the hardware. We don't get notification of
>> >> > > additions or removals. We can only read the whole table. And it can be
>> >> > > expensive to read the whole table, since it can be 1K or more entries,
>> >> > > going over an MDIO bus, which in the worst case can be bit banging on
>> >> > > gpio lines.
>> >> > >
>> >> > Which, coincidentially, is the case in my application. The newer
>> >> > Marvell switches support up to 8k forwarding table entries, so that
>> >> > would be really awkward.
>> >> >
>> >> > > We probably need a design for devices where we can efficiently get
>> >> > > access to the hardware table, and use it in the software bridge. But
>> >> > > we also need a design where the SW and HW bridges have independently
>> >> > > tables.
>> >> > >
>> >>
>> >> I do agree that reading all of FDB into CPU is a pain. Given the table
>> >> size of 1K or 8K, I am (probably incorrectly) speculating that the
>> >> device may be a router primarily. Also, not having means of
>> >
>> > No. I don't think any of the Marvell or Broadcom entry level
>> > switch chips supports L3.
>> >
>> > See http://www.marvell.com/switching/link-street/ and
>> > http://community.broadcom.com/docs/DOC-1724.
>> >
>> > Forwarding table size in those chips is from 1k all the way up to 16k.
>> >
>> > It is correct to assume that some of those chips are _used_ in
>> > routers/L3 switches, but that would not be used for L3 data but
>> > to interrconnect the various cards in the system, primarily for
>> > internal management traffic.
>>
>> I didn't mean that the chips support L3 in hardware, as you said they
>> don't. I meant that these chips are used in routers which have routing
>> done by CPU. Typically the OPENWRT class devices, SOHO or similar
>> devices.
>>
> No, that is not the use case in any of the use cases I am aware of
> and have been involved with.
Ok.
>
>> For these routers, exposing FDB to CPU is not really a requirement. So
>> the hardware is not built to have learn/age-notifications or efficient
>> access to FDB. This is the reason I asked you whether the driver you
>> are developing really needs to expose FDB to CPU.
>>
>> >
>> >> learn-notifications and/or a quick (hw) interface to get all the
>> >> table, I (again probably incorrectly) speculate that there are not
>> >> many use cases associated with FDB and end-user/CPU for this silicon.
>> >>
>> >> So I am thinking why would you want to read FDB to CPU ? You can
>> >> disable learning on the bridge and have the driver not send any
>> >> learning notifications to kernel, while the silicon continues to learn
>> >> and forward. The end user may not be able see the FDB on a command,
>> >> but is this a requirement for you ?
>> >>
>> >> I may be missing some use cases here, so would you mind mentioning ?
>> >>
>> > A bridge can span multiple switch chips as well as some local interfaces.
>> > In that case, it would be beneficial if the switch would be able to share
>> > its fdb with the CPU, but I don't think it is mandatory. I may be missing
>> > something, though.
>>
>> This is a general usecase and is usually with chips that can support
>> learning notifications and other CPU controls on FDB. It can be
>> implemented via bridge controlling the FDB. But I suspect this usecase
>> won't be applicable for the device you are attempting to port the
>> driver to.
>>
>
> What do you mean with "bridge controlling the FDB" ? You mean by software ?
> That would not scale for the reasons mentioned earlier; the switch chips
> need to be able to populate their own FDBs.
Yes, I mean software. I think the scalability issue is not applicable
for chips (in the end-devices) which don't have the need to have CPU
looking at FDB. Those chips can maintain their own FDB but don't need
to inform CPU, therefore bypassing the scalability issue. Please also
see my note below.
>
> Note that I am not that much concerned about "my" use cases, at least not
> for now. For those cases, there will in general only be one port connected
> to the CPU, and a bridge group is not shared across multiple ethernet ports
> conneted to the CPU and some of the switch ports. But that doesn't mean
> that we should ignore that possibility.
The general usecases are those that deal with CPU looking at the FDB
and/or somehow trying to manipulate. I guess I am trying to say that
these usecases are really applicable to devices/silicon where FDB
addition/ageing notifications are supported by the silicon and where
FDB can be read quickly (than MDIO/SPI/I2C) by CPU. Therefore the
drivers can afford to learn the FDB additions/deletions and update the
bridge, thus the bridge and the hardware FDB can be in sync.
For those devices/silicon which don't offer these capabilities, the
use cases themselves are not applicable. The driver can afford to not
read FDB from silicon. Therefore, the scalability issues (reading huge
data from MDIO/SPI/I2C) don't really come into play.
>
> Thanks,
> Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW
2015-02-25 19:33 ` B Viswanath
@ 2015-02-25 20:03 ` Guenter Roeck
0 siblings, 0 replies; 30+ messages in thread
From: Guenter Roeck @ 2015-02-25 20:03 UTC (permalink / raw)
To: B Viswanath
Cc: Andrew Lunn, Scott Feldman, roopa, Viswanath Bandaru,
Florian Fainelli, netdev, jiri, gospo, siva.mannem.lnx
On Thu, Feb 26, 2015 at 01:03:21AM +0530, B Viswanath wrote:
> On 26 February 2015 at 00:45, Guenter Roeck <linux@roeck-us.net> wrote:
> > On Thu, Feb 26, 2015 at 12:21:47AM +0530, B Viswanath wrote:
> >> On 26 February 2015 at 00:09, Guenter Roeck <linux@roeck-us.net> wrote:
> >> > On Wed, Feb 25, 2015 at 11:01:00PM +0530, B Viswanath wrote:
> >> >> On 25 February 2015 at 22:13, Guenter Roeck <linux@roeck-us.net> wrote:
> >> >> >
> >> >> <snip>
> >> >>
> >> >> > > >
> >> >> > > > You'll want to turn learning off on the bridge, and enable learning (and
> >> >> > > > learning_sync) in hardware. The hw driver will install an FDB entry in the
> >> >> > > > bridge's FDB and mark it "external". The entry will also appear in the
> >> >> > > > device's FDB.
> >> >> > >
> >> >> > > I don't think this is going to work. There is no efficient way to get
> >> >> > > the hardware tables out of the hardware. We don't get notification of
> >> >> > > additions or removals. We can only read the whole table. And it can be
> >> >> > > expensive to read the whole table, since it can be 1K or more entries,
> >> >> > > going over an MDIO bus, which in the worst case can be bit banging on
> >> >> > > gpio lines.
> >> >> > >
> >> >> > Which, coincidentially, is the case in my application. The newer
> >> >> > Marvell switches support up to 8k forwarding table entries, so that
> >> >> > would be really awkward.
> >> >> >
> >> >> > > We probably need a design for devices where we can efficiently get
> >> >> > > access to the hardware table, and use it in the software bridge. But
> >> >> > > we also need a design where the SW and HW bridges have independently
> >> >> > > tables.
> >> >> > >
> >> >>
> >> >> I do agree that reading all of FDB into CPU is a pain. Given the table
> >> >> size of 1K or 8K, I am (probably incorrectly) speculating that the
> >> >> device may be a router primarily. Also, not having means of
> >> >
> >> > No. I don't think any of the Marvell or Broadcom entry level
> >> > switch chips supports L3.
> >> >
> >> > See http://www.marvell.com/switching/link-street/ and
> >> > http://community.broadcom.com/docs/DOC-1724.
> >> >
> >> > Forwarding table size in those chips is from 1k all the way up to 16k.
> >> >
> >> > It is correct to assume that some of those chips are _used_ in
> >> > routers/L3 switches, but that would not be used for L3 data but
> >> > to interrconnect the various cards in the system, primarily for
> >> > internal management traffic.
> >>
> >> I didn't mean that the chips support L3 in hardware, as you said they
> >> don't. I meant that these chips are used in routers which have routing
> >> done by CPU. Typically the OPENWRT class devices, SOHO or similar
> >> devices.
> >>
> > No, that is not the use case in any of the use cases I am aware of
> > and have been involved with.
>
> Ok.
> >
> >> For these routers, exposing FDB to CPU is not really a requirement. So
> >> the hardware is not built to have learn/age-notifications or efficient
> >> access to FDB. This is the reason I asked you whether the driver you
> >> are developing really needs to expose FDB to CPU.
> >>
> >> >
> >> >> learn-notifications and/or a quick (hw) interface to get all the
> >> >> table, I (again probably incorrectly) speculate that there are not
> >> >> many use cases associated with FDB and end-user/CPU for this silicon.
> >> >>
> >> >> So I am thinking why would you want to read FDB to CPU ? You can
> >> >> disable learning on the bridge and have the driver not send any
> >> >> learning notifications to kernel, while the silicon continues to learn
> >> >> and forward. The end user may not be able see the FDB on a command,
> >> >> but is this a requirement for you ?
> >> >>
> >> >> I may be missing some use cases here, so would you mind mentioning ?
> >> >>
> >> > A bridge can span multiple switch chips as well as some local interfaces.
> >> > In that case, it would be beneficial if the switch would be able to share
> >> > its fdb with the CPU, but I don't think it is mandatory. I may be missing
> >> > something, though.
> >>
> >> This is a general usecase and is usually with chips that can support
> >> learning notifications and other CPU controls on FDB. It can be
> >> implemented via bridge controlling the FDB. But I suspect this usecase
> >> won't be applicable for the device you are attempting to port the
> >> driver to.
> >>
> >
> > What do you mean with "bridge controlling the FDB" ? You mean by software ?
> > That would not scale for the reasons mentioned earlier; the switch chips
> > need to be able to populate their own FDBs.
>
> Yes, I mean software. I think the scalability issue is not applicable
> for chips (in the end-devices) which don't have the need to have CPU
> looking at FDB. Those chips can maintain their own FDB but don't need
> to inform CPU, therefore bypassing the scalability issue. Please also
> see my note below.
>
> >
> > Note that I am not that much concerned about "my" use cases, at least not
> > for now. For those cases, there will in general only be one port connected
> > to the CPU, and a bridge group is not shared across multiple ethernet ports
> > conneted to the CPU and some of the switch ports. But that doesn't mean
> > that we should ignore that possibility.
>
> The general usecases are those that deal with CPU looking at the FDB
> and/or somehow trying to manipulate. I guess I am trying to say that
> these usecases are really applicable to devices/silicon where FDB
> addition/ageing notifications are supported by the silicon and where
> FDB can be read quickly (than MDIO/SPI/I2C) by CPU. Therefore the
> drivers can afford to learn the FDB additions/deletions and update the
> bridge, thus the bridge and the hardware FDB can be in sync.
>
> For those devices/silicon which don't offer these capabilities, the
> use cases themselves are not applicable. The driver can afford to not
> read FDB from silicon. Therefore, the scalability issues (reading huge
> data from MDIO/SPI/I2C) don't really come into play.
>
Looks like we are in violent agreement.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2015-02-25 20:03 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-20 7:09 [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 1/5] neighbour: add external aged flag sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 2/5] switchdev: add ntf_flags to FDB notifier sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 3/5] bridge: call external learn add if adding FDB entry with NTF_EXT_LEARNED set sfeldma
2015-02-20 7:09 ` [PATCH net-next RFC 4/5] bridge: let HW control FDB ageing by setting NTF_EXT_AGED sfeldma
2015-02-20 17:31 ` David Miller
2015-02-20 7:09 ` [PATCH net-next RFC 5/5] rocker: explicitly set SW ageing for rocker sfeldma
2015-02-20 9:46 ` Jiri Pirko
2015-02-20 14:56 ` Scott Feldman
2015-02-20 17:29 ` [PATCH net-next RFC 0/5] Add NTF_EXT_AGED to control FDB ageing in SW or HW roopa
2015-02-20 19:13 ` Florian Fainelli
2015-02-20 19:45 ` Guenter Roeck
2015-02-21 0:20 ` Viswanath Bandaru
2015-02-21 0:39 ` Guenter Roeck
2015-02-21 18:29 ` roopa
[not found] ` <CAE4R7bAPS1GZKaC4M6x9cqfTOkVju1+Po4KzanfSniEFX9oi1w@mail.gmail.com>
2015-02-25 14:31 ` Andrew Lunn
2015-02-25 16:43 ` Guenter Roeck
2015-02-25 17:31 ` B Viswanath
2015-02-25 18:39 ` Guenter Roeck
2015-02-25 18:51 ` B Viswanath
2015-02-25 19:15 ` Guenter Roeck
2015-02-25 19:33 ` B Viswanath
2015-02-25 20:03 ` Guenter Roeck
2015-02-25 17:29 ` David Miller
2015-02-21 11:03 ` Jiri Pirko
2015-02-21 11:29 ` Viswanath Bandaru
2015-02-21 15:50 ` Jiri Pirko
2015-02-21 17:20 ` Viswanath Bandaru
2015-02-21 18:31 ` roopa
2015-02-21 1:23 ` Siva Mannem
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.