* [PATCH 0/4] RFC CPSW switchdev mode @ 2018-05-24 6:56 Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 1/4] cpsw: move common headers definitions to cpsw_priv.h Ilias Apalodimas ` (5 more replies) 0 siblings, 6 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 6:56 UTC (permalink / raw) To: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera Cc: francois.ozog, yogeshs, spatton, Ilias Apalodimas Hello, This is adding a new mode on the cpsw driver based around switchdev. In order to enable this you need to enable CONFIG_NET_SWITCHDEV, CONFIG_BRIDGE_VLAN_FILTERING, CONFIG_TI_CPSW_SWITCHDEV and add to udev config: SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="0f011900", \ ATTR{phys_port_name}!="", NAME="sw0$attr{phys_port_name}" Since the phys_switch_id is based on cpsw version, users with different version will need to do 'ip -d link show dev sw0p0 | grep switchid' and replace with the correct value. This patch creates 3 ports, sw0p0, sw0p1 and sw0p2. sw0p1 and sw0p2 are the netdev interfaces connected to PHY devices while sw0p0 is the switch 'cpu facing port'. sw0p0 will be unable to receive and transmit traffic and it's not 100% within switchdev scope but, it's used to configure switch cpu port individually as this is needed for various switch features and configuration scenarios. Bridge setup: ip link add name br0 type bridge ip link set dev br0 type bridge ageing_time 1000 ip link set dev br0 type bridge vlan_filtering 1 ip link set dev sw0p1 up ip link set dev sw0p2 up ip link set dev sw0p0 up ip link set dev sw0p0 master br0 ip link set dev sw0p2 master br0 ip link set dev sw0p1 master br0 ip link set br0 address $(cat /sys/class/net/sw0p1/address) ifconfig br0 up VLAN config: untagged: bridge vlan add dev sw0p1 vid 100 pvid untagged master bridge vlan add dev sw0p2 vid 100 pvid untagged master tagged: bridge vlan add dev sw0p1 vid 100 master bridge vlan add dev sw0p2 vid 100 master IP address on br0: bridge vlan add dev br0 vid 100 pvid untagged self bridge vlan add dev sw0p0 vid 100 pvid untagged master udhcpc -i br0 FDBs: bridge fdb add aa:bb:cc:dd:ee:ff dev sw0p1 master vlan 100 bridge fdb add aa:bb:cc:dd:ee:fe dev sw0p2 master MDBs: single vlan: bridge mdb add dev br0 port sw0p1 grp 239.1.1.1 permanent vid 100 all vlans: bridge mdb add dev br0 port sw0p2 grp 239.1.1.1 permanent bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent Multicast: setting multicast on and off will affect registered multicast setting allmulti on and off will affect unregistered multicast This muct occur before adding VLANs on the interfaces. If you change the flag after the VLAN configuration you need to re-issue the VLAN config commands. Promiscuous mode: Adding/removing sw0p0 on the bridge will enable/disable ALE_P0_UNI_FLOOD NFS: The only way for NFS to work is by chrooting to a minimal environment when switch configuration that will affect connectivity is needed. Assuming you are booting NFS with eth1 interface(the script is hacky and it's just there to prove NFS is doable). setup.sh: #!/bin/sh mkdir proc mount -t proc none /proc ifconfig br0 > /dev/null if [ $? -ne 0 ]; then echo "Setting up bridge" ip link add name br0 type bridge ip link set dev br0 type bridge ageing_time 1000 ip link set dev br0 type bridge vlan_filtering 1 ip link set eth1 down ip link set eth1 name sw0p1 ip link set dev sw0p1 up ip link set dev sw0p2 up ip link set dev sw0p0 up ip link set dev sw0p0 master br0 ip link set dev sw0p2 master br0 ip link set dev sw0p1 master br0 ifconfig sw0p1 0.0.0.0 udhchc -i br0 fi umount /proc run_nfs.sh: #!/bin/sh mkdir /tmp/root/bin -p mkdir /tmp/root/lib -p cp -r /lib/ /tmp/root/ cp -r /bin/ /tmp/root/ cp /sbin/ip /tmp/root/bin cp /sbin/bridge /tmp/root/bin cp /sbin/ifconfig /tmp/root/bin cp /sbin/udhcpc /tmp/root/bin cp /path/to/setup.sh /tmp/root/bin chroot /tmp/root/ busybox sh /bin/run_nfs.sh run ./run_nfs.sh This is on top of 4.17-rc2 tree. P.S: I am not 100% sure that the promiscuity handling is correct. Please let me know if i should change anything on that Ilias Apalodimas (4): cpsw: move common headers definitions to cpsw_priv.h cpsw_ale: add support functions for switchdev cpsw_switchdev: add switchdev support files cpsw: add switchdev support drivers/net/ethernet/ti/Kconfig | 9 + drivers/net/ethernet/ti/Makefile | 1 + drivers/net/ethernet/ti/cpsw.c | 610 ++++++++++++++++++++++--------- drivers/net/ethernet/ti/cpsw_ale.c | 129 +++++++ drivers/net/ethernet/ti/cpsw_ale.h | 8 + drivers/net/ethernet/ti/cpsw_priv.h | 148 ++++++++ drivers/net/ethernet/ti/cpsw_switchdev.c | 299 +++++++++++++++ drivers/net/ethernet/ti/cpsw_switchdev.h | 4 + 8 files changed, 1039 insertions(+), 169 deletions(-) create mode 100644 drivers/net/ethernet/ti/cpsw_priv.h create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.c create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.h -- 2.7.4 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 1/4] cpsw: move common headers definitions to cpsw_priv.h 2018-05-24 6:56 [PATCH 0/4] RFC CPSW switchdev mode Ilias Apalodimas @ 2018-05-24 6:56 ` Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 2/4] cpsw_ale: add support functions for switchdev Ilias Apalodimas ` (4 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 6:56 UTC (permalink / raw) To: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera Cc: francois.ozog, yogeshs, spatton, Ilias Apalodimas Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> --- drivers/net/ethernet/ti/cpsw.c | 111 +--------------------------- drivers/net/ethernet/ti/cpsw_priv.h | 141 ++++++++++++++++++++++++++++++++++++ 2 files changed, 142 insertions(+), 110 deletions(-) create mode 100644 drivers/net/ethernet/ti/cpsw_priv.h diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 3037127..b16e7cf 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -41,6 +41,7 @@ #include "cpsw.h" #include "cpsw_ale.h" +#include "cpsw_priv.h" #include "cpts.h" #include "davinci_cpdma.h" @@ -88,7 +89,6 @@ do { \ #define CPSW_VERSION_3 0x19010f #define CPSW_VERSION_4 0x190112 -#define HOST_PORT_NUM 0 #define CPSW_ALE_PORTS_NUM 3 #define SLIVER_SIZE 0x40 @@ -309,16 +309,6 @@ struct cpsw_ss_regs { #define CPSW_MAX_BLKS_TX_SHIFT 4 #define CPSW_MAX_BLKS_RX 5 -struct cpsw_host_regs { - u32 max_blks; - u32 blk_cnt; - u32 tx_in_ctl; - u32 port_vlan; - u32 tx_pri_map; - u32 cpdma_tx_pri_map; - u32 cpdma_rx_chan_map; -}; - struct cpsw_sliver_regs { u32 id_ver; u32 mac_control; @@ -370,105 +360,6 @@ struct cpsw_hw_stats { u32 rxdmaoverruns; }; -struct cpsw_slave_data { - struct device_node *phy_node; - char phy_id[MII_BUS_ID_SIZE]; - int phy_if; - u8 mac_addr[ETH_ALEN]; - u16 dual_emac_res_vlan; /* Reserved VLAN for DualEMAC */ -}; - -struct cpsw_platform_data { - struct cpsw_slave_data *slave_data; - u32 ss_reg_ofs; /* Subsystem control register offset */ - u32 channels; /* number of cpdma channels (symmetric) */ - u32 slaves; /* number of slave cpgmac ports */ - u32 active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */ - u32 ale_entries; /* ale table size */ - u32 bd_ram_size; /*buffer descriptor ram size */ - u32 mac_control; /* Mac control register */ - u16 default_vlan; /* Def VLAN for ALE lookup in VLAN aware mode*/ - bool dual_emac; /* Enable Dual EMAC mode */ -}; - -struct cpsw_slave { - void __iomem *regs; - struct cpsw_sliver_regs __iomem *sliver; - int slave_num; - u32 mac_control; - struct cpsw_slave_data *data; - struct phy_device *phy; - struct net_device *ndev; - u32 port_vlan; -}; - -static inline u32 slave_read(struct cpsw_slave *slave, u32 offset) -{ - return readl_relaxed(slave->regs + offset); -} - -static inline void slave_write(struct cpsw_slave *slave, u32 val, u32 offset) -{ - writel_relaxed(val, slave->regs + offset); -} - -struct cpsw_vector { - struct cpdma_chan *ch; - int budget; -}; - -struct cpsw_common { - struct device *dev; - struct cpsw_platform_data data; - struct napi_struct napi_rx; - struct napi_struct napi_tx; - struct cpsw_ss_regs __iomem *regs; - struct cpsw_wr_regs __iomem *wr_regs; - u8 __iomem *hw_stats; - struct cpsw_host_regs __iomem *host_port_regs; - u32 version; - u32 coal_intvl; - u32 bus_freq_mhz; - int rx_packet_max; - struct cpsw_slave *slaves; - struct cpdma_ctlr *dma; - struct cpsw_vector txv[CPSW_MAX_QUEUES]; - struct cpsw_vector rxv[CPSW_MAX_QUEUES]; - struct cpsw_ale *ale; - bool quirk_irq; - bool rx_irq_disabled; - bool tx_irq_disabled; - u32 irqs_table[IRQ_NUM]; - struct cpts *cpts; - int rx_ch_num, tx_ch_num; - int speed; - int usage_count; -}; - -struct cpsw_priv { - struct net_device *ndev; - struct device *dev; - u32 msg_enable; - u8 mac_addr[ETH_ALEN]; - bool rx_pause; - bool tx_pause; - u32 emac_port; - struct cpsw_common *cpsw; -}; - -struct cpsw_stats { - char stat_string[ETH_GSTRING_LEN]; - int type; - int sizeof_stat; - int stat_offset; -}; - -enum { - CPSW_STATS, - CPDMA_RX_STATS, - CPDMA_TX_STATS, -}; - #define CPSW_STAT(m) CPSW_STATS, \ sizeof(((struct cpsw_hw_stats *)0)->m), \ offsetof(struct cpsw_hw_stats, m) diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h new file mode 100644 index 0000000..3b02a83 --- /dev/null +++ b/drivers/net/ethernet/ti/cpsw_priv.h @@ -0,0 +1,141 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include <linux/netdevice.h> +#include <linux/platform_device.h> + +#define HOST_PORT_NUM 0 +#define IRQ_NUM 2 +#define CPSW_MAX_QUEUES 8 + +#define CPSW_VERSION_1 0x19010a +#define CPSW_VERSION_2 0x19010c +#define CPSW_VERSION_3 0x19010f +#define CPSW_VERSION_4 0x190112 + +/* CPSW_PORT_V1 */ +#define CPSW1_MAX_BLKS 0x00 /* Maximum FIFO Blocks */ +#define CPSW1_BLK_CNT 0x04 /* FIFO Block Usage Count (Read Only) */ +#define CPSW1_TX_IN_CTL 0x08 /* Transmit FIFO Control */ +#define CPSW1_PORT_VLAN 0x0c /* VLAN Register */ +#define CPSW1_TX_PRI_MAP 0x10 /* Tx Header Priority to Switch Pri Mapping */ +#define CPSW1_TS_CTL 0x14 /* Time Sync Control */ +#define CPSW1_TS_SEQ_LTYPE 0x18 /* Time Sync Sequence ID Offset and Msg Type */ +#define CPSW1_TS_VLAN 0x1c /* Time Sync VLAN1 and VLAN2 */ + +/* CPSW_PORT_V2 */ +#define CPSW2_CONTROL 0x00 /* Control Register */ +#define CPSW2_MAX_BLKS 0x08 /* Maximum FIFO Blocks */ +#define CPSW2_BLK_CNT 0x0c /* FIFO Block Usage Count (Read Only) */ +#define CPSW2_TX_IN_CTL 0x10 /* Transmit FIFO Control */ +#define CPSW2_PORT_VLAN 0x14 /* VLAN Register */ +#define CPSW2_TX_PRI_MAP 0x18 /* Tx Header Priority to Switch Pri Mapping */ +#define CPSW2_TS_SEQ_MTYPE 0x1c /* Time Sync Sequence ID Offset and Msg Type */ + +struct cpsw_slave_data { + struct device_node *phy_node; + char phy_id[MII_BUS_ID_SIZE]; + int phy_if; + u8 mac_addr[ETH_ALEN]; + u16 dual_emac_res_vlan; /* Reserved VLAN for DualEMAC */ +}; + +struct cpsw_platform_data { + struct cpsw_slave_data *slave_data; + u32 ss_reg_ofs; /* Subsystem control register offset */ + u32 channels; /* number of cpdma channels (symmetric) */ + u32 slaves; /* number of slave cpgmac ports */ + u32 active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */ + u32 ale_entries; /* ale table size */ + u32 bd_ram_size; /*buffer descriptor ram size */ + u32 mac_control; /* Mac control register */ + u16 default_vlan; /* Def VLAN for ALE lookup in VLAN aware mode*/ + bool dual_emac; /* Enable Dual EMAC mode */ +}; + +struct cpsw_slave { + void __iomem *regs; + struct cpsw_sliver_regs __iomem *sliver; + int slave_num; + u32 mac_control; + struct cpsw_slave_data *data; + struct phy_device *phy; + struct net_device *ndev; + u32 port_vlan; +}; + +struct cpsw_vector { + struct cpdma_chan *ch; + int budget; +}; + +struct cpsw_common { + struct device *dev; + struct cpsw_platform_data data; + struct napi_struct napi_rx; + struct napi_struct napi_tx; + struct cpsw_ss_regs __iomem *regs; + struct cpsw_wr_regs __iomem *wr_regs; + u8 __iomem *hw_stats; + struct cpsw_host_regs __iomem *host_port_regs; + u32 version; + u32 coal_intvl; + u32 bus_freq_mhz; + int rx_packet_max; + struct cpsw_slave *slaves; + struct cpdma_ctlr *dma; + struct cpsw_vector txv[CPSW_MAX_QUEUES]; + struct cpsw_vector rxv[CPSW_MAX_QUEUES]; + struct cpsw_ale *ale; + bool quirk_irq; + bool rx_irq_disabled; + bool tx_irq_disabled; + u32 irqs_table[IRQ_NUM]; + struct cpts *cpts; + int rx_ch_num, tx_ch_num; + int speed; + int usage_count; +}; + +struct cpsw_priv { + struct net_device *ndev; + struct device *dev; + u32 msg_enable; + u8 mac_addr[ETH_ALEN]; + bool rx_pause; + bool tx_pause; + u8 port_state[3]; + u32 emac_port; + struct cpsw_common *cpsw; +}; + +struct cpsw_stats { + char stat_string[ETH_GSTRING_LEN]; + int type; + int sizeof_stat; + int stat_offset; +}; + +enum { + CPSW_STATS, + CPDMA_RX_STATS, + CPDMA_TX_STATS, +}; + +struct cpsw_host_regs { + u32 max_blks; + u32 blk_cnt; + u32 tx_in_ctl; + u32 port_vlan; + u32 tx_pri_map; + u32 cpdma_tx_pri_map; + u32 cpdma_rx_chan_map; +}; + +static inline u32 slave_read(struct cpsw_slave *slave, u32 offset) +{ + return readl_relaxed(slave->regs + offset); +} + +static inline void slave_write(struct cpsw_slave *slave, u32 val, u32 offset) +{ + writel_relaxed(val, slave->regs + offset); +} -- 2.7.4 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 2/4] cpsw_ale: add support functions for switchdev 2018-05-24 6:56 [PATCH 0/4] RFC CPSW switchdev mode Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 1/4] cpsw: move common headers definitions to cpsw_priv.h Ilias Apalodimas @ 2018-05-24 6:56 ` Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 3/4] cpsw_switchdev: add switchdev support files Ilias Apalodimas ` (3 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 6:56 UTC (permalink / raw) To: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera Cc: francois.ozog, yogeshs, spatton, Ilias Apalodimas Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> --- drivers/net/ethernet/ti/cpsw_ale.c | 129 +++++++++++++++++++++++++++++++++++++ drivers/net/ethernet/ti/cpsw_ale.h | 8 +++ 2 files changed, 137 insertions(+) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 93dc05c..0b7383f 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -409,6 +409,41 @@ int cpsw_ale_del_mcast(struct cpsw_ale *ale, u8 *addr, int port_mask, } EXPORT_SYMBOL_GPL(cpsw_ale_del_mcast); +static int cpsw_ale_read_mc(struct cpsw_ale *ale, u8 *addr, int flags, u16 vid) +{ + u32 ale_entry[ALE_ENTRY_WORDS] = {0, 0, 0}; + int idx; + + idx = cpsw_ale_match_addr(ale, addr, (flags & ALE_VLAN) ? vid : 0); + if (idx >= 0) + cpsw_ale_read(ale, idx, ale_entry); + + return cpsw_ale_get_port_mask(ale_entry, ale->port_mask_bits); +} + +int cpsw_ale_mcast_add_modify(struct cpsw_ale *ale, u8 *addr, int port_mask, + int flags, u16 vid, int mcast_state) +{ + int mcast_members, ret; + + mcast_members = cpsw_ale_read_mc(ale, addr, flags, vid) | port_mask; + ret = cpsw_ale_add_mcast(ale, addr, mcast_members, flags, vid, 0); + + return ret; +} + +int cpsw_ale_mcast_del_modify(struct cpsw_ale *ale, u8 *addr, int port_mask, + int flags, u16 vid) +{ + int mcast_members, ret; + + mcast_members = cpsw_ale_read_mc(ale, addr, flags, vid) & ~port_mask; + ret = cpsw_ale_del_mcast(ale, addr, mcast_members, flags, vid); + + return ret; +} +EXPORT_SYMBOL_GPL(cpsw_ale_mcast_del_modify); + /* ALE NetCP NU switch specific vlan functions */ static void cpsw_ale_set_vlan_mcast(struct cpsw_ale *ale, u32 *ale_entry, int reg_mcast, int unreg_mcast) @@ -424,6 +459,52 @@ static void cpsw_ale_set_vlan_mcast(struct cpsw_ale *ale, u32 *ale_entry, writel(unreg_mcast, ale->params.ale_regs + ALE_VLAN_MASK_MUX(idx)); } +static int cpsw_ale_read_untagged(struct cpsw_ale *ale, u16 vid) +{ + u32 ale_entry[ALE_ENTRY_WORDS] = {0, 0, 0}; + int idx; + + idx = cpsw_ale_match_vlan(ale, vid); + if (idx >= 0) + cpsw_ale_read(ale, idx, ale_entry); + + return cpsw_ale_get_vlan_untag_force(ale_entry, ale->vlan_field_bits); +} + +/* returns mask of current members for specificed vlan */ +static int cpsw_ale_read_vlan_members(struct cpsw_ale *ale, u16 vid) +{ + u32 ale_entry[ALE_ENTRY_WORDS] = {0, 0, 0}; + int idx; + + idx = cpsw_ale_match_vlan(ale, vid); + if (idx >= 0) + cpsw_ale_read(ale, idx, ale_entry); + + return cpsw_ale_get_vlan_member_list(ale_entry, ale->vlan_field_bits); +} + +/* returns mask of registered/unregistered multicast registration */ +static int cpsw_ale_read_reg_unreg_mc(struct cpsw_ale *ale, u16 vid, bool unreg) +{ + u32 ale_entry[ALE_ENTRY_WORDS] = {0, 0, 0}; + int idx; + int ret; + + idx = cpsw_ale_match_vlan(ale, vid); + if (idx >= 0) + cpsw_ale_read(ale, idx, ale_entry); + + if (unreg) + ret = cpsw_ale_get_vlan_unreg_mcast(ale_entry, + ale->vlan_field_bits); + else + ret = cpsw_ale_get_vlan_reg_mcast(ale_entry, + ale->vlan_field_bits); + + return ret; +} + int cpsw_ale_add_vlan(struct cpsw_ale *ale, u16 vid, int port, int untag, int reg_mcast, int unreg_mcast) { @@ -482,6 +563,54 @@ int cpsw_ale_del_vlan(struct cpsw_ale *ale, u16 vid, int port_mask) } EXPORT_SYMBOL_GPL(cpsw_ale_del_vlan); +int cpsw_ale_vlan_add_modify(struct cpsw_ale *ale, u16 vid, int port_mask, + int untag_mask, int reg_mask, int unreg_mask) +{ + int ret = 0; + int vlan_members = cpsw_ale_read_vlan_members(ale, vid) & ~port_mask; + int reg_mcast_members = + cpsw_ale_read_reg_unreg_mc(ale, vid, 0) & ~port_mask; + int unreg_mcast_members = + cpsw_ale_read_reg_unreg_mc(ale, vid, 1) & ~port_mask; + int untag_members = cpsw_ale_read_untagged(ale, vid) & ~port_mask; + + vlan_members |= port_mask; + untag_members |= untag_mask; + reg_mcast_members |= reg_mask; + unreg_mcast_members |= unreg_mask; + + ret = cpsw_ale_add_vlan(ale, vid, vlan_members, untag_members, + reg_mcast_members, unreg_mcast_members); + if (ret) { + dev_err(ale->params.dev, "Unable to add vlan\n"); + return ret; + } + dev_dbg(ale->params.dev, "port mask 0x%x untag 0x%x\n", vlan_members, + untag_mask); + + return ret; +} +EXPORT_SYMBOL_GPL(cpsw_ale_vlan_add_modify); + +int cpsw_ale_vlan_del_modify(struct cpsw_ale *ale, u16 vid, int port_mask) +{ + int ret = 0; + int vlan_members; + + vlan_members = cpsw_ale_read_vlan_members(ale, vid); + vlan_members &= ~port_mask; + + ret = cpsw_ale_del_vlan(ale, vid, vlan_members); + if (ret) { + dev_err(ale->params.dev, "Unable to del vlan\n"); + return ret; + } + dev_dbg(ale->params.dev, "port mask 0x%x\n", port_mask); + + return ret; +} +EXPORT_SYMBOL_GPL(cpsw_ale_vlan_del_modify); + void cpsw_ale_set_allmulti(struct cpsw_ale *ale, int allmulti) { u32 ale_entry[ALE_ENTRY_WORDS]; diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h index d4fe901..bc29616 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.h +++ b/drivers/net/ethernet/ti/cpsw_ale.h @@ -123,4 +123,12 @@ int cpsw_ale_control_set(struct cpsw_ale *ale, int port, int control, int value); void cpsw_ale_dump(struct cpsw_ale *ale, u32 *data); +int cpsw_ale_vlan_add_modify(struct cpsw_ale *ale, u16 vid, int port_mask, + int untag_mask, int reg_mcast, int unreg_mcast); +int cpsw_ale_vlan_del_modify(struct cpsw_ale *ale, u16 vid, int port_mask); +int cpsw_ale_mcast_add_modify(struct cpsw_ale *ale, u8 *addr, int port_mask, + int flags, u16 vid, int mcast_state); +int cpsw_ale_mcast_del_modify(struct cpsw_ale *ale, u8 *addr, int port, + int flags, u16 vid); + #endif -- 2.7.4 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 3/4] cpsw_switchdev: add switchdev support files 2018-05-24 6:56 [PATCH 0/4] RFC CPSW switchdev mode Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 1/4] cpsw: move common headers definitions to cpsw_priv.h Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 2/4] cpsw_ale: add support functions for switchdev Ilias Apalodimas @ 2018-05-24 6:56 ` Ilias Apalodimas 2018-05-24 10:00 ` Maxim Uvarov 2018-05-27 4:39 ` kbuild test robot 2018-05-24 6:56 ` [PATCH 4/4] cpsw: add switchdev support Ilias Apalodimas ` (2 subsequent siblings) 5 siblings, 2 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 6:56 UTC (permalink / raw) To: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera Cc: francois.ozog, yogeshs, spatton, Ilias Apalodimas Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> --- drivers/net/ethernet/ti/Kconfig | 9 + drivers/net/ethernet/ti/Makefile | 1 + drivers/net/ethernet/ti/cpsw_switchdev.c | 299 +++++++++++++++++++++++++++++++ drivers/net/ethernet/ti/cpsw_switchdev.h | 4 + 4 files changed, 313 insertions(+) create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.c create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.h diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig index 48a541e..b22ae7d 100644 --- a/drivers/net/ethernet/ti/Kconfig +++ b/drivers/net/ethernet/ti/Kconfig @@ -73,6 +73,15 @@ config TI_CPSW To compile this driver as a module, choose M here: the module will be called cpsw. +config TI_CPSW_SWITCHDEV + bool "TI CPSW switchdev support" + depends on TI_CPSW + depends on NET_SWITCHDEV + help + Enable switchdev support on TI's CPSW Ethernet Switch. + + This will allow you to configure the switch using standard tools. + config TI_CPTS bool "TI Common Platform Time Sync (CPTS) Support" depends on TI_CPSW || TI_KEYSTONE_NETCP diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile index 0be551d..3926c6a 100644 --- a/drivers/net/ethernet/ti/Makefile +++ b/drivers/net/ethernet/ti/Makefile @@ -15,6 +15,7 @@ obj-$(CONFIG_TI_CPSW_PHY_SEL) += cpsw-phy-sel.o obj-$(CONFIG_TI_CPSW_ALE) += cpsw_ale.o obj-$(CONFIG_TI_CPTS_MOD) += cpts.o obj-$(CONFIG_TI_CPSW) += ti_cpsw.o +ti_cpsw-objs:= cpsw_switchdev.o ti_cpsw-y := cpsw.o obj-$(CONFIG_TI_KEYSTONE_NETCP) += keystone_netcp.o diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c new file mode 100644 index 0000000..bf8c1bf --- /dev/null +++ b/drivers/net/ethernet/ti/cpsw_switchdev.c @@ -0,0 +1,299 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Texas Instruments switchdev Driver + * + * Copyright (C) 2018 Texas Instruments + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation version 2. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether express or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include <linux/etherdevice.h> +#include <linux/if_bridge.h> +#include <net/switchdev.h> +#include "cpsw.h" +#include "cpsw_priv.h" +#include "cpsw_ale.h" + +static u32 cpsw_switchdev_get_ver(struct net_device *ndev) +{ + struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_common *cpsw = priv->cpsw; + + return cpsw->version; +} + +static int cpsw_port_attr_set(struct net_device *dev, + const struct switchdev_attr *attr, + struct switchdev_trans *trans) +{ + return -EOPNOTSUPP; +} + +static int cpsw_port_attr_get(struct net_device *dev, + struct switchdev_attr *attr) +{ + u32 cpsw_ver; + int err = 0; + + switch (attr->id) { + case SWITCHDEV_ATTR_ID_PORT_PARENT_ID: + cpsw_ver = cpsw_switchdev_get_ver(dev); + attr->u.ppid.id_len = sizeof(cpsw_ver); + memcpy(&attr->u.ppid.id, &cpsw_ver, attr->u.ppid.id_len); + break; + default: + return -EOPNOTSUPP; + } + + return err; +} + +static u16 cpsw_get_pvid(struct cpsw_priv *priv) +{ + struct cpsw_common *cpsw = priv->cpsw; + u32 __iomem *port_vlan_reg; + u32 pvid; + + if (priv->emac_port) { + int reg = CPSW2_PORT_VLAN; + + if (cpsw->version == CPSW_VERSION_1) + reg = CPSW1_PORT_VLAN; + pvid = slave_read(cpsw->slaves + (priv->emac_port - 1), reg); + } else { + port_vlan_reg = &cpsw->host_port_regs->port_vlan; + pvid = readl(port_vlan_reg); + } + + pvid = pvid & 0xfff; + + return pvid; +} + +static void cpsw_set_pvid(struct cpsw_priv *priv, u16 vid, bool cfi, u32 cos) +{ + struct cpsw_common *cpsw = priv->cpsw; + void __iomem *port_vlan_reg; + u32 pvid; + + pvid = vid; + pvid |= cfi ? BIT(12) : 0; + pvid |= (cos & 0x7) << 13; + + if (priv->emac_port) { + int reg = CPSW2_PORT_VLAN; + + if (cpsw->version == CPSW_VERSION_1) + reg = CPSW1_PORT_VLAN; + /* no barrier */ + slave_write(cpsw->slaves + (priv->emac_port - 1), pvid, reg); + } else { + /* CPU port */ + port_vlan_reg = &cpsw->host_port_regs->port_vlan; + writel(pvid, port_vlan_reg); + } +} + +static int cpsw_port_vlan_add(struct cpsw_priv *priv, bool untag, bool pvid, + u16 vid) +{ + struct cpsw_common *cpsw = priv->cpsw; + int port_mask = BIT(priv->emac_port); + int unreg_mcast_mask = 0; + int reg_mcast_mask = 0; + int untag_mask = 0; + int ret = 0; + + if (priv->ndev->flags & IFF_ALLMULTI) + unreg_mcast_mask = port_mask; + + if (priv->ndev->flags & IFF_MULTICAST) + reg_mcast_mask = port_mask; + + if (untag) + untag_mask = port_mask; + + ret = cpsw_ale_vlan_add_modify(cpsw->ale, vid, port_mask, untag_mask, + reg_mcast_mask, unreg_mcast_mask); + if (ret) { + dev_err(priv->dev, "Unable to add vlan\n"); + return ret; + } + + if (!pvid) + return ret; + + cpsw_set_pvid(priv, vid, 0, 0); + + dev_dbg(priv->dev, "VID: %u dev: %s port: %u\n", vid, + priv->ndev->name, priv->emac_port); + + return ret; +} + +static int cpsw_port_vlan_del(struct cpsw_priv *priv, u16 vid) +{ + struct cpsw_common *cpsw = priv->cpsw; + int port_mask = BIT(priv->emac_port); + int ret = 0; + + ret = cpsw_ale_vlan_del_modify(cpsw->ale, vid, port_mask); + if (ret != 0) + return ret; + + ret = cpsw_ale_del_ucast(cpsw->ale, priv->mac_addr, + HOST_PORT_NUM, ALE_VLAN, vid); + + if (vid == cpsw_get_pvid(priv)) + cpsw_set_pvid(priv, 0, 0, 0); + + if (ret != 0) { + dev_dbg(priv->dev, "Failed to delete unicast entry\n"); + ret = 0; + } + + ret = cpsw_ale_del_mcast(cpsw->ale, priv->ndev->broadcast, + 0, ALE_VLAN, vid); + if (ret != 0) { + dev_dbg(priv->dev, "Failed to delete multicast entry\n"); + ret = 0; + } + + return ret; +} + +static int cpsw_port_vlans_add(struct cpsw_priv *priv, + const struct switchdev_obj_port_vlan *vlan, + struct switchdev_trans *trans) +{ + bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED; + bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID; + u16 vid; + + if (switchdev_trans_ph_prepare(trans)) + return 0; + + for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) { + int err; + + err = cpsw_port_vlan_add(priv, untagged, pvid, vid); + if (err) + return err; + } + + return 0; +} + +static int cpsw_port_vlans_del(struct cpsw_priv *priv, + const struct switchdev_obj_port_vlan *vlan) + +{ + u16 vid; + + for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) { + int err; + + err = cpsw_port_vlan_del(priv, vid); + if (err) + return err; + } + + return 0; +} + +static int cpsw_port_mdb_add(struct cpsw_priv *priv, + struct switchdev_obj_port_mdb *mdb, + struct switchdev_trans *trans) +{ + struct cpsw_common *cpsw = priv->cpsw; + int port_mask; + int err; + + if (switchdev_trans_ph_prepare(trans)) + return 0; + + port_mask = BIT(priv->emac_port); + err = cpsw_ale_mcast_add_modify(cpsw->ale, mdb->addr, port_mask, + ALE_VLAN, mdb->vid, 0); + + return err; +} + +static int cpsw_port_mdb_del(struct cpsw_priv *priv, + struct switchdev_obj_port_mdb *mdb) + +{ + struct cpsw_common *cpsw = priv->cpsw; + int del_mask; + int err; + + del_mask = BIT(priv->emac_port); + err = cpsw_ale_mcast_del_modify(cpsw->ale, mdb->addr, del_mask, + ALE_VLAN, mdb->vid); + + return err; +} + +static int cpsw_port_obj_add(struct net_device *ndev, + const struct switchdev_obj *obj, + struct switchdev_trans *trans) +{ + struct switchdev_obj_port_vlan *vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); + struct switchdev_obj_port_mdb *mdb = SWITCHDEV_OBJ_PORT_MDB(obj); + struct cpsw_priv *priv = netdev_priv(ndev); + int err = 0; + + switch (obj->id) { + case SWITCHDEV_OBJ_ID_PORT_VLAN: + err = cpsw_port_vlans_add(priv, vlan, trans); + break; + case SWITCHDEV_OBJ_ID_PORT_MDB: + err = cpsw_port_mdb_add(priv, mdb, trans); + break; + default: + err = -EOPNOTSUPP; + break; + } + + return err; +} + +static int cpsw_port_obj_del(struct net_device *ndev, + const struct switchdev_obj *obj) +{ + struct switchdev_obj_port_vlan *vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); + struct cpsw_priv *priv = netdev_priv(ndev); + int err = 0; + + switch (obj->id) { + case SWITCHDEV_OBJ_ID_PORT_VLAN: + err = cpsw_port_vlans_del(priv, vlan); + break; + case SWITCHDEV_OBJ_ID_PORT_MDB: + err = cpsw_port_mdb_del(priv, SWITCHDEV_OBJ_PORT_MDB(obj)); + break; + default: + err = -EOPNOTSUPP; + break; + } + + return err; +} + +static const struct switchdev_ops cpsw_port_switchdev_ops = { + .switchdev_port_attr_set = cpsw_port_attr_set, + .switchdev_port_attr_get = cpsw_port_attr_get, + .switchdev_port_obj_add = cpsw_port_obj_add, + .switchdev_port_obj_del = cpsw_port_obj_del, +}; + +void cpsw_port_switchdev_init(struct net_device *ndev) +{ + ndev->switchdev_ops = &cpsw_port_switchdev_ops; +} diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.h b/drivers/net/ethernet/ti/cpsw_switchdev.h new file mode 100644 index 0000000..4940462 --- /dev/null +++ b/drivers/net/ethernet/ti/cpsw_switchdev.h @@ -0,0 +1,4 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include <net/switchdev.h> + +void cpsw_port_switchdev_init(struct net_device *ndev); -- 2.7.4 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH 3/4] cpsw_switchdev: add switchdev support files 2018-05-24 6:56 ` [PATCH 3/4] cpsw_switchdev: add switchdev support files Ilias Apalodimas @ 2018-05-24 10:00 ` Maxim Uvarov 2018-05-27 4:39 ` kbuild test robot 1 sibling, 0 replies; 53+ messages in thread From: Maxim Uvarov @ 2018-05-24 10:00 UTC (permalink / raw) To: Ilias Apalodimas Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton 2018-05-24 9:56 GMT+03:00 Ilias Apalodimas <ilias.apalodimas@linaro.org>: > Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> > --- > drivers/net/ethernet/ti/Kconfig | 9 + > drivers/net/ethernet/ti/Makefile | 1 + > drivers/net/ethernet/ti/cpsw_switchdev.c | 299 +++++++++++++++++++++++++++++++ > drivers/net/ethernet/ti/cpsw_switchdev.h | 4 + > 4 files changed, 313 insertions(+) > create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.c > create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.h > > diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig > index 48a541e..b22ae7d 100644 > --- a/drivers/net/ethernet/ti/Kconfig > +++ b/drivers/net/ethernet/ti/Kconfig > @@ -73,6 +73,15 @@ config TI_CPSW > To compile this driver as a module, choose M here: the module > will be called cpsw. > > +config TI_CPSW_SWITCHDEV > + bool "TI CPSW switchdev support" > + depends on TI_CPSW > + depends on NET_SWITCHDEV > + help > + Enable switchdev support on TI's CPSW Ethernet Switch. > + > + This will allow you to configure the switch using standard tools. > + > config TI_CPTS > bool "TI Common Platform Time Sync (CPTS) Support" > depends on TI_CPSW || TI_KEYSTONE_NETCP > diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile > index 0be551d..3926c6a 100644 > --- a/drivers/net/ethernet/ti/Makefile > +++ b/drivers/net/ethernet/ti/Makefile > @@ -15,6 +15,7 @@ obj-$(CONFIG_TI_CPSW_PHY_SEL) += cpsw-phy-sel.o > obj-$(CONFIG_TI_CPSW_ALE) += cpsw_ale.o > obj-$(CONFIG_TI_CPTS_MOD) += cpts.o > obj-$(CONFIG_TI_CPSW) += ti_cpsw.o > +ti_cpsw-objs:= cpsw_switchdev.o > ti_cpsw-y := cpsw.o > > obj-$(CONFIG_TI_KEYSTONE_NETCP) += keystone_netcp.o > diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c > new file mode 100644 > index 0000000..bf8c1bf > --- /dev/null > +++ b/drivers/net/ethernet/ti/cpsw_switchdev.c > @@ -0,0 +1,299 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Texas Instruments switchdev Driver > + * > + * Copyright (C) 2018 Texas Instruments > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License as > + * published by the Free Software Foundation version 2. > + * > + * This program is distributed "as is" WITHOUT ANY WARRANTY of any > + * kind, whether express or implied; without even the implied warranty > + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + */ > + > +#include <linux/etherdevice.h> > +#include <linux/if_bridge.h> > +#include <net/switchdev.h> > +#include "cpsw.h" > +#include "cpsw_priv.h" > +#include "cpsw_ale.h" > + > +static u32 cpsw_switchdev_get_ver(struct net_device *ndev) > +{ > + struct cpsw_priv *priv = netdev_priv(ndev); > + struct cpsw_common *cpsw = priv->cpsw; > + > + return cpsw->version; > +} > + > +static int cpsw_port_attr_set(struct net_device *dev, > + const struct switchdev_attr *attr, > + struct switchdev_trans *trans) > +{ > + return -EOPNOTSUPP; > +} > + > +static int cpsw_port_attr_get(struct net_device *dev, > + struct switchdev_attr *attr) > +{ > + u32 cpsw_ver; > + int err = 0; > + > + switch (attr->id) { > + case SWITCHDEV_ATTR_ID_PORT_PARENT_ID: > + cpsw_ver = cpsw_switchdev_get_ver(dev); > + attr->u.ppid.id_len = sizeof(cpsw_ver); > + memcpy(&attr->u.ppid.id, &cpsw_ver, attr->u.ppid.id_len); > + break; > + default: > + return -EOPNOTSUPP; > + } > + > + return err; err is always 0 here. > +} > + > +static u16 cpsw_get_pvid(struct cpsw_priv *priv) > +{ > + struct cpsw_common *cpsw = priv->cpsw; > + u32 __iomem *port_vlan_reg; > + u32 pvid; > + > + if (priv->emac_port) { > + int reg = CPSW2_PORT_VLAN; > + > + if (cpsw->version == CPSW_VERSION_1) > + reg = CPSW1_PORT_VLAN; > + pvid = slave_read(cpsw->slaves + (priv->emac_port - 1), reg); > + } else { > + port_vlan_reg = &cpsw->host_port_regs->port_vlan; > + pvid = readl(port_vlan_reg); > + } > + > + pvid = pvid & 0xfff; > + > + return pvid; > +} > + > +static void cpsw_set_pvid(struct cpsw_priv *priv, u16 vid, bool cfi, u32 cos) > +{ > + struct cpsw_common *cpsw = priv->cpsw; > + void __iomem *port_vlan_reg; > + u32 pvid; > + > + pvid = vid; > + pvid |= cfi ? BIT(12) : 0; > + pvid |= (cos & 0x7) << 13; > + > + if (priv->emac_port) { > + int reg = CPSW2_PORT_VLAN; > + > + if (cpsw->version == CPSW_VERSION_1) > + reg = CPSW1_PORT_VLAN; > + /* no barrier */ > + slave_write(cpsw->slaves + (priv->emac_port - 1), pvid, reg); > + } else { > + /* CPU port */ > + port_vlan_reg = &cpsw->host_port_regs->port_vlan; > + writel(pvid, port_vlan_reg); > + } > +} > + > +static int cpsw_port_vlan_add(struct cpsw_priv *priv, bool untag, bool pvid, > + u16 vid) > +{ > + struct cpsw_common *cpsw = priv->cpsw; > + int port_mask = BIT(priv->emac_port); > + int unreg_mcast_mask = 0; > + int reg_mcast_mask = 0; > + int untag_mask = 0; > + int ret = 0; > + > + if (priv->ndev->flags & IFF_ALLMULTI) > + unreg_mcast_mask = port_mask; > + > + if (priv->ndev->flags & IFF_MULTICAST) > + reg_mcast_mask = port_mask; > + > + if (untag) > + untag_mask = port_mask; > + > + ret = cpsw_ale_vlan_add_modify(cpsw->ale, vid, port_mask, untag_mask, > + reg_mcast_mask, unreg_mcast_mask); > + if (ret) { > + dev_err(priv->dev, "Unable to add vlan\n"); > + return ret; > + } > + > + if (!pvid) > + return ret; > + > + cpsw_set_pvid(priv, vid, 0, 0); > + > + dev_dbg(priv->dev, "VID: %u dev: %s port: %u\n", vid, > + priv->ndev->name, priv->emac_port); > + > + return ret; > +} > + > +static int cpsw_port_vlan_del(struct cpsw_priv *priv, u16 vid) > +{ > + struct cpsw_common *cpsw = priv->cpsw; > + int port_mask = BIT(priv->emac_port); > + int ret = 0; no need to set it to 0 here. > + > + ret = cpsw_ale_vlan_del_modify(cpsw->ale, vid, port_mask); > + if (ret != 0) > + return ret; > + > + ret = cpsw_ale_del_ucast(cpsw->ale, priv->mac_addr, > + HOST_PORT_NUM, ALE_VLAN, vid); > + > + if (vid == cpsw_get_pvid(priv)) > + cpsw_set_pvid(priv, 0, 0, 0); > + > + if (ret != 0) { > + dev_dbg(priv->dev, "Failed to delete unicast entry\n"); > + ret = 0; no need to set it to 0. > + } > + > + ret = cpsw_ale_del_mcast(cpsw->ale, priv->ndev->broadcast, > + 0, ALE_VLAN, vid); > + if (ret != 0) { > + dev_dbg(priv->dev, "Failed to delete multicast entry\n"); > + ret = 0; > + } > + just return 0 as it always returned. > + return ret; > +} > + > +static int cpsw_port_vlans_add(struct cpsw_priv *priv, > + const struct switchdev_obj_port_vlan *vlan, > + struct switchdev_trans *trans) > +{ > + bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED; > + bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID; > + u16 vid; > + > + if (switchdev_trans_ph_prepare(trans)) > + return 0; > + > + for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) { > + int err; > + > + err = cpsw_port_vlan_add(priv, untagged, pvid, vid); > + if (err) > + return err; > + } > + > + return 0; > +} > + > +static int cpsw_port_vlans_del(struct cpsw_priv *priv, > + const struct switchdev_obj_port_vlan *vlan) > + > +{ > + u16 vid; > + > + for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) { > + int err; > + > + err = cpsw_port_vlan_del(priv, vid); > + if (err) > + return err; > + } > + > + return 0; > +} > + > +static int cpsw_port_mdb_add(struct cpsw_priv *priv, > + struct switchdev_obj_port_mdb *mdb, > + struct switchdev_trans *trans) > +{ > + struct cpsw_common *cpsw = priv->cpsw; > + int port_mask; > + int err; > + > + if (switchdev_trans_ph_prepare(trans)) > + return 0; > + > + port_mask = BIT(priv->emac_port); > + err = cpsw_ale_mcast_add_modify(cpsw->ale, mdb->addr, port_mask, > + ALE_VLAN, mdb->vid, 0); > + > + return err; > +} > + > +static int cpsw_port_mdb_del(struct cpsw_priv *priv, > + struct switchdev_obj_port_mdb *mdb) > + > +{ > + struct cpsw_common *cpsw = priv->cpsw; > + int del_mask; > + int err; > + > + del_mask = BIT(priv->emac_port); > + err = cpsw_ale_mcast_del_modify(cpsw->ale, mdb->addr, del_mask, > + ALE_VLAN, mdb->vid); > + > + return err; > +} > + > +static int cpsw_port_obj_add(struct net_device *ndev, > + const struct switchdev_obj *obj, > + struct switchdev_trans *trans) > +{ > + struct switchdev_obj_port_vlan *vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); > + struct switchdev_obj_port_mdb *mdb = SWITCHDEV_OBJ_PORT_MDB(obj); > + struct cpsw_priv *priv = netdev_priv(ndev); > + int err = 0; > + > + switch (obj->id) { > + case SWITCHDEV_OBJ_ID_PORT_VLAN: > + err = cpsw_port_vlans_add(priv, vlan, trans); > + break; > + case SWITCHDEV_OBJ_ID_PORT_MDB: > + err = cpsw_port_mdb_add(priv, mdb, trans); > + break; > + default: > + err = -EOPNOTSUPP; > + break; > + } > + > + return err; > +} > + > +static int cpsw_port_obj_del(struct net_device *ndev, > + const struct switchdev_obj *obj) > +{ > + struct switchdev_obj_port_vlan *vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); > + struct cpsw_priv *priv = netdev_priv(ndev); > + int err = 0; > + > + switch (obj->id) { > + case SWITCHDEV_OBJ_ID_PORT_VLAN: > + err = cpsw_port_vlans_del(priv, vlan); > + break; > + case SWITCHDEV_OBJ_ID_PORT_MDB: > + err = cpsw_port_mdb_del(priv, SWITCHDEV_OBJ_PORT_MDB(obj)); > + break; > + default: > + err = -EOPNOTSUPP; > + break; > + } > + > + return err; > +} > + > +static const struct switchdev_ops cpsw_port_switchdev_ops = { > + .switchdev_port_attr_set = cpsw_port_attr_set, > + .switchdev_port_attr_get = cpsw_port_attr_get, > + .switchdev_port_obj_add = cpsw_port_obj_add, > + .switchdev_port_obj_del = cpsw_port_obj_del, > +}; > + > +void cpsw_port_switchdev_init(struct net_device *ndev) > +{ > + ndev->switchdev_ops = &cpsw_port_switchdev_ops; > +} > diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.h b/drivers/net/ethernet/ti/cpsw_switchdev.h > new file mode 100644 > index 0000000..4940462 > --- /dev/null > +++ b/drivers/net/ethernet/ti/cpsw_switchdev.h > @@ -0,0 +1,4 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#include <net/switchdev.h> > + > +void cpsw_port_switchdev_init(struct net_device *ndev); > -- > 2.7.4 > -- Best regards, Maxim Uvarov ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 3/4] cpsw_switchdev: add switchdev support files 2018-05-24 6:56 ` [PATCH 3/4] cpsw_switchdev: add switchdev support files Ilias Apalodimas 2018-05-24 10:00 ` Maxim Uvarov @ 2018-05-27 4:39 ` kbuild test robot 1 sibling, 0 replies; 53+ messages in thread From: kbuild test robot @ 2018-05-27 4:39 UTC (permalink / raw) To: Ilias Apalodimas Cc: kbuild-all, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton, Ilias Apalodimas [-- Attachment #1: Type: text/plain, Size: 1439 bytes --] Hi Ilias, Thank you for the patch! Yet something to improve: [auto build test ERROR on net/master] [also build test ERROR on v4.17-rc6] [cannot apply to net-next/master] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Ilias-Apalodimas/RFC-CPSW-switchdev-mode/20180527-102334 config: arm-omap2plus_defconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm All errors (new ones prefixed by >>): drivers/net/ethernet/ti/cpsw_switchdev.c: In function 'cpsw_port_switchdev_init': >> drivers/net/ethernet/ti/cpsw_switchdev.c:298:8: error: 'struct net_device' has no member named 'switchdev_ops'; did you mean 'netdev_ops'? ndev->switchdev_ops = &cpsw_port_switchdev_ops; ^~~~~~~~~~~~~ netdev_ops vim +298 drivers/net/ethernet/ti/cpsw_switchdev.c 295 296 void cpsw_port_switchdev_init(struct net_device *ndev) 297 { > 298 ndev->switchdev_ops = &cpsw_port_switchdev_ops; --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 33598 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 4/4] cpsw: add switchdev support 2018-05-24 6:56 [PATCH 0/4] RFC CPSW switchdev mode Ilias Apalodimas ` (2 preceding siblings ...) 2018-05-24 6:56 ` [PATCH 3/4] cpsw_switchdev: add switchdev support files Ilias Apalodimas @ 2018-05-24 6:56 ` Ilias Apalodimas 2018-05-24 13:12 ` Andrew Lunn 2018-05-24 8:05 ` [PATCH 0/4] RFC CPSW switchdev mode Jiri Pirko 2018-06-01 21:29 ` Grygorii Strashko 5 siblings, 1 reply; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 6:56 UTC (permalink / raw) To: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera Cc: francois.ozog, yogeshs, spatton, Ilias Apalodimas Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> --- drivers/net/ethernet/ti/cpsw.c | 503 +++++++++++++++++++++++++++++++----- drivers/net/ethernet/ti/cpsw_priv.h | 9 +- 2 files changed, 450 insertions(+), 62 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index b16e7cf..8f8ebd8 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -18,12 +18,10 @@ #include <linux/clk.h> #include <linux/timer.h> #include <linux/module.h> -#include <linux/platform_device.h> #include <linux/irqreturn.h> #include <linux/interrupt.h> #include <linux/if_ether.h> #include <linux/etherdevice.h> -#include <linux/netdevice.h> #include <linux/net_tstamp.h> #include <linux/phy.h> #include <linux/workqueue.h> @@ -42,6 +40,7 @@ #include "cpsw.h" #include "cpsw_ale.h" #include "cpsw_priv.h" +#include "cpsw_switchdev.h" #include "cpts.h" #include "davinci_cpdma.h" @@ -146,9 +145,6 @@ do { \ #define CPSW_CMINTMAX_INTVL (1000 / CPSW_CMINTMIN_CNT) #define CPSW_CMINTMIN_INTVL ((1000 / CPSW_CMINTMAX_CNT) + 1) -#define cpsw_slave_index(cpsw, priv) \ - ((cpsw->data.dual_emac) ? priv->emac_port : \ - cpsw->data.active_slave) #define IRQ_NUM 2 #define CPSW_MAX_QUEUES 8 #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256 @@ -360,6 +356,13 @@ struct cpsw_hw_stats { u32 rxdmaoverruns; }; +struct cpsw_switchdev_event_work { + struct work_struct work; + struct switchdev_notifier_fdb_info fdb_info; + struct cpsw_priv *priv; + unsigned long event; +}; + #define CPSW_STAT(m) CPSW_STATS, \ sizeof(((struct cpsw_hw_stats *)0)->m), \ offsetof(struct cpsw_hw_stats, m) @@ -433,18 +436,22 @@ static const struct cpsw_stats cpsw_gstrings_ch_stats[] = { struct cpsw_slave *slave; \ struct cpsw_common *cpsw = (priv)->cpsw; \ int n; \ - if (cpsw->data.dual_emac) \ - (func)((cpsw)->slaves + priv->emac_port, ##arg);\ - else \ + if (cpsw->data.switch_mode) { \ + if (priv->emac_port == HOST_PORT_NUM) \ + break; \ + (func)((cpsw)->slaves + (priv->emac_port - 1), \ + ##arg);\ + } else { \ for (n = cpsw->data.slaves, \ slave = cpsw->slaves; \ n; n--) \ (func)(slave++, ##arg); \ + } \ } while (0) #define cpsw_dual_emac_src_port_detect(cpsw, status, ndev, skb) \ do { \ - if (!cpsw->data.dual_emac) \ + if (!cpsw->data.switch_mode) \ break; \ if (CPDMA_RX_SOURCE_PORT(status) == 1) { \ ndev = cpsw->slaves[0].ndev; \ @@ -456,11 +463,13 @@ static const struct cpsw_stats cpsw_gstrings_ch_stats[] = { } while (0) #define cpsw_add_mcast(cpsw, priv, addr) \ do { \ - if (cpsw->data.dual_emac) { \ + if (cpsw->data.switch_mode) { \ struct cpsw_slave *slave = cpsw->slaves + \ - priv->emac_port; \ + (priv->emac_port - 1); \ int slave_port = cpsw_get_slave_port( \ slave->slave_num); \ + if (priv->emac_port == HOST_PORT_NUM) \ + break; \ cpsw_ale_add_mcast(cpsw->ale, addr, \ 1 << slave_port | ALE_PORT_HOST, \ ALE_VLAN, slave->port_vlan, 0); \ @@ -476,13 +485,41 @@ static inline int cpsw_get_slave_port(u32 slave_num) return slave_num + 1; } +static int cpsw_is_dual_mac(u8 switch_mode) +{ + return switch_mode == CPSW_DUAL_EMAC; +} + +static int cpsw_is_switchdev(u8 switch_mode) +{ + return switch_mode == CPSW_SWITCHDEV; +} + +static int cpsw_is_switch(u8 switch_mode) +{ + return switch_mode == CPSW_TI_SWITCH; +} + +static int cpsw_slave_index(struct cpsw_priv *priv) +{ + struct cpsw_common *cpsw = priv->cpsw; + +#if IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV) + if (priv->emac_port == HOST_PORT_NUM) + return -1; +#endif + + return cpsw->data.switch_mode ? priv->emac_port - 1 : + cpsw->data.active_slave; +} + static void cpsw_set_promiscious(struct net_device *ndev, bool enable) { struct cpsw_common *cpsw = ndev_to_cpsw(ndev); struct cpsw_ale *ale = cpsw->ale; int i; - if (cpsw->data.dual_emac) { + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) { bool flag = false; /* Enabling promiscuous mode for one interface will be @@ -508,7 +545,7 @@ static void cpsw_set_promiscious(struct net_device *ndev, bool enable) cpsw_ale_control_set(ale, 0, ALE_BYPASS, 0); dev_dbg(&ndev->dev, "promiscuity disabled\n"); } - } else { + } else if (cpsw_is_switch(cpsw->data.switch_mode)) { if (enable) { unsigned long timeout = jiffies + HZ; @@ -548,6 +585,18 @@ static void cpsw_set_promiscious(struct net_device *ndev, bool enable) } dev_dbg(&ndev->dev, "promiscuity disabled\n"); } + } else if (cpsw_is_switchdev(cpsw->data.switch_mode)) { + /* When interfaces are placed into a bridge they'll switch to + * promiscuous mode. In switchdev case ALE_P0_UNI_FLOOD is + * changed whether the cpu port participates in the bridge + */ + struct cpsw_priv *priv = netdev_priv(ndev); + int slave_idx = cpsw_slave_index(priv); + int slave_num; + + slave_num = cpsw_get_slave_port(slave_idx); + cpsw_ale_control_set(ale, slave_num, ALE_PORT_NOLEARN, 0); + cpsw_ale_control_set(ale, slave_num, ALE_PORT_NO_SA_UPDATE, 0); } } @@ -555,10 +604,11 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev) { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; + int slave_no = cpsw_slave_index(priv); int vid; - if (cpsw->data.dual_emac) - vid = cpsw->slaves[priv->emac_port].port_vlan; + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) + vid = cpsw->slaves[slave_no].port_vlan; else vid = cpsw->data.default_vlan; @@ -629,8 +679,9 @@ static void cpsw_tx_handler(void *token, int len, int status) static void cpsw_rx_vlan_encap(struct sk_buff *skb) { struct cpsw_priv *priv = netdev_priv(skb->dev); - struct cpsw_common *cpsw = priv->cpsw; u32 rx_vlan_encap_hdr = *((u32 *)skb->data); + struct cpsw_common *cpsw = priv->cpsw; + int slave_no = cpsw_slave_index(priv); u16 vtag, vid, prio, pkt_type; /* Remove VLAN header encapsulation word */ @@ -651,8 +702,8 @@ static void cpsw_rx_vlan_encap(struct sk_buff *skb) if (!vid) return; /* Ignore default vlans in dual mac mode */ - if (cpsw->data.dual_emac && - vid == cpsw->slaves[priv->emac_port].port_vlan) + if (cpsw_is_dual_mac(cpsw->data.switch_mode) && + vid == cpsw->slaves[slave_no].port_vlan) return; prio = (rx_vlan_encap_hdr >> @@ -681,9 +732,9 @@ static void cpsw_rx_handler(void *token, int len, int status) cpsw_dual_emac_src_port_detect(cpsw, status, ndev, skb); if (unlikely(status < 0) || unlikely(!netif_running(ndev))) { - /* In dual emac mode check for all interfaces */ - if (cpsw->data.dual_emac && cpsw->usage_count && - (status >= 0)) { + /* In any other that switch mode check for all interfaces */ + if (!cpsw_is_switch(cpsw->data.switch_mode) && + cpsw->usage_count && status >= 0) { /* The packet received is for the interface which * is already down and the other interface is up * and running, instead of freeing which results @@ -699,6 +750,11 @@ static void cpsw_rx_handler(void *token, int len, int status) return; } +#if IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV) + if (cpsw_is_switchdev(cpsw->data.switch_mode)) + skb->offload_fwd_mark = 1; +#endif + new_skb = netdev_alloc_skb_ip_align(ndev, cpsw->rx_packet_max); if (new_skb) { skb_copy_queue_mapping(new_skb, skb); @@ -1206,11 +1262,10 @@ static inline int cpsw_tx_packet_submit(struct cpsw_priv *priv, struct sk_buff *skb, struct cpdma_chan *txch) { - struct cpsw_common *cpsw = priv->cpsw; - skb_tx_timestamp(skb); + return cpdma_chan_submit(txch, skb, skb->data, skb->len, - priv->emac_port + cpsw->data.dual_emac); + priv->emac_port); } static inline void cpsw_add_dual_emac_def_ale_entries( @@ -1283,7 +1338,7 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv) slave_port = cpsw_get_slave_port(slave->slave_num); - if (cpsw->data.dual_emac) + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) cpsw_add_dual_emac_def_ale_entries(priv, slave, slave_port); else cpsw_ale_add_mcast(cpsw->ale, priv->ndev->broadcast, @@ -1362,8 +1417,8 @@ static void cpsw_init_host_port(struct cpsw_priv *priv) control_reg = readl(&cpsw->regs->control); control_reg |= CPSW_VLAN_AWARE | CPSW_RX_VLAN_ENCAP; writel(control_reg, &cpsw->regs->control); - fifo_mode = (cpsw->data.dual_emac) ? CPSW_FIFO_DUAL_MAC_MODE : - CPSW_FIFO_NORMAL_MODE; + fifo_mode = cpsw_is_dual_mac(cpsw->data.switch_mode) ? + CPSW_FIFO_DUAL_MAC_MODE : CPSW_FIFO_NORMAL_MODE; writel(fifo_mode, &cpsw->host_port_regs->tx_in_ctl); /* setup host port priority mapping */ @@ -1374,7 +1429,7 @@ static void cpsw_init_host_port(struct cpsw_priv *priv) cpsw_ale_control_set(cpsw->ale, HOST_PORT_NUM, ALE_PORT_STATE, ALE_PORT_STATE_FORWARD); - if (!cpsw->data.dual_emac) { + if (!cpsw_is_dual_mac(cpsw->data.switch_mode)) { cpsw_ale_add_ucast(cpsw->ale, priv->mac_addr, HOST_PORT_NUM, 0, 0); cpsw_ale_add_mcast(cpsw->ale, priv->ndev->broadcast, @@ -1474,14 +1529,19 @@ static int cpsw_ndo_open(struct net_device *ndev) /* Initialize host and slave ports */ if (!cpsw->usage_count) cpsw_init_host_port(priv); - for_each_slave(priv, cpsw_slave_open, priv); - /* Add default VLAN */ - if (!cpsw->data.dual_emac) - cpsw_add_default_vlan(priv); - else - cpsw_ale_add_vlan(cpsw->ale, cpsw->data.default_vlan, - ALE_ALL_PORTS, ALE_ALL_PORTS, 0, 0); + if (!IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV) || + (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV) && + priv->emac_port != HOST_PORT_NUM)) { + for_each_slave(priv, cpsw_slave_open, priv); + + /* Add default VLAN */ + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) + cpsw_ale_add_vlan(cpsw->ale, cpsw->data.default_vlan, + ALE_ALL_PORTS, ALE_ALL_PORTS, 0, 0); + else + cpsw_add_default_vlan(priv); + } /* initialize shared resources for every ndev */ if (!cpsw->usage_count) { @@ -1575,6 +1635,13 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff *skb, struct cpdma_chan *txch; int ret, q_idx; +#if IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV) + if (priv->emac_port == HOST_PORT_NUM) { + dev_kfree_skb_any(skb); + return NETDEV_TX_OK; + } +#endif + if (skb_padto(skb, CPSW_MIN_PACKET_SIZE)) { cpsw_err(priv, tx_err, "packet pad failed\n"); ndev->stats.tx_dropped++; @@ -1655,8 +1722,12 @@ static void cpsw_hwtstamp_v2(struct cpsw_priv *priv) struct cpsw_slave *slave; struct cpsw_common *cpsw = priv->cpsw; u32 ctrl, mtype; + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return; - slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)]; + slave = &cpsw->slaves[slave_no]; ctrl = slave_read(slave, CPSW2_CONTROL); switch (cpsw->version) { @@ -1791,11 +1862,14 @@ static int cpsw_ndo_ioctl(struct net_device *dev, struct ifreq *req, int cmd) { struct cpsw_priv *priv = netdev_priv(dev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); if (!netif_running(dev)) return -EINVAL; + if (slave_no < 0) + return -EOPNOTSUPP; + switch (cmd) { case SIOCSHWTSTAMP: return cpsw_hwtstamp_set(dev, req); @@ -1832,6 +1906,7 @@ static int cpsw_ndo_set_mac_address(struct net_device *ndev, void *p) struct cpsw_priv *priv = netdev_priv(ndev); struct sockaddr *addr = (struct sockaddr *)p; struct cpsw_common *cpsw = priv->cpsw; + int slave_no = cpsw_slave_index(priv); int flags = 0; u16 vid = 0; int ret; @@ -1845,8 +1920,8 @@ static int cpsw_ndo_set_mac_address(struct net_device *ndev, void *p) return ret; } - if (cpsw->data.dual_emac) { - vid = cpsw->slaves[priv->emac_port].port_vlan; + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) { + vid = cpsw->slaves[slave_no].port_vlan; flags = ALE_VLAN; } @@ -1884,8 +1959,11 @@ static inline int cpsw_add_vlan_ale_entry(struct cpsw_priv *priv, u32 port_mask; struct cpsw_common *cpsw = priv->cpsw; - if (cpsw->data.dual_emac) { - port_mask = (1 << (priv->emac_port + 1)) | ALE_PORT_HOST; + if (cpsw_is_switchdev(cpsw->data.switch_mode)) + return -EOPNOTSUPP; + + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) { + port_mask = (1 << priv->emac_port) | ALE_PORT_HOST; if (priv->ndev->flags & IFF_ALLMULTI) unreg_mcast_mask = port_mask; @@ -1929,6 +2007,9 @@ static int cpsw_ndo_vlan_rx_add_vid(struct net_device *ndev, struct cpsw_common *cpsw = priv->cpsw; int ret; + if (cpsw_is_switchdev(cpsw->data.switch_mode)) + return 0; + if (vid == cpsw->data.default_vlan) return 0; @@ -1938,7 +2019,7 @@ static int cpsw_ndo_vlan_rx_add_vid(struct net_device *ndev, return ret; } - if (cpsw->data.dual_emac) { + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) { /* In dual EMAC, reserved VLAN id should not be used for * creating VLAN interfaces as this can break the dual * EMAC port separation @@ -1965,6 +2046,9 @@ static int cpsw_ndo_vlan_rx_kill_vid(struct net_device *ndev, struct cpsw_common *cpsw = priv->cpsw; int ret; + if (cpsw_is_switchdev(cpsw->data.switch_mode)) + return 0; + if (vid == cpsw->data.default_vlan) return 0; @@ -1974,7 +2058,7 @@ static int cpsw_ndo_vlan_rx_kill_vid(struct net_device *ndev, return ret; } - if (cpsw->data.dual_emac) { + if (cpsw_is_dual_mac(cpsw->data.switch_mode)) { int i; for (i = 0; i < cpsw->data.slaves; i++) { @@ -1999,6 +2083,24 @@ static int cpsw_ndo_vlan_rx_kill_vid(struct net_device *ndev, return ret; } +static int cpsw_ndo_get_phys_port_name(struct net_device *ndev, char *name, + size_t len) +{ + struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_common *cpsw = priv->cpsw; + int err; + + if (!cpsw_is_switchdev(cpsw->data.switch_mode)) + return -EOPNOTSUPP; + + err = snprintf(name, len, "p%d", priv->emac_port); + + if (err >= len) + return -EINVAL; + + return 0; +} + static int cpsw_ndo_set_tx_maxrate(struct net_device *ndev, int queue, u32 rate) { struct cpsw_priv *priv = netdev_priv(ndev); @@ -2065,6 +2167,7 @@ static const struct net_device_ops cpsw_netdev_ops = { #endif .ndo_vlan_rx_add_vid = cpsw_ndo_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = cpsw_ndo_vlan_rx_kill_vid, + .ndo_get_phys_port_name = cpsw_ndo_get_phys_port_name, }; static int cpsw_get_regs_len(struct net_device *ndev) @@ -2152,7 +2255,10 @@ static int cpsw_get_link_ksettings(struct net_device *ndev, { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return -EOPNOTSUPP; if (!cpsw->slaves[slave_no].phy) return -EOPNOTSUPP; @@ -2166,7 +2272,10 @@ static int cpsw_set_link_ksettings(struct net_device *ndev, { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return -EOPNOTSUPP; if (cpsw->slaves[slave_no].phy) return phy_ethtool_ksettings_set(cpsw->slaves[slave_no].phy, @@ -2179,7 +2288,10 @@ static void cpsw_get_wol(struct net_device *ndev, struct ethtool_wolinfo *wol) { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return; wol->supported = 0; wol->wolopts = 0; @@ -2192,7 +2304,10 @@ static int cpsw_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol) { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return -EOPNOTSUPP; if (cpsw->slaves[slave_no].phy) return phy_ethtool_set_wol(cpsw->slaves[slave_no].phy, wol); @@ -2451,7 +2566,10 @@ static int cpsw_get_eee(struct net_device *ndev, struct ethtool_eee *edata) { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return -EOPNOTSUPP; if (cpsw->slaves[slave_no].phy) return phy_ethtool_get_eee(cpsw->slaves[slave_no].phy, edata); @@ -2463,7 +2581,10 @@ static int cpsw_set_eee(struct net_device *ndev, struct ethtool_eee *edata) { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return -EOPNOTSUPP; if (cpsw->slaves[slave_no].phy) return phy_ethtool_set_eee(cpsw->slaves[slave_no].phy, edata); @@ -2475,7 +2596,10 @@ static int cpsw_nway_reset(struct net_device *ndev) { struct cpsw_priv *priv = netdev_priv(ndev); struct cpsw_common *cpsw = priv->cpsw; - int slave_no = cpsw_slave_index(cpsw, priv); + int slave_no = cpsw_slave_index(priv); + + if (slave_no < 0) + return -EOPNOTSUPP; if (cpsw->slaves[slave_no].phy) return genphy_restart_aneg(cpsw->slaves[slave_no].phy); @@ -2626,7 +2750,11 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data, data->mac_control = prop; if (of_property_read_bool(node, "dual_emac")) - data->dual_emac = 1; + data->switch_mode = CPSW_DUAL_EMAC; + + /* switchdev overrides DTS */ + if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV)) + data->switch_mode = CPSW_SWITCHDEV; /* * Populate all the child nodes here... @@ -2707,7 +2835,7 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data, if (ret) return ret; } - if (data->dual_emac) { + if (cpsw_is_dual_mac(data->switch_mode)) { if (of_property_read_u32(slave_node, "dual_emac_res_vlan", &prop)) { dev_err(&pdev->dev, "Missing dual_emac_res_vlan in DT.\n"); @@ -2787,9 +2915,13 @@ static int cpsw_probe_dual_emac(struct cpsw_priv *priv) } memcpy(ndev->dev_addr, priv_sl2->mac_addr, ETH_ALEN); - priv_sl2->emac_port = 1; + priv_sl2->emac_port = 2; cpsw->slaves[1].ndev = ndev; ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER; + if (cpsw_is_switchdev(cpsw->data.switch_mode)) { + ndev->features |= NETIF_F_NETNS_LOCAL; + cpsw_port_switchdev_init(ndev); + } ndev->netdev_ops = &cpsw_netdev_ops; ndev->ethtool_ops = &cpsw_ethtool_ops; @@ -2806,6 +2938,49 @@ static int cpsw_probe_dual_emac(struct cpsw_priv *priv) return ret; } +static int cpsw_probe_cpu_port(struct cpsw_common *cpsw) +{ + struct cpsw_priv *priv_sl2; + struct net_device *ndev; + int ret = 0; + + ndev = alloc_etherdev_mq(sizeof(struct cpsw_priv), CPSW_MAX_QUEUES); + if (!ndev) { + dev_err(cpsw->dev, "cpsw: error allocating net_device\n"); + return -ENOMEM; + } + + priv_sl2 = netdev_priv(ndev); + priv_sl2->cpsw = cpsw; + priv_sl2->ndev = ndev; + priv_sl2->dev = &ndev->dev; + priv_sl2->msg_enable = netif_msg_init(debug_level, CPSW_DEBUG); + + random_ether_addr(priv_sl2->mac_addr); + dev_info(cpsw->dev, "cpu port: Random MACID = %pM\n", + priv_sl2->mac_addr); + + memcpy(ndev->dev_addr, priv_sl2->mac_addr, ETH_ALEN); + + priv_sl2->emac_port = HOST_PORT_NUM; + ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_NETNS_LOCAL; + + ndev->netdev_ops = &cpsw_netdev_ops; + + /* register the network device */ + SET_NETDEV_DEV(ndev, cpsw->dev); + cpsw_port_switchdev_init(ndev); + ret = register_netdev(ndev); + if (ret) { + dev_err(cpsw->dev, "cpsw: error registering net device\n"); + free_netdev(ndev); + ret = -ENODEV; + } + cpsw->master = ndev; + + return ret; +} + #define CPSW_QUIRK_IRQ BIT(0) static const struct platform_device_id cpsw_devtype[] = { @@ -2844,6 +3019,187 @@ static const struct of_device_id cpsw_of_mtable[] = { }; MODULE_DEVICE_TABLE(of, cpsw_of_mtable); +static bool cpsw_port_dev_check(const struct net_device *dev) +{ + return dev->netdev_ops == &cpsw_netdev_ops; +} + +static void cpsw_fdb_offload_notify(struct net_device *ndev, + struct switchdev_notifier_fdb_info *rcv) +{ + struct switchdev_notifier_fdb_info info; + + info.addr = rcv->addr; + info.vid = rcv->vid; + call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED, + ndev, &info.info); +} + +static void cpsw_switchdev_event_work(struct work_struct *work) +{ + struct cpsw_switchdev_event_work *switchdev_work = + container_of(work, struct cpsw_switchdev_event_work, work); + struct cpsw_priv *priv = switchdev_work->priv; + struct switchdev_notifier_fdb_info *fdb; + struct cpsw_common *cpsw = priv->cpsw; + + rtnl_lock(); + switch (switchdev_work->event) { + case SWITCHDEV_FDB_ADD_TO_DEVICE: + fdb = &switchdev_work->fdb_info; + cpsw_ale_add_ucast(cpsw->ale, (u8 *)fdb->addr, priv->emac_port, + ALE_VLAN | ALE_SECURE, fdb->vid); + cpsw_fdb_offload_notify(priv->ndev, fdb); + break; + case SWITCHDEV_FDB_DEL_TO_DEVICE: + fdb = &switchdev_work->fdb_info; + cpsw_ale_del_ucast(cpsw->ale, (u8 *)fdb->addr, priv->emac_port, + ALE_VLAN | ALE_SECURE, fdb->vid); + break; + default: + break; + } + rtnl_unlock(); + + kfree(switchdev_work->fdb_info.addr); + kfree(switchdev_work); + dev_put(priv->ndev); +} + +/* called under rcu_read_lock() */ +static int cpsw_switchdev_event(struct notifier_block *unused, + unsigned long event, void *ptr) +{ + struct net_device *ndev = switchdev_notifier_info_to_dev(ptr); + struct switchdev_notifier_fdb_info *fdb_info = ptr; + struct cpsw_switchdev_event_work *switchdev_work; + struct cpsw_priv *priv = netdev_priv(ndev); + + if (!cpsw_port_dev_check(ndev)) + return NOTIFY_DONE; + + switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC); + if (WARN_ON(!switchdev_work)) + return NOTIFY_BAD; + + INIT_WORK(&switchdev_work->work, cpsw_switchdev_event_work); + switchdev_work->priv = priv; + switchdev_work->event = event; + + switch (event) { + case SWITCHDEV_FDB_ADD_TO_DEVICE: + case SWITCHDEV_FDB_DEL_TO_DEVICE: + memcpy(&switchdev_work->fdb_info, ptr, + sizeof(switchdev_work->fdb_info)); + switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC); + ether_addr_copy((u8 *)switchdev_work->fdb_info.addr, + fdb_info->addr); + dev_hold(ndev); + break; + default: + kfree(switchdev_work); + return NOTIFY_DONE; + } + + queue_work(system_long_wq, &switchdev_work->work); + + return NOTIFY_DONE; +} + +static struct notifier_block cpsw_switchdev_notifier = { + .notifier_call = cpsw_switchdev_event, +}; + +static void cpsw_netdevice_port_link(struct net_device *ndev) +{ + struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_common *cpsw = priv->cpsw; + + if (priv->emac_port != HOST_PORT_NUM) + return; + + cpsw_ale_control_set(cpsw->ale, HOST_PORT_NUM, ALE_P0_UNI_FLOOD, 1); +} + +static void cpsw_netdevice_port_unlink(struct net_device *ndev) +{ + struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_common *cpsw = priv->cpsw; + + if (priv->emac_port != HOST_PORT_NUM) + return; + + cpsw_ale_control_set(cpsw->ale, HOST_PORT_NUM, ALE_P0_UNI_FLOOD, 0); +} + +/* netdev notifier */ +static int cpsw_netdevice_event(struct notifier_block *unused, + unsigned long event, void *ptr) +{ + struct net_device *ndev = netdev_notifier_info_to_dev(ptr); + struct netdev_notifier_changeupper_info *info; + + switch (event) { + case NETDEV_CHANGEUPPER: + info = ptr; + if (!info->master) + goto out; + if (info->linking) + cpsw_netdevice_port_link(ndev); + else + cpsw_netdevice_port_unlink(ndev); + break; + default: + return NOTIFY_DONE; + } + +out: + return NOTIFY_DONE; +} + +static struct notifier_block cpsw_netdevice_nb __read_mostly = { + .notifier_call = cpsw_netdevice_event, +}; + +static int cpsw_register_notifiers(struct cpsw_priv *priv) +{ + int ret; + + ret = register_netdevice_notifier(&cpsw_netdevice_nb); + if (ret) { + cpsw_err(priv, probe, "can't register netdevice notifier\n"); + return ret; + } + + ret = register_switchdev_notifier(&cpsw_switchdev_notifier); + if (ret) { + cpsw_err(priv, probe, "can't register switchdev notifier\n"); + goto unreg_netdevice; + } + + return ret; + +unreg_netdevice: + ret = unregister_netdevice_notifier(&cpsw_netdevice_nb); + + return ret; +} + +static int cpsw_unregister_notifiers(struct cpsw_priv *priv) +{ + int ret; + + ret = unregister_switchdev_notifier(&cpsw_switchdev_notifier); + if (ret) + dev_err(priv->dev, "can't unregister switchdev notifier\n"); + + ret += unregister_netdevice_notifier(&cpsw_netdevice_nb); + if (ret) + dev_err(priv->dev, "can't unregister netdevice notifier\n"); + + return ret; +} + static int cpsw_probe(struct platform_device *pdev) { struct clk *clk; @@ -2935,7 +3291,11 @@ static int cpsw_probe(struct platform_device *pdev) cpsw->slaves[i].slave_num = i; cpsw->slaves[0].ndev = ndev; - priv->emac_port = 0; + + if (cpsw_is_switch(cpsw->data.switch_mode)) + priv->emac_port = HOST_PORT_NUM; + else + priv->emac_port = 1; clk = devm_clk_get(&pdev->dev, "fck"); if (IS_ERR(clk)) { @@ -3076,8 +3436,17 @@ static int cpsw_probe(struct platform_device *pdev) cpsw->quirk_irq = true; } - ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX; + if (cpsw_is_switchdev(cpsw->data.switch_mode)) { + ret = cpsw_probe_cpu_port(cpsw); + if (ret) { + cpsw_err(priv, probe, "error probe cpu interface\n"); + goto clean_dma_ret; + } + cpsw_port_switchdev_init(ndev); + ndev->features |= NETIF_F_NETNS_LOCAL; + } + ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX; ndev->netdev_ops = &cpsw_netdev_ops; ndev->ethtool_ops = &cpsw_ethtool_ops; netif_napi_add(ndev, &cpsw->napi_rx, cpsw_rx_poll, CPSW_POLL_WEIGHT); @@ -3093,7 +3462,7 @@ static int cpsw_probe(struct platform_device *pdev) goto clean_dma_ret; } - if (cpsw->data.dual_emac) { + if (!cpsw_is_switch(cpsw->data.switch_mode)) { ret = cpsw_probe_dual_emac(priv); if (ret) { cpsw_err(priv, probe, "error probe slave 2 emac interface\n"); @@ -3139,6 +3508,12 @@ static int cpsw_probe(struct platform_device *pdev) goto clean_dma_ret; } + if (cpsw_is_switchdev(cpsw->data.switch_mode)) { + ret = cpsw_register_notifiers(priv); + if (ret) + goto clean_dma_ret; + } + cpsw_notice(priv, probe, "initialized device (regs %pa, irq %d, pool size %d)\n", &ss_res->start, ndev->irq, dma_params.descs_pool_size); @@ -3164,7 +3539,8 @@ static int cpsw_probe(struct platform_device *pdev) static int cpsw_remove(struct platform_device *pdev) { struct net_device *ndev = platform_get_drvdata(pdev); - struct cpsw_common *cpsw = ndev_to_cpsw(ndev); + struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_common *cpsw = priv->cpsw; int ret; ret = pm_runtime_get_sync(&pdev->dev); @@ -3173,7 +3549,10 @@ static int cpsw_remove(struct platform_device *pdev) return ret; } - if (cpsw->data.dual_emac) + if (cpsw_is_switchdev(cpsw->data.switch_mode)) + ret = cpsw_unregister_notifiers(priv); + + if (cpsw->data.switch_mode) unregister_netdev(cpsw->slaves[1].ndev); unregister_netdev(ndev); @@ -3182,8 +3561,10 @@ static int cpsw_remove(struct platform_device *pdev) cpsw_remove_dt(pdev); pm_runtime_put_sync(&pdev->dev); pm_runtime_disable(&pdev->dev); - if (cpsw->data.dual_emac) + if (cpsw->data.switch_mode) free_netdev(cpsw->slaves[1].ndev); + if (cpsw->master) + free_netdev(cpsw->master); free_netdev(ndev); return 0; } @@ -3195,7 +3576,7 @@ static int cpsw_suspend(struct device *dev) struct net_device *ndev = platform_get_drvdata(pdev); struct cpsw_common *cpsw = ndev_to_cpsw(ndev); - if (cpsw->data.dual_emac) { + if (cpsw->data.switch_mode) { int i; for (i = 0; i < cpsw->data.slaves; i++) { @@ -3224,7 +3605,7 @@ static int cpsw_resume(struct device *dev) /* shut up ASSERT_RTNL() warning in netif_set_real_num_tx/rx_queues */ rtnl_lock(); - if (cpsw->data.dual_emac) { + if (cpsw->data.switch_mode) { int i; for (i = 0; i < cpsw->data.slaves; i++) { diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h index 3b02a83..4be5ffc 100644 --- a/drivers/net/ethernet/ti/cpsw_priv.h +++ b/drivers/net/ethernet/ti/cpsw_priv.h @@ -30,6 +30,12 @@ #define CPSW2_TX_PRI_MAP 0x18 /* Tx Header Priority to Switch Pri Mapping */ #define CPSW2_TS_SEQ_MTYPE 0x1c /* Time Sync Sequence ID Offset and Msg Type */ +enum { + CPSW_TI_SWITCH, + CPSW_DUAL_EMAC, + CPSW_SWITCHDEV, +}; + struct cpsw_slave_data { struct device_node *phy_node; char phy_id[MII_BUS_ID_SIZE]; @@ -48,7 +54,7 @@ struct cpsw_platform_data { u32 bd_ram_size; /*buffer descriptor ram size */ u32 mac_control; /* Mac control register */ u16 default_vlan; /* Def VLAN for ALE lookup in VLAN aware mode*/ - bool dual_emac; /* Enable Dual EMAC mode */ + u8 switch_mode; /* Enable Dual EMAC/switchdev mode */ }; struct cpsw_slave { @@ -80,6 +86,7 @@ struct cpsw_common { u32 coal_intvl; u32 bus_freq_mhz; int rx_packet_max; + struct net_device *master; /* used for switchdev */ struct cpsw_slave *slaves; struct cpdma_ctlr *dma; struct cpsw_vector txv[CPSW_MAX_QUEUES]; -- 2.7.4 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-05-24 6:56 ` [PATCH 4/4] cpsw: add switchdev support Ilias Apalodimas @ 2018-05-24 13:12 ` Andrew Lunn 2018-05-24 13:32 ` Ilias Apalodimas 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-05-24 13:12 UTC (permalink / raw) To: Ilias Apalodimas Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton > @@ -2626,7 +2750,11 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data, > data->mac_control = prop; > > if (of_property_read_bool(node, "dual_emac")) > - data->dual_emac = 1; > + data->switch_mode = CPSW_DUAL_EMAC; > + > + /* switchdev overrides DTS */ > + if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV)) > + data->switch_mode = CPSW_SWITCHDEV; Device tree is supposed to describe the hardware. Using that hardware in different ways is not something you should describe in DT. There are also a lot of IS_ENABLED() here, which i don't like. It is a lot better than #ifdef, but we should try to do better. It would be good to split this cleanly into three parts. A generic library, which does not care about DUAL_MAC or SWITCHDEV. A driver which implements legacy DUAL MAC etc. And a driver which implements SWITCHDEV. We can then give this new switchdev driver a different compatible. It i still encoding in device tree how to use the hardware, but it is more implicit, rather than explicit. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-05-24 13:12 ` Andrew Lunn @ 2018-05-24 13:32 ` Ilias Apalodimas 2018-05-24 16:39 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 13:32 UTC (permalink / raw) To: Andrew Lunn Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: > Device tree is supposed to describe the hardware. Using that hardware > in different ways is not something you should describe in DT. > The new switchdev mode is applied with a .config option in the kernel. What you see is pre-existing code, so i am not sure if i should change it in this patchset. Your point is valid though and we are on the same page. > There are also a lot of IS_ENABLED() here, which i don't like. It is a > lot better than #ifdef, but we should try to do better. I don't like it either i just tried to clean up code in "hot path" with ifdefs. In theory this should replace "switch mode" in the near future so the ifdefs will go away > It would be > good to split this cleanly into three parts. A generic library, which > does not care about DUAL_MAC or SWITCHDEV. A driver which implements > legacy DUAL MAC etc. And a driver which implements SWITCHDEV. We can > then give this new switchdev driver a different compatible. It i still > encoding in device tree how to use the hardware, but it is more > implicit, rather than explicit. Good idea, i'll sent the next version like that Thanks, Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-05-24 13:32 ` Ilias Apalodimas @ 2018-05-24 16:39 ` Andrew Lunn 2018-05-25 4:56 ` Ilias Apalodimas 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-05-24 16:39 UTC (permalink / raw) To: Ilias Apalodimas Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: > On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: > > Device tree is supposed to describe the hardware. Using that hardware > > in different ways is not something you should describe in DT. > > > The new switchdev mode is applied with a .config option in the kernel. What you > see is pre-existing code, so i am not sure if i should change it in this > patchset. If you break the code up into a library and two drivers, it becomes a moot point. But what i don't like here is that the device tree says to do dual mac. But you ignore that and do sometime else. I would prefer that if DT says dual mac, and switchdev is compiled in, the probe fails with EINVAL. Rather than ignore something, make it clear it is invalid. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-05-24 16:39 ` Andrew Lunn @ 2018-05-25 4:56 ` Ilias Apalodimas 2018-06-01 21:48 ` Florian Fainelli 0 siblings, 1 reply; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-25 4:56 UTC (permalink / raw) To: Andrew Lunn Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote: > On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: > > On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: > > > Device tree is supposed to describe the hardware. Using that hardware > > > in different ways is not something you should describe in DT. > > > > > The new switchdev mode is applied with a .config option in the kernel. What you > > see is pre-existing code, so i am not sure if i should change it in this > > patchset. > > If you break the code up into a library and two drivers, it becomes a > moot point. Agree > > But what i don't like here is that the device tree says to do dual > mac. But you ignore that and do sometime else. I would prefer that if > DT says dual mac, and switchdev is compiled in, the probe fails with > EINVAL. Rather than ignore something, make it clear it is invalid. The switch has 3 modes of operation as is. 1. switch mode, to enable that you don't need to add anything on the DTS and linux registers a single netdev interface. 2. dual mac mode, this is when you need to add dual_emac; on the DTS. 3. switchdev mode which is controlled by a .config option, since as you pointed out DTS was not made for controlling config options. I agree that this is far from beautiful. If the driver remains as in though, i'd prefer either keeping what's there or making "switchdev" a DTS option, following the pre-existing erroneous usage rather than making the device unusable. If we end up returning some error and refuse to initialize, users that remote upgrade their equipment, without taking a good look at changelog, will loose access to their devices with no means of remotely fixing that. Regards Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-05-25 4:56 ` Ilias Apalodimas @ 2018-06-01 21:48 ` Florian Fainelli 2018-06-02 10:34 ` Ilias Apalodimas 0 siblings, 1 reply; 53+ messages in thread From: Florian Fainelli @ 2018-06-01 21:48 UTC (permalink / raw) To: Ilias Apalodimas, Andrew Lunn Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On 05/24/2018 09:56 PM, Ilias Apalodimas wrote: > On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote: >> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: >>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: >>>> Device tree is supposed to describe the hardware. Using that hardware >>>> in different ways is not something you should describe in DT. >>>> >>> The new switchdev mode is applied with a .config option in the kernel. What you >>> see is pre-existing code, so i am not sure if i should change it in this >>> patchset. >> >> If you break the code up into a library and two drivers, it becomes a >> moot point. > Agree > >> >> But what i don't like here is that the device tree says to do dual >> mac. But you ignore that and do sometime else. I would prefer that if >> DT says dual mac, and switchdev is compiled in, the probe fails with >> EINVAL. Rather than ignore something, make it clear it is invalid. > The switch has 3 modes of operation as is. > 1. switch mode, to enable that you don't need to add anything on > the DTS and linux registers a single netdev interface. > 2. dual mac mode, this is when you need to add dual_emac; on the DTS. > 3. switchdev mode which is controlled by a .config option, since as you > pointed out DTS was not made for controlling config options. > > I agree that this is far from beautiful. If the driver remains as in though, > i'd prefer either keeping what's there or making "switchdev" a DTS option, > following the pre-existing erroneous usage rather than making the device > unusable. If we end up returning some error and refuse to initialize, users > that remote upgrade their equipment, without taking a good look at changelog, > will loose access to their devices with no means of remotely fixing that. It seems to me that the mistake here is seeing multiple modes of operations for the cpsw. There are not actually many, there is one usage, and then there is what you can and cannot offload. The basic premise with switchdev and DSA (which uses switchdev) is that each user-facing port of your switch needs to work as if it were a normal Ethernet NIC, that is what you call dual-MAC I believe. Then, when you create a bridge and you enslave those ports into the bridge, you need to have forwarding done in hardware between these two ports when the SMAC/DMAC are not for the host/CPU/management interface and you must simultaneously still have the host have the ability to send/receive traffic through the bridge device. It seems to me like this is entirely doable given that the dual MAC use case is supported already. switchdev is just a stateless framework to get notified from the networking stack about what you can possibly offload in hardware, so having a DTS option gate that is unfortunately wrong because it is really implementing a SW policy in DTS which is not what it is meant for. -- Florian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-06-01 21:48 ` Florian Fainelli @ 2018-06-02 10:34 ` Ilias Apalodimas 2018-06-02 16:10 ` Florian Fainelli 2018-06-05 21:03 ` Grygorii Strashko 0 siblings, 2 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-06-02 10:34 UTC (permalink / raw) To: Florian Fainelli Cc: Andrew Lunn, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton Hi Florian, Thanks for taking time to look into this On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote: > > > On 05/24/2018 09:56 PM, Ilias Apalodimas wrote: > > On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote: > >> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: > >>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: > >>>> Device tree is supposed to describe the hardware. Using that hardware > >>>> in different ways is not something you should describe in DT. > >>>> > >>> The new switchdev mode is applied with a .config option in the kernel. What you > >>> see is pre-existing code, so i am not sure if i should change it in this > >>> patchset. > >> > >> If you break the code up into a library and two drivers, it becomes a > >> moot point. > > Agree > > > >> > >> But what i don't like here is that the device tree says to do dual > >> mac. But you ignore that and do sometime else. I would prefer that if > >> DT says dual mac, and switchdev is compiled in, the probe fails with > >> EINVAL. Rather than ignore something, make it clear it is invalid. > > The switch has 3 modes of operation as is. > > 1. switch mode, to enable that you don't need to add anything on > > the DTS and linux registers a single netdev interface. > > 2. dual mac mode, this is when you need to add dual_emac; on the DTS. > > 3. switchdev mode which is controlled by a .config option, since as you > > pointed out DTS was not made for controlling config options. > > > > I agree that this is far from beautiful. If the driver remains as in though, > > i'd prefer either keeping what's there or making "switchdev" a DTS option, > > following the pre-existing erroneous usage rather than making the device > > unusable. If we end up returning some error and refuse to initialize, users > > that remote upgrade their equipment, without taking a good look at changelog, > > will loose access to their devices with no means of remotely fixing that. > > It seems to me that the mistake here is seeing multiple modes of > operations for the cpsw. There are not actually many, there is one > usage, and then there is what you can and cannot offload. CPSW has in fact 2 modes of operation, different FIFO usage/lookup entry(it's called ALE in the current driver) by-pass(which is used in dual emac for example) and other features. Again Grygorii is better suited to answer the exact differences. > The basic > premise with switchdev and DSA (which uses switchdev) is that each > user-facing port of your switch needs to work as if it were a normal > Ethernet NIC, that is what you call dual-MAC I believe. Then, when you > create a bridge and you enslave those ports into the bridge, you need to > have forwarding done in hardware between these two ports when the > SMAC/DMAC are not for the host/CPU/management interface and you must > simultaneously still have the host have the ability to send/receive > traffic through the bridge device. Yes dual emac does that. But dual emac configures the port facing VLAN to the CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU port That's exactly what the current RFC does as well, with the addition of registering a sw0p0 (i'll explain why later on this mail) A little more detail on the issue we are having. On my description sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the ports that have PHYs attached. When we start in the new switchdev mode all interfaces are added to VLAN 0 so CPU port + port1 + port2 are all in the same VLAN group. In that case sw0p1 and sw0p2 are working as you describe. So those 2 interfaces can send/receive traffic normally which matches the switchdev case. When we add them on a bridge(let's say br0), VLAN1(or any default bridge VLAN) is now configured on sw0p1 and sw0p2 but *not* on the CPU port. >From this point on the whole fuunctionality just collapses. The switch will work and offload traffic between sw0p1/sw0p2 but the bridge won't be able to get an ip address (since VLAN1 is not a member of the CPU port and the packet gets dropped). IGMPv2/V3 messages will never reach the br_multicast.c code to trigger switchdev and configure the MDBs on the ports. i am prety sure there are other fail scenarios which i haven't discovered already, but those 2 are the most basic ones. If we add VLAN1 on the CPU port, everything works as intended of course. That's the reason we registered sw0p0 as the CPU port. It can't do any "real" traffic, but you can configure the CPU port independantly and not be forced to do an OR on every VLAN add/delete grouping the CPU port with your port command. The TL;DR version of this is that the switch is working exactly as switchdev is expecting offloading traffic to the hardware when possible as long as the CPU port is member of the proper VLANs Petr's patch solves this for us (9c86ce2c1ae337fc10568a12aea812ed03de8319). We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and decide when to add the CPU port or not. There are still a couple of cases that are not covered though, if we don't register the CPU port. We cant decide when to forward multicast traffic on the CPU port if a join hasn't been sent from br0. So let's say you got 2 hosts doing multicast and for whatever reason the host wants to see that traffic. With the CPU port present you can do a "bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will offload the traffic to the CPU port and thus the host. If this goes away we are forced to send a join. > It seems to me like this is entirely doable given that the dual MAC use > case is supported already. > > switchdev is just a stateless framework to get notified from the > networking stack about what you can possibly offload in hardware, so > having a DTS option gate that is unfortunately wrong because it is > really implementing a SW policy in DTS which is not what it is meant for. The DTS option for configuration pre-existed, i don't like that either switchdev mode is activated by a .config option not DTS(it just overrides whatever config you have on the DTS). Far from pretty though fair point, i am open to suggestions. > -- > Florian Thanks! Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-06-02 10:34 ` Ilias Apalodimas @ 2018-06-02 16:10 ` Florian Fainelli 2018-06-02 16:52 ` Ilias Apalodimas 2018-06-05 21:03 ` Grygorii Strashko 1 sibling, 1 reply; 53+ messages in thread From: Florian Fainelli @ 2018-06-02 16:10 UTC (permalink / raw) To: Ilias Apalodimas Cc: Andrew Lunn, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On June 2, 2018 3:34:32 AM MST, Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote: >Hi Florian, > >Thanks for taking time to look into this > >On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote: >> >> >> On 05/24/2018 09:56 PM, Ilias Apalodimas wrote: >> > On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote: >> >> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: >> >>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: >> >>>> Device tree is supposed to describe the hardware. Using that >hardware >> >>>> in different ways is not something you should describe in DT. >> >>>> >> >>> The new switchdev mode is applied with a .config option in the >kernel. What you >> >>> see is pre-existing code, so i am not sure if i should change it >in this >> >>> patchset. >> >> >> >> If you break the code up into a library and two drivers, it >becomes a >> >> moot point. >> > Agree >> > >> >> >> >> But what i don't like here is that the device tree says to do dual >> >> mac. But you ignore that and do sometime else. I would prefer that >if >> >> DT says dual mac, and switchdev is compiled in, the probe fails >with >> >> EINVAL. Rather than ignore something, make it clear it is invalid. >> > The switch has 3 modes of operation as is. >> > 1. switch mode, to enable that you don't need to add anything on >> > the DTS and linux registers a single netdev interface. >> > 2. dual mac mode, this is when you need to add dual_emac; on the >DTS. >> > 3. switchdev mode which is controlled by a .config option, since as >you >> > pointed out DTS was not made for controlling config options. >> > >> > I agree that this is far from beautiful. If the driver remains as >in though, >> > i'd prefer either keeping what's there or making "switchdev" a DTS >option, >> > following the pre-existing erroneous usage rather than making the >device >> > unusable. If we end up returning some error and refuse to >initialize, users >> > that remote upgrade their equipment, without taking a good look at >changelog, >> > will loose access to their devices with no means of remotely fixing >that. >> >> It seems to me that the mistake here is seeing multiple modes of >> operations for the cpsw. There are not actually many, there is one >> usage, and then there is what you can and cannot offload. >CPSW has in fact 2 modes of operation, different FIFO usage/lookup >entry(it's >called ALE in the current driver) by-pass(which is used in dual emac >for >example) and other features. Again Grygorii is better suited to answer >the >exact differences. >> The basic >> premise with switchdev and DSA (which uses switchdev) is that each >> user-facing port of your switch needs to work as if it were a normal >> Ethernet NIC, that is what you call dual-MAC I believe. Then, when >you >> create a bridge and you enslave those ports into the bridge, you need >to >> have forwarding done in hardware between these two ports when the >> SMAC/DMAC are not for the host/CPU/management interface and you must >> simultaneously still have the host have the ability to send/receive >> traffic through the bridge device. >Yes dual emac does that. But dual emac configures the port facing VLAN >to the >CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is >configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU >port >That's exactly what the current RFC does as well, with the addition of >registering a sw0p0 (i'll explain why later on this mail) >A little more detail on the issue we are having. On my description >sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the >ports >that have PHYs attached. > >When we start in the new switchdev mode all interfaces are added to >VLAN 0 >so CPU port + port1 + port2 are all in the same VLAN group. In that >case sw0p1 >and sw0p2 are working as you describe. So those 2 interfaces can >send/receive >traffic normally which matches the switchdev case. > >When we add them on a bridge(let's say br0), VLAN1(or any default >bridge VLAN) >is now configured on sw0p1 and sw0p2 but *not* on the CPU port. >From this point on the whole fuunctionality just collapses. The switch >will >work and offload traffic between sw0p1/sw0p2 but the bridge won't be >able to >get an ip address (since VLAN1 is not a member of the CPU port and the >packet >gets dropped). >IGMPv2/V3 messages will never reach the br_multicast.c code to trigger >switchdev and configure the MDBs on the ports. i am prety sure there >are other >fail scenarios which i haven't discovered already, but those 2 are the >most >basic ones. If we add VLAN1 on the CPU port, everything works as >intended of >course. > >That's the reason we registered sw0p0 as the CPU port. It can't do any >"real" >traffic, but you can configure the CPU port independantly and not be >forced to >do an OR on every VLAN add/delete grouping the CPU port with your port >command. >The TL;DR version of this is that the switch is working exactly as >switchdev is >expecting offloading traffic to the hardware when possible as long as >the CPU >port is member of the proper VLANs > >Petr's patch solves this for us >(9c86ce2c1ae337fc10568a12aea812ed03de8319). >We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and >decide >when to add the CPU port or not. > >There are still a couple of cases that are not covered though, if we >don't >register the CPU port. We cant decide when to forward multicast >traffic on the CPU port if a join hasn't been sent from br0. >So let's say you got 2 hosts doing multicast and for whatever reason >the host >wants to see that traffic. >With the CPU port present you can do a >"bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will >offload >the traffic to the CPU port and thus the host. If this goes away we are >forced >to send a join. Thanks for the detailed explanation. Somehow I was under the impression that cpsw had the ability, through specific DMA descriptor bits to direct traffic towards one external port or another and conversely, have that information from the HW when receiving packets. What you describe is exactly the same problem we have in DSA when the switch advertises DSA_TAG_PROTO_NONE where only VLAN tags could help differentiate traffic from external ports. At some point there was a discuss of making DSA_TAG_PROTO_NONE automatically create one VLAN per port but this is a good source for other problems... Looking forward to your follow-up patch series! -- Florian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-06-02 16:10 ` Florian Fainelli @ 2018-06-02 16:52 ` Ilias Apalodimas 0 siblings, 0 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-06-02 16:52 UTC (permalink / raw) To: Florian Fainelli Cc: Andrew Lunn, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On Sat, Jun 02, 2018 at 09:10:08AM -0700, Florian Fainelli wrote: > On June 2, 2018 3:34:32 AM MST, Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote: > >Hi Florian, > > > >Thanks for taking time to look into this > > > >On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote: > >> > >> > >> On 05/24/2018 09:56 PM, Ilias Apalodimas wrote: > >> > On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote: > >> >> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: > >> >>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: > >> >>>> Device tree is supposed to describe the hardware. Using that > >hardware > >> >>>> in different ways is not something you should describe in DT. > >> >>>> > >> >>> The new switchdev mode is applied with a .config option in the > >kernel. What you > >> >>> see is pre-existing code, so i am not sure if i should change it > >in this > >> >>> patchset. > >> >> > >> >> If you break the code up into a library and two drivers, it > >becomes a > >> >> moot point. > >> > Agree > >> > > >> >> > >> >> But what i don't like here is that the device tree says to do dual > >> >> mac. But you ignore that and do sometime else. I would prefer that > >if > >> >> DT says dual mac, and switchdev is compiled in, the probe fails > >with > >> >> EINVAL. Rather than ignore something, make it clear it is invalid. > >> > The switch has 3 modes of operation as is. > >> > 1. switch mode, to enable that you don't need to add anything on > >> > the DTS and linux registers a single netdev interface. > >> > 2. dual mac mode, this is when you need to add dual_emac; on the > >DTS. > >> > 3. switchdev mode which is controlled by a .config option, since as > >you > >> > pointed out DTS was not made for controlling config options. > >> > > >> > I agree that this is far from beautiful. If the driver remains as > >in though, > >> > i'd prefer either keeping what's there or making "switchdev" a DTS > >option, > >> > following the pre-existing erroneous usage rather than making the > >device > >> > unusable. If we end up returning some error and refuse to > >initialize, users > >> > that remote upgrade their equipment, without taking a good look at > >changelog, > >> > will loose access to their devices with no means of remotely fixing > >that. > >> > >> It seems to me that the mistake here is seeing multiple modes of > >> operations for the cpsw. There are not actually many, there is one > >> usage, and then there is what you can and cannot offload. > >CPSW has in fact 2 modes of operation, different FIFO usage/lookup > >entry(it's > >called ALE in the current driver) by-pass(which is used in dual emac > >for > >example) and other features. Again Grygorii is better suited to answer > >the > >exact differences. > >> The basic > >> premise with switchdev and DSA (which uses switchdev) is that each > >> user-facing port of your switch needs to work as if it were a normal > >> Ethernet NIC, that is what you call dual-MAC I believe. Then, when > >you > >> create a bridge and you enslave those ports into the bridge, you need > >to > >> have forwarding done in hardware between these two ports when the > >> SMAC/DMAC are not for the host/CPU/management interface and you must > >> simultaneously still have the host have the ability to send/receive > >> traffic through the bridge device. > >Yes dual emac does that. But dual emac configures the port facing VLAN > >to the > >CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is > >configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU > >port > >That's exactly what the current RFC does as well, with the addition of > >registering a sw0p0 (i'll explain why later on this mail) > >A little more detail on the issue we are having. On my description > >sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the > >ports > >that have PHYs attached. > > > >When we start in the new switchdev mode all interfaces are added to > >VLAN 0 > >so CPU port + port1 + port2 are all in the same VLAN group. In that > >case sw0p1 > >and sw0p2 are working as you describe. So those 2 interfaces can > >send/receive > >traffic normally which matches the switchdev case. > > > >When we add them on a bridge(let's say br0), VLAN1(or any default > >bridge VLAN) > >is now configured on sw0p1 and sw0p2 but *not* on the CPU port. > >From this point on the whole fuunctionality just collapses. The switch > >will > >work and offload traffic between sw0p1/sw0p2 but the bridge won't be > >able to > >get an ip address (since VLAN1 is not a member of the CPU port and the > >packet > >gets dropped). > >IGMPv2/V3 messages will never reach the br_multicast.c code to trigger > >switchdev and configure the MDBs on the ports. i am prety sure there > >are other > >fail scenarios which i haven't discovered already, but those 2 are the > >most > >basic ones. If we add VLAN1 on the CPU port, everything works as > >intended of > >course. > > > >That's the reason we registered sw0p0 as the CPU port. It can't do any > >"real" > >traffic, but you can configure the CPU port independantly and not be > >forced to > >do an OR on every VLAN add/delete grouping the CPU port with your port > >command. > >The TL;DR version of this is that the switch is working exactly as > >switchdev is > >expecting offloading traffic to the hardware when possible as long as > >the CPU > >port is member of the proper VLANs > > > >Petr's patch solves this for us > >(9c86ce2c1ae337fc10568a12aea812ed03de8319). > >We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and > >decide > >when to add the CPU port or not. > > > >There are still a couple of cases that are not covered though, if we > >don't > >register the CPU port. We cant decide when to forward multicast > >traffic on the CPU port if a join hasn't been sent from br0. > >So let's say you got 2 hosts doing multicast and for whatever reason > >the host > >wants to see that traffic. > >With the CPU port present you can do a > >"bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will > >offload > >the traffic to the CPU port and thus the host. If this goes away we are > >forced > >to send a join. > > Thanks for the detailed explanation. Somehow I was under the impression that cpsw had the ability, through specific DMA descriptor bits to direct traffic towards one external port or another and conversely, have that information from the HW when receiving packets. That's one mode of operation when by-passing the ALE if my understanding of the hardware is correct. You can choose not to do that. I am still earning all the details of the ahrdware myself. On Rx though you still need the CPU to participate to receive the packet(and yes the descriptor indicates the port) > What you describe is exactly the same problem we have in DSA when the switch advertises DSA_TAG_PROTO_NONE where only VLAN tags could help differentiate traffic from external ports. At some point there was a discuss of making DSA_TAG_PROTO_NONE automatically create one VLAN per port but this is a good source for other problems... I am pretty sure i am not the first one to encounter this kind of problems > > Looking forward to your follow-up patch series! Will do! > > -- > Florian Thanks Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-06-02 10:34 ` Ilias Apalodimas 2018-06-02 16:10 ` Florian Fainelli @ 2018-06-05 21:03 ` Grygorii Strashko 2018-06-05 21:37 ` Florian Fainelli 1 sibling, 1 reply; 53+ messages in thread From: Grygorii Strashko @ 2018-06-05 21:03 UTC (permalink / raw) To: Ilias Apalodimas, Florian Fainelli Cc: Andrew Lunn, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On 06/02/2018 05:34 AM, Ilias Apalodimas wrote: > Hi Florian, > > Thanks for taking time to look into this > > On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote: >> >> >> On 05/24/2018 09:56 PM, Ilias Apalodimas wrote: >>> On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote: >>>> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: >>>>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: >>>>>> Device tree is supposed to describe the hardware. Using that hardware >>>>>> in different ways is not something you should describe in DT. >>>>>> >>>>> The new switchdev mode is applied with a .config option in the kernel. What you >>>>> see is pre-existing code, so i am not sure if i should change it in this >>>>> patchset. >>>> >>>> If you break the code up into a library and two drivers, it becomes a >>>> moot point. >>> Agree >>> >>>> >>>> But what i don't like here is that the device tree says to do dual >>>> mac. But you ignore that and do sometime else. I would prefer that if >>>> DT says dual mac, and switchdev is compiled in, the probe fails with >>>> EINVAL. Rather than ignore something, make it clear it is invalid. >>> The switch has 3 modes of operation as is. >>> 1. switch mode, to enable that you don't need to add anything on >>> the DTS and linux registers a single netdev interface. >>> 2. dual mac mode, this is when you need to add dual_emac; on the DTS. >>> 3. switchdev mode which is controlled by a .config option, since as you >>> pointed out DTS was not made for controlling config options. >>> >>> I agree that this is far from beautiful. If the driver remains as in though, >>> i'd prefer either keeping what's there or making "switchdev" a DTS option, >>> following the pre-existing erroneous usage rather than making the device >>> unusable. If we end up returning some error and refuse to initialize, users >>> that remote upgrade their equipment, without taking a good look at changelog, >>> will loose access to their devices with no means of remotely fixing that. >> >> It seems to me that the mistake here is seeing multiple modes of >> operations for the cpsw. There are not actually many, there is one >> usage, and then there is what you can and cannot offload. > CPSW has in fact 2 modes of operation, different FIFO usage/lookup entry(it's > called ALE in the current driver) by-pass(which is used in dual emac for > example) and other features. Again Grygorii is better suited to answer the > exact differences. dual_mac is HW enabled mode of operation, so having DT option is pretty reasonable as for me. 1) when enabled it configures internal FIFOs in special way so both external Ports became equal in the direction toward to Host port 0. TRM "The intention of this mode is to allow packets from both ethernet ports to be written into the FIFO without one port starving the other port." 2) ALE, out of the box, does not support this mode and, as result, two default vlan have to be created to direct traffic P1->P0 (vlan1) and P2-P0 (vlan2), but any packets P0->P1 and P0->P2 have to be directed (and will bypass ALE). This way traffic separated on cpsw egress towards to P0, >> The basic >> premise with switchdev and DSA (which uses switchdev) is that each >> user-facing port of your switch needs to work as if it were a normal >> Ethernet NIC, that is what you call dual-MAC I believe. Then, when you >> create a bridge and you enslave those ports into the bridge, you need to >> have forwarding done in hardware between these two ports when the >> SMAC/DMAC are not for the host/CPU/management interface and you must >> simultaneously still have the host have the ability to send/receive >> traffic through the bridge device. TRM "When operating in dual mac mode the intention is to transfer packets between ports 0 and 1 and ports 0 and 2, but not between ports 1 and 2. Each CPGMAC_SL appears as a single MAC with no bridging between MAC’s. Each CPGMAC_SL has at least one unique (not the same) mac address." So, we cant't talk about forwarding in dual_mac mode. Interfaces expected to be completely independent without any packet leaking between interfaces. !! BUT !! CPSW ALE (switch core) still expected to be used, but for packet filtering - only registered vlans - only registered mcast/bcast - ingress mcast/bcast rate limiting (it's actually more like coalescing - limits number of mcast/bcast packets per sec. And all offloading ALE (val/mdb) entries should always contain two ports in masks: p1&p0 or p2&p0. Never ever all three ports. > Yes dual emac does that. But dual emac configures the port facing VLAN to the > CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is > configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU port > That's exactly what the current RFC does as well, with the addition of > registering a sw0p0 (i'll explain why later on this mail) > A little more detail on the issue we are having. On my description > sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the ports > that have PHYs attached. > > When we start in the new switchdev mode all interfaces are added to VLAN 0 > so CPU port + port1 + port2 are all in the same VLAN group. In that case sw0p1 > and sw0p2 are working as you describe. So those 2 interfaces can send/receive > traffic normally which matches the switchdev case. > > When we add them on a bridge(let's say br0), VLAN1(or any default bridge VLAN) > is now configured on sw0p1 and sw0p2 but *not* on the CPU port. > From this point on the whole fuunctionality just collapses. The switch will > work and offload traffic between sw0p1/sw0p2 but the bridge won't be able to > get an ip address (since VLAN1 is not a member of the CPU port and the packet > gets dropped). > IGMPv2/V3 messages will never reach the br_multicast.c code to trigger > switchdev and configure the MDBs on the ports. i am prety sure there are other > fail scenarios which i haven't discovered already, but those 2 are the most > basic ones. If we add VLAN1 on the CPU port, everything works as intended of > course. > > That's the reason we registered sw0p0 as the CPU port. It can't do any "real" > traffic, but you can configure the CPU port independantly and not be forced to > do an OR on every VLAN add/delete grouping the CPU port with your port command. > The TL;DR version of this is that the switch is working exactly as switchdev is > expecting offloading traffic to the hardware when possible as long as the CPU > port is member of the proper VLANs > > Petr's patch solves this for us (9c86ce2c1ae337fc10568a12aea812ed03de8319). > We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and decide > when to add the CPU port or not. > > There are still a couple of cases that are not covered though, if we don't > register the CPU port. We cant decide when to forward multicast > traffic on the CPU port if a join hasn't been sent from br0. > So let's say you got 2 hosts doing multicast and for whatever reason the host > wants to see that traffic. > With the CPU port present you can do a > "bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will offload > the traffic to the CPU port and thus the host. If this goes away we are forced > to send a join. > >> It seems to me like this is entirely doable given that the dual MAC use >> case is supported already. >> >> switchdev is just a stateless framework to get notified from the >> networking stack about what you can possibly offload in hardware, so >> having a DTS option gate that is unfortunately wrong because it is >> really implementing a SW policy in DTS which is not what it is meant for. > The DTS option for configuration pre-existed, i don't like that either switchdev > mode is activated by a .config option not DTS(it just overrides whatever config > you have on the DTS). Far from pretty though fair point, i am open to > suggestions. Again this is option describing HW mode which not expected to be changed on the fly. I'm honestly, propose descope dual_mac and concentrate on switch mode (switchdev) - right now I don't see how dual_mac can be supported with switchdev as per above. The same way as I do not see how we can re-use switchdev with 50% of devices which have "only one" user-facing external port (P1 or P2). -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 4/4] cpsw: add switchdev support 2018-06-05 21:03 ` Grygorii Strashko @ 2018-06-05 21:37 ` Florian Fainelli 0 siblings, 0 replies; 53+ messages in thread From: Florian Fainelli @ 2018-06-05 21:37 UTC (permalink / raw) To: Grygorii Strashko, Ilias Apalodimas Cc: Andrew Lunn, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On 06/05/2018 02:03 PM, Grygorii Strashko wrote: > > > On 06/02/2018 05:34 AM, Ilias Apalodimas wrote: >> Hi Florian, >> >> Thanks for taking time to look into this >> >> On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote: >>> >>> >>> On 05/24/2018 09:56 PM, Ilias Apalodimas wrote: >>>> On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote: >>>>> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote: >>>>>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote: >>>>>>> Device tree is supposed to describe the hardware. Using that hardware >>>>>>> in different ways is not something you should describe in DT. >>>>>>> >>>>>> The new switchdev mode is applied with a .config option in the kernel. What you >>>>>> see is pre-existing code, so i am not sure if i should change it in this >>>>>> patchset. >>>>> >>>>> If you break the code up into a library and two drivers, it becomes a >>>>> moot point. >>>> Agree >>>> >>>>> >>>>> But what i don't like here is that the device tree says to do dual >>>>> mac. But you ignore that and do sometime else. I would prefer that if >>>>> DT says dual mac, and switchdev is compiled in, the probe fails with >>>>> EINVAL. Rather than ignore something, make it clear it is invalid. >>>> The switch has 3 modes of operation as is. >>>> 1. switch mode, to enable that you don't need to add anything on >>>> the DTS and linux registers a single netdev interface. >>>> 2. dual mac mode, this is when you need to add dual_emac; on the DTS. >>>> 3. switchdev mode which is controlled by a .config option, since as you >>>> pointed out DTS was not made for controlling config options. >>>> >>>> I agree that this is far from beautiful. If the driver remains as in though, >>>> i'd prefer either keeping what's there or making "switchdev" a DTS option, >>>> following the pre-existing erroneous usage rather than making the device >>>> unusable. If we end up returning some error and refuse to initialize, users >>>> that remote upgrade their equipment, without taking a good look at changelog, >>>> will loose access to their devices with no means of remotely fixing that. >>> >>> It seems to me that the mistake here is seeing multiple modes of >>> operations for the cpsw. There are not actually many, there is one >>> usage, and then there is what you can and cannot offload. > >> CPSW has in fact 2 modes of operation, different FIFO usage/lookup entry(it's >> called ALE in the current driver) by-pass(which is used in dual emac for >> example) and other features. Again Grygorii is better suited to answer the >> exact differences. > > dual_mac is HW enabled mode of operation, so having DT option is pretty > reasonable as for me. > 1) when enabled it configures internal FIFOs in special way so both > external Ports became equal in the direction toward to Host port 0. > > TRM "The intention of this mode is to allow packets from both ethernet ports > to be written into the FIFO without one port starving the other port." > > 2) ALE, out of the box, does not support this mode and, as result, two > default vlan have to be created to direct traffic P1->P0 (vlan1) and > P2-P0 (vlan2), but any packets P0->P1 and P0->P2 have to be directed > (and will bypass ALE). This way traffic separated on cpsw egress towards to P0, > >>> The basic >>> premise with switchdev and DSA (which uses switchdev) is that each >>> user-facing port of your switch needs to work as if it were a normal >>> Ethernet NIC, that is what you call dual-MAC I believe. Then, when you >>> create a bridge and you enslave those ports into the bridge, you need to >>> have forwarding done in hardware between these two ports when the >>> SMAC/DMAC are not for the host/CPU/management interface and you must >>> simultaneously still have the host have the ability to send/receive >>> traffic through the bridge device. > > TRM > "When operating in dual mac mode the intention is to transfer packets between ports 0 and 1 and ports 0 > and 2, but not between ports 1 and 2. Each CPGMAC_SL appears as a single MAC with no bridging > between MAC’s. Each CPGMAC_SL has at least one unique (not the same) mac address." > > So, we cant't talk about forwarding in dual_mac mode. Interfaces expected to be > completely independent without any packet leaking between interfaces. > > !! BUT !! CPSW ALE (switch core) still expected to be used, but for packet filtering > - only registered vlans > - only registered mcast/bcast > - ingress mcast/bcast rate limiting (it's actually more like coalescing - > limits number of mcast/bcast packets per sec. > > And all offloading ALE (val/mdb) entries should always contain two ports in masks: > p1&p0 or p2&p0. Never ever all three ports. > >> Yes dual emac does that. But dual emac configures the port facing VLAN to the >> CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is >> configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU port >> That's exactly what the current RFC does as well, with the addition of >> registering a sw0p0 (i'll explain why later on this mail) >> A little more detail on the issue we are having. On my description >> sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the ports >> that have PHYs attached. >> >> When we start in the new switchdev mode all interfaces are added to VLAN 0 >> so CPU port + port1 + port2 are all in the same VLAN group. In that case sw0p1 >> and sw0p2 are working as you describe. So those 2 interfaces can send/receive >> traffic normally which matches the switchdev case. >> >> When we add them on a bridge(let's say br0), VLAN1(or any default bridge VLAN) >> is now configured on sw0p1 and sw0p2 but *not* on the CPU port. >> From this point on the whole fuunctionality just collapses. The switch will >> work and offload traffic between sw0p1/sw0p2 but the bridge won't be able to >> get an ip address (since VLAN1 is not a member of the CPU port and the packet >> gets dropped). >> IGMPv2/V3 messages will never reach the br_multicast.c code to trigger >> switchdev and configure the MDBs on the ports. i am prety sure there are other >> fail scenarios which i haven't discovered already, but those 2 are the most >> basic ones. If we add VLAN1 on the CPU port, everything works as intended of >> course. >> >> That's the reason we registered sw0p0 as the CPU port. It can't do any "real" >> traffic, but you can configure the CPU port independantly and not be forced to >> do an OR on every VLAN add/delete grouping the CPU port with your port command. >> The TL;DR version of this is that the switch is working exactly as switchdev is >> expecting offloading traffic to the hardware when possible as long as the CPU >> port is member of the proper VLANs >> >> Petr's patch solves this for us (9c86ce2c1ae337fc10568a12aea812ed03de8319). >> We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and decide >> when to add the CPU port or not. >> >> There are still a couple of cases that are not covered though, if we don't >> register the CPU port. We cant decide when to forward multicast >> traffic on the CPU port if a join hasn't been sent from br0. >> So let's say you got 2 hosts doing multicast and for whatever reason the host >> wants to see that traffic. >> With the CPU port present you can do a >> "bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will offload >> the traffic to the CPU port and thus the host. If this goes away we are forced >> to send a join. >> >>> It seems to me like this is entirely doable given that the dual MAC use >>> case is supported already. >>> >>> switchdev is just a stateless framework to get notified from the >>> networking stack about what you can possibly offload in hardware, so >>> having a DTS option gate that is unfortunately wrong because it is >>> really implementing a SW policy in DTS which is not what it is meant for. >> The DTS option for configuration pre-existed, i don't like that either switchdev >> mode is activated by a .config option not DTS(it just overrides whatever config >> you have on the DTS). Far from pretty though fair point, i am open to >> suggestions. > > Again this is option describing HW mode which not expected to be changed on the fly. > I'm honestly, propose descope dual_mac and concentrate on switch mode (switchdev) - > right now I don't see how dual_mac can be supported with switchdev as per above. > The same way as I do not see how we can re-use switchdev with 50% of devices which > have "only one" user-facing external port (P1 or P2). This is still not appropriate for Device Tree, because this is completely orthogonal from one another. Also, I think you tend to conflate what the switch can do, with in which mode does it start by default, which are, again, two different things. -- Florian ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 6:56 [PATCH 0/4] RFC CPSW switchdev mode Ilias Apalodimas ` (3 preceding siblings ...) 2018-05-24 6:56 ` [PATCH 4/4] cpsw: add switchdev support Ilias Apalodimas @ 2018-05-24 8:05 ` Jiri Pirko 2018-05-24 8:48 ` Ilias Apalodimas 2018-06-01 21:29 ` Grygorii Strashko 5 siblings, 1 reply; 53+ messages in thread From: Jiri Pirko @ 2018-05-24 8:05 UTC (permalink / raw) To: Ilias Apalodimas Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, ivecera, francois.ozog, yogeshs, spatton Thu, May 24, 2018 at 08:56:20AM CEST, ilias.apalodimas@linaro.org wrote: >Hello, > >This is adding a new mode on the cpsw driver based around switchdev. >In order to enable this you need to enable CONFIG_NET_SWITCHDEV, >CONFIG_BRIDGE_VLAN_FILTERING, CONFIG_TI_CPSW_SWITCHDEV >and add to udev config: > >SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="0f011900", \ > ATTR{phys_port_name}!="", NAME="sw0$attr{phys_port_name}" >Since the phys_switch_id is based on cpsw version, users with different >version will need to do 'ip -d link show dev sw0p0 | grep switchid' and >replace with the correct value. > >This patch creates 3 ports, sw0p0, sw0p1 and sw0p2. >sw0p1 and sw0p2 are the netdev interfaces connected to PHY devices >while sw0p0 is the switch 'cpu facing port'. Any reason you need cpu port? We don't need it in mlxsw and also in dsa. What is this device? Could you give me some pointer to description? ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 8:05 ` [PATCH 0/4] RFC CPSW switchdev mode Jiri Pirko @ 2018-05-24 8:48 ` Ilias Apalodimas 2018-05-24 12:54 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 8:48 UTC (permalink / raw) To: Jiri Pirko Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, ivecera, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 10:05:28AM +0200, Jiri Pirko wrote: > Thu, May 24, 2018 at 08:56:20AM CEST, ilias.apalodimas@linaro.org wrote: > Any reason you need cpu port? We don't need it in mlxsw and also in dsa. Yes i've seen that on mlxsw/rocker drivers and i was reluctant adding one here. The reason is that TI wants this configured differently from customer facing ports. Apparently there are existing customers already using the "feature". So OR'ing and adding the cpu port on every operation (add/del vlans add ucast/mcast entries etc) was less favoured. > > What is this device? Could you give me some pointer to description? This is the switch used on TI's AM5728 and BBB boards. I am pretty sure there are other platforms i am not aware of. http://www.ti.com/lit/ug/spruhz6j/spruhz6j.pdf is the techincal reference manual. Section 24.11.5.4 "Initialization and Configuration of CPSW" is the switch part. Thanks, Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 8:48 ` Ilias Apalodimas @ 2018-05-24 12:54 ` Andrew Lunn 2018-05-24 13:44 ` Ivan Vecera 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-05-24 12:54 UTC (permalink / raw) To: Ilias Apalodimas Cc: Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, ivecera, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 11:48:31AM +0300, Ilias Apalodimas wrote: > On Thu, May 24, 2018 at 10:05:28AM +0200, Jiri Pirko wrote: > > Thu, May 24, 2018 at 08:56:20AM CEST, ilias.apalodimas@linaro.org wrote: > > Any reason you need cpu port? We don't need it in mlxsw and also in dsa. > Yes i've seen that on mlxsw/rocker drivers and i was reluctant adding one here. > The reason is that TI wants this configured differently from customer facing > ports. Apparently there are existing customers already using the "feature". > So OR'ing and adding the cpu port on every operation (add/del vlans add > ucast/mcast entries etc) was less favoured. Hi Ilias Nice to see this device moving away from its custom model and towards the switchdev model. Did you consider making a clean break from the existing code and write a new driver. Let the existing customers using the existing driver. Have the new switchdev driver fully conform to switchdev. I don't like having this 'cpu' interface. As you say, it breaks the switchhdev model. If we need to extend the switchdev model to support some use case, lets do that. Please can you fully describe the use cases, so we can discuss how to implement them cleanly within the switchdev model. Thanks Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 12:54 ` Andrew Lunn @ 2018-05-24 13:44 ` Ivan Vecera 2018-05-24 14:08 ` Ilias Apalodimas 0 siblings, 1 reply; 53+ messages in thread From: Ivan Vecera @ 2018-05-24 13:44 UTC (permalink / raw) To: Andrew Lunn, Ilias Apalodimas Cc: Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On 24.5.2018 14:54, Andrew Lunn wrote: > On Thu, May 24, 2018 at 11:48:31AM +0300, Ilias Apalodimas wrote: >> On Thu, May 24, 2018 at 10:05:28AM +0200, Jiri Pirko wrote: >>> Thu, May 24, 2018 at 08:56:20AM CEST, ilias.apalodimas@linaro.org wrote: >>> Any reason you need cpu port? We don't need it in mlxsw and also in dsa. >> Yes i've seen that on mlxsw/rocker drivers and i was reluctant adding one here. >> The reason is that TI wants this configured differently from customer facing >> ports. Apparently there are existing customers already using the "feature". >> So OR'ing and adding the cpu port on every operation (add/del vlans add >> ucast/mcast entries etc) was less favoured. > > Hi Ilias > > Nice to see this device moving away from its custom model and towards > the switchdev model. +1 > Did you consider making a clean break from the existing code and write > a new driver. Let the existing customers using the existing > driver. Have the new switchdev driver fully conform to switchdev. I would also prefer fresh new driver. The existing one can be marked as 'bugfix-only' and later pertinently deprecated/removed. > > I don't like having this 'cpu' interface. As you say, it breaks the > switchhdev model. If we need to extend the switchdev model to support > some use case, lets do that. Please can you fully describe the use > cases, so we can discuss how to implement them cleanly within the > switchdev model. +1 Ivan ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 13:44 ` Ivan Vecera @ 2018-05-24 14:08 ` Ilias Apalodimas 2018-05-24 14:54 ` Andrew Lunn 2018-06-02 23:28 ` Grygorii Strashko 0 siblings, 2 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 14:08 UTC (permalink / raw) To: Ivan Vecera Cc: Andrew Lunn, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 03:44:54PM +0200, Ivan Vecera wrote: > On 24.5.2018 14:54, Andrew Lunn wrote: > > On Thu, May 24, 2018 at 11:48:31AM +0300, Ilias Apalodimas wrote: > >> On Thu, May 24, 2018 at 10:05:28AM +0200, Jiri Pirko wrote: > >>> Thu, May 24, 2018 at 08:56:20AM CEST, ilias.apalodimas@linaro.org wrote: > >>> Any reason you need cpu port? We don't need it in mlxsw and also in dsa. > >> Yes i've seen that on mlxsw/rocker drivers and i was reluctant adding one here. > >> The reason is that TI wants this configured differently from customer facing > >> ports. Apparently there are existing customers already using the "feature". > >> So OR'ing and adding the cpu port on every operation (add/del vlans add > >> ucast/mcast entries etc) was less favoured. > > > > Hi Ilias > > > > Nice to see this device moving away from its custom model and towards > > the switchdev model. > +1 Thanks. To be honest it opens up so many posibilities for common configuration from userspace across vendors that doing something new without it doesn't make any sense (at least to me). > > > Did you consider making a clean break from the existing code and write > > a new driver. Let the existing customers using the existing > > driver. Have the new switchdev driver fully conform to switchdev. > > I would also prefer fresh new driver. The existing one can be marked as > 'bugfix-only' and later pertinently deprecated/removed. Yes, but given the driver and the platforms it's used at, we ended up patching the existing driver. I am not opposed to the idea, but Grygorii is more suited to reply on that. > > > > I don't like having this 'cpu' interface. As you say, it breaks the > > switchhdev model. If we need to extend the switchdev model to support > > some use case, lets do that. Please can you fully describe the use > > cases, so we can discuss how to implement them cleanly within the > > switchdev model. > +1 There's configuration needs from customers adding or not adding a VLAN to the CPU port. In my configuration examples for instance, if the cpu port is not added to the bridge, you cannot get an ip address on it. Similar cases exist for customers on adding MDBs as far as i know. So they want the "customer facing ports" to have the MDBs present but not the cpu port. In some cases (where the CPE/device that has the switch) participates in the traffic they want the cpu port to have the samne MDBs installed. This is just two simple cases that come in mind, again Grygorii is more suited to answer and explain existing/more complex use cases better than me. Adding a cpu port that cannot transmit or receive traffic is a bit "weird", on the other hand you can access it's configuration using the same userspace tools and the same commands you do for the "normal" ports. Extending switchdev might be the proper solution here. Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 14:08 ` Ilias Apalodimas @ 2018-05-24 14:54 ` Andrew Lunn 2018-05-24 15:07 ` Ilias Apalodimas 2018-06-02 23:28 ` Grygorii Strashko 1 sibling, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-05-24 14:54 UTC (permalink / raw) To: Ilias Apalodimas Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > There's configuration needs from customers adding or not adding a VLAN to the > CPU port. In my configuration examples for instance, if the cpu port is not > added to the bridge, you cannot get an ip address on it. If you cannot get an IP address, it is plain broken. The whole idea is that switch port interfaces are just linux interfaces. A linux interface which cannot get an IP address is broken. > Similar cases exist for customers on adding MDBs as far as i know. So they want > the "customer facing ports" to have the MDBs present but not the cpu port. That i can understand. And it should actually work now with switchdev. It performs IGMP snooping, and if there is nothing joining the group on the CPU, it won't add an MDB entry to forward traffic to the CPU. > Adding a cpu port that cannot transmit or receive traffic is a bit "weird" And how is it supposed to send BPDUs? STP is going to be broken.... Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 14:54 ` Andrew Lunn @ 2018-05-24 15:07 ` Ilias Apalodimas 2018-05-24 15:25 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 15:07 UTC (permalink / raw) To: Andrew Lunn Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 04:54:41PM +0200, Andrew Lunn wrote: > If you cannot get an IP address, it is plain broken. The whole idea is > that switch port interfaces are just linux interfaces. A linux > interface which cannot get an IP address is broken. The switch interfaces can get ip addresses just like every linux interface. The cpu port can't (sw0p0) > > > Similar cases exist for customers on adding MDBs as far as i know. So they want > > the "customer facing ports" to have the MDBs present but not the cpu port. > > That i can understand. And it should actually work now with > switchdev. It performs IGMP snooping, and if there is nothing joining > the group on the CPU, it won't add an MDB entry to forward traffic to > the CPU. Yes, but this should be configurable (i.e the customer can deny adding the MDB on the cpu port) > > > Adding a cpu port that cannot transmit or receive traffic is a bit "weird" > > And how is it supposed to send BPDUs? STP is going to be broken.... Not sure about this, i'll have to check Regards Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 15:07 ` Ilias Apalodimas @ 2018-05-24 15:25 ` Andrew Lunn 2018-05-24 16:02 ` Ilias Apalodimas 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-05-24 15:25 UTC (permalink / raw) To: Ilias Apalodimas Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > > That i can understand. And it should actually work now with > > switchdev. It performs IGMP snooping, and if there is nothing joining > > the group on the CPU, it won't add an MDB entry to forward traffic to > > the CPU. > Yes, but this should be configurable (i.e the customer can deny adding the MDB > on the cpu port) O.K, back to the basic idea. Switch ports are just normal Linux interfaces. How would you configure this with two e1000e put in a bridge? I want multicast to be bridged between the two e1000e, but the host stack should not see the packets. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 15:25 ` Andrew Lunn @ 2018-05-24 16:02 ` Ilias Apalodimas 2018-05-24 16:33 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-24 16:02 UTC (permalink / raw) To: Andrew Lunn Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 05:25:59PM +0200, Andrew Lunn wrote: > O.K, back to the basic idea. Switch ports are just normal Linux > interfaces. > > How would you configure this with two e1000e put in a bridge? I want > multicast to be bridged between the two e1000e, but the host stack > should not see the packets. I am not sure i am following. I might be missing something. In your case you got two ethernet pci/pcie interfaces bridged through software. You can filter those if needed. In the case we are trying to cover, you got a hardware that offers that capability. Since not all switches are pcie based shouldn't we be able to allow this ? Regards Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 16:02 ` Ilias Apalodimas @ 2018-05-24 16:33 ` Andrew Lunn 2018-05-25 6:29 ` Ilias Apalodimas 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-05-24 16:33 UTC (permalink / raw) To: Ilias Apalodimas Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 07:02:54PM +0300, Ilias Apalodimas wrote: > On Thu, May 24, 2018 at 05:25:59PM +0200, Andrew Lunn wrote: > > O.K, back to the basic idea. Switch ports are just normal Linux > > interfaces. > > > > How would you configure this with two e1000e put in a bridge? I want > > multicast to be bridged between the two e1000e, but the host stack > > should not see the packets. > I am not sure i am following. I might be missing something. In your case you > got two ethernet pci/pcie interfaces bridged through software. You can filter > those if needed. In the case we are trying to cover, you got a hardware that > offers that capability. Since not all switches are pcie based shouldn't we be > able to allow this ? switchdev is about offloading what Linux can do to hardware to accelerate it. The switch is a block of accelerator hardware, like a GPU is for accelerating graphics. Linux can render OpenGL, but it is better to hand it over to the GPU accelerator. Same applies here. The Linux bridge can bridge multicast. Using the switchdev API, you can push that down to the accelerator, and let it do it. So you need to think about, how do you make the Linux bridge not pass multicast traffic to the host stack. Then how do you extend the switchdev API so you can push this down to the accelerator. To really get switchdev, you often need to pivot your point of view a bit. People often think, switchdev is about writing drivers for switches. Its not, its about how you offload networking which Linux can do down to a switch. And if the switch cannot accelerate it, you leave Linux to do it. When you get in the details, i think you will find the switchdev API actually already has what you need for this use case. What you need to figure out is how you make the Linux bridge not pass multicast to the host. Well, actually, not pass multicast it has not asked for. Then accelerate it. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 16:33 ` Andrew Lunn @ 2018-05-25 6:29 ` Ilias Apalodimas 2018-05-25 10:28 ` Ilias Apalodimas 2018-05-25 12:09 ` Andrew Lunn 0 siblings, 2 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-25 6:29 UTC (permalink / raw) To: Andrew Lunn Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Thu, May 24, 2018 at 06:33:10PM +0200, Andrew Lunn wrote: > On Thu, May 24, 2018 at 07:02:54PM +0300, Ilias Apalodimas wrote: > > On Thu, May 24, 2018 at 05:25:59PM +0200, Andrew Lunn wrote: > > > O.K, back to the basic idea. Switch ports are just normal Linux > > > interfaces. > > > > > > How would you configure this with two e1000e put in a bridge? I want > > > multicast to be bridged between the two e1000e, but the host stack > > > should not see the packets. > > I am not sure i am following. I might be missing something. In your case you > > got two ethernet pci/pcie interfaces bridged through software. You can filter > > those if needed. In the case we are trying to cover, you got a hardware that > > offers that capability. Since not all switches are pcie based shouldn't we be > > able to allow this ? > > switchdev is about offloading what Linux can do to hardware to > accelerate it. The switch is a block of accelerator hardware, like a > GPU is for accelerating graphics. Linux can render OpenGL, but it is > better to hand it over to the GPU accelerator. > > Same applies here. The Linux bridge can bridge multicast. Using the > switchdev API, you can push that down to the accelerator, and let it > do it. > > So you need to think about, how do you make the Linux bridge not pass > multicast traffic to the host stack. Then how do you extend the > switchdev API so you can push this down to the accelerator. > > To really get switchdev, you often need to pivot your point of view a > bit. People often think, switchdev is about writing drivers for > switches. Its not, its about how you offload networking which Linux > can do down to a switch. And if the switch cannot accelerate it, you > leave Linux to do it. > > When you get in the details, i think you will find the switchdev API > actually already has what you need for this use case. What you need to > figure out is how you make the Linux bridge not pass multicast to the > host. Well, actually, not pass multicast it has not asked for. Then > accelerate it. > Understood, if we missed back anything on handling multicast for the cpu port we'll go back and fix it (i am assuming snooping is the answer here). Multicasting is only one part of the equation though. What about the need for vlans/FDBs on that port though? Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-25 6:29 ` Ilias Apalodimas @ 2018-05-25 10:28 ` Ilias Apalodimas 2018-05-25 11:59 ` Andrew Lunn 2018-05-25 12:09 ` Andrew Lunn 1 sibling, 1 reply; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-25 10:28 UTC (permalink / raw) To: Andrew Lunn Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Fri, May 25, 2018 at 09:29:02AM +0300, Ilias Apalodimas wrote: > On Thu, May 24, 2018 at 06:33:10PM +0200, Andrew Lunn wrote: > > On Thu, May 24, 2018 at 07:02:54PM +0300, Ilias Apalodimas wrote: > > > On Thu, May 24, 2018 at 05:25:59PM +0200, Andrew Lunn wrote: > > > > O.K, back to the basic idea. Switch ports are just normal Linux > > > > interfaces. > > > > > > > > How would you configure this with two e1000e put in a bridge? I want > > > > multicast to be bridged between the two e1000e, but the host stack > > > > should not see the packets. > > > I am not sure i am following. I might be missing something. In your case you > > > got two ethernet pci/pcie interfaces bridged through software. You can filter > > > those if needed. In the case we are trying to cover, you got a hardware that > > > offers that capability. Since not all switches are pcie based shouldn't we be > > > able to allow this ? > > > > switchdev is about offloading what Linux can do to hardware to > > accelerate it. The switch is a block of accelerator hardware, like a > > GPU is for accelerating graphics. Linux can render OpenGL, but it is > > better to hand it over to the GPU accelerator. > > > > Same applies here. The Linux bridge can bridge multicast. Using the > > switchdev API, you can push that down to the accelerator, and let it > > do it. > > > > So you need to think about, how do you make the Linux bridge not pass > > multicast traffic to the host stack. Then how do you extend the > > switchdev API so you can push this down to the accelerator. > > > > To really get switchdev, you often need to pivot your point of view a > > bit. People often think, switchdev is about writing drivers for > > switches. Its not, its about how you offload networking which Linux > > can do down to a switch. And if the switch cannot accelerate it, you > > leave Linux to do it. > > > > When you get in the details, i think you will find the switchdev API > > actually already has what you need for this use case. What you need to > > figure out is how you make the Linux bridge not pass multicast to the > > host. Well, actually, not pass multicast it has not asked for. Then > > accelerate it. > > > Understood, if we missed back anything on handling multicast for > the cpu port we'll go back and fix it (i am assuming snooping is the answer > here). Multicasting is only one part of the equation though. What about the > need for vlans/FDBs on that port though? > I just noticed this: https://www.spinics.net/lists/netdev/msg504760.html I tried doing the "bridge vlan add vid 2 dev br0 self" in my initial attempts but didn't get a notification to program the CPU port(with the sef argument). This obviously solves our vlan configuration issue. So it's only static FBDs left. Regards Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-25 10:28 ` Ilias Apalodimas @ 2018-05-25 11:59 ` Andrew Lunn 0 siblings, 0 replies; 53+ messages in thread From: Andrew Lunn @ 2018-05-25 11:59 UTC (permalink / raw) To: Ilias Apalodimas Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > So it's only static FBDs left. Please describe the use case. Why would i want to put static FDB addresses on a CPU port? What in the linux network stack needs this? Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-25 6:29 ` Ilias Apalodimas 2018-05-25 10:28 ` Ilias Apalodimas @ 2018-05-25 12:09 ` Andrew Lunn 2018-05-31 15:27 ` Ilias Apalodimas 1 sibling, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-05-25 12:09 UTC (permalink / raw) To: Ilias Apalodimas Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > Understood, if we missed back anything on handling multicast for > the cpu port we'll go back and fix it (i am assuming snooping is the answer > here). It is part of the answer. You should also look at .ndo_set_rx_mode. When the switch ports are not in a bridge, this call i used to pass a list of MAC addresses which the network stack would like to receiver. You probably want to turn that list into MBD and FDBs. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-25 12:09 ` Andrew Lunn @ 2018-05-31 15:27 ` Ilias Apalodimas 0 siblings, 0 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-05-31 15:27 UTC (permalink / raw) To: Andrew Lunn Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton Sorry for the late response i had some time to take another look and do some extra testing > switchdev is about offloading what Linux can do to hardware to > accelerate it. The switch is a block of accelerator hardware, like a > GPU is for accelerating graphics. Linux can render OpenGL, but it is > better to hand it over to the GPU accelerator. > > Same applies here. The Linux bridge can bridge multicast. Using the > switchdev API, you can push that down to the accelerator, and let it > do it. > > So you need to think about, how do you make the Linux bridge not pass > multicast traffic to the host stack. Then how do you extend the > switchdev API so you can push this down to the accelerator. > > To really get switchdev, you often need to pivot your point of view a > bit. People often think, switchdev is about writing drivers for > switches. Its not, its about how you offload networking which Linux > can do down to a switch. And if the switch cannot accelerate it, you > leave Linux to do it. > > When you get in the details, i think you will find the switchdev API > actually already has what you need for this use case. What you need to > figure out is how you make the Linux bridge not pass multicast to the > host. Well, actually, not pass multicast it has not asked for. Then > accelerate it. The current driver is already working like that. The difference between the modes of operation is this: By registering the 'cpu port' we choose if the linux host is going to see the br_ip4_multicast_igmp3_report or br_multicast_ipv4_rcv (by configuring the vlan it participates) and trigger switchdev to add the MDBs If the cpu port is member of that VLAN then the dynamic entry shows on 'bridge mdb show' command i.e dev br0 port sw0p1 grp 239.1.1.1 temp offload vid 100 If not the user is able to add it manually. Anyway i got the main points of the RFC, if Petr's patch get accepted i might be able to respin this without registering a CPU port. Regards Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 14:08 ` Ilias Apalodimas 2018-05-24 14:54 ` Andrew Lunn @ 2018-06-02 23:28 ` Grygorii Strashko 2018-06-03 0:08 ` Andrew Lunn ` (3 more replies) 1 sibling, 4 replies; 53+ messages in thread From: Grygorii Strashko @ 2018-06-02 23:28 UTC (permalink / raw) To: Ilias Apalodimas, Ivan Vecera, Andrew Lunn Cc: Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton, Jiri Pirko Hi All, Sry, for delayed reply. On 05/24/2018 09:08 AM, Ilias Apalodimas wrote: > On Thu, May 24, 2018 at 03:44:54PM +0200, Ivan Vecera wrote: >> On 24.5.2018 14:54, Andrew Lunn wrote: >>> On Thu, May 24, 2018 at 11:48:31AM +0300, Ilias Apalodimas wrote: >>>> On Thu, May 24, 2018 at 10:05:28AM +0200, Jiri Pirko wrote: >>>>> Thu, May 24, 2018 at 08:56:20AM CEST, ilias.apalodimas@linaro.org wrote: >>>>> Any reason you need cpu port? We don't need it in mlxsw and also in dsa. >>>> Yes i've seen that on mlxsw/rocker drivers and i was reluctant adding one here. >>>> The reason is that TI wants this configured differently from customer facing >>>> ports. Apparently there are existing customers already using the "feature". >>>> So OR'ing and adding the cpu port on every operation (add/del vlans add >>>> ucast/mcast entries etc) was less favoured. >>> >>> Hi Ilias >>> >>> Nice to see this device moving away from its custom model and towards >>> the switchdev model. >> +1 > Thanks. To be honest it opens up so many posibilities for common configuration > from userspace across vendors that doing something new without it doesn't make > any sense (at least to me). > >> >>> Did you consider making a clean break from the existing code and write >>> a new driver. Let the existing customers using the existing >>> driver. Have the new switchdev driver fully conform to switchdev. >> >> I would also prefer fresh new driver. The existing one can be marked as >> 'bugfix-only' and later pertinently deprecated/removed. > Yes, but given the driver and the platforms it's used at, we ended up patching > the existing driver. I am not opposed to the idea, but Grygorii is more suited > to reply on that. Correct, we considered two options - start from scratch or hack existing driver to get working prototype as fast as possible. Hacking of existing driver was just faster way to go and i agree that new driver might be better approach for the future. > >>> >>> I don't like having this 'cpu' interface. As you say, it breaks the >>> switchhdev model. If we need to extend the switchdev model to support >>> some use case, lets do that. Please can you fully describe the use >>> cases, so we can discuss how to implement them cleanly within the >>> switchdev model. >> +1 > There's configuration needs from customers adding or not adding a VLAN to the > CPU port. In my configuration examples for instance, if the cpu port is not > added to the bridge, you cannot get an ip address on it. > Similar cases exist for customers on adding MDBs as far as i know. So they want > the "customer facing ports" to have the MDBs present but not the cpu port. > In some cases (where the CPE/device that has the switch) participates in the > traffic they want the cpu port to have the samne MDBs installed. > This is just two simple cases that come in mind, again Grygorii is more suited > to answer and explain existing/more complex use cases better than me. > > Adding a cpu port that cannot transmit or receive traffic is a bit "weird", on > the other hand you can access it's configuration using the same userspace tools > and the same commands you do for the "normal" ports. Extending switchdev might > be the proper solution here. I'd try to provide more information about this switch and why we end up adding eth0 port. Please, be patient to me as this is new area for me and I might not be right in some conclusions or can missing smth. *Before this patch set*: Current CPSW driver in switch mode can be described as below +----------------------------------------+ |Linux Host 0 | | | | TI tool (ioctl)/netdev | | + + | +-----------------+----------------------+ | | | Control MMIO | | | Data DMA | | | +------------+-----+-----+-----------------+ | | | | CPDMA | CPSW | | v | | | | | | +-----------+ | | | | P0 | | | | | | | | | +-----+-----+ | | | | FDB: static MAC Port=0 | | +-----+-----+ | | +------> ALE +-----+ | | +------+ | | | | | +-----------+ | | | +---+----+ +----+---+ | | | | | | | | | P1 | | P2 | | +------------------------------------------+ | | | | | PHY | | PHY | +---+----+ +----+---+ | | +---+----+ +---+----+ | | | | | Host 1| | Host 2 | +--------+ +--------+ Note. This is embedded world and in many cases network configuration is static, everything which is not allowed - drop (means no such things like IGMP or even ARP). The core Part of CPSW is ALE module which perform switching ops according to ALE table (fdb/mdb/vlan). From ALE point of view *all* port absolutely equal. And Linux Host 0 in most of customer use cases is the no much different from Hosts 1 or 2, so allowed net traffic to Linux Host 0 have to be very carefully configured as Apps running on it must continue working even if network failed or overloaded (packet storms). So, as per my understanding, P0 meaning here is absolutely not the same as CPU port in DSA. It just another switch port with bad luck to be connected directly to CPU. Current CPSW driver offloads basic switch configuration in HW and additional configuration can be done using TI custom tool (kernel updated to accept additional IOCTL). Current driver creates one netdev ETH0 and all net traffic enter/exit through this device - it possible to distinguish which external port p1/2 received packet or send packet directly to external port p1/2, but current CPSW doesn't do this. Static Unicast FDB entries created in ALE table for Port 0 to specify which unicast traffic need to be accepted by Linux Host 0. By default only registered multicast or broadcast packets forwarded to Linux Host 0. *After this patch set*: goal keep things working the same as max as possible and get rid of TI custom tool. +-------------------------------------------------------+ | Linux Host 0 | | +------------------+ | | | br0 | | | | | | | +---+--------------+ | | | |data | | +-------+--------------+---------------+ | | | | ^ ^ | | | | | | ++-----+ +-+---+-+ +----+-+-+ | | | |sw0p0 | | sw0p1 | |sw0p2 | | | | +------+ +----^--+ +-----^--+ | | | | | | | | +-----------+-------------+ | | | | | +----+v+------------------------------------------------+ | | | Control MMIO | | | Data DMA | | | +------------+-----+-----+-----------------------------+ | | | | CPDMA | CPSW | | v | | | | | | +-----------+ | | | | P0 | | | | | | | | | +-----+-----+ | | | | FDB: static MACP1 port=0 | | | +-----+-----+ FDB: static MACP2 port=0 | | +------> ALE +-----+ | | +------+ | | | | | +-----------+ | | | +---+----+ +----+---+ | | | | | | | | | P1 | | P2 | | +------------------------------------------------------+ | | | | | PHY | | PHY | +--------+ +--------+ In this implementation switch egress traffic to Linux Host 0 always split between sw0p1 and sw0p2 depending on which external port P1/P2 packet was received, sw0p0 doesn't produce any traffic. On switch ingress from Linux Host 0 packets are always sent directly to external ports P1/P2, with assumption that Linux Bridge knows where packet should go, and ALE bypassed in this case. sw0p0 doesn't allows to send any traffic and serves for configuration purposes only. Below I've described some tested use cases (not include full static configuration), but regarding sw0p0 - there is work done by Ivan Khoronzhuk [1] which enables adds MQPRIO and CBS Qdisc and targets AVB network features. It required to offload MQPRIO and CBS parameters on all ports including P0. In case of P0, CPDMA TX channels shapers need to be configured, and in case of sw0p1/sw0p2 internal FIFOS. sw0p0 also expected to be used to configure CPDMA interface in general - number of tx/rx channels, rates, ring sizes. In addition there is set of global CPSW parameters (not related to P1/P2, like MAC Authorization Mode, OUI Deny Mode, crc ) which I've thought can be added to sw0p0 (using ethtool -priv-flags). Additional headache is PTP: we have on PHC, but both external interfaces P1/P2 can timestamp packets. [1] https://lkml.org/lkml/2018/5/18/1134 Below use cases were tried with this approach tried with current LKML, so possible changes to Net/Bridge framework not considered: 1) boot, ping no vlan # ip link add name br0 type bridge # echo 0 > /sys/class/net/br0/bridge/default_pvid # ip link set dev eth2 master br0 # ip link set dev eth0 master br0 # ip link set dev eth1 master br0 # ifconfig br0 192.168.1.2 *Note*: I've had to disable default_pvid as otherwise linux Bridge adds and offloads default vlan 1, but default configuration for CPSW driver is vid 0. + CPSW specific - it can't untag packets for P0. Another option I've found: # ip link set dev br0 type bridge vlan_filtering 1. but anyway, I've found it confusing that Linux bridge adds default vlan when vlan_filtering == 0 2) add vlans. Host 1 - not vlan capable, Host 2/ Linux Host 0 can handle vlans # bridge vlan add dev sw0p1 vid 100 pvid untagged master # bridge vlan add dev sw0p2 vid 100 master # bridge vlan add dev sw0p0 vid 100 master Any combination expected to work. *Note*. Ilias reused IFF_ALLMULTI and IFF_MULTICAST on each interface to configure registered and unregistered multicast ports masks for each ALE VLAN entries. So, # ifconfig eth0 -multicast # ifconfig eth0 --allmulti # bridge vlan add dev sw0p0 vid 100 master expected to stop forwarding any multicast traffic to Linux Host 0 3) MDB Allow mcast 239.1.1.1 between P1/P2 # bridge mdb add dev br0 port sw0p1 grp 239.1.1.1 permanent # bridge mdb add dev br0 port sw0p2 grp 239.1.1.1 permanent Allow mcast 239.1.1.2 between P0/P2 # bridge mdb add dev br0 port sw0p0 grp 239.1.1.2 permanent # bridge mdb add dev br0 port sw0p2 grp 239.1.1.2 permanent *Note !!!!!*. I've found no proper way to add L2 mcast addresses as 01-80-C2-00-00-00 or 01-1B-19-00-00-00. probably I missing smth. 4) stp I did some updates (added port stp state offload/stp mcats addr configuration - will post diff on Monday) and tried stp (kernel STP) by connecting P1/P2 ports to the same switch: # ip link add name br0 type bridge # echo 0 > /sys/class/net/br0/bridge/default_pvid # ip link set dev sw0p2 master br0 # ip link set dev sw0p0 master br0 # ip link set dev sw0p1 master br0 # brctl show # ifconfig br0 192.168.1.2 # brctl stp br0 on # brctl showstp br0 sw0p1 (3) port id 8003 state blocking sw0p2 (1) port id 8001 state forwarding Don't know howto: 1) add FDB entry with "blocked" flag - ALE can discard all packets with SRC/DST address = blocked MAC 2) add multicast MAC address with Supervisory Packet flag set. Such packets will bypass most of checks inside ALE and will be forwarded in all port's states except "disabled". 3) add "unknown vlan configuration" : ALE provides possibility to configure default behavior for tagged packets with "unknown vlan" by configuring - Unknown VLAN Force Untagged Egress ports Mask. - Unknown VLAN Registered Multicast Flood Ports Mask - Unknown VLAN Multicast Flood ports Mask - Unknown VLAN Member ports List 4) The way to detect "brctl stp br0 on/off" If you are here. Thanks for reading this and your patience. -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-02 23:28 ` Grygorii Strashko @ 2018-06-03 0:08 ` Andrew Lunn 2018-06-05 21:18 ` Grygorii Strashko 2018-06-03 0:26 ` Andrew Lunn ` (2 subsequent siblings) 3 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-06-03 0:08 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Sat, Jun 02, 2018 at 06:28:22PM -0500, Grygorii Strashko wrote: Hi Grygorii I'm just picking out one thing here... there is lots more good stuff here. > Additional headache is PTP: we have on PHC, but both external interfaces P1/P2 > can timestamp packets. This should not be a problem. The Marvell switches have one PHC, but each port can time stamp packets using this counter. Each port has its own receive and transmit time stamp registers. So i don't think this will cause you problems. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-03 0:08 ` Andrew Lunn @ 2018-06-05 21:18 ` Grygorii Strashko 2018-06-05 21:28 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Grygorii Strashko @ 2018-06-05 21:18 UTC (permalink / raw) To: Andrew Lunn Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On 06/02/2018 07:08 PM, Andrew Lunn wrote: > On Sat, Jun 02, 2018 at 06:28:22PM -0500, Grygorii Strashko wrote: > > Hi Grygorii > > I'm just picking out one thing here... there is lots more good stuff here. > >> Additional headache is PTP: we have on PHC, but both external interfaces P1/P2 >> can timestamp packets. > > This should not be a problem. The Marvell switches have one PHC, but > each port can time stamp packets using this counter. Each port has its > own receive and transmit time stamp registers. So i don't think this > will cause you problems. I hope you are right - question is always in number of available options and which one to select - and, most important, explain it to the end user :( For example: phc_index is returned as part of .get_ts_info() = cpsw_get_ts_info(), so which intf should return phc_index? Still not tested, so jut hope ... -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 21:18 ` Grygorii Strashko @ 2018-06-05 21:28 ` Andrew Lunn 2018-06-05 21:42 ` Grygorii Strashko 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-06-05 21:28 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > I hope you are right - question is always in number of available options > and which one to select - and, most important, explain it to the end user :( The end customer being ptp4linux? At least for Marvell switches, it is happy about everything except that the switch is a bit slow, so we need to modify some of the time outs in the configuration file. > For example: > phc_index is returned as part of .get_ts_info() = cpsw_get_ts_info(), > so which intf should return phc_index? It is not a 1:1 relationship. See: https://elixir.bootlin.com/linux/latest/source/drivers/net/dsa/mv88e6xxx/hwtstamp.c#L61 All interfaces return the same index. In fact, for a switch, having a PHC per port would be odd. That would mean you need to sync the PHCs in order to act as a boundary clock. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 21:28 ` Andrew Lunn @ 2018-06-05 21:42 ` Grygorii Strashko 2018-06-05 21:55 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Grygorii Strashko @ 2018-06-05 21:42 UTC (permalink / raw) To: Andrew Lunn Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On 06/05/2018 04:28 PM, Andrew Lunn wrote: >> I hope you are right - question is always in number of available options >> and which one to select - and, most important, explain it to the end user :( > > The end customer being ptp4linux? At least for Marvell switches, it is > happy about everything except that the switch is a bit slow, so we > need to modify some of the time outs in the configuration file. > >> For example: >> phc_index is returned as part of .get_ts_info() = cpsw_get_ts_info(), >> so which intf should return phc_index? > > It is not a 1:1 relationship. See: > > https://elixir.bootlin.com/linux/latest/source/drivers/net/dsa/mv88e6xxx/hwtstamp.c#L61 > > All interfaces return the same index. > > In fact, for a switch, having a PHC per port would be odd. That would > mean you need to sync the PHCs in order to act as a boundary clock. PHC only one, but hw timestamping blocks are per port. -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 21:42 ` Grygorii Strashko @ 2018-06-05 21:55 ` Andrew Lunn 0 siblings, 0 replies; 53+ messages in thread From: Andrew Lunn @ 2018-06-05 21:55 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > PHC only one, but hw timestamping blocks are per port. Yes, same as the Marvell. Per port, there are two receive time stamps and one transmit time stamp. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-02 23:28 ` Grygorii Strashko 2018-06-03 0:08 ` Andrew Lunn @ 2018-06-03 0:26 ` Andrew Lunn 2018-06-05 23:23 ` Grygorii Strashko 2018-06-03 0:37 ` Andrew Lunn 2018-06-03 0:49 ` Andrew Lunn 3 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-06-03 0:26 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > *After this patch set*: goal keep things working the same as max as > possible and get rid of TI custom tool. We are happy to keep things the same, if they fit with the switchdev model. Anything in your customer TI tool/model which does not fit the switchdev model you won't be able to keep, except if we agree to extend the model. I can say now, sw0p0 is going to cause problems. I really do suggest you drop it for the moment in order to get a minimal driver accepted. sw0p0 does not fit the switchdev model. > Below I've described some tested use cases (not include full static configuration), > but regarding sw0p0 - there is work done by Ivan Khoronzhuk [1] which enables > adds MQPRIO and CBS Qdisc and targets AVB network features. It required to > offload MQPRIO and CBS parameters on all ports including P0. In case of P0, > CPDMA TX channels shapers need to be configured, and in case > of sw0p1/sw0p2 internal FIFOS. > sw0p0 also expected to be used to configure CPDMA interface in general - > number of tx/rx channels, rates, ring sizes. Can this be derives from the configuration on sw0p1 and sw0p2? sw0p1 has 1 tx channel, sw0p2 has 2 tx channels, so give p0 3 tx channels? > In addition there is set of global CPSW parameters (not related to P1/P2, like > MAC Authorization Mode, OUI Deny Mode, crc ) which I've > thought can be added to sw0p0 (using ethtool -priv-flags). You should describe these features, and then we can figure out how best to model them. devlink might be an option if they are switch global. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-03 0:26 ` Andrew Lunn @ 2018-06-05 23:23 ` Grygorii Strashko 2018-06-05 23:49 ` Andrew Lunn 2018-06-06 8:23 ` Ivan Khoronzhuk 0 siblings, 2 replies; 53+ messages in thread From: Grygorii Strashko @ 2018-06-05 23:23 UTC (permalink / raw) To: Andrew Lunn Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On 06/02/2018 07:26 PM, Andrew Lunn wrote: >> *After this patch set*: goal keep things working the same as max as >> possible and get rid of TI custom tool. > > We are happy to keep things the same, if they fit with the switchdev > model. Anything in your customer TI tool/model which does not fit the > switchdev model you won't be able to keep, except if we agree to > extend the model. Right. That's the main goal of RFC to identify those gaps. > > I can say now, sw0p0 is going to cause problems. I really do suggest > you drop it for the moment in order to get a minimal driver > accepted. sw0p0 does not fit the switchdev model. Honestly, this is not the first patchset and we started without sw0p0, but then.... (with current LKML) - default vlan offloading breaks traffic reception to P0 (Ilias saying it's fixed in next - good) - adding vlan to P1/P2 works, but not for P0 (again as per Ilias -fixed) - mcast - no way to manually add static record and include or exclude P0. :( above are basic functionality required. > >> Below I've described some tested use cases (not include full static configuration), >> but regarding sw0p0 - there is work done by Ivan Khoronzhuk [1] which enables >> adds MQPRIO and CBS Qdisc and targets AVB network features. It required to >> offload MQPRIO and CBS parameters on all ports including P0. In case of P0, >> CPDMA TX channels shapers need to be configured, and in case >> of sw0p1/sw0p2 internal FIFOS. >> sw0p0 also expected to be used to configure CPDMA interface in general - >> number of tx/rx channels, rates, ring sizes. > > Can this be derives from the configuration on sw0p1 and sw0p2? > sw0p1 has 1 tx channel, sw0p2 has 2 tx channels, so give p0 3 tx > channels? This not exactly what is required, sry I probably will need just repeat what Ivan described in https://lkml.org/lkml/2018/5/18/1135. I'd try to provide this info tomorrow. Unfortunately, I'm not sure if all this reworking and switchdev conversation would make sense if we will not be able to fit Ivan's work in new CPSW driver model ;..( and do AVB bridge. > >> In addition there is set of global CPSW parameters (not related to P1/P2, like >> MAC Authorization Mode, OUI Deny Mode, crc ) which I've >> thought can be added to sw0p0 (using ethtool -priv-flags). > > You should describe these features, and then we can figure out how > best to model them. devlink might be an option if they are switch > global. Ok. that can be postponed. -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 23:23 ` Grygorii Strashko @ 2018-06-05 23:49 ` Andrew Lunn 2018-06-06 8:23 ` Ivan Khoronzhuk 1 sibling, 0 replies; 53+ messages in thread From: Andrew Lunn @ 2018-06-05 23:49 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On Tue, Jun 05, 2018 at 06:23:45PM -0500, Grygorii Strashko wrote: > > > On 06/02/2018 07:26 PM, Andrew Lunn wrote: > >> *After this patch set*: goal keep things working the same as max as > >> possible and get rid of TI custom tool. > > > > We are happy to keep things the same, if they fit with the switchdev > > model. Anything in your customer TI tool/model which does not fit the > > switchdev model you won't be able to keep, except if we agree to > > extend the model. > > Right. That's the main goal of RFC to identify those gaps. > > > > > I can say now, sw0p0 is going to cause problems. I really do suggest > > you drop it for the moment in order to get a minimal driver > > accepted. sw0p0 does not fit the switchdev model. > > Honestly, this is not the first patchset and we started without sw0p0, > but then.... (with current LKML) > - default vlan offloading breaks traffic reception to P0 > (Ilias saying it's fixed in next - good) > - adding vlan to P1/P2 works, but not for P0 (again as per Ilias -fixed) > - mcast - no way to manually add static record and include or exclude P0. > > > :( above are basic functionality required. For a DSA driver, this is way more than basic. A basic DSA driver just provides interfaces, and does everything in software. No offload at all. Generally, FDB offload is next, then MDB, and then VLAN, each as separate patch sets. > Unfortunately, I'm not sure if all this reworking and switchdev conversation would make sense > if we will not be able to fit Ivan's work in new CPSW driver model ;..( > and do AVB bridge. AVB bridge should fit the switchdev model. You can offload TC via switchdev e.g. the b53 has mirred, mellonex has flower and a lot more. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 23:23 ` Grygorii Strashko 2018-06-05 23:49 ` Andrew Lunn @ 2018-06-06 8:23 ` Ivan Khoronzhuk 1 sibling, 0 replies; 53+ messages in thread From: Ivan Khoronzhuk @ 2018-06-06 8:23 UTC (permalink / raw) To: Grygorii Strashko Cc: Andrew Lunn, Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, nsekhar, francois.ozog, yogeshs, spatton On Tue, Jun 05, 2018 at 06:23:45PM -0500, Grygorii Strashko wrote: > > >On 06/02/2018 07:26 PM, Andrew Lunn wrote: >>> *After this patch set*: goal keep things working the same as max as >>> possible and get rid of TI custom tool. >> >> We are happy to keep things the same, if they fit with the switchdev >> model. Anything in your customer TI tool/model which does not fit the >> switchdev model you won't be able to keep, except if we agree to >> extend the model. > >Right. That's the main goal of RFC to identify those gaps. > >> >> I can say now, sw0p0 is going to cause problems. I really do suggest >> you drop it for the moment in order to get a minimal driver >> accepted. sw0p0 does not fit the switchdev model. > >Honestly, this is not the first patchset and we started without sw0p0, >but then.... (with current LKML) >- default vlan offloading breaks traffic reception to P0 > (Ilias saying it's fixed in next - good) >- adding vlan to P1/P2 works, but not for P0 (again as per Ilias -fixed) >- mcast - no way to manually add static record and include or exclude P0. > > >:( above are basic functionality required. > >> >>> Below I've described some tested use cases (not include full static configuration), >>> but regarding sw0p0 - there is work done by Ivan Khoronzhuk [1] which enables >>> adds MQPRIO and CBS Qdisc and targets AVB network features. It required to >>> offload MQPRIO and CBS parameters on all ports including P0. In case of P0, >>> CPDMA TX channels shapers need to be configured, and in case >>> of sw0p1/sw0p2 internal FIFOS. >>> sw0p0 also expected to be used to configure CPDMA interface in general - >>> number of tx/rx channels, rates, ring sizes. >> >> Can this be derives from the configuration on sw0p1 and sw0p2? >> sw0p1 has 1 tx channel, sw0p2 has 2 tx channels, so give p0 3 tx >> channels? > >This not exactly what is required, sry I probably will need just repeat what Ivan >described in https://lkml.org/lkml/2018/5/18/1135. I'd try to provide this info tomorrow. > >Unfortunately, I'm not sure if all this reworking and switchdev conversation would make sense >if we will not be able to fit Ivan's work in new CPSW driver model ;..( >and do AVB bridge. Not sure I tracked everything, but current link example only for dual-emac mode, where i guess no switchdev and thus we have only 2 ports having phys and no need in some common configuration. Only common resources tx queues that are easily shared between two ports. In case of switchdev the same, no need in port 0, at least for cpsw implementation (not sure about others), tx queues cpdma shapers are configured separately and can be done with any netdev presenting ports, thus p0 can be avoided in this case also. For instance, I've created short description, based on current switchdev RFC, showing simple configuration commands for shapers configuration from host side and CBS for every port, w/o p0 usage: https://git.linaro.org/people/ivan.khoronzhuk/tsn_conf.git/tree/README_switchdev Will add it once switchdev decision is stabilized and applied. Basically nothing changed with dual-emac configuration except different rates set for cpdma shapers from host side. Probably, p0 is needed for other configurations, or for compatibility reasons with old versions, but definitely should be created list of all cases, and on my opinion, any common resource for both ports can be configured with any netdev presenting ports if it doesn't conflict with ports configuration itself. > >> >>> In addition there is set of global CPSW parameters (not related to P1/P2, like >>> MAC Authorization Mode, OUI Deny Mode, crc ) which I've >>> thought can be added to sw0p0 (using ethtool -priv-flags). >> >> You should describe these features, and then we can figure out how >> best to model them. devlink might be an option if they are switch >> global. > >Ok. that can be postponed. > >-- >regards, >-grygorii -- Regards, Ivan Khoronzhuk ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-02 23:28 ` Grygorii Strashko 2018-06-03 0:08 ` Andrew Lunn 2018-06-03 0:26 ` Andrew Lunn @ 2018-06-03 0:37 ` Andrew Lunn 2018-06-05 21:31 ` Grygorii Strashko 2018-06-03 0:49 ` Andrew Lunn 3 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-06-03 0:37 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > 1) boot, ping no vlan > > # ip link add name br0 type bridge > # echo 0 > /sys/class/net/br0/bridge/default_pvid > # ip link set dev eth2 master br0 > # ip link set dev eth0 master br0 > # ip link set dev eth1 master br0 > # ifconfig br0 192.168.1.2 > > *Note*: I've had to disable default_pvid as otherwise linux Bridge adds > and offloads default vlan 1, but default configuration for CPSW driver is vid 0. > + CPSW specific - it can't untag packets for P0. > Another option I've found: > # ip link set dev br0 type bridge vlan_filtering 1. > but anyway, I've found it confusing that Linux bridge adds default vlan when vlan_filtering == 0 There are three different configurations here you need to worry about, with respect to vlans: # CONFIG_VLAN_8021Q is not set So you don't have any vlan support in the kernel. CONFIG_VLAN_8021Q=y, vlan_filtering = 0 So you have vlans, but filtering is off CONFIG_VLAN_8021Q=y, vlan_filtering = 1 So you have vlans, and filtering is on. Even with vlan_filtering off, the bridge still does a little with vlans. And you need all three to work correctly. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-03 0:37 ` Andrew Lunn @ 2018-06-05 21:31 ` Grygorii Strashko 2018-06-05 21:37 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Grygorii Strashko @ 2018-06-05 21:31 UTC (permalink / raw) To: Andrew Lunn Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton On 06/02/2018 07:37 PM, Andrew Lunn wrote: >> 1) boot, ping no vlan >> >> # ip link add name br0 type bridge >> # echo 0 > /sys/class/net/br0/bridge/default_pvid >> # ip link set dev eth2 master br0 >> # ip link set dev eth0 master br0 >> # ip link set dev eth1 master br0 >> # ifconfig br0 192.168.1.2 >> >> *Note*: I've had to disable default_pvid as otherwise linux Bridge adds >> and offloads default vlan 1, but default configuration for CPSW driver is vid 0. >> + CPSW specific - it can't untag packets for P0. >> Another option I've found: >> # ip link set dev br0 type bridge vlan_filtering 1. >> but anyway, I've found it confusing that Linux bridge adds default vlan when vlan_filtering == 0 > > There are three different configurations here you need to worry about, > with respect to vlans: > > # CONFIG_VLAN_8021Q is not set > > So you don't have any vlan support in the kernel. > > CONFIG_VLAN_8021Q=y, vlan_filtering = 0 > > So you have vlans, but filtering is off > > CONFIG_VLAN_8021Q=y, vlan_filtering = 1 > > So you have vlans, and filtering is on. > > Even with vlan_filtering off, the bridge still does a little with > vlans. > > And you need all three to work correctly. > Right, thanky for the info, but still (sry, to be annoying) why default vlan is added by bridge when vlan_filtering == 0? -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 21:31 ` Grygorii Strashko @ 2018-06-05 21:37 ` Andrew Lunn 0 siblings, 0 replies; 53+ messages in thread From: Andrew Lunn @ 2018-06-05 21:37 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > Right, thanky for the info, but still (sry, to be annoying) why default vlan is added by bridge > when vlan_filtering == 0? I have no idea! I just made sure the Marvell driver works with this. You might want to do a git blame and find out who added it, and it might say why. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-02 23:28 ` Grygorii Strashko ` (2 preceding siblings ...) 2018-06-03 0:37 ` Andrew Lunn @ 2018-06-03 0:49 ` Andrew Lunn 2018-06-05 22:45 ` Grygorii Strashko 3 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-06-03 0:49 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton Hi Grygorii > Don't know howto: > 1) add FDB entry with "blocked" flag - ALE can discard all packets with SRC/DST > address = blocked MAC > 2) add multicast MAC address with Supervisory Packet flag set. > Such packets will bypass most of checks inside ALE and will be forwarded in all port's > states except "disabled". > 3) add "unknown vlan configuration" : ALE provides possibility to configure > default behavior for tagged packets with "unknown vlan" by configuring > - Unknown VLAN Force Untagged Egress ports Mask. > - Unknown VLAN Registered Multicast Flood Ports Mask > - Unknown VLAN Multicast Flood ports Mask > - Unknown VLAN Member ports List > 4) The way to detect "brctl stp br0 on/off" You are probably looking at this from the wrong direction. Yes, the switch can do these things. But the real question is, why would the network stack want to do this? As i've said before, you are accelerating the network stack by offloading things to the hardware. Does the software bridge support FDB with a blocked flag? I don't think it does. So you first need to extend the software bridge with this concept. Then you can offload it to the hardware to accelerate it. Does the network stack need for forward specific multicast MAC addresses between bridge ports independent of the state? If there is no need for it, you don't need to accelerate it. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-03 0:49 ` Andrew Lunn @ 2018-06-05 22:45 ` Grygorii Strashko 2018-06-05 23:40 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Grygorii Strashko @ 2018-06-05 22:45 UTC (permalink / raw) To: Andrew Lunn Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton Hi Andrew, Thanks a lot for you comments. On 06/02/2018 07:49 PM, Andrew Lunn wrote: > Hi Grygorii > >> Don't know howto: >> 1) add FDB entry with "blocked" flag - ALE can discard all packets with SRC/DST >> address = blocked MAC >> 2) add multicast MAC address with Supervisory Packet flag set. >> Such packets will bypass most of checks inside ALE and will be forwarded in all port's >> states except "disabled". >> 3) add "unknown vlan configuration" : ALE provides possibility to configure >> default behavior for tagged packets with "unknown vlan" by configuring >> - Unknown VLAN Force Untagged Egress ports Mask. >> - Unknown VLAN Registered Multicast Flood Ports Mask >> - Unknown VLAN Multicast Flood ports Mask >> - Unknown VLAN Member ports List >> 4) The way to detect "brctl stp br0 on/off" > > You are probably looking at this from the wrong direction. Yes, the > switch can do these things. But the real question is, why would the > network stack want to do this? As i've said before, you are > accelerating the network stack by offloading things to the hardware. Right. Thanks. > > Does the software bridge support FDB with a blocked flag? I don't > think it does. So you first need to extend the software bridge with > this concept. Then you can offload it to the hardware to accelerate > it. Sry, for possible misunderstanding: in "Don't know howto" i've listed things I was not able to discover from code or documentation with hope that expert opinion will help to confirm if this this a real/possible gap or I/we've just missed smth. And if this is a real gap - get "green" or "red" flag for future work (which need to be planned somehow). So, my understanding for (1) "blocked FDB entry support" is reasonable extension for bridge/switchdev ("green"). > > Does the network stack need for forward specific multicast MAC > addresses between bridge ports independent of the state? If there is > no need for it, you don't need to accelerate it. Assume this is about item 2 - this question is related to STP packets. CPSW/ALE will drop STP packets if receiving port is in blocking/learning states unless corresponding mcast entry exist in ALE entry with (Supervisory Packet) flag set (Actually ALE mcast entry has two fields (TRM): Supervisory Packet (SUPER): When set, this field indicates that the packet with a matching multicast destination address is a supervisory packet. Multicast Forward State (MCAST_FWD_STATE): Indicates the port state(s) required for the received port on a destination address lookup in order for the multicast packet to be forwarded to the transmit port(s). A transmit port must be in the Forwarding state in order to forward the packet.) Question 4 was asked with assumption that if (2) not supported and "red" flag - then option (4) can be used as w/a (again if "green" flag) and STP mcast address can be added in ALE on event "stp on". ** "unknown vlan configuration" This is about following use case. Non static network configuration when CPSW based device knows what traffic it can accept (Host port 0), but it has no knowledge (or limited) about network segments attached to Port 1 and Port 2. For example: Host 0 can accept only vlan 100 traffic coming from Port 1. ALE entry: vid =100, port_mask 0x3 But there can be vlan 50 created in attached network segments. Unknown VLAN Force Untagged Egress ports Mask = 0x0 Unknown VLAN Registered Multicast Flood Ports Mask = 0x6 (P1|P2) Unknown VLAN Multicast Flood ports Mask = 0x6 (P1|P2) Unknown VLAN Member ports List = 0x6 (P1|P2) with above configuration packets with "unknown vlan" (no ALE entry) will still be forwarded between port 1 and 2, but not Port 0. So, is it reasonable to add "unknown vlan configuration" to bridge/switchdev if not exist yet? will any other hw known benefit from it? -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 22:45 ` Grygorii Strashko @ 2018-06-05 23:40 ` Andrew Lunn 0 siblings, 0 replies; 53+ messages in thread From: Andrew Lunn @ 2018-06-05 23:40 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev, ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton > So, my understanding for (1) "blocked FDB entry support" is reasonable > extension for bridge/switchdev ("green"). You might have to justify it, but yes. > > Does the network stack need for forward specific multicast MAC > > addresses between bridge ports independent of the state? If there is > > no need for it, you don't need to accelerate it. > > Assume this is about item 2 - this question is related to STP packets. > CPSW/ALE will drop STP packets if receiving port is in blocking/learning states > unless corresponding mcast entry exist in ALE entry with (Supervisory Packet) flag set > (Actually ALE mcast entry has two fields (TRM): > Supervisory Packet (SUPER): When set, this field indicates that the packet > with a matching multicast destination address is a supervisory packet. > Multicast Forward State (MCAST_FWD_STATE): Indicates the port state(s) required for the received port > on a destination address lookup in order for the multicast packet to be forwarded to > the transmit port(s). A transmit port must be in the Forwarding state in > order to forward the packet.) So i this case, i would expect your driver to just add these entries to the ALE. No need for configuration for above. Just do it as soon as a port is made a member of a bridge. And remove it when the port leaves the bridge. DSA devices are smarter. They all have hardware which looks for BPDU and forwards them to the host independent of the state of the port. They also tend to have hardware looking for IGMP packets, and will forward them to the host, even if the multicast address is not being forwarded to the host. > ** "unknown vlan configuration" > > This is about following use case. Non static network configuration when > CPSW based device knows what traffic it can accept (Host port 0), but > it has no knowledge (or limited) about network segments attached to Port 1 and Port 2. > > For example: Host 0 can accept only vlan 100 traffic coming from Port 1. > ALE entry: vid =100, port_mask 0x3 > > But there can be vlan 50 created in attached network segments. > Unknown VLAN Force Untagged Egress ports Mask = 0x0 > Unknown VLAN Registered Multicast Flood Ports Mask = 0x6 (P1|P2) > Unknown VLAN Multicast Flood ports Mask = 0x6 (P1|P2) > Unknown VLAN Member ports List = 0x6 (P1|P2) > > with above configuration packets with "unknown vlan" (no ALE entry) will > still be forwarded between port 1 and 2, but not Port 0. > > So, is it reasonable to add "unknown vlan configuration" to bridge/switchdev > if not exist yet? will any other hw known benefit from it? You need to think about the case of two e1000e. How do you configure this setup to do what you want? It probably is already possible. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-05-24 6:56 [PATCH 0/4] RFC CPSW switchdev mode Ilias Apalodimas ` (4 preceding siblings ...) 2018-05-24 8:05 ` [PATCH 0/4] RFC CPSW switchdev mode Jiri Pirko @ 2018-06-01 21:29 ` Grygorii Strashko 2018-06-02 14:08 ` Andrew Lunn 5 siblings, 1 reply; 53+ messages in thread From: Grygorii Strashko @ 2018-06-01 21:29 UTC (permalink / raw) To: Ilias Apalodimas, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera Cc: francois.ozog, yogeshs, spatton Hi Ilias, On 05/24/2018 01:56 AM, Ilias Apalodimas wrote: > Hello, > > This is adding a new mode on the cpsw driver based around switchdev. > In order to enable this you need to enable CONFIG_NET_SWITCHDEV, > CONFIG_BRIDGE_VLAN_FILTERING, CONFIG_TI_CPSW_SWITCHDEV > and add to udev config: > > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="0f011900", \ > ATTR{phys_port_name}!="", NAME="sw0$attr{phys_port_name}" > Since the phys_switch_id is based on cpsw version, users with different > version will need to do 'ip -d link show dev sw0p0 | grep switchid' and > replace with the correct value. > > This patch creates 3 ports, sw0p0, sw0p1 and sw0p2. > sw0p1 and sw0p2 are the netdev interfaces connected to PHY devices > while sw0p0 is the switch 'cpu facing port'. > sw0p0 will be unable to receive and transmit traffic and it's not 100% within > switchdev scope but, it's used to configure switch cpu port individually as > this is needed for various switch features and configuration scenarios. First, sorry for delayed reply it was very tough week (new SoC). Second, Thanks a lot for your great work. I'm still testing it with different use cases and trying to consolidate my reply for all questions. All, thanks for your comments. -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-01 21:29 ` Grygorii Strashko @ 2018-06-02 14:08 ` Andrew Lunn 2018-06-05 22:59 ` Grygorii Strashko 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-06-02 14:08 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On Fri, Jun 01, 2018 at 04:29:08PM -0500, Grygorii Strashko wrote: > Hi Ilias, > Second, Thanks a lot for your great work. I'm still testing it with different > use cases and trying to consolidate my reply for all questions. > > All, thanks for your comments. Hi Grygorii Something i've said to Ilias already. I would recommend you don't try to cover all your uses cases with the first version. Keep it simple and clean, don't do anything controversial and get it merged. Then add more features one by one. We can then discuss any odd ball features while being able to look at the complete system, driver, switchdev and the network stack. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-02 14:08 ` Andrew Lunn @ 2018-06-05 22:59 ` Grygorii Strashko 2018-06-05 23:53 ` Andrew Lunn 0 siblings, 1 reply; 53+ messages in thread From: Grygorii Strashko @ 2018-06-05 22:59 UTC (permalink / raw) To: Andrew Lunn Cc: Ilias Apalodimas, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton On 06/02/2018 09:08 AM, Andrew Lunn wrote: > On Fri, Jun 01, 2018 at 04:29:08PM -0500, Grygorii Strashko wrote: >> Hi Ilias, > > >> Second, Thanks a lot for your great work. I'm still testing it with different >> use cases and trying to consolidate my reply for all questions. >> >> All, thanks for your comments. > > Hi Grygorii > > Something i've said to Ilias already. I would recommend you don't try > to cover all your uses cases with the first version. Keep it simple > and clean, don't do anything controversial and get it merged. Then add > more features one by one. We can then discuss any odd ball features > while being able to look at the complete system, driver, switchdev and > the network stack. > yes. It definitely no problem from my side, except basic customer use-cases simply not working without sw0p0, at least with current LKML :( And I just have to look a little bit in the future as selected approach expected to be extended on future SoC (and other parts of existing SoCs - ICSS-G SW switch) where we going to have more features, like TSN, EST and packet Policing and Classification. And I very, very appreciated for your (and all others) time and comments. Thank you. -- regards, -grygorii ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 22:59 ` Grygorii Strashko @ 2018-06-05 23:53 ` Andrew Lunn 2018-06-06 6:42 ` Ilias Apalodimas 0 siblings, 1 reply; 53+ messages in thread From: Andrew Lunn @ 2018-06-05 23:53 UTC (permalink / raw) To: Grygorii Strashko Cc: Ilias Apalodimas, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton > And I just have to look a little bit in the future as selected approach > expected to be extended on future SoC (and other parts of existing SoCs - ICSS-G SW switch) > where we going to have more features, like TSN, EST and packet Policing and Classification. You should probably took a closer look at the Mellonex driver. It has a lot of TC offload, which is what policing and classification is. TSN is being worked on in general, and i think the i210 is taking the lead. So you probably want to keep an eye on that, and join the discussion. Andrew ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 0/4] RFC CPSW switchdev mode 2018-06-05 23:53 ` Andrew Lunn @ 2018-06-06 6:42 ` Ilias Apalodimas 0 siblings, 0 replies; 53+ messages in thread From: Ilias Apalodimas @ 2018-06-06 6:42 UTC (permalink / raw) To: Andrew Lunn Cc: Grygorii Strashko, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera, francois.ozog, yogeshs, spatton Hi Andrew, >On Wed, Jun 06, 2018 at 01:53:56AM +0200, Andrew Lunn wrote: > > And I just have to look a little bit in the future as selected approach > > expected to be extended on future SoC (and other parts of existing SoCs - ICSS-G SW switch) > > where we going to have more features, like TSN, EST and packet Policing and Classification. > > You should probably took a closer look at the Mellonex driver. It has > a lot of TC offload, which is what policing and classification is. > I did take a close look to both Mellanox and rocker drivers before issuing this RFC and we seem to be close on the approach. What Grygorii is reffering to, is that for CBS to work properly on CPSW there *must* be a way to configure the CPU port individually. > TSN is being worked on in general, and i think the i210 is taking the > lead. So you probably want to keep an eye on that, and join the > discussion. > TSN is not just "one thing". It's a few IEEE standards bundled up to provide the needed functionality. i210 is only implementing CBS at the moment and there's an RFC out there to support what they call "Time based scheduling". I am already having discussions with Jesus on their current work. The idea behind using switchdev is that due to it's design, it's a very convenient way of utilizing netlink and iproute2/tc to configure any kind of future shapers. Since net_device_ops now has a callback to configure hardware schedulers being able to expose netdevs as hardware ports and configure them individually is great. As you can understand you'll end up with TSN capable switches and NICs being configured with the same commands from a userspace point of view. I am not sure this is the proper way to go, but at least for me, sounds like a viable solution. Thanks Ilias ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2018-06-06 8:23 UTC | newest] Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-24 6:56 [PATCH 0/4] RFC CPSW switchdev mode Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 1/4] cpsw: move common headers definitions to cpsw_priv.h Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 2/4] cpsw_ale: add support functions for switchdev Ilias Apalodimas 2018-05-24 6:56 ` [PATCH 3/4] cpsw_switchdev: add switchdev support files Ilias Apalodimas 2018-05-24 10:00 ` Maxim Uvarov 2018-05-27 4:39 ` kbuild test robot 2018-05-24 6:56 ` [PATCH 4/4] cpsw: add switchdev support Ilias Apalodimas 2018-05-24 13:12 ` Andrew Lunn 2018-05-24 13:32 ` Ilias Apalodimas 2018-05-24 16:39 ` Andrew Lunn 2018-05-25 4:56 ` Ilias Apalodimas 2018-06-01 21:48 ` Florian Fainelli 2018-06-02 10:34 ` Ilias Apalodimas 2018-06-02 16:10 ` Florian Fainelli 2018-06-02 16:52 ` Ilias Apalodimas 2018-06-05 21:03 ` Grygorii Strashko 2018-06-05 21:37 ` Florian Fainelli 2018-05-24 8:05 ` [PATCH 0/4] RFC CPSW switchdev mode Jiri Pirko 2018-05-24 8:48 ` Ilias Apalodimas 2018-05-24 12:54 ` Andrew Lunn 2018-05-24 13:44 ` Ivan Vecera 2018-05-24 14:08 ` Ilias Apalodimas 2018-05-24 14:54 ` Andrew Lunn 2018-05-24 15:07 ` Ilias Apalodimas 2018-05-24 15:25 ` Andrew Lunn 2018-05-24 16:02 ` Ilias Apalodimas 2018-05-24 16:33 ` Andrew Lunn 2018-05-25 6:29 ` Ilias Apalodimas 2018-05-25 10:28 ` Ilias Apalodimas 2018-05-25 11:59 ` Andrew Lunn 2018-05-25 12:09 ` Andrew Lunn 2018-05-31 15:27 ` Ilias Apalodimas 2018-06-02 23:28 ` Grygorii Strashko 2018-06-03 0:08 ` Andrew Lunn 2018-06-05 21:18 ` Grygorii Strashko 2018-06-05 21:28 ` Andrew Lunn 2018-06-05 21:42 ` Grygorii Strashko 2018-06-05 21:55 ` Andrew Lunn 2018-06-03 0:26 ` Andrew Lunn 2018-06-05 23:23 ` Grygorii Strashko 2018-06-05 23:49 ` Andrew Lunn 2018-06-06 8:23 ` Ivan Khoronzhuk 2018-06-03 0:37 ` Andrew Lunn 2018-06-05 21:31 ` Grygorii Strashko 2018-06-05 21:37 ` Andrew Lunn 2018-06-03 0:49 ` Andrew Lunn 2018-06-05 22:45 ` Grygorii Strashko 2018-06-05 23:40 ` Andrew Lunn 2018-06-01 21:29 ` Grygorii Strashko 2018-06-02 14:08 ` Andrew Lunn 2018-06-05 22:59 ` Grygorii Strashko 2018-06-05 23:53 ` Andrew Lunn 2018-06-06 6:42 ` Ilias Apalodimas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).