All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 00/10] bnxt_en: Updates for net-next.
@ 2017-07-24 16:34 Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 01/10] bnxt_en: Update firmware interface spec to 1.8.0 Michael Chan
                   ` (10 more replies)
  0 siblings, 11 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

This series includes updating the firmware interface, adding
methods to get and set VEPA/VEB bridge modes, some minor DCBX and ETS
refinements, and 3 patches from Sathya Perla to implement initial
VF representors for SRIOV switching.

Michael Chan (7):
  bnxt_en: Update firmware interface spec to 1.8.0.
  bnxt_en: Retrieve the hardware bridge mode from the firmware.
  bnxt_en: Implement ndo_bridge_{get|set}link methods.
  bnxt_en: Add bnxt_get_num_stats() to centrally get the number of
    ethtool stats.
  bnxt_en: Allow the user to set ethtool stats-block-usecs to 0.
  bnxt_en: Report firmware DCBX agent.
  bnxt_en: Set ETS min_bw parameter for older firmware.

Sathya Perla (3):
  bnxt_en: add support to enable VF-representors
  bnxt_en: add vf-rep RX/TX and netdev implementation
  bnxt_en: add support for port_attr_get and and get_phys_port_name

 drivers/net/ethernet/broadcom/Kconfig             |   1 +
 drivers/net/ethernet/broadcom/bnxt/Makefile       |   2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 237 +++++++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h         |  54 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c     |  17 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h     |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  33 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h     | 322 ++++++++++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c   |  15 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c     | 489 ++++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h     |  42 ++
 11 files changed, 1134 insertions(+), 79 deletions(-)
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH net-next 01/10] bnxt_en: Update firmware interface spec to 1.8.0.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 02/10] bnxt_en: Retrieve the hardware bridge mode from the firmware Michael Chan
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

VF representors and PTP are added features in the new firmware spec.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c       |   8 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h       |   8 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h   | 322 +++++++++++++++++++++---
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c |   9 +-
 4 files changed, 308 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index e7c8539..2103f14 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5646,7 +5646,7 @@ static int bnxt_hwrm_phy_qcaps(struct bnxt *bp)
 	if (rc)
 		goto hwrm_phy_qcaps_exit;
 
-	if (resp->eee_supported & PORT_PHY_QCAPS_RESP_EEE_SUPPORTED) {
+	if (resp->flags & PORT_PHY_QCAPS_RESP_FLAGS_EEE_SUPPORTED) {
 		struct ethtool_eee *eee = &bp->eee;
 		u16 fw_speeds = le16_to_cpu(resp->supported_speeds_eee_mode);
 
@@ -5686,13 +5686,15 @@ static int bnxt_update_link(struct bnxt *bp, bool chng_link_state)
 
 	memcpy(&link_info->phy_qcfg_resp, resp, sizeof(*resp));
 	link_info->phy_link_status = resp->link;
-	link_info->duplex =  resp->duplex;
+	link_info->duplex = resp->duplex_cfg;
+	if (bp->hwrm_spec_code >= 0x10800)
+		link_info->duplex = resp->duplex_state;
 	link_info->pause = resp->pause;
 	link_info->auto_mode = resp->auto_mode;
 	link_info->auto_pause_setting = resp->auto_pause;
 	link_info->lp_pause = resp->link_partner_adv_pause;
 	link_info->force_pause_setting = resp->force_pause;
-	link_info->duplex_setting = resp->duplex;
+	link_info->duplex_setting = resp->duplex_cfg;
 	if (link_info->phy_link_status == BNXT_LINK_LINK)
 		link_info->link_speed = le16_to_cpu(resp->link_speed);
 	else
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index f34691f..505691a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -12,10 +12,10 @@
 #define BNXT_H
 
 #define DRV_MODULE_NAME		"bnxt_en"
-#define DRV_MODULE_VERSION	"1.7.0"
+#define DRV_MODULE_VERSION	"1.8.0"
 
 #define DRV_VER_MAJ	1
-#define DRV_VER_MIN	7
+#define DRV_VER_MIN	8
 #define DRV_VER_UPD	0
 
 #include <linux/interrupt.h>
@@ -825,8 +825,8 @@ struct bnxt_link_info {
 	u8			loop_back;
 	u8			link_up;
 	u8			duplex;
-#define BNXT_LINK_DUPLEX_HALF	PORT_PHY_QCFG_RESP_DUPLEX_HALF
-#define BNXT_LINK_DUPLEX_FULL	PORT_PHY_QCFG_RESP_DUPLEX_FULL
+#define BNXT_LINK_DUPLEX_HALF	PORT_PHY_QCFG_RESP_DUPLEX_STATE_HALF
+#define BNXT_LINK_DUPLEX_FULL	PORT_PHY_QCFG_RESP_DUPLEX_STATE_FULL
 	u8			pause;
 #define BNXT_LINK_PAUSE_TX	PORT_PHY_QCFG_RESP_PAUSE_TX
 #define BNXT_LINK_PAUSE_RX	PORT_PHY_QCFG_RESP_PAUSE_RX
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
index 7dc71bb..3ba22e8 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
@@ -11,14 +11,14 @@
 #ifndef BNXT_HSI_H
 #define BNXT_HSI_H
 
-/* HSI and HWRM Specification 1.7.6 */
+/* HSI and HWRM Specification 1.8.0 */
 #define HWRM_VERSION_MAJOR	1
-#define HWRM_VERSION_MINOR	7
-#define HWRM_VERSION_UPDATE	6
+#define HWRM_VERSION_MINOR	8
+#define HWRM_VERSION_UPDATE	0
 
-#define HWRM_VERSION_RSVD	2 /* non-zero means beta version */
+#define HWRM_VERSION_RSVD	0 /* non-zero means beta version */
 
-#define HWRM_VERSION_STR	"1.7.6.2"
+#define HWRM_VERSION_STR	"1.8.0.0"
 /*
  * Following is the signature for HWRM message field that indicates not
  * applicable (All F's). Need to cast it the size of the field if needed.
@@ -813,7 +813,7 @@ struct hwrm_func_qcfg_output {
 	#define FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED	    0x4UL
 	#define FUNC_QCFG_RESP_FLAGS_STD_TX_RING_MODE_ENABLED      0x8UL
 	#define FUNC_QCFG_RESP_FLAGS_FW_LLDP_AGENT_ENABLED	    0x10UL
-	#define FUNC_QCFG_RESP_FLAGS_MULTI_HOST			    0x20UL
+	#define FUNC_QCFG_RESP_FLAGS_MULTI_HOST		    0x20UL
 	u8 mac_address[6];
 	__le16 pci_id;
 	__le16 alloc_rsscos_ctx;
@@ -835,9 +835,8 @@ struct hwrm_func_qcfg_output {
 	u8 port_pf_cnt;
 	#define FUNC_QCFG_RESP_PORT_PF_CNT_UNAVAIL		   0x0UL
 	__le16 dflt_vnic_id;
-	u8 host_cnt;
-	#define FUNC_QCFG_RESP_HOST_CNT_UNAVAIL		   0x0UL
 	u8 unused_0;
+	u8 unused_1;
 	__le32 min_bw;
 	#define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_MASK		    0xfffffffUL
 	#define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_SFT		    0
@@ -874,12 +873,56 @@ struct hwrm_func_qcfg_output {
 	#define FUNC_QCFG_RESP_EVB_MODE_NO_EVB			   0x0UL
 	#define FUNC_QCFG_RESP_EVB_MODE_VEB			   0x1UL
 	#define FUNC_QCFG_RESP_EVB_MODE_VEPA			   0x2UL
-	u8 unused_1;
+	u8 unused_2;
 	__le16 alloc_vfs;
 	__le32 alloc_mcast_filters;
 	__le32 alloc_hw_ring_grps;
 	__le16 alloc_sp_tx_rings;
+	u8 unused_3;
+	u8 valid;
+};
+
+/* hwrm_func_vlan_cfg */
+/* Input (48 bytes) */
+struct hwrm_func_vlan_cfg_input {
+	__le16 req_type;
+	__le16 cmpl_ring;
+	__le16 seq_id;
+	__le16 target_id;
+	__le64 resp_addr;
+	__le16 fid;
+	u8 unused_0;
+	u8 unused_1;
+	__le32 enables;
+	#define FUNC_VLAN_CFG_REQ_ENABLES_STAG_VID		    0x1UL
+	#define FUNC_VLAN_CFG_REQ_ENABLES_CTAG_VID		    0x2UL
+	#define FUNC_VLAN_CFG_REQ_ENABLES_STAG_PCP		    0x4UL
+	#define FUNC_VLAN_CFG_REQ_ENABLES_CTAG_PCP		    0x8UL
+	#define FUNC_VLAN_CFG_REQ_ENABLES_STAG_TPID		    0x10UL
+	#define FUNC_VLAN_CFG_REQ_ENABLES_CTAG_TPID		    0x20UL
+	__le16 stag_vid;
+	u8 stag_pcp;
 	u8 unused_2;
+	__be16 stag_tpid;
+	__le16 ctag_vid;
+	u8 ctag_pcp;
+	u8 unused_3;
+	__be16 ctag_tpid;
+	__le32 rsvd1;
+	__le32 rsvd2;
+	__le32 unused_4;
+};
+
+/* Output (16 bytes) */
+struct hwrm_func_vlan_cfg_output {
+	__le16 error_code;
+	__le16 req_type;
+	__le16 seq_id;
+	__le16 resp_len;
+	__le32 unused_0;
+	u8 unused_1;
+	u8 unused_2;
+	u8 unused_3;
 	u8 valid;
 };
 
@@ -902,6 +945,7 @@ struct hwrm_func_cfg_input {
 	#define FUNC_CFG_REQ_FLAGS_STD_TX_RING_MODE_ENABLE	    0x200UL
 	#define FUNC_CFG_REQ_FLAGS_STD_TX_RING_MODE_DISABLE	    0x400UL
 	#define FUNC_CFG_REQ_FLAGS_VIRT_MAC_PERSIST		    0x800UL
+	#define FUNC_CFG_REQ_FLAGS_NO_AUTOCLEAR_STATISTIC	    0x1000UL
 	__le32 enables;
 	#define FUNC_CFG_REQ_ENABLES_MTU			    0x1UL
 	#define FUNC_CFG_REQ_ENABLES_MRU			    0x2UL
@@ -1456,9 +1500,9 @@ struct hwrm_port_phy_qcfg_output {
 	#define PORT_PHY_QCFG_RESP_LINK_SPEED_50GB		   0x1f4UL
 	#define PORT_PHY_QCFG_RESP_LINK_SPEED_100GB		   0x3e8UL
 	#define PORT_PHY_QCFG_RESP_LINK_SPEED_10MB		   0xffffUL
-	u8 duplex;
-	#define PORT_PHY_QCFG_RESP_DUPLEX_HALF			   0x0UL
-	#define PORT_PHY_QCFG_RESP_DUPLEX_FULL			   0x1UL
+	u8 duplex_cfg;
+	#define PORT_PHY_QCFG_RESP_DUPLEX_CFG_HALF		   0x0UL
+	#define PORT_PHY_QCFG_RESP_DUPLEX_CFG_FULL		   0x1UL
 	u8 pause;
 	#define PORT_PHY_QCFG_RESP_PAUSE_TX			    0x1UL
 	#define PORT_PHY_QCFG_RESP_PAUSE_RX			    0x2UL
@@ -1573,6 +1617,9 @@ struct hwrm_port_phy_qcfg_output {
 	#define PORT_PHY_QCFG_RESP_PHY_TYPE_40G_BASELR4	   0x16UL
 	#define PORT_PHY_QCFG_RESP_PHY_TYPE_40G_BASEER4	   0x17UL
 	#define PORT_PHY_QCFG_RESP_PHY_TYPE_40G_ACTIVE_CABLE      0x18UL
+	#define PORT_PHY_QCFG_RESP_PHY_TYPE_1G_BASET		   0x19UL
+	#define PORT_PHY_QCFG_RESP_PHY_TYPE_1G_BASESX		   0x1aUL
+	#define PORT_PHY_QCFG_RESP_PHY_TYPE_1G_BASECX		   0x1bUL
 	u8 media_type;
 	#define PORT_PHY_QCFG_RESP_MEDIA_TYPE_UNKNOWN		   0x0UL
 	#define PORT_PHY_QCFG_RESP_MEDIA_TYPE_TP		   0x1UL
@@ -1651,14 +1698,16 @@ struct hwrm_port_phy_qcfg_output {
 	#define PORT_PHY_QCFG_RESP_FEC_CFG_FEC_CLAUSE74_ENABLED    0x10UL
 	#define PORT_PHY_QCFG_RESP_FEC_CFG_FEC_CLAUSE91_SUPPORTED  0x20UL
 	#define PORT_PHY_QCFG_RESP_FEC_CFG_FEC_CLAUSE91_ENABLED    0x40UL
+	u8 duplex_state;
+	#define PORT_PHY_QCFG_RESP_DUPLEX_STATE_HALF		   0x0UL
+	#define PORT_PHY_QCFG_RESP_DUPLEX_STATE_FULL		   0x1UL
 	u8 unused_1;
-	u8 unused_2;
 	char phy_vendor_name[16];
 	char phy_vendor_partnumber[16];
-	__le32 unused_3;
+	__le32 unused_2;
+	u8 unused_3;
 	u8 unused_4;
 	u8 unused_5;
-	u8 unused_6;
 	u8 valid;
 };
 
@@ -1744,6 +1793,51 @@ struct hwrm_port_mac_cfg_output {
 	u8 valid;
 };
 
+/* hwrm_port_mac_ptp_qcfg */
+/* Input (24 bytes) */
+struct hwrm_port_mac_ptp_qcfg_input {
+	__le16 req_type;
+	__le16 cmpl_ring;
+	__le16 seq_id;
+	__le16 target_id;
+	__le64 resp_addr;
+	__le16 port_id;
+	__le16 unused_0[3];
+};
+
+/* Output (80 bytes) */
+struct hwrm_port_mac_ptp_qcfg_output {
+	__le16 error_code;
+	__le16 req_type;
+	__le16 seq_id;
+	__le16 resp_len;
+	u8 flags;
+	#define PORT_MAC_PTP_QCFG_RESP_FLAGS_DIRECT_ACCESS	    0x1UL
+	#define PORT_MAC_PTP_QCFG_RESP_FLAGS_HWRM_ACCESS	    0x2UL
+	u8 unused_0;
+	__le16 unused_1;
+	__le32 rx_ts_reg_off_lower;
+	__le32 rx_ts_reg_off_upper;
+	__le32 rx_ts_reg_off_seq_id;
+	__le32 rx_ts_reg_off_src_id_0;
+	__le32 rx_ts_reg_off_src_id_1;
+	__le32 rx_ts_reg_off_src_id_2;
+	__le32 rx_ts_reg_off_domain_id;
+	__le32 rx_ts_reg_off_fifo;
+	__le32 rx_ts_reg_off_fifo_adv;
+	__le32 rx_ts_reg_off_granularity;
+	__le32 tx_ts_reg_off_lower;
+	__le32 tx_ts_reg_off_upper;
+	__le32 tx_ts_reg_off_seq_id;
+	__le32 tx_ts_reg_off_fifo;
+	__le32 tx_ts_reg_off_granularity;
+	__le32 unused_2;
+	u8 unused_3;
+	u8 unused_4;
+	u8 unused_5;
+	u8 valid;
+};
+
 /* hwrm_port_qstats */
 /* Input (40 bytes) */
 struct hwrm_port_qstats_input {
@@ -1874,10 +1968,10 @@ struct hwrm_port_phy_qcaps_output {
 	__le16 req_type;
 	__le16 seq_id;
 	__le16 resp_len;
-	u8 eee_supported;
-	#define PORT_PHY_QCAPS_RESP_EEE_SUPPORTED		    0x1UL
-	#define PORT_PHY_QCAPS_RESP_RSVD1_MASK			    0xfeUL
-	#define PORT_PHY_QCAPS_RESP_RSVD1_SFT			    1
+	u8 flags;
+	#define PORT_PHY_QCAPS_RESP_FLAGS_EEE_SUPPORTED	    0x1UL
+	#define PORT_PHY_QCAPS_RESP_FLAGS_RSVD1_MASK		    0xfeUL
+	#define PORT_PHY_QCAPS_RESP_FLAGS_RSVD1_SFT		    1
 	u8 unused_0;
 	__le16 supported_speeds_force_mode;
 	#define PORT_PHY_QCAPS_RESP_SUPPORTED_SPEEDS_FORCE_MODE_100MBHD 0x1UL
@@ -3152,6 +3246,95 @@ struct hwrm_queue_cos2bw_cfg_output {
 	u8 valid;
 };
 
+/* hwrm_queue_dscp_qcaps */
+/* Input (24 bytes) */
+struct hwrm_queue_dscp_qcaps_input {
+	__le16 req_type;
+	__le16 cmpl_ring;
+	__le16 seq_id;
+	__le16 target_id;
+	__le64 resp_addr;
+	u8 port_id;
+	u8 unused_0[7];
+};
+
+/* Output (16 bytes) */
+struct hwrm_queue_dscp_qcaps_output {
+	__le16 error_code;
+	__le16 req_type;
+	__le16 seq_id;
+	__le16 resp_len;
+	u8 num_dscp_bits;
+	u8 unused_0;
+	__le16 max_entries;
+	u8 unused_1;
+	u8 unused_2;
+	u8 unused_3;
+	u8 valid;
+};
+
+/* hwrm_queue_dscp2pri_qcfg */
+/* Input (32 bytes) */
+struct hwrm_queue_dscp2pri_qcfg_input {
+	__le16 req_type;
+	__le16 cmpl_ring;
+	__le16 seq_id;
+	__le16 target_id;
+	__le64 resp_addr;
+	__le64 dest_data_addr;
+	u8 port_id;
+	u8 unused_0;
+	__le16 dest_data_buffer_size;
+	__le32 unused_1;
+};
+
+/* Output (16 bytes) */
+struct hwrm_queue_dscp2pri_qcfg_output {
+	__le16 error_code;
+	__le16 req_type;
+	__le16 seq_id;
+	__le16 resp_len;
+	__le16 entry_cnt;
+	u8 default_pri;
+	u8 unused_0;
+	u8 unused_1;
+	u8 unused_2;
+	u8 unused_3;
+	u8 valid;
+};
+
+/* hwrm_queue_dscp2pri_cfg */
+/* Input (40 bytes) */
+struct hwrm_queue_dscp2pri_cfg_input {
+	__le16 req_type;
+	__le16 cmpl_ring;
+	__le16 seq_id;
+	__le16 target_id;
+	__le64 resp_addr;
+	__le64 src_data_addr;
+	__le32 flags;
+	#define QUEUE_DSCP2PRI_CFG_REQ_FLAGS_USE_HW_DEFAULT_PRI    0x1UL
+	__le32 enables;
+	#define QUEUE_DSCP2PRI_CFG_REQ_ENABLES_DEFAULT_PRI	    0x1UL
+	u8 port_id;
+	u8 default_pri;
+	__le16 entry_cnt;
+	__le32 unused_0;
+};
+
+/* Output (16 bytes) */
+struct hwrm_queue_dscp2pri_cfg_output {
+	__le16 error_code;
+	__le16 req_type;
+	__le16 seq_id;
+	__le16 resp_len;
+	__le32 unused_0;
+	u8 unused_1;
+	u8 unused_2;
+	u8 unused_3;
+	u8 valid;
+};
+
 /* hwrm_vnic_alloc */
 /* Input (24 bytes) */
 struct hwrm_vnic_alloc_input {
@@ -4038,7 +4221,7 @@ struct hwrm_cfa_encap_record_alloc_input {
 	#define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_IPGRE       0x8UL
 	u8 unused_0;
 	__le16 unused_1;
-	__le32 encap_data[16];
+	__le32 encap_data[20];
 };
 
 /* Output (16 bytes) */
@@ -4120,8 +4303,8 @@ struct hwrm_cfa_ntuple_filter_alloc_input {
 	#define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_ADDR_TYPE_IPV6     0x6UL
 	u8 ip_protocol;
 	#define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_UNKNOWN   0x0UL
-	#define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_UDP       0x6UL
-	#define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_TCP       0x11UL
+	#define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_TCP       0x6UL
+	#define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_UDP       0x11UL
 	__le16 dst_id;
 	__le16 mirror_vnic_id;
 	u8 tunnel_type;
@@ -4224,6 +4407,58 @@ struct hwrm_cfa_ntuple_filter_cfg_output {
 	u8 valid;
 };
 
+/* hwrm_cfa_vfr_alloc */
+/* Input (32 bytes) */
+struct hwrm_cfa_vfr_alloc_input {
+	__le16 req_type;
+	__le16 cmpl_ring;
+	__le16 seq_id;
+	__le16 target_id;
+	__le64 resp_addr;
+	__le16 vf_id;
+	__le16 reserved;
+	__le32 unused_0;
+	char vfr_name[32];
+};
+
+/* Output (16 bytes) */
+struct hwrm_cfa_vfr_alloc_output {
+	__le16 error_code;
+	__le16 req_type;
+	__le16 seq_id;
+	__le16 resp_len;
+	__le16 rx_cfa_code;
+	__le16 tx_cfa_action;
+	u8 unused_0;
+	u8 unused_1;
+	u8 unused_2;
+	u8 valid;
+};
+
+/* hwrm_cfa_vfr_free */
+/* Input (24 bytes) */
+struct hwrm_cfa_vfr_free_input {
+	__le16 req_type;
+	__le16 cmpl_ring;
+	__le16 seq_id;
+	__le16 target_id;
+	__le64 resp_addr;
+	char vfr_name[32];
+};
+
+/* Output (16 bytes) */
+struct hwrm_cfa_vfr_free_output {
+	__le16 error_code;
+	__le16 req_type;
+	__le16 seq_id;
+	__le16 resp_len;
+	__le32 unused_0;
+	u8 unused_1;
+	u8 unused_2;
+	u8 unused_3;
+	u8 valid;
+};
+
 /* hwrm_tunnel_dst_port_query */
 /* Input (24 bytes) */
 struct hwrm_tunnel_dst_port_query_input {
@@ -4448,12 +4683,13 @@ struct hwrm_fw_reset_input {
 	#define FW_RESET_REQ_EMBEDDED_PROC_TYPE_MGMT		   0x1UL
 	#define FW_RESET_REQ_EMBEDDED_PROC_TYPE_NETCTRL	   0x2UL
 	#define FW_RESET_REQ_EMBEDDED_PROC_TYPE_ROCE		   0x3UL
-	#define FW_RESET_REQ_EMBEDDED_PROC_TYPE_RSVD		   0x4UL
+	#define FW_RESET_REQ_EMBEDDED_PROC_TYPE_HOST		   0x4UL
 	u8 selfrst_status;
 	#define FW_RESET_REQ_SELFRST_STATUS_SELFRSTNONE	   0x0UL
 	#define FW_RESET_REQ_SELFRST_STATUS_SELFRSTASAP	   0x1UL
 	#define FW_RESET_REQ_SELFRST_STATUS_SELFRSTPCIERST	   0x2UL
-	__le16 unused_0[3];
+	u8 host_idx;
+	u8 unused_0[5];
 };
 
 /* Output (16 bytes) */
@@ -4487,7 +4723,7 @@ struct hwrm_fw_qstatus_input {
 	#define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_MGMT		   0x1UL
 	#define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_NETCTRL	   0x2UL
 	#define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_ROCE		   0x3UL
-	#define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_RSVD		   0x4UL
+	#define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_HOST		   0x4UL
 	u8 unused_0[7];
 };
 
@@ -4572,6 +4808,16 @@ struct hwrm_fw_set_structured_data_output {
 	u8 valid;
 };
 
+/* Command specific Error Codes (8 bytes) */
+struct hwrm_fw_set_structured_data_cmd_err {
+	u8 code;
+	#define FW_SET_STRUCTURED_DATA_CMD_ERR_CODE_UNKNOWN       0x0UL
+	#define FW_SET_STRUCTURED_DATA_CMD_ERR_CODE_BAD_HDR_CNT   0x1UL
+	#define FW_SET_STRUCTURED_DATA_CMD_ERR_CODE_BAD_FMT       0x2UL
+	#define FW_SET_STRUCTURED_DATA_CMD_ERR_CODE_BAD_ID	   0x3UL
+	u8 unused_0[7];
+};
+
 /* hwrm_fw_get_structured_data */
 /* Input (32 bytes) */
 struct hwrm_fw_get_structured_data_input {
@@ -4611,6 +4857,14 @@ struct hwrm_fw_get_structured_data_output {
 	u8 valid;
 };
 
+/* Command specific Error Codes (8 bytes) */
+struct hwrm_fw_get_structured_data_cmd_err {
+	u8 code;
+	#define FW_GET_STRUCTURED_DATA_CMD_ERR_CODE_UNKNOWN       0x0UL
+	#define FW_GET_STRUCTURED_DATA_CMD_ERR_CODE_BAD_ID	   0x3UL
+	u8 unused_0[7];
+};
+
 /* hwrm_exec_fwd_resp */
 /* Input (128 bytes) */
 struct hwrm_exec_fwd_resp_input {
@@ -5411,7 +5665,7 @@ struct cmd_nums {
 	#define HWRM_PORT_LPBK_CLR_STATS			   (0x26UL)
 	#define HWRM_PORT_PHY_QCFG				   (0x27UL)
 	#define HWRM_PORT_MAC_QCFG				   (0x28UL)
-	#define RESERVED7					   (0x29UL)
+	#define HWRM_PORT_MAC_PTP_QCFG				   (0x29UL)
 	#define HWRM_PORT_PHY_QCAPS				   (0x2aUL)
 	#define HWRM_PORT_PHY_I2C_WRITE			   (0x2bUL)
 	#define HWRM_PORT_PHY_I2C_READ				   (0x2cUL)
@@ -5421,14 +5675,17 @@ struct cmd_nums {
 	#define HWRM_QUEUE_QPORTCFG				   (0x30UL)
 	#define HWRM_QUEUE_QCFG				   (0x31UL)
 	#define HWRM_QUEUE_CFG					   (0x32UL)
-	#define RESERVED2					   (0x33UL)
-	#define RESERVED3					   (0x34UL)
+	#define HWRM_FUNC_VLAN_CFG				   (0x33UL)
+	#define HWRM_FUNC_VLAN_QCFG				   (0x34UL)
 	#define HWRM_QUEUE_PFCENABLE_QCFG			   (0x35UL)
 	#define HWRM_QUEUE_PFCENABLE_CFG			   (0x36UL)
 	#define HWRM_QUEUE_PRI2COS_QCFG			   (0x37UL)
 	#define HWRM_QUEUE_PRI2COS_CFG				   (0x38UL)
 	#define HWRM_QUEUE_COS2BW_QCFG				   (0x39UL)
 	#define HWRM_QUEUE_COS2BW_CFG				   (0x3aUL)
+	#define HWRM_QUEUE_DSCP_QCAPS				   (0x3bUL)
+	#define HWRM_QUEUE_DSCP2PRI_QCFG			   (0x3cUL)
+	#define HWRM_QUEUE_DSCP2PRI_CFG			   (0x3dUL)
 	#define HWRM_VNIC_ALLOC				   (0x40UL)
 	#define HWRM_VNIC_FREE					   (0x41UL)
 	#define HWRM_VNIC_CFG					   (0x42UL)
@@ -5455,7 +5712,7 @@ struct cmd_nums {
 	#define HWRM_CFA_L2_FILTER_FREE			   (0x91UL)
 	#define HWRM_CFA_L2_FILTER_CFG				   (0x92UL)
 	#define HWRM_CFA_L2_SET_RX_MASK			   (0x93UL)
-	#define RESERVED4					   (0x94UL)
+	#define HWRM_CFA_VLAN_ANTISPOOF_CFG			   (0x94UL)
 	#define HWRM_CFA_TUNNEL_FILTER_ALLOC			   (0x95UL)
 	#define HWRM_CFA_TUNNEL_FILTER_FREE			   (0x96UL)
 	#define HWRM_CFA_ENCAP_RECORD_ALLOC			   (0x97UL)
@@ -5494,6 +5751,8 @@ struct cmd_nums {
 	#define HWRM_CFA_METER_PROFILE_CFG			   (0xf7UL)
 	#define HWRM_CFA_METER_INSTANCE_ALLOC			   (0xf8UL)
 	#define HWRM_CFA_METER_INSTANCE_FREE			   (0xf9UL)
+	#define HWRM_CFA_VFR_ALLOC				   (0xfdUL)
+	#define HWRM_CFA_VFR_FREE				   (0xfeUL)
 	#define HWRM_CFA_VF_PAIR_ALLOC				   (0x100UL)
 	#define HWRM_CFA_VF_PAIR_FREE				   (0x101UL)
 	#define HWRM_CFA_VF_PAIR_INFO				   (0x102UL)
@@ -5502,6 +5761,9 @@ struct cmd_nums {
 	#define HWRM_CFA_FLOW_FLUSH				   (0x105UL)
 	#define HWRM_CFA_FLOW_STATS				   (0x106UL)
 	#define HWRM_CFA_FLOW_INFO				   (0x107UL)
+	#define HWRM_CFA_DECAP_FILTER_ALLOC			   (0x108UL)
+	#define HWRM_CFA_DECAP_FILTER_FREE			   (0x109UL)
+	#define HWRM_CFA_VLAN_ANTISPOOF_QCFG			   (0x10aUL)
 	#define HWRM_SELFTEST_QLIST				   (0x200UL)
 	#define HWRM_SELFTEST_EXEC				   (0x201UL)
 	#define HWRM_SELFTEST_IRQ				   (0x202UL)
@@ -5510,6 +5772,8 @@ struct cmd_nums {
 	#define HWRM_DBG_WRITE_DIRECT				   (0xff12UL)
 	#define HWRM_DBG_WRITE_INDIRECT			   (0xff13UL)
 	#define HWRM_DBG_DUMP					   (0xff14UL)
+	#define HWRM_DBG_ERASE_NVM				   (0xff15UL)
+	#define HWRM_DBG_CFG					   (0xff16UL)
 	#define HWRM_NVM_FACTORY_DEFAULTS			   (0xffeeUL)
 	#define HWRM_NVM_VALIDATE_OPTION			   (0xffefUL)
 	#define HWRM_NVM_FLUSH					   (0xfff0UL)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index b8e7248..fde7256 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -794,8 +794,10 @@ static int bnxt_vf_set_link(struct bnxt *bp, struct bnxt_vf_info *vf)
 					PORT_PHY_QCFG_RESP_LINK_LINK;
 				phy_qcfg_resp.link_speed = cpu_to_le16(
 					PORT_PHY_QCFG_RESP_LINK_SPEED_10GB);
-				phy_qcfg_resp.duplex =
-					PORT_PHY_QCFG_RESP_DUPLEX_FULL;
+				phy_qcfg_resp.duplex_cfg =
+					PORT_PHY_QCFG_RESP_DUPLEX_CFG_FULL;
+				phy_qcfg_resp.duplex_state =
+					PORT_PHY_QCFG_RESP_DUPLEX_STATE_FULL;
 				phy_qcfg_resp.pause =
 					(PORT_PHY_QCFG_RESP_PAUSE_TX |
 					 PORT_PHY_QCFG_RESP_PAUSE_RX);
@@ -804,7 +806,8 @@ static int bnxt_vf_set_link(struct bnxt *bp, struct bnxt_vf_info *vf)
 			/* force link down */
 			phy_qcfg_resp.link = PORT_PHY_QCFG_RESP_LINK_NO_LINK;
 			phy_qcfg_resp.link_speed = 0;
-			phy_qcfg_resp.duplex = PORT_PHY_QCFG_RESP_DUPLEX_HALF;
+			phy_qcfg_resp.duplex_state =
+				PORT_PHY_QCFG_RESP_DUPLEX_STATE_HALF;
 			phy_qcfg_resp.pause = 0;
 		}
 		rc = bnxt_hwrm_fwd_resp(bp, vf, &phy_qcfg_resp,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 02/10] bnxt_en: Retrieve the hardware bridge mode from the firmware.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 01/10] bnxt_en: Update firmware interface spec to 1.8.0 Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 03/10] bnxt_en: Implement ndo_bridge_{get|set}link methods Michael Chan
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

Retrieve and store the hardware bridge mode, so that we can implement
ndo_bridge_{get|set)link methods in the next patch.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 2103f14..ec8a195 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -33,6 +33,7 @@
 #include <linux/mii.h>
 #include <linux/if.h>
 #include <linux/if_vlan.h>
+#include <linux/if_bridge.h>
 #include <linux/rtc.h>
 #include <linux/bpf.h>
 #include <net/ip.h>
@@ -4610,6 +4611,13 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
 		bp->port_partition_type = resp->port_partition_type;
 		break;
 	}
+	if (bp->hwrm_spec_code < 0x10707 ||
+	    resp->evb_mode == FUNC_QCFG_RESP_EVB_MODE_VEB)
+		bp->br_mode = BRIDGE_MODE_VEB;
+	else if (resp->evb_mode == FUNC_QCFG_RESP_EVB_MODE_VEPA)
+		bp->br_mode = BRIDGE_MODE_VEPA;
+	else
+		bp->br_mode = BRIDGE_MODE_UNDEF;
 
 func_qcfg_exit:
 	mutex_unlock(&bp->hwrm_cmd_lock);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 505691a..15d5a10 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1164,6 +1164,7 @@ struct bnxt {
 	u8			nge_port_cnt;
 	__le16			nge_fw_dst_port_id;
 	u8			port_partition_type;
+	u16			br_mode;
 
 	u16			rx_coal_ticks;
 	u16			rx_coal_ticks_irq;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 03/10] bnxt_en: Implement ndo_bridge_{get|set}link methods.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 01/10] bnxt_en: Update firmware interface spec to 1.8.0 Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 02/10] bnxt_en: Retrieve the hardware bridge mode from the firmware Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 04/10] bnxt_en: Add bnxt_get_num_stats() to centrally get the number of ethtool stats Michael Chan
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

To allow users to set the hardware bridging mode to VEB or VEPA.  Only
single function PF can change the bridging mode.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 67 +++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ec8a195..4acaeaf 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4919,6 +4919,26 @@ static void bnxt_hwrm_resource_free(struct bnxt *bp, bool close_path,
 	}
 }
 
+static int bnxt_hwrm_set_br_mode(struct bnxt *bp, u16 br_mode)
+{
+	struct hwrm_func_cfg_input req = {0};
+	int rc;
+
+	bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_CFG, -1, -1);
+	req.fid = cpu_to_le16(0xffff);
+	req.enables = cpu_to_le32(FUNC_CFG_REQ_ENABLES_EVB_MODE);
+	if (br_mode == BRIDGE_MODE_VEB)
+		req.evb_mode = FUNC_CFG_REQ_EVB_MODE_VEB;
+	else if (br_mode == BRIDGE_MODE_VEPA)
+		req.evb_mode = FUNC_CFG_REQ_EVB_MODE_VEPA;
+	else
+		return -EINVAL;
+	rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+	if (rc)
+		rc = -EIO;
+	return rc;
+}
+
 static int bnxt_setup_vnic(struct bnxt *bp, u16 vnic_id)
 {
 	struct bnxt_vnic_info *vnic = &bp->vnic_info[vnic_id];
@@ -7432,6 +7452,51 @@ static void bnxt_udp_tunnel_del(struct net_device *dev,
 	schedule_work(&bp->sp_task);
 }
 
+static int bnxt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
+			       struct net_device *dev, u32 filter_mask,
+			       int nlflags)
+{
+	struct bnxt *bp = netdev_priv(dev);
+
+	return ndo_dflt_bridge_getlink(skb, pid, seq, dev, bp->br_mode, 0, 0,
+				       nlflags, filter_mask, NULL);
+}
+
+static int bnxt_bridge_setlink(struct net_device *dev, struct nlmsghdr *nlh,
+			       u16 flags)
+{
+	struct bnxt *bp = netdev_priv(dev);
+	struct nlattr *attr, *br_spec;
+	int rem, rc = 0;
+
+	if (bp->hwrm_spec_code < 0x10708 || !BNXT_SINGLE_PF(bp))
+		return -EOPNOTSUPP;
+
+	br_spec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
+	if (!br_spec)
+		return -EINVAL;
+
+	nla_for_each_nested(attr, br_spec, rem) {
+		u16 mode;
+
+		if (nla_type(attr) != IFLA_BRIDGE_MODE)
+			continue;
+
+		if (nla_len(attr) < sizeof(mode))
+			return -EINVAL;
+
+		mode = nla_get_u16(attr);
+		if (mode == bp->br_mode)
+			break;
+
+		rc = bnxt_hwrm_set_br_mode(bp, mode);
+		if (!rc)
+			bp->br_mode = mode;
+		break;
+	}
+	return rc;
+}
+
 static const struct net_device_ops bnxt_netdev_ops = {
 	.ndo_open		= bnxt_open,
 	.ndo_start_xmit		= bnxt_start_xmit,
@@ -7463,6 +7528,8 @@ static void bnxt_udp_tunnel_del(struct net_device *dev,
 	.ndo_udp_tunnel_add	= bnxt_udp_tunnel_add,
 	.ndo_udp_tunnel_del	= bnxt_udp_tunnel_del,
 	.ndo_xdp		= bnxt_xdp,
+	.ndo_bridge_getlink	= bnxt_bridge_getlink,
+	.ndo_bridge_setlink	= bnxt_bridge_setlink,
 };
 
 static void bnxt_remove_one(struct pci_dev *pdev)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 04/10] bnxt_en: Add bnxt_get_num_stats() to centrally get the number of ethtool stats.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (2 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 03/10] bnxt_en: Implement ndo_bridge_{get|set}link methods Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 05/10] bnxt_en: Allow the user to set ethtool stats-block-usecs to 0 Michael Chan
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

Instead of duplicating the logic multiple times.  Also, it is unnecessary
to zero the buffer in .get_ethtool_stats() because it is already zeroed
by the caller.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 25 ++++++++++++-----------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index be6acad..4661b11 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -198,19 +198,23 @@ static int bnxt_set_coalesce(struct net_device *dev,
 
 #define BNXT_NUM_PORT_STATS ARRAY_SIZE(bnxt_port_stats_arr)
 
+static int bnxt_get_num_stats(struct bnxt *bp)
+{
+	int num_stats = BNXT_NUM_STATS * bp->cp_nr_rings;
+
+	if (bp->flags & BNXT_FLAG_PORT_STATS)
+		num_stats += BNXT_NUM_PORT_STATS;
+
+	return num_stats;
+}
+
 static int bnxt_get_sset_count(struct net_device *dev, int sset)
 {
 	struct bnxt *bp = netdev_priv(dev);
 
 	switch (sset) {
-	case ETH_SS_STATS: {
-		int num_stats = BNXT_NUM_STATS * bp->cp_nr_rings;
-
-		if (bp->flags & BNXT_FLAG_PORT_STATS)
-			num_stats += BNXT_NUM_PORT_STATS;
-
-		return num_stats;
-	}
+	case ETH_SS_STATS:
+		return bnxt_get_num_stats(bp);
 	case ETH_SS_TEST:
 		if (!bp->num_tests)
 			return -EOPNOTSUPP;
@@ -225,11 +229,8 @@ static void bnxt_get_ethtool_stats(struct net_device *dev,
 {
 	u32 i, j = 0;
 	struct bnxt *bp = netdev_priv(dev);
-	u32 buf_size = sizeof(struct ctx_hw_stats) * bp->cp_nr_rings;
 	u32 stat_fields = sizeof(struct ctx_hw_stats) / 8;
 
-	memset(buf, 0, buf_size);
-
 	if (!bp->bnapi)
 		return;
 
@@ -835,7 +836,7 @@ static void bnxt_get_drvinfo(struct net_device *dev,
 		strlcpy(info->fw_version, bp->fw_ver_str,
 			sizeof(info->fw_version));
 	strlcpy(info->bus_info, pci_name(bp->pdev), sizeof(info->bus_info));
-	info->n_stats = BNXT_NUM_STATS * bp->cp_nr_rings;
+	info->n_stats = bnxt_get_num_stats(bp);
 	info->testinfo_len = bp->num_tests;
 	/* TODO CHIMP_FW: eeprom dump details */
 	info->eedump_len = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 05/10] bnxt_en: Allow the user to set ethtool stats-block-usecs to 0.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (3 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 04/10] bnxt_en: Add bnxt_get_num_stats() to centrally get the number of ethtool stats Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 06/10] bnxt_en: Report firmware DCBX agent Michael Chan
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

For debugging purpose, it is sometimes useful to disable periodic
port statistics updates, so that the firmware logs will not be
filled with statistics update messages.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 3 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 8 +++++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 4acaeaf..5df9670 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6843,7 +6843,8 @@ static void bnxt_timer(unsigned long data)
 	if (atomic_read(&bp->intr_sem) != 0)
 		goto bnxt_restart_timer;
 
-	if (bp->link_info.link_up && (bp->flags & BNXT_FLAG_PORT_STATS)) {
+	if (bp->link_info.link_up && (bp->flags & BNXT_FLAG_PORT_STATS) &&
+	    bp->stats_coal_ticks) {
 		set_bit(BNXT_PERIODIC_STATS_SP_EVENT, &bp->sp_event);
 		schedule_work(&bp->sp_task);
 	}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 4661b11..140e769 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -86,9 +86,11 @@ static int bnxt_set_coalesce(struct net_device *dev,
 	if (bp->stats_coal_ticks != coal->stats_block_coalesce_usecs) {
 		u32 stats_ticks = coal->stats_block_coalesce_usecs;
 
-		stats_ticks = clamp_t(u32, stats_ticks,
-				      BNXT_MIN_STATS_COAL_TICKS,
-				      BNXT_MAX_STATS_COAL_TICKS);
+		/* Allow 0, which means disable. */
+		if (stats_ticks)
+			stats_ticks = clamp_t(u32, stats_ticks,
+					      BNXT_MIN_STATS_COAL_TICKS,
+					      BNXT_MAX_STATS_COAL_TICKS);
 		stats_ticks = rounddown(stats_ticks, BNXT_MIN_STATS_COAL_TICKS);
 		bp->stats_coal_ticks = stats_ticks;
 		update_stats = true;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 06/10] bnxt_en: Report firmware DCBX agent.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (4 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 05/10] bnxt_en: Allow the user to set ethtool stats-block-usecs to 0 Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 07/10] bnxt_en: Set ETS min_bw parameter for older firmware Michael Chan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

Report DCB_CAP_DCBX_LLD_MANAGED only if the firmware DCBX agent is enabled
and running for PF or VF.  Otherwise, if both LLDP and DCBX agents are
disabled in firmware, we report DCB_CAP_DCBX_LLD_HOST and allow host
IEEE DCB settings.  This patch refines the current logic in the driver.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 19 ++++++++++---------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 11 ++++++++---
 3 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5df9670..95fea26 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4578,6 +4578,7 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
 {
 	struct hwrm_func_qcfg_input req = {0};
 	struct hwrm_func_qcfg_output *resp = bp->hwrm_cmd_resp_addr;
+	u16 flags;
 	int rc;
 
 	bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_QCFG, -1, -1);
@@ -4594,15 +4595,15 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
 		vf->vlan = le16_to_cpu(resp->vlan) & VLAN_VID_MASK;
 	}
 #endif
-	if (BNXT_PF(bp)) {
-		u16 flags = le16_to_cpu(resp->flags);
-
-		if (flags & (FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED |
-			     FUNC_QCFG_RESP_FLAGS_FW_LLDP_AGENT_ENABLED))
-			bp->flags |= BNXT_FLAG_FW_LLDP_AGENT;
-		if (flags & FUNC_QCFG_RESP_FLAGS_MULTI_HOST)
-			bp->flags |= BNXT_FLAG_MULTI_HOST;
-	}
+	flags = le16_to_cpu(resp->flags);
+	if (flags & (FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED |
+		     FUNC_QCFG_RESP_FLAGS_FW_LLDP_AGENT_ENABLED)) {
+		bp->flags |= BNXT_FLAG_FW_LLDP_AGENT;
+		if (flags & FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED)
+			bp->flags |= BNXT_FLAG_FW_DCBX_AGENT;
+	}
+	if (BNXT_PF(bp) && (flags & FUNC_QCFG_RESP_FLAGS_MULTI_HOST))
+		bp->flags |= BNXT_FLAG_MULTI_HOST;
 
 	switch (resp->port_partition_type) {
 	case FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_0:
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 15d5a10..6b781be 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1027,6 +1027,7 @@ struct bnxt {
 	#define BNXT_FLAG_MULTI_HOST	0x100000
 	#define BNXT_FLAG_SHORT_CMD	0x200000
 	#define BNXT_FLAG_DOUBLE_DB	0x400000
+	#define BNXT_FLAG_FW_DCBX_AGENT	0x800000
 	#define BNXT_FLAG_CHIP_NITRO_A0	0x1000000
 
 	#define BNXT_FLAG_ALL_CONFIG_FEATS (BNXT_FLAG_TPA |		\
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
index 5c6dd0c..c014589 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
@@ -549,13 +549,18 @@ static u8 bnxt_dcbnl_setdcbx(struct net_device *dev, u8 mode)
 {
 	struct bnxt *bp = netdev_priv(dev);
 
-	/* only support IEEE */
-	if ((mode & DCB_CAP_DCBX_VER_CEE) || !(mode & DCB_CAP_DCBX_VER_IEEE))
+	/* All firmware DCBX settings are set in NVRAM */
+	if (bp->dcbx_cap & DCB_CAP_DCBX_LLD_MANAGED)
 		return 1;
 
 	if (mode & DCB_CAP_DCBX_HOST) {
 		if (BNXT_VF(bp) || (bp->flags & BNXT_FLAG_FW_LLDP_AGENT))
 			return 1;
+
+		/* only support IEEE */
+		if ((mode & DCB_CAP_DCBX_VER_CEE) ||
+		    !(mode & DCB_CAP_DCBX_VER_IEEE))
+			return 1;
 	}
 
 	if (mode == bp->dcbx_cap)
@@ -584,7 +589,7 @@ void bnxt_dcb_init(struct bnxt *bp)
 	bp->dcbx_cap = DCB_CAP_DCBX_VER_IEEE;
 	if (BNXT_PF(bp) && !(bp->flags & BNXT_FLAG_FW_LLDP_AGENT))
 		bp->dcbx_cap |= DCB_CAP_DCBX_HOST;
-	else
+	else if (bp->flags & BNXT_FLAG_FW_DCBX_AGENT)
 		bp->dcbx_cap |= DCB_CAP_DCBX_LLD_MANAGED;
 	bp->dev->dcbnl_ops = &dcbnl_ops;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 07/10] bnxt_en: Set ETS min_bw parameter for older firmware.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (5 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 06/10] bnxt_en: Report firmware DCBX agent Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 08/10] bnxt_en: add support to enable VF-representors Michael Chan
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev

In addition to the ETS weight, older firmware also requires the min_bw
parameter to be set for it to work properly.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 6 ++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
index c014589..aa1f3a2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
@@ -93,6 +93,12 @@ static int bnxt_hwrm_queue_cos2bw_cfg(struct bnxt *bp, struct ieee_ets *ets,
 			cos2bw.tsa =
 				QUEUE_COS2BW_QCFG_RESP_QUEUE_ID0_TSA_ASSIGN_ETS;
 			cos2bw.bw_weight = ets->tc_tx_bw[i];
+			/* older firmware requires min_bw to be set to the
+			 * same weight value in percent.
+			 */
+			cos2bw.min_bw =
+				cpu_to_le32((ets->tc_tx_bw[i] * 100) |
+					    BW_VALUE_UNIT_PERCENT1_100);
 		}
 		memcpy(data, &cos2bw.queue_id, sizeof(cos2bw) - 4);
 		if (i == 0) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
index ecd0a5e..d2e0af9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
@@ -26,6 +26,7 @@ struct bnxt_cos2bw_cfg {
 	u8			queue_id;
 	__le32			min_bw;
 	__le32			max_bw;
+#define BW_VALUE_UNIT_PERCENT1_100		(0x1UL << 29)
 	u8			tsa;
 	u8			pri_lvl;
 	u8			bw_weight;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 08/10] bnxt_en: add support to enable VF-representors
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (6 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 07/10] bnxt_en: Set ETS min_bw parameter for older firmware Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-25  2:28   ` kbuild test robot
  2017-07-24 16:34 ` [PATCH net-next 09/10] bnxt_en: add vf-rep RX/TX and netdev implementation Michael Chan
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sathya Perla

From: Sathya Perla <sathya.perla@broadcom.com>

This patch is a part of a patch-set that introduces support for
VF-reps in the bnxt_en driver. The driver registers eswitch mode
get/set methods with the devlink interface that allow a user to
enable SRIOV switchdev mode. When enabled, the driver registers
a VF-rep netdev object for each VF with the stack. This can
essentially bring the VFs unders the management perview of the
hypervisor and applications such as OVS.

The next patch in the series, adds the RX/TX routines and a slim
netdev implementation for the VF-reps.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/Kconfig           |   1 +
 drivers/net/ethernet/broadcom/bnxt/Makefile     |   2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c       |   9 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h       |  32 ++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c |   6 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c   | 244 ++++++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h   |  38 ++++
 7 files changed, 330 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h

diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig
index 9641380..285f8bc 100644
--- a/drivers/net/ethernet/broadcom/Kconfig
+++ b/drivers/net/ethernet/broadcom/Kconfig
@@ -193,6 +193,7 @@ config SYSTEMPORT
 config BNXT
 	tristate "Broadcom NetXtreme-C/E support"
 	depends on PCI
+	depends on MAY_USE_DEVLINK
 	select FW_LOADER
 	select LIBCRC32C
 	---help---
diff --git a/drivers/net/ethernet/broadcom/bnxt/Makefile b/drivers/net/ethernet/broadcom/bnxt/Makefile
index a7ca45b..d141a22 100644
--- a/drivers/net/ethernet/broadcom/bnxt/Makefile
+++ b/drivers/net/ethernet/broadcom/bnxt/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_BNXT) += bnxt_en.o
 
-bnxt_en-y := bnxt.o bnxt_sriov.o bnxt_ethtool.o bnxt_dcb.o bnxt_ulp.o bnxt_xdp.o
+bnxt_en-y := bnxt.o bnxt_sriov.o bnxt_ethtool.o bnxt_dcb.o bnxt_ulp.o bnxt_xdp.o bnxt_vfr.o
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 95fea26..ebdeeb4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -57,6 +57,7 @@
 #include "bnxt_ethtool.h"
 #include "bnxt_dcb.h"
 #include "bnxt_xdp.h"
+#include "bnxt_vfr.h"
 
 #define BNXT_TX_TIMEOUT		(5 * HZ)
 
@@ -7539,8 +7540,10 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct bnxt *bp = netdev_priv(dev);
 
-	if (BNXT_PF(bp))
+	if (BNXT_PF(bp)) {
 		bnxt_sriov_disable(bp);
+		bnxt_dl_unregister(bp);
+	}
 
 	pci_disable_pcie_error_reporting(pdev);
 	unregister_netdev(dev);
@@ -7843,6 +7846,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 #ifdef CONFIG_BNXT_SRIOV
 	init_waitqueue_head(&bp->sriov_cfg_wait);
+	mutex_init(&bp->sriov_lock);
 #endif
 	bp->gro_func = bnxt_gro_func_5730x;
 	if (BNXT_CHIP_P4_PLUS(bp))
@@ -7934,6 +7938,9 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (rc)
 		goto init_err_clr_int;
 
+	if (BNXT_PF(bp))
+		bnxt_dl_register(bp);
+
 	netdev_info(dev, "%s found at mem %lx, node addr %pM\n",
 		    board_info[ent->driver_data].name,
 		    (long)pci_resource_start(pdev, 0), dev->dev_addr);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 6b781be..a7d5f42 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -19,6 +19,7 @@
 #define DRV_VER_UPD	0
 
 #include <linux/interrupt.h>
+#include <net/devlink.h>
 
 struct tx_bd {
 	__le32 tx_bd_len_flags_type;
@@ -618,6 +619,8 @@ struct bnxt_tpa_info {
 
 #define BNXT_TPA_OUTER_L3_OFF(hdr_info)	\
 	((hdr_info) & 0x1ff)
+
+	u16			cfa_code; /* cfa_code in TPA start compl */
 };
 
 struct bnxt_rx_ring_info {
@@ -928,6 +931,23 @@ struct bnxt_test_info {
 #define BNXT_CAG_REG_LEGACY_INT_STATUS	0x4014
 #define BNXT_CAG_REG_BASE		0x300000
 
+struct bnxt_vf_rep_stats {
+	u64			packets;
+	u64			bytes;
+	u64			dropped;
+};
+
+struct bnxt_vf_rep {
+	struct bnxt			*bp;
+	struct net_device		*dev;
+	u16				vf_idx;
+	u16				tx_cfa_action;
+	u16				rx_cfa_code;
+
+	struct bnxt_vf_rep_stats	rx_stats;
+	struct bnxt_vf_rep_stats	tx_stats;
+};
+
 struct bnxt {
 	void __iomem		*bar0;
 	void __iomem		*bar1;
@@ -1208,6 +1228,12 @@ struct bnxt {
 	wait_queue_head_t	sriov_cfg_wait;
 	bool			sriov_cfg;
 #define BNXT_SRIOV_CFG_WAIT_TMO	msecs_to_jiffies(10000)
+
+	/* lock to protect VF-rep creation/cleanup via
+	 * multiple paths such as ->sriov_configure() and
+	 * devlink ->eswitch_mode_set()
+	 */
+	struct mutex		sriov_lock;
 #endif
 
 #define BNXT_NTP_FLTR_MAX_FLTR	4096
@@ -1234,6 +1260,12 @@ struct bnxt {
 	struct bnxt_led_info	leds[BNXT_MAX_LED];
 
 	struct bpf_prog		*xdp_prog;
+
+	/* devlink interface and vf-rep structs */
+	struct devlink		*dl;
+	enum devlink_eswitch_mode eswitch_mode;
+	struct bnxt_vf_rep	**vf_reps; /* array of vf-rep ptrs */
+	u16			*cfa_code_map; /* cfa_code -> vf_idx map */
 };
 
 #define BNXT_RX_STATS_OFFSET(counter)			\
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index fde7256..d37925a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -18,6 +18,7 @@
 #include "bnxt.h"
 #include "bnxt_ulp.h"
 #include "bnxt_sriov.h"
+#include "bnxt_vfr.h"
 #include "bnxt_ethtool.h"
 
 #ifdef CONFIG_BNXT_SRIOV
@@ -587,6 +588,10 @@ void bnxt_sriov_disable(struct bnxt *bp)
 	if (!num_vfs)
 		return;
 
+	/* synchronize VF and VF-rep create and destroy */
+	mutex_lock(&bp->sriov_lock);
+	bnxt_vf_reps_destroy(bp);
+
 	if (pci_vfs_assigned(bp->pdev)) {
 		bnxt_hwrm_fwd_async_event_cmpl(
 			bp, NULL, ASYNC_EVENT_CMPL_EVENT_ID_PF_DRVR_UNLOAD);
@@ -597,6 +602,7 @@ void bnxt_sriov_disable(struct bnxt *bp)
 		/* Free the HW resources reserved for various VF's */
 		bnxt_hwrm_func_vf_resource_free(bp, num_vfs);
 	}
+	mutex_unlock(&bp->sriov_lock);
 
 	bnxt_free_vf_resources(bp);
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
new file mode 100644
index 0000000..eab358c
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
@@ -0,0 +1,244 @@
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * Copyright (c) 2016-2017 Broadcom Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/rtnetlink.h>
+#include <linux/jhash.h>
+
+#include "bnxt_hsi.h"
+#include "bnxt.h"
+#include "bnxt_vfr.h"
+
+#define CFA_HANDLE_INVALID		0xffff
+
+static void __bnxt_vf_reps_destroy(struct bnxt *bp)
+{
+	u16 num_vfs = pci_num_vf(bp->pdev);
+	struct bnxt_vf_rep *vf_rep;
+	int i;
+
+	for (i = 0; i < num_vfs; i++) {
+		vf_rep = bp->vf_reps[i];
+		if (vf_rep) {
+			if (vf_rep->dev) {
+				/* if register_netdev failed, then netdev_ops
+				 * would have been set to NULL
+				 */
+				if (vf_rep->dev->netdev_ops)
+					unregister_netdev(vf_rep->dev);
+				free_netdev(vf_rep->dev);
+			}
+		}
+	}
+
+	kfree(bp->vf_reps);
+	bp->vf_reps = NULL;
+}
+
+void bnxt_vf_reps_destroy(struct bnxt *bp)
+{
+	bool closed = false;
+
+	if (bp->eswitch_mode != DEVLINK_ESWITCH_MODE_SWITCHDEV)
+		return;
+
+	if (!bp->vf_reps)
+		return;
+
+	/* Ensure that parent PF's and VF-reps' RX/TX has been quiesced
+	 * before proceeding with VF-rep cleanup.
+	 */
+	rtnl_lock();
+	if (netif_running(bp->dev)) {
+		bnxt_close_nic(bp, false, false);
+		closed = true;
+	}
+	bp->eswitch_mode = DEVLINK_ESWITCH_MODE_LEGACY;
+
+	if (closed)
+		bnxt_open_nic(bp, false, false);
+	rtnl_unlock();
+
+	/* Need to call vf_reps_destroy() outside of rntl_lock
+	 * as unregister_netdev takes rtnl_lock
+	 */
+	__bnxt_vf_reps_destroy(bp);
+}
+
+/* Use the OUI of the PF's perm addr and report the same mac addr
+ * for the same VF-rep each time
+ */
+static void bnxt_vf_rep_eth_addr_gen(u8 *src_mac, u16 vf_idx, u8 *mac)
+{
+	u32 addr;
+
+	ether_addr_copy(mac, src_mac);
+
+	addr = jhash(src_mac, ETH_ALEN, 0) + vf_idx;
+	mac[3] = (u8)(addr & 0xFF);
+	mac[4] = (u8)((addr >> 8) & 0xFF);
+	mac[5] = (u8)((addr >> 16) & 0xFF);
+}
+
+static void bnxt_vf_rep_netdev_init(struct bnxt *bp, struct bnxt_vf_rep *vf_rep,
+				    struct net_device *dev)
+{
+	struct net_device *pf_dev = bp->dev;
+
+	/* Just inherit all the featues of the parent PF as the VF-R
+	 * uses the RX/TX rings of the parent PF
+	 */
+	dev->hw_features = pf_dev->hw_features;
+	dev->gso_partial_features = pf_dev->gso_partial_features;
+	dev->vlan_features = pf_dev->vlan_features;
+	dev->hw_enc_features = pf_dev->hw_enc_features;
+	dev->features |= pf_dev->features;
+	bnxt_vf_rep_eth_addr_gen(bp->pf.mac_addr, vf_rep->vf_idx,
+				 dev->perm_addr);
+	ether_addr_copy(dev->dev_addr, dev->perm_addr);
+}
+
+static int bnxt_vf_reps_create(struct bnxt *bp)
+{
+	u16 num_vfs = pci_num_vf(bp->pdev);
+	struct bnxt_vf_rep *vf_rep;
+	struct net_device *dev;
+	int rc, i;
+
+	bp->vf_reps = kcalloc(num_vfs, sizeof(vf_rep), GFP_KERNEL);
+	if (!bp->vf_reps)
+		return -ENOMEM;
+
+	for (i = 0; i < num_vfs; i++) {
+		dev = alloc_etherdev(sizeof(*vf_rep));
+		if (!dev) {
+			rc = -ENOMEM;
+			goto err;
+		}
+
+		vf_rep = netdev_priv(dev);
+		bp->vf_reps[i] = vf_rep;
+		vf_rep->dev = dev;
+		vf_rep->bp = bp;
+		vf_rep->vf_idx = i;
+		vf_rep->tx_cfa_action = CFA_HANDLE_INVALID;
+
+		bnxt_vf_rep_netdev_init(bp, vf_rep, dev);
+		rc = register_netdev(dev);
+		if (rc) {
+			/* no need for unregister_netdev in cleanup */
+			dev->netdev_ops = NULL;
+			goto err;
+		}
+	}
+
+	bp->eswitch_mode = DEVLINK_ESWITCH_MODE_SWITCHDEV;
+	return 0;
+
+err:
+	netdev_info(bp->dev, "%s error=%d", __func__, rc);
+	__bnxt_vf_reps_destroy(bp);
+	return rc;
+}
+
+/* Devlink related routines */
+static int bnxt_dl_eswitch_mode_get(struct devlink *devlink, u16 *mode)
+{
+	struct bnxt *bp = bnxt_get_bp_from_dl(devlink);
+
+	*mode = bp->eswitch_mode;
+	return 0;
+}
+
+static int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode)
+{
+	struct bnxt *bp = bnxt_get_bp_from_dl(devlink);
+	int rc = 0;
+
+	mutex_lock(&bp->sriov_lock);
+	if (bp->eswitch_mode == mode) {
+		netdev_info(bp->dev, "already in %s eswitch mode",
+			    mode == DEVLINK_ESWITCH_MODE_LEGACY ?
+			    "legacy" : "switchdev");
+		rc = -EINVAL;
+		goto done;
+	}
+
+	switch (mode) {
+	case DEVLINK_ESWITCH_MODE_LEGACY:
+		bnxt_vf_reps_destroy(bp);
+		break;
+
+	case DEVLINK_ESWITCH_MODE_SWITCHDEV:
+		if (pci_num_vf(bp->pdev) == 0) {
+			netdev_info(bp->dev,
+				    "Enable VFs before setting swtichdev mode");
+			rc = -EPERM;
+			goto done;
+		}
+		rc = bnxt_vf_reps_create(bp);
+		break;
+
+	default:
+		rc = -EINVAL;
+		goto done;
+	}
+done:
+	mutex_unlock(&bp->sriov_lock);
+	return rc;
+}
+
+static const struct devlink_ops bnxt_dl_ops = {
+	.eswitch_mode_set = bnxt_dl_eswitch_mode_set,
+	.eswitch_mode_get = bnxt_dl_eswitch_mode_get
+};
+
+int bnxt_dl_register(struct bnxt *bp)
+{
+	struct devlink *dl;
+	int rc;
+
+	if (!pci_find_ext_capability(bp->pdev, PCI_EXT_CAP_ID_SRIOV))
+		return 0;
+
+	if (bp->hwrm_spec_code < 0x10800) {
+		netdev_warn(bp->dev, "Firmware does not support SR-IOV E-Switch SWITCHDEV mode.\n");
+		return -ENOTSUPP;
+	}
+
+	dl = devlink_alloc(&bnxt_dl_ops, sizeof(struct bnxt_dl));
+	if (!dl) {
+		netdev_warn(bp->dev, "devlink_alloc failed");
+		return -ENOMEM;
+	}
+
+	bnxt_link_bp_to_dl(dl, bp);
+	bp->eswitch_mode = DEVLINK_ESWITCH_MODE_LEGACY;
+	rc = devlink_register(dl, &bp->pdev->dev);
+	if (rc) {
+		bnxt_link_bp_to_dl(dl, NULL);
+		devlink_free(dl);
+		netdev_warn(bp->dev, "devlink_register failed. rc=%d", rc);
+		return rc;
+	}
+
+	return 0;
+}
+
+void bnxt_dl_unregister(struct bnxt *bp)
+{
+	struct devlink *dl = bp->dl;
+
+	if (!dl)
+		return;
+
+	devlink_unregister(dl);
+	devlink_free(dl);
+}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h
new file mode 100644
index 0000000..310c9c5
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h
@@ -0,0 +1,38 @@
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * Copyright (c) 2016-2017 Broadcom Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#ifndef BNXT_VFR_H
+#define BNXT_VFR_H
+
+#define	MAX_CFA_CODE			65536
+
+/* Struct to hold housekeeping info needed by devlink interface */
+struct bnxt_dl {
+	struct bnxt *bp;	/* back ptr to the controlling dev */
+};
+
+static inline struct bnxt *bnxt_get_bp_from_dl(struct devlink *dl)
+{
+	return ((struct bnxt_dl *)devlink_priv(dl))->bp;
+}
+
+static inline void bnxt_link_bp_to_dl(struct devlink *dl, struct bnxt *bp)
+{
+	struct bnxt_dl *bp_dl = devlink_priv(dl);
+
+	bp_dl->bp = bp;
+	if (bp)
+		bp->dl = dl;
+}
+
+int bnxt_dl_register(struct bnxt *bp);
+void bnxt_dl_unregister(struct bnxt *bp);
+void bnxt_vf_reps_destroy(struct bnxt *bp);
+
+#endif /* BNXT_VFR_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 09/10] bnxt_en: add vf-rep RX/TX and netdev implementation
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (7 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 08/10] bnxt_en: add support to enable VF-representors Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-24 16:34 ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Michael Chan
  2017-07-25  0:32 ` [PATCH net-next 00/10] bnxt_en: Updates for net-next David Miller
  10 siblings, 0 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sathya Perla

From: Sathya Perla <sathya.perla@broadcom.com>

This patch introduces the RX/TX and a simple netdev implementation
for VF-reps. The VF-reps use the RX/TX rings of the PF. For each VF-rep
the PF driver issues a VFR_ALLOC FW cmd that returns "cfa_code"
and "cfa_action" values. The FW sets up the filter tables in such
a way that VF traffic by default (in absence of other rules)
gets punted to the parent PF. The cfa_code value in the RX-compl
informs the driver of the source VF. For traffic being transmitted
from the VF-rep, the TX BD is tagged with a cfa_action value that
informs the HW to punt it to the corresponding VF.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  68 ++++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  10 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 218 +++++++++++++++++++++++++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h |   4 +
 4 files changed, 288 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ebdeeb4..f262fe6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -245,6 +245,16 @@ static bool bnxt_vf_pciid(enum board_idx idx)
 	TX_BD_FLAGS_LHINT_2048_AND_LARGER,
 };
 
+static u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb)
+{
+	struct metadata_dst *md_dst = skb_metadata_dst(skb);
+
+	if (!md_dst || md_dst->type != METADATA_HW_PORT_MUX)
+		return 0;
+
+	return md_dst->u.port_info.port_id;
+}
+
 static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct bnxt *bp = netdev_priv(dev);
@@ -289,7 +299,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	tx_buf->nr_frags = last_frag;
 
 	vlan_tag_flags = 0;
-	cfa_action = 0;
+	cfa_action = bnxt_xmit_get_cfa_action(skb);
 	if (skb_vlan_tag_present(skb)) {
 		vlan_tag_flags = TX_BD_CFA_META_KEY_VLAN |
 				 skb_vlan_tag_get(skb);
@@ -324,7 +334,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			tx_push1->tx_bd_hsize_lflags = 0;
 
 		tx_push1->tx_bd_cfa_meta = cpu_to_le32(vlan_tag_flags);
-		tx_push1->tx_bd_cfa_action = cpu_to_le32(cfa_action);
+		tx_push1->tx_bd_cfa_action =
+			cpu_to_le32(cfa_action << TX_BD_CFA_ACTION_SHIFT);
 
 		end = pdata + length;
 		end = PTR_ALIGN(end, 8) - 1;
@@ -429,7 +440,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	txbd->tx_bd_len_flags_type = cpu_to_le32(flags);
 
 	txbd1->tx_bd_cfa_meta = cpu_to_le32(vlan_tag_flags);
-	txbd1->tx_bd_cfa_action = cpu_to_le32(cfa_action);
+	txbd1->tx_bd_cfa_action =
+			cpu_to_le32(cfa_action << TX_BD_CFA_ACTION_SHIFT);
 	for (i = 0; i < last_frag; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
@@ -1034,7 +1046,10 @@ static void bnxt_tpa_start(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
 		bnxt_sched_reset(bp, rxr);
 		return;
 	}
-
+	/* Store cfa_code in tpa_info to use in tpa_end
+	 * completion processing.
+	 */
+	tpa_info->cfa_code = TPA_START_CFA_CODE(tpa_start1);
 	prod_rx_buf->data = tpa_info->data;
 	prod_rx_buf->data_ptr = tpa_info->data_ptr;
 
@@ -1269,6 +1284,17 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 	return skb;
 }
 
+/* Given the cfa_code of a received packet determine which
+ * netdev (vf-rep or PF) the packet is destined to.
+ */
+static struct net_device *bnxt_get_pkt_dev(struct bnxt *bp, u16 cfa_code)
+{
+	struct net_device *dev = bnxt_get_vf_rep(bp, cfa_code);
+
+	/* if vf-rep dev is NULL, the must belongs to the PF */
+	return dev ? dev : bp->dev;
+}
+
 static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
 					   struct bnxt_napi *bnapi,
 					   u32 *raw_cons,
@@ -1362,7 +1388,9 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
 			return NULL;
 		}
 	}
-	skb->protocol = eth_type_trans(skb, bp->dev);
+
+	skb->protocol =
+		eth_type_trans(skb, bnxt_get_pkt_dev(bp, tpa_info->cfa_code));
 
 	if (tpa_info->hash_type != PKT_HASH_TYPE_NONE)
 		skb_set_hash(skb, tpa_info->rss_hash, tpa_info->hash_type);
@@ -1389,6 +1417,18 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
 	return skb;
 }
 
+static void bnxt_deliver_skb(struct bnxt *bp, struct bnxt_napi *bnapi,
+			     struct sk_buff *skb)
+{
+	if (skb->dev != bp->dev) {
+		/* this packet belongs to a vf-rep */
+		bnxt_vf_rep_rx(bp, skb);
+		return;
+	}
+	skb_record_rx_queue(skb, bnapi->index);
+	napi_gro_receive(&bnapi->napi, skb);
+}
+
 /* returns the following:
  * 1       - 1 packet successfully received
  * 0       - successful TPA_START, packet not completed yet
@@ -1405,7 +1445,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_napi *bnapi, u32 *raw_cons,
 	struct rx_cmp *rxcmp;
 	struct rx_cmp_ext *rxcmp1;
 	u32 tmp_raw_cons = *raw_cons;
-	u16 cons, prod, cp_cons = RING_CMP(tmp_raw_cons);
+	u16 cfa_code, cons, prod, cp_cons = RING_CMP(tmp_raw_cons);
 	struct bnxt_sw_rx_bd *rx_buf;
 	unsigned int len;
 	u8 *data_ptr, agg_bufs, cmp_type;
@@ -1447,8 +1487,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_napi *bnapi, u32 *raw_cons,
 
 		rc = -ENOMEM;
 		if (likely(skb)) {
-			skb_record_rx_queue(skb, bnapi->index);
-			napi_gro_receive(&bnapi->napi, skb);
+			bnxt_deliver_skb(bp, bnapi, skb);
 			rc = 1;
 		}
 		*event |= BNXT_RX_EVENT;
@@ -1537,7 +1576,8 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_napi *bnapi, u32 *raw_cons,
 		skb_set_hash(skb, le32_to_cpu(rxcmp->rx_cmp_rss_hash), type);
 	}
 
-	skb->protocol = eth_type_trans(skb, dev);
+	cfa_code = RX_CMP_CFA_CODE(rxcmp1);
+	skb->protocol = eth_type_trans(skb, bnxt_get_pkt_dev(bp, cfa_code));
 
 	if ((rxcmp1->rx_cmp_flags2 &
 	     cpu_to_le32(RX_CMP_FLAGS2_META_FORMAT_VLAN)) &&
@@ -1562,8 +1602,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_napi *bnapi, u32 *raw_cons,
 		}
 	}
 
-	skb_record_rx_queue(skb, bnapi->index);
-	napi_gro_receive(&bnapi->napi, skb);
+	bnxt_deliver_skb(bp, bnapi, skb);
 	rc = 1;
 
 next_rx:
@@ -6246,6 +6285,9 @@ static int __bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
 	/* Poll link status and check for SFP+ module status */
 	bnxt_get_port_module_status(bp);
 
+	/* VF-reps may need to be re-opened after the PF is re-opened */
+	if (BNXT_PF(bp))
+		bnxt_vf_reps_open(bp);
 	return 0;
 
 open_err:
@@ -6334,6 +6376,10 @@ int bnxt_close_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
 		if (rc)
 			netdev_warn(bp->dev, "timeout waiting for SRIOV config operation to complete!\n");
 	}
+
+	/* Close the VF-reps before closing PF */
+	if (BNXT_PF(bp))
+		bnxt_vf_reps_close(bp);
 #endif
 	/* Change device state to avoid TX queue wake up's */
 	bnxt_tx_disable(bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index a7d5f42..63756f0 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -20,6 +20,7 @@
 
 #include <linux/interrupt.h>
 #include <net/devlink.h>
+#include <net/dst_metadata.h>
 
 struct tx_bd {
 	__le32 tx_bd_len_flags_type;
@@ -243,6 +244,10 @@ struct rx_cmp_ext {
 	    ((le32_to_cpu((rxcmp1)->rx_cmp_flags2) &			\
 	     RX_CMP_FLAGS2_T_L4_CS_CALC) >> 3)
 
+#define RX_CMP_CFA_CODE(rxcmpl1)					\
+	((le32_to_cpu((rxcmpl1)->rx_cmp_cfa_code_errors_v2) &		\
+	  RX_CMPL_CFA_CODE_MASK) >> RX_CMPL_CFA_CODE_SFT)
+
 struct rx_agg_cmp {
 	__le32 rx_agg_cmp_len_flags_type;
 	#define RX_AGG_CMP_TYPE					(0x3f << 0)
@@ -312,6 +317,10 @@ struct rx_tpa_start_cmp_ext {
 	__le32 rx_tpa_start_cmp_hdr_info;
 };
 
+#define TPA_START_CFA_CODE(rx_tpa_start)				\
+	((le32_to_cpu((rx_tpa_start)->rx_tpa_start_cmp_cfa_code_v2) &	\
+	 RX_TPA_START_CMP_CFA_CODE) >> RX_TPA_START_CMPL_CFA_CODE_SHIFT)
+
 struct rx_tpa_end_cmp {
 	__le32 rx_tpa_end_cmp_len_flags_type;
 	#define RX_TPA_END_CMP_TYPE				(0x3f << 0)
@@ -940,6 +949,7 @@ struct bnxt_vf_rep_stats {
 struct bnxt_vf_rep {
 	struct bnxt			*bp;
 	struct net_device		*dev;
+	struct metadata_dst		*dst;
 	u16				vf_idx;
 	u16				tx_cfa_action;
 	u16				rx_cfa_code;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
index eab358c..60bdb181 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
@@ -17,6 +17,178 @@
 #include "bnxt_vfr.h"
 
 #define CFA_HANDLE_INVALID		0xffff
+#define VF_IDX_INVALID			0xffff
+
+static int hwrm_cfa_vfr_alloc(struct bnxt *bp, u16 vf_idx,
+			      u16 *tx_cfa_action, u16 *rx_cfa_code)
+{
+	struct hwrm_cfa_vfr_alloc_output *resp = bp->hwrm_cmd_resp_addr;
+	struct hwrm_cfa_vfr_alloc_input req = { 0 };
+	int rc;
+
+	bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_CFA_VFR_ALLOC, -1, -1);
+	req.vf_id = cpu_to_le16(vf_idx);
+	sprintf(req.vfr_name, "vfr%d", vf_idx);
+
+	mutex_lock(&bp->hwrm_cmd_lock);
+	rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+	if (!rc) {
+		*tx_cfa_action = le16_to_cpu(resp->tx_cfa_action);
+		*rx_cfa_code = le16_to_cpu(resp->rx_cfa_code);
+		netdev_dbg(bp->dev, "tx_cfa_action=0x%x, rx_cfa_code=0x%x",
+			   *tx_cfa_action, *rx_cfa_code);
+	} else {
+		netdev_info(bp->dev, "%s error rc=%d", __func__, rc);
+	}
+
+	mutex_unlock(&bp->hwrm_cmd_lock);
+	return rc;
+}
+
+static int hwrm_cfa_vfr_free(struct bnxt *bp, u16 vf_idx)
+{
+	struct hwrm_cfa_vfr_free_input req = { 0 };
+	int rc;
+
+	bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_CFA_VFR_FREE, -1, -1);
+	sprintf(req.vfr_name, "vfr%d", vf_idx);
+
+	rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+	if (rc)
+		netdev_info(bp->dev, "%s error rc=%d", __func__, rc);
+	return rc;
+}
+
+static int bnxt_vf_rep_open(struct net_device *dev)
+{
+	struct bnxt_vf_rep *vf_rep = netdev_priv(dev);
+	struct bnxt *bp = vf_rep->bp;
+
+	/* Enable link and TX only if the parent PF is open. */
+	if (netif_running(bp->dev)) {
+		netif_carrier_on(dev);
+		netif_tx_start_all_queues(dev);
+	}
+	return 0;
+}
+
+static int bnxt_vf_rep_close(struct net_device *dev)
+{
+	netif_carrier_off(dev);
+	netif_tx_disable(dev);
+
+	return 0;
+}
+
+static netdev_tx_t bnxt_vf_rep_xmit(struct sk_buff *skb,
+				    struct net_device *dev)
+{
+	struct bnxt_vf_rep *vf_rep = netdev_priv(dev);
+	int rc, len = skb->len;
+
+	skb_dst_drop(skb);
+	dst_hold((struct dst_entry *)vf_rep->dst);
+	skb_dst_set(skb, (struct dst_entry *)vf_rep->dst);
+	skb->dev = vf_rep->dst->u.port_info.lower_dev;
+
+	rc = dev_queue_xmit(skb);
+	if (!rc) {
+		vf_rep->tx_stats.packets++;
+		vf_rep->tx_stats.bytes += len;
+	}
+	return rc;
+}
+
+static void
+bnxt_vf_rep_get_stats64(struct net_device *dev,
+			struct rtnl_link_stats64 *stats)
+{
+	struct bnxt_vf_rep *vf_rep = netdev_priv(dev);
+
+	stats->rx_packets = vf_rep->rx_stats.packets;
+	stats->rx_bytes = vf_rep->rx_stats.bytes;
+	stats->tx_packets = vf_rep->tx_stats.packets;
+	stats->tx_bytes = vf_rep->tx_stats.bytes;
+}
+
+struct net_device *bnxt_get_vf_rep(struct bnxt *bp, u16 cfa_code)
+{
+	u16 vf_idx;
+
+	if (cfa_code && bp->cfa_code_map && BNXT_PF(bp)) {
+		vf_idx = bp->cfa_code_map[cfa_code];
+		if (vf_idx != VF_IDX_INVALID)
+			return bp->vf_reps[vf_idx]->dev;
+	}
+	return NULL;
+}
+
+void bnxt_vf_rep_rx(struct bnxt *bp, struct sk_buff *skb)
+{
+	struct bnxt_vf_rep *vf_rep = netdev_priv(skb->dev);
+	struct bnxt_vf_rep_stats *rx_stats;
+
+	rx_stats = &vf_rep->rx_stats;
+	vf_rep->rx_stats.bytes += skb->len;
+	vf_rep->rx_stats.packets++;
+
+	netif_receive_skb(skb);
+}
+
+static void bnxt_vf_rep_get_drvinfo(struct net_device *dev,
+				    struct ethtool_drvinfo *info)
+{
+	strlcpy(info->driver, DRV_MODULE_NAME, sizeof(info->driver));
+	strlcpy(info->version, DRV_MODULE_VERSION, sizeof(info->version));
+}
+
+static const struct ethtool_ops bnxt_vf_rep_ethtool_ops = {
+	.get_drvinfo		= bnxt_vf_rep_get_drvinfo
+};
+
+static const struct net_device_ops bnxt_vf_rep_netdev_ops = {
+	.ndo_open		= bnxt_vf_rep_open,
+	.ndo_stop		= bnxt_vf_rep_close,
+	.ndo_start_xmit		= bnxt_vf_rep_xmit,
+	.ndo_get_stats64	= bnxt_vf_rep_get_stats64
+};
+
+/* Called when the parent PF interface is closed:
+ * As the mode transition from SWITCHDEV to LEGACY
+ * happens under the rtnl_lock() this routine is safe
+ * under the rtnl_lock()
+ */
+void bnxt_vf_reps_close(struct bnxt *bp)
+{
+	struct bnxt_vf_rep *vf_rep;
+	u16 num_vfs, i;
+
+	if (bp->eswitch_mode != DEVLINK_ESWITCH_MODE_SWITCHDEV)
+		return;
+
+	num_vfs = pci_num_vf(bp->pdev);
+	for (i = 0; i < num_vfs; i++) {
+		vf_rep = bp->vf_reps[i];
+		if (netif_running(vf_rep->dev))
+			bnxt_vf_rep_close(vf_rep->dev);
+	}
+}
+
+/* Called when the parent PF interface is opened (re-opened):
+ * As the mode transition from SWITCHDEV to LEGACY
+ * happen under the rtnl_lock() this routine is safe
+ * under the rtnl_lock()
+ */
+void bnxt_vf_reps_open(struct bnxt *bp)
+{
+	int i;
+
+	if (bp->eswitch_mode != DEVLINK_ESWITCH_MODE_SWITCHDEV)
+		return;
+
+	for (i = 0; i < pci_num_vf(bp->pdev); i++)
+		bnxt_vf_rep_open(bp->vf_reps[i]->dev);
+}
 
 static void __bnxt_vf_reps_destroy(struct bnxt *bp)
 {
@@ -27,6 +199,11 @@ static void __bnxt_vf_reps_destroy(struct bnxt *bp)
 	for (i = 0; i < num_vfs; i++) {
 		vf_rep = bp->vf_reps[i];
 		if (vf_rep) {
+			dst_release((struct dst_entry *)vf_rep->dst);
+
+			if (vf_rep->tx_cfa_action != CFA_HANDLE_INVALID)
+				hwrm_cfa_vfr_free(bp, vf_rep->vf_idx);
+
 			if (vf_rep->dev) {
 				/* if register_netdev failed, then netdev_ops
 				 * would have been set to NULL
@@ -60,6 +237,9 @@ void bnxt_vf_reps_destroy(struct bnxt *bp)
 		bnxt_close_nic(bp, false, false);
 		closed = true;
 	}
+	/* un-publish cfa_code_map so that RX path can't see it anymore */
+	kfree(bp->cfa_code_map);
+	bp->cfa_code_map = NULL;
 	bp->eswitch_mode = DEVLINK_ESWITCH_MODE_LEGACY;
 
 	if (closed)
@@ -92,6 +272,8 @@ static void bnxt_vf_rep_netdev_init(struct bnxt *bp, struct bnxt_vf_rep *vf_rep,
 {
 	struct net_device *pf_dev = bp->dev;
 
+	dev->netdev_ops = &bnxt_vf_rep_netdev_ops;
+	dev->ethtool_ops = &bnxt_vf_rep_ethtool_ops;
 	/* Just inherit all the featues of the parent PF as the VF-R
 	 * uses the RX/TX rings of the parent PF
 	 */
@@ -107,7 +289,7 @@ static void bnxt_vf_rep_netdev_init(struct bnxt *bp, struct bnxt_vf_rep *vf_rep,
 
 static int bnxt_vf_reps_create(struct bnxt *bp)
 {
-	u16 num_vfs = pci_num_vf(bp->pdev);
+	u16 *cfa_code_map = NULL, num_vfs = pci_num_vf(bp->pdev);
 	struct bnxt_vf_rep *vf_rep;
 	struct net_device *dev;
 	int rc, i;
@@ -116,6 +298,16 @@ static int bnxt_vf_reps_create(struct bnxt *bp)
 	if (!bp->vf_reps)
 		return -ENOMEM;
 
+	/* storage for cfa_code to vf-idx mapping */
+	cfa_code_map = kmalloc(sizeof(*bp->cfa_code_map) * MAX_CFA_CODE,
+			       GFP_KERNEL);
+	if (!cfa_code_map) {
+		rc = -ENOMEM;
+		goto err;
+	}
+	for (i = 0; i < MAX_CFA_CODE; i++)
+		cfa_code_map[i] = VF_IDX_INVALID;
+
 	for (i = 0; i < num_vfs; i++) {
 		dev = alloc_etherdev(sizeof(*vf_rep));
 		if (!dev) {
@@ -130,6 +322,26 @@ static int bnxt_vf_reps_create(struct bnxt *bp)
 		vf_rep->vf_idx = i;
 		vf_rep->tx_cfa_action = CFA_HANDLE_INVALID;
 
+		/* get cfa handles from FW */
+		rc = hwrm_cfa_vfr_alloc(bp, vf_rep->vf_idx,
+					&vf_rep->tx_cfa_action,
+					&vf_rep->rx_cfa_code);
+		if (rc) {
+			rc = -ENOLINK;
+			goto err;
+		}
+		cfa_code_map[vf_rep->rx_cfa_code] = vf_rep->vf_idx;
+
+		vf_rep->dst = metadata_dst_alloc(0, METADATA_HW_PORT_MUX,
+						 GFP_KERNEL);
+		if (!vf_rep->dst) {
+			rc = -ENOMEM;
+			goto err;
+		}
+		/* only cfa_action is needed to mux a packet while TXing */
+		vf_rep->dst->u.port_info.port_id = vf_rep->tx_cfa_action;
+		vf_rep->dst->u.port_info.lower_dev = bp->dev;
+
 		bnxt_vf_rep_netdev_init(bp, vf_rep, dev);
 		rc = register_netdev(dev);
 		if (rc) {
@@ -139,11 +351,15 @@ static int bnxt_vf_reps_create(struct bnxt *bp)
 		}
 	}
 
+	/* publish cfa_code_map only after all VF-reps have been initialized */
+	bp->cfa_code_map = cfa_code_map;
 	bp->eswitch_mode = DEVLINK_ESWITCH_MODE_SWITCHDEV;
+	netif_keep_dst(bp->dev);
 	return 0;
 
 err:
 	netdev_info(bp->dev, "%s error=%d", __func__, rc);
+	kfree(cfa_code_map);
 	__bnxt_vf_reps_destroy(bp);
 	return rc;
 }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h
index 310c9c5..c6cd55a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h
@@ -34,5 +34,9 @@ static inline void bnxt_link_bp_to_dl(struct devlink *dl, struct bnxt *bp)
 int bnxt_dl_register(struct bnxt *bp);
 void bnxt_dl_unregister(struct bnxt *bp);
 void bnxt_vf_reps_destroy(struct bnxt *bp);
+void bnxt_vf_reps_close(struct bnxt *bp);
+void bnxt_vf_reps_open(struct bnxt *bp);
+void bnxt_vf_rep_rx(struct bnxt *bp, struct sk_buff *skb);
+struct net_device *bnxt_get_vf_rep(struct bnxt *bp, u16 cfa_code);
 
 #endif /* BNXT_VFR_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (8 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 09/10] bnxt_en: add vf-rep RX/TX and netdev implementation Michael Chan
@ 2017-07-24 16:34 ` Michael Chan
  2017-07-25  2:44   ` kbuild test robot
  2017-07-25  4:45   ` Jakub Kicinski
  2017-07-25  0:32 ` [PATCH net-next 00/10] bnxt_en: Updates for net-next David Miller
  10 siblings, 2 replies; 38+ messages in thread
From: Michael Chan @ 2017-07-24 16:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sathya Perla

From: Sathya Perla <sathya.perla@broadcom.com>

This patch adds support for the switchdev_port_attr_get() and
ndo_get_phys_port_name() methods for the PF and the VF-reps.
Using this support a user application can deduce that the PF
(when in the ESWITCH_SWDEV mode) and it's VF-reps form a switch.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 57 +++++++++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  2 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 31 ++++++++++++++-
 3 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f262fe6..82cbe18 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7546,6 +7546,61 @@ static int bnxt_bridge_setlink(struct net_device *dev, struct nlmsghdr *nlh,
 	return rc;
 }
 
+static int bnxt_get_phys_port_name(struct net_device *dev, char *buf,
+				   size_t len)
+{
+	struct bnxt *bp = netdev_priv(dev);
+	int rc;
+
+	/* The PF and it's VF-reps only support the switchdev framework */
+	if (!BNXT_PF(bp))
+		return -EOPNOTSUPP;
+
+	/* The switch-id that the pf belongs to is exported by
+	 * the switchdev ndo. This name is just to distinguish from the
+	 * vf-rep ports.
+	 */
+	rc = snprintf(buf, len, "pf%d", bp->pf.port_id);
+
+	if (rc >= len)
+		return -EOPNOTSUPP;
+	return 0;
+}
+
+int bnxt_port_attr_get(struct bnxt *bp, struct switchdev_attr *attr)
+{
+	if (bp->eswitch_mode != DEVLINK_ESWITCH_MODE_SWITCHDEV)
+		return -EOPNOTSUPP;
+
+	/* The PF and it's VF-reps only support the switchdev framework */
+	if (!BNXT_PF(bp))
+		return -EOPNOTSUPP;
+
+	switch (attr->id) {
+	case SWITCHDEV_ATTR_ID_PORT_PARENT_ID:
+		/* In SRIOV each PF-pool (PF + child VFs) serves as a
+		 * switching domain, the PF's perm mac-addr can be used
+		 * as the unique parent-id
+		 */
+		attr->u.ppid.id_len = ETH_ALEN;
+		ether_addr_copy(attr->u.ppid.id, bp->pf.mac_addr);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+	return 0;
+}
+
+static int bnxt_swdev_port_attr_get(struct net_device *dev,
+				    struct switchdev_attr *attr)
+{
+	return bnxt_port_attr_get(netdev_priv(dev), attr);
+}
+
+static const struct switchdev_ops bnxt_switchdev_ops = {
+	.switchdev_port_attr_get	= bnxt_swdev_port_attr_get
+};
+
 static const struct net_device_ops bnxt_netdev_ops = {
 	.ndo_open		= bnxt_open,
 	.ndo_start_xmit		= bnxt_start_xmit,
@@ -7579,6 +7634,7 @@ static int bnxt_bridge_setlink(struct net_device *dev, struct nlmsghdr *nlh,
 	.ndo_xdp		= bnxt_xdp,
 	.ndo_bridge_getlink	= bnxt_bridge_getlink,
 	.ndo_bridge_setlink	= bnxt_bridge_setlink,
+	.ndo_get_phys_port_name = bnxt_get_phys_port_name
 };
 
 static void bnxt_remove_one(struct pci_dev *pdev)
@@ -7838,6 +7894,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	dev->netdev_ops = &bnxt_netdev_ops;
 	dev->watchdog_timeo = BNXT_TX_TIMEOUT;
 	dev->ethtool_ops = &bnxt_ethtool_ops;
+	dev->switchdev_ops = &bnxt_switchdev_ops;
 	pci_set_drvdata(pdev, dev);
 
 	rc = bnxt_alloc_hwrm_resources(bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 63756f0..2d84d57 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -21,6 +21,7 @@
 #include <linux/interrupt.h>
 #include <net/devlink.h>
 #include <net/dst_metadata.h>
+#include <net/switchdev.h>
 
 struct tx_bd {
 	__le32 tx_bd_len_flags_type;
@@ -1350,4 +1351,5 @@ int bnxt_reserve_rings(struct bnxt *bp, int tx, int rx, bool sh, int tcs,
 int bnxt_setup_mq_tc(struct net_device *dev, u8 tc);
 int bnxt_get_max_rings(struct bnxt *, int *, int *, bool);
 void bnxt_restore_pf_fw_resources(struct bnxt *bp);
+int bnxt_port_attr_get(struct bnxt *bp, struct switchdev_attr *attr);
 #endif
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
index 60bdb181..83478e9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
@@ -135,6 +135,18 @@ void bnxt_vf_rep_rx(struct bnxt *bp, struct sk_buff *skb)
 	netif_receive_skb(skb);
 }
 
+static int bnxt_vf_rep_get_phys_port_name(struct net_device *dev, char *buf,
+					  size_t len)
+{
+	struct bnxt_vf_rep *vf_rep = netdev_priv(dev);
+	int rc;
+
+	rc = snprintf(buf, len, "vfr%d", vf_rep->vf_idx);
+	if (rc >= len)
+		return -EOPNOTSUPP;
+	return 0;
+}
+
 static void bnxt_vf_rep_get_drvinfo(struct net_device *dev,
 				    struct ethtool_drvinfo *info)
 {
@@ -142,6 +154,21 @@ static void bnxt_vf_rep_get_drvinfo(struct net_device *dev,
 	strlcpy(info->version, DRV_MODULE_VERSION, sizeof(info->version));
 }
 
+static int bnxt_vf_rep_port_attr_get(struct net_device *dev,
+				     struct switchdev_attr *attr)
+{
+	struct bnxt_vf_rep *vf_rep = netdev_priv(dev);
+
+	/* as only PORT_PARENT_ID is supported currently use common code
+	 * between PF and VF-rep for now.
+	 */
+	return bnxt_port_attr_get(vf_rep->bp, attr);
+}
+
+static const struct switchdev_ops bnxt_vf_rep_switchdev_ops = {
+	.switchdev_port_attr_get	= bnxt_vf_rep_port_attr_get
+};
+
 static const struct ethtool_ops bnxt_vf_rep_ethtool_ops = {
 	.get_drvinfo		= bnxt_vf_rep_get_drvinfo
 };
@@ -150,7 +177,8 @@ static void bnxt_vf_rep_get_drvinfo(struct net_device *dev,
 	.ndo_open		= bnxt_vf_rep_open,
 	.ndo_stop		= bnxt_vf_rep_close,
 	.ndo_start_xmit		= bnxt_vf_rep_xmit,
-	.ndo_get_stats64	= bnxt_vf_rep_get_stats64
+	.ndo_get_stats64	= bnxt_vf_rep_get_stats64,
+	.ndo_get_phys_port_name = bnxt_vf_rep_get_phys_port_name
 };
 
 /* Called when the parent PF interface is closed:
@@ -274,6 +302,7 @@ static void bnxt_vf_rep_netdev_init(struct bnxt *bp, struct bnxt_vf_rep *vf_rep,
 
 	dev->netdev_ops = &bnxt_vf_rep_netdev_ops;
 	dev->ethtool_ops = &bnxt_vf_rep_ethtool_ops;
+	dev->switchdev_ops = &bnxt_vf_rep_switchdev_ops;
 	/* Just inherit all the featues of the parent PF as the VF-R
 	 * uses the RX/TX rings of the parent PF
 	 */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 00/10] bnxt_en: Updates for net-next.
  2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
                   ` (9 preceding siblings ...)
  2017-07-24 16:34 ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Michael Chan
@ 2017-07-25  0:32 ` David Miller
  10 siblings, 0 replies; 38+ messages in thread
From: David Miller @ 2017-07-25  0:32 UTC (permalink / raw)
  To: michael.chan; +Cc: netdev

From: Michael Chan <michael.chan@broadcom.com>
Date: Mon, 24 Jul 2017 12:34:19 -0400

> This series includes updating the firmware interface, adding
> methods to get and set VEPA/VEB bridge modes, some minor DCBX and ETS
> refinements, and 3 patches from Sathya Perla to implement initial
> VF representors for SRIOV switching.

Series applied, thank you.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 08/10] bnxt_en: add support to enable VF-representors
  2017-07-24 16:34 ` [PATCH net-next 08/10] bnxt_en: add support to enable VF-representors Michael Chan
@ 2017-07-25  2:28   ` kbuild test robot
  0 siblings, 0 replies; 38+ messages in thread
From: kbuild test robot @ 2017-07-25  2:28 UTC (permalink / raw)
  To: Michael Chan; +Cc: kbuild-all, davem, netdev, Sathya Perla

[-- Attachment #1: Type: text/plain, Size: 7055 bytes --]

Hi Sathya,

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Michael-Chan/bnxt_en-Update-firmware-interface-spec-to-1-8-0/20170725-094835
config: x86_64-randconfig-x016-201730 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   In file included from include/linux/kernfs.h:13:0,
                    from include/linux/sysfs.h:15,
                    from include/linux/kobject.h:21,
                    from include/linux/pci.h:28,
                    from drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:9:
   drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c: In function 'bnxt_dl_eswitch_mode_set':
>> drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:165:16: error: 'struct bnxt' has no member named 'sriov_lock'
     mutex_lock(&bp->sriov_lock);
                   ^
   include/linux/mutex.h:164:44: note: in definition of macro 'mutex_lock'
    #define mutex_lock(lock) mutex_lock_nested(lock, 0)
                                               ^~~~
   drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:194:18: error: 'struct bnxt' has no member named 'sriov_lock'
     mutex_unlock(&bp->sriov_lock);
                     ^~

vim +165 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c

   > 9	#include <linux/pci.h>
    10	#include <linux/netdevice.h>
    11	#include <linux/etherdevice.h>
    12	#include <linux/rtnetlink.h>
    13	#include <linux/jhash.h>
    14	
    15	#include "bnxt_hsi.h"
    16	#include "bnxt.h"
    17	#include "bnxt_vfr.h"
    18	
    19	#define CFA_HANDLE_INVALID		0xffff
    20	
    21	static void __bnxt_vf_reps_destroy(struct bnxt *bp)
    22	{
    23		u16 num_vfs = pci_num_vf(bp->pdev);
    24		struct bnxt_vf_rep *vf_rep;
    25		int i;
    26	
    27		for (i = 0; i < num_vfs; i++) {
    28			vf_rep = bp->vf_reps[i];
    29			if (vf_rep) {
    30				if (vf_rep->dev) {
    31					/* if register_netdev failed, then netdev_ops
    32					 * would have been set to NULL
    33					 */
    34					if (vf_rep->dev->netdev_ops)
    35						unregister_netdev(vf_rep->dev);
    36					free_netdev(vf_rep->dev);
    37				}
    38			}
    39		}
    40	
    41		kfree(bp->vf_reps);
    42		bp->vf_reps = NULL;
    43	}
    44	
    45	void bnxt_vf_reps_destroy(struct bnxt *bp)
    46	{
    47		bool closed = false;
    48	
    49		if (bp->eswitch_mode != DEVLINK_ESWITCH_MODE_SWITCHDEV)
    50			return;
    51	
    52		if (!bp->vf_reps)
    53			return;
    54	
    55		/* Ensure that parent PF's and VF-reps' RX/TX has been quiesced
    56		 * before proceeding with VF-rep cleanup.
    57		 */
    58		rtnl_lock();
    59		if (netif_running(bp->dev)) {
    60			bnxt_close_nic(bp, false, false);
    61			closed = true;
    62		}
    63		bp->eswitch_mode = DEVLINK_ESWITCH_MODE_LEGACY;
    64	
    65		if (closed)
    66			bnxt_open_nic(bp, false, false);
    67		rtnl_unlock();
    68	
    69		/* Need to call vf_reps_destroy() outside of rntl_lock
    70		 * as unregister_netdev takes rtnl_lock
    71		 */
    72		__bnxt_vf_reps_destroy(bp);
    73	}
    74	
    75	/* Use the OUI of the PF's perm addr and report the same mac addr
    76	 * for the same VF-rep each time
    77	 */
    78	static void bnxt_vf_rep_eth_addr_gen(u8 *src_mac, u16 vf_idx, u8 *mac)
    79	{
    80		u32 addr;
    81	
    82		ether_addr_copy(mac, src_mac);
    83	
    84		addr = jhash(src_mac, ETH_ALEN, 0) + vf_idx;
    85		mac[3] = (u8)(addr & 0xFF);
    86		mac[4] = (u8)((addr >> 8) & 0xFF);
    87		mac[5] = (u8)((addr >> 16) & 0xFF);
    88	}
    89	
    90	static void bnxt_vf_rep_netdev_init(struct bnxt *bp, struct bnxt_vf_rep *vf_rep,
    91					    struct net_device *dev)
    92	{
    93		struct net_device *pf_dev = bp->dev;
    94	
    95		/* Just inherit all the featues of the parent PF as the VF-R
    96		 * uses the RX/TX rings of the parent PF
    97		 */
    98		dev->hw_features = pf_dev->hw_features;
    99		dev->gso_partial_features = pf_dev->gso_partial_features;
   100		dev->vlan_features = pf_dev->vlan_features;
   101		dev->hw_enc_features = pf_dev->hw_enc_features;
   102		dev->features |= pf_dev->features;
   103		bnxt_vf_rep_eth_addr_gen(bp->pf.mac_addr, vf_rep->vf_idx,
   104					 dev->perm_addr);
   105		ether_addr_copy(dev->dev_addr, dev->perm_addr);
   106	}
   107	
   108	static int bnxt_vf_reps_create(struct bnxt *bp)
   109	{
   110		u16 num_vfs = pci_num_vf(bp->pdev);
   111		struct bnxt_vf_rep *vf_rep;
   112		struct net_device *dev;
   113		int rc, i;
   114	
   115		bp->vf_reps = kcalloc(num_vfs, sizeof(vf_rep), GFP_KERNEL);
   116		if (!bp->vf_reps)
   117			return -ENOMEM;
   118	
   119		for (i = 0; i < num_vfs; i++) {
   120			dev = alloc_etherdev(sizeof(*vf_rep));
   121			if (!dev) {
   122				rc = -ENOMEM;
   123				goto err;
   124			}
   125	
   126			vf_rep = netdev_priv(dev);
   127			bp->vf_reps[i] = vf_rep;
   128			vf_rep->dev = dev;
   129			vf_rep->bp = bp;
   130			vf_rep->vf_idx = i;
   131			vf_rep->tx_cfa_action = CFA_HANDLE_INVALID;
   132	
   133			bnxt_vf_rep_netdev_init(bp, vf_rep, dev);
   134			rc = register_netdev(dev);
   135			if (rc) {
   136				/* no need for unregister_netdev in cleanup */
   137				dev->netdev_ops = NULL;
   138				goto err;
   139			}
   140		}
   141	
   142		bp->eswitch_mode = DEVLINK_ESWITCH_MODE_SWITCHDEV;
   143		return 0;
   144	
   145	err:
   146		netdev_info(bp->dev, "%s error=%d", __func__, rc);
   147		__bnxt_vf_reps_destroy(bp);
   148		return rc;
   149	}
   150	
   151	/* Devlink related routines */
   152	static int bnxt_dl_eswitch_mode_get(struct devlink *devlink, u16 *mode)
   153	{
   154		struct bnxt *bp = bnxt_get_bp_from_dl(devlink);
   155	
   156		*mode = bp->eswitch_mode;
   157		return 0;
   158	}
   159	
   160	static int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode)
   161	{
   162		struct bnxt *bp = bnxt_get_bp_from_dl(devlink);
   163		int rc = 0;
   164	
 > 165		mutex_lock(&bp->sriov_lock);
   166		if (bp->eswitch_mode == mode) {
   167			netdev_info(bp->dev, "already in %s eswitch mode",
   168				    mode == DEVLINK_ESWITCH_MODE_LEGACY ?
   169				    "legacy" : "switchdev");
   170			rc = -EINVAL;
   171			goto done;
   172		}
   173	
   174		switch (mode) {
   175		case DEVLINK_ESWITCH_MODE_LEGACY:
   176			bnxt_vf_reps_destroy(bp);
   177			break;
   178	
   179		case DEVLINK_ESWITCH_MODE_SWITCHDEV:
   180			if (pci_num_vf(bp->pdev) == 0) {
   181				netdev_info(bp->dev,
   182					    "Enable VFs before setting swtichdev mode");
   183				rc = -EPERM;
   184				goto done;
   185			}
   186			rc = bnxt_vf_reps_create(bp);
   187			break;
   188	
   189		default:
   190			rc = -EINVAL;
   191			goto done;
   192		}
   193	done:
   194		mutex_unlock(&bp->sriov_lock);
   195		return rc;
   196	}
   197	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31958 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-24 16:34 ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Michael Chan
@ 2017-07-25  2:44   ` kbuild test robot
  2017-07-25  4:06     ` Jakub Kicinski
  2017-07-25  4:45   ` Jakub Kicinski
  1 sibling, 1 reply; 38+ messages in thread
From: kbuild test robot @ 2017-07-25  2:44 UTC (permalink / raw)
  To: Michael Chan; +Cc: kbuild-all, davem, netdev, Sathya Perla

[-- Attachment #1: Type: text/plain, Size: 5020 bytes --]

Hi Sathya,

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Michael-Chan/bnxt_en-Update-firmware-interface-spec-to-1-8-0/20170725-094835
config: x86_64-randconfig-x016-201730 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/net/ethernet/broadcom/bnxt/bnxt.c: In function 'bnxt_init_one':
>> drivers/net/ethernet/broadcom/bnxt/bnxt.c:7897:5: error: 'struct net_device' has no member named 'switchdev_ops'; did you mean 'netdev_ops'?
     dev->switchdev_ops = &bnxt_switchdev_ops;
        ^~
--
   drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c: In function 'bnxt_vf_rep_netdev_init':
>> drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:305:5: error: 'struct net_device' has no member named 'switchdev_ops'; did you mean 'netdev_ops'?
     dev->switchdev_ops = &bnxt_vf_rep_switchdev_ops;
        ^~
   In file included from include/linux/kernfs.h:13:0,
                    from include/linux/sysfs.h:15,
                    from include/linux/kobject.h:21,
                    from include/linux/pci.h:28,
                    from drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:9:
   drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c: In function 'bnxt_dl_eswitch_mode_set':
   drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:410:16: error: 'struct bnxt' has no member named 'sriov_lock'
     mutex_lock(&bp->sriov_lock);
                   ^
   include/linux/mutex.h:164:44: note: in definition of macro 'mutex_lock'
    #define mutex_lock(lock) mutex_lock_nested(lock, 0)
                                               ^~~~
   drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:439:18: error: 'struct bnxt' has no member named 'sriov_lock'
     mutex_unlock(&bp->sriov_lock);
                     ^~

vim +7897 drivers/net/ethernet/broadcom/bnxt/bnxt.c

  7863	
  7864	static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
  7865	{
  7866		static int version_printed;
  7867		struct net_device *dev;
  7868		struct bnxt *bp;
  7869		int rc, max_irqs;
  7870	
  7871		if (pci_is_bridge(pdev))
  7872			return -ENODEV;
  7873	
  7874		if (version_printed++ == 0)
  7875			pr_info("%s", version);
  7876	
  7877		max_irqs = bnxt_get_max_irq(pdev);
  7878		dev = alloc_etherdev_mq(sizeof(*bp), max_irqs);
  7879		if (!dev)
  7880			return -ENOMEM;
  7881	
  7882		bp = netdev_priv(dev);
  7883	
  7884		if (bnxt_vf_pciid(ent->driver_data))
  7885			bp->flags |= BNXT_FLAG_VF;
  7886	
  7887		if (pdev->msix_cap)
  7888			bp->flags |= BNXT_FLAG_MSIX_CAP;
  7889	
  7890		rc = bnxt_init_board(pdev, dev);
  7891		if (rc < 0)
  7892			goto init_err_free;
  7893	
  7894		dev->netdev_ops = &bnxt_netdev_ops;
  7895		dev->watchdog_timeo = BNXT_TX_TIMEOUT;
  7896		dev->ethtool_ops = &bnxt_ethtool_ops;
> 7897		dev->switchdev_ops = &bnxt_switchdev_ops;
  7898		pci_set_drvdata(pdev, dev);
  7899	
  7900		rc = bnxt_alloc_hwrm_resources(bp);
  7901		if (rc)
  7902			goto init_err_pci_clean;
  7903	
  7904		mutex_init(&bp->hwrm_cmd_lock);
  7905		rc = bnxt_hwrm_ver_get(bp);
  7906		if (rc)
  7907			goto init_err_pci_clean;
  7908	
  7909		if (bp->flags & BNXT_FLAG_SHORT_CMD) {
  7910			rc = bnxt_alloc_hwrm_short_cmd_req(bp);
  7911			if (rc)
  7912				goto init_err_pci_clean;
  7913		}
  7914	
  7915		rc = bnxt_hwrm_func_reset(bp);
  7916		if (rc)
  7917			goto init_err_pci_clean;
  7918	
  7919		bnxt_hwrm_fw_set_time(bp);
  7920	
  7921		dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
  7922				   NETIF_F_TSO | NETIF_F_TSO6 |
  7923				   NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
  7924				   NETIF_F_GSO_IPXIP4 |
  7925				   NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
  7926				   NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
  7927				   NETIF_F_RXCSUM | NETIF_F_GRO;
  7928	
  7929		if (!BNXT_CHIP_TYPE_NITRO_A0(bp))
  7930			dev->hw_features |= NETIF_F_LRO;
  7931	
  7932		dev->hw_enc_features =
  7933				NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
  7934				NETIF_F_TSO | NETIF_F_TSO6 |
  7935				NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
  7936				NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
  7937				NETIF_F_GSO_IPXIP4 | NETIF_F_GSO_PARTIAL;
  7938		dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM |
  7939					    NETIF_F_GSO_GRE_CSUM;
  7940		dev->vlan_features = dev->hw_features | NETIF_F_HIGHDMA;
  7941		dev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_TX |
  7942				    NETIF_F_HW_VLAN_STAG_RX | NETIF_F_HW_VLAN_STAG_TX;
  7943		dev->features |= dev->hw_features | NETIF_F_HIGHDMA;
  7944		dev->priv_flags |= IFF_UNICAST_FLT;
  7945	
  7946		/* MTU range: 60 - 9500 */
  7947		dev->min_mtu = ETH_ZLEN;
  7948		dev->max_mtu = BNXT_MAX_MTU;
  7949	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31958 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-25  2:44   ` kbuild test robot
@ 2017-07-25  4:06     ` Jakub Kicinski
  2017-07-25  4:21       ` David Miller
  0 siblings, 1 reply; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-25  4:06 UTC (permalink / raw)
  To: Michael Chan; +Cc: kbuild-all, davem, netdev, Sathya Perla

On Tue, 25 Jul 2017 10:44:49 +0800, kbuild test robot wrote:
>    drivers/net/ethernet/broadcom/bnxt/bnxt.c: In function 'bnxt_init_one':
> >> drivers/net/ethernet/broadcom/bnxt/bnxt.c:7897:5: error: 'struct net_device' has no member named 'switchdev_ops'; did you mean 'netdev_ops'?  
>      dev->switchdev_ops = &bnxt_switchdev_ops;
>         ^~

Check out SWITCHDEV_SET_OPS() from commit d643a75ac2bc ("net: switchdev:
add SET_SWITCHDEV_OPS helper").

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-25  4:06     ` Jakub Kicinski
@ 2017-07-25  4:21       ` David Miller
  0 siblings, 0 replies; 38+ messages in thread
From: David Miller @ 2017-07-25  4:21 UTC (permalink / raw)
  To: kubakici; +Cc: michael.chan, kbuild-all, netdev, sathya.perla

From: Jakub Kicinski <kubakici@wp.pl>
Date: Mon, 24 Jul 2017 21:06:17 -0700

> On Tue, 25 Jul 2017 10:44:49 +0800, kbuild test robot wrote:
>>    drivers/net/ethernet/broadcom/bnxt/bnxt.c: In function 'bnxt_init_one':
>> >> drivers/net/ethernet/broadcom/bnxt/bnxt.c:7897:5: error: 'struct net_device' has no member named 'switchdev_ops'; did you mean 'netdev_ops'?  
>>      dev->switchdev_ops = &bnxt_switchdev_ops;
>>         ^~
> 
> Check out SWITCHDEV_SET_OPS() from commit d643a75ac2bc ("net: switchdev:
> add SET_SWITCHDEV_OPS helper").

This was easy enough so I just pushed out:

====================
>From bc88055ab72c0eaa080926c888628b77d2055513 Mon Sep 17 00:00:00 2001
From: "David S. Miller" <davem@davemloft.net>
Date: Mon, 24 Jul 2017 21:20:16 -0700
Subject: [PATCH] bnxt_en: Use SWITCHDEV_SET_OPS().

Suggested by Jakub Kicinski.

Fixes: c124a62ff2dd ("bnxt_en: add support for port_attr_get and and get_phys_port_name")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 82cbe1804821..badbc3550338 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7894,7 +7894,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	dev->netdev_ops = &bnxt_netdev_ops;
 	dev->watchdog_timeo = BNXT_TX_TIMEOUT;
 	dev->ethtool_ops = &bnxt_ethtool_ops;
-	dev->switchdev_ops = &bnxt_switchdev_ops;
+	SWITCHDEV_SET_OPS(dev, &bnxt_switchdev_ops);
 	pci_set_drvdata(pdev, dev);
 
 	rc = bnxt_alloc_hwrm_resources(bp);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-24 16:34 ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Michael Chan
  2017-07-25  2:44   ` kbuild test robot
@ 2017-07-25  4:45   ` Jakub Kicinski
  2017-07-25  5:13     ` [RFC] switchdev: clarify ndo_get_phys_port_name() formats Jakub Kicinski
                       ` (2 more replies)
  1 sibling, 3 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-25  4:45 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev, Sathya Perla

On Mon, 24 Jul 2017 12:34:29 -0400, Michael Chan wrote:
> From: Sathya Perla <sathya.perla@broadcom.com>
> 
> This patch adds support for the switchdev_port_attr_get() and
> ndo_get_phys_port_name() methods for the PF and the VF-reps.
> Using this support a user application can deduce that the PF
> (when in the ESWITCH_SWDEV mode) and it's VF-reps form a switch.
> 
> Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 57 +++++++++++++++++++++++++++
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  2 +
>  drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 31 ++++++++++++++-
>  3 files changed, 89 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index f262fe6..82cbe18 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -7546,6 +7546,61 @@ static int bnxt_bridge_setlink(struct net_device *dev, struct nlmsghdr *nlh,
>  	return rc;
>  }
>  
> +static int bnxt_get_phys_port_name(struct net_device *dev, char *buf,
> +				   size_t len)
> +{
> +	struct bnxt *bp = netdev_priv(dev);
> +	int rc;
> +
> +	/* The PF and it's VF-reps only support the switchdev framework */
> +	if (!BNXT_PF(bp))
> +		return -EOPNOTSUPP;
> +
> +	/* The switch-id that the pf belongs to is exported by
> +	 * the switchdev ndo. This name is just to distinguish from the
> +	 * vf-rep ports.
> +	 */
> +	rc = snprintf(buf, len, "pf%d", bp->pf.port_id);

This is worrisome.  What do you mean by PF?  In my experience, since
for years PFs had one-to-one association with physical ports people use
the terms interchangeably.  This causes a lot of headaches in proper
eswitch modelling.

I'm not sure whether this is a HW limitation or engineers trying to
make things "easy for users" but most VF-representor patchsets we've
seen on netdev don't include PF representors.  NICs have 3 types of
ports - PFs, VFs and external ports/MACs.  For some reason there is a
tendency to keep pretending PF and physical port is the same entity,
which among other things makes it impossible to install VF->PF rules
(as in from VF to the host, not to the network).

Most high performance NIC vendors also have multi-PF NICs, even though
those are not deployed by wider public.  If we keep pretending PF is
the external port, those will get very awkward to model.  This is a bit
of a pet peeve of mine :)

If this bnxt_get_phys_port_name() relates to the external port, please
change your implementation to comply with
Documentation/networking/switchdev.txt, in particular:

> Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> would be sub-port 0 on port 1 on switch 1.

So for physical ports the convention is p<port_id>, and in case of
breakout p<port_id>s<subport_id>.

> +static int bnxt_vf_rep_get_phys_port_name(struct net_device *dev, char *buf,
> +					  size_t len)
> +{
> +	struct bnxt_vf_rep *vf_rep = netdev_priv(dev);
> +	int rc;
> +
> +	rc = snprintf(buf, len, "vfr%d", vf_rep->vf_idx);

This is even worse.  We already have two naming conventions in the
kernel, mlx5 uses "%d" for legacy reasons.  nfp uses pf%dvf%d for vfs,
which is better for two reasons: (a) it works with multi-PF devices;
(b) naming something representor from switchdev perspective is
pointless, you're not calling the external port netdev a physical port
representor, even though it has the exact same relation to the port as
with VFs (i.e. egressing traffic on the netdev will cause the switch
to send to the given port).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-25  4:45   ` Jakub Kicinski
@ 2017-07-25  5:13     ` Jakub Kicinski
  2017-07-25 15:22       ` Andy Gospodarek
  2017-07-26  5:48       ` Jiri Pirko
  2017-07-25  9:55     ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Sathya Perla
  2017-07-26  9:13     ` Or Gerlitz
  2 siblings, 2 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-25  5:13 UTC (permalink / raw)
  To: netdev
  Cc: Jiri Pirko, Or Gerlitz, Michael Chan, Sathya Perla, simon.horman,
	davem, Jakub Kicinski

We are still in position where we can suggest uniform naming
convention for ndo_get_phys_port_name().  switchdev.txt file
already contained a suggestion of how to name external ports.
Since the use of switchdev for SR-IOV NIC's eswitches is growing,
establish a format for ports of those devices as well.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 Documentation/networking/switchdev.txt | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
index 3e7b946dea27..7c4b6025fb4b 100644
--- a/Documentation/networking/switchdev.txt
+++ b/Documentation/networking/switchdev.txt
@@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
 SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
 	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
 
-Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
-is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
-would be sub-port 0 on port 1 on switch 1.
+Suggested formats of the port name returned by ndo_get_phys_port_name are:
+ - pA     for external ports;
+ - pAsB   for split external ports;
+ - pfC    for PF ports (so called PF representors);
+ - pfCvfD for VF ports (so called VF representors).
+Where A is the port name or ID; B is the sub-port name or ID; C is PCIe
+Physical Function name or ID and D is PCIe Virtual Function name or ID.
+
+Suggested naming convention for switches is "swX", where X is the switch name
+or ID, plus the port name.  For example, sw1p1s0 would be sub-port 0 on port 1
+on switch 1.
 
 Port Features
 ^^^^^^^^^^^^^
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-25  4:45   ` Jakub Kicinski
  2017-07-25  5:13     ` [RFC] switchdev: clarify ndo_get_phys_port_name() formats Jakub Kicinski
@ 2017-07-25  9:55     ` Sathya Perla
  2017-07-25 22:27       ` Jakub Kicinski
  2017-07-26  9:13     ` Or Gerlitz
  2 siblings, 1 reply; 38+ messages in thread
From: Sathya Perla @ 2017-07-25  9:55 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Michael Chan, David Miller, netdev

On Tue, Jul 25, 2017 at 10:15 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
...
>> +static int bnxt_get_phys_port_name(struct net_device *dev, char *buf,
>> +                                size_t len)
>> +{
>> +     struct bnxt *bp = netdev_priv(dev);
>> +     int rc;
>> +
>> +     /* The PF and it's VF-reps only support the switchdev framework */
>> +     if (!BNXT_PF(bp))
>> +             return -EOPNOTSUPP;
>> +
>> +     /* The switch-id that the pf belongs to is exported by
>> +      * the switchdev ndo. This name is just to distinguish from the
>> +      * vf-rep ports.
>> +      */
>> +     rc = snprintf(buf, len, "pf%d", bp->pf.port_id);
>
> This is worrisome.  What do you mean by PF?  In my experience, since
> for years PFs had one-to-one association with physical ports people use
> the terms interchangeably.  This causes a lot of headaches in proper
> eswitch modelling.
>
> I'm not sure whether this is a HW limitation or engineers trying to
> make things "easy for users" but most VF-representor patchsets we've
> seen on netdev don't include PF representors.  NICs have 3 types of
> ports - PFs, VFs and external ports/MACs.  For some reason there is a
> tendency to keep pretending PF and physical port is the same entity,
> which among other things makes it impossible to install VF->PF rules
> (as in from VF to the host, not to the network).
>
> Most high performance NIC vendors also have multi-PF NICs, even though
> those are not deployed by wider public.  If we keep pretending PF is
> the external port, those will get very awkward to model.  This is a bit
> of a pet peeve of mine :)

Jakub, we'll consider implementing a PF-rep as an add-on feature...

>
> If this bnxt_get_phys_port_name() relates to the external port, please
> change your implementation to comply with
> Documentation/networking/switchdev.txt, in particular:
>
>> Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
>> is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
>> would be sub-port 0 on port 1 on switch 1.
>
> So for physical ports the convention is p<port_id>, and in case of
> breakout p<port_id>s<subport_id>.

I'm sending a follow-up patch that fixes the naming for both the
external port and the VF-reps as proposed in your RFC ( switchdev:
clarify ndo_get_phys_port_name() formats)

thanks,
-Sathya

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-25  5:13     ` [RFC] switchdev: clarify ndo_get_phys_port_name() formats Jakub Kicinski
@ 2017-07-25 15:22       ` Andy Gospodarek
  2017-07-25 22:26         ` Jakub Kicinski
  2017-07-26  5:48       ` Jiri Pirko
  1 sibling, 1 reply; 38+ messages in thread
From: Andy Gospodarek @ 2017-07-25 15:22 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Or Gerlitz, Michael Chan, Sathya Perla,
	simon.horman, davem

On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote:
> We are still in position where we can suggest uniform naming
> convention for ndo_get_phys_port_name().  switchdev.txt file
> already contained a suggestion of how to name external ports.
> Since the use of switchdev for SR-IOV NIC's eswitches is growing,
> establish a format for ports of those devices as well.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

This is a nice addition and I suspect there could be even more done to
update this file to cover the VF rep usage.

> ---
>  Documentation/networking/switchdev.txt | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
> index 3e7b946dea27..7c4b6025fb4b 100644
> --- a/Documentation/networking/switchdev.txt
> +++ b/Documentation/networking/switchdev.txt
> @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
>  SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
>  	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
>  
> -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> -is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> -would be sub-port 0 on port 1 on switch 1.
> +Suggested formats of the port name returned by ndo_get_phys_port_name are:
> + - pA     for external ports;
> + - pAsB   for split external ports;
> + - pfC    for PF ports (so called PF representors);
> + - pfCvfD for VF ports (so called VF representors).

I hate to clutter this up, but might be also need to add:

 - pfCsB    for split PF ports (so called PF representors);
 - pfCsBvfD for split VF ports (so called VF representors).

or are we comfortable that these additions to the name for split ports
are implied?

> +Where A is the port name or ID; B is the sub-port name or ID; C is PCIe
> +Physical Function name or ID and D is PCIe Virtual Function name or ID.
> +
> +Suggested naming convention for switches is "swX", where X is the switch name
> +or ID, plus the port name.  For example, sw1p1s0 would be sub-port 0 on port 1
> +on switch 1.
>  
>  Port Features
>  ^^^^^^^^^^^^^

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-25 15:22       ` Andy Gospodarek
@ 2017-07-25 22:26         ` Jakub Kicinski
  2017-07-26  1:48           ` Andy Gospodarek
  0 siblings, 1 reply; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-25 22:26 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: netdev, Jiri Pirko, Or Gerlitz, Michael Chan, Sathya Perla,
	simon.horman, davem

On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote:
> On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote:
> > We are still in position where we can suggest uniform naming
> > convention for ndo_get_phys_port_name().  switchdev.txt file
> > already contained a suggestion of how to name external ports.
> > Since the use of switchdev for SR-IOV NIC's eswitches is growing,
> > establish a format for ports of those devices as well.
> > 
> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>  
> 
> This is a nice addition and I suspect there could be even more done to
> update this file to cover the VF rep usage.
> 
> > ---
> >  Documentation/networking/switchdev.txt | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
> > index 3e7b946dea27..7c4b6025fb4b 100644
> > --- a/Documentation/networking/switchdev.txt
> > +++ b/Documentation/networking/switchdev.txt
> > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
> >  SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
> >  	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
> >  
> > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> > -is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> > -would be sub-port 0 on port 1 on switch 1.
> > +Suggested formats of the port name returned by ndo_get_phys_port_name are:
> > + - pA     for external ports;
> > + - pAsB   for split external ports;
> > + - pfC    for PF ports (so called PF representors);
> > + - pfCvfD for VF ports (so called VF representors).  
> 
> I hate to clutter this up, but might be also need to add:
> 
>  - pfCsB    for split PF ports (so called PF representors);
>  - pfCsBvfD for split VF ports (so called VF representors).
> 
> or are we comfortable that these additions to the name for split ports
> are implied?

Hm..  What is a split PF port?  Splits happen on the physical port - see
my rant on the thread this is a reply to ;)  PFs are PCIe functions,
on the opposite side of the eswitch from the wires.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-25  9:55     ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Sathya Perla
@ 2017-07-25 22:27       ` Jakub Kicinski
  0 siblings, 0 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-25 22:27 UTC (permalink / raw)
  To: Sathya Perla; +Cc: Michael Chan, David Miller, netdev

On Tue, 25 Jul 2017 15:25:19 +0530, Sathya Perla wrote:
> >> Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> >> is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> >> would be sub-port 0 on port 1 on switch 1.  
> >
> > So for physical ports the convention is p<port_id>, and in case of
> > breakout p<port_id>s<subport_id>.  
> 
> I'm sending a follow-up patch that fixes the naming for both the
> external port and the VF-reps as proposed in your RFC ( switchdev:
> clarify ndo_get_phys_port_name() formats)

Thanks!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-25 22:26         ` Jakub Kicinski
@ 2017-07-26  1:48           ` Andy Gospodarek
  2017-07-26  2:34             ` Jakub Kicinski
  0 siblings, 1 reply; 38+ messages in thread
From: Andy Gospodarek @ 2017-07-26  1:48 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Or Gerlitz, Michael Chan, Sathya Perla,
	simon.horman, davem

On Tue, Jul 25, 2017 at 03:26:47PM -0700, Jakub Kicinski wrote:
> On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote:
> > On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote:
> > > We are still in position where we can suggest uniform naming
> > > convention for ndo_get_phys_port_name().  switchdev.txt file
> > > already contained a suggestion of how to name external ports.
> > > Since the use of switchdev for SR-IOV NIC's eswitches is growing,
> > > establish a format for ports of those devices as well.
> > > 
> > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>  
> > 
> > This is a nice addition and I suspect there could be even more done to
> > update this file to cover the VF rep usage.
> > 
> > > ---
> > >  Documentation/networking/switchdev.txt | 14 +++++++++++---
> > >  1 file changed, 11 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
> > > index 3e7b946dea27..7c4b6025fb4b 100644
> > > --- a/Documentation/networking/switchdev.txt
> > > +++ b/Documentation/networking/switchdev.txt
> > > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
> > >  SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
> > >  	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
> > >  
> > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> > > -is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> > > -would be sub-port 0 on port 1 on switch 1.
> > > +Suggested formats of the port name returned by ndo_get_phys_port_name are:
> > > + - pA     for external ports;
> > > + - pAsB   for split external ports;
> > > + - pfC    for PF ports (so called PF representors);
> > > + - pfCvfD for VF ports (so called VF representors).  
> > 
> > I hate to clutter this up, but might be also need to add:
> > 
> >  - pfCsB    for split PF ports (so called PF representors);
> >  - pfCsBvfD for split VF ports (so called VF representors).
> > 
> > or are we comfortable that these additions to the name for split ports
> > are implied?
> 
> Hm..  What is a split PF port?  Splits happen on the physical port - see
> my rant on the thread this is a reply to ;)  PFs are PCIe functions,
> on the opposite side of the eswitch from the wires.

I'm with you that I think there is value in separate netdevs to
represent "PFs, VFs and external ports/MACs" -- particularly for the
use-case you to create rules to control PF<->VF traffic.

So while I'm not saying it is a _great_ idea to support such a thing as
port-splitting of PFs, I suggested this addition as I'm not willing to restrict
such a design/implementation if a vendor or customer desired.  It seemed
useful to provde some guidance on how to name them -- even if we do not
like them.  :-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-26  1:48           ` Andy Gospodarek
@ 2017-07-26  2:34             ` Jakub Kicinski
  2017-07-26 17:54               ` Andy Gospodarek
  0 siblings, 1 reply; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-26  2:34 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: netdev, Jiri Pirko, Or Gerlitz, Michael Chan, Sathya Perla,
	simon.horman, davem

On Tue, 25 Jul 2017 21:48:15 -0400, Andy Gospodarek wrote:
> On Tue, Jul 25, 2017 at 03:26:47PM -0700, Jakub Kicinski wrote:
> > On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote:  
> > > On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote:  
> > > > We are still in position where we can suggest uniform naming
> > > > convention for ndo_get_phys_port_name().  switchdev.txt file
> > > > already contained a suggestion of how to name external ports.
> > > > Since the use of switchdev for SR-IOV NIC's eswitches is growing,
> > > > establish a format for ports of those devices as well.
> > > > 
> > > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>    
> > > 
> > > This is a nice addition and I suspect there could be even more done to
> > > update this file to cover the VF rep usage.
> > >   
> > > > ---
> > > >  Documentation/networking/switchdev.txt | 14 +++++++++++---
> > > >  1 file changed, 11 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
> > > > index 3e7b946dea27..7c4b6025fb4b 100644
> > > > --- a/Documentation/networking/switchdev.txt
> > > > +++ b/Documentation/networking/switchdev.txt
> > > > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
> > > >  SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
> > > >  	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
> > > >  
> > > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> > > > -is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> > > > -would be sub-port 0 on port 1 on switch 1.
> > > > +Suggested formats of the port name returned by ndo_get_phys_port_name are:
> > > > + - pA     for external ports;
> > > > + - pAsB   for split external ports;
> > > > + - pfC    for PF ports (so called PF representors);
> > > > + - pfCvfD for VF ports (so called VF representors).    
> > > 
> > > I hate to clutter this up, but might be also need to add:
> > > 
> > >  - pfCsB    for split PF ports (so called PF representors);
> > >  - pfCsBvfD for split VF ports (so called VF representors).
> > > 
> > > or are we comfortable that these additions to the name for split ports
> > > are implied?  
> > 
> > Hm..  What is a split PF port?  Splits happen on the physical port - see
> > my rant on the thread this is a reply to ;)  PFs are PCIe functions,
> > on the opposite side of the eswitch from the wires.  
> 
> I'm with you that I think there is value in separate netdevs to
> represent "PFs, VFs and external ports/MACs" -- particularly for the
> use-case you to create rules to control PF<->VF traffic.
> 
> So while I'm not saying it is a _great_ idea to support such a thing as
> port-splitting of PFs, I suggested this addition as I'm not willing to restrict
> such a design/implementation if a vendor or customer desired.  It seemed
> useful to provde some guidance on how to name them -- even if we do not
> like them.  :-)

If I understand you correctly split PF would be a situation where
device has multiple port instances on the PCIe PF side?  IOW switch sees
multiple endpoints on the PF side?  Let me attempt an ASCII diagram :)

                                                                           
                    HOST A             ||          HOST B                  
                                       ||                                  
        PF A       | V | V | V | V     ||       PF B        | V | V | V               
                   | F | F | F | F ... ||                   | F | F | F ...           
 port A0 | port A1 | 0 | 1 | 2 | 3     || port B0 | port B1 | 0 | 1 | 2            
                                       ||                                  
             PCI Express link          ||        PCI Express link          
        \      \      \  |   |   |          |        |      /   /   /         
         \      \      \ |   |   |          |        |     /   /   /          
          \______\______\'   |   |          |        '____/___/___/
                         /---------------------------\                         
                         |<<==========               |
                         |             ==========>>  |
                         |     SR-IOV e-switch       |                         
                         |<<==========               |                         
                         |             ==========>>  |                          
                         \---------------------------/                     
                              |        |         |                          
                              |        |         |                          
                                 ||         ||                               
                          MAC 0  ||  MAC 1  || MAC 2                   
                                 ||         ||                              


Seems to be a valid configuration, perhaps this would actually be of
some use in container workloads, especially if the ports could be
instantiated at runtime in high numbers.  I would be cautious though
with calling the instances splits.  The more different PFs look from
MACs the better IMHO.  Do you actually have that problem today?  Is
there any HW supported upstream which would benefit from this?  Could we
decide on naming when we have an example implementation?  In theory
nothing stops us from splitting VFs the same way.

Another note on PF netdevs, perhaps the most awkward thing about them,
is that they result in two netdevs being visible to the host.  This is
not incorrect, since VFs if unassigned to VMs will end up creating an
"actual" netdev and the switchdev port representor too, but it rubs
some people the wrong way.  Which in turn makes those people try to not
spawn separate netdevs, which is incorrect IMHO, and breaks down e.g.
when the real netdev gets assigned to a namespace.

I'm not sure if this clarifies my thinking, I have, however, seem to
have drawn a moose :)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-25  5:13     ` [RFC] switchdev: clarify ndo_get_phys_port_name() formats Jakub Kicinski
  2017-07-25 15:22       ` Andy Gospodarek
@ 2017-07-26  5:48       ` Jiri Pirko
  2017-07-26  8:13         ` Jakub Kicinski
  1 sibling, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2017-07-26  5:48 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Or Gerlitz, Michael Chan, Sathya Perla,
	simon.horman, davem

Tue, Jul 25, 2017 at 07:13:44AM CEST, jakub.kicinski@netronome.com wrote:
>We are still in position where we can suggest uniform naming
>convention for ndo_get_phys_port_name().  switchdev.txt file
>already contained a suggestion of how to name external ports.
>Since the use of switchdev for SR-IOV NIC's eswitches is growing,
>establish a format for ports of those devices as well.
>
>Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>---
> Documentation/networking/switchdev.txt | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
>diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
>index 3e7b946dea27..7c4b6025fb4b 100644
>--- a/Documentation/networking/switchdev.txt
>+++ b/Documentation/networking/switchdev.txt
>@@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
> SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
> 	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
> 
>-Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
>-is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
>-would be sub-port 0 on port 1 on switch 1.
>+Suggested formats of the port name returned by ndo_get_phys_port_name are:
>+ - pA     for external ports;
>+ - pAsB   for split external ports;
>+ - pfC    for PF ports (so called PF representors);
>+ - pfCvfD for VF ports (so called VF representors).

I think it would make sense if the driver would just fill-up a struct in
the ndo call and core would generate the string.


>+Where A is the port name or ID; B is the sub-port name or ID; C is PCIe
>+Physical Function name or ID and D is PCIe Virtual Function name or ID.
>+
>+Suggested naming convention for switches is "swX", where X is the switch name
>+or ID, plus the port name.  For example, sw1p1s0 would be sub-port 0 on port 1
>+on switch 1.
> 
> Port Features
> ^^^^^^^^^^^^^
>-- 
>2.11.0
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-26  5:48       ` Jiri Pirko
@ 2017-07-26  8:13         ` Jakub Kicinski
  2017-07-26  9:23           ` Or Gerlitz
  0 siblings, 1 reply; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-26  8:13 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, Or Gerlitz, Michael Chan, Sathya Perla, simon.horman, davem

On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote:
> Tue, Jul 25, 2017 at 07:13:44AM CEST, jakub.kicinski@netronome.com wrote:
> >We are still in position where we can suggest uniform naming
> >convention for ndo_get_phys_port_name().  switchdev.txt file
> >already contained a suggestion of how to name external ports.
> >Since the use of switchdev for SR-IOV NIC's eswitches is growing,
> >establish a format for ports of those devices as well.
> >
> >Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> >---
> > Documentation/networking/switchdev.txt | 14 +++++++++++---
> > 1 file changed, 11 insertions(+), 3 deletions(-)
> >
> >diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
> >index 3e7b946dea27..7c4b6025fb4b 100644
> >--- a/Documentation/networking/switchdev.txt
> >+++ b/Documentation/networking/switchdev.txt
> >@@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
> > SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
> > 	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
> > 
> >-Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> >-is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> >-would be sub-port 0 on port 1 on switch 1.
> >+Suggested formats of the port name returned by ndo_get_phys_port_name are:
> >+ - pA     for external ports;
> >+ - pAsB   for split external ports;
> >+ - pfC    for PF ports (so called PF representors);
> >+ - pfCvfD for VF ports (so called VF representors).  
> 
> I think it would make sense if the driver would just fill-up a struct in
> the ndo call and core would generate the string.

I do like the idea of core generating the string.  I have a temptation
to try to involve devlink in this somehow, since port id and split info
is already available there.  Perhaps that would only overcomplicate
things.

The other question is: can we remove the option to generate an arbitrary
string completely?  I think Or and mlx5 folks may object since it would
mean mlx5 VF repr names would change.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name
  2017-07-25  4:45   ` Jakub Kicinski
  2017-07-25  5:13     ` [RFC] switchdev: clarify ndo_get_phys_port_name() formats Jakub Kicinski
  2017-07-25  9:55     ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Sathya Perla
@ 2017-07-26  9:13     ` Or Gerlitz
  2 siblings, 0 replies; 38+ messages in thread
From: Or Gerlitz @ 2017-07-26  9:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Michael Chan, David Miller, Linux Netdev List, Sathya Perla, Jiri Pirko

On Tue, Jul 25, 2017 at 7:45 AM, Jakub Kicinski <kubakici@wp.pl> wrote:

> This is even worse.  We already have two naming conventions in the
> kernel, mlx5 uses "%d" for legacy reasons.  nfp uses pf%dvf%d for vfs,

To make it clear, in mlx5 this is used only for the offloads/switchdev mode
and not for legacy mode, in the latter mode, a  VF probed to VM doesn't
have host side representation so there's no where to use that.

Or.

> which is better for two reasons: (a) it works with multi-PF devices;
> (b) naming something representor from switchdev perspective is
> pointless, you're not calling the external port netdev a physical port
> representor, even though it has the exact same relation to the port as
> with VFs (i.e. egressing traffic on the netdev will cause the switch
> to send to the given port).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-26  8:13         ` Jakub Kicinski
@ 2017-07-26  9:23           ` Or Gerlitz
  2017-07-26 20:11             ` Jakub Kicinski
  0 siblings, 1 reply; 38+ messages in thread
From: Or Gerlitz @ 2017-07-26  9:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jiri Pirko, Linux Netdev List, Or Gerlitz, Michael Chan,
	Sathya Perla, Simon Horman, David Miller

On Wed, Jul 26, 2017 at 11:13 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote:

>> I think it would make sense if the driver would just fill-up a struct in
>> the ndo call and core would generate the string.

> I do like the idea of core generating the string.  I have a temptation
> to try to involve devlink in this somehow, since port id and split info
> is already available there.  Perhaps that would only overcomplicate
> things.

> The other question is: can we remove the option to generate an arbitrary
> string completely?  I think Or and mlx5 folks may object since it would
> mean mlx5 VF repr names would change.

What we have today is the representor driver instance setting the VF index as
the of phys port name and we're telling users to have this sort of udev rule:

SUBSYSTEM=="net", ACTION=="add",
ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="",
NAME="$PF_NIC$attr{phys_port_name}"

that has affiliation to  the PF where this VF belongs. AFAIK this
model/assumption is now under push to
higher level (open stack), so lets see if/what we want to change here
w.r.t to sriov offloading drivers.

I would opt for an approach where the value returned by the kernel is
the minimal possible and
user-space has flexibility to further orchestrate that with udev
rules. I wasn't fully clear on Jakub's suggestion
which parts must come from the kernel. Do we have any length
limitation for the phys port name?

Or.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-26  2:34             ` Jakub Kicinski
@ 2017-07-26 17:54               ` Andy Gospodarek
  0 siblings, 0 replies; 38+ messages in thread
From: Andy Gospodarek @ 2017-07-26 17:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Or Gerlitz, Michael Chan, Sathya Perla,
	simon.horman, davem

On Tue, Jul 25, 2017 at 07:34:47PM -0700, Jakub Kicinski wrote:
> On Tue, 25 Jul 2017 21:48:15 -0400, Andy Gospodarek wrote:
> > On Tue, Jul 25, 2017 at 03:26:47PM -0700, Jakub Kicinski wrote:
> > > On Tue, 25 Jul 2017 11:22:41 -0400, Andy Gospodarek wrote:  
> > > > On Mon, Jul 24, 2017 at 10:13:44PM -0700, Jakub Kicinski wrote:  
> > > > > We are still in position where we can suggest uniform naming
> > > > > convention for ndo_get_phys_port_name().  switchdev.txt file
> > > > > already contained a suggestion of how to name external ports.
> > > > > Since the use of switchdev for SR-IOV NIC's eswitches is growing,
> > > > > establish a format for ports of those devices as well.
> > > > > 
> > > > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>    
> > > > 
> > > > This is a nice addition and I suspect there could be even more done to
> > > > update this file to cover the VF rep usage.
> > > >   
> > > > > ---
> > > > >  Documentation/networking/switchdev.txt | 14 +++++++++++---
> > > > >  1 file changed, 11 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
> > > > > index 3e7b946dea27..7c4b6025fb4b 100644
> > > > > --- a/Documentation/networking/switchdev.txt
> > > > > +++ b/Documentation/networking/switchdev.txt
> > > > > @@ -119,9 +119,17 @@ into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
> > > > >  SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
> > > > >  	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
> > > > >  
> > > > > -Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
> > > > > -is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
> > > > > -would be sub-port 0 on port 1 on switch 1.
> > > > > +Suggested formats of the port name returned by ndo_get_phys_port_name are:
> > > > > + - pA     for external ports;
> > > > > + - pAsB   for split external ports;
> > > > > + - pfC    for PF ports (so called PF representors);
> > > > > + - pfCvfD for VF ports (so called VF representors).    
> > > > 
> > > > I hate to clutter this up, but might be also need to add:
> > > > 
> > > >  - pfCsB    for split PF ports (so called PF representors);
> > > >  - pfCsBvfD for split VF ports (so called VF representors).
> > > > 
> > > > or are we comfortable that these additions to the name for split ports
> > > > are implied?  
> > > 
> > > Hm..  What is a split PF port?  Splits happen on the physical port - see
> > > my rant on the thread this is a reply to ;)  PFs are PCIe functions,
> > > on the opposite side of the eswitch from the wires.  
> > 
> > I'm with you that I think there is value in separate netdevs to
> > represent "PFs, VFs and external ports/MACs" -- particularly for the
> > use-case you to create rules to control PF<->VF traffic.
> > 
> > So while I'm not saying it is a _great_ idea to support such a thing as
> > port-splitting of PFs, I suggested this addition as I'm not willing to restrict
> > such a design/implementation if a vendor or customer desired.  It seemed
> > useful to provde some guidance on how to name them -- even if we do not
> > like them.  :-)
> 
> If I understand you correctly split PF would be a situation where
> device has multiple port instances on the PCIe PF side?  IOW switch sees
> multiple endpoints on the PF side?  Let me attempt an ASCII diagram :)
> 
>                                                                            
>                     HOST A             ||          HOST B                  
>                                        ||                                  
>         PF A       | V | V | V | V     ||       PF B        | V | V | V               
>                    | F | F | F | F ... ||                   | F | F | F ...           
>  port A0 | port A1 | 0 | 1 | 2 | 3     || port B0 | port B1 | 0 | 1 | 2            
>                                        ||                                  
>              PCI Express link          ||        PCI Express link          
>         \      \      \  |   |   |          |        |      /   /   /         
>          \      \      \ |   |   |          |        |     /   /   /          
>           \______\______\'   |   |          |        '____/___/___/
>                          /---------------------------\                         
>                          |<<==========               |
>                          |             ==========>>  |
>                          |     SR-IOV e-switch       |                         
>                          |<<==========               |                         
>                          |             ==========>>  |                          
>                          \---------------------------/                     
>                               |        |         |                          
>                               |        |         |                          
>                                  ||         ||                               
>                           MAC 0  ||  MAC 1  || MAC 2                   
>                                  ||         ||                              
> 
> 
> Seems to be a valid configuration, perhaps this would actually be of
> some use in container workloads, especially if the ports could be
> instantiated at runtime in high numbers.  I would be cautious though
> with calling the instances splits.  The more different PFs look from
> MACs the better IMHO.  Do you actually have that problem today?
> 
> Is there any HW supported upstream which would benefit from this?  Could we
> decide on naming when we have an example implementation?  In theory
> nothing stops us from splitting VFs the same way.

Not that I know about right now.

> Another note on PF netdevs, perhaps the most awkward thing about them,
> is that they result in two netdevs being visible to the host.  This is
> not incorrect, since VFs if unassigned to VMs will end up creating an
> "actual" netdev and the switchdev port representor too, but it rubs
> some people the wrong way.  Which in turn makes those people try to not
> spawn separate netdevs, which is incorrect IMHO, and breaks down e.g.
> when the real netdev gets assigned to a namespace.

For me, the most awkward part of having a separate netdev for the PF and
the MAC is that is really not how things were thought about in the
nominal switching case (the non e-switch case).

Since idea behind switchdev when it was created was to make sure that
each front-panel port on a switch was represented by a netdev in the
kernel (and the 'CPU interface' was abstracted away by the driver) I was
always a bit uneasy about having a separate netdev allocated for the CPU
port when in the switching case it really wasn't necessary.

> I'm not sure if this clarifies my thinking, I have, however, seem to
> have drawn a moose :)

Which looks great, BTW.  The moose may turn out to be one of the major
benefits from this thread!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-26  9:23           ` Or Gerlitz
@ 2017-07-26 20:11             ` Jakub Kicinski
  2017-07-27 10:30               ` Or Gerlitz
  0 siblings, 1 reply; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-26 20:11 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Jiri Pirko, Linux Netdev List, Or Gerlitz, Michael Chan,
	Sathya Perla, Simon Horman, David Miller

On Wed, 26 Jul 2017 12:23:17 +0300, Or Gerlitz wrote:
> On Wed, Jul 26, 2017 at 11:13 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
> > On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote:  
> 
> >> I think it would make sense if the driver would just fill-up a struct in
> >> the ndo call and core would generate the string.  
> 
> > I do like the idea of core generating the string.  I have a temptation
> > to try to involve devlink in this somehow, since port id and split info
> > is already available there.  Perhaps that would only overcomplicate
> > things.  
> 
> > The other question is: can we remove the option to generate an arbitrary
> > string completely?  I think Or and mlx5 folks may object since it would
> > mean mlx5 VF repr names would change.  
> 
> What we have today is the representor driver instance setting the VF index as
> the of phys port name and we're telling users to have this sort of udev rule:
> 
> SUBSYSTEM=="net", ACTION=="add",
> ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="",
> NAME="$PF_NIC$attr{phys_port_name}"

Example names generated by this rule would be pfnic_0, pfnic_1 for vf
representors 0 and 1?

> that has affiliation to  the PF where this VF belongs. AFAIK this
> model/assumption is now under push to
> higher level (open stack), so lets see if/what we want to change here
> w.r.t to sriov offloading drivers.
> 
> I would opt for an approach where the value returned by the kernel is
> the minimal possible and
> user-space has flexibility to further orchestrate that with udev
> rules. I wasn't fully clear on Jakub's suggestion
> which parts must come from the kernel. Do we have any length
> limitation for the phys port name?

Looks like the limit today is IFNAMSIZ.  I'm in favor of leaving the
flexibility to the userspace, why I suggested adding the pf%d or
pf%dvf%d to the name is that I don't think we have any other way today
of telling whether representor is for a physical port, VF or PF.

If I understand mlx5 code, you're not reporting port ids for physical
ports so presence of the name already implies it's a VF but in case you
want to add port splitting support, for example, reporting the name on
physical ports will become more of a necessity.

If we adopt Jiri's suggestion of returning structured data it will be
very easy to give user space type and indexes separately, but we should
probably still return the string for backwards compatibility.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: clarify ndo_get_phys_port_name() formats
  2017-07-26 20:11             ` Jakub Kicinski
@ 2017-07-27 10:30               ` Or Gerlitz
  2017-07-28  2:31                 ` [RFC] switchdev: generate phys_port_name in the core Jakub Kicinski
  0 siblings, 1 reply; 38+ messages in thread
From: Or Gerlitz @ 2017-07-27 10:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jiri Pirko, Linux Netdev List, Or Gerlitz, Michael Chan,
	Sathya Perla, Simon Horman, David Miller

On Wed, Jul 26, 2017 at 11:11 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Wed, 26 Jul 2017 12:23:17 +0300, Or Gerlitz wrote:
>> On Wed, Jul 26, 2017 at 11:13 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
>> > On Wed, 26 Jul 2017 07:48:40 +0200, Jiri Pirko wrote:
>>
>> >> I think it would make sense if the driver would just fill-up a struct in
>> >> the ndo call and core would generate the string.
>>
>> > I do like the idea of core generating the string.  I have a temptation
>> > to try to involve devlink in this somehow, since port id and split info
>> > is already available there.  Perhaps that would only overcomplicate
>> > things.
>>
>> > The other question is: can we remove the option to generate an arbitrary
>> > string completely?  I think Or and mlx5 folks may object since it would
>> > mean mlx5 VF repr names would change.
>>
>> What we have today is the representor driver instance setting the VF index as
>> the of phys port name and we're telling users to have this sort of udev rule:
>>
>> SUBSYSTEM=="net", ACTION=="add",
>> ATTR{phys_switch_id}=="<phys_switch_id>", \ ATTR{phys_port_name}!="",
>> NAME="$PF_NIC$attr{phys_port_name}"
>
> Example names generated by this rule would be pfnic_0, pfnic_1 for vf
> representors 0 and 1?

YES

>> that has affiliation to  the PF where this VF belongs. AFAIK this
>> model/assumption is now under push to
>> higher level (open stack), so lets see if/what we want to change here
>> w.r.t to sriov offloading drivers.
>>
>> I would opt for an approach where the value returned by the kernel is
>> the minimal possible and
>> user-space has flexibility to further orchestrate that with udev
>> rules. I wasn't fully clear on Jakub's suggestion
>> which parts must come from the kernel. Do we have any length
>> limitation for the phys port name?
>
> Looks like the limit today is IFNAMSIZ.  I'm in favor of leaving the
> flexibility to the userspace, why I suggested adding the pf%d or
> pf%dvf%d to the name is that I don't think we have any other way today
> of telling whether representor is for a physical port, VF or PF.

> If I understand mlx5 code, you're not reporting port ids for physical
> ports so presence of the name already implies it's a VF but in case you

yes, currently in our model the PF serves as the uplink representor
when running in offloads mode.

> want to add port splitting support, for example, reporting the name on
> physical ports will become more of a necessity.

> If we adopt Jiri's suggestion of returning structured data it will be
> very easy to give user space type and indexes separately, but we should
> probably still return the string for backwards compatibility.

I am not still clear how the structured data would look like

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] switchdev: generate phys_port_name in the core
  2017-07-27 10:30               ` Or Gerlitz
@ 2017-07-28  2:31                 ` Jakub Kicinski
  2017-07-28  2:37                   ` Jakub Kicinski
                                     ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-28  2:31 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Jiri Pirko, netdev, Or Gerlitz, Michael Chan, Sathya Perla,
	David Miller, simon.horman, Jakub Kicinski

On Thu, 27 Jul 2017 13:30:44 +0300, Or Gerlitz wrote:
> > want to add port splitting support, for example, reporting the name on
> > physical ports will become more of a necessity.  
> 
> > If we adopt Jiri's suggestion of returning structured data it will be
> > very easy to give user space type and indexes separately, but we should
> > probably still return the string for backwards compatibility.  
> 
> I am not still clear how the structured data would look like

I decided to just quickly write the code, that should be easier to 
understand.  We can probably leave out the netlink part of the API
if there is no need for it right now, but that's what I ment by
returning the information in a more structured way.

Tested-by: nobody :)
Suggested-by: Jiri (if I understood correctly)
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c |  8 ++-
 drivers/net/ethernet/mellanox/mlxsw/switchx2.c   | 10 ++--
 drivers/net/ethernet/netronome/nfp/nfp_port.c    | 26 ++++-----
 drivers/net/ethernet/netronome/nfp/nfp_port.h    |  4 +-
 include/linux/netdevice.h                        | 18 ++++++-
 include/uapi/linux/if_link.h                     | 16 ++++++
 net/core/dev.c                                   | 31 +++++++++--
 net/core/rtnetlink.c                             | 69 ++++++++++++++++++++++++
 8 files changed, 153 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 45e60be9c277..7a71291b8ec3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -637,16 +637,14 @@ static int mlx5e_rep_close(struct net_device *dev)
 }
 
 static int mlx5e_rep_get_phys_port_name(struct net_device *dev,
-					char *buf, size_t len)
+					struct netdev_port_info *info)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
 	struct mlx5e_rep_priv *rpriv = priv->ppriv;
 	struct mlx5_eswitch_rep *rep = rpriv->rep;
-	int ret;
 
-	ret = snprintf(buf, len, "%d", rep->vport - 1);
-	if (ret >= len)
-		return -EOPNOTSUPP;
+	info->type = NETDEV_PORT_PCI_VF;
+	info->pci.vf_id = rep->vport - 1;
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
index 3b0f72455681..383b8b5f41cf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
@@ -413,15 +413,13 @@ mlxsw_sx_port_get_stats64(struct net_device *dev,
 	stats->tx_dropped	= tx_dropped;
 }
 
-static int mlxsw_sx_port_get_phys_port_name(struct net_device *dev, char *name,
-					    size_t len)
+static int mlxsw_sx_port_get_phys_port_name(struct net_device *dev,
+					    struct netdev_port_info *info)
 {
 	struct mlxsw_sx_port *mlxsw_sx_port = netdev_priv(dev);
-	int err;
 
-	err = snprintf(name, len, "p%d", mlxsw_sx_port->mapping.module + 1);
-	if (err >= len)
-		return -EINVAL;
+	info->type = NETDEV_PORT_EXTERNAL;
+	info->port.id = mlxsw_sx_port->mapping.module + 1;
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_port.c b/drivers/net/ethernet/netronome/nfp/nfp_port.c
index d16a7b78ba9b..8f5c37b9a79c 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_port.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_port.c
@@ -143,11 +143,11 @@ struct nfp_eth_table_port *nfp_port_get_eth_port(struct nfp_port *port)
 }
 
 int
-nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t len)
+nfp_port_get_phys_port_name(struct net_device *netdev,
+			    struct netdev_port_info *info)
 {
 	struct nfp_eth_table_port *eth_port;
 	struct nfp_port *port;
-	int n;
 
 	port = nfp_port_from_netdev(netdev);
 	if (!port)
@@ -159,25 +159,27 @@ nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t len)
 		if (!eth_port)
 			return -EOPNOTSUPP;
 
-		if (!eth_port->is_split)
-			n = snprintf(name, len, "p%d", eth_port->label_port);
-		else
-			n = snprintf(name, len, "p%ds%d", eth_port->label_port,
-				     eth_port->label_subport);
+		info->type = NETDEV_PORT_EXTERNAL;
+		info->external.id = eth_port->label_port;
+
+		if (eth_port->is_split) {
+			info->type = NETDEV_PORT_EXTERNAL_SPLIT;
+			info->external.split_id = eth_port->label_subport;
+		}
 		break;
 	case NFP_PORT_PF_PORT:
-		n = snprintf(name, len, "pf%d", port->pf_id);
+		info->type = NETDEV_PORT_PCI_PF;
+		info->pci.pf_id = port->pf_id;
 		break;
 	case NFP_PORT_VF_PORT:
-		n = snprintf(name, len, "pf%dvf%d", port->pf_id, port->vf_id);
+		info->type = NETDEV_PORT_PCI_VF;
+		info->pci.pf_id = port->pf_id;
+		info->pci.vf_id = port->vf_id;
 		break;
 	default:
 		return -EOPNOTSUPP;
 	}
 
-	if (n >= len)
-		return -EINVAL;
-
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_port.h b/drivers/net/ethernet/netronome/nfp/nfp_port.h
index 56c76926c82a..03b8746feb29 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_port.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_port.h
@@ -118,8 +118,8 @@ nfp_port_from_id(struct nfp_pf *pf, enum nfp_port_type type, unsigned int id);
 struct nfp_eth_table_port *__nfp_port_get_eth_port(struct nfp_port *port);
 struct nfp_eth_table_port *nfp_port_get_eth_port(struct nfp_port *port);
 
-int
-nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t len);
+int nfp_port_get_phys_port_name(struct net_device *netdev,
+				struct netdev_port_info *info);
 int nfp_port_configure(struct net_device *netdev, bool configed);
 
 struct nfp_port *
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3a3cdc1b1f31..0a055df701ef 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -845,6 +845,22 @@ struct xfrmdev_ops {
 };
 #endif
 
+struct netdev_port_info {
+	int type;
+
+	union {
+		struct {
+			u32 id;
+			u32 split_id;
+		} external;
+
+		struct {
+			u32 pf_id;
+			u32 vf_id;
+		} pci;
+	};
+};
+
 /*
  * This structure defines the management hooks for network devices.
  * The following hooks can be defined; unless noted otherwise, they are
@@ -1306,7 +1322,7 @@ struct net_device_ops {
 	int			(*ndo_get_phys_port_id)(struct net_device *dev,
 							struct netdev_phys_item_id *ppid);
 	int			(*ndo_get_phys_port_name)(struct net_device *dev,
-							  char *name, size_t len);
+							  struct netdev_port_info *info);
 	void			(*ndo_udp_tunnel_add)(struct net_device *dev,
 						      struct udp_tunnel_info *ti);
 	void			(*ndo_udp_tunnel_del)(struct net_device *dev,
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 8d062c58d5cb..e00ff0333e3f 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -158,6 +158,7 @@ enum {
 	IFLA_PAD,
 	IFLA_XDP,
 	IFLA_EVENT,
+	IFLA_PORT_INFO,
 	__IFLA_MAX
 };
 
@@ -927,4 +928,19 @@ enum {
 	IFLA_EVENT_BONDING_OPTIONS,	/* change in bonding options */
 };
 
+enum {
+	NETDEV_PORT_EXTERNAL,
+	NETDEV_PORT_EXTERNAL_SPLIT,
+	NETDEV_PORT_PCI_PF,
+	NETDEV_PORT_PCI_VF,
+};
+
+enum {
+	IFLA_PORT_INFO_TYPE,
+	IFLA_PORT_INFO_EXTERNAL_ID,
+	IFLA_PORT_INFO_EXTERNAL_SPLIT_ID,
+	IFLA_PORT_INFO_PCI_PF_ID,
+	IFLA_PORT_INFO_PCI_VF_ID,
+};
+
 #endif /* _UAPI_LINUX_IF_LINK_H */
diff --git a/net/core/dev.c b/net/core/dev.c
index 8ea6b4b42611..8fe3f697234e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6947,14 +6947,39 @@ EXPORT_SYMBOL(dev_get_phys_port_id);
  *
  *	Get device physical port name
  */
-int dev_get_phys_port_name(struct net_device *dev,
-			   char *name, size_t len)
+int dev_get_phys_port_name(struct net_device *dev, char *name, size_t len)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
+	struct netdev_port_info info = {};
+	int ret;
 
 	if (!ops->ndo_get_phys_port_name)
 		return -EOPNOTSUPP;
-	return ops->ndo_get_phys_port_name(dev, name, len);
+	ret = ops->ndo_get_phys_port_name(dev, &info);
+	if (ret)
+		return ret;
+
+	switch (info.type) {
+	case NETDEV_PORT_EXTERNAL:
+		ret = snprintf(name, len, "p%d", info.external.id);
+		break;
+	case NETDEV_PORT_EXTERNAL_SPLIT:
+		ret = snprintf(name, len, "p%ds%d",
+			       info.external.id, info.external.split_id);
+		break;
+	case NETDEV_PORT_PCI_PF:
+		ret = snprintf(name, len, "pf%d", info.pci.pf_id);
+		break;
+	case NETDEV_PORT_PCI_VF:
+		ret = snprintf(name, len, "pf%dvf%d",
+			       info.pci.pf_id, info.pci.vf_id);
+		break;
+	}
+
+	if (ret > len)
+		return -EINVAL;
+
+	return 0;
 }
 EXPORT_SYMBOL(dev_get_phys_port_name);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9201e3621351..1eb181ef705f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -907,6 +907,15 @@ static size_t rtnl_xdp_size(void)
 	return xdp_size;
 }
 
+static size_t rtnl_port_info_size(void)
+{
+	size_t port_info_size = nla_total_size(0) + /* nest IFLA_PORT_INFO */
+				nla_total_size(4) + /* EXTERNAL_ID or PF_ID */
+				nla_total_size(4);  /* SPLIT_ID or VF_ID*/
+
+	return port_info_size;
+}
+
 static noinline size_t if_nlmsg_size(const struct net_device *dev,
 				     u32 ext_filter_mask)
 {
@@ -946,6 +955,7 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev,
 	       + nla_total_size(IFNAMSIZ) /* IFLA_PHYS_PORT_NAME */
 	       + rtnl_xdp_size() /* IFLA_XDP */
 	       + nla_total_size(4)  /* IFLA_EVENT */
+	       + rtnl_port_info_size() /* IFLA_PORT_INFO */
 	       + nla_total_size(1); /* IFLA_PROTO_DOWN */
 
 }
@@ -1330,6 +1340,62 @@ static u32 rtnl_get_event(unsigned long event)
 	return rtnl_event_type;
 }
 
+static int rtnl_port_info_fill(struct sk_buff *skb, struct net_device *dev)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	struct netdev_port_info info = {};
+	struct nlattr *port_info;
+	int err;
+
+	if (!ops->ndo_get_phys_port_name)
+		return 0;
+
+	port_info = nla_nest_start(skb, IFLA_PORT_INFO);
+	if (!port_info)
+		return -EMSGSIZE;
+
+	err = ops->ndo_get_phys_port_name(dev, &info);
+	if (err)
+		goto err_cancel;
+
+	err = nla_put_u32(skb, IFLA_PORT_INFO_TYPE, info.type);
+	if (err)
+		goto err_cancel;
+
+	switch (info.type) {
+	case NETDEV_PORT_EXTERNAL_SPLIT:
+		err = nla_put_u32(skb, IFLA_PORT_INFO_EXTERNAL_SPLIT_ID,
+				  info.external.split_id);
+		if (err)
+			goto err_cancel;
+		/* fall through */
+	case NETDEV_PORT_EXTERNAL:
+		err = nla_put_u32(skb, IFLA_PORT_INFO_EXTERNAL_ID,
+				  info.external.id);
+		if (err)
+			goto err_cancel;
+		break;
+	case NETDEV_PORT_PCI_VF:
+		err = nla_put_u32(skb, IFLA_PORT_INFO_PCI_VF_ID,
+				  info.pci.vf_id);
+		if (err)
+			goto err_cancel;
+		/* fall through */
+	case NETDEV_PORT_PCI_PF:
+		err = nla_put_u32(skb, IFLA_PORT_INFO_PCI_PF_ID,
+				  info.pci.pf_id);
+		if (err)
+			goto err_cancel;
+		break;
+	}
+
+	return 0;
+
+err_cancel:
+	nla_nest_cancel(skb, port_info);
+	return err;
+}
+
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 			    int type, u32 pid, u32 seq, u32 change,
 			    unsigned int flags, u32 ext_filter_mask,
@@ -1435,6 +1501,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 	if (rtnl_xdp_fill(skb, dev))
 		goto nla_put_failure;
 
+	if (rtnl_port_info_fill(skb, dev))
+		goto nla_put_failure;
+
 	if (dev->rtnl_link_ops || rtnl_have_link_slave_info(dev)) {
 		if (rtnl_link_fill(skb, dev) < 0)
 			goto nla_put_failure;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: generate phys_port_name in the core
  2017-07-28  2:31                 ` [RFC] switchdev: generate phys_port_name in the core Jakub Kicinski
@ 2017-07-28  2:37                   ` Jakub Kicinski
  2017-07-28  5:35                   ` Jiri Pirko
  2017-07-28 14:13                   ` Andrew Lunn
  2 siblings, 0 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-28  2:37 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Jiri Pirko, netdev, Or Gerlitz, Michael Chan, Sathya Perla,
	David Miller, simon.horman

On Thu, 27 Jul 2017 19:31:22 -0700, Jakub Kicinski wrote:
> +static size_t rtnl_port_info_size(void)
> +{
> +	size_t port_info_size = nla_total_size(0) + /* nest IFLA_PORT_INFO */

				nla_total_size(4) + /* TYPE */

> +				nla_total_size(4) + /* EXTERNAL_ID or PF_ID */
> +				nla_total_size(4);  /* SPLIT_ID or VF_ID*/
> +
> +	return port_info_size;
> +}
> +

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: generate phys_port_name in the core
  2017-07-28  2:31                 ` [RFC] switchdev: generate phys_port_name in the core Jakub Kicinski
  2017-07-28  2:37                   ` Jakub Kicinski
@ 2017-07-28  5:35                   ` Jiri Pirko
  2017-07-28  5:58                     ` Jiri Pirko
  2017-07-28  7:27                     ` Jakub Kicinski
  2017-07-28 14:13                   ` Andrew Lunn
  2 siblings, 2 replies; 38+ messages in thread
From: Jiri Pirko @ 2017-07-28  5:35 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Or Gerlitz, netdev, Or Gerlitz, Michael Chan, Sathya Perla,
	David Miller, simon.horman

Fri, Jul 28, 2017 at 04:31:22AM CEST, jakub.kicinski@netronome.com wrote:
>On Thu, 27 Jul 2017 13:30:44 +0300, Or Gerlitz wrote:
>> > want to add port splitting support, for example, reporting the name on
>> > physical ports will become more of a necessity.  
>> 
>> > If we adopt Jiri's suggestion of returning structured data it will be
>> > very easy to give user space type and indexes separately, but we should
>> > probably still return the string for backwards compatibility.  
>> 
>> I am not still clear how the structured data would look like
>
>I decided to just quickly write the code, that should be easier to 
>understand.  We can probably leave out the netlink part of the API
>if there is no need for it right now, but that's what I ment by
>returning the information in a more structured way.
>
>Tested-by: nobody :)
>Suggested-by: Jiri (if I understood correctly)

Yes, you did :) Couple of nits inlined.


>---
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c |  8 ++-
> drivers/net/ethernet/mellanox/mlxsw/switchx2.c   | 10 ++--
> drivers/net/ethernet/netronome/nfp/nfp_port.c    | 26 ++++-----
> drivers/net/ethernet/netronome/nfp/nfp_port.h    |  4 +-
> include/linux/netdevice.h                        | 18 ++++++-
> include/uapi/linux/if_link.h                     | 16 ++++++
> net/core/dev.c                                   | 31 +++++++++--
> net/core/rtnetlink.c                             | 69 ++++++++++++++++++++++++
> 8 files changed, 153 insertions(+), 29 deletions(-)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
>index 45e60be9c277..7a71291b8ec3 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
>@@ -637,16 +637,14 @@ static int mlx5e_rep_close(struct net_device *dev)
> }
> 
> static int mlx5e_rep_get_phys_port_name(struct net_device *dev,
>-					char *buf, size_t len)
>+					struct netdev_port_info *info)

Either we rename ndo to something like ndo_get_port_info or you rename
the struct to netdev_port_name_info. These 2 should be in sync


> {
> 	struct mlx5e_priv *priv = netdev_priv(dev);
> 	struct mlx5e_rep_priv *rpriv = priv->ppriv;
> 	struct mlx5_eswitch_rep *rep = rpriv->rep;
>-	int ret;
> 
>-	ret = snprintf(buf, len, "%d", rep->vport - 1);
>-	if (ret >= len)
>-		return -EOPNOTSUPP;
>+	info->type = NETDEV_PORT_PCI_VF;

NETDEV_PORT_TYPE_PCI_VF
or
NETDEV_PORT_NAME_TYPE_PCI_VF
depends on the option you chose above.


>+	info->pci.vf_id = rep->vport - 1;
> 
> 	return 0;
> }
>diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
>index 3b0f72455681..383b8b5f41cf 100644
>--- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
>+++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
>@@ -413,15 +413,13 @@ mlxsw_sx_port_get_stats64(struct net_device *dev,
> 	stats->tx_dropped	= tx_dropped;
> }
> 
>-static int mlxsw_sx_port_get_phys_port_name(struct net_device *dev, char *name,
>-					    size_t len)
>+static int mlxsw_sx_port_get_phys_port_name(struct net_device *dev,
>+					    struct netdev_port_info *info)
> {
> 	struct mlxsw_sx_port *mlxsw_sx_port = netdev_priv(dev);
>-	int err;
> 
>-	err = snprintf(name, len, "p%d", mlxsw_sx_port->mapping.module + 1);
>-	if (err >= len)
>-		return -EINVAL;
>+	info->type = NETDEV_PORT_EXTERNAL;
>+	info->port.id = mlxsw_sx_port->mapping.module + 1;
> 
> 	return 0;
> }
>diff --git a/drivers/net/ethernet/netronome/nfp/nfp_port.c b/drivers/net/ethernet/netronome/nfp/nfp_port.c
>index d16a7b78ba9b..8f5c37b9a79c 100644
>--- a/drivers/net/ethernet/netronome/nfp/nfp_port.c
>+++ b/drivers/net/ethernet/netronome/nfp/nfp_port.c
>@@ -143,11 +143,11 @@ struct nfp_eth_table_port *nfp_port_get_eth_port(struct nfp_port *port)
> }
> 
> int
>-nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t len)
>+nfp_port_get_phys_port_name(struct net_device *netdev,
>+			    struct netdev_port_info *info)
> {
> 	struct nfp_eth_table_port *eth_port;
> 	struct nfp_port *port;
>-	int n;
> 
> 	port = nfp_port_from_netdev(netdev);
> 	if (!port)
>@@ -159,25 +159,27 @@ nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t len)
> 		if (!eth_port)
> 			return -EOPNOTSUPP;
> 
>-		if (!eth_port->is_split)
>-			n = snprintf(name, len, "p%d", eth_port->label_port);
>-		else
>-			n = snprintf(name, len, "p%ds%d", eth_port->label_port,
>-				     eth_port->label_subport);
>+		info->type = NETDEV_PORT_EXTERNAL;
>+		info->external.id = eth_port->label_port;
>+
>+		if (eth_port->is_split) {
>+			info->type = NETDEV_PORT_EXTERNAL_SPLIT;
>+			info->external.split_id = eth_port->label_subport;
>+		}
> 		break;
> 	case NFP_PORT_PF_PORT:
>-		n = snprintf(name, len, "pf%d", port->pf_id);
>+		info->type = NETDEV_PORT_PCI_PF;
>+		info->pci.pf_id = port->pf_id;
> 		break;
> 	case NFP_PORT_VF_PORT:
>-		n = snprintf(name, len, "pf%dvf%d", port->pf_id, port->vf_id);
>+		info->type = NETDEV_PORT_PCI_VF;
>+		info->pci.pf_id = port->pf_id;
>+		info->pci.vf_id = port->vf_id;
> 		break;
> 	default:
> 		return -EOPNOTSUPP;
> 	}
> 
>-	if (n >= len)
>-		return -EINVAL;
>-
> 	return 0;
> }
> 
>diff --git a/drivers/net/ethernet/netronome/nfp/nfp_port.h b/drivers/net/ethernet/netronome/nfp/nfp_port.h
>index 56c76926c82a..03b8746feb29 100644
>--- a/drivers/net/ethernet/netronome/nfp/nfp_port.h
>+++ b/drivers/net/ethernet/netronome/nfp/nfp_port.h
>@@ -118,8 +118,8 @@ nfp_port_from_id(struct nfp_pf *pf, enum nfp_port_type type, unsigned int id);
> struct nfp_eth_table_port *__nfp_port_get_eth_port(struct nfp_port *port);
> struct nfp_eth_table_port *nfp_port_get_eth_port(struct nfp_port *port);
> 
>-int
>-nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t len);
>+int nfp_port_get_phys_port_name(struct net_device *netdev,
>+				struct netdev_port_info *info);
> int nfp_port_configure(struct net_device *netdev, bool configed);
> 
> struct nfp_port *
>diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>index 3a3cdc1b1f31..0a055df701ef 100644
>--- a/include/linux/netdevice.h
>+++ b/include/linux/netdevice.h
>@@ -845,6 +845,22 @@ struct xfrmdev_ops {
> };
> #endif
> 
>+struct netdev_port_info {
>+	int type;

enum please


>+
>+	union {
>+		struct {
>+			u32 id;
>+			u32 split_id;
>+		} external;
>+
>+		struct {
>+			u32 pf_id;
>+			u32 vf_id;
>+		} pci;
>+	};
>+};
>+
> /*
>  * This structure defines the management hooks for network devices.
>  * The following hooks can be defined; unless noted otherwise, they are
>@@ -1306,7 +1322,7 @@ struct net_device_ops {
> 	int			(*ndo_get_phys_port_id)(struct net_device *dev,
> 							struct netdev_phys_item_id *ppid);
> 	int			(*ndo_get_phys_port_name)(struct net_device *dev,
>-							  char *name, size_t len);
>+							  struct netdev_port_info *info);
> 	void			(*ndo_udp_tunnel_add)(struct net_device *dev,
> 						      struct udp_tunnel_info *ti);
> 	void			(*ndo_udp_tunnel_del)(struct net_device *dev,
>diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>index 8d062c58d5cb..e00ff0333e3f 100644
>--- a/include/uapi/linux/if_link.h
>+++ b/include/uapi/linux/if_link.h
>@@ -158,6 +158,7 @@ enum {
> 	IFLA_PAD,
> 	IFLA_XDP,
> 	IFLA_EVENT,
>+	IFLA_PORT_INFO,
> 	__IFLA_MAX
> };
> 
>@@ -927,4 +928,19 @@ enum {
> 	IFLA_EVENT_BONDING_OPTIONS,	/* change in bonding options */
> };
> 
>+enum {
>+	NETDEV_PORT_EXTERNAL,
>+	NETDEV_PORT_EXTERNAL_SPLIT,
>+	NETDEV_PORT_PCI_PF,

Isn't PF also EXTERNAL? Cant VF be split? What I'm getting at, shoudn't
these be flags?


>+	NETDEV_PORT_PCI_VF,
>+};
>+

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: generate phys_port_name in the core
  2017-07-28  5:35                   ` Jiri Pirko
@ 2017-07-28  5:58                     ` Jiri Pirko
  2017-07-28  7:28                       ` Jakub Kicinski
  2017-07-28  7:27                     ` Jakub Kicinski
  1 sibling, 1 reply; 38+ messages in thread
From: Jiri Pirko @ 2017-07-28  5:58 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Or Gerlitz, netdev, Or Gerlitz, Michael Chan, Sathya Perla,
	David Miller, simon.horman

Fri, Jul 28, 2017 at 07:35:07AM CEST, jiri@resnulli.us wrote:
>Fri, Jul 28, 2017 at 04:31:22AM CEST, jakub.kicinski@netronome.com wrote:
>>On Thu, 27 Jul 2017 13:30:44 +0300, Or Gerlitz wrote:
>>> > want to add port splitting support, for example, reporting the name on
>>> > physical ports will become more of a necessity.  
>>> 
>>> > If we adopt Jiri's suggestion of returning structured data it will be
>>> > very easy to give user space type and indexes separately, but we should
>>> > probably still return the string for backwards compatibility.  
>>> 
>>> I am not still clear how the structured data would look like
>>
>>I decided to just quickly write the code, that should be easier to 
>>understand.  We can probably leave out the netlink part of the API
>>if there is no need for it right now, but that's what I ment by
>>returning the information in a more structured way.
>>
>>Tested-by: nobody :)
>>Suggested-by: Jiri (if I understood correctly)
>
>Yes, you did :) Couple of nits inlined.
>
>
>>---
>> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c |  8 ++-
>> drivers/net/ethernet/mellanox/mlxsw/switchx2.c   | 10 ++--
>> drivers/net/ethernet/netronome/nfp/nfp_port.c    | 26 ++++-----
>> drivers/net/ethernet/netronome/nfp/nfp_port.h    |  4 +-
>> include/linux/netdevice.h                        | 18 ++++++-
>> include/uapi/linux/if_link.h                     | 16 ++++++
>> net/core/dev.c                                   | 31 +++++++++--
>> net/core/rtnetlink.c                             | 69 ++++++++++++++++++++++++
>> 8 files changed, 153 insertions(+), 29 deletions(-)
>>
>>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
>>index 45e60be9c277..7a71291b8ec3 100644
>>--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
>>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
>>@@ -637,16 +637,14 @@ static int mlx5e_rep_close(struct net_device *dev)
>> }
>> 
>> static int mlx5e_rep_get_phys_port_name(struct net_device *dev,
>>-					char *buf, size_t len)
>>+					struct netdev_port_info *info)
>
>Either we rename ndo to something like ndo_get_port_info or you rename
>the struct to netdev_port_name_info. These 2 should be in sync

As this is strictly related to port name, I think it should be:
struct netdev_phys_port_name_info

And the rest named in-sync with this

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: generate phys_port_name in the core
  2017-07-28  5:35                   ` Jiri Pirko
  2017-07-28  5:58                     ` Jiri Pirko
@ 2017-07-28  7:27                     ` Jakub Kicinski
  1 sibling, 0 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-28  7:27 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Or Gerlitz, netdev, Or Gerlitz, Michael Chan, Sathya Perla,
	David Miller, simon.horman

On Fri, 28 Jul 2017 07:35:07 +0200, Jiri Pirko wrote:
> >+enum {
> >+	NETDEV_PORT_EXTERNAL,
> >+	NETDEV_PORT_EXTERNAL_SPLIT,
> >+	NETDEV_PORT_PCI_PF,  
> 
> Isn't PF also EXTERNAL? Cant VF be split? What I'm getting at, shoudn't
> these be flags?

By external PF do you mean not connected to the current host?  Yes we
could make a flag like that.  Right now I have picked "external" to
avoid too specific names like MAC/PHY/Ethernet...  There could be a
MAC-to-MAC connection for instance, so no PHY involved.  Should we go
with NETDEV_PORT_ETH?

I will make SPLIT into a flag.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: generate phys_port_name in the core
  2017-07-28  5:58                     ` Jiri Pirko
@ 2017-07-28  7:28                       ` Jakub Kicinski
  0 siblings, 0 replies; 38+ messages in thread
From: Jakub Kicinski @ 2017-07-28  7:28 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Or Gerlitz, netdev, Or Gerlitz, Michael Chan, Sathya Perla,
	David Miller, simon.horman

On Fri, 28 Jul 2017 07:58:15 +0200, Jiri Pirko wrote:
> Fri, Jul 28, 2017 at 07:35:07AM CEST, jiri@resnulli.us wrote:
> >Fri, Jul 28, 2017 at 04:31:22AM CEST, jakub.kicinski@netronome.com wrote:  
> >>On Thu, 27 Jul 2017 13:30:44 +0300, Or Gerlitz wrote:  
> >>> > want to add port splitting support, for example, reporting the name on
> >>> > physical ports will become more of a necessity.    
> >>>   
> >>> > If we adopt Jiri's suggestion of returning structured data it will be
> >>> > very easy to give user space type and indexes separately, but we should
> >>> > probably still return the string for backwards compatibility.    
> >>> 
> >>> I am not still clear how the structured data would look like  
> >>
> >>I decided to just quickly write the code, that should be easier to 
> >>understand.  We can probably leave out the netlink part of the API
> >>if there is no need for it right now, but that's what I ment by
> >>returning the information in a more structured way.
> >>
> >>Tested-by: nobody :)
> >>Suggested-by: Jiri (if I understood correctly)  
> >
> >Yes, you did :) Couple of nits inlined.
> >
> >  
> >>---
> >> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c |  8 ++-
> >> drivers/net/ethernet/mellanox/mlxsw/switchx2.c   | 10 ++--
> >> drivers/net/ethernet/netronome/nfp/nfp_port.c    | 26 ++++-----
> >> drivers/net/ethernet/netronome/nfp/nfp_port.h    |  4 +-
> >> include/linux/netdevice.h                        | 18 ++++++-
> >> include/uapi/linux/if_link.h                     | 16 ++++++
> >> net/core/dev.c                                   | 31 +++++++++--
> >> net/core/rtnetlink.c                             | 69 ++++++++++++++++++++++++
> >> 8 files changed, 153 insertions(+), 29 deletions(-)
> >>
> >>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> >>index 45e60be9c277..7a71291b8ec3 100644
> >>--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> >>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> >>@@ -637,16 +637,14 @@ static int mlx5e_rep_close(struct net_device *dev)
> >> }
> >> 
> >> static int mlx5e_rep_get_phys_port_name(struct net_device *dev,
> >>-					char *buf, size_t len)
> >>+					struct netdev_port_info *info)  
> >
> >Either we rename ndo to something like ndo_get_port_info or you rename
> >the struct to netdev_port_name_info. These 2 should be in sync  
> 
> As this is strictly related to port name, I think it should be:
> struct netdev_phys_port_name_info
> 
> And the rest named in-sync with this

Hm..  Since we would return the type and IDs via netlink, I was
actually leaning towards ndo_get_port_info.  "Name" somehow implies a
string in my mind.  Perhaps the "info" part is where I went wrong, could
we go with netdev_port_label?  Or netdev_port_ident?  I would also
prefer to avoid the phys in name, since it doesn't sit well with PFs
vs VFs.

Ah, naming things...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] switchdev: generate phys_port_name in the core
  2017-07-28  2:31                 ` [RFC] switchdev: generate phys_port_name in the core Jakub Kicinski
  2017-07-28  2:37                   ` Jakub Kicinski
  2017-07-28  5:35                   ` Jiri Pirko
@ 2017-07-28 14:13                   ` Andrew Lunn
  2 siblings, 0 replies; 38+ messages in thread
From: Andrew Lunn @ 2017-07-28 14:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Or Gerlitz, Jiri Pirko, netdev, Or Gerlitz, Michael Chan,
	Sathya Perla, David Miller, simon.horman

On Thu, Jul 27, 2017 at 07:31:22PM -0700, Jakub Kicinski wrote:
> On Thu, 27 Jul 2017 13:30:44 +0300, Or Gerlitz wrote:
> > > want to add port splitting support, for example, reporting the name on
> > > physical ports will become more of a necessity.  
> > 
> > > If we adopt Jiri's suggestion of returning structured data it will be
> > > very easy to give user space type and indexes separately, but we should
> > > probably still return the string for backwards compatibility.  
> > 
> > I am not still clear how the structured data would look like
> 
> I decided to just quickly write the code, that should be easier to 
> understand.  We can probably leave out the netlink part of the API
> if there is no need for it right now, but that's what I ment by
> returning the information in a more structured way.
> 
> Tested-by: nobody :)
> Suggested-by: Jiri (if I understood correctly)
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c |  8 ++-
>  drivers/net/ethernet/mellanox/mlxsw/switchx2.c   | 10 ++--
>  drivers/net/ethernet/netronome/nfp/nfp_port.c    | 26 ++++-----
>  drivers/net/ethernet/netronome/nfp/nfp_port.h    |  4 +-
>  include/linux/netdevice.h                        | 18 ++++++-
>  include/uapi/linux/if_link.h                     | 16 ++++++
>  net/core/dev.c                                   | 31 +++++++++--
>  net/core/rtnetlink.c                             | 69 ++++++++++++++++++++++++

Hi Jakub

Don't forget net/dsa/slave.c when you go from RFC to a real patch for
submission.

	Andrew

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2017-07-28 14:13 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-24 16:34 [PATCH net-next 00/10] bnxt_en: Updates for net-next Michael Chan
2017-07-24 16:34 ` [PATCH net-next 01/10] bnxt_en: Update firmware interface spec to 1.8.0 Michael Chan
2017-07-24 16:34 ` [PATCH net-next 02/10] bnxt_en: Retrieve the hardware bridge mode from the firmware Michael Chan
2017-07-24 16:34 ` [PATCH net-next 03/10] bnxt_en: Implement ndo_bridge_{get|set}link methods Michael Chan
2017-07-24 16:34 ` [PATCH net-next 04/10] bnxt_en: Add bnxt_get_num_stats() to centrally get the number of ethtool stats Michael Chan
2017-07-24 16:34 ` [PATCH net-next 05/10] bnxt_en: Allow the user to set ethtool stats-block-usecs to 0 Michael Chan
2017-07-24 16:34 ` [PATCH net-next 06/10] bnxt_en: Report firmware DCBX agent Michael Chan
2017-07-24 16:34 ` [PATCH net-next 07/10] bnxt_en: Set ETS min_bw parameter for older firmware Michael Chan
2017-07-24 16:34 ` [PATCH net-next 08/10] bnxt_en: add support to enable VF-representors Michael Chan
2017-07-25  2:28   ` kbuild test robot
2017-07-24 16:34 ` [PATCH net-next 09/10] bnxt_en: add vf-rep RX/TX and netdev implementation Michael Chan
2017-07-24 16:34 ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Michael Chan
2017-07-25  2:44   ` kbuild test robot
2017-07-25  4:06     ` Jakub Kicinski
2017-07-25  4:21       ` David Miller
2017-07-25  4:45   ` Jakub Kicinski
2017-07-25  5:13     ` [RFC] switchdev: clarify ndo_get_phys_port_name() formats Jakub Kicinski
2017-07-25 15:22       ` Andy Gospodarek
2017-07-25 22:26         ` Jakub Kicinski
2017-07-26  1:48           ` Andy Gospodarek
2017-07-26  2:34             ` Jakub Kicinski
2017-07-26 17:54               ` Andy Gospodarek
2017-07-26  5:48       ` Jiri Pirko
2017-07-26  8:13         ` Jakub Kicinski
2017-07-26  9:23           ` Or Gerlitz
2017-07-26 20:11             ` Jakub Kicinski
2017-07-27 10:30               ` Or Gerlitz
2017-07-28  2:31                 ` [RFC] switchdev: generate phys_port_name in the core Jakub Kicinski
2017-07-28  2:37                   ` Jakub Kicinski
2017-07-28  5:35                   ` Jiri Pirko
2017-07-28  5:58                     ` Jiri Pirko
2017-07-28  7:28                       ` Jakub Kicinski
2017-07-28  7:27                     ` Jakub Kicinski
2017-07-28 14:13                   ` Andrew Lunn
2017-07-25  9:55     ` [PATCH net-next 10/10] bnxt_en: add support for port_attr_get and and get_phys_port_name Sathya Perla
2017-07-25 22:27       ` Jakub Kicinski
2017-07-26  9:13     ` Or Gerlitz
2017-07-25  0:32 ` [PATCH net-next 00/10] bnxt_en: Updates for net-next David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.