linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 net-next 0/4] Add ethtool interface for SyncE
@ 2021-12-01 18:02 Maciej Machnikowski
  2021-12-01 18:02 ` [PATCH v4 net-next 1/4] ice: add support detecting features based on netlist Maciej Machnikowski
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Maciej Machnikowski @ 2021-12-01 18:02 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan, arkadiusz.kubalewski
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba,
	linux-kselftest, idosch, mkubecek, saeed, michael.chan, petrm

Synchronous Ethernet networks use a physical layer clock to syntonize
the frequency across different network elements.

Basic SyncE node defined in the ITU-T G.8264 consist of an Ethernet
Equipment Clock (EEC) and have the ability to recover synchronization
from the synchronization inputs - either traffic interfaces or external
frequency sources.
The EEC can synchronize its frequency (syntonize) to any of those sources.
It is also able to select synchronization source through priority tables
and synchronization status messaging. It also provides neccessary
filtering and holdover capabilities

This patch series introduces basic interface for reading and configuring
recover clocks on a SyncE capable device

v4:
- Dropped EEC_STATE reporting (TBD: DPLL subsystem)
- moved recovered clock configuration to ethtool netlink

v3:
- remove RTM_GETRCLKRANGE
- return state of all possible pins in the RTM_GETRCLKSTATE
- clarify documentation

v2:
- improved documentation
- fixed kdoc warning

RFC history:
v2:
- removed whitespace changes
- fix issues reported by test robot
v3:
- Changed naming from SyncE to EEC
- Clarify cover letter and commit message for patch 1
v4:
- Removed sync_source and pin_idx info
- Changed one structure to attributes
- Added EEC_SRC_PORT flag to indicate that the EEC is synchronized
  to the recovered clock of a port that returns the state
v5:
- add EEC source as an optiona attribute
- implement support for recovered clocks
- align states returned by EEC to ITU-T G.781
v6:
- fix EEC clock state reporting
- add documentation
- fix descriptions in code comments

Maciej Machnikowski (4):
  ice: add support detecting features based on netlist
  ethtool: Add ability to configure recovered clock for SyncE feature
  ice: add support for monitoring SyncE DPLL state
  ice: add support for SyncE recovered clocks

 Documentation/networking/ethtool-netlink.rst  |  67 +++++
 drivers/net/ethernet/intel/ice/ice.h          |   7 +
 .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  70 ++++-
 drivers/net/ethernet/intel/ice/ice_common.c   | 224 +++++++++++++++
 drivers/net/ethernet/intel/ice/ice_common.h   |  20 +-
 drivers/net/ethernet/intel/ice/ice_devids.h   |   3 +
 drivers/net/ethernet/intel/ice/ice_ethtool.c  |  97 +++++++
 drivers/net/ethernet/intel/ice/ice_lib.c      |   6 +-
 drivers/net/ethernet/intel/ice/ice_ptp.c      |  35 +++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c   |  49 ++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.h   |  36 +++
 drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
 include/linux/ethtool.h                       |   9 +
 include/uapi/linux/ethtool_netlink.h          |  21 ++
 net/ethtool/Makefile                          |   3 +-
 net/ethtool/netlink.c                         |  20 ++
 net/ethtool/netlink.h                         |   4 +
 net/ethtool/synce.c                           | 267 ++++++++++++++++++
 18 files changed, 935 insertions(+), 4 deletions(-)
 create mode 100644 net/ethtool/synce.c

-- 
2.26.3


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 1/4] ice: add support detecting features based on netlist
  2021-12-01 18:02 [PATCH v4 net-next 0/4] Add ethtool interface for SyncE Maciej Machnikowski
@ 2021-12-01 18:02 ` Maciej Machnikowski
  2021-12-01 18:02 ` [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature Maciej Machnikowski
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Maciej Machnikowski @ 2021-12-01 18:02 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan, arkadiusz.kubalewski
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba,
	linux-kselftest, idosch, mkubecek, saeed, michael.chan, petrm

Add new functions to check netlist of a given board for:
- Recovered Clock device,
- Clock Generation Unit,
- Clock Multiplexer,

Initialize feature bits depending on detected components.

Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h          |   2 +
 .../net/ethernet/intel/ice/ice_adminq_cmd.h   |   7 +-
 drivers/net/ethernet/intel/ice/ice_common.c   | 123 ++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_common.h   |   9 ++
 drivers/net/ethernet/intel/ice/ice_lib.c      |   6 +-
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c   |   1 +
 drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
 7 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index b67ad51cbcc9..cb6b4c53584b 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -183,6 +183,8 @@
 
 enum ice_feature {
 	ICE_F_DSCP,
+	ICE_F_CGU,
+	ICE_F_PHY_RCLK,
 	ICE_F_SMA_CTRL,
 	ICE_F_MAX
 };
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 4eef3488d86f..339c2a86f680 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -1297,6 +1297,8 @@ struct ice_aqc_link_topo_params {
 #define ICE_AQC_LINK_TOPO_NODE_TYPE_CAGE	6
 #define ICE_AQC_LINK_TOPO_NODE_TYPE_MEZZ	7
 #define ICE_AQC_LINK_TOPO_NODE_TYPE_ID_EEPROM	8
+#define ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL	9
+#define ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_MUX	10
 #define ICE_AQC_LINK_TOPO_NODE_CTX_S		4
 #define ICE_AQC_LINK_TOPO_NODE_CTX_M		\
 				(0xF << ICE_AQC_LINK_TOPO_NODE_CTX_S)
@@ -1333,7 +1335,10 @@ struct ice_aqc_link_topo_addr {
 struct ice_aqc_get_link_topo {
 	struct ice_aqc_link_topo_addr addr;
 	u8 node_part_num;
-#define ICE_AQC_GET_LINK_TOPO_NODE_NR_PCA9575	0x21
+#define ICE_AQC_GET_LINK_TOPO_NODE_NR_PCA9575		0x21
+#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032	0x24
+#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_PKVL		0x31
+#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_GEN_CLK_MUX	0x47
 	u8 rsvd[9];
 };
 
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index b3066d0fea8b..35903b282885 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -274,6 +274,79 @@ ice_aq_get_link_topo_handle(struct ice_port_info *pi, u8 node_type,
 	return ice_aq_send_cmd(pi->hw, &desc, NULL, 0, cd);
 }
 
+/**
+ * ice_aq_get_netlist_node
+ * @hw: pointer to the hw struct
+ * @cmd: get_link_topo AQ structure
+ * @node_part_number: output node part number if node found
+ * @node_handle: output node handle parameter if node found
+ */
+enum ice_status
+ice_aq_get_netlist_node(struct ice_hw *hw, struct ice_aqc_get_link_topo *cmd,
+			u8 *node_part_number, u16 *node_handle)
+{
+	struct ice_aq_desc desc;
+
+	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_link_topo);
+	desc.params.get_link_topo = *cmd;
+
+	if (ice_aq_send_cmd(hw, &desc, NULL, 0, NULL))
+		return ICE_ERR_NOT_SUPPORTED;
+
+	if (node_handle)
+		*node_handle =
+			le16_to_cpu(desc.params.get_link_topo.addr.handle);
+	if (node_part_number)
+		*node_part_number = desc.params.get_link_topo.node_part_num;
+
+	return ICE_SUCCESS;
+}
+
+#define MAX_NETLIST_SIZE 10
+/**
+ * ice_find_netlist_node
+ * @hw: pointer to the hw struct
+ * @node_type_ctx: type of netlist node to look for
+ * @node_part_number: node part number to look for
+ * @node_handle: output parameter if node found - optional
+ *
+ * Find and return the node handle for a given node type and part number in the
+ * netlist. When found ICE_SUCCESS is returned, ICE_ERR_DOES_NOT_EXIST
+ * otherwise. If @node_handle provided, it would be set to found node handle.
+ */
+enum ice_status
+ice_find_netlist_node(struct ice_hw *hw, u8 node_type_ctx, u8 node_part_number,
+		      u16 *node_handle)
+{
+	struct ice_aqc_get_link_topo cmd;
+	u8 rec_node_part_number;
+	enum ice_status status;
+	u16 rec_node_handle;
+	u8 idx;
+
+	for (idx = 0; idx < MAX_NETLIST_SIZE; idx++) {
+		memset(&cmd, 0, sizeof(cmd));
+
+		cmd.addr.topo_params.node_type_ctx =
+			(node_type_ctx << ICE_AQC_LINK_TOPO_NODE_TYPE_S);
+		cmd.addr.topo_params.index = idx;
+
+		status = ice_aq_get_netlist_node(hw, &cmd,
+						 &rec_node_part_number,
+						 &rec_node_handle);
+		if (status)
+			return status;
+
+		if (rec_node_part_number == node_part_number) {
+			if (node_handle)
+				*node_handle = rec_node_handle;
+			return ICE_SUCCESS;
+		}
+	}
+
+	return ICE_ERR_DOES_NOT_EXIST;
+}
+
 /**
  * ice_is_media_cage_present
  * @pi: port information structure
@@ -5083,3 +5156,53 @@ bool ice_fw_supports_report_dflt_cfg(struct ice_hw *hw)
 	}
 	return false;
 }
+
+/**
+ * ice_is_phy_rclk_present_e810t
+ * @hw: pointer to the hw struct
+ *
+ * Check if the PHY Recovered Clock device is present in the netlist
+ */
+bool ice_is_phy_rclk_present_e810t(struct ice_hw *hw)
+{
+	if (ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL,
+				  ICE_ACQ_GET_LINK_TOPO_NODE_NR_PKVL, NULL))
+		return false;
+
+	return true;
+}
+
+/**
+ * ice_is_cgu_present_e810t
+ * @hw: pointer to the hw struct
+ *
+ * Check if the Clock Generation Unit (CGU) device is present in the netlist
+ */
+bool ice_is_cgu_present_e810t(struct ice_hw *hw)
+{
+	if (!ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL,
+				   ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032,
+				   NULL)) {
+		hw->cgu_part_number =
+			ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032;
+		return true;
+	}
+	return false;
+}
+
+/**
+ * ice_is_clock_mux_present_e810t
+ * @hw: pointer to the hw struct
+ *
+ * Check if the Clock Multiplexer device is present in the netlist
+ */
+bool ice_is_clock_mux_present_e810t(struct ice_hw *hw)
+{
+	if (ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_MUX,
+				  ICE_ACQ_GET_LINK_TOPO_NODE_NR_GEN_CLK_MUX,
+				  NULL))
+		return false;
+
+	return true;
+}
+
diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h
index 65c1b3244264..b20a5c085246 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.h
+++ b/drivers/net/ethernet/intel/ice/ice_common.h
@@ -89,6 +89,12 @@ ice_aq_get_phy_caps(struct ice_port_info *pi, bool qual_mods, u8 report_mode,
 		    struct ice_aqc_get_phy_caps_data *caps,
 		    struct ice_sq_cd *cd);
 enum ice_status
+ice_aq_get_netlist_node(struct ice_hw *hw, struct ice_aqc_get_link_topo *cmd,
+			u8 *node_part_number, u16 *node_handle);
+enum ice_status
+ice_find_netlist_node(struct ice_hw *hw, u8 node_type_ctx, u8 node_part_number,
+		      u16 *node_handle);
+enum ice_status
 ice_aq_list_caps(struct ice_hw *hw, void *buf, u16 buf_size, u32 *cap_count,
 		 enum ice_adminq_opc opc, struct ice_sq_cd *cd);
 enum ice_status
@@ -206,4 +212,7 @@ bool ice_fw_supports_lldp_fltr_ctrl(struct ice_hw *hw);
 enum ice_status
 ice_lldp_fltr_add_remove(struct ice_hw *hw, u16 vsi_num, bool add);
 bool ice_fw_supports_report_dflt_cfg(struct ice_hw *hw);
+bool ice_is_phy_rclk_present_e810t(struct ice_hw *hw);
+bool ice_is_cgu_present_e810t(struct ice_hw *hw);
+bool ice_is_clock_mux_present_e810t(struct ice_hw *hw);
 #endif /* _ICE_COMMON_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 09a3297cd63c..18c30b2912e3 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -4188,8 +4188,12 @@ void ice_init_feature_support(struct ice_pf *pf)
 	case ICE_DEV_ID_E810C_QSFP:
 	case ICE_DEV_ID_E810C_SFP:
 		ice_set_feature_support(pf, ICE_F_DSCP);
-		if (ice_is_e810t(&pf->hw))
+		if (ice_is_clock_mux_present_e810t(&pf->hw))
 			ice_set_feature_support(pf, ICE_F_SMA_CTRL);
+		if (ice_is_phy_rclk_present_e810t(&pf->hw))
+			ice_set_feature_support(pf, ICE_F_PHY_RCLK);
+		if (ice_is_cgu_present_e810t(&pf->hw))
+			ice_set_feature_support(pf, ICE_F_CGU);
 		break;
 	default:
 		break;
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 29f947c0cd2e..aa257db36765 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -800,3 +800,4 @@ bool ice_is_pca9575_present(struct ice_hw *hw)
 
 	return !status && handle;
 }
+
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
index 9e0c2923c62e..a9dc16641bd4 100644
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -920,6 +920,7 @@ struct ice_hw {
 	struct list_head rss_list_head;
 	struct ice_mbx_snapshot mbx_snapshot;
 	u16 io_expander_handle;
+	u8 cgu_part_number;
 };
 
 /* Statistics collected by each port, VSI, VEB, and S-channel */
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-01 18:02 [PATCH v4 net-next 0/4] Add ethtool interface for SyncE Maciej Machnikowski
  2021-12-01 18:02 ` [PATCH v4 net-next 1/4] ice: add support detecting features based on netlist Maciej Machnikowski
@ 2021-12-01 18:02 ` Maciej Machnikowski
  2021-12-02 11:48   ` kernel test robot
  2021-12-02 12:43   ` Ido Schimmel
  2021-12-01 18:02 ` [PATCH v4 net-next 3/4] ice: add support for monitoring SyncE DPLL state Maciej Machnikowski
  2021-12-01 18:02 ` [PATCH v4 net-next 4/4] ice: add support for SyncE recovered clocks Maciej Machnikowski
  3 siblings, 2 replies; 27+ messages in thread
From: Maciej Machnikowski @ 2021-12-01 18:02 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan, arkadiusz.kubalewski
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba,
	linux-kselftest, idosch, mkubecek, saeed, michael.chan, petrm

Add netlink ethtool messages:
- ETHTOOL_MSG_RCLK_GET
- ETHTOOL_MSG_RCLK_SET
Required for controling basic SyncE functionality - configuration of
recovered reference clocks from PHY port on netdevice supporting SyncE.

- ETHTOOL_MSG_RCLK_GET - will read the current state if pin index is
			given or will return allowed range if it's not

- ETHTOOL_MSG_RCLK_SET - will configure given recovered clock pin to
			output recovered clock

Co-developed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 Documentation/networking/ethtool-netlink.rst |  67 +++++
 include/linux/ethtool.h                      |   9 +
 include/uapi/linux/ethtool_netlink.h         |  21 ++
 net/ethtool/Makefile                         |   3 +-
 net/ethtool/netlink.c                        |  20 ++
 net/ethtool/netlink.h                        |   4 +
 net/ethtool/synce.c                          | 267 +++++++++++++++++++
 7 files changed, 390 insertions(+), 1 deletion(-)
 create mode 100644 net/ethtool/synce.c

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 9d98e0511249..51db9338785a 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -220,6 +220,8 @@ Userspace to kernel:
   ``ETHTOOL_MSG_PHC_VCLOCKS_GET``       get PHC virtual clocks info
   ``ETHTOOL_MSG_MODULE_SET``            set transceiver module parameters
   ``ETHTOOL_MSG_MODULE_GET``            get transceiver module parameters
+  ``ETHTOOL_RCLK_GET``                  get recovered clock parameters
+  ``ETHTOOL_RCLK_SET``                  set recovered clock parameters
   ===================================== =================================
 
 Kernel to userspace:
@@ -260,6 +262,8 @@ Kernel to userspace:
   ``ETHTOOL_MSG_STATS_GET_REPLY``          standard statistics
   ``ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY``    PHC virtual clocks info
   ``ETHTOOL_MSG_MODULE_GET_REPLY``         transceiver module parameters
+  ``ETHTOOL_MSG_RCLK_GET_REPLY``           reference recovered clock config
+  ``ETHTOOL_MSG_RCLK_NTF``                 reference recovered clock config
   ======================================== =================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -1598,6 +1602,67 @@ For SFF-8636 modules, low power mode is forced by the host according to table
 For CMIS modules, low power mode is forced by the host according to table 6-12
 in revision 5.0 of the specification.
 
+RCLK_GET
+========
+
+Get status of an output pin for PHY recovered frequency clock.
+
+Request contents:
+
+  ======================================  ======  ==========================
+  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
+  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
+  ======================================  ======  ==========================
+
+Kernel response contents:
+
+  ======================================  ======  ==========================
+  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
+  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32     state of a pin
+  ``ETHTOOL_A_RCLK_RANGE_MIN_PIN``        u32     min index of RCLK pins
+  ``ETHTOOL_A_RCLK_RANGE_MAX_PIN``        u32     max index of RCLK pins
+  ======================================  ======  ==========================
+
+Supported device can have mulitple reference recover clock pins available
+to be used as source of frequency for a DPLL.
+Once a pin on given port is enabled. The PHY recovered frequency is being
+fed onto that pin, and can be used by DPLL to synchonize with its signal.
+Pins don't have to start with index equal 0 - device can also have different
+external sources pins.
+
+The ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is optional parameter. If present in
+the RCLK_GET request, the ``ETHTOOL_A_RCLK_PIN_ENABLED`` is provided in a
+response, it contatins state of the pin pointed by the index. Values are:
+
+.. kernel-doc:: include/uapi/linux/ethtool.h
+    :identifiers: ethtool_rclk_pin_state
+
+If ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is not present in the RCLK_GET request,
+the range of available pins is returned:
+``ETHTOOL_A_RCLK_RANGE_MIN_PIN`` is lowest possible index of a pin available
+for recovering frequency from PHY.
+``ETHTOOL_A_RCLK_RANGE_MAX_PIN`` is highest possible index of a pin available
+for recovering frequency from PHY.
+
+RCLK_SET
+==========
+
+Set status of an output pin for PHY recovered frequency clock.
+
+Request contents:
+
+  ======================================  ======  ========================
+  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
+  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
+  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32      requested state
+  ======================================  ======  ========================
+
+``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is a index of a pin for which the change of
+state is requested. Values of ``ETHTOOL_A_RCLK_PIN_ENABLED`` are:
+
+.. kernel-doc:: include/uapi/linux/ethtool.h
+    :identifiers: ethtool_rclk_pin_state
+
 Request translation
 ===================
 
@@ -1699,4 +1764,6 @@ are netlink only.
   n/a                                 ``ETHTOOL_MSG_PHC_VCLOCKS_GET``
   n/a                                 ``ETHTOOL_MSG_MODULE_GET``
   n/a                                 ``ETHTOOL_MSG_MODULE_SET``
+  n/a                                 ``ETHTOOL_MSG_RCLK_GET``
+  n/a                                 ``ETHTOOL_MSG_RCLK_SET``
   =================================== =====================================
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index a26f37a27167..e344e5153f9b 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -614,6 +614,9 @@ struct ethtool_module_power_mode_params {
  *	plugged-in.
  * @set_module_power_mode: Set the power mode policy for the plug-in module
  *	used by the network device.
+ * @get_rclk_range: Get range of valid Reference Clock input pins for the port.
+ * @get_rclk_state: Get state of Reference Clock input signal pin.
+ * @set_rclk_out: Enable/disable Reference Clock input signal pin.
  *
  * All operations are optional (i.e. the function pointer may be set
  * to %NULL) and callers must take this into account.  Callers must
@@ -750,6 +753,12 @@ struct ethtool_ops {
 	int	(*set_module_power_mode)(struct net_device *dev,
 					 const struct ethtool_module_power_mode_params *params,
 					 struct netlink_ext_ack *extack);
+	int	(*get_rclk_range)(struct net_device *dev, u32 *min_idx,
+				  u32 *max_idx, struct netlink_ext_ack *extack);
+	int	(*get_rclk_state)(struct net_device *dev, u32 out_idx,
+				  bool *ena, struct netlink_ext_ack *extack);
+	int	(*set_rclk_out)(struct net_device *dev, u32 out_idx, bool ena,
+				struct netlink_ext_ack *extack);
 };
 
 int ethtool_check_ops(const struct ethtool_ops *ops);
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index cca6e474a085..9a860f052a36 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -49,6 +49,8 @@ enum {
 	ETHTOOL_MSG_PHC_VCLOCKS_GET,
 	ETHTOOL_MSG_MODULE_GET,
 	ETHTOOL_MSG_MODULE_SET,
+	ETHTOOL_MSG_RCLK_GET,
+	ETHTOOL_MSG_RCLK_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -94,6 +96,8 @@ enum {
 	ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY,
 	ETHTOOL_MSG_MODULE_GET_REPLY,
 	ETHTOOL_MSG_MODULE_NTF,
+	ETHTOOL_MSG_RCLK_GET_REPLY,
+	ETHTOOL_MSG_RCLK_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -853,6 +857,23 @@ enum {
 	ETHTOOL_A_MODULE_MAX = (__ETHTOOL_A_MODULE_CNT - 1)
 };
 
+/* REF CLK */
+
+enum {
+	ETHTOOL_A_RCLK_UNSPEC,
+	ETHTOOL_A_RCLK_HEADER,		/* nest - _A_HEADER_* */
+	ETHTOOL_A_RCLK_OUT_PIN_IDX,	/* u32 */
+	ETHTOOL_A_RCLK_PIN_FLAGS,	/* u32 */
+	ETHTOOL_A_RCLK_PIN_MIN,		/* u32 */
+	ETHTOOL_A_RCLK_PIN_MAX,		/* u32 */
+
+	/* add new constants above here */
+	__ETHTOOL_A_RCLK_CNT,
+	ETHTOOL_A_RCLK_MAX = (__ETHTOOL_A_RCLK_CNT - 1)
+};
+
+#define ETHTOOL_RCLK_PIN_FLAGS_ENA	(1 << 0)
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index b76432e70e6b..dd6de311a9c2 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -7,4 +7,5 @@ obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o \
 		   linkstate.o debug.o wol.o features.o privflags.o rings.o \
 		   channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \
-		   tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o
+		   tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o \
+		   synce.o
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 38b44c0291b1..76ee82c687fe 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -283,6 +283,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_STATS_GET]		= &ethnl_stats_request_ops,
 	[ETHTOOL_MSG_PHC_VCLOCKS_GET]	= &ethnl_phc_vclocks_request_ops,
 	[ETHTOOL_MSG_MODULE_GET]	= &ethnl_module_request_ops,
+	[ETHTOOL_MSG_RCLK_GET]		= &ethnl_rclk_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -595,6 +596,7 @@ ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
 	[ETHTOOL_MSG_EEE_NTF]		= &ethnl_eee_request_ops,
 	[ETHTOOL_MSG_FEC_NTF]		= &ethnl_fec_request_ops,
 	[ETHTOOL_MSG_MODULE_NTF]	= &ethnl_module_request_ops,
+	[ETHTOOL_MSG_RCLK_NTF]		= &ethnl_rclk_request_ops,
 };
 
 /* default notification handler */
@@ -689,6 +691,7 @@ static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
 	[ETHTOOL_MSG_EEE_NTF]		= ethnl_default_notify,
 	[ETHTOOL_MSG_FEC_NTF]		= ethnl_default_notify,
 	[ETHTOOL_MSG_MODULE_NTF]	= ethnl_default_notify,
+	[ETHTOOL_MSG_RCLK_NTF]		= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
@@ -1018,6 +1021,23 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.policy = ethnl_module_set_policy,
 		.maxattr = ARRAY_SIZE(ethnl_module_set_policy) - 1,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_RCLK_GET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+		.policy = ethnl_rclk_get_policy,
+		.maxattr = ARRAY_SIZE(ethnl_rclk_get_policy) - 1,
+	},
+	{
+		.cmd	= ETHTOOL_MSG_RCLK_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_rclk,
+		.policy = ethnl_rclk_set_policy,
+		.maxattr = ARRAY_SIZE(ethnl_rclk_set_policy) - 1,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 490598e5eedd..bdeb559f0db7 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -338,6 +338,7 @@ extern const struct ethnl_request_ops ethnl_module_eeprom_request_ops;
 extern const struct ethnl_request_ops ethnl_stats_request_ops;
 extern const struct ethnl_request_ops ethnl_phc_vclocks_request_ops;
 extern const struct ethnl_request_ops ethnl_module_request_ops;
+extern const struct ethnl_request_ops ethnl_rclk_request_ops;
 
 extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
 extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
@@ -376,6 +377,8 @@ extern const struct nla_policy ethnl_stats_get_policy[ETHTOOL_A_STATS_GROUPS + 1
 extern const struct nla_policy ethnl_phc_vclocks_get_policy[ETHTOOL_A_PHC_VCLOCKS_HEADER + 1];
 extern const struct nla_policy ethnl_module_get_policy[ETHTOOL_A_MODULE_HEADER + 1];
 extern const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MODE_POLICY + 1];
+extern const struct nla_policy ethnl_rclk_get_policy[ETHTOOL_A_RCLK_OUT_PIN_IDX + 1];
+extern const struct nla_policy ethnl_rclk_set_policy[ETHTOOL_A_RCLK_PIN_FLAGS + 1];
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
@@ -395,6 +398,7 @@ int ethnl_tunnel_info_start(struct netlink_callback *cb);
 int ethnl_tunnel_info_dumpit(struct sk_buff *skb, struct netlink_callback *cb);
 int ethnl_set_fec(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_module(struct sk_buff *skb, struct genl_info *info);
+int ethnl_set_rclk(struct sk_buff *skb, struct genl_info *info);
 
 extern const char stats_std_names[__ETHTOOL_STATS_CNT][ETH_GSTRING_LEN];
 extern const char stats_eth_phy_names[__ETHTOOL_A_STATS_ETH_PHY_CNT][ETH_GSTRING_LEN];
diff --git a/net/ethtool/synce.c b/net/ethtool/synce.c
new file mode 100644
index 000000000000..f4ebb4c57d4d
--- /dev/null
+++ b/net/ethtool/synce.c
@@ -0,0 +1,267 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/ethtool.h>
+#include "netlink.h"
+
+struct rclk_out_pin_info {
+	u32 idx;
+	bool valid;
+};
+
+struct rclk_request_data {
+	struct ethnl_req_info base;
+	struct rclk_out_pin_info out_pin;
+};
+
+struct rclk_pin_state_info {
+	u32 range_min;
+	u32 range_max;
+	u32 flags;
+	u32 idx;
+};
+
+struct rclk_reply_data {
+	struct ethnl_reply_data	base;
+	struct rclk_pin_state_info pin_state;
+};
+
+#define RCLK_REPDATA(__reply_base) \
+	container_of(__reply_base, struct rclk_reply_data, base)
+
+#define RCLK_REQDATA(__req_base) \
+	container_of(__req_base, struct rclk_request_data, base)
+
+/* RCLK_GET */
+
+const struct nla_policy
+ethnl_rclk_get_policy[ETHTOOL_A_RCLK_OUT_PIN_IDX + 1] = {
+	[ETHTOOL_A_RCLK_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy),
+	[ETHTOOL_A_RCLK_OUT_PIN_IDX] = { .type = NLA_U32 },
+};
+
+static int rclk_parse_request(struct ethnl_req_info *req_base,
+			      struct nlattr **tb,
+			      struct netlink_ext_ack *extack)
+{
+	struct rclk_request_data *req = RCLK_REQDATA(req_base);
+
+	if (tb[ETHTOOL_A_RCLK_OUT_PIN_IDX]) {
+		req->out_pin.idx = nla_get_u32(tb[ETHTOOL_A_RCLK_OUT_PIN_IDX]);
+		req->out_pin.valid = true;
+	}
+
+	return 0;
+}
+
+static int rclk_state_get(struct net_device *dev,
+			  struct rclk_reply_data *data,
+			  struct netlink_ext_ack *extack,
+			  u32 out_idx)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	bool pin_state;
+	int ret;
+
+	if (!ops->get_rclk_state)
+		return -EOPNOTSUPP;
+
+	ret = ops->get_rclk_state(dev, out_idx, &pin_state, extack);
+	if (ret)
+		return ret;
+
+	data->pin_state.flags = pin_state ? ETHTOOL_RCLK_PIN_FLAGS_ENA : 0;
+	data->pin_state.idx = out_idx;
+
+	return ret;
+}
+
+static int rclk_range_get(struct net_device *dev,
+			  struct rclk_reply_data *data,
+			  struct netlink_ext_ack *extack)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	u32 min_idx, max_idx;
+	int ret;
+
+	if (!ops->get_rclk_range)
+		return -EOPNOTSUPP;
+
+	ret = ops->get_rclk_range(dev, &min_idx, &max_idx, extack);
+	if (ret)
+		return ret;
+
+	data->pin_state.range_min = min_idx;
+	data->pin_state.range_max = max_idx;
+
+	return ret;
+}
+
+static int rclk_prepare_data(const struct ethnl_req_info *req_base,
+			     struct ethnl_reply_data *reply_base,
+			     struct genl_info *info)
+{
+	struct rclk_reply_data *reply = RCLK_REPDATA(reply_base);
+	struct rclk_request_data *request = RCLK_REQDATA(req_base);
+	struct netlink_ext_ack *extack = info ? info->extack : NULL;
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	memset(&reply->pin_state, 0, sizeof(reply->pin_state));
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+
+	if (request->out_pin.valid)
+		ret = rclk_state_get(dev, reply, extack,
+				     request->out_pin.idx);
+	else
+		ret = rclk_range_get(dev, reply, extack);
+
+	ethnl_ops_complete(dev);
+
+	return ret;
+}
+
+static int rclk_fill_reply(struct sk_buff *skb,
+			   const struct ethnl_req_info *req_base,
+			   const struct ethnl_reply_data *reply_base)
+{
+	const struct rclk_reply_data *reply = RCLK_REPDATA(reply_base);
+	const struct rclk_request_data *request = RCLK_REQDATA(req_base);
+
+	if (request->out_pin.valid) {
+		if (nla_put_u32(skb, ETHTOOL_A_RCLK_PIN_FLAGS,
+				reply->pin_state.flags))
+			return -EMSGSIZE;
+		if (nla_put_u32(skb, ETHTOOL_A_RCLK_OUT_PIN_IDX,
+				reply->pin_state.idx))
+			return -EMSGSIZE;
+	} else {
+		if (nla_put_u32(skb, ETHTOOL_A_RCLK_PIN_MIN,
+				reply->pin_state.range_min))
+			return -EMSGSIZE;
+		if (nla_put_u32(skb, ETHTOOL_A_RCLK_PIN_MAX,
+				reply->pin_state.range_max))
+			return -EMSGSIZE;
+	}
+
+	return 0;
+}
+
+static int rclk_reply_size(const struct ethnl_req_info *req_base,
+			   const struct ethnl_reply_data *reply_base)
+{
+	const struct rclk_request_data *request = RCLK_REQDATA(req_base);
+
+	if (request->out_pin.valid)
+		return nla_total_size(sizeof(u32)) +	/* ETHTOOL_A_RCLK_PIN_FLAGS */
+		       nla_total_size(sizeof(u32));	/* ETHTOOL_A_RCLK_OUT_PIN_IDX */
+	else
+		return nla_total_size(sizeof(u32)) +	/* ETHTOOL_A_RCLK_PIN_MIN */
+		       nla_total_size(sizeof(u32));	/* ETHTOOL_A_RCLK_PIN_MAX */
+}
+
+const struct ethnl_request_ops ethnl_rclk_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_RCLK_GET,
+	.reply_cmd		= ETHTOOL_MSG_RCLK_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_RCLK_HEADER,
+	.req_info_size		= sizeof(struct rclk_request_data),
+	.reply_data_size	= sizeof(struct rclk_reply_data),
+
+	.parse_request		= rclk_parse_request,
+	.prepare_data		= rclk_prepare_data,
+	.reply_size		= rclk_reply_size,
+	.fill_reply		= rclk_fill_reply,
+};
+
+/* RCLK SET */
+
+const struct nla_policy
+ethnl_rclk_set_policy[ETHTOOL_A_RCLK_PIN_FLAGS + 1] = {
+	[ETHTOOL_A_RCLK_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy),
+	[ETHTOOL_A_RCLK_OUT_PIN_IDX] = { .type = NLA_U32 },
+	[ETHTOOL_A_RCLK_PIN_FLAGS] = { .type = NLA_U32 },
+};
+
+static int rclk_set_state(struct net_device *dev, struct nlattr **tb,
+			  bool *p_mod, struct netlink_ext_ack *extack)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	bool old_state, new_state;
+	u32 min_idx, max_idx;
+	u32 out_idx;
+	int ret;
+
+	if (!tb[ETHTOOL_A_RCLK_PIN_FLAGS] &&
+	    !tb[ETHTOOL_A_RCLK_OUT_PIN_IDX])
+		return 0;
+
+	if (!ops->set_rclk_out || !ops->get_rclk_range) {
+		NL_SET_ERR_MSG_ATTR(extack,
+				    tb[ETHTOOL_A_RCLK_PIN_FLAGS],
+				    "Setting recovered clock state is not supported by this device");
+		return -EOPNOTSUPP;
+	}
+
+	ret = ops->get_rclk_range(dev, &min_idx, &max_idx, extack);
+	if (ret)
+		return ret;
+
+	out_idx = nla_get_u32(tb[ETHTOOL_A_RCLK_OUT_PIN_IDX]);
+	if (out_idx < min_idx || out_idx > max_idx) {
+		NL_SET_ERR_MSG_ATTR(extack,
+				    tb[ETHTOOL_A_RCLK_OUT_PIN_IDX],
+				    "Requested recovered clock pin index is out of range");
+		return -EINVAL;
+	}
+
+	ret = ops->get_rclk_state(dev, out_idx, &old_state, extack);
+	if (ret < 0)
+		return ret;
+
+	new_state = !!(nla_get_u32(tb[ETHTOOL_A_RCLK_PIN_FLAGS]) &
+		       ETHTOOL_RCLK_PIN_FLAGS_ENA);
+
+	/* If state changed - flag need for sending the notification */
+	*p_mod = old_state != new_state;
+
+	return ops->set_rclk_out(dev, out_idx, new_state, extack);
+}
+
+int ethnl_set_rclk(struct sk_buff *skb, struct genl_info *info)
+{
+	struct ethnl_req_info req_info = {};
+	struct nlattr **tb = info->attrs;
+	struct net_device *dev;
+	bool mod = false;
+	int ret;
+
+	ret = ethnl_parse_header_dev_get(&req_info, tb[ETHTOOL_A_RCLK_HEADER],
+					 genl_info_net(info), info->extack,
+					 true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	ret = rclk_set_state(dev, tb, &mod, info->extack);
+	if (ret < 0)
+		goto out_ops;
+
+	if (!mod)
+		goto out_ops;
+
+	ethtool_notify(dev, ETHTOOL_MSG_RCLK_NTF, NULL);
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+	dev_put(dev);
+	return ret;
+}
+
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 3/4] ice: add support for monitoring SyncE DPLL state
  2021-12-01 18:02 [PATCH v4 net-next 0/4] Add ethtool interface for SyncE Maciej Machnikowski
  2021-12-01 18:02 ` [PATCH v4 net-next 1/4] ice: add support detecting features based on netlist Maciej Machnikowski
  2021-12-01 18:02 ` [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature Maciej Machnikowski
@ 2021-12-01 18:02 ` Maciej Machnikowski
  2021-12-01 18:02 ` [PATCH v4 net-next 4/4] ice: add support for SyncE recovered clocks Maciej Machnikowski
  3 siblings, 0 replies; 27+ messages in thread
From: Maciej Machnikowski @ 2021-12-01 18:02 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan, arkadiusz.kubalewski
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba,
	linux-kselftest, idosch, mkubecek, saeed, michael.chan, petrm

Implement SyncE DPLL monitoring for E810-T devices.
Poll loop will periodically check the state of the DPLL and cache it
in the pf structure. State changes will be logged in the system log.

Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h          |  5 ++
 .../net/ethernet/intel/ice/ice_adminq_cmd.h   | 34 +++++++++++++
 drivers/net/ethernet/intel/ice/ice_common.c   | 36 ++++++++++++++
 drivers/net/ethernet/intel/ice/ice_common.h   |  5 +-
 drivers/net/ethernet/intel/ice/ice_devids.h   |  3 ++
 drivers/net/ethernet/intel/ice/ice_ptp.c      | 35 ++++++++++++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c   | 48 +++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.h   | 34 +++++++++++++
 8 files changed, 199 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index cb6b4c53584b..2dcc8fd6dff5 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -607,6 +607,11 @@ struct ice_pf {
 #define ICE_VF_AGG_NODE_ID_START	65
 #define ICE_MAX_VF_AGG_NODES		32
 	struct ice_agg_node vf_agg_node[ICE_MAX_VF_AGG_NODES];
+
+	enum ice_eec_state synce_dpll_state;
+	u8 synce_dpll_pin;
+	enum ice_eec_state ptp_dpll_state;
+	u8 ptp_dpll_pin;
 };
 
 struct ice_netdev_priv {
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 339c2a86f680..11226af7a9a4 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -1808,6 +1808,36 @@ struct ice_aqc_add_rdma_qset_data {
 	struct ice_aqc_add_tx_rdma_qset_entry rdma_qsets[];
 };
 
+/* Get CGU DPLL status (direct 0x0C66) */
+struct ice_aqc_get_cgu_dpll_status {
+	u8 dpll_num;
+	u8 ref_state;
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_LOS		BIT(0)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_SCM		BIT(1)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_CFM		BIT(2)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_GST		BIT(3)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_PFM		BIT(4)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_ESYNC	BIT(6)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_FAST_LOCK_EN	BIT(7)
+	__le16 dpll_state;
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK		BIT(0)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO		BIT(1)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY	BIT(2)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_FLHIT		BIT(5)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_PSLHIT	BIT(7)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT	8
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL	\
+	ICE_M(0x1F, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT	13
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE \
+	ICE_M(0x7, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT)
+	__le32 phase_offset_h;
+	__le32 phase_offset_l;
+	u8 eec_mode;
+	u8 rsvd[1];
+	__le16 node_handle;
+};
+
 /* Configure Firmware Logging Command (indirect 0xFF09)
  * Logging Information Read Response (indirect 0xFF10)
  * Note: The 0xFF10 command has no input parameters.
@@ -2039,6 +2069,7 @@ struct ice_aq_desc {
 		struct ice_aqc_fw_logging fw_logging;
 		struct ice_aqc_get_clear_fw_log get_clear_fw_log;
 		struct ice_aqc_download_pkg download_pkg;
+		struct ice_aqc_get_cgu_dpll_status get_cgu_dpll_status;
 		struct ice_aqc_driver_shared_params drv_shared_params;
 		struct ice_aqc_set_mac_lb set_mac_lb;
 		struct ice_aqc_alloc_free_res_cmd sw_res_ctrl;
@@ -2205,6 +2236,9 @@ enum ice_adminq_opc {
 	ice_aqc_opc_update_pkg				= 0x0C42,
 	ice_aqc_opc_get_pkg_info_list			= 0x0C43,
 
+	/* 1588/SyncE commands/events */
+	ice_aqc_opc_get_cgu_dpll_status			= 0x0C66,
+
 	ice_aqc_opc_driver_shared_params		= 0x0C90,
 
 	/* Standalone Commands/Events */
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 35903b282885..8069141ac105 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -4644,6 +4644,42 @@ ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid,
 	return ice_status_to_errno(status);
 }
 
+/**
+ * ice_aq_get_cgu_dpll_status
+ * @hw: pointer to the HW struct
+ * @dpll_num: DPLL index
+ * @ref_state: Reference clock state
+ * @dpll_state: DPLL state
+ * @phase_offset: Phase offset in ps
+ * @eec_mode: EEC_mode
+ *
+ * Get CGU DPLL status (0x0C66)
+ */
+enum ice_status
+ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state,
+			   u16 *dpll_state, u64 *phase_offset, u8 *eec_mode)
+{
+	struct ice_aqc_get_cgu_dpll_status *cmd;
+	struct ice_aq_desc desc;
+	enum ice_status status;
+
+	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_cgu_dpll_status);
+	cmd = &desc.params.get_cgu_dpll_status;
+	cmd->dpll_num = dpll_num;
+
+	status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL);
+	if (!status) {
+		*ref_state = cmd->ref_state;
+		*dpll_state = le16_to_cpu(cmd->dpll_state);
+		*phase_offset = le32_to_cpu(cmd->phase_offset_h);
+		*phase_offset <<= 32;
+		*phase_offset += le32_to_cpu(cmd->phase_offset_l);
+		*eec_mode = cmd->eec_mode;
+	}
+
+	return status;
+}
+
 /**
  * ice_replay_pre_init - replay pre initialization
  * @hw: pointer to the HW struct
diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h
index b20a5c085246..aaed388a40a8 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.h
+++ b/drivers/net/ethernet/intel/ice/ice_common.h
@@ -106,6 +106,7 @@ enum ice_status
 ice_aq_manage_mac_write(struct ice_hw *hw, const u8 *mac_addr, u8 flags,
 			struct ice_sq_cd *cd);
 bool ice_is_e810(struct ice_hw *hw);
+bool ice_is_e810t(struct ice_hw *hw);
 enum ice_status ice_clear_pf_cfg(struct ice_hw *hw);
 enum ice_status
 ice_aq_set_phy_cfg(struct ice_hw *hw, struct ice_port_info *pi,
@@ -162,6 +163,9 @@ ice_cfg_vsi_rdma(struct ice_port_info *pi, u16 vsi_handle, u16 tc_bitmap,
 int
 ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
 		      u16 *rdma_qset, u16 num_qsets, u32 *qset_teid);
+enum ice_status
+ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state,
+			   u16 *dpll_state, u64 *phase_offset, u8 *eec_mode);
 int
 ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid,
 		      u16 *q_id);
@@ -189,7 +193,6 @@ ice_stat_update40(struct ice_hw *hw, u32 reg, bool prev_stat_loaded,
 void
 ice_stat_update32(struct ice_hw *hw, u32 reg, bool prev_stat_loaded,
 		  u64 *prev_stat, u64 *cur_stat);
-bool ice_is_e810t(struct ice_hw *hw);
 enum ice_status
 ice_sched_query_elem(struct ice_hw *hw, u32 node_teid,
 		     struct ice_aqc_txsched_elem_data *buf);
diff --git a/drivers/net/ethernet/intel/ice/ice_devids.h b/drivers/net/ethernet/intel/ice/ice_devids.h
index 61dd2f18dee8..0b654d417d29 100644
--- a/drivers/net/ethernet/intel/ice/ice_devids.h
+++ b/drivers/net/ethernet/intel/ice/ice_devids.h
@@ -58,4 +58,7 @@
 /* Intel(R) Ethernet Connection E822-L 1GbE */
 #define ICE_DEV_ID_E822L_SGMII		0x189A
 
+#define ICE_SUBDEV_ID_E810T		0x000E
+#define ICE_SUBDEV_ID_E810T2		0x000F
+
 #endif /* _ICE_DEVIDS_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index bf7247c6f58e..bb502c19d53a 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1766,6 +1766,36 @@ static void ice_ptp_tx_tstamp_cleanup(struct ice_ptp_tx *tx)
 	}
 }
 
+static void ice_handle_cgu_state(struct ice_pf *pf)
+{
+	enum ice_eec_state cgu_state;
+	u8 pin;
+
+	cgu_state = ice_get_zl_dpll_state(&pf->hw, ICE_CGU_DPLL_SYNCE, &pin);
+	if (pf->synce_dpll_state != cgu_state) {
+		pf->synce_dpll_state = cgu_state;
+		pf->synce_dpll_pin = pin;
+
+		dev_warn(ice_pf_to_dev(pf),
+			 "<DPLL%i> state changed to: %d, pin %d",
+			 ICE_CGU_DPLL_SYNCE,
+			 pf->synce_dpll_state,
+			 pin);
+	}
+
+	cgu_state = ice_get_zl_dpll_state(&pf->hw, ICE_CGU_DPLL_PTP, &pin);
+	if (pf->ptp_dpll_state != cgu_state) {
+		pf->ptp_dpll_state = cgu_state;
+		pf->ptp_dpll_pin = pin;
+
+		dev_warn(ice_pf_to_dev(pf),
+			 "<DPLL%i> state changed to: %d, pin %d",
+			 ICE_CGU_DPLL_PTP,
+			 pf->ptp_dpll_state,
+			 pin);
+	}
+}
+
 static void ice_ptp_periodic_work(struct kthread_work *work)
 {
 	struct ice_ptp *ptp = container_of(work, struct ice_ptp, work.work);
@@ -1774,6 +1804,10 @@ static void ice_ptp_periodic_work(struct kthread_work *work)
 	if (!test_bit(ICE_FLAG_PTP, pf->flags))
 		return;
 
+	if (ice_is_feature_supported(pf, ICE_F_CGU) &&
+	    pf->hw.func_caps.ts_func_info.src_tmr_owned)
+		ice_handle_cgu_state(pf);
+
 	ice_ptp_update_cached_phctime(pf);
 
 	ice_ptp_tx_tstamp_cleanup(&pf->ptp.port.tx);
@@ -1958,3 +1992,4 @@ void ice_ptp_release(struct ice_pf *pf)
 
 	dev_info(ice_pf_to_dev(pf), "Removed PTP clock\n");
 }
+
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index aa257db36765..b4300bf3e4ce 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -375,6 +375,54 @@ static int ice_ptp_port_cmd_e810(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
 	return 0;
 }
 
+/**
+ * ice_get_zl_dpll_state - get the state of the DPLL
+ * @hw: pointer to the hw struct
+ * @dpll_idx: Index of internal DPLL unit
+ * @pin: pointer to a buffer for returning currently active pin
+ *
+ * This function will read the state of the DPLL(dpll_idx). If optional
+ * parameter pin is given it'll be used to retrieve currently active pin.
+ *
+ * Return: state of the DPLL
+ */
+enum ice_eec_state
+ice_get_zl_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin)
+{
+	enum ice_status status;
+	u64 phase_offset;
+	u16 dpll_state;
+	u8 ref_state;
+	u8 eec_mode;
+
+	if (dpll_idx >= ICE_CGU_DPLL_MAX)
+		return ICE_EEC_STATE_INVALID;
+
+	status = ice_aq_get_cgu_dpll_status(hw, dpll_idx, &ref_state,
+					    &dpll_state, &phase_offset,
+					    &eec_mode);
+	if (status)
+		return ICE_EEC_STATE_INVALID;
+
+	if (pin) {
+		/* current ref pin in dpll_state_refsel_status_X register */
+		*pin = (dpll_state &
+			ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL) >>
+		       ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT;
+	}
+
+	if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK) {
+		if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY)
+			return ICE_EEC_STATE_LOCKED_HO_ACQ;
+		else
+			return ICE_EEC_STATE_LOCKED;
+	} else if ((dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO) &&
+		  (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY)) {
+		return ICE_EEC_STATE_HOLDOVER;
+	}
+	return ICE_EEC_STATE_FREERUN;
+}
+
 /* Device agnostic functions
  *
  * The following functions implement useful behavior to hide the differences
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
index b2984b5c22c1..28b04ec40bae 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
@@ -12,6 +12,18 @@ enum ice_ptp_tmr_cmd {
 	READ_TIME
 };
 
+enum ice_eec_state {
+	ICE_EEC_STATE_INVALID = 0,       /* state is not valid */
+	ICE_EEC_STATE_FREERUN,           /* clock is free-running */
+	ICE_EEC_STATE_LOCKED,            /* clock is locked to the reference,
+					  * but the holdover memory is not valid
+					  */
+	ICE_EEC_STATE_LOCKED_HO_ACQ,     /* clock is locked to the reference
+					  * and holdover memory is valid
+					  */
+	ICE_EEC_STATE_HOLDOVER,          /* clock is in holdover mode */
+};
+
 /* Increment value to generate nanoseconds in the GLTSYN_TIME_L register for
  * the E810 devices. Based off of a PLL with an 812.5 MHz frequency.
  */
@@ -33,6 +45,8 @@ int ice_ptp_init_phy_e810(struct ice_hw *hw);
 int ice_read_sma_ctrl_e810t(struct ice_hw *hw, u8 *data);
 int ice_write_sma_ctrl_e810t(struct ice_hw *hw, u8 data);
 bool ice_is_pca9575_present(struct ice_hw *hw);
+enum ice_eec_state
+ice_get_zl_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin);
 
 #define PFTSYN_SEM_BYTES	4
 
@@ -98,4 +112,24 @@ bool ice_is_pca9575_present(struct ice_hw *hw);
 #define ICE_SMA_MAX_BIT_E810T	7
 #define ICE_PCA9575_P1_OFFSET	8
 
+enum ice_e810t_cgu_dpll {
+	ICE_CGU_DPLL_SYNCE,
+	ICE_CGU_DPLL_PTP,
+	ICE_CGU_DPLL_MAX
+};
+
+enum ice_e810t_cgu_pins {
+	REF0P,
+	REF0N,
+	REF1P,
+	REF1N,
+	REF2P,
+	REF2N,
+	REF3P,
+	REF3N,
+	REF4P,
+	REF4N,
+	NUM_E810T_CGU_PINS
+};
+
 #endif /* _ICE_PTP_HW_H_ */
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 net-next 4/4] ice: add support for SyncE recovered clocks
  2021-12-01 18:02 [PATCH v4 net-next 0/4] Add ethtool interface for SyncE Maciej Machnikowski
                   ` (2 preceding siblings ...)
  2021-12-01 18:02 ` [PATCH v4 net-next 3/4] ice: add support for monitoring SyncE DPLL state Maciej Machnikowski
@ 2021-12-01 18:02 ` Maciej Machnikowski
  2021-12-02  1:56   ` Jakub Kicinski
  3 siblings, 1 reply; 27+ messages in thread
From: Maciej Machnikowski @ 2021-12-01 18:02 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan, arkadiusz.kubalewski
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba,
	linux-kselftest, idosch, mkubecek, saeed, michael.chan, petrm

Implement ethtool netlink functions for handling SyncE recovered clocks
configuration on ice driver:
- ETHTOOL_MSG_RCLK_SET
- ETHTOOL_MSG_RCLK_GET

Co-developed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 .../net/ethernet/intel/ice/ice_adminq_cmd.h   | 29 ++++++
 drivers/net/ethernet/intel/ice/ice_common.c   | 65 +++++++++++++
 drivers/net/ethernet/intel/ice/ice_common.h   |  6 ++
 drivers/net/ethernet/intel/ice/ice_ethtool.c  | 97 +++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.h   |  2 +
 5 files changed, 199 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 11226af7a9a4..aed03200bb99 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -1281,6 +1281,31 @@ struct ice_aqc_set_mac_lb {
 	u8 reserved[15];
 };
 
+/* Set PHY recovered clock output (direct 0x0630) */
+struct ice_aqc_set_phy_rec_clk_out {
+	u8 phy_output;
+	u8 port_num;
+	u8 flags;
+#define ICE_AQC_SET_PHY_REC_CLK_OUT_OUT_EN	BIT(0)
+#define ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT	0xFF
+	u8 rsvd;
+	__le32 freq;
+	u8 rsvd2[6];
+	__le16 node_handle;
+};
+
+/* Get PHY recovered clock output (direct 0x0631) */
+struct ice_aqc_get_phy_rec_clk_out {
+	u8 phy_output;
+	u8 port_num;
+	u8 flags;
+#define ICE_AQC_GET_PHY_REC_CLK_OUT_OUT_EN	BIT(0)
+	u8 rsvd;
+	__le32 freq;
+	u8 rsvd2[6];
+	__le16 node_handle;
+};
+
 struct ice_aqc_link_topo_params {
 	u8 lport_num;
 	u8 lport_num_valid;
@@ -2033,6 +2058,8 @@ struct ice_aq_desc {
 		struct ice_aqc_get_phy_caps get_phy;
 		struct ice_aqc_set_phy_cfg set_phy;
 		struct ice_aqc_restart_an restart_an;
+		struct ice_aqc_set_phy_rec_clk_out set_phy_rec_clk_out;
+		struct ice_aqc_get_phy_rec_clk_out get_phy_rec_clk_out;
 		struct ice_aqc_gpio read_write_gpio;
 		struct ice_aqc_sff_eeprom read_write_sff_param;
 		struct ice_aqc_set_port_id_led set_port_id_led;
@@ -2188,6 +2215,8 @@ enum ice_adminq_opc {
 	ice_aqc_opc_get_link_status			= 0x0607,
 	ice_aqc_opc_set_event_mask			= 0x0613,
 	ice_aqc_opc_set_mac_lb				= 0x0620,
+	ice_aqc_opc_set_phy_rec_clk_out			= 0x0630,
+	ice_aqc_opc_get_phy_rec_clk_out			= 0x0631,
 	ice_aqc_opc_get_link_topo			= 0x06E0,
 	ice_aqc_opc_set_port_id_led			= 0x06E9,
 	ice_aqc_opc_set_gpio				= 0x06EC,
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 8069141ac105..29d302ea1e56 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -5242,3 +5242,68 @@ bool ice_is_clock_mux_present_e810t(struct ice_hw *hw)
 	return true;
 }
 
+/**
+ * ice_aq_set_phy_rec_clk_out - set RCLK phy out
+ * @hw: pointer to the HW struct
+ * @phy_output: PHY reference clock output pin
+ * @enable: GPIO state to be applied
+ * @freq: PHY output frequency
+ *
+ * Set CGU reference priority (0x0630)
+ * Return 0 on success or negative value on failure.
+ */
+enum ice_status
+ice_aq_set_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, bool enable,
+			   u32 *freq)
+{
+	struct ice_aqc_set_phy_rec_clk_out *cmd;
+	struct ice_aq_desc desc;
+	enum ice_status status;
+
+	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_set_phy_rec_clk_out);
+	cmd = &desc.params.set_phy_rec_clk_out;
+	cmd->phy_output = phy_output;
+	cmd->port_num = ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT;
+	cmd->flags = enable & ICE_AQC_SET_PHY_REC_CLK_OUT_OUT_EN;
+	cmd->freq = cpu_to_le32(*freq);
+
+	status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL);
+	if (!status)
+		*freq = le32_to_cpu(cmd->freq);
+
+	return status;
+}
+
+/**
+ * ice_aq_get_phy_rec_clk_out
+ * @hw: pointer to the HW struct
+ * @phy_output: PHY reference clock output pin
+ * @port_num: Port number
+ * @flags: PHY flags
+ * @freq: PHY output frequency
+ *
+ * Get PHY recovered clock output (0x0631)
+ */
+enum ice_status
+ice_aq_get_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, u8 *port_num,
+			   u8 *flags, u32 *freq)
+{
+	struct ice_aqc_get_phy_rec_clk_out *cmd;
+	struct ice_aq_desc desc;
+	enum ice_status status;
+
+	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_phy_rec_clk_out);
+	cmd = &desc.params.get_phy_rec_clk_out;
+	cmd->phy_output = phy_output;
+	cmd->port_num = *port_num;
+
+	status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL);
+	if (!status) {
+		*port_num = cmd->port_num;
+		*flags = cmd->flags;
+		*freq = le32_to_cpu(cmd->freq);
+	}
+
+	return status;
+}
+
diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h
index aaed388a40a8..8a99c8364173 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.h
+++ b/drivers/net/ethernet/intel/ice/ice_common.h
@@ -166,6 +166,12 @@ ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
 enum ice_status
 ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state,
 			   u16 *dpll_state, u64 *phase_offset, u8 *eec_mode);
+enum ice_status
+ice_aq_set_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, bool enable,
+			   u32 *freq);
+enum ice_status
+ice_aq_get_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, u8 *port_num,
+			   u8 *flags, u32 *freq);
 int
 ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid,
 		      u16 *q_id);
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 5af2faaa21e1..c9e16bb9470e 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -4076,6 +4076,100 @@ ice_get_module_eeprom(struct net_device *netdev,
 	return 0;
 }
 
+/**
+ * ice_get_rclk_range - get range of recovered clock indices
+ * @netdev: network interface device structure
+ * @min_idx: min rclk index
+ * @max_idx: max rclk index
+ * @ena_mask: bitmask of pin states
+ * @extack: netlink extended ack
+ */
+static int
+ice_get_rclk_range(struct net_device *netdev, u32 *min_idx, u32 *max_idx,
+		   struct netlink_ext_ack *extack)
+{
+	struct ice_netdev_priv *np = netdev_priv(netdev);
+	struct ice_vsi *vsi = np->vsi;
+	struct ice_pf *pf = vsi->back;
+
+	if (!ice_is_feature_supported(pf, ICE_F_CGU))
+		return -EOPNOTSUPP;
+
+	*min_idx = 0;
+	*max_idx = ICE_RCLK_PIN_MAX;
+
+	return 0;
+}
+
+/**
+ * ice_get_rclk_state - get state of a recovered frequency output pin
+ * @netdev: network interface device structure
+ * @out_idx: index of a questioned pin
+ * @ena: returned state of a pin
+ * @extack: netlink extended ack
+ */
+static int
+ice_get_rclk_state(struct net_device *netdev, u32 out_idx,
+		   bool *ena, struct netlink_ext_ack *extack)
+{
+	u8 port_num = ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT, flags;
+	struct ice_netdev_priv *np = netdev_priv(netdev);
+	struct ice_vsi *vsi = np->vsi;
+	struct ice_pf *pf = vsi->back;
+	u32 freq;
+	int ret;
+
+	if (!ice_is_feature_supported(pf, ICE_F_CGU))
+		return -EOPNOTSUPP;
+
+	if (out_idx > ICE_RCLK_PIN_MAX)
+		return -EINVAL;
+
+	ret = ice_aq_get_phy_rec_clk_out(&pf->hw, out_idx,
+					 &port_num, &flags, &freq);
+	if (ret)
+		return ret;
+
+	if (flags & ICE_AQC_GET_PHY_REC_CLK_OUT_OUT_EN)
+		*ena = true;
+	else
+		*ena = false;
+
+	return ret;
+}
+
+/**
+ * ice_set_rclk_out - enable/disable recovered clock redirection to the
+ * output pin
+ * @netdev: network interface device structure
+ * @out_idx: index of pin being configured
+ * @ena: requested state of a pin
+ * @extack: netlink extended ack
+ */
+static int
+ice_set_rclk_out(struct net_device *netdev, u32 out_idx, bool ena,
+		 struct netlink_ext_ack *extack)
+{
+	struct ice_netdev_priv *np = netdev_priv(netdev);
+	struct ice_vsi *vsi = np->vsi;
+	struct ice_pf *pf = vsi->back;
+	enum ice_status ret;
+	u32 freq;
+
+	if (!ice_is_feature_supported(pf, ICE_F_CGU))
+		return -EOPNOTSUPP;
+
+	if (out_idx > ICE_RCLK_PIN_MAX)
+		return -EINVAL;
+
+	ret = ice_aq_set_phy_rec_clk_out(&pf->hw, out_idx,
+					 ena, &freq);
+	if (ret)
+		return ret;
+
+	return ret;
+}
+
 static const struct ethtool_ops ice_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
 				     ETHTOOL_COALESCE_USE_ADAPTIVE |
@@ -4121,6 +4215,9 @@ static const struct ethtool_ops ice_ethtool_ops = {
 	.set_fecparam		= ice_set_fecparam,
 	.get_module_info	= ice_get_module_info,
 	.get_module_eeprom	= ice_get_module_eeprom,
+	.get_rclk_range		= ice_get_rclk_range,
+	.get_rclk_state		= ice_get_rclk_state,
+	.set_rclk_out		= ice_set_rclk_out,
 };
 
 static const struct ethtool_ops ice_ethtool_safe_mode_ops = {
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
index 28b04ec40bae..865ca680b62e 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
@@ -132,4 +132,6 @@ enum ice_e810t_cgu_pins {
 	NUM_E810T_CGU_PINS
 };
 
+#define ICE_RCLK_PIN_MAX	(REF1N - REF1P)
+
 #endif /* _ICE_PTP_HW_H_ */
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 4/4] ice: add support for SyncE recovered clocks
  2021-12-01 18:02 ` [PATCH v4 net-next 4/4] ice: add support for SyncE recovered clocks Maciej Machnikowski
@ 2021-12-02  1:56   ` Jakub Kicinski
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Kicinski @ 2021-12-02  1:56 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, arkadiusz.kubalewski, richardcochran,
	abyagowi, anthony.l.nguyen, davem, linux-kselftest, idosch,
	mkubecek, saeed, michael.chan, petrm

On Wed,  1 Dec 2021 19:02:08 +0100 Maciej Machnikowski wrote:
> Implement ethtool netlink functions for handling SyncE recovered clocks
> configuration on ice driver:
> - ETHTOOL_MSG_RCLK_SET
> - ETHTOOL_MSG_RCLK_GET
> 
> Co-developed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
> Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>

drivers/net/ethernet/intel/ice/ice_ethtool.c:4090: warning: Excess function parameter 'ena_mask' description in 'ice_get_rclk_range'

drivers/net/ethernet/intel/ice/ice_dcb_nl.c:66:6: warning: variable 'bwcfg' set but not used [-Wunused-but-set-variable]
        int bwcfg = 0, bwrec = 0;
            ^

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-01 18:02 ` [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature Maciej Machnikowski
@ 2021-12-02 11:48   ` kernel test robot
  2021-12-02 12:43   ` Ido Schimmel
  1 sibling, 0 replies; 27+ messages in thread
From: kernel test robot @ 2021-12-02 11:48 UTC (permalink / raw)
  To: Maciej Machnikowski, netdev, intel-wired-lan, arkadiusz.kubalewski
  Cc: kbuild-all, richardcochran, abyagowi, anthony.l.nguyen, davem,
	kuba, linux-kselftest

Hi Maciej,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Maciej-Machnikowski/Add-ethtool-interface-for-SyncE/20211202-021915
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 23ea630f86c70cbe6691f9f839e7b6742f0e9ad3
reproduce: make htmldocs

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

include/uapi/linux/ethtool.h:1: warning: 'ethtool_rclk_pin_state' not found

vim +/ethtool_rclk_pin_state +1 include/uapi/linux/ethtool.h

6f52b16c5b29b8 Greg Kroah-Hartman 2017-11-01  @1  /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
607ca46e97a1b6 David Howells      2012-10-13   2  /*
607ca46e97a1b6 David Howells      2012-10-13   3   * ethtool.h: Defines for Linux ethtool.
607ca46e97a1b6 David Howells      2012-10-13   4   *
607ca46e97a1b6 David Howells      2012-10-13   5   * Copyright (C) 1998 David S. Miller (davem@redhat.com)
607ca46e97a1b6 David Howells      2012-10-13   6   * Copyright 2001 Jeff Garzik <jgarzik@pobox.com>
607ca46e97a1b6 David Howells      2012-10-13   7   * Portions Copyright 2001 Sun Microsystems (thockin@sun.com)
607ca46e97a1b6 David Howells      2012-10-13   8   * Portions Copyright 2002 Intel (eli.kupermann@intel.com,
607ca46e97a1b6 David Howells      2012-10-13   9   *                                christopher.leech@intel.com,
607ca46e97a1b6 David Howells      2012-10-13  10   *                                scott.feldman@intel.com)
607ca46e97a1b6 David Howells      2012-10-13  11   * Portions Copyright (C) Sun Microsystems 2008
607ca46e97a1b6 David Howells      2012-10-13  12   */
607ca46e97a1b6 David Howells      2012-10-13  13  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-01 18:02 ` [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature Maciej Machnikowski
  2021-12-02 11:48   ` kernel test robot
@ 2021-12-02 12:43   ` Ido Schimmel
  2021-12-02 15:17     ` Machnikowski, Maciej
  1 sibling, 1 reply; 27+ messages in thread
From: Ido Schimmel @ 2021-12-02 12:43 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, arkadiusz.kubalewski, richardcochran,
	abyagowi, anthony.l.nguyen, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski wrote:
> +RCLK_GET
> +========
> +
> +Get status of an output pin for PHY recovered frequency clock.
> +
> +Request contents:
> +
> +  ======================================  ======  ==========================
> +  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
> +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> +  ======================================  ======  ==========================
> +
> +Kernel response contents:
> +
> +  ======================================  ======  ==========================
> +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> +  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32     state of a pin
> +  ``ETHTOOL_A_RCLK_RANGE_MIN_PIN``        u32     min index of RCLK pins
> +  ``ETHTOOL_A_RCLK_RANGE_MAX_PIN``        u32     max index of RCLK pins
> +  ======================================  ======  ==========================
> +
> +Supported device can have mulitple reference recover clock pins available

s/mulitple/multiple/

> +to be used as source of frequency for a DPLL.
> +Once a pin on given port is enabled. The PHY recovered frequency is being
> +fed onto that pin, and can be used by DPLL to synchonize with its signal.

s/synchonize/synchronize/

Please run a spell checker on documentation

> +Pins don't have to start with index equal 0 - device can also have different
> +external sources pins.
> +
> +The ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is optional parameter. If present in
> +the RCLK_GET request, the ``ETHTOOL_A_RCLK_PIN_ENABLED`` is provided in a

The `ETHTOOL_A_RCLK_PIN_ENABLED` attribute is no where to be found in
this submission

> +response, it contatins state of the pin pointed by the index. Values are:

s/contatins/contains/

> +
> +.. kernel-doc:: include/uapi/linux/ethtool.h
> +    :identifiers: ethtool_rclk_pin_state

This structure is also no where to be found

> +
> +If ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is not present in the RCLK_GET request,
> +the range of available pins is returned:
> +``ETHTOOL_A_RCLK_RANGE_MIN_PIN`` is lowest possible index of a pin available
> +for recovering frequency from PHY.
> +``ETHTOOL_A_RCLK_RANGE_MAX_PIN`` is highest possible index of a pin available
> +for recovering frequency from PHY.
> +
> +RCLK_SET
> +==========
> +
> +Set status of an output pin for PHY recovered frequency clock.
> +
> +Request contents:
> +
> +  ======================================  ======  ========================
> +  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
> +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> +  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32      requested state
> +  ======================================  ======  ========================
> +
> +``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is a index of a pin for which the change of
> +state is requested. Values of ``ETHTOOL_A_RCLK_PIN_ENABLED`` are:
> +
> +.. kernel-doc:: include/uapi/linux/ethtool.h
> +    :identifiers: ethtool_rclk_pin_state

Same.

Looking at the diagram from the previous submission [1]:

      ┌──────────┬──────────┐
      │ RX       │ TX       │
  1   │ ports    │ ports    │ 1
  ───►├─────┐    │          ├─────►
  2   │     │    │          │ 2
  ───►├───┐ │    │          ├─────►
  3   │   │ │    │          │ 3
  ───►├─┐ │ │    │          ├─────►
      │ ▼ ▼ ▼    │          │
      │ ──────   │          │
      │ \____/   │          │
      └──┼──┼────┴──────────┘
        1│ 2│        ▲
 RCLK out│  │        │ TX CLK in
         ▼  ▼        │
       ┌─────────────┴───┐
       │                 │
       │       SEC       │
       │                 │
       └─────────────────┘

Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message allows
me to redirect the frequency recovered from this netdev to the EEC via
either pin 1, pin 2 or both.

Given a netdev, the RCLK_GET message allows me to query the range of
pins (RCLK out 1-2 in the diagram) through which the frequency can be
fed into the EEC.

Questions:

1. The query for all the above netdevs will return the same range of
pins. How does user space know that these are the same pins? That is,
how does user space know that RCLK_SET message to redirect the frequency
recovered from netdev 1 to pin 1 will be overridden by the same message
but for netdev 2?

2. How does user space know the mapping between a netdev and an EEC?
That is, how does user space know that RCLK_SET message for netdev 1
will cause the Tx frequency of netdev 2 to change according to the
frequency recovered from netdev 1?

3. If user space sends two RCLK_SET messages to redirect the frequency
recovered from netdev 1 to RCLK out 1 and from netdev 2 to RCLK out 2,
how does it know which recovered frequency is actually used an input to
the EEC?

4. Why these pins are represented as attributes of a netdev and not as
attributes of the EEC? That is, why are they represented as output pins
of the PHY as opposed to input pins of the EEC?

5. What is the problem with the following model?

- The EEC is a separate object with following attributes:
  * State: Invalid / Freerun / Locked / etc
  * Sources: Netdev / external / etc
  * Potentially more

- Notifications are emitted to user space when the state of the EEC
  changes. Drivers will either poll the state from the device or get
  interrupts

- The mapping from netdev to EEC is queried via ethtool

[1] https://lore.kernel.org/netdev/20211110114448.2792314-1-maciej.machnikowski@intel.com/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-02 12:43   ` Ido Schimmel
@ 2021-12-02 15:17     ` Machnikowski, Maciej
  2021-12-02 16:35       ` Ido Schimmel
  0 siblings, 1 reply; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-02 15:17 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, intel-wired-lan, Kubalewski, Arkadiusz, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

> -----Original Message-----
> From: Ido Schimmel <idosch@idosch.org>
> Sent: Thursday, December 2, 2021 1:44 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski wrote:
> > +RCLK_GET
> > +========
> > +
> > +Get status of an output pin for PHY recovered frequency clock.
> > +
> > +Request contents:
> > +
> > +  ======================================  ======
> ==========================
> > +  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
> > +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> > +  ======================================  ======
> ==========================
> > +
> > +Kernel response contents:
> > +
> > +  ======================================  ======
> ==========================
> > +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> > +  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32     state of a pin
> > +  ``ETHTOOL_A_RCLK_RANGE_MIN_PIN``        u32     min index of RCLK pins
> > +  ``ETHTOOL_A_RCLK_RANGE_MAX_PIN``        u32     max index of RCLK
> pins
> > +  ======================================  ======
> ==========================
> > +
> > +Supported device can have mulitple reference recover clock pins available
> 
> s/mulitple/multiple/
> 
> > +to be used as source of frequency for a DPLL.
> > +Once a pin on given port is enabled. The PHY recovered frequency is being
> > +fed onto that pin, and can be used by DPLL to synchonize with its signal.
> 
> s/synchonize/synchronize/
> 
> Please run a spell checker on documentation
> 
> > +Pins don't have to start with index equal 0 - device can also have different
> > +external sources pins.
> > +
> > +The ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is optional parameter. If present
> in
> > +the RCLK_GET request, the ``ETHTOOL_A_RCLK_PIN_ENABLED`` is
> provided in a
> 
> The `ETHTOOL_A_RCLK_PIN_ENABLED` attribute is no where to be found in
> this submission
> 
> > +response, it contatins state of the pin pointed by the index. Values are:
> 
> s/contatins/contains/
> 
> > +
> > +.. kernel-doc:: include/uapi/linux/ethtool.h
> > +    :identifiers: ethtool_rclk_pin_state
> 
> This structure is also no where to be found
> 
> > +
> > +If ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is not present in the RCLK_GET
> request,
> > +the range of available pins is returned:
> > +``ETHTOOL_A_RCLK_RANGE_MIN_PIN`` is lowest possible index of a pin
> available
> > +for recovering frequency from PHY.
> > +``ETHTOOL_A_RCLK_RANGE_MAX_PIN`` is highest possible index of a pin
> available
> > +for recovering frequency from PHY.
> > +
> > +RCLK_SET
> > +==========
> > +
> > +Set status of an output pin for PHY recovered frequency clock.
> > +
> > +Request contents:
> > +
> > +  ======================================  ======
> ========================
> > +  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
> > +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> > +  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32      requested state
> > +  ======================================  ======
> ========================
> > +
> > +``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is a index of a pin for which the
> change of
> > +state is requested. Values of ``ETHTOOL_A_RCLK_PIN_ENABLED`` are:
> > +
> > +.. kernel-doc:: include/uapi/linux/ethtool.h
> > +    :identifiers: ethtool_rclk_pin_state
> 
> Same.

Done - rewritten the manual

> Looking at the diagram from the previous submission [1]:
> 
>       ┌──────────┬──────────┐
>       │ RX       │ TX       │
>   1   │ ports    │ ports    │ 1
>   ───►├─────┐    │          ├─────►
>   2   │     │    │          │ 2
>   ───►├───┐ │    │          ├─────►
>   3   │   │ │    │          │ 3
>   ───►├─┐ │ │    │          ├─────►
>       │ ▼ ▼ ▼    │          │
>       │ ──────   │          │
>       │ \____/   │          │
>       └──┼──┼────┴──────────┘
>         1│ 2│        ▲
>  RCLK out│  │        │ TX CLK in
>          ▼  ▼        │
>        ┌─────────────┴───┐
>        │                 │
>        │       SEC       │
>        │                 │
>        └─────────────────┘
> 
> Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message allows
> me to redirect the frequency recovered from this netdev to the EEC via
> either pin 1, pin 2 or both.
> 
> Given a netdev, the RCLK_GET message allows me to query the range of
> pins (RCLK out 1-2 in the diagram) through which the frequency can be
> fed into the EEC.
> 
> Questions:
> 
> 1. The query for all the above netdevs will return the same range of
> pins. How does user space know that these are the same pins? That is,
> how does user space know that RCLK_SET message to redirect the frequency
> recovered from netdev 1 to pin 1 will be overridden by the same message
> but for netdev 2?

We don't have a way to do so right now. When we have EEC subsystem in place
the right thing to do will be to add EEC input index and EEC index as additional
arguments

> 2. How does user space know the mapping between a netdev and an EEC?
> That is, how does user space know that RCLK_SET message for netdev 1
> will cause the Tx frequency of netdev 2 to change according to the
> frequency recovered from netdev 1?

Ditto - currently we don't have any entity to link the pins to ATM,
but we can address that in userspace just like PTP pins are used now

> 3. If user space sends two RCLK_SET messages to redirect the frequency
> recovered from netdev 1 to RCLK out 1 and from netdev 2 to RCLK out 2,
> how does it know which recovered frequency is actually used an input to
> the EEC?
>
> 4. Why these pins are represented as attributes of a netdev and not as
> attributes of the EEC? That is, why are they represented as output pins
> of the PHY as opposed to input pins of the EEC?

They are 2 separate beings. Recovered clock outputs are controlled
separately from EEC inputs. If we mix them it'll be hard to control everything
especially that a single EEC can support multiple devices.
Also if we make those pins attributes of the EEC it'll become extremally hard
to map them to netdevs and control them from the userspace app that will
receive the ESMC message with a given QL level on netdev X.
 
> 5. What is the problem with the following model?
> 
> - The EEC is a separate object with following attributes:
>   * State: Invalid / Freerun / Locked / etc
>   * Sources: Netdev / external / etc
>   * Potentially more
> 
> - Notifications are emitted to user space when the state of the EEC
>   changes. Drivers will either poll the state from the device or get
>   interrupts
> 
> - The mapping from netdev to EEC is queried via ethtool

Yep - that will be part of the EEC (DPLL) subsystem

> [1] https://lore.kernel.org/netdev/20211110114448.2792314-1-
> maciej.machnikowski@intel.com/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-02 15:17     ` Machnikowski, Maciej
@ 2021-12-02 16:35       ` Ido Schimmel
  2021-12-02 17:20         ` Machnikowski, Maciej
  0 siblings, 1 reply; 27+ messages in thread
From: Ido Schimmel @ 2021-12-02 16:35 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, Kubalewski, Arkadiusz, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
> > -----Original Message-----
> > From: Ido Schimmel <idosch@idosch.org>
> > Sent: Thursday, December 2, 2021 1:44 PM
> > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > recovered clock for SyncE feature
> > 
> > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski wrote:
> > > +RCLK_GET
> > > +========
> > > +
> > > +Get status of an output pin for PHY recovered frequency clock.
> > > +
> > > +Request contents:
> > > +
> > > +  ======================================  ======
> > ==========================
> > > +  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
> > > +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> > > +  ======================================  ======
> > ==========================
> > > +
> > > +Kernel response contents:
> > > +
> > > +  ======================================  ======
> > ==========================
> > > +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> > > +  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32     state of a pin
> > > +  ``ETHTOOL_A_RCLK_RANGE_MIN_PIN``        u32     min index of RCLK pins
> > > +  ``ETHTOOL_A_RCLK_RANGE_MAX_PIN``        u32     max index of RCLK
> > pins
> > > +  ======================================  ======
> > ==========================
> > > +
> > > +Supported device can have mulitple reference recover clock pins available
> > 
> > s/mulitple/multiple/
> > 
> > > +to be used as source of frequency for a DPLL.
> > > +Once a pin on given port is enabled. The PHY recovered frequency is being
> > > +fed onto that pin, and can be used by DPLL to synchonize with its signal.
> > 
> > s/synchonize/synchronize/
> > 
> > Please run a spell checker on documentation
> > 
> > > +Pins don't have to start with index equal 0 - device can also have different
> > > +external sources pins.
> > > +
> > > +The ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is optional parameter. If present
> > in
> > > +the RCLK_GET request, the ``ETHTOOL_A_RCLK_PIN_ENABLED`` is
> > provided in a
> > 
> > The `ETHTOOL_A_RCLK_PIN_ENABLED` attribute is no where to be found in
> > this submission
> > 
> > > +response, it contatins state of the pin pointed by the index. Values are:
> > 
> > s/contatins/contains/
> > 
> > > +
> > > +.. kernel-doc:: include/uapi/linux/ethtool.h
> > > +    :identifiers: ethtool_rclk_pin_state
> > 
> > This structure is also no where to be found
> > 
> > > +
> > > +If ``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is not present in the RCLK_GET
> > request,
> > > +the range of available pins is returned:
> > > +``ETHTOOL_A_RCLK_RANGE_MIN_PIN`` is lowest possible index of a pin
> > available
> > > +for recovering frequency from PHY.
> > > +``ETHTOOL_A_RCLK_RANGE_MAX_PIN`` is highest possible index of a pin
> > available
> > > +for recovering frequency from PHY.
> > > +
> > > +RCLK_SET
> > > +==========
> > > +
> > > +Set status of an output pin for PHY recovered frequency clock.
> > > +
> > > +Request contents:
> > > +
> > > +  ======================================  ======
> > ========================
> > > +  ``ETHTOOL_A_RCLK_HEADER``               nested  request header
> > > +  ``ETHTOOL_A_RCLK_OUT_PIN_IDX``          u32     index of a pin
> > > +  ``ETHTOOL_A_RCLK_PIN_FLAGS``            u32      requested state
> > > +  ======================================  ======
> > ========================
> > > +
> > > +``ETHTOOL_A_RCLK_OUT_PIN_IDX`` is a index of a pin for which the
> > change of
> > > +state is requested. Values of ``ETHTOOL_A_RCLK_PIN_ENABLED`` are:
> > > +
> > > +.. kernel-doc:: include/uapi/linux/ethtool.h
> > > +    :identifiers: ethtool_rclk_pin_state
> > 
> > Same.
> 
> Done - rewritten the manual
> 
> > Looking at the diagram from the previous submission [1]:
> > 
> >       ┌──────────┬──────────┐
> >       │ RX       │ TX       │
> >   1   │ ports    │ ports    │ 1
> >   ───►├─────┐    │          ├─────►
> >   2   │     │    │          │ 2
> >   ───►├───┐ │    │          ├─────►
> >   3   │   │ │    │          │ 3
> >   ───►├─┐ │ │    │          ├─────►
> >       │ ▼ ▼ ▼    │          │
> >       │ ──────   │          │
> >       │ \____/   │          │
> >       └──┼──┼────┴──────────┘
> >         1│ 2│        ▲
> >  RCLK out│  │        │ TX CLK in
> >          ▼  ▼        │
> >        ┌─────────────┴───┐
> >        │                 │
> >        │       SEC       │
> >        │                 │
> >        └─────────────────┘
> > 
> > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message allows
> > me to redirect the frequency recovered from this netdev to the EEC via
> > either pin 1, pin 2 or both.
> > 
> > Given a netdev, the RCLK_GET message allows me to query the range of
> > pins (RCLK out 1-2 in the diagram) through which the frequency can be
> > fed into the EEC.
> > 
> > Questions:
> > 
> > 1. The query for all the above netdevs will return the same range of
> > pins. How does user space know that these are the same pins? That is,
> > how does user space know that RCLK_SET message to redirect the frequency
> > recovered from netdev 1 to pin 1 will be overridden by the same message
> > but for netdev 2?
> 
> We don't have a way to do so right now. When we have EEC subsystem in place
> the right thing to do will be to add EEC input index and EEC index as additional
> arguments
> 
> > 2. How does user space know the mapping between a netdev and an EEC?
> > That is, how does user space know that RCLK_SET message for netdev 1
> > will cause the Tx frequency of netdev 2 to change according to the
> > frequency recovered from netdev 1?
> 
> Ditto - currently we don't have any entity to link the pins to ATM,
> but we can address that in userspace just like PTP pins are used now
> 
> > 3. If user space sends two RCLK_SET messages to redirect the frequency
> > recovered from netdev 1 to RCLK out 1 and from netdev 2 to RCLK out 2,
> > how does it know which recovered frequency is actually used an input to
> > the EEC?

User space doesn't know this as well?

> >
> > 4. Why these pins are represented as attributes of a netdev and not as
> > attributes of the EEC? That is, why are they represented as output pins
> > of the PHY as opposed to input pins of the EEC?
> 
> They are 2 separate beings. Recovered clock outputs are controlled
> separately from EEC inputs. 

Separate how? What does it mean that they are controlled separately? In
which sense? That redirection of recovered frequency to pin is
controlled via PHY registers whereas priority setting between EEC inputs
is controlled via EEC registers? If so, this is an implementation detail
of a specific design. It is not of any importance to user space.

> If we mix them it'll be hard to control everything especially that a
> single EEC can support multiple devices.

Hard how? Please provide concrete examples.

What do you mean by "multiple devices"? A multi-port adapter with a
single EEC or something else?

> Also if we make those pins attributes of the EEC it'll become extremally hard
> to map them to netdevs and control them from the userspace app that will
> receive the ESMC message with a given QL level on netdev X.

Hard how? What is the problem with something like:

# eec set source 1 type netdev dev swp1

The EEC object should be registered by the same entity that registers
the netdevs whose Tx frequency is controlled by the EEC, the MAC driver.

>  
> > 5. What is the problem with the following model?
> > 
> > - The EEC is a separate object with following attributes:
> >   * State: Invalid / Freerun / Locked / etc
> >   * Sources: Netdev / external / etc
> >   * Potentially more
> > 
> > - Notifications are emitted to user space when the state of the EEC
> >   changes. Drivers will either poll the state from the device or get
> >   interrupts
> > 
> > - The mapping from netdev to EEC is queried via ethtool
> 
> Yep - that will be part of the EEC (DPLL) subsystem

This model avoids all the problems I pointed out in the current
proposal.

> 
> > [1] https://lore.kernel.org/netdev/20211110114448.2792314-1-
> > maciej.machnikowski@intel.com/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-02 16:35       ` Ido Schimmel
@ 2021-12-02 17:20         ` Machnikowski, Maciej
  2021-12-03 14:26           ` Petr Machata
  2021-12-03 15:45           ` Ido Schimmel
  0 siblings, 2 replies; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-02 17:20 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, intel-wired-lan, Kubalewski, Arkadiusz, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

> -----Original Message-----
> From: Ido Schimmel <idosch@idosch.org>
> Sent: Thursday, December 2, 2021 5:36 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
> > > -----Original Message-----
> > > From: Ido Schimmel <idosch@idosch.org>
> > > Sent: Thursday, December 2, 2021 1:44 PM
> > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > > recovered clock for SyncE feature
> > >
> > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski wrote:
> > > Looking at the diagram from the previous submission [1]:
> > >
> > >       ┌──────────┬──────────┐
> > >       │ RX       │ TX       │
> > >   1   │ ports    │ ports    │ 1
> > >   ───►├─────┐    │          ├─────►
> > >   2   │     │    │          │ 2
> > >   ───►├───┐ │    │          ├─────►
> > >   3   │   │ │    │          │ 3
> > >   ───►├─┐ │ │    │          ├─────►
> > >       │ ▼ ▼ ▼    │          │
> > >       │ ──────   │          │
> > >       │ \____/   │          │
> > >       └──┼──┼────┴──────────┘
> > >         1│ 2│        ▲
> > >  RCLK out│  │        │ TX CLK in
> > >          ▼  ▼        │
> > >        ┌─────────────┴───┐
> > >        │                 │
> > >        │       SEC       │
> > >        │                 │
> > >        └─────────────────┘
> > >
> > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message allows
> > > me to redirect the frequency recovered from this netdev to the EEC via
> > > either pin 1, pin 2 or both.
> > >
> > > Given a netdev, the RCLK_GET message allows me to query the range of
> > > pins (RCLK out 1-2 in the diagram) through which the frequency can be
> > > fed into the EEC.
> > >
> > > Questions:
> > >
> > > 1. The query for all the above netdevs will return the same range of
> > > pins. How does user space know that these are the same pins? That is,
> > > how does user space know that RCLK_SET message to redirect the
> frequency
> > > recovered from netdev 1 to pin 1 will be overridden by the same
> message
> > > but for netdev 2?
> >
> > We don't have a way to do so right now. When we have EEC subsystem in
> place
> > the right thing to do will be to add EEC input index and EEC index as
> additional
> > arguments
> >
> > > 2. How does user space know the mapping between a netdev and an
> EEC?
> > > That is, how does user space know that RCLK_SET message for netdev 1
> > > will cause the Tx frequency of netdev 2 to change according to the
> > > frequency recovered from netdev 1?
> >
> > Ditto - currently we don't have any entity to link the pins to ATM,
> > but we can address that in userspace just like PTP pins are used now
> >
> > > 3. If user space sends two RCLK_SET messages to redirect the frequency
> > > recovered from netdev 1 to RCLK out 1 and from netdev 2 to RCLK out 2,
> > > how does it know which recovered frequency is actually used an input to
> > > the EEC?
> 
> User space doesn't know this as well?

In current model it can come from the config file. Once we implement DPLL
subsystem we can implement connection between pins and DPLLs if they are
known.

> > >
> > > 4. Why these pins are represented as attributes of a netdev and not as
> > > attributes of the EEC? That is, why are they represented as output pins
> > > of the PHY as opposed to input pins of the EEC?
> >
> > They are 2 separate beings. Recovered clock outputs are controlled
> > separately from EEC inputs.
> 
> Separate how? What does it mean that they are controlled separately? In
> which sense? That redirection of recovered frequency to pin is
> controlled via PHY registers whereas priority setting between EEC inputs
> is controlled via EEC registers? If so, this is an implementation detail
> of a specific design. It is not of any importance to user space.

They belong to different devices. EEC registers are physically in the DPLL
hanging over I2C and recovered clocks are in the PHY/integrated PHY in
the MAC. Depending on system architecture you may have control over
one piece only

> > If we mix them it'll be hard to control everything especially that a
> > single EEC can support multiple devices.
> 
> Hard how? Please provide concrete examples.

From the EEC perspective it's one to many relation - one EEC input pin will serve
even 4,16,48 netdevs. I don't see easy way of starting from EEC input of EEC device
and figuring out which netdevs are connected to it to talk to the right one.
In current model it's as simple as:
- I received QL-PRC on netdev ens4f0
- I send back enable recovered clock on pin 0 of the ens4f0
- go to EEC that will be linked to it
- see the state of it - if its locked - report QL-EEC downsteam

How would you this control look in the EEC/DPLL implementation? Maybe
I missed something.
 
> What do you mean by "multiple devices"? A multi-port adapter with a
> single EEC or something else?

Multiple MACs that use a single EEC clock.

> > Also if we make those pins attributes of the EEC it'll become extremally
> hard
> > to map them to netdevs and control them from the userspace app that will
> > receive the ESMC message with a given QL level on netdev X.
> 
> Hard how? What is the problem with something like:
> 
> # eec set source 1 type netdev dev swp1
> 
> The EEC object should be registered by the same entity that registers
> the netdevs whose Tx frequency is controlled by the EEC, the MAC driver.

But the EEC object may not be controlled by the MAC - in which case
this model won't work.

> >
> > > 5. What is the problem with the following model?
> > >
> > > - The EEC is a separate object with following attributes:
> > >   * State: Invalid / Freerun / Locked / etc
> > >   * Sources: Netdev / external / etc
> > >   * Potentially more
> > >
> > > - Notifications are emitted to user space when the state of the EEC
> > >   changes. Drivers will either poll the state from the device or get
> > >   interrupts
> > >
> > > - The mapping from netdev to EEC is queried via ethtool
> >
> > Yep - that will be part of the EEC (DPLL) subsystem
> 
> This model avoids all the problems I pointed out in the current
> proposal.

That's the go-to model, but first we need control over the source as well :)
Regards
Maciek

> >
> > > [1] https://lore.kernel.org/netdev/20211110114448.2792314-1-
> > > maciej.machnikowski@intel.com/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-02 17:20         ` Machnikowski, Maciej
@ 2021-12-03 14:26           ` Petr Machata
  2021-12-03 14:55             ` Machnikowski, Maciej
  2021-12-03 15:45           ` Ido Schimmel
  1 sibling, 1 reply; 27+ messages in thread
From: Petr Machata @ 2021-12-03 14:26 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Ido Schimmel, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan, petrm


Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:

>> -----Original Message-----
>> From: Ido Schimmel <idosch@idosch.org>
>>
>> On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
>> > > -----Original Message-----
>> > > From: Ido Schimmel <idosch@idosch.org>
>> > >
>> > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski wrote:
>> > > Looking at the diagram from the previous submission [1]:
>> > >
>> > >       ┌──────────┬──────────┐
>> > >       │ RX       │ TX       │
>> > >   1   │ ports    │ ports    │ 1
>> > >   ───►├─────┐    │          ├─────►
>> > >   2   │     │    │          │ 2
>> > >   ───►├───┐ │    │          ├─────►
>> > >   3   │   │ │    │          │ 3
>> > >   ───►├─┐ │ │    │          ├─────►
>> > >       │ ▼ ▼ ▼    │          │
>> > >       │ ──────   │          │
>> > >       │ \____/   │          │
>> > >       └──┼──┼────┴──────────┘
>> > >         1│ 2│        ▲
>> > >  RCLK out│  │        │ TX CLK in
>> > >          ▼  ▼        │
>> > >        ┌─────────────┴───┐
>> > >        │                 │
>> > >        │       SEC       │
>> > >        │                 │
>> > >        └─────────────────┘
>> > >
>> > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message allows
>> > > me to redirect the frequency recovered from this netdev to the EEC via
>> > > either pin 1, pin 2 or both.
>> > >
>> > > Given a netdev, the RCLK_GET message allows me to query the range of
>> > > pins (RCLK out 1-2 in the diagram) through which the frequency can be
>> > > fed into the EEC.
>> > >
>> > > Questions:
>> > >
>> > > 1. The query for all the above netdevs will return the same range
>> > > of pins. How does user space know that these are the same pins?
>> > > That is, how does user space know that RCLK_SET message to
>> > > redirect the frequency recovered from netdev 1 to pin 1 will be
>> > > overridden by the same message but for netdev 2?
>> >
>> > We don't have a way to do so right now. When we have EEC subsystem
>> > in place the right thing to do will be to add EEC input index and
>> > EEC index as additional arguments
>> >
>> > > 2. How does user space know the mapping between a netdev and an
>> > > EEC? That is, how does user space know that RCLK_SET message for
>> > > netdev 1 will cause the Tx frequency of netdev 2 to change
>> > > according to the frequency recovered from netdev 1?
>> >
>> > Ditto - currently we don't have any entity to link the pins to ATM,
>> > but we can address that in userspace just like PTP pins are used
>> > now
>> >
>> > > 3. If user space sends two RCLK_SET messages to redirect the
>> > > frequency recovered from netdev 1 to RCLK out 1 and from netdev 2
>> > > to RCLK out 2, how does it know which recovered frequency is
>> > > actually used an input to the EEC?
>> 
>> User space doesn't know this as well?
>
> In current model it can come from the config file. Once we implement DPLL
> subsystem we can implement connection between pins and DPLLs if they are
> known.
>
>> > >
>> > > 4. Why these pins are represented as attributes of a netdev and not as
>> > > attributes of the EEC? That is, why are they represented as output pins
>> > > of the PHY as opposed to input pins of the EEC?
>> >
>> > They are 2 separate beings. Recovered clock outputs are controlled
>> > separately from EEC inputs.
>> 
>> Separate how? What does it mean that they are controlled separately? In
>> which sense? That redirection of recovered frequency to pin is
>> controlled via PHY registers whereas priority setting between EEC inputs
>> is controlled via EEC registers? If so, this is an implementation detail
>> of a specific design. It is not of any importance to user space.
>
> They belong to different devices. EEC registers are physically in the DPLL
> hanging over I2C and recovered clocks are in the PHY/integrated PHY in
> the MAC. Depending on system architecture you may have control over
> one piece only

What does ETHTOOL_MSG_RCLK_SET actually configure, physically? Say I
have this message:

ETHTOOL_MSG_RCLK_SET dev = eth0
- ETHTOOL_A_RCLK_OUT_PIN_IDX = n
- ETHTOOL_A_RCLK_PIN_FLAGS |= ETHTOOL_RCLK_PIN_FLAGS_ENA

Eventually this lands in ops->set_rclk_out(dev, out_idx, new_state).
What does the MAC driver do next?

>> > If we mix them it'll be hard to control everything especially that a
>> > single EEC can support multiple devices.
>> 
>> Hard how? Please provide concrete examples.
>
> From the EEC perspective it's one to many relation - one EEC input pin will serve
> even 4,16,48 netdevs. I don't see easy way of starting from EEC input of EEC device
> and figuring out which netdevs are connected to it to talk to the right one.
> In current model it's as simple as:
> - I received QL-PRC on netdev ens4f0
> - I send back enable recovered clock on pin 0 of the ens4f0

How do I know it's pin 0 though? Config file?

> - go to EEC that will be linked to it
> - see the state of it - if its locked - report QL-EEC downsteam
>
> How would you this control look in the EEC/DPLL implementation? Maybe
> I missed something.

In the EEC-centric model this is what happens:

- QL-PRC packet is received on ens4f0
- Userspace consults a UAPI to figure out what EEC and pin ID this
  netdevice corresponds to
- Userspace instructs through a UAPI the indicated EEC to use the
  indicated pin as a source
- Userspace then monitors the indicated EEC through a UAPI. When the EEC
  locks, QL-EEC is reported downstream

>> What do you mean by "multiple devices"? A multi-port adapter with a
>> single EEC or something else?
>
> Multiple MACs that use a single EEC clock.
>
>> > Also if we make those pins attributes of the EEC it'll become extremally hard
>> > to map them to netdevs and control them from the userspace app that will
>> > receive the ESMC message with a given QL level on netdev X.
>> 
>> Hard how? What is the problem with something like:
>> 
>> # eec set source 1 type netdev dev swp1
>> 
>> The EEC object should be registered by the same entity that registers
>> the netdevs whose Tx frequency is controlled by the EEC, the MAC driver.
>
> But the EEC object may not be controlled by the MAC - in which case
> this model won't work.

In that case the driver for the device that controls EEC would
instantiates the object. It doesn't have to be a MAC driver.

But if it is controlled by the MAC, the MAC driver instantiates it. And
can set up the connection between the MAC and the EEC, so that in the
shell snippet above "eec" knows how to get the EEC handle from the
netdevice.

>> >
>> > > 5. What is the problem with the following model?
>> > >
>> > > - The EEC is a separate object with following attributes:
>> > >   * State: Invalid / Freerun / Locked / etc
>> > >   * Sources: Netdev / external / etc
>> > >   * Potentially more
>> > >
>> > > - Notifications are emitted to user space when the state of the EEC
>> > >   changes. Drivers will either poll the state from the device or get
>> > >   interrupts
>> > >
>> > > - The mapping from netdev to EEC is queried via ethtool
>> >
>> > Yep - that will be part of the EEC (DPLL) subsystem
>>
>> This model avoids all the problems I pointed out in the current
>> proposal.
>
> That's the go-to model, but first we need control over the source as
> well :)

Why is that? Can you illustrate a case that breaks with the above model?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 14:26           ` Petr Machata
@ 2021-12-03 14:55             ` Machnikowski, Maciej
  2021-12-03 15:58               ` Ido Schimmel
  2021-12-03 16:26               ` Petr Machata
  0 siblings, 2 replies; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-03 14:55 UTC (permalink / raw)
  To: Petr Machata
  Cc: Ido Schimmel, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan

> -----Original Message-----
> From: Petr Machata <petrm@nvidia.com>
> Sent: Friday, December 3, 2021 3:27 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> 
> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Ido Schimmel <idosch@idosch.org>
> >>
> >> On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
> >> > > -----Original Message-----
> >> > > From: Ido Schimmel <idosch@idosch.org>
> >> > >
> >> > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski
> wrote:
> >> > > Looking at the diagram from the previous submission [1]:
> >> > >
> >> > >       ┌──────────┬──────────┐
> >> > >       │ RX       │ TX       │
> >> > >   1   │ ports    │ ports    │ 1
> >> > >   ───►├─────┐    │          ├─────►
> >> > >   2   │     │    │          │ 2
> >> > >   ───►├───┐ │    │          ├─────►
> >> > >   3   │   │ │    │          │ 3
> >> > >   ───►├─┐ │ │    │          ├─────►
> >> > >       │ ▼ ▼ ▼    │          │
> >> > >       │ ──────   │          │
> >> > >       │ \____/   │          │
> >> > >       └──┼──┼────┴──────────┘
> >> > >         1│ 2│        ▲
> >> > >  RCLK out│  │        │ TX CLK in
> >> > >          ▼  ▼        │
> >> > >        ┌─────────────┴───┐
> >> > >        │                 │
> >> > >        │       SEC       │
> >> > >        │                 │
> >> > >        └─────────────────┘
> >> > >
> >> > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message
> allows
> >> > > me to redirect the frequency recovered from this netdev to the EEC
> via
> >> > > either pin 1, pin 2 or both.
> >> > >
> >> > > Given a netdev, the RCLK_GET message allows me to query the range
> of
> >> > > pins (RCLK out 1-2 in the diagram) through which the frequency can be
> >> > > fed into the EEC.
> >> > >
> >> > > Questions:
> >> > >
> >> > > 1. The query for all the above netdevs will return the same range
> >> > > of pins. How does user space know that these are the same pins?
> >> > > That is, how does user space know that RCLK_SET message to
> >> > > redirect the frequency recovered from netdev 1 to pin 1 will be
> >> > > overridden by the same message but for netdev 2?
> >> >
> >> > We don't have a way to do so right now. When we have EEC subsystem
> >> > in place the right thing to do will be to add EEC input index and
> >> > EEC index as additional arguments
> >> >
> >> > > 2. How does user space know the mapping between a netdev and an
> >> > > EEC? That is, how does user space know that RCLK_SET message for
> >> > > netdev 1 will cause the Tx frequency of netdev 2 to change
> >> > > according to the frequency recovered from netdev 1?
> >> >
> >> > Ditto - currently we don't have any entity to link the pins to ATM,
> >> > but we can address that in userspace just like PTP pins are used
> >> > now
> >> >
> >> > > 3. If user space sends two RCLK_SET messages to redirect the
> >> > > frequency recovered from netdev 1 to RCLK out 1 and from netdev 2
> >> > > to RCLK out 2, how does it know which recovered frequency is
> >> > > actually used an input to the EEC?
> >>
> >> User space doesn't know this as well?
> >
> > In current model it can come from the config file. Once we implement DPLL
> > subsystem we can implement connection between pins and DPLLs if they
> are
> > known.
> >
> >> > >
> >> > > 4. Why these pins are represented as attributes of a netdev and not as
> >> > > attributes of the EEC? That is, why are they represented as output
> pins
> >> > > of the PHY as opposed to input pins of the EEC?
> >> >
> >> > They are 2 separate beings. Recovered clock outputs are controlled
> >> > separately from EEC inputs.
> >>
> >> Separate how? What does it mean that they are controlled separately? In
> >> which sense? That redirection of recovered frequency to pin is
> >> controlled via PHY registers whereas priority setting between EEC inputs
> >> is controlled via EEC registers? If so, this is an implementation detail
> >> of a specific design. It is not of any importance to user space.
> >
> > They belong to different devices. EEC registers are physically in the DPLL
> > hanging over I2C and recovered clocks are in the PHY/integrated PHY in
> > the MAC. Depending on system architecture you may have control over
> > one piece only
> 
> What does ETHTOOL_MSG_RCLK_SET actually configure, physically? Say I
> have this message:
> 
> ETHTOOL_MSG_RCLK_SET dev = eth0
> - ETHTOOL_A_RCLK_OUT_PIN_IDX = n
> - ETHTOOL_A_RCLK_PIN_FLAGS |= ETHTOOL_RCLK_PIN_FLAGS_ENA
> 
> Eventually this lands in ops->set_rclk_out(dev, out_idx, new_state).
> What does the MAC driver do next?

It goes to the PTY layer, enables the clock recovery from a given physical lane, 
optionally configure the clock divider and pin output muxes. This will be 
HW-specific though, but the general concept will look like that.

> >> > If we mix them it'll be hard to control everything especially that a
> >> > single EEC can support multiple devices.
> >>
> >> Hard how? Please provide concrete examples.
> >
> > From the EEC perspective it's one to many relation - one EEC input pin will
> serve
> > even 4,16,48 netdevs. I don't see easy way of starting from EEC input of EEC
> device
> > and figuring out which netdevs are connected to it to talk to the right one.
> > In current model it's as simple as:
> > - I received QL-PRC on netdev ens4f0
> > - I send back enable recovered clock on pin 0 of the ens4f0
> 
> How do I know it's pin 0 though? Config file?

You can find that by sending the ETHTOOL_MSG_RCLK_GET without any pin
index to get the acceptable/supported range.

> > - go to EEC that will be linked to it
> > - see the state of it - if its locked - report QL-EEC downsteam
> >
> > How would you this control look in the EEC/DPLL implementation? Maybe
> > I missed something.
> 
> In the EEC-centric model this is what happens:
> 
> - QL-PRC packet is received on ens4f0
> - Userspace consults a UAPI to figure out what EEC and pin ID this
>   netdevice corresponds to
> - Userspace instructs through a UAPI the indicated EEC to use the
>   indicated pin as a source
> - Userspace then monitors the indicated EEC through a UAPI. When the EEC
>   locks, QL-EEC is reported downstream

This is still missing the port/lane->pin mapping. This is what will happen in 
the EEC/DPLL subsystem.

> >> What do you mean by "multiple devices"? A multi-port adapter with a
> >> single EEC or something else?
> >
> > Multiple MACs that use a single EEC clock.
> >
> >> > Also if we make those pins attributes of the EEC it'll become extremally
> hard
> >> > to map them to netdevs and control them from the userspace app that
> will
> >> > receive the ESMC message with a given QL level on netdev X.
> >>
> >> Hard how? What is the problem with something like:
> >>
> >> # eec set source 1 type netdev dev swp1
> >>
> >> The EEC object should be registered by the same entity that registers
> >> the netdevs whose Tx frequency is controlled by the EEC, the MAC driver.
> >
> > But the EEC object may not be controlled by the MAC - in which case
> > this model won't work.
> 
> In that case the driver for the device that controls EEC would
> instantiates the object. It doesn't have to be a MAC driver.
> 
> But if it is controlled by the MAC, the MAC driver instantiates it. And
> can set up the connection between the MAC and the EEC, so that in the
> shell snippet above "eec" knows how to get the EEC handle from the
> netdevice.

But it still needs to talk to MAC driver somehow to enable the clock
recovery on a given pin - that's where the API defined here is needed.

> >> >
> >> > > 5. What is the problem with the following model?
> >> > >
> >> > > - The EEC is a separate object with following attributes:
> >> > >   * State: Invalid / Freerun / Locked / etc
> >> > >   * Sources: Netdev / external / etc
> >> > >   * Potentially more
> >> > >
> >> > > - Notifications are emitted to user space when the state of the EEC
> >> > >   changes. Drivers will either poll the state from the device or get
> >> > >   interrupts
> >> > >
> >> > > - The mapping from netdev to EEC is queried via ethtool
> >> >
> >> > Yep - that will be part of the EEC (DPLL) subsystem
> >>
> >> This model avoids all the problems I pointed out in the current
> >> proposal.
> >
> > That's the go-to model, but first we need control over the source as
> > well :)
> 
> Why is that? Can you illustrate a case that breaks with the above model?

If you have 32 port switch chip with 2 recovered clock outputs how will you
tell the chip to get the 18th port to pin 0 and from port 20 to pin 1? That's
the part those patches addresses. The further side of "which clock should the
EEC use" belongs to the DPLL subsystem and I agree with that.

Or to put it into different words:
This API will configure given quality level  frequency reference outputs on chip's
Dedicated outputs. On a board you will connect those to the EEC's reference inputs.

The EEC's job is to validate the inputs and lock to them following certain rules,
The PHY/MAC (and this API) job is to deliver reference signals to the EEC. 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-02 17:20         ` Machnikowski, Maciej
  2021-12-03 14:26           ` Petr Machata
@ 2021-12-03 15:45           ` Ido Schimmel
  2021-12-03 16:18             ` Machnikowski, Maciej
  1 sibling, 1 reply; 27+ messages in thread
From: Ido Schimmel @ 2021-12-03 15:45 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, Kubalewski, Arkadiusz, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

On Thu, Dec 02, 2021 at 05:20:24PM +0000, Machnikowski, Maciej wrote:
> > -----Original Message-----
> > From: Ido Schimmel <idosch@idosch.org>
> > Sent: Thursday, December 2, 2021 5:36 PM
> > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > recovered clock for SyncE feature
> > 
> > On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
> > > > -----Original Message-----
> > > > From: Ido Schimmel <idosch@idosch.org>
> > > > Sent: Thursday, December 2, 2021 1:44 PM
> > > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > > > recovered clock for SyncE feature
> > > >
> > > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski wrote:
> > > > Looking at the diagram from the previous submission [1]:
> > > >
> > > >       ┌──────────┬──────────┐
> > > >       │ RX       │ TX       │
> > > >   1   │ ports    │ ports    │ 1
> > > >   ───►├─────┐    │          ├─────►
> > > >   2   │     │    │          │ 2
> > > >   ───►├───┐ │    │          ├─────►
> > > >   3   │   │ │    │          │ 3
> > > >   ───►├─┐ │ │    │          ├─────►
> > > >       │ ▼ ▼ ▼    │          │
> > > >       │ ──────   │          │
> > > >       │ \____/   │          │
> > > >       └──┼──┼────┴──────────┘
> > > >         1│ 2│        ▲
> > > >  RCLK out│  │        │ TX CLK in
> > > >          ▼  ▼        │
> > > >        ┌─────────────┴───┐
> > > >        │                 │
> > > >        │       SEC       │
> > > >        │                 │
> > > >        └─────────────────┘
> > > >
> > > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message allows
> > > > me to redirect the frequency recovered from this netdev to the EEC via
> > > > either pin 1, pin 2 or both.
> > > >
> > > > Given a netdev, the RCLK_GET message allows me to query the range of
> > > > pins (RCLK out 1-2 in the diagram) through which the frequency can be
> > > > fed into the EEC.
> > > >
> > > > Questions:
> > > >
> > > > 1. The query for all the above netdevs will return the same range of
> > > > pins. How does user space know that these are the same pins? That is,
> > > > how does user space know that RCLK_SET message to redirect the
> > frequency
> > > > recovered from netdev 1 to pin 1 will be overridden by the same
> > message
> > > > but for netdev 2?
> > >
> > > We don't have a way to do so right now. When we have EEC subsystem in
> > place
> > > the right thing to do will be to add EEC input index and EEC index as
> > additional
> > > arguments
> > >
> > > > 2. How does user space know the mapping between a netdev and an
> > EEC?
> > > > That is, how does user space know that RCLK_SET message for netdev 1
> > > > will cause the Tx frequency of netdev 2 to change according to the
> > > > frequency recovered from netdev 1?
> > >
> > > Ditto - currently we don't have any entity to link the pins to ATM,
> > > but we can address that in userspace just like PTP pins are used now
> > >
> > > > 3. If user space sends two RCLK_SET messages to redirect the frequency
> > > > recovered from netdev 1 to RCLK out 1 and from netdev 2 to RCLK out 2,
> > > > how does it know which recovered frequency is actually used an input to
> > > > the EEC?
> > 
> > User space doesn't know this as well?
> 
> In current model it can come from the config file. Once we implement DPLL
> subsystem we can implement connection between pins and DPLLs if they are
> known.

To be clear, no SyncE patches should be accepted before we have a DPLL
subsystem or however the subsystem that will model the EEC is going to
be called.

You are asking us to buy into a new uAPI that can never be removed. We
pointed out numerous problems with this uAPI and suggested a model that
solves them. When asked why it can't work we are answered with vague
arguments about this model being "hard".

In addition, without a representation of the EEC, these patches have no
value for user space. They basically allow user space to redirect the
recovered frequency from a netdev to an object that does not exist.
User space doesn't know if the object is successfully tracking the
frequency (the EEC state) and does not know which other components are
utilizing this recovered frequency as input (e.g., other netdevs, PHC).

BTW, what is the use case for enabling two EEC inputs simultaneously?
Some seamless failover?

> 
> > > >
> > > > 4. Why these pins are represented as attributes of a netdev and not as
> > > > attributes of the EEC? That is, why are they represented as output pins
> > > > of the PHY as opposed to input pins of the EEC?
> > >
> > > They are 2 separate beings. Recovered clock outputs are controlled
> > > separately from EEC inputs.
> > 
> > Separate how? What does it mean that they are controlled separately? In
> > which sense? That redirection of recovered frequency to pin is
> > controlled via PHY registers whereas priority setting between EEC inputs
> > is controlled via EEC registers? If so, this is an implementation detail
> > of a specific design. It is not of any importance to user space.
> 
> They belong to different devices. EEC registers are physically in the DPLL
> hanging over I2C and recovered clocks are in the PHY/integrated PHY in
> the MAC. Depending on system architecture you may have control over
> one piece only

These are implementation details of a specific design and should not
influence the design of the uAPI. The uAPI should be influenced by the
logical task that it is trying to achieve.

> 
> > > If we mix them it'll be hard to control everything especially that a
> > > single EEC can support multiple devices.
> > 
> > Hard how? Please provide concrete examples.
> 
> From the EEC perspective it's one to many relation - one EEC input pin will serve
> even 4,16,48 netdevs. I don't see easy way of starting from EEC input of EEC device
> and figuring out which netdevs are connected to it to talk to the right one.
> In current model it's as simple as:
> - I received QL-PRC on netdev ens4f0
> - I send back enable recovered clock on pin 0 of the ens4f0
> - go to EEC that will be linked to it
> - see the state of it - if its locked - report QL-EEC downsteam
> 
> How would you this control look in the EEC/DPLL implementation? Maybe
> I missed something.

Petr already replied.

>  
> > What do you mean by "multiple devices"? A multi-port adapter with a
> > single EEC or something else?
> 
> Multiple MACs that use a single EEC clock.
> 
> > > Also if we make those pins attributes of the EEC it'll become extremally
> > hard
> > > to map them to netdevs and control them from the userspace app that will
> > > receive the ESMC message with a given QL level on netdev X.
> > 
> > Hard how? What is the problem with something like:
> > 
> > # eec set source 1 type netdev dev swp1
> > 
> > The EEC object should be registered by the same entity that registers
> > the netdevs whose Tx frequency is controlled by the EEC, the MAC driver.
> 
> But the EEC object may not be controlled by the MAC - in which case
> this model won't work.

Why wouldn't it work? Leave individual kernel modules alone and look at
the kernel. It is registering all the necessary logical objects such
netdevs, PHCs and EECs. There is no way user space knows better than the
kernel how these objects fit together as the purpose of the kernel is to
abstract the hardware to user space.

User space's request to use the Rx frequency recovered from netdev X as
an input to EEC Y will be processed by the DPLL subsystem. In turn, this
subsystem will invoke whichever kernel modules it needs to fulfill the
request.

> 
> > >
> > > > 5. What is the problem with the following model?
> > > >
> > > > - The EEC is a separate object with following attributes:
> > > >   * State: Invalid / Freerun / Locked / etc
> > > >   * Sources: Netdev / external / etc
> > > >   * Potentially more
> > > >
> > > > - Notifications are emitted to user space when the state of the EEC
> > > >   changes. Drivers will either poll the state from the device or get
> > > >   interrupts
> > > >
> > > > - The mapping from netdev to EEC is queried via ethtool
> > >
> > > Yep - that will be part of the EEC (DPLL) subsystem
> > 
> > This model avoids all the problems I pointed out in the current
> > proposal.
> 
> That's the go-to model, but first we need control over the source as well :)

The point that we are trying to make is that like the EEC state, the
source is also an EEC attribute and not a netdev attribute.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 14:55             ` Machnikowski, Maciej
@ 2021-12-03 15:58               ` Ido Schimmel
  2021-12-03 16:26               ` Petr Machata
  1 sibling, 0 replies; 27+ messages in thread
From: Ido Schimmel @ 2021-12-03 15:58 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Petr Machata, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan

On Fri, Dec 03, 2021 at 02:55:05PM +0000, Machnikowski, Maciej wrote:
> If you have 32 port switch chip with 2 recovered clock outputs how will you
> tell the chip to get the 18th port to pin 0 and from port 20 to pin 1? That's
> the part those patches addresses. The further side of "which clock should the
> EEC use" belongs to the DPLL subsystem and I agree with that.
> 
> Or to put it into different words:
> This API will configure given quality level  frequency reference outputs on chip's
> Dedicated outputs. On a board you will connect those to the EEC's reference inputs.

So these outputs are hardwired into the EEC's inputs and are therefore
only meaningful as EEC inputs? If so, why these outputs are not
configured via the EEC object?

> 
> The EEC's job is to validate the inputs and lock to them following certain rules,
> The PHY/MAC (and this API) job is to deliver reference signals to the EEC. 
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 15:45           ` Ido Schimmel
@ 2021-12-03 16:18             ` Machnikowski, Maciej
  2021-12-03 18:21               ` Petr Machata
  2021-12-05 12:24               ` Ido Schimmel
  0 siblings, 2 replies; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-03 16:18 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, intel-wired-lan, Kubalewski, Arkadiusz, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

> -----Original Message-----
> From: Ido Schimmel <idosch@idosch.org>
> Sent: Friday, December 3, 2021 4:46 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> On Thu, Dec 02, 2021 at 05:20:24PM +0000, Machnikowski, Maciej wrote:
> > > -----Original Message-----
> > > From: Ido Schimmel <idosch@idosch.org>
> > > Sent: Thursday, December 2, 2021 5:36 PM
> > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > > recovered clock for SyncE feature
> > >
> > > On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
> > > > > -----Original Message-----
> > > > > From: Ido Schimmel <idosch@idosch.org>
> > > > > Sent: Thursday, December 2, 2021 1:44 PM
> > > > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > > > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > > > > recovered clock for SyncE feature
> > > > >
> > > > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski
> wrote:
> > > > > Looking at the diagram from the previous submission [1]:
> > > > >
> > > > >       ┌──────────┬──────────┐
> > > > >       │ RX       │ TX       │
> > > > >   1   │ ports    │ ports    │ 1
> > > > >   ───►├─────┐    │          ├─────►
> > > > >   2   │     │    │          │ 2
> > > > >   ───►├───┐ │    │          ├─────►
> > > > >   3   │   │ │    │          │ 3
> > > > >   ───►├─┐ │ │    │          ├─────►
> > > > >       │ ▼ ▼ ▼    │          │
> > > > >       │ ──────   │          │
> > > > >       │ \____/   │          │
> > > > >       └──┼──┼────┴──────────┘
> > > > >         1│ 2│        ▲
> > > > >  RCLK out│  │        │ TX CLK in
> > > > >          ▼  ▼        │
> > > > >        ┌─────────────┴───┐
> > > > >        │                 │
> > > > >        │       SEC       │
> > > > >        │                 │
> > > > >        └─────────────────┘
> > > > >
> > > > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message
> allows
> > > > > me to redirect the frequency recovered from this netdev to the EEC
> via
> > > > > either pin 1, pin 2 or both.
> > > > >
> > > > > Given a netdev, the RCLK_GET message allows me to query the range
> of
> > > > > pins (RCLK out 1-2 in the diagram) through which the frequency can
> be
> > > > > fed into the EEC.
> > > > >
> > > > > Questions:
> > > > >
> > > > > 1. The query for all the above netdevs will return the same range of
> > > > > pins. How does user space know that these are the same pins? That
> is,
> > > > > how does user space know that RCLK_SET message to redirect the
> > > frequency
> > > > > recovered from netdev 1 to pin 1 will be overridden by the same
> > > message
> > > > > but for netdev 2?
> > > >
> > > > We don't have a way to do so right now. When we have EEC subsystem
> in
> > > place
> > > > the right thing to do will be to add EEC input index and EEC index as
> > > additional
> > > > arguments
> > > >
> > > > > 2. How does user space know the mapping between a netdev and an
> > > EEC?
> > > > > That is, how does user space know that RCLK_SET message for netdev
> 1
> > > > > will cause the Tx frequency of netdev 2 to change according to the
> > > > > frequency recovered from netdev 1?
> > > >
> > > > Ditto - currently we don't have any entity to link the pins to ATM,
> > > > but we can address that in userspace just like PTP pins are used now
> > > >
> > > > > 3. If user space sends two RCLK_SET messages to redirect the
> frequency
> > > > > recovered from netdev 1 to RCLK out 1 and from netdev 2 to RCLK out
> 2,
> > > > > how does it know which recovered frequency is actually used an input
> to
> > > > > the EEC?
> > >
> > > User space doesn't know this as well?
> >
> > In current model it can come from the config file. Once we implement DPLL
> > subsystem we can implement connection between pins and DPLLs if they
> are
> > known.
> 
> To be clear, no SyncE patches should be accepted before we have a DPLL
> subsystem or however the subsystem that will model the EEC is going to
> be called.
> 
> You are asking us to buy into a new uAPI that can never be removed. We
> pointed out numerous problems with this uAPI and suggested a model that
> solves them. When asked why it can't work we are answered with vague
> arguments about this model being "hard".

My argument was never "it's hard" - the answer is we need both APIs.

> In addition, without a representation of the EEC, these patches have no
> value for user space. They basically allow user space to redirect the
> recovered frequency from a netdev to an object that does not exist.
> User space doesn't know if the object is successfully tracking the
> frequency (the EEC state) and does not know which other components are
> utilizing this recovered frequency as input (e.g., other netdevs, PHC).

That's also not true - the proposed uAPI lets you enable recovered frequency
output pins and redirect the right clock to them. In some implementations
you may not have anything else.

> BTW, what is the use case for enabling two EEC inputs simultaneously?
> Some seamless failover?

Mainly - redundacy

> >
> > > > >
> > > > > 4. Why these pins are represented as attributes of a netdev and not
> as
> > > > > attributes of the EEC? That is, why are they represented as output
> pins
> > > > > of the PHY as opposed to input pins of the EEC?
> > > >
> > > > They are 2 separate beings. Recovered clock outputs are controlled
> > > > separately from EEC inputs.
> > >
> > > Separate how? What does it mean that they are controlled separately? In
> > > which sense? That redirection of recovered frequency to pin is
> > > controlled via PHY registers whereas priority setting between EEC inputs
> > > is controlled via EEC registers? If so, this is an implementation detail
> > > of a specific design. It is not of any importance to user space.
> >
> > They belong to different devices. EEC registers are physically in the DPLL
> > hanging over I2C and recovered clocks are in the PHY/integrated PHY in
> > the MAC. Depending on system architecture you may have control over
> > one piece only
> 
> These are implementation details of a specific design and should not
> influence the design of the uAPI. The uAPI should be influenced by the
> logical task that it is trying to achieve.

There are 2 logical tasks:
1. Enable clocks that are recovered from a specific netdev
2. Control the EEC

They are both needed to get to the full solution, but are independent from 
each other. You can't put RCLK redirection to the EEC as it's one to many 
relation and you will need to call the netdev to enable it anyway.

Also, when we tried to add EEC state to PTP subsystem the answer was
that we can't mix subsystems. The proposal to configure recovered
clocks  through EEC would mix netdev with EEC.

> >
> > > > If we mix them it'll be hard to control everything especially that a
> > > > single EEC can support multiple devices.
> > >
> > > Hard how? Please provide concrete examples.
> >
> > From the EEC perspective it's one to many relation - one EEC input pin will
> serve
> > even 4,16,48 netdevs. I don't see easy way of starting from EEC input of EEC
> device
> > and figuring out which netdevs are connected to it to talk to the right one.
> > In current model it's as simple as:
> > - I received QL-PRC on netdev ens4f0
> > - I send back enable recovered clock on pin 0 of the ens4f0
> > - go to EEC that will be linked to it
> > - see the state of it - if its locked - report QL-EEC downsteam
> >
> > How would you this control look in the EEC/DPLL implementation? Maybe
> > I missed something.
> 
> Petr already replied.

See my response there. 

> >
> > > What do you mean by "multiple devices"? A multi-port adapter with a
> > > single EEC or something else?
> >
> > Multiple MACs that use a single EEC clock.
> >
> > > > Also if we make those pins attributes of the EEC it'll become extremally
> > > hard
> > > > to map them to netdevs and control them from the userspace app that
> will
> > > > receive the ESMC message with a given QL level on netdev X.
> > >
> > > Hard how? What is the problem with something like:
> > >
> > > # eec set source 1 type netdev dev swp1
> > >
> > > The EEC object should be registered by the same entity that registers
> > > the netdevs whose Tx frequency is controlled by the EEC, the MAC
> driver.
> >
> > But the EEC object may not be controlled by the MAC - in which case
> > this model won't work.
> 
> Why wouldn't it work? Leave individual kernel modules alone and look at
> the kernel. It is registering all the necessary logical objects such
> netdevs, PHCs and EECs. There is no way user space knows better than the
> kernel how these objects fit together as the purpose of the kernel is to
> abstract the hardware to user space.
> 
> User space's request to use the Rx frequency recovered from netdev X as
> an input to EEC Y will be processed by the DPLL subsystem. In turn, this
> subsystem will invoke whichever kernel modules it needs to fulfill the
> request.

But how would that call go through the kernel? What would you like to give
to the EEC object and how should it react. I'm fine with the changes, but
I don't see the solution in that proposal and this model would mix independent
subsystems.
The netdev -> EEC should be a downstream relation, just like the PTP is now
If a netdev wants to check what's the state of EEC driving it - it can do it, but
I don't see a way for the EEC subsystem to directly configure something in
Potentially couple different MAC chips without calling a kind of netdev API.
And that's what those patches address.

> >
> > > >
> > > > > 5. What is the problem with the following model?
> > > > >
> > > > > - The EEC is a separate object with following attributes:
> > > > >   * State: Invalid / Freerun / Locked / etc
> > > > >   * Sources: Netdev / external / etc
> > > > >   * Potentially more
> > > > >
> > > > > - Notifications are emitted to user space when the state of the EEC
> > > > >   changes. Drivers will either poll the state from the device or get
> > > > >   interrupts
> > > > >
> > > > > - The mapping from netdev to EEC is queried via ethtool
> > > >
> > > > Yep - that will be part of the EEC (DPLL) subsystem
> > >
> > > This model avoids all the problems I pointed out in the current
> > > proposal.
> >
> > That's the go-to model, but first we need control over the source as well :)
> 
> The point that we are trying to make is that like the EEC state, the
> source is also an EEC attribute and not a netdev attribute.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 14:55             ` Machnikowski, Maciej
  2021-12-03 15:58               ` Ido Schimmel
@ 2021-12-03 16:26               ` Petr Machata
  2021-12-03 16:50                 ` Machnikowski, Maciej
  1 sibling, 1 reply; 27+ messages in thread
From: Petr Machata @ 2021-12-03 16:26 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Petr Machata, Ido Schimmel, netdev, intel-wired-lan, Kubalewski,
	Arkadiusz, richardcochran, abyagowi, Nguyen, Anthony L, davem,
	kuba, linux-kselftest, mkubecek, saeed, michael.chan


Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:

>> -----Original Message-----
>> From: Petr Machata <petrm@nvidia.com>
>> Sent: Friday, December 3, 2021 3:27 PM
>> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
>> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
>> recovered clock for SyncE feature
>> 
>> 
>> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
>> 
>> >> -----Original Message-----
>> >> From: Ido Schimmel <idosch@idosch.org>
>> >>
>> >> On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
>> >> > > -----Original Message-----
>> >> > > From: Ido Schimmel <idosch@idosch.org>
>> >> > >
>> >> > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski
>> wrote:
>> >> > > Looking at the diagram from the previous submission [1]:
>> >> > >
>> >> > >       ┌──────────┬──────────┐
>> >> > >       │ RX       │ TX       │
>> >> > >   1   │ ports    │ ports    │ 1
>> >> > >   ───►├─────┐    │          ├─────►
>> >> > >   2   │     │    │          │ 2
>> >> > >   ───►├───┐ │    │          ├─────►
>> >> > >   3   │   │ │    │          │ 3
>> >> > >   ───►├─┐ │ │    │          ├─────►
>> >> > >       │ ▼ ▼ ▼    │          │
>> >> > >       │ ──────   │          │
>> >> > >       │ \____/   │          │
>> >> > >       └──┼──┼────┴──────────┘
>> >> > >         1│ 2│        ▲
>> >> > >  RCLK out│  │        │ TX CLK in
>> >> > >          ▼  ▼        │
>> >> > >        ┌─────────────┴───┐
>> >> > >        │                 │
>> >> > >        │       SEC       │
>> >> > >        │                 │
>> >> > >        └─────────────────┘
>> >> > >
>> >> > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message
>> allows
>> >> > > me to redirect the frequency recovered from this netdev to the EEC
>> via
>> >> > > either pin 1, pin 2 or both.
>> >> > >
>> >> > > Given a netdev, the RCLK_GET message allows me to query the range
>> of
>> >> > > pins (RCLK out 1-2 in the diagram) through which the frequency can be
>> >> > > fed into the EEC.
>> >> > >
>> >> > > Questions:
>> >> > >
>> >> > > 1. The query for all the above netdevs will return the same range
>> >> > > of pins. How does user space know that these are the same pins?
>> >> > > That is, how does user space know that RCLK_SET message to
>> >> > > redirect the frequency recovered from netdev 1 to pin 1 will be
>> >> > > overridden by the same message but for netdev 2?
>> >> >
>> >> > We don't have a way to do so right now. When we have EEC subsystem
>> >> > in place the right thing to do will be to add EEC input index and
>> >> > EEC index as additional arguments
>> >> >
>> >> > > 2. How does user space know the mapping between a netdev and an
>> >> > > EEC? That is, how does user space know that RCLK_SET message for
>> >> > > netdev 1 will cause the Tx frequency of netdev 2 to change
>> >> > > according to the frequency recovered from netdev 1?
>> >> >
>> >> > Ditto - currently we don't have any entity to link the pins to ATM,
>> >> > but we can address that in userspace just like PTP pins are used
>> >> > now
>> >> >
>> >> > > 3. If user space sends two RCLK_SET messages to redirect the
>> >> > > frequency recovered from netdev 1 to RCLK out 1 and from netdev 2
>> >> > > to RCLK out 2, how does it know which recovered frequency is
>> >> > > actually used an input to the EEC?
>> >>
>> >> User space doesn't know this as well?
>> >
>> > In current model it can come from the config file. Once we implement DPLL
>> > subsystem we can implement connection between pins and DPLLs if they
>> are
>> > known.
>> >
>> >> > >
>> >> > > 4. Why these pins are represented as attributes of a netdev and not as
>> >> > > attributes of the EEC? That is, why are they represented as output
>> pins
>> >> > > of the PHY as opposed to input pins of the EEC?
>> >> >
>> >> > They are 2 separate beings. Recovered clock outputs are controlled
>> >> > separately from EEC inputs.
>> >>
>> >> Separate how? What does it mean that they are controlled separately? In
>> >> which sense? That redirection of recovered frequency to pin is
>> >> controlled via PHY registers whereas priority setting between EEC inputs
>> >> is controlled via EEC registers? If so, this is an implementation detail
>> >> of a specific design. It is not of any importance to user space.
>> >
>> > They belong to different devices. EEC registers are physically in the DPLL
>> > hanging over I2C and recovered clocks are in the PHY/integrated PHY in
>> > the MAC. Depending on system architecture you may have control over
>> > one piece only
>> 
>> What does ETHTOOL_MSG_RCLK_SET actually configure, physically? Say I
>> have this message:
>> 
>> ETHTOOL_MSG_RCLK_SET dev = eth0
>> - ETHTOOL_A_RCLK_OUT_PIN_IDX = n
>> - ETHTOOL_A_RCLK_PIN_FLAGS |= ETHTOOL_RCLK_PIN_FLAGS_ENA
>> 
>> Eventually this lands in ops->set_rclk_out(dev, out_idx, new_state).
>> What does the MAC driver do next?
>
> It goes to the PTY layer, enables the clock recovery from a given physical lane, 
> optionally configure the clock divider and pin output muxes. This will be 
> HW-specific though, but the general concept will look like that.

The reason I am asking is that I suspect that by exposing this
functionality through netdev, you assume that the NIC driver will do
whatever EEC configuration necessary _anyway_. So why couldn't it just
instantiate the EEC object as well?

>> >> > If we mix them it'll be hard to control everything especially that a
>> >> > single EEC can support multiple devices.
>> >>
>> >> Hard how? Please provide concrete examples.
>> >
>> > From the EEC perspective it's one to many relation - one EEC input
>> > pin will serve even 4,16,48 netdevs. I don't see easy way of
>> > starting from EEC input of EEC device and figuring out which
>> > netdevs are connected to it to talk to the right one. In current
>> > model it's as simple as:
>> > - I received QL-PRC on netdev ens4f0
>> > - I send back enable recovered clock on pin 0 of the ens4f0
>> 
>> How do I know it's pin 0 though? Config file?
>
> You can find that by sending the ETHTOOL_MSG_RCLK_GET without any pin
> index to get the acceptable/supported range.

Ha, OK, pin0 means the RCLK pin. OK.

>> > - go to EEC that will be linked to it
>> > - see the state of it - if its locked - report QL-EEC downsteam
>> >
>> > How would you this control look in the EEC/DPLL implementation? Maybe
>> > I missed something.
>> 
>> In the EEC-centric model this is what happens:
>> 
>> - QL-PRC packet is received on ens4f0
>> - Userspace consults a UAPI to figure out what EEC and pin ID this
>>   netdevice corresponds to
>> - Userspace instructs through a UAPI the indicated EEC to use the
>>   indicated pin as a source
>> - Userspace then monitors the indicated EEC through a UAPI. When the EEC
>>   locks, QL-EEC is reported downstream
>
> This is still missing the port/lane->pin mapping. This is what will
> happen in the EEC/DPLL subsystem.

You asked how the control looks in the ECC-centric model. So this is
how. That this stuff is missing is fairly obvious, we are talking about
a different model.

I don't buy the "extremely hard" argument. The set of steps to do might
be longer, but they are still just steps. No jumps, hoops, sommersaults.
On the flip side we get a proper UAPI that can stay useful for a while.

>> >> What do you mean by "multiple devices"? A multi-port adapter with a
>> >> single EEC or something else?
>> >
>> > Multiple MACs that use a single EEC clock.
>> >
>> >> > Also if we make those pins attributes of the EEC it'll become
>> >> > extremally hard to map them to netdevs and control them from the
>> >> > userspace app that will receive the ESMC message with a given QL
>> >> > level on netdev X.
>> >>
>> >> Hard how? What is the problem with something like:
>> >>
>> >> # eec set source 1 type netdev dev swp1
>> >>
>> >> The EEC object should be registered by the same entity that registers
>> >> the netdevs whose Tx frequency is controlled by the EEC, the MAC driver.
>> >
>> > But the EEC object may not be controlled by the MAC - in which case
>> > this model won't work.
>> 
>> In that case the driver for the device that controls EEC would
>> instantiates the object. It doesn't have to be a MAC driver.
>> 
>> But if it is controlled by the MAC, the MAC driver instantiates it. And
>> can set up the connection between the MAC and the EEC, so that in the
>> shell snippet above "eec" knows how to get the EEC handle from the
>> netdevice.
>
> But it still needs to talk to MAC driver somehow to enable the clock
> recovery on a given pin - that's where the API defined here is needed.

Yes, there needs to be an API between the EEC object and its owner. That
API can be internal though. E.g. a set of callbacks or a notifier chain.
This is how loose coupling is typically done in the kernel.

>> >> > > 5. What is the problem with the following model?
>> >> > >
>> >> > > - The EEC is a separate object with following attributes:
>> >> > >   * State: Invalid / Freerun / Locked / etc
>> >> > >   * Sources: Netdev / external / etc
>> >> > >   * Potentially more
>> >> > >
>> >> > > - Notifications are emitted to user space when the state of the EEC
>> >> > >   changes. Drivers will either poll the state from the device or get
>> >> > >   interrupts
>> >> > >
>> >> > > - The mapping from netdev to EEC is queried via ethtool
>> >> >
>> >> > Yep - that will be part of the EEC (DPLL) subsystem
>> >>
>> >> This model avoids all the problems I pointed out in the current
>> >> proposal.
>> >
>> > That's the go-to model, but first we need control over the source as
>> > well :)
>> 
>> Why is that? Can you illustrate a case that breaks with the above model?
>
> If you have 32 port switch chip with 2 recovered clock outputs how will you
> tell the chip to get the 18th port to pin 0 and from port 20 to pin 1? That's
> the part those patches addresses. The further side of "which clock should the
> EEC use" belongs to the DPLL subsystem and I agree with that.

So the claim is that in some cases the owner of the EEC does not know
about the netdevices?

If that is the case, how do netdevices know about the EEC, like the
netdev-centric model assumes?

Anyway, to answer the question, something like the following would
happen:

- Ask EEC to enumerate all input pins it knows about
- Find the one that references swp18
- Ask EEC to forward that input pin to output pin 0
- Repeat for swp20 and output pin 1

The switch driver (or multi-port NIC driver) just instantiates all of
netdevices, the EEC object, and pin objects, and therefore can set up
arbitrary linking between the three.

> Or to put it into different words:
> This API will configure given quality level  frequency reference outputs on chip's
> Dedicated outputs. On a board you will connect those to the EEC's reference inputs.
>
> The EEC's job is to validate the inputs and lock to them following certain rules,
> The PHY/MAC (and this API) job is to deliver reference signals to the EEC. 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 16:26               ` Petr Machata
@ 2021-12-03 16:50                 ` Machnikowski, Maciej
  2021-12-03 18:44                   ` Petr Machata
  0 siblings, 1 reply; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-03 16:50 UTC (permalink / raw)
  To: Petr Machata
  Cc: Ido Schimmel, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan

> -----Original Message-----
> From: Petr Machata <petrm@nvidia.com>
> Sent: Friday, December 3, 2021 5:26 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> 
> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Petr Machata <petrm@nvidia.com>
> >> Sent: Friday, December 3, 2021 3:27 PM
> >> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> >> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> >> recovered clock for SyncE feature
> >>
> >>
> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> >>
> >> >> -----Original Message-----
> >> >> From: Ido Schimmel <idosch@idosch.org>
> >> >>
> >> >> On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej
> wrote:
> >> >> > > -----Original Message-----
> >> >> > > From: Ido Schimmel <idosch@idosch.org>
> >> >> > >
> >> >> > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski
> >> wrote:
> >> >> > > Looking at the diagram from the previous submission [1]:
> >> >> > >
> >> >> > >       ┌──────────┬──────────┐
> >> >> > >       │ RX       │ TX       │
> >> >> > >   1   │ ports    │ ports    │ 1
> >> >> > >   ───►├─────┐    │          ├─────►
> >> >> > >   2   │     │    │          │ 2
> >> >> > >   ───►├───┐ │    │          ├─────►
> >> >> > >   3   │   │ │    │          │ 3
> >> >> > >   ───►├─┐ │ │    │          ├─────►
> >> >> > >       │ ▼ ▼ ▼    │          │
> >> >> > >       │ ──────   │          │
> >> >> > >       │ \____/   │          │
> >> >> > >       └──┼──┼────┴──────────┘
> >> >> > >         1│ 2│        ▲
> >> >> > >  RCLK out│  │        │ TX CLK in
> >> >> > >          ▼  ▼        │
> >> >> > >        ┌─────────────┴───┐
> >> >> > >        │                 │
> >> >> > >        │       SEC       │
> >> >> > >        │                 │
> >> >> > >        └─────────────────┘
> >> >> > >
> >> >> > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message
> >> allows
> >> >> > > me to redirect the frequency recovered from this netdev to the
> EEC
> >> via
> >> >> > > either pin 1, pin 2 or both.
> >> >> > >
> >> >> > > Given a netdev, the RCLK_GET message allows me to query the
> range
> >> of
> >> >> > > pins (RCLK out 1-2 in the diagram) through which the frequency can
> be
> >> >> > > fed into the EEC.
> >> >> > >
> >> >> > > Questions:
> >> >> > >
> >> >> > > 1. The query for all the above netdevs will return the same range
> >> >> > > of pins. How does user space know that these are the same pins?
> >> >> > > That is, how does user space know that RCLK_SET message to
> >> >> > > redirect the frequency recovered from netdev 1 to pin 1 will be
> >> >> > > overridden by the same message but for netdev 2?
> >> >> >
> >> >> > We don't have a way to do so right now. When we have EEC
> subsystem
> >> >> > in place the right thing to do will be to add EEC input index and
> >> >> > EEC index as additional arguments
> >> >> >
> >> >> > > 2. How does user space know the mapping between a netdev and
> an
> >> >> > > EEC? That is, how does user space know that RCLK_SET message
> for
> >> >> > > netdev 1 will cause the Tx frequency of netdev 2 to change
> >> >> > > according to the frequency recovered from netdev 1?
> >> >> >
> >> >> > Ditto - currently we don't have any entity to link the pins to ATM,
> >> >> > but we can address that in userspace just like PTP pins are used
> >> >> > now
> >> >> >
> >> >> > > 3. If user space sends two RCLK_SET messages to redirect the
> >> >> > > frequency recovered from netdev 1 to RCLK out 1 and from netdev
> 2
> >> >> > > to RCLK out 2, how does it know which recovered frequency is
> >> >> > > actually used an input to the EEC?
> >> >>
> >> >> User space doesn't know this as well?
> >> >
> >> > In current model it can come from the config file. Once we implement
> DPLL
> >> > subsystem we can implement connection between pins and DPLLs if
> they
> >> are
> >> > known.
> >> >
> >> >> > >
> >> >> > > 4. Why these pins are represented as attributes of a netdev and
> not as
> >> >> > > attributes of the EEC? That is, why are they represented as output
> >> pins
> >> >> > > of the PHY as opposed to input pins of the EEC?
> >> >> >
> >> >> > They are 2 separate beings. Recovered clock outputs are controlled
> >> >> > separately from EEC inputs.
> >> >>
> >> >> Separate how? What does it mean that they are controlled separately?
> In
> >> >> which sense? That redirection of recovered frequency to pin is
> >> >> controlled via PHY registers whereas priority setting between EEC
> inputs
> >> >> is controlled via EEC registers? If so, this is an implementation detail
> >> >> of a specific design. It is not of any importance to user space.
> >> >
> >> > They belong to different devices. EEC registers are physically in the DPLL
> >> > hanging over I2C and recovered clocks are in the PHY/integrated PHY in
> >> > the MAC. Depending on system architecture you may have control over
> >> > one piece only
> >>
> >> What does ETHTOOL_MSG_RCLK_SET actually configure, physically? Say I
> >> have this message:
> >>
> >> ETHTOOL_MSG_RCLK_SET dev = eth0
> >> - ETHTOOL_A_RCLK_OUT_PIN_IDX = n
> >> - ETHTOOL_A_RCLK_PIN_FLAGS |= ETHTOOL_RCLK_PIN_FLAGS_ENA
> >>
> >> Eventually this lands in ops->set_rclk_out(dev, out_idx, new_state).
> >> What does the MAC driver do next?
> >
> > It goes to the PTY layer, enables the clock recovery from a given physical
> lane,
> > optionally configure the clock divider and pin output muxes. This will be
> > HW-specific though, but the general concept will look like that.
> 
> The reason I am asking is that I suspect that by exposing this
> functionality through netdev, you assume that the NIC driver will do
> whatever EEC configuration necessary _anyway_. So why couldn't it just
> instantiate the EEC object as well?

Not necessarily. The EEC can be supported by totally different driver. I.e there
are Renesas DPLL drivers available now in the ptp subsystem. The DPLL
can be connected anywhere in the system.

> >> >> > If we mix them it'll be hard to control everything especially that a
> >> >> > single EEC can support multiple devices.
> >> >>
> >> >> Hard how? Please provide concrete examples.
> >> >
> >> > From the EEC perspective it's one to many relation - one EEC input
> >> > pin will serve even 4,16,48 netdevs. I don't see easy way of
> >> > starting from EEC input of EEC device and figuring out which
> >> > netdevs are connected to it to talk to the right one. In current
> >> > model it's as simple as:
> >> > - I received QL-PRC on netdev ens4f0
> >> > - I send back enable recovered clock on pin 0 of the ens4f0
> >>
> >> How do I know it's pin 0 though? Config file?
> >
> > You can find that by sending the ETHTOOL_MSG_RCLK_GET without any
> pin
> > index to get the acceptable/supported range.
> 
> Ha, OK, pin0 means the RCLK pin. OK.
> 
> >> > - go to EEC that will be linked to it
> >> > - see the state of it - if its locked - report QL-EEC downsteam
> >> >
> >> > How would you this control look in the EEC/DPLL implementation?
> Maybe
> >> > I missed something.
> >>
> >> In the EEC-centric model this is what happens:
> >>
> >> - QL-PRC packet is received on ens4f0
> >> - Userspace consults a UAPI to figure out what EEC and pin ID this
> >>   netdevice corresponds to
> >> - Userspace instructs through a UAPI the indicated EEC to use the
> >>   indicated pin as a source
> >> - Userspace then monitors the indicated EEC through a UAPI. When the
> EEC
> >>   locks, QL-EEC is reported downstream
> >
> > This is still missing the port/lane->pin mapping. This is what will
> > happen in the EEC/DPLL subsystem.
> 
> You asked how the control looks in the ECC-centric model. So this is
> how. That this stuff is missing is fairly obvious, we are talking about
> a different model.
> 
> I don't buy the "extremely hard" argument. The set of steps to do might
> be longer, but they are still just steps. No jumps, hoops, sommersaults.
> On the flip side we get a proper UAPI that can stay useful for a while.
> 
> >> >> What do you mean by "multiple devices"? A multi-port adapter with a
> >> >> single EEC or something else?
> >> >
> >> > Multiple MACs that use a single EEC clock.
> >> >
> >> >> > Also if we make those pins attributes of the EEC it'll become
> >> >> > extremally hard to map them to netdevs and control them from the
> >> >> > userspace app that will receive the ESMC message with a given QL
> >> >> > level on netdev X.
> >> >>
> >> >> Hard how? What is the problem with something like:
> >> >>
> >> >> # eec set source 1 type netdev dev swp1
> >> >>
> >> >> The EEC object should be registered by the same entity that registers
> >> >> the netdevs whose Tx frequency is controlled by the EEC, the MAC
> driver.
> >> >
> >> > But the EEC object may not be controlled by the MAC - in which case
> >> > this model won't work.
> >>
> >> In that case the driver for the device that controls EEC would
> >> instantiates the object. It doesn't have to be a MAC driver.
> >>
> >> But if it is controlled by the MAC, the MAC driver instantiates it. And
> >> can set up the connection between the MAC and the EEC, so that in the
> >> shell snippet above "eec" knows how to get the EEC handle from the
> >> netdevice.
> >
> > But it still needs to talk to MAC driver somehow to enable the clock
> > recovery on a given pin - that's where the API defined here is needed.
> 
> Yes, there needs to be an API between the EEC object and its owner. That
> API can be internal though. E.g. a set of callbacks or a notifier chain.
> This is how loose coupling is typically done in the kernel.
> 
> >> >> > > 5. What is the problem with the following model?
> >> >> > >
> >> >> > > - The EEC is a separate object with following attributes:
> >> >> > >   * State: Invalid / Freerun / Locked / etc
> >> >> > >   * Sources: Netdev / external / etc
> >> >> > >   * Potentially more
> >> >> > >
> >> >> > > - Notifications are emitted to user space when the state of the EEC
> >> >> > >   changes. Drivers will either poll the state from the device or get
> >> >> > >   interrupts
> >> >> > >
> >> >> > > - The mapping from netdev to EEC is queried via ethtool
> >> >> >
> >> >> > Yep - that will be part of the EEC (DPLL) subsystem
> >> >>
> >> >> This model avoids all the problems I pointed out in the current
> >> >> proposal.
> >> >
> >> > That's the go-to model, but first we need control over the source as
> >> > well :)
> >>
> >> Why is that? Can you illustrate a case that breaks with the above model?
> >
> > If you have 32 port switch chip with 2 recovered clock outputs how will you
> > tell the chip to get the 18th port to pin 0 and from port 20 to pin 1? That's
> > the part those patches addresses. The further side of "which clock should
> the
> > EEC use" belongs to the DPLL subsystem and I agree with that.
> 
> So the claim is that in some cases the owner of the EEC does not know
> about the netdevices?
> 
> If that is the case, how do netdevices know about the EEC, like the
> netdev-centric model assumes?
> 
> Anyway, to answer the question, something like the following would
> happen:
> 
> - Ask EEC to enumerate all input pins it knows about
> - Find the one that references swp18
> - Ask EEC to forward that input pin to output pin 0
> - Repeat for swp20 and output pin 1
> 
> The switch driver (or multi-port NIC driver) just instantiates all of
> netdevices, the EEC object, and pin objects, and therefore can set up
> arbitrary linking between the three.

This will end up with a model in which pin X of the EEC will link to dozens
ports - userspace tool would need to find out the relation between them and
EECs somehow. It's far more convenient if a given netdev knows where it is
connected to and which pin can it drive.

I.e. send the netdev swp20 ETHTOOL_MSG_RCLK_GET and get the pin indexes
of the EEC and send the future message to find which EEC that is (or even return
EEC index in RCLK_GET?). Set the recovered clock on that pin with the 
ETHTOOL_MSG_RCLK_SET.
Then go to the given EEC and configure it to use the pin that was returned
before as a frequency source and monitor the EEC state.

Additionally, the EEC device may be instantiated by a totally different driver,
in which case the relation between its pins and netdevs may not even be known.

> > Or to put it into different words:
> > This API will configure given quality level  frequency reference outputs on
> chip's
> > Dedicated outputs. On a board you will connect those to the EEC's
> reference inputs.
> >
> > The EEC's job is to validate the inputs and lock to them following certain
> rules,
> > The PHY/MAC (and this API) job is to deliver reference signals to the EEC.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 16:18             ` Machnikowski, Maciej
@ 2021-12-03 18:21               ` Petr Machata
  2021-12-05 12:24               ` Ido Schimmel
  1 sibling, 0 replies; 27+ messages in thread
From: Petr Machata @ 2021-12-03 18:21 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Ido Schimmel, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan, petrm


Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:

>> -----Original Message-----
>> From: Ido Schimmel <idosch@idosch.org>
>> Sent: Friday, December 3, 2021 4:46 PM
>> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
>> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
>> recovered clock for SyncE feature
>> 
>> On Thu, Dec 02, 2021 at 05:20:24PM +0000, Machnikowski, Maciej wrote:
>> > > -----Original Message-----
>> > > From: Ido Schimmel <idosch@idosch.org>
>> > > Sent: Thursday, December 2, 2021 5:36 PM
>> > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
>> > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
>> > > recovered clock for SyncE feature
>> > >
>> > > On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
>> > > > > -----Original Message-----
>> > > > > From: Ido Schimmel <idosch@idosch.org>
>> > > > > Sent: Thursday, December 2, 2021 1:44 PM
>> > > > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
>> > > > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
>> > > > > recovered clock for SyncE feature
>> > > > >
>> > > > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski wrote:
>> > > > > Looking at the diagram from the previous submission [1]:
>> > > > >
>> > > > >       ┌──────────┬──────────┐
>> > > > >       │ RX       │ TX       │
>> > > > >   1   │ ports    │ ports    │ 1
>> > > > >   ───►├─────┐    │          ├─────►
>> > > > >   2   │     │    │          │ 2
>> > > > >   ───►├───┐ │    │          ├─────►
>> > > > >   3   │   │ │    │          │ 3
>> > > > >   ───►├─┐ │ │    │          ├─────►
>> > > > >       │ ▼ ▼ ▼    │          │
>> > > > >       │ ──────   │          │
>> > > > >       │ \____/   │          │
>> > > > >       └──┼──┼────┴──────────┘
>> > > > >         1│ 2│        ▲
>> > > > >  RCLK out│  │        │ TX CLK in
>> > > > >          ▼  ▼        │
>> > > > >        ┌─────────────┴───┐
>> > > > >        │                 │
>> > > > >        │       SEC       │
>> > > > >        │                 │
>> > > > >        └─────────────────┘
>> > > > >
>> > > > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET
>> > > > > message allows me to redirect the frequency recovered from
>> > > > > this netdev to the EEC via either pin 1, pin 2 or both.
>> > > > >
>> > > > > Given a netdev, the RCLK_GET message allows me to query the
>> > > > > range of pins (RCLK out 1-2 in the diagram) through which the
>> > > > > frequency can be fed into the EEC.
>> > > > >
>> > > > > Questions:
>> > > > >
>> > > > > 1. The query for all the above netdevs will return the same
>> > > > > range of pins. How does user space know that these are the
>> > > > > same pins? That is, how does user space know that RCLK_SET
>> > > > > message to redirect the frequency recovered from netdev 1 to
>> > > > > pin 1 will be overridden by the same message but for netdev
>> > > > > 2?
>> > > >
>> > > > We don't have a way to do so right now. When we have EEC
>> > > > subsystem in place the right thing to do will be to add EEC
>> > > > input index and EEC index as additional arguments
>> > > >
>> > > > > 2. How does user space know the mapping between a netdev and
>> > > > > an EEC? That is, how does user space know that RCLK_SET
>> > > > > message for netdev 1 will cause the Tx frequency of netdev 2
>> > > > > to change according to the frequency recovered from netdev 1?
>> > > >
>> > > > Ditto - currently we don't have any entity to link the pins to
>> > > > ATM, but we can address that in userspace just like PTP pins
>> > > > are used now
>> > > >
>> > > > > 3. If user space sends two RCLK_SET messages to redirect the
>> > > > > frequency recovered from netdev 1 to RCLK out 1 and from
>> > > > > netdev 2 to RCLK out 2, how does it know which recovered
>> > > > > frequency is actually used an input to the EEC?
>> > >
>> > > User space doesn't know this as well?
>> >
>> > In current model it can come from the config file. Once we
>> > implement DPLL subsystem we can implement connection between pins
>> > and DPLLs if they are known.
>> 
>> To be clear, no SyncE patches should be accepted before we have a
>> DPLL subsystem or however the subsystem that will model the EEC is
>> going to be called.
>> 
>> You are asking us to buy into a new uAPI that can never be removed.
>> We pointed out numerous problems with this uAPI and suggested a model
>> that solves them. When asked why it can't work we are answered with
>> vague arguments about this model being "hard".
>
> My argument was never "it's hard" - the answer is we need both APIs.
>
>> In addition, without a representation of the EEC, these patches have
>> no value for user space. They basically allow user space to redirect
>> the recovered frequency from a netdev to an object that does not
>> exist. User space doesn't know if the object is successfully tracking
>> the frequency (the EEC state) and does not know which other
>> components are utilizing this recovered frequency as input (e.g.,
>> other netdevs, PHC).
>
> That's also not true - the proposed uAPI lets you enable recovered
> frequency output pins and redirect the right clock to them. In some
> implementations you may not have anything else.

Wait, are there EEC deployments where there is no way to determine the
EEC state?

>> BTW, what is the use case for enabling two EEC inputs simultaneously?
>> Some seamless failover?
>
> Mainly - redundacy
>
>> >
>> > > > >
>> > > > > 4. Why these pins are represented as attributes of a netdev
>> > > > > and not as attributes of the EEC? That is, why are they
>> > > > > represented as output pins of the PHY as opposed to input
>> > > > > pins of the EEC?
>> > > >
>> > > > They are 2 separate beings. Recovered clock outputs are
>> > > > controlled separately from EEC inputs.
>> > >
>> > > Separate how? What does it mean that they are controlled
>> > > separately? In which sense? That redirection of recovered
>> > > frequency to pin is controlled via PHY registers whereas priority
>> > > setting between EEC inputs is controlled via EEC registers? If
>> > > so, this is an implementation detail of a specific design. It is
>> > > not of any importance to user space.
>> >
>> > They belong to different devices. EEC registers are physically in
>> > the DPLL hanging over I2C and recovered clocks are in the
>> > PHY/integrated PHY in the MAC. Depending on system architecture you
>> > may have control over one piece only
>> 
>> These are implementation details of a specific design and should not
>> influence the design of the uAPI. The uAPI should be influenced by
>> the logical task that it is trying to achieve.
>
> There are 2 logical tasks:
> 1. Enable clocks that are recovered from a specific netdev
> 2. Control the EEC
>
> They are both needed to get to the full solution, but are independent
> from each other. You can't put RCLK redirection to the EEC as it's one
> to many relation and you will need to call the netdev to enable it
> anyway.

"Call the netdev"? When EEC decides a configuration needs to be done, it
will defer to a callback set up by whoever created the EEC object. EEC
doesn't care. If you have a disk that somehow contains an EEC to
syntonize disk spinning across the data center, go ahead and create the
object from a disk driver. Then the EEC object will invoke disk driver
code.

> Also, when we tried to add EEC state to PTP subsystem the answer was
> that we can't mix subsystems. The proposal to configure recovered
> clocks through EEC would mix netdev with EEC.

Involving MAC driver through an abstract interface is not mixing
subsystems. It's just loose coupling.

>> > > What do you mean by "multiple devices"? A multi-port adapter with
>> > > a single EEC or something else?
>> >
>> > Multiple MACs that use a single EEC clock.
>> >
>> > > > Also if we make those pins attributes of the EEC it'll become
>> > > > extremally hard to map them to netdevs and control them from
>> > > > the userspace app that will receive the ESMC message with a
>> > > > given QL level on netdev X.
>> > >
>> > > Hard how? What is the problem with something like:
>> > >
>> > > # eec set source 1 type netdev dev swp1
>> > >
>> > > The EEC object should be registered by the same entity that
>> > > registers the netdevs whose Tx frequency is controlled by the
>> > > EEC, the MAC driver.
>> >
>> > But the EEC object may not be controlled by the MAC - in which case
>> > this model won't work.
>> 
>> Why wouldn't it work? Leave individual kernel modules alone and look
>> at the kernel. It is registering all the necessary logical objects
>> such netdevs, PHCs and EECs. There is no way user space knows better
>> than the kernel how these objects fit together as the purpose of the
>> kernel is to abstract the hardware to user space.
>> 
>> User space's request to use the Rx frequency recovered from netdev X
>> as an input to EEC Y will be processed by the DPLL subsystem. In
>> turn, this subsystem will invoke whichever kernel modules it needs to
>> fulfill the request.
>
> But how would that call go through the kernel? What would you like to
> give to the EEC object and how should it react. I'm fine with the
> changes, but I don't see the solution in that proposal

You will give EEC object handle, RCLK source handle, and a handle of the
output pin to configure. These are all objects in the EEC subsystem.

Some of the RCLK sources are pre-attached to a netdevice, so they carry
an ifindex reference. Some are external and do not have a netdevice
(that's for NIC-to-NIC frequency bridges, external GPS's and whatnot).

Eventually to implement the request, the EEC object would call its
creator through a callback appropriate for the request.

> and this model would mix independent subsystems.

The only place where netdevices are tightly coupled to the EEC are those
pre-attached pins. But OK, EEC just happens to be very, very often part
of a NIC, and being able to say, this RCLK comes from swp1, is just
very, very handy. But it is not a requirement. The EEC model can just as
easily represent external pins, or weird stuff like boards that have
nothing _but_ external pins.

> The netdev -> EEC should be a downstream relation, just like the PTP
> is now If a netdev wants to check what's the state of EEC driving it -
> it can do it, but I don't see a way for the EEC subsystem to directly
> configure something in Potentially couple different MAC chips without
> calling a kind of netdev API. And that's what those patches address.

Either the device packages everything, e.g. a switch, or an EEC-enabled
NIC. In that case, the NIC driver instantiates the EEC, and pins, and
RCLK sources, and netdevices. EEC configuration ends up getting handled
by this device driver, because that's the way it set things up.

Or we have a NIC separate from the EEC, but there is still an option to
hook those up somehow. That looks like something that should probably be
represented by an EEC with some external RCLK sources. (Or maybe they
are just inout pins or whatever, that is a detail.) Then the EEC driver
ends up instantiating the object, and implementing the requests. And the
admin needs to have external information to know that external pin such
and such is actually connected to PHY such and such.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 16:50                 ` Machnikowski, Maciej
@ 2021-12-03 18:44                   ` Petr Machata
  2021-12-03 19:07                     ` Machnikowski, Maciej
  0 siblings, 1 reply; 27+ messages in thread
From: Petr Machata @ 2021-12-03 18:44 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Petr Machata, Ido Schimmel, netdev, intel-wired-lan, Kubalewski,
	Arkadiusz, richardcochran, abyagowi, Nguyen, Anthony L, davem,
	kuba, linux-kselftest, mkubecek, saeed, michael.chan


Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:

>> -----Original Message-----
>> From: Petr Machata <petrm@nvidia.com>
>> 
>> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
>> 
>> >> -----Original Message-----
>> >> From: Petr Machata <petrm@nvidia.com>
>> >>
>> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Ido Schimmel <idosch@idosch.org>
>> >> >>
>> >> >> On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
>> >> >> > > -----Original Message-----
>> >> >> > > From: Ido Schimmel <idosch@idosch.org>
>> >> >> > >
>> >> >> > > 4. Why these pins are represented as attributes of a netdev
>> >> >> > > and not as attributes of the EEC? That is, why are they
>> >> >> > > represented as output pins of the PHY as opposed to input
>> >> >> > > pins of the EEC?
>> >> >> >
>> >> >> > They are 2 separate beings. Recovered clock outputs are
>> >> >> > controlled separately from EEC inputs.
>> >> >>
>> >> >> Separate how? What does it mean that they are controlled
>> >> >> separately? In which sense? That redirection of recovered
>> >> >> frequency to pin is controlled via PHY registers whereas
>> >> >> priority setting between EEC inputs is controlled via EEC
>> >> >> registers? If so, this is an implementation detail of a
>> >> >> specific design. It is not of any importance to user space.
>> >> >
>> >> > They belong to different devices. EEC registers are physically
>> >> > in the DPLL hanging over I2C and recovered clocks are in the
>> >> > PHY/integrated PHY in the MAC. Depending on system architecture
>> >> > you may have control over one piece only
>> >>
>> >> What does ETHTOOL_MSG_RCLK_SET actually configure, physically? Say
>> >> I have this message:
>> >>
>> >> ETHTOOL_MSG_RCLK_SET dev = eth0
>> >> - ETHTOOL_A_RCLK_OUT_PIN_IDX = n
>> >> - ETHTOOL_A_RCLK_PIN_FLAGS |= ETHTOOL_RCLK_PIN_FLAGS_ENA
>> >>
>> >> Eventually this lands in ops->set_rclk_out(dev, out_idx,
>> >> new_state). What does the MAC driver do next?
>> >
>> > It goes to the PTY layer, enables the clock recovery from a given
>> > physical lane, optionally configure the clock divider and pin
>> > output muxes. This will be HW-specific though, but the general
>> > concept will look like that.
>> 
>> The reason I am asking is that I suspect that by exposing this
>> functionality through netdev, you assume that the NIC driver will do
>> whatever EEC configuration necessary _anyway_. So why couldn't it just
>> instantiate the EEC object as well?
>
> Not necessarily. The EEC can be supported by totally different driver.
> I.e there are Renesas DPLL drivers available now in the ptp subsystem.
> The DPLL can be connected anywhere in the system.
>
>> >> >> > > 5. What is the problem with the following model?
>> >> >> > >
>> >> >> > > - The EEC is a separate object with following attributes:
>> >> >> > >   * State: Invalid / Freerun / Locked / etc
>> >> >> > >   * Sources: Netdev / external / etc
>> >> >> > >   * Potentially more
>> >> >> > >
>> >> >> > > - Notifications are emitted to user space when the state of
>> >> >> > >   the EEC changes. Drivers will either poll the state from
>> >> >> > >   the device or get interrupts
>> >> >> > >
>> >> >> > > - The mapping from netdev to EEC is queried via ethtool
>> >> >> >
>> >> >> > Yep - that will be part of the EEC (DPLL) subsystem
>> >> >>
>> >> >> This model avoids all the problems I pointed out in the current
>> >> >> proposal.
>> >> >
>> >> > That's the go-to model, but first we need control over the
>> >> > source as well :)
>> >>
>> >> Why is that? Can you illustrate a case that breaks with the above
>> >> model?
>> >
>> > If you have 32 port switch chip with 2 recovered clock outputs how
>> > will you tell the chip to get the 18th port to pin 0 and from port
>> > 20 to pin 1? That's the part those patches addresses. The further
>> > side of "which clock should the EEC use" belongs to the DPLL
>> > subsystem and I agree with that.
>> 
>> So the claim is that in some cases the owner of the EEC does not know
>> about the netdevices?
>> 
>> If that is the case, how do netdevices know about the EEC, like the
>> netdev-centric model assumes?
>> 
>> Anyway, to answer the question, something like the following would
>> happen:
>> 
>> - Ask EEC to enumerate all input pins it knows about
>> - Find the one that references swp18
>> - Ask EEC to forward that input pin to output pin 0
>> - Repeat for swp20 and output pin 1
>> 
>> The switch driver (or multi-port NIC driver) just instantiates all of
>> netdevices, the EEC object, and pin objects, and therefore can set up
>> arbitrary linking between the three.
>
> This will end up with a model in which pin X of the EEC will link to
>dozens ports - userspace tool would need to find out the relation
>between them and EECs somehow.

Indeed. If you have EEC connected to a bunch of ports, the EEC object is
related to a bunch of netdevices. The UAPI needs to have tools to dump
these objects so that it is possible to discover what is connected
where.

This configuration will also not change during the lifetime of the EEC
object, so tools can cache it.

> It's far more convenient if a given netdev knows where it is connected
> to and which pin can it drive.

Yeah, it is of course possible to add references from the netdevice to
the EEC object directly, so that the tool just needs to ask a netdevice
what EEC / RCLK source ID it maps to.

This has mostly nothing to do with the model itself.

> I.e. send the netdev swp20 ETHTOOL_MSG_RCLK_GET and get the pin
> indexes of the EEC and send the future message to find which EEC that
> is (or even return EEC index in RCLK_GET?).

Since the pin index on its own is useless, it would make sense to return
both pieces of information at the same time.

> Set the recovered clock on that pin with the ETHTOOL_MSG_RCLK_SET.

Nope.

> Then go to the given EEC and configure it to use the pin that was
> returned before as a frequency source and monitor the EEC state.

Yep.

EEC will invoke a callback to set up the tracking. If something special
needs to be done to "set the recovered clock on that pin", the handler
of that callback will do it.

> Additionally, the EEC device may be instantiated by a totally
> different driver, in which case the relation between its pins and
> netdevs may not even be known.

Like an EEC, some PHYs, but the MAC driver does not know about both
pieces? Who sets up the connection between the two? The box admin
through some cabling? SoC designer?

Also, what does the external EEC actually do with the signal from the
PHY? Tune to it and forward to the other PHYs in the complex?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 18:44                   ` Petr Machata
@ 2021-12-03 19:07                     ` Machnikowski, Maciej
  2021-12-06 14:40                       ` Petr Machata
  0 siblings, 1 reply; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-03 19:07 UTC (permalink / raw)
  To: Petr Machata
  Cc: Ido Schimmel, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan

> -----Original Message-----
> From: Petr Machata <petrm@nvidia.com>
> Sent: Friday, December 3, 2021 7:45 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> 
> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Petr Machata <petrm@nvidia.com>
> >>
> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> >>
> >> >> -----Original Message-----
> >> >> From: Petr Machata <petrm@nvidia.com>
> >> >>
> >> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Ido Schimmel <idosch@idosch.org>
> >> >> >>
> >> >> >> On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej
> wrote:
> >> >> >> > > -----Original Message-----
> >> >> >> > > From: Ido Schimmel <idosch@idosch.org>
> >> >> >> > >
> >> >> >> > > 4. Why these pins are represented as attributes of a netdev
> >> >> >> > > and not as attributes of the EEC? That is, why are they
> >> >> >> > > represented as output pins of the PHY as opposed to input
> >> >> >> > > pins of the EEC?
> >> >> >> >
> >> >> >> > They are 2 separate beings. Recovered clock outputs are
> >> >> >> > controlled separately from EEC inputs.
> >> >> >>
> >> >> >> Separate how? What does it mean that they are controlled
> >> >> >> separately? In which sense? That redirection of recovered
> >> >> >> frequency to pin is controlled via PHY registers whereas
> >> >> >> priority setting between EEC inputs is controlled via EEC
> >> >> >> registers? If so, this is an implementation detail of a
> >> >> >> specific design. It is not of any importance to user space.
> >> >> >
> >> >> > They belong to different devices. EEC registers are physically
> >> >> > in the DPLL hanging over I2C and recovered clocks are in the
> >> >> > PHY/integrated PHY in the MAC. Depending on system architecture
> >> >> > you may have control over one piece only
> >> >>
> >> >> What does ETHTOOL_MSG_RCLK_SET actually configure, physically?
> Say
> >> >> I have this message:
> >> >>
> >> >> ETHTOOL_MSG_RCLK_SET dev = eth0
> >> >> - ETHTOOL_A_RCLK_OUT_PIN_IDX = n
> >> >> - ETHTOOL_A_RCLK_PIN_FLAGS |= ETHTOOL_RCLK_PIN_FLAGS_ENA
> >> >>
> >> >> Eventually this lands in ops->set_rclk_out(dev, out_idx,
> >> >> new_state). What does the MAC driver do next?
> >> >
> >> > It goes to the PTY layer, enables the clock recovery from a given
> >> > physical lane, optionally configure the clock divider and pin
> >> > output muxes. This will be HW-specific though, but the general
> >> > concept will look like that.
> >>
> >> The reason I am asking is that I suspect that by exposing this
> >> functionality through netdev, you assume that the NIC driver will do
> >> whatever EEC configuration necessary _anyway_. So why couldn't it just
> >> instantiate the EEC object as well?
> >
> > Not necessarily. The EEC can be supported by totally different driver.
> > I.e there are Renesas DPLL drivers available now in the ptp subsystem.
> > The DPLL can be connected anywhere in the system.
> >
> >> >> >> > > 5. What is the problem with the following model?
> >> >> >> > >
> >> >> >> > > - The EEC is a separate object with following attributes:
> >> >> >> > >   * State: Invalid / Freerun / Locked / etc
> >> >> >> > >   * Sources: Netdev / external / etc
> >> >> >> > >   * Potentially more
> >> >> >> > >
> >> >> >> > > - Notifications are emitted to user space when the state of
> >> >> >> > >   the EEC changes. Drivers will either poll the state from
> >> >> >> > >   the device or get interrupts
> >> >> >> > >
> >> >> >> > > - The mapping from netdev to EEC is queried via ethtool
> >> >> >> >
> >> >> >> > Yep - that will be part of the EEC (DPLL) subsystem
> >> >> >>
> >> >> >> This model avoids all the problems I pointed out in the current
> >> >> >> proposal.
> >> >> >
> >> >> > That's the go-to model, but first we need control over the
> >> >> > source as well :)
> >> >>
> >> >> Why is that? Can you illustrate a case that breaks with the above
> >> >> model?
> >> >
> >> > If you have 32 port switch chip with 2 recovered clock outputs how
> >> > will you tell the chip to get the 18th port to pin 0 and from port
> >> > 20 to pin 1? That's the part those patches addresses. The further
> >> > side of "which clock should the EEC use" belongs to the DPLL
> >> > subsystem and I agree with that.
> >>
> >> So the claim is that in some cases the owner of the EEC does not know
> >> about the netdevices?
> >>
> >> If that is the case, how do netdevices know about the EEC, like the
> >> netdev-centric model assumes?
> >>
> >> Anyway, to answer the question, something like the following would
> >> happen:
> >>
> >> - Ask EEC to enumerate all input pins it knows about
> >> - Find the one that references swp18
> >> - Ask EEC to forward that input pin to output pin 0
> >> - Repeat for swp20 and output pin 1
> >>
> >> The switch driver (or multi-port NIC driver) just instantiates all of
> >> netdevices, the EEC object, and pin objects, and therefore can set up
> >> arbitrary linking between the three.
> >
> > This will end up with a model in which pin X of the EEC will link to
> >dozens ports - userspace tool would need to find out the relation
> >between them and EECs somehow.
> 
> Indeed. If you have EEC connected to a bunch of ports, the EEC object is
> related to a bunch of netdevices. The UAPI needs to have tools to dump
> these objects so that it is possible to discover what is connected
> where.
> 
> This configuration will also not change during the lifetime of the EEC
> object, so tools can cache it.
> 
> > It's far more convenient if a given netdev knows where it is connected
> > to and which pin can it drive.
> 
> Yeah, it is of course possible to add references from the netdevice to
> the EEC object directly, so that the tool just needs to ask a netdevice
> what EEC / RCLK source ID it maps to.
> 
> This has mostly nothing to do with the model itself.
> 
> > I.e. send the netdev swp20 ETHTOOL_MSG_RCLK_GET and get the pin
> > indexes of the EEC and send the future message to find which EEC that
> > is (or even return EEC index in RCLK_GET?).
> 
> Since the pin index on its own is useless, it would make sense to return
> both pieces of information at the same time.
> 
> > Set the recovered clock on that pin with the ETHTOOL_MSG_RCLK_SET.
> 
> Nope.
> 
> > Then go to the given EEC and configure it to use the pin that was
> > returned before as a frequency source and monitor the EEC state.
> 
> Yep.
> 
> EEC will invoke a callback to set up the tracking. If something special
> needs to be done to "set the recovered clock on that pin", the handler
> of that callback will do it.
> 
> > Additionally, the EEC device may be instantiated by a totally
> > different driver, in which case the relation between its pins and
> > netdevs may not even be known.
> 
> Like an EEC, some PHYs, but the MAC driver does not know about both
> pieces? Who sets up the connection between the two? The box admin
> through some cabling? SoC designer?
> 
> Also, what does the external EEC actually do with the signal from the
> PHY? Tune to it and forward to the other PHYs in the complex?
Yes - it can also apply HW filters to it.

The EEC model will not work when you have the following system:
SoC with some ethernet ports with driver A
Switch chip with N ports with driver B
EEC/DPLL with driver C
Both SoC and Switch ASIC can recover clock and use the cleaned
clock from the DPLL.

In that case you can't create any relation between EEC and recover
clock pins that would enable the EEC subsystem to control
recovered clocks, because you have 3 independent drivers.

The model you proposed assumes that the MAC/Switch is in
charge of the DPLL, but that's not always true. The model where
recovered clock outputs are controlled independently can support
both models and is more flexible. It can also address the mode where
you want to use the recovered clock as a source for RF part of your
system and don't have any EEC to control from the netdev side.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 16:18             ` Machnikowski, Maciej
  2021-12-03 18:21               ` Petr Machata
@ 2021-12-05 12:24               ` Ido Schimmel
  2021-12-06  9:15                 ` Machnikowski, Maciej
  1 sibling, 1 reply; 27+ messages in thread
From: Ido Schimmel @ 2021-12-05 12:24 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, Kubalewski, Arkadiusz, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

On Fri, Dec 03, 2021 at 04:18:18PM +0000, Machnikowski, Maciej wrote:
> > -----Original Message-----
> > From: Ido Schimmel <idosch@idosch.org>
> > Sent: Friday, December 3, 2021 4:46 PM
> > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > recovered clock for SyncE feature
> > 
> > On Thu, Dec 02, 2021 at 05:20:24PM +0000, Machnikowski, Maciej wrote:
> > > > -----Original Message-----
> > > > From: Ido Schimmel <idosch@idosch.org>
> > > > Sent: Thursday, December 2, 2021 5:36 PM
> > > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > > > recovered clock for SyncE feature
> > > >
> > > > On Thu, Dec 02, 2021 at 03:17:06PM +0000, Machnikowski, Maciej wrote:
> > > > > > -----Original Message-----
> > > > > > From: Ido Schimmel <idosch@idosch.org>
> > > > > > Sent: Thursday, December 2, 2021 1:44 PM
> > > > > > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > > > > > Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> > > > > > recovered clock for SyncE feature
> > > > > >
> > > > > > On Wed, Dec 01, 2021 at 07:02:06PM +0100, Maciej Machnikowski
> > wrote:
> > > > > > Looking at the diagram from the previous submission [1]:
> > > > > >
> > > > > >       ┌──────────┬──────────┐
> > > > > >       │ RX       │ TX       │
> > > > > >   1   │ ports    │ ports    │ 1
> > > > > >   ───►├─────┐    │          ├─────►
> > > > > >   2   │     │    │          │ 2
> > > > > >   ───►├───┐ │    │          ├─────►
> > > > > >   3   │   │ │    │          │ 3
> > > > > >   ───►├─┐ │ │    │          ├─────►
> > > > > >       │ ▼ ▼ ▼    │          │
> > > > > >       │ ──────   │          │
> > > > > >       │ \____/   │          │
> > > > > >       └──┼──┼────┴──────────┘
> > > > > >         1│ 2│        ▲
> > > > > >  RCLK out│  │        │ TX CLK in
> > > > > >          ▼  ▼        │
> > > > > >        ┌─────────────┴───┐
> > > > > >        │                 │
> > > > > >        │       SEC       │
> > > > > >        │                 │
> > > > > >        └─────────────────┘
> > > > > >
> > > > > > Given a netdev (1, 2 or 3 in the diagram), the RCLK_SET message
> > allows
> > > > > > me to redirect the frequency recovered from this netdev to the EEC
> > via
> > > > > > either pin 1, pin 2 or both.
> > > > > >
> > > > > > Given a netdev, the RCLK_GET message allows me to query the range
> > of
> > > > > > pins (RCLK out 1-2 in the diagram) through which the frequency can
> > be
> > > > > > fed into the EEC.
> > > > > >
> > > > > > Questions:
> > > > > >
> > > > > > 1. The query for all the above netdevs will return the same range of
> > > > > > pins. How does user space know that these are the same pins? That
> > is,
> > > > > > how does user space know that RCLK_SET message to redirect the
> > > > frequency
> > > > > > recovered from netdev 1 to pin 1 will be overridden by the same
> > > > message
> > > > > > but for netdev 2?
> > > > >
> > > > > We don't have a way to do so right now. When we have EEC subsystem
> > in
> > > > place
> > > > > the right thing to do will be to add EEC input index and EEC index as
> > > > additional
> > > > > arguments
> > > > >
> > > > > > 2. How does user space know the mapping between a netdev and an
> > > > EEC?
> > > > > > That is, how does user space know that RCLK_SET message for netdev
> > 1
> > > > > > will cause the Tx frequency of netdev 2 to change according to the
> > > > > > frequency recovered from netdev 1?
> > > > >
> > > > > Ditto - currently we don't have any entity to link the pins to ATM,
> > > > > but we can address that in userspace just like PTP pins are used now
> > > > >
> > > > > > 3. If user space sends two RCLK_SET messages to redirect the
> > frequency
> > > > > > recovered from netdev 1 to RCLK out 1 and from netdev 2 to RCLK out
> > 2,
> > > > > > how does it know which recovered frequency is actually used an input
> > to
> > > > > > the EEC?
> > > >
> > > > User space doesn't know this as well?
> > >
> > > In current model it can come from the config file. Once we implement DPLL
> > > subsystem we can implement connection between pins and DPLLs if they
> > are
> > > known.
> > 
> > To be clear, no SyncE patches should be accepted before we have a DPLL
> > subsystem or however the subsystem that will model the EEC is going to
> > be called.
> > 
> > You are asking us to buy into a new uAPI that can never be removed. We
> > pointed out numerous problems with this uAPI and suggested a model that
> > solves them. When asked why it can't work we are answered with vague
> > arguments about this model being "hard".
> 
> My argument was never "it's hard" - the answer is we need both APIs.

We are discussing whether two APIs are actually necessary or whether EEC
source configuration can be done via the EEC. The answer cannot be "the
answer is we need both APIs".

> 
> > In addition, without a representation of the EEC, these patches have no
> > value for user space. They basically allow user space to redirect the
> > recovered frequency from a netdev to an object that does not exist.
> > User space doesn't know if the object is successfully tracking the
> > frequency (the EEC state) and does not know which other components are
> > utilizing this recovered frequency as input (e.g., other netdevs, PHC).
> 
> That's also not true - the proposed uAPI lets you enable recovered frequency
> output pins and redirect the right clock to them. In some implementations
> you may not have anything else.

What isn't true? That these patches have no value for user space? This
is 100% true. You admitted that this is incomplete work. There is no
reason to merge one API without the other. At the very least, we need to
see an explanation of how the two APIs work together. This is missing
from the patchset, which prompted these questions:

https://lore.kernel.org/netdev/Yai%2Fe5jz3NZAg0pm@shredder/

> 
> > BTW, what is the use case for enabling two EEC inputs simultaneously?
> > Some seamless failover?
> 
> Mainly - redundacy
> 
> > >
> > > > > >
> > > > > > 4. Why these pins are represented as attributes of a netdev and not
> > as
> > > > > > attributes of the EEC? That is, why are they represented as output
> > pins
> > > > > > of the PHY as opposed to input pins of the EEC?
> > > > >
> > > > > They are 2 separate beings. Recovered clock outputs are controlled
> > > > > separately from EEC inputs.
> > > >
> > > > Separate how? What does it mean that they are controlled separately? In
> > > > which sense? That redirection of recovered frequency to pin is
> > > > controlled via PHY registers whereas priority setting between EEC inputs
> > > > is controlled via EEC registers? If so, this is an implementation detail
> > > > of a specific design. It is not of any importance to user space.
> > >
> > > They belong to different devices. EEC registers are physically in the DPLL
> > > hanging over I2C and recovered clocks are in the PHY/integrated PHY in
> > > the MAC. Depending on system architecture you may have control over
> > > one piece only
> > 
> > These are implementation details of a specific design and should not
> > influence the design of the uAPI. The uAPI should be influenced by the
> > logical task that it is trying to achieve.
> 
> There are 2 logical tasks:
> 1. Enable clocks that are recovered from a specific netdev

I already replied about this here:

https://lore.kernel.org/netdev/Yao+nK40D0+u8UKL@shredder/

If the recovered clock outputs are only meaningful as EEC inputs, then
there is no reason not to configure them through the EEC object. The
fact that you think that the *internal* kernel plumbing (that can be
improved over time) will be "hard" is not a reason to end up with a
*user* API (that cannot be changed) where the *Ethernet* Equipment Clock
is ignorant of its *Ethernet* ports.

With your proposal where the EEC is only aware of pins, how does user
space answer the question of what is the source of the EEC? It needs to
issue RCLK_GET dump? How does it even know that the source is a netdev
and not an external one? And if the EEC object knows that the source is
a netdev, how come it does not know which netdev?

> 2. Control the EEC
> 
> They are both needed to get to the full solution, but are independent from 
> each other. You can't put RCLK redirection to the EEC as it's one to many 
> relation and you will need to call the netdev to enable it anyway.

So what if I need to call the netdev? The EEC cannot be so disjoint from
the associated netdevs. After all, EEC stands for *Ethernet* Equipment
Clock. In the common case, the EEC will transfer the frequency from one
netdev to another. In the less common case, it will transfer the
frequency from an external source to a netdev.

> 
> Also, when we tried to add EEC state to PTP subsystem the answer was
> that we can't mix subsystems. 

SyncE doesn't belong in PTP because PTP can work without SyncE and SyncE
can work without PTP. The fact that the primary use case for SyncE might
be PTP doesn't mean that SyncE belongs in PTP subsystem.

> The proposal to configure recovered clocks  through EEC would mix
> netdev with EEC.

I don't believe that *Ethernet* Equipment Clock and *Ethernet* ports
should be so disjoint so that the EEC doesn't know about:

a. The netdev from which it is recovering its frequency
b. The netdevs that it is controlling

If the netdevs are smart enough to report the EEC input pins and EEC
association to user space, then they are also smart enough to register
themselves internally in the kernel with the EEC. They can all appear as
virtual input/output pins of the EEC that can be enabled/disabled by
user space. In addition, you can have physical (named) pins for external
sources / outputs and another virtual output pin towards the PHC.

> 
> > >
> > > > > If we mix them it'll be hard to control everything especially that a
> > > > > single EEC can support multiple devices.
> > > >
> > > > Hard how? Please provide concrete examples.
> > >
> > > From the EEC perspective it's one to many relation - one EEC input pin will
> > serve
> > > even 4,16,48 netdevs. I don't see easy way of starting from EEC input of EEC
> > device
> > > and figuring out which netdevs are connected to it to talk to the right one.
> > > In current model it's as simple as:
> > > - I received QL-PRC on netdev ens4f0
> > > - I send back enable recovered clock on pin 0 of the ens4f0
> > > - go to EEC that will be linked to it
> > > - see the state of it - if its locked - report QL-EEC downsteam
> > >
> > > How would you this control look in the EEC/DPLL implementation? Maybe
> > > I missed something.
> > 
> > Petr already replied.
> 
> See my response there. 
> 
> > >
> > > > What do you mean by "multiple devices"? A multi-port adapter with a
> > > > single EEC or something else?
> > >
> > > Multiple MACs that use a single EEC clock.
> > >
> > > > > Also if we make those pins attributes of the EEC it'll become extremally
> > > > hard
> > > > > to map them to netdevs and control them from the userspace app that
> > will
> > > > > receive the ESMC message with a given QL level on netdev X.
> > > >
> > > > Hard how? What is the problem with something like:
> > > >
> > > > # eec set source 1 type netdev dev swp1
> > > >
> > > > The EEC object should be registered by the same entity that registers
> > > > the netdevs whose Tx frequency is controlled by the EEC, the MAC
> > driver.
> > >
> > > But the EEC object may not be controlled by the MAC - in which case
> > > this model won't work.
> > 
> > Why wouldn't it work? Leave individual kernel modules alone and look at
> > the kernel. It is registering all the necessary logical objects such
> > netdevs, PHCs and EECs. There is no way user space knows better than the
> > kernel how these objects fit together as the purpose of the kernel is to
> > abstract the hardware to user space.
> > 
> > User space's request to use the Rx frequency recovered from netdev X as
> > an input to EEC Y will be processed by the DPLL subsystem. In turn, this
> > subsystem will invoke whichever kernel modules it needs to fulfill the
> > request.
> 
> But how would that call go through the kernel? What would you like to give
> to the EEC object and how should it react. I'm fine with the changes, but
> I don't see the solution in that proposal and this model would mix independent
> subsystems.
> The netdev -> EEC should be a downstream relation, just like the PTP is now
> If a netdev wants to check what's the state of EEC driving it - it can do it, but
> I don't see a way for the EEC subsystem to directly configure something in
> Potentially couple different MAC chips without calling a kind of netdev API.
> And that's what those patches address.
> 
> > >
> > > > >
> > > > > > 5. What is the problem with the following model?
> > > > > >
> > > > > > - The EEC is a separate object with following attributes:
> > > > > >   * State: Invalid / Freerun / Locked / etc
> > > > > >   * Sources: Netdev / external / etc
> > > > > >   * Potentially more
> > > > > >
> > > > > > - Notifications are emitted to user space when the state of the EEC
> > > > > >   changes. Drivers will either poll the state from the device or get
> > > > > >   interrupts
> > > > > >
> > > > > > - The mapping from netdev to EEC is queried via ethtool
> > > > >
> > > > > Yep - that will be part of the EEC (DPLL) subsystem
> > > >
> > > > This model avoids all the problems I pointed out in the current
> > > > proposal.
> > >
> > > That's the go-to model, but first we need control over the source as well :)
> > 
> > The point that we are trying to make is that like the EEC state, the
> > source is also an EEC attribute and not a netdev attribute.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-05 12:24               ` Ido Schimmel
@ 2021-12-06  9:15                 ` Machnikowski, Maciej
  0 siblings, 0 replies; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-06  9:15 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, intel-wired-lan, Kubalewski, Arkadiusz, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, kuba, linux-kselftest,
	mkubecek, saeed, michael.chan, petrm

> -----Original Message-----
> From: Ido Schimmel <idosch@idosch.org>
> Sent: Sunday, December 5, 2021 1:24 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature


OK I see where the misunderstanding comes from. The subsystem we'll
Develop will NOT be EEC subsystem, but the DPLL subsystem. The EEC is only 
one potential use of the DPLL, but there are many more.
By principle all DPLL chips have configurable inputs, configurable PLL block,
and configurable outputs - that's what the new subsystem will configure
and expose. And the input block is shared between multiple DPLLs internally.

Unfortunately, we have no way of representing all connections to a given
DPLL and we have to rely on manual/labels anyway - just like we do with
PTP pins. We control them, but have no idea where they are connected
physically.

My assumption is that the DPLL block will follow the same principle and 
will expose a set of inputs and set of outputs that uAPI will configure.

Now with that in mind:


> > My argument was never "it's hard" - the answer is we need both APIs.
> 
> We are discussing whether two APIs are actually necessary or whether EEC
> source configuration can be done via the EEC. The answer cannot be "the
> answer is we need both APIs".

We need both APIs because a single recovered clock can be connected to:
- Multiple DPLLs - either internal to the chip, or via fanouts to different chips
- a FPGA
- a RF HW that may not expose any DPLL

Given that - we cannot hook control over recovered clocks to a DPLL subsystem,
as it's not the only consumer of that output.

> >
> > > In addition, without a representation of the EEC, these patches have no
> > > value for user space. They basically allow user space to redirect the
> > > recovered frequency from a netdev to an object that does not exist.
> > > User space doesn't know if the object is successfully tracking the
> > > frequency (the EEC state) and does not know which other components
> are
> > > utilizing this recovered frequency as input (e.g., other netdevs, PHC).
> >
> > That's also not true - the proposed uAPI lets you enable recovered
> frequency
> > output pins and redirect the right clock to them. In some implementations
> > you may not have anything else.
> 
> What isn't true? That these patches have no value for user space? This
> is 100% true. You admitted that this is incomplete work. There is no
> reason to merge one API without the other. At the very least, we need to
> see an explanation of how the two APIs work together. This is missing
> from the patchset, which prompted these questions:
> 
> https://lore.kernel.org/netdev/Yai%2Fe5jz3NZAg0pm@shredder/

That's what I try to explain. A given DPLL will have multiple reference frequencies
to choose from, but the sources of them will be configured independently.
With the sources like:
- 1PPS/10MHz from the GNSS
- 1PPS/10MHz  from external source
- 1PPS from the PTP block
- Recovered clock

Additionally, a given DPLL chip may have many (2, 4, 6, 8 +) DPLLs inside,
each one using the same reference signals for different purposes.

Also there is a reason to merge this without DPLL subsystem for all devices
that use recovered clocks for a purpose other than SyncE.

[...]
> > > >
> > > > They belong to different devices. EEC registers are physically in the DPLL
> > > > hanging over I2C and recovered clocks are in the PHY/integrated PHY in
> > > > the MAC. Depending on system architecture you may have control over
> > > > one piece only
> > >
> > > These are implementation details of a specific design and should not
> > > influence the design of the uAPI. The uAPI should be influenced by the
> > > logical task that it is trying to achieve.
> >
> > There are 2 logical tasks:
> > 1. Enable clocks that are recovered from a specific netdev
> 
> I already replied about this here:
> 
> https://lore.kernel.org/netdev/Yao+nK40D0+u8UKL@shredder/
> 
> If the recovered clock outputs are only meaningful as EEC inputs, then
> there is no reason not to configure them through the EEC object. The
> fact that you think that the *internal* kernel plumbing (that can be
> improved over time) will be "hard" is not a reason to end up with a
> *user* API (that cannot be changed) where the *Ethernet* Equipment Clock
> is ignorant of its *Ethernet* ports.

Like I mentioned above - it won’t be EEC subsystem. Also the signal is relevant
to other devices - not only EEC and not even only to DPLLs.

> With your proposal where the EEC is only aware of pins, how does user
> space answer the question of what is the source of the EEC? It needs to
> issue RCLK_GET dump? How does it even know that the source is a netdev
> and not an external one? And if the EEC object knows that the source is
> a netdev, how come it does not know which netdev?

This needs to be addressed by the DPLL subsystem - I'd say using labels would be
a good way to manage it - in the same way we use them in the PTP subsystem

> > 2. Control the EEC
> >
> > They are both needed to get to the full solution, but are independent from
> > each other. You can't put RCLK redirection to the EEC as it's one to many
> > relation and you will need to call the netdev to enable it anyway.
> 
> So what if I need to call the netdev? The EEC cannot be so disjoint from
> the associated netdevs. After all, EEC stands for *Ethernet* Equipment
> Clock. In the common case, the EEC will transfer the frequency from one
> netdev to another. In the less common case, it will transfer the
> frequency from an external source to a netdev.

Again - the DPLLs linked to the netdev is just one of many use cases. Other
DPLLs not linked to netdevs will also exist and use the PHY recovered clock.

> >
> > Also, when we tried to add EEC state to PTP subsystem the answer was
> > that we can't mix subsystems.
> 
> SyncE doesn't belong in PTP because PTP can work without SyncE and SyncE
> can work without PTP. The fact that the primary use case for SyncE might
> be PTP doesn't mean that SyncE belongs in PTP subsystem.
> 
> > The proposal to configure recovered clocks  through EEC would mix
> > netdev with EEC.
> 
> I don't believe that *Ethernet* Equipment Clock and *Ethernet* ports
> should be so disjoint so that the EEC doesn't know about:
> 
> a. The netdev from which it is recovering its frequency
> b. The netdevs that it is controlling
> 
> If the netdevs are smart enough to report the EEC input pins and EEC
> association to user space, then they are also smart enough to register
> themselves internally in the kernel with the EEC. They can all appear as
> virtual input/output pins of the EEC that can be enabled/disabled by
> user space. In addition, you can have physical (named) pins for external
> sources / outputs and another virtual output pin towards the PHC.

Netdevs needs to know which DPLL is used as its source.
If we want to make a DPLL aware of who's driving the signal we can create
internal plumbing between PHY output pins and some/all DPLL references
that are linked to it - if that connection is known.

If such plumbing is known we can add registration of a netdev that's using
the recovered clock output to the reference inputs. That way the DPLL would
see which netdev (or other device) is the source from its reference input 
system.

Hope this clarifies why we need both uAPIs.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-03 19:07                     ` Machnikowski, Maciej
@ 2021-12-06 14:40                       ` Petr Machata
  2021-12-06 15:09                         ` Machnikowski, Maciej
  0 siblings, 1 reply; 27+ messages in thread
From: Petr Machata @ 2021-12-06 14:40 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Petr Machata, Ido Schimmel, netdev, intel-wired-lan, Kubalewski,
	Arkadiusz, richardcochran, abyagowi, Nguyen, Anthony L, davem,
	kuba, linux-kselftest, mkubecek, saeed, michael.chan


Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:

>> -----Original Message-----
>> From: Petr Machata <petrm@nvidia.com>
>> 
>> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
>> 
>> > Additionally, the EEC device may be instantiated by a totally
>> > different driver, in which case the relation between its pins and
>> > netdevs may not even be known.
>> 
>> Like an EEC, some PHYs, but the MAC driver does not know about both
>> pieces? Who sets up the connection between the two? The box admin
>> through some cabling? SoC designer?
>> 
>> Also, what does the external EEC actually do with the signal from the
>> PHY? Tune to it and forward to the other PHYs in the complex?
>
> Yes - it can also apply HW filters to it.

Sounds like this device should have an EEC instance of its own then.

Maybe we need to call it something else than "EEC". PLL? Something that
does not have the standardization connotations, because several
instances would be present in a system with several NICs.

> The EEC model will not work when you have the following system:
> SoC with some ethernet ports with driver A
> Switch chip with N ports with driver B
> EEC/DPLL with driver C
> Both SoC and Switch ASIC can recover clock and use the cleaned
> clock from the DPLL.
>
> In that case you can't create any relation between EEC and recover
> clock pins that would enable the EEC subsystem to control
> recovered clocks, because you have 3 independent drivers.

I think that in that case you have several EEC instances. Those are
connected by some wiring that is external to the devices themselves. I
am not sure who should be in charge of describing the wiring. Device
tree? Config file?

> The model you proposed assumes that the MAC/Switch is in
> charge of the DPLL, but that's not always true.

The EEC-centric model does not in fact assume that. It lets anyone to
set up an EEC object.

The netdev-centric UAPI assumes that the driver behind the netdev knows
about how many RCLK out pins there are. So it can certainly instantiate
a DPLL object instead, with those pins as external pins, and leave the
connection of the external pins to the EEC proper implicit.

That gives userspace exactly the same information as the netdev-centric
UAPI, but now userspace doesn't need to know about netdevs, and
synchronously-spinning drives, and GPS receivers, each of which is
handled through a dedicated set of netlink messages / sysctls / what
have you. The userspace needs to know about EEC subsystem, and that's
it.

> The model where recovered clock outputs are controlled independently
> can support both models and is more flexible. It can also address the

- Anyone can instantiate EEC objects
- Only things with ports instantiate netdevs

How is the latter one more flexible?

> mode where you want to use the recovered clock as a source for RF part
> of your system and don't have any EEC to control from the netdev side.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-06 14:40                       ` Petr Machata
@ 2021-12-06 15:09                         ` Machnikowski, Maciej
  2021-12-06 16:00                           ` Petr Machata
  0 siblings, 1 reply; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-06 15:09 UTC (permalink / raw)
  To: Petr Machata
  Cc: Ido Schimmel, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan

> -----Original Message-----
> From: Petr Machata <petrm@nvidia.com>
> Sent: Monday, December 6, 2021 3:41 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> 
> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Petr Machata <petrm@nvidia.com>
> >>
> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> >>
> >> > Additionally, the EEC device may be instantiated by a totally
> >> > different driver, in which case the relation between its pins and
> >> > netdevs may not even be known.
> >>
> >> Like an EEC, some PHYs, but the MAC driver does not know about both
> >> pieces? Who sets up the connection between the two? The box admin
> >> through some cabling? SoC designer?
> >>
> >> Also, what does the external EEC actually do with the signal from the
> >> PHY? Tune to it and forward to the other PHYs in the complex?
> >
> > Yes - it can also apply HW filters to it.
> 
> Sounds like this device should have an EEC instance of its own then.
> 
> Maybe we need to call it something else than "EEC". PLL? Something that
> does not have the standardization connotations, because several
> instances would be present in a system with several NICs.

There will be no EEC/EEC subsystem, but the DPLL. Every driver would 
be able to create a DPLL object so that we can easily use it from 
non-netdev devices as well. See the other mail for more details.

> > The EEC model will not work when you have the following system:
> > SoC with some ethernet ports with driver A
> > Switch chip with N ports with driver B
> > EEC/DPLL with driver C
> > Both SoC and Switch ASIC can recover clock and use the cleaned
> > clock from the DPLL.
> >
> > In that case you can't create any relation between EEC and recover
> > clock pins that would enable the EEC subsystem to control
> > recovered clocks, because you have 3 independent drivers.
> 
> I think that in that case you have several EEC instances. Those are
> connected by some wiring that is external to the devices themselves. I
> am not sure who should be in charge of describing the wiring. Device
> tree? Config file?

In some complex systems you'll need to create a relation between netdevs
and DPLLs in a config file, so it is the only way to describe all possible
scenarios.
We can't assume any connections between them, as that's design specific,
just like PTP pins are. They have labels, but it's up to the system
integrator to define how they are used.
We can consider creating some of them if they are known to the driver
and belong to the same driver.

> > The model you proposed assumes that the MAC/Switch is in
> > charge of the DPLL, but that's not always true.
> 
> The EEC-centric model does not in fact assume that. It lets anyone to
> set up an EEC object.
> 
> The netdev-centric UAPI assumes that the driver behind the netdev knows
> about how many RCLK out pins there are. So it can certainly instantiate
> a DPLL object instead, with those pins as external pins, and leave the
> connection of the external pins to the EEC proper implicit.

Netdev will know how many RCLK outputs are there, as that's the function
of a given MAC/PHY/Retimer.

> That gives userspace exactly the same information as the netdev-centric
> UAPI, but now userspace doesn't need to know about netdevs, and
> synchronously-spinning drives, and GPS receivers, each of which is
> handled through a dedicated set of netlink messages / sysctls / what
> have you. The userspace needs to know about EEC subsystem, and that's
> it.

I believe the direction is to make the connection between a netdev and
its related DPLL that's serving as EEC in a similar way the link to a PTP 
device is created. Userspace app will need to go to DPLL subsystem
to understand what's the current frequency source for a given netdev.

That's still independent uAPI from the one defined by those patches.

> > The model where recovered clock outputs are controlled independently
> > can support both models and is more flexible. It can also address the
> 
> - Anyone can instantiate EEC objects
> - Only things with ports instantiate netdevs
> 
> How is the latter one more flexible?

- Everything can instantiate DPLL object,
- Only a netdev can instantiate recovered clock outputs, which can be 
  connected to any other part of the system - not only a DPLL.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-06 15:09                         ` Machnikowski, Maciej
@ 2021-12-06 16:00                           ` Petr Machata
  2021-12-06 18:49                             ` Machnikowski, Maciej
  0 siblings, 1 reply; 27+ messages in thread
From: Petr Machata @ 2021-12-06 16:00 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Petr Machata, Ido Schimmel, netdev, intel-wired-lan, Kubalewski,
	Arkadiusz, richardcochran, abyagowi, Nguyen, Anthony L, davem,
	kuba, linux-kselftest, mkubecek, saeed, michael.chan


Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:

>> -----Original Message-----
>> From: Petr Machata <petrm@nvidia.com>
>> Sent: Monday, December 6, 2021 3:41 PM
>> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
>> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
>> recovered clock for SyncE feature
>> 
>> 
>> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
>> 
>> >> -----Original Message-----
>> >> From: Petr Machata <petrm@nvidia.com>
>> >>
>> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
>> >>
>> >> > Additionally, the EEC device may be instantiated by a totally
>> >> > different driver, in which case the relation between its pins
>> >> > and netdevs may not even be known.
>> >>
>> >> Like an EEC, some PHYs, but the MAC driver does not know about
>> >> both pieces? Who sets up the connection between the two? The box
>> >> admin through some cabling? SoC designer?
>> >>
>> >> Also, what does the external EEC actually do with the signal from
>> >> the PHY? Tune to it and forward to the other PHYs in the complex?
>> >
>> > Yes - it can also apply HW filters to it.
>> 
>> Sounds like this device should have an EEC instance of its own then.
>> 
>> Maybe we need to call it something else than "EEC". PLL? Something
>> that does not have the standardization connotations, because several
>> instances would be present in a system with several NICs.
>
> There will be no EEC/EEC subsystem, but the DPLL. Every driver would
> be able to create a DPLL object so that we can easily use it from
> non-netdev devices as well. See the other mail for more details.

Yes, this makes sense to me.

>> > The EEC model will not work when you have the following system:
>> > SoC with some ethernet ports with driver A
>> > Switch chip with N ports with driver B
>> > EEC/DPLL with driver C
>> > Both SoC and Switch ASIC can recover clock and use the cleaned
>> > clock from the DPLL.
>> >
>> > In that case you can't create any relation between EEC and recover
>> > clock pins that would enable the EEC subsystem to control
>> > recovered clocks, because you have 3 independent drivers.
>> 
>> I think that in that case you have several EEC instances. Those are
>> connected by some wiring that is external to the devices themselves. I
>> am not sure who should be in charge of describing the wiring. Device
>> tree? Config file?
>
> In some complex systems you'll need to create a relation between
> netdevs and DPLLs in a config file, so it is the only way to describe
> all possible scenarios. We can't assume any connections between them,
> as that's design specific, just like PTP pins are. They have labels,
> but it's up to the system integrator to define how they are used. We
> can consider creating some of them if they are known to the driver and
> belong to the same driver.

Agreed.

>> > The model you proposed assumes that the MAC/Switch is in
>> > charge of the DPLL, but that's not always true.
>> 
>> The EEC-centric model does not in fact assume that. It lets anyone to
>> set up an EEC object.
>> 
>> The netdev-centric UAPI assumes that the driver behind the netdev
>> knows about how many RCLK out pins there are. So it can certainly
>> instantiate a DPLL object instead, with those pins as external pins,
>> and leave the connection of the external pins to the EEC proper
>> implicit.
>
> Netdev will know how many RCLK outputs are there, as that's the
> function of a given MAC/PHY/Retimer.

So... spawn a DPLL with that number of virtual pins?

>> That gives userspace exactly the same information as the
>> netdev-centric UAPI, but now userspace doesn't need to know about
>> netdevs, and synchronously-spinning drives, and GPS receivers, each
>> of which is handled through a dedicated set of netlink messages /
>> sysctls / what have you. The userspace needs to know about EEC
>> subsystem, and that's it.
>
> I believe the direction is to make the connection between a netdev and
> its related DPLL that's serving as EEC in a similar way the link to a
> PTP device is created. Userspace app will need to go to DPLL subsystem
> to understand what's the current frequency source for a given netdev.

But the way PTP and netdevs are linked is that PTP clock is instantiated
independently, and then this clock is referenced by the netdevice. I do
not object to that at all, in fact I believe I mentioned this a couple
times already.

I'm objecting to accessing the PTP clock _through_ the netdev UAPI.
Because, how will non-NIC-bound DPLLs be represented? Well, through some
other UAPI, obviously. So userspace will need to know about all classes
of devices that can carry frequency signal.

Alternatively, both NIC drivers and other drivers can instantiate some
common type of DPLL-related object. Then any userspace tool that knows
how to work with objects of that type automatically knows how to handle
NICs, and GPSs, and whatever craziness someone dreams up.

> That's still independent uAPI from the one defined by those patches.
>
>> > The model where recovered clock outputs are controlled independently
>> > can support both models and is more flexible. It can also address the
>> 
>> - Anyone can instantiate EEC objects
>> - Only things with ports instantiate netdevs
>> 
>> How is the latter one more flexible?
>
> - Everything can instantiate DPLL object,
> - Only a netdev can instantiate recovered clock outputs, which can be
>   connected to any other part of the system - not only a DPLL.

If the frequency source devices are truly so different from the general
DPLL circuits that thay cannot possibly be expressed as the same type of
object, then by all means, represent them as something else. DPLL
frequency source, whatever.

But don't hide the API behind netdevs just because some NICs carry
DPLLs. Non-NIC frequency sources do exist, and the subsystem should
support them _in the same way_ as the NIC ones.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature
  2021-12-06 16:00                           ` Petr Machata
@ 2021-12-06 18:49                             ` Machnikowski, Maciej
  0 siblings, 0 replies; 27+ messages in thread
From: Machnikowski, Maciej @ 2021-12-06 18:49 UTC (permalink / raw)
  To: Petr Machata
  Cc: Ido Schimmel, netdev, intel-wired-lan, Kubalewski, Arkadiusz,
	richardcochran, abyagowi, Nguyen, Anthony L, davem, kuba,
	linux-kselftest, mkubecek, saeed, michael.chan

> -----Original Message-----
> From: Petr Machata <petrm@nvidia.com>
> Sent: Monday, December 6, 2021 5:00 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> recovered clock for SyncE feature
> 
> 
> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Petr Machata <petrm@nvidia.com>
> >> Sent: Monday, December 6, 2021 3:41 PM
> >> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> >> Subject: Re: [PATCH v4 net-next 2/4] ethtool: Add ability to configure
> >> recovered clock for SyncE feature
> >>
> >>
> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> >>
> >> >> -----Original Message-----
> >> >> From: Petr Machata <petrm@nvidia.com>
> >> >>
> >> >> Machnikowski, Maciej <maciej.machnikowski@intel.com> writes:
> >> >>
> >> >> > Additionally, the EEC device may be instantiated by a totally
> >> >> > different driver, in which case the relation between its pins
> >> >> > and netdevs may not even be known.
> >> >>
> >> >> Like an EEC, some PHYs, but the MAC driver does not know about
> >> >> both pieces? Who sets up the connection between the two? The box
> >> >> admin through some cabling? SoC designer?
> >> >>
> >> >> Also, what does the external EEC actually do with the signal from
> >> >> the PHY? Tune to it and forward to the other PHYs in the complex?
> >> >
> >> > Yes - it can also apply HW filters to it.
> >>
> >> Sounds like this device should have an EEC instance of its own then.
> >>
> >> Maybe we need to call it something else than "EEC". PLL? Something
> >> that does not have the standardization connotations, because several
> >> instances would be present in a system with several NICs.
> >
> > There will be no EEC/EEC subsystem, but the DPLL. Every driver would
> > be able to create a DPLL object so that we can easily use it from
> > non-netdev devices as well. See the other mail for more details.
> 
> Yes, this makes sense to me.
> 
> >> > The EEC model will not work when you have the following system:
> >> > SoC with some ethernet ports with driver A
> >> > Switch chip with N ports with driver B
> >> > EEC/DPLL with driver C
> >> > Both SoC and Switch ASIC can recover clock and use the cleaned
> >> > clock from the DPLL.
> >> >
> >> > In that case you can't create any relation between EEC and recover
> >> > clock pins that would enable the EEC subsystem to control
> >> > recovered clocks, because you have 3 independent drivers.
> >>
> >> I think that in that case you have several EEC instances. Those are
> >> connected by some wiring that is external to the devices themselves. I
> >> am not sure who should be in charge of describing the wiring. Device
> >> tree? Config file?
> >
> > In some complex systems you'll need to create a relation between
> > netdevs and DPLLs in a config file, so it is the only way to describe
> > all possible scenarios. We can't assume any connections between them,
> > as that's design specific, just like PTP pins are. They have labels,
> > but it's up to the system integrator to define how they are used. We
> > can consider creating some of them if they are known to the driver and
> > belong to the same driver.
> 
> Agreed.
> 
> >> > The model you proposed assumes that the MAC/Switch is in
> >> > charge of the DPLL, but that's not always true.
> >>
> >> The EEC-centric model does not in fact assume that. It lets anyone to
> >> set up an EEC object.
> >>
> >> The netdev-centric UAPI assumes that the driver behind the netdev
> >> knows about how many RCLK out pins there are. So it can certainly
> >> instantiate a DPLL object instead, with those pins as external pins,
> >> and leave the connection of the external pins to the EEC proper
> >> implicit.
> >
> > Netdev will know how many RCLK outputs are there, as that's the
> > function of a given MAC/PHY/Retimer.
> 
> So... spawn a DPLL with that number of virtual pins?

Recovered clock has different properties than a DPLL output.
Think of it as of a specific GPIO pin that sends the clock signal.


> >> That gives userspace exactly the same information as the
> >> netdev-centric UAPI, but now userspace doesn't need to know about
> >> netdevs, and synchronously-spinning drives, and GPS receivers, each
> >> of which is handled through a dedicated set of netlink messages /
> >> sysctls / what have you. The userspace needs to know about EEC
> >> subsystem, and that's it.
> >
> > I believe the direction is to make the connection between a netdev and
> > its related DPLL that's serving as EEC in a similar way the link to a
> > PTP device is created. Userspace app will need to go to DPLL subsystem
> > to understand what's the current frequency source for a given netdev.
> 
> But the way PTP and netdevs are linked is that PTP clock is instantiated
> independently, and then this clock is referenced by the netdevice. I do
> not object to that at all, in fact I believe I mentioned this a couple
> times already.
> 
> I'm objecting to accessing the PTP clock _through_ the netdev UAPI.
> Because, how will non-NIC-bound DPLLs be represented? Well, through
> some
> other UAPI, obviously. So userspace will need to know about all classes
> of devices that can carry frequency signal.

That's why we'll link to an instantiated DPLL like we do to PTP.
Those patches are only enabling recovered clock outputs - not DPLL.

> Alternatively, both NIC drivers and other drivers can instantiate some
> common type of DPLL-related object. Then any userspace tool that knows
> how to work with objects of that type automatically knows how to handle
> NICs, and GPSs, and whatever craziness someone dreams up.

I see no benefit in adding a new object like that - other subsystems
already have their own implementations of such GPIOs and they are
always implementation-specific.

> > That's still independent uAPI from the one defined by those patches.
> >
> >> > The model where recovered clock outputs are controlled independently
> >> > can support both models and is more flexible. It can also address the
> >>
> >> - Anyone can instantiate EEC objects
> >> - Only things with ports instantiate netdevs
> >>
> >> How is the latter one more flexible?
> >
> > - Everything can instantiate DPLL object,
> > - Only a netdev can instantiate recovered clock outputs, which can be
> >   connected to any other part of the system - not only a DPLL.
> 
> If the frequency source devices are truly so different from the general
> DPLL circuits that thay cannot possibly be expressed as the same type of
> object, then by all means, represent them as something else. DPLL
> frequency source, whatever.
> 
> But don't hide the API behind netdevs just because some NICs carry
> DPLLs. Non-NIC frequency sources do exist, and the subsystem should
> support them _in the same way_ as the NIC ones.

That's not the goal. This API gives control over a simple logic that's inside the
netdev that allows outputting the clock signal to a given physical output.
Internally it controls muxes and dividers.

NIC frequency source is actually different to other sources. It's tightly
coupled to a given netdev port, depends on link speed and the link itself. 
That's why it needs a separate API coupled with the netdev subsystem.

Also other subsystems have similar controls embedded in them - like PTP
has its pins subsystem. I don't see a reason to make yet another subsystem,
as all of them are custom.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-12-06 18:49 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01 18:02 [PATCH v4 net-next 0/4] Add ethtool interface for SyncE Maciej Machnikowski
2021-12-01 18:02 ` [PATCH v4 net-next 1/4] ice: add support detecting features based on netlist Maciej Machnikowski
2021-12-01 18:02 ` [PATCH v4 net-next 2/4] ethtool: Add ability to configure recovered clock for SyncE feature Maciej Machnikowski
2021-12-02 11:48   ` kernel test robot
2021-12-02 12:43   ` Ido Schimmel
2021-12-02 15:17     ` Machnikowski, Maciej
2021-12-02 16:35       ` Ido Schimmel
2021-12-02 17:20         ` Machnikowski, Maciej
2021-12-03 14:26           ` Petr Machata
2021-12-03 14:55             ` Machnikowski, Maciej
2021-12-03 15:58               ` Ido Schimmel
2021-12-03 16:26               ` Petr Machata
2021-12-03 16:50                 ` Machnikowski, Maciej
2021-12-03 18:44                   ` Petr Machata
2021-12-03 19:07                     ` Machnikowski, Maciej
2021-12-06 14:40                       ` Petr Machata
2021-12-06 15:09                         ` Machnikowski, Maciej
2021-12-06 16:00                           ` Petr Machata
2021-12-06 18:49                             ` Machnikowski, Maciej
2021-12-03 15:45           ` Ido Schimmel
2021-12-03 16:18             ` Machnikowski, Maciej
2021-12-03 18:21               ` Petr Machata
2021-12-05 12:24               ` Ido Schimmel
2021-12-06  9:15                 ` Machnikowski, Maciej
2021-12-01 18:02 ` [PATCH v4 net-next 3/4] ice: add support for monitoring SyncE DPLL state Maciej Machnikowski
2021-12-01 18:02 ` [PATCH v4 net-next 4/4] ice: add support for SyncE recovered clocks Maciej Machnikowski
2021-12-02  1:56   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).