All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/2] Add support for DSFP transceiver type
@ 2020-11-23  9:19 Moshe Shemesh
  2020-11-23  9:19 ` [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI Moshe Shemesh
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Moshe Shemesh @ 2020-11-23  9:19 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Andrew Lunn, Adrian Pop, Michal Kubecek
  Cc: netdev, Vladyslav Tarasiuk, Moshe Shemesh, Moshe Shemesh

Add support for new cable module type DSFP (Dual Small Form-Factor Pluggable
transceiver). DSFP EEPROM memory layout is compatible with CMIS 4.0 spec. Add
CMIS 4.0 module type to UAPI and implement DSFP EEPROM dump in mlx5.

Change log:
v1 -> v2
- Added comments on accessing only the mandatory part of passive and
  active cables.

Vladyslav Tarasiuk (2):
  ethtool: Add CMIS 4.0 module type to UAPI
  net/mlx5e: Add DSFP EEPROM dump support to ethtool

 .../ethernet/mellanox/mlx5/core/en_ethtool.c  | 12 ++++-
 .../net/ethernet/mellanox/mlx5/core/port.c    | 52 ++++++++++++++++---
 include/linux/mlx5/port.h                     |  1 +
 include/uapi/linux/ethtool.h                  |  3 ++
 4 files changed, 60 insertions(+), 8 deletions(-)

-- 
2.18.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI
  2020-11-23  9:19 [PATCH net-next v2 0/2] Add support for DSFP transceiver type Moshe Shemesh
@ 2020-11-23  9:19 ` Moshe Shemesh
  2020-11-23 22:40   ` Jesse Brandeburg
  2020-11-23  9:19 ` [PATCH net-next v2 2/2] net/mlx5e: Add DSFP EEPROM dump support to ethtool Moshe Shemesh
  2020-11-24  1:14 ` [PATCH net-next v2 0/2] Add support for DSFP transceiver type Andrew Lunn
  2 siblings, 1 reply; 19+ messages in thread
From: Moshe Shemesh @ 2020-11-23  9:19 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Andrew Lunn, Adrian Pop, Michal Kubecek
  Cc: netdev, Vladyslav Tarasiuk, Moshe Shemesh

From: Vladyslav Tarasiuk <vladyslavt@nvidia.com>

CMIS 4.0 document describes a universal EEPROM memory layout, which is
used for some modules such as DSFP, OSFP and QSFP-DD modules. In order
to distinguish them in userspace from existing standards, add
corresponding values.

CMIS 4.0 EERPOM memory includes mandatory and optional pages, the max
read length 768B includes passive and active cables mandatory pages.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
---
 include/uapi/linux/ethtool.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 9ca87bc73c44..0ec4c0ea3235 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -1861,9 +1861,12 @@ static inline int ethtool_validate_duplex(__u8 duplex)
 #define ETH_MODULE_SFF_8636_LEN		256
 #define ETH_MODULE_SFF_8436		0x4
 #define ETH_MODULE_SFF_8436_LEN		256
+#define ETH_MODULE_CMIS_4		0x5
+#define ETH_MODULE_CMIS_4_LEN		256
 
 #define ETH_MODULE_SFF_8636_MAX_LEN     640
 #define ETH_MODULE_SFF_8436_MAX_LEN     640
+#define ETH_MODULE_CMIS_4_MAX_LEN	768
 
 /* Reset flags */
 /* The reset() operation must clear the flags for the components which
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v2 2/2] net/mlx5e: Add DSFP EEPROM dump support to ethtool
  2020-11-23  9:19 [PATCH net-next v2 0/2] Add support for DSFP transceiver type Moshe Shemesh
  2020-11-23  9:19 ` [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI Moshe Shemesh
@ 2020-11-23  9:19 ` Moshe Shemesh
  2020-11-24  1:14 ` [PATCH net-next v2 0/2] Add support for DSFP transceiver type Andrew Lunn
  2 siblings, 0 replies; 19+ messages in thread
From: Moshe Shemesh @ 2020-11-23  9:19 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Andrew Lunn, Adrian Pop, Michal Kubecek
  Cc: netdev, Vladyslav Tarasiuk, Moshe Shemesh

From: Vladyslav Tarasiuk <vladyslavt@nvidia.com>

DSFP is a new cable module type, which EEPROM uses memory layout
described in CMIS 4.0 document. Use corresponding standard value for
userspace ethtool to distinguish DSFP's layout from older standards.

Add DSFP module ID in accordance to SFF-8024.

DSFP module memory can be flat or paged, which is indicated by a
flat_mem bit. In first case, only page 00 is available, and in second -
multiple pages: 00h, 01h, 02h, 10h and 11h. These five pages in bank
zero include the mandatory part for passive and active cables.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en_ethtool.c  | 12 ++++-
 .../net/ethernet/mellanox/mlx5/core/port.c    | 52 ++++++++++++++++---
 include/linux/mlx5/port.h                     |  1 +
 3 files changed, 57 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 42e61dc28ead..e6e80f1b0e94 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1659,8 +1659,8 @@ static int mlx5e_get_module_info(struct net_device *netdev,
 	int size_read = 0;
 	u8 data[4] = {0};
 
-	size_read = mlx5_query_module_eeprom(dev, 0, 2, data);
-	if (size_read < 2)
+	size_read = mlx5_query_module_eeprom(dev, 0, 3, data);
+	if (size_read < 3)
 		return -EIO;
 
 	/* data[0] = identifier byte */
@@ -1680,6 +1680,14 @@ static int mlx5e_get_module_info(struct net_device *netdev,
 			modinfo->eeprom_len = ETH_MODULE_SFF_8436_MAX_LEN;
 		}
 		break;
+	case MLX5_MODULE_ID_DSFP:
+		modinfo->type = ETH_MODULE_CMIS_4;
+		/* check flat_mem bit, zero indicates paged memory */
+		if (data[2] & 0x80)
+			modinfo->eeprom_len = ETH_MODULE_CMIS_4_LEN;
+		else
+			modinfo->eeprom_len = ETH_MODULE_CMIS_4_MAX_LEN;
+		break;
 	case MLX5_MODULE_ID_SFP:
 		modinfo->type       = ETH_MODULE_SFF_8472;
 		modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 4bb219565c58..df8e3d024479 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -311,13 +311,9 @@ static int mlx5_query_module_id(struct mlx5_core_dev *dev, int module_num,
 	return 0;
 }
 
-static int mlx5_qsfp_eeprom_page(u16 offset)
+static int mlx5_eeprom_high_page_num(u16 offset)
 {
-	if (offset < MLX5_EEPROM_PAGE_LENGTH)
-		/* Addresses between 0-255 - page 00 */
-		return 0;
-
-	/* Addresses between 256 - 639 belongs to pages 01, 02 and 03
+	/* Addresses 256 and higher belong to pages 01, 02, etc.
 	 * For example, offset = 400 belongs to page 02:
 	 * 1 + ((400 - 256)/128) = 2
 	 */
@@ -325,6 +321,16 @@ static int mlx5_qsfp_eeprom_page(u16 offset)
 		    MLX5_EEPROM_HIGH_PAGE_LENGTH);
 }
 
+static int mlx5_qsfp_eeprom_page(u16 offset)
+{
+	if (offset < MLX5_EEPROM_PAGE_LENGTH)
+		/* Addresses between 0-255 - page 00 */
+		return 0;
+
+	/* Addresses between 256 - 639 belong to pages 01, 02 and 03 */
+	return mlx5_eeprom_high_page_num(offset);
+}
+
 static int mlx5_qsfp_eeprom_high_page_offset(int page_num)
 {
 	if (!page_num) /* Page 0 always start from low page */
@@ -341,6 +347,37 @@ static void mlx5_qsfp_eeprom_params_set(u16 *i2c_addr, int *page_num, u16 *offse
 	*offset -=  mlx5_qsfp_eeprom_high_page_offset(*page_num);
 }
 
+static int mlx5_dsfp_eeprom_high_page_offset(int page_num)
+{
+	if (!page_num)
+		return 0;
+
+	return (page_num < 0x10 ? page_num : page_num - 13) * MLX5_EEPROM_HIGH_PAGE_LENGTH;
+}
+
+static int mlx5_dsfp_eeprom_page(u16 offset)
+{
+	if (offset < MLX5_EEPROM_PAGE_LENGTH)
+		return 0;
+
+	if (offset < MLX5_EEPROM_PAGE_LENGTH + (MLX5_EEPROM_HIGH_PAGE_LENGTH * 2))
+		/* Addresses 0 - 511 - pages 00, 01 and 02 */
+		return mlx5_eeprom_high_page_num(offset);
+
+	/* Offsets 512 - 767 belong to pages 10h and 11h.
+	 * For example, offset = 700 belongs to page 11:
+	 * 13 + 1 + ((700 - 256) / 128) = 17 = 0x11
+	 */
+	return 13 + mlx5_eeprom_high_page_num(offset);
+}
+
+static void mlx5_dsfp_eeprom_params_set(u16 *i2c_addr, int *page_num, u16 *offset)
+{
+	*i2c_addr = MLX5_I2C_ADDR_LOW;
+	*page_num = mlx5_dsfp_eeprom_page(*offset);
+	*offset -= mlx5_dsfp_eeprom_high_page_offset(*page_num);
+}
+
 static void mlx5_sfp_eeprom_params_set(u16 *i2c_addr, int *page_num, u16 *offset)
 {
 	*i2c_addr = MLX5_I2C_ADDR_LOW;
@@ -380,6 +417,9 @@ int mlx5_query_module_eeprom(struct mlx5_core_dev *dev,
 	case MLX5_MODULE_ID_QSFP28:
 		mlx5_qsfp_eeprom_params_set(&i2c_addr, &page_num, &offset);
 		break;
+	case MLX5_MODULE_ID_DSFP:
+		mlx5_dsfp_eeprom_params_set(&i2c_addr, &page_num, &offset);
+		break;
 	default:
 		mlx5_core_err(dev, "Module ID not recognized: 0x%x\n", module_id);
 		return -EINVAL;
diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h
index 23edd2db4803..ad4b2e778d46 100644
--- a/include/linux/mlx5/port.h
+++ b/include/linux/mlx5/port.h
@@ -45,6 +45,7 @@ enum mlx5_module_id {
 	MLX5_MODULE_ID_QSFP             = 0xC,
 	MLX5_MODULE_ID_QSFP_PLUS        = 0xD,
 	MLX5_MODULE_ID_QSFP28           = 0x11,
+	MLX5_MODULE_ID_DSFP             = 0x1B,
 };
 
 enum mlx5_an_status {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI
  2020-11-23  9:19 ` [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI Moshe Shemesh
@ 2020-11-23 22:40   ` Jesse Brandeburg
  2020-11-25 10:41     ` Moshe Shemesh
  0 siblings, 1 reply; 19+ messages in thread
From: Jesse Brandeburg @ 2020-11-23 22:40 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Adrian Pop,
	Michal Kubecek, netdev, Vladyslav Tarasiuk, Moshe Shemesh

Moshe Shemesh wrote:

> From: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
> 
> CMIS 4.0 document describes a universal EEPROM memory layout, which is
> used for some modules such as DSFP, OSFP and QSFP-DD modules. In order
> to distinguish them in userspace from existing standards, add
> corresponding values.
> 
> CMIS 4.0 EERPOM memory includes mandatory and optional pages, the max

typo? s/EERPOM/EEPROM

> read length 768B includes passive and active cables mandatory pages.
> 
> Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>

rest was ok.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-23  9:19 [PATCH net-next v2 0/2] Add support for DSFP transceiver type Moshe Shemesh
  2020-11-23  9:19 ` [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI Moshe Shemesh
  2020-11-23  9:19 ` [PATCH net-next v2 2/2] net/mlx5e: Add DSFP EEPROM dump support to ethtool Moshe Shemesh
@ 2020-11-24  1:14 ` Andrew Lunn
  2020-11-24 21:16   ` Jakub Kicinski
  2 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2020-11-24  1:14 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: David S. Miller, Jakub Kicinski, Adrian Pop, Michal Kubecek,
	netdev, Vladyslav Tarasiuk, Moshe Shemesh

On Mon, Nov 23, 2020 at 11:19:56AM +0200, Moshe Shemesh wrote:
> Add support for new cable module type DSFP (Dual Small Form-Factor Pluggable
> transceiver). DSFP EEPROM memory layout is compatible with CMIS 4.0 spec. Add
> CMIS 4.0 module type to UAPI and implement DSFP EEPROM dump in mlx5.

So the patches themselves look O.K.

But we are yet again kicking the can down the road and not fixing the
underlying inflexibility of the API.

Do we want to keep kicking the can, or is now the time to do the work
on this API?

   Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-24  1:14 ` [PATCH net-next v2 0/2] Add support for DSFP transceiver type Andrew Lunn
@ 2020-11-24 21:16   ` Jakub Kicinski
  2020-11-25 10:40     ` Moshe Shemesh
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2020-11-24 21:16 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Moshe Shemesh, David S. Miller, Adrian Pop, Michal Kubecek,
	netdev, Vladyslav Tarasiuk, Moshe Shemesh

On Tue, 24 Nov 2020 02:14:59 +0100 Andrew Lunn wrote:
> On Mon, Nov 23, 2020 at 11:19:56AM +0200, Moshe Shemesh wrote:
> > Add support for new cable module type DSFP (Dual Small Form-Factor Pluggable
> > transceiver). DSFP EEPROM memory layout is compatible with CMIS 4.0 spec. Add
> > CMIS 4.0 module type to UAPI and implement DSFP EEPROM dump in mlx5.  
> 
> So the patches themselves look O.K.
> 
> But we are yet again kicking the can down the road and not fixing the
> underlying inflexibility of the API.
> 
> Do we want to keep kicking the can, or is now the time to do the work
> on this API?

This is hardly rocket science. Let's do it right.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-24 21:16   ` Jakub Kicinski
@ 2020-11-25 10:40     ` Moshe Shemesh
  2020-11-25 14:18       ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Moshe Shemesh @ 2020-11-25 10:40 UTC (permalink / raw)
  To: Jakub Kicinski, Andrew Lunn
  Cc: Moshe Shemesh, David S. Miller, Adrian Pop, Michal Kubecek,
	netdev, Vladyslav Tarasiuk


On 11/24/2020 11:16 PM, Jakub Kicinski wrote:
> On Tue, 24 Nov 2020 02:14:59 +0100 Andrew Lunn wrote:
>> On Mon, Nov 23, 2020 at 11:19:56AM +0200, Moshe Shemesh wrote:
>>> Add support for new cable module type DSFP (Dual Small Form-Factor Pluggable
>>> transceiver). DSFP EEPROM memory layout is compatible with CMIS 4.0 spec. Add
>>> CMIS 4.0 module type to UAPI and implement DSFP EEPROM dump in mlx5.
>> So the patches themselves look O.K.
>>
>> But we are yet again kicking the can down the road and not fixing the
>> underlying inflexibility of the API.
>>
>> Do we want to keep kicking the can, or is now the time to do the work
>> on this API?
> This is hardly rocket science. Let's do it right.

OK, we will add API options to select bank and page to read any specific 
page the user selects. So advanced user will use it get the optional 
pages he needs, but what about non advanced user who wants to use the 
current API with a current script for DSFP EEPROM. Isn't it better that 
he will get the 5 mandatory pages then keep it not supported ?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI
  2020-11-23 22:40   ` Jesse Brandeburg
@ 2020-11-25 10:41     ` Moshe Shemesh
  0 siblings, 0 replies; 19+ messages in thread
From: Moshe Shemesh @ 2020-11-25 10:41 UTC (permalink / raw)
  To: Jesse Brandeburg, Moshe Shemesh
  Cc: David S. Miller, Jakub Kicinski, Andrew Lunn, Adrian Pop,
	Michal Kubecek, netdev, Vladyslav Tarasiuk


On 11/24/2020 12:40 AM, Jesse Brandeburg wrote:
> External email: Use caution opening links or attachments
>
>
> Moshe Shemesh wrote:
>
>> From: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
>>
>> CMIS 4.0 document describes a universal EEPROM memory layout, which is
>> used for some modules such as DSFP, OSFP and QSFP-DD modules. In order
>> to distinguish them in userspace from existing standards, add
>> corresponding values.
>>
>> CMIS 4.0 EERPOM memory includes mandatory and optional pages, the max
> typo? s/EERPOM/EEPROM
Right, thanks.
>
>> read length 768B includes passive and active cables mandatory pages.
>>
>> Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
>> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
> rest was ok.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-25 10:40     ` Moshe Shemesh
@ 2020-11-25 14:18       ` Andrew Lunn
  2020-11-26 14:23         ` Moshe Shemesh
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2020-11-25 14:18 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jakub Kicinski, Moshe Shemesh, David S. Miller, Adrian Pop,
	Michal Kubecek, netdev, Vladyslav Tarasiuk

> OK, we will add API options to select bank and page to read any specific
> page the user selects. So advanced user will use it get the optional pages
> he needs, but what about non advanced user who wants to use the current API
> with a current script for DSFP EEPROM. Isn't it better that he will get the
> 5 mandatory pages then keep it not supported ?

Users using ethtool will not see a difference. They get a dump of what
ethtool knows how to decode. It should try the netlink API first, and
then fall back to the old ioctl interface.

If i was implementing the ethtool side of it, i would probably do some
sort of caching system. We know page 0 should always exist, so
pre-load that into the cache. Try the netlink API first. If that
fails, use the ioctl interface. If the ioctl is used, put everything
returned into the cache. The decoder can then start decoding, see what
bits are set indicating other pages should be available. Ask for them
from the cache. The netlink API can go fetch them and load them into
the cache. If they cannot be loaded return ENODEV, and the decoder has
to skip what it wanted to decode. If you do it correctly, the decoder
should not care about ioctl vs netlink.

I can do a follow up patch for the generic SFP code in
drivers/net/phy, once you have done the first implementation. But i
only have a limited number of SFPs and most are 1G only. Russell King
can hopefully test with his collection.

     Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-25 14:18       ` Andrew Lunn
@ 2020-11-26 14:23         ` Moshe Shemesh
  2020-11-26 15:21           ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Moshe Shemesh @ 2020-11-26 14:23 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jakub Kicinski, Moshe Shemesh, David S. Miller, Adrian Pop,
	Michal Kubecek, netdev, Vladyslav Tarasiuk, Maxim Mikityanskiy


On 11/25/2020 4:18 PM, Andrew Lunn wrote:
> External email: Use caution opening links or attachments
>
>
>> OK, we will add API options to select bank and page to read any specific
>> page the user selects. So advanced user will use it get the optional pages
>> he needs, but what about non advanced user who wants to use the current API
>> with a current script for DSFP EEPROM. Isn't it better that he will get the
>> 5 mandatory pages then keep it not supported ?
> Users using ethtool will not see a difference. They get a dump of what
> ethtool knows how to decode. It should try the netlink API first, and
> then fall back to the old ioctl interface.
Yes, it makes sense that whenever command not supported by netlink API 
it falls back to old ioctl interface. As I see it we want here to add 
bank and page options to netlink API  to get data from specific page.
>
> If i was implementing the ethtool side of it, i would probably do some
> sort of caching system. We know page 0 should always exist, so
> pre-load that into the cache. Try the netlink API first. If that
> fails, use the ioctl interface. If the ioctl is used, put everything
> returned into the cache.
I am not sure what you mean by cache here. Don't you want to read page 0 
once you got the ethtool command to read from the module ? If not, then 
at what stage ?
>   The decoder can then start decoding, see what
> bits are set indicating other pages should be available. Ask for them
> from the cache. The netlink API can go fetch them and load them into
> the cache. If they cannot be loaded return ENODEV, and the decoder has
> to skip what it wanted to decode.

So the decoder should read page 0 and check according to page 0 and 
specification which pages should be present, right ?

What about the global offset that we currently got when user doesn't 
specify a page, do you mean that this global offset goes through the 
optional and non optional pages that exist and skip the ones that are 
missing according to the specific EEPROM ?

>   If you do it correctly, the decoder
> should not care about ioctl vs netlink.
>
> I can do a follow up patch for the generic SFP code in
> drivers/net/phy, once you have done the first implementation. But i
> only have a limited number of SFPs and most are 1G only. Russell King
> can hopefully test with his collection.
>
>       Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-26 14:23         ` Moshe Shemesh
@ 2020-11-26 15:21           ` Andrew Lunn
  2020-11-27 15:12             ` Moshe Shemesh
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2020-11-26 15:21 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jakub Kicinski, Moshe Shemesh, David S. Miller, Adrian Pop,
	Michal Kubecek, netdev, Vladyslav Tarasiuk, Maxim Mikityanskiy

> > If i was implementing the ethtool side of it, i would probably do some
> > sort of caching system. We know page 0 should always exist, so
> > pre-load that into the cache. Try the netlink API first. If that
> > fails, use the ioctl interface. If the ioctl is used, put everything
> > returned into the cache.
> I am not sure what you mean by cache here. Don't you want to read page 0
> once you got the ethtool command to read from the module ? If not, then at
> what stage ?

At the beginning, you try the netlink API and ask for pager 0, bytes
0-127. If you get a page, put it into the cache. If not, use the ioctl
interface, which could return one page, or multiple pages. Put them
all into the cache.

> >   The decoder can then start decoding, see what
> > bits are set indicating other pages should be available. Ask for them
> > from the cache. The netlink API can go fetch them and load them into
> > the cache. If they cannot be loaded return ENODEV, and the decoder has
> > to skip what it wanted to decode.
> 
> So the decoder should read page 0 and check according to page 0 and
> specification which pages should be present, right ?

Yes. It ask the cache, give me a pointer to page 0, bytes 0-127. It
then decodes that, looking at the enumeration data to indicate what
other pages should be available. Maybe it decides, page 0, bytes
128-255 should exist, so it asks the cache for a pointer to that. If
using netlink, it would ask the kernel for that data, put it into the
cache, and return a pointer. If using ioctl, it already knows if it
has that data, so it just returns a pointer, so says sorry, not
available.

> What about the global offset that we currently got when user doesn't specify
> a page, do you mean that this global offset goes through the optional and
> non optional pages that exist and skip the ones that are missing according
> to the specific EEPROM ?

ethtool -m|--dump-module-eeprom|--module-info devname [raw on|off] [hex on|off] [offset N] [length N]

So you mean [offset N] [length N].

That is going to be hard, but the API is broken for complex SFPs with
optional pages. And it is not well defined exactly what offset means.
You can keep backwards compatibility by identifying the SFP from page
0, and then reading the pages in the order the ioctl would do. Let
user space handle it, for those SFPs which the kernel already
supports. For SFPs which the kernel does not support, i would just
return not supported. You can do the same for raw. However, for new
SFPs, for raw you can run the decoder but output to /dev/null. That
loads into the cache all the pages which the decoder knows about. You
can then dump the cache. You probably need a new format, to give an
indication of what each page actually is.

Maybe you want to add new options [page N] [ bank N] to allow
arbitrary queries to be made? Again, you can answer these from the
cache, so the old ioctl interface could work if asked for pages which
the old API had.

    Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-26 15:21           ` Andrew Lunn
@ 2020-11-27 15:12             ` Moshe Shemesh
  2020-11-27 15:56               ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Moshe Shemesh @ 2020-11-27 15:12 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jakub Kicinski, Moshe Shemesh, David S. Miller, Adrian Pop,
	Michal Kubecek, netdev, Vladyslav Tarasiuk, Maxim Mikityanskiy


On 11/26/2020 5:21 PM, Andrew Lunn wrote:
>>> If i was implementing the ethtool side of it, i would probably do some
>>> sort of caching system. We know page 0 should always exist, so
>>> pre-load that into the cache. Try the netlink API first. If that
>>> fails, use the ioctl interface. If the ioctl is used, put everything
>>> returned into the cache.
>> I am not sure what you mean by cache here. Don't you want to read page 0
>> once you got the ethtool command to read from the module ? If not, then at
>> what stage ?
> At the beginning, you try the netlink API and ask for pager 0, bytes
> 0-127. If you get a page, put it into the cache. If not, use the ioctl
> interface, which could return one page, or multiple pages. Put them
> all into the cache.


OK, but if the caching system is checking one time netlink and one time 
ioctl, it means this cache should be in user space, or did you mean to 
have this cache in kernel ?

>>>    The decoder can then start decoding, see what
>>> bits are set indicating other pages should be available. Ask for them
>>> from the cache. The netlink API can go fetch them and load them into
>>> the cache. If they cannot be loaded return ENODEV, and the decoder has
>>> to skip what it wanted to decode.
>> So the decoder should read page 0 and check according to page 0 and
>> specification which pages should be present, right ?
> Yes. It ask the cache, give me a pointer to page 0, bytes 0-127. It
> then decodes that, looking at the enumeration data to indicate what
> other pages should be available. Maybe it decides, page 0, bytes
> 128-255 should exist, so it asks the cache for a pointer to that. If
> using netlink, it would ask the kernel for that data, put it into the
> cache, and return a pointer. If using ioctl, it already knows if it
> has that data, so it just returns a pointer, so says sorry, not
> available.
>
>> What about the global offset that we currently got when user doesn't specify
>> a page, do you mean that this global offset goes through the optional and
>> non optional pages that exist and skip the ones that are missing according
>> to the specific EEPROM ?
> ethtool -m|--dump-module-eeprom|--module-info devname [raw on|off] [hex on|off] [offset N] [length N]
>
> So you mean [offset N] [length N].


Yes, that's the current options and we can either try coding new 
implementation for that or just call the current ioctl implementation. 
The new code can be triggered once options [bank N] and [Page N] are used.

>
> That is going to be hard, but the API is broken for complex SFPs with
> optional pages. And it is not well defined exactly what offset means.
> You can keep backwards compatibility by identifying the SFP from page
> 0, and then reading the pages in the order the ioctl would do. Let
> user space handle it, for those SFPs which the kernel already
> supports. For SFPs which the kernel does not support, i would just
> return not supported. You can do the same for raw. However, for new
> SFPs, for raw you can run the decoder but output to /dev/null. That
> loads into the cache all the pages which the decoder knows about. You
> can then dump the cache. You probably need a new format, to give an
> indication of what each page actually is.
OK, if I got it right on current API [offset N] [length N] just call 
ioctl current implementation, while using the option [raw on] will call 
new implementation for new SFPs (CMIS 4). Also using [bank N] and [page 
N] will call new implementation for new SFPs.
> Maybe you want to add new options [page N] [ bank N] to allow
> arbitrary queries to be made?
Exactly what I meant, I actually thought of letting the user ask for the 
page he wants, he should know what he needs.
> Again, you can answer these from the
> cache, so the old ioctl interface could work if asked for pages which
> the old API had.
Yes, for the simple EEPROM types that have 1 or 4 pages, ioctl read 
should be enough to get the data.
>
>      Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-27 15:12             ` Moshe Shemesh
@ 2020-11-27 15:56               ` Andrew Lunn
  2020-12-29 10:23                 ` Vladyslav Tarasiuk
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2020-11-27 15:56 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jakub Kicinski, Moshe Shemesh, David S. Miller, Adrian Pop,
	Michal Kubecek, netdev, Vladyslav Tarasiuk, Maxim Mikityanskiy

> OK, but if the caching system is checking one time netlink and one time
> ioctl, it means this cache should be in user space, or did you mean to have
> this cache in kernel ?

This is all in userspace, in the ethtool code.

> > > What about the global offset that we currently got when user doesn't specify
> > > a page, do you mean that this global offset goes through the optional and
> > > non optional pages that exist and skip the ones that are missing according
> > > to the specific EEPROM ?
> > ethtool -m|--dump-module-eeprom|--module-info devname [raw on|off] [hex on|off] [offset N] [length N]
> > 
> > So you mean [offset N] [length N].
> 
> 
> Yes, that's the current options and we can either try coding new
> implementation for that or just call the current ioctl implementation. The
> new code can be triggered once options [bank N] and [Page N] are used.

You cannot rely on the ioctl being available. New drivers won't
implement it, if they have the netlink code. Drivers will convert from
get_module_info to whatever new ndo call you add for netlink.

> OK, if I got it right on current API [offset N] [length N] just call ioctl
> current implementation, while using the option [raw on] will call new
> implementation for new SFPs (CMIS 4). Also using [bank N] and [page N] will
> call new implementation for new SFPs.

Not just CMIS. All SFPs.

    Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-11-27 15:56               ` Andrew Lunn
@ 2020-12-29 10:23                 ` Vladyslav Tarasiuk
  2020-12-29 16:25                   ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Vladyslav Tarasiuk @ 2020-12-29 10:23 UTC (permalink / raw)
  To: Andrew Lunn, Moshe Shemesh
  Cc: Jakub Kicinski, Moshe Shemesh, David S. Miller, Adrian Pop,
	Michal Kubecek, netdev, Maxim Mikityanskiy, vladyslavt


On 27-Nov-20 17:56, Andrew Lunn wrote:
>> OK, but if the caching system is checking one time netlink and one time
>> ioctl, it means this cache should be in user space, or did you mean to have
>> this cache in kernel ?
> This is all in userspace, in the ethtool code.
>
>>>> What about the global offset that we currently got when user doesn't specify
>>>> a page, do you mean that this global offset goes through the optional and
>>>> non optional pages that exist and skip the ones that are missing according
>>>> to the specific EEPROM ?
>>> ethtool -m|--dump-module-eeprom|--module-info devname [raw on|off] [hex on|off] [offset N] [length N]
>>>
>>> So you mean [offset N] [length N].
>>
>> Yes, that's the current options and we can either try coding new
>> implementation for that or just call the current ioctl implementation. The
>> new code can be triggered once options [bank N] and [Page N] are used.
> You cannot rely on the ioctl being available. New drivers won't
> implement it, if they have the netlink code. Drivers will convert from
> get_module_info to whatever new ndo call you add for netlink.
>
>> OK, if I got it right on current API [offset N] [length N] just call ioctl
>> current implementation, while using the option [raw on] will call new
>> implementation for new SFPs (CMIS 4). Also using [bank N] and [page N] will
>> call new implementation for new SFPs.
> Not just CMIS. All SFPs.
>
>      Andrew

Hi Andrew,

Following this conversation, I wrote some pseudocode checking if I'm on 
right path here.
Please review:

struct eeprom_page {
         u8 page_number;
         u8 bank_number;
         u16 offset;
         u16 data_length;
         u8 *data;
}

Check every requested page, offset and length availability in userspace,
depending on module standard and EEPROM data.
================================================================================

cache_fetch_add(page_number, bank_number, offset, length)
{
         ethnla_put_u8 (msgbuff, ETHTOOL_A_EEPROM_PAGE_NUMBER, page_number);
         ethnla_put_u8 (msgbuff, ETHTOOL_A_EEPROM_BANK_NUMBER, bank_number);
         ethnla_put_u16(msgbuff, ETHTOOL_A_EEPROM_PAGE_OFFSET, offset);
         ethnla_put_u16(msgbuff, ETHTOOL_A_EEPROM_PAGE_LENGTH, length);
         err = nlsock_send_get_request(eeprom_page_reply_cb) // caches a 
page
         if (err < 0)
                 return err;
}

// Any call should be a pair of cache_fetch_add() with error check and only
// then a cache_get(), but for pseudocode this will do
cache_get_page(page_number, bank_number, offset, length)
{
         if (!cache_contains(page_number, bank_number, offset, length))
                 cache_fetch_add(page_number, bank_number, offset, length);
         return cache_get(page_number, bank_number);
}

print_cmis_y()
{
         page_data = cache_get_page(0, 0, 0, 256)->data;
         print_page_zero(page_data);
         page_data = cache_get_page(1, 0, 128, 128)->data;
         print_page_one(page_data);
}

print_human_readable()
{
         spec_id = cache_get_page(0, 0, 0, 128)->data[0];
         switch (spec_id) {
         case sff_xxxx:
                 print_sff_xxxx();
         case cmis_y:
                 print_cmis_y();
         default:
                 print_hex();
         }
}

getmodule_reply_cb()
{
         if (offset || hex || bank_number || page number)
                 print_hex();
         else
                 // if _human_readable() decoder needs more than page 
00, it will
                 // fetch it on demand
                 print_human_readable();
}

eeprom_page_reply_cb()
{
         cache_add(new_page);
}

page_available(page_number)
{
         spec_id = cache_get_page(0, 0, 0, 128)->data[0];

         // check bits indicating page availability
         switch (spec_id) {
         case sff_xxxx:
                 return is_page_available_sff_xxxx(page_number);
         case cmis_y:
                 return is_page_available_cmis_y(page_number, bank_number);
         default:
                 return -EINVAL;
         }
}

nl_getmodule()
{
         msg_init();

         // request first low page
         ret = cache_fetch_add(0, 0, 0, 128);

         if (ret < 0) // netlink unsupported, fall back to ioctl
                 return ret;

         netlink_cmd_check();
         nl_parser()
         // check bits indicating page availability according to used spec
         if (page_available(page_number, bank_number)) {
                 if (cache_contains(page_number, bank_number, offset, 
length))
                         print_page(page_number, bank_number, offset, 
length)
                 else {
                         ret = cache_fetch_add();
                         if (ret < 0)
                                 return ret;
                         print_page(page_number, bank_number, offset, 
length);
                 }
         } else {
                 return -EINVAL;
         }
}


Or just proceed to request any page and depend on kernel parameter 
validation
================================================================================

getmodule_reply_cb()
{
         if (offset || hex || bank_number || page_number)
                 print_hex();
         else
                 // if _human_readable() decoder needs more than page 
00, it will
                 // fetch it on demand
                 print_human_readable();
}

nl_getmodule()
{
         // request page straight away
         netlink_cmd_check();
         nl_parser();
         /*
          * nl_parser() sets page_number, page_bank, offset and length and
          * prepares to pass them to the kernel. See 
eeprom_page_parse_request()
          */
         ret = nlsock_sendmsg(getmodule_reply_cb);

         if (ret < 0)
                 retrun ret;
}



Kernel
This part is the same for both userspace variants
================================================================================

kernel_fallback(request, data)
{
         struct ethtool_eeprom eeprom;

         if (request->bank_number)
                 return -EINVAL;
         // also take into account page 00 size
         eeprom.offset = request->offset + page_size * request->page_number;
         eeprom.len = request->length;
         eeprom.cmd = ETHTOOL_GMODULEEEPROM;
         eeprom.data = data;

         err = ops->get_module_eeprom(dev, &eeprom, eeprom.data)
         if (err < 0)
                 return err;

         // prepare data for response via netlink like for supported ndo
}

eeprom_page_parse_request()
{
         struct eeprom_page_req_info *request;

         request->offset = nla_get_u16(tb[ETHTOOL_A_EEPROM_PAGE_OFFSET]);
         request->length = nla_get_u16(tb[ETHTOOL_A_EEPROM_PAGE_LENGTH]);
         request->page_number = 
nla_get_u8(tb[ETHTOOL_A_EEPROM_PAGE_NUMBER]);
         request->bank_number = 
nla_get_u8(tb[ETHTOOL_A_EEPROM_PAGE_BANK_NUMBER]);
}

eeprom_page_prepare_data(request)
{
         u8 data[128];
         u8 page_zero_data[128];

         if (!ops->get_module_eeprom_page) {
                 if (ops->get_module_info && ops->get_module_eeprom)
                         return kernel_fallback(request, &data);
                 else
                         return -EOPNOTSUPP;
         }
         // fetch page 00 for further checks
         err = ops->get_module_eeprom_page(dev, 0, 128, 0, 0, 
&page_zero_data);
         if (err < 0)
                 return err;

         // Checks according to spec, similar to page_available() in 
userspace.
         // May be ommited and passed directly to the driver for it to 
validate
         if (is_request_valid(request, &page_zero_data))
                 err = ops->get_module_eeprom_page(dev, request->offset,
request->length,
request->page_number,
request->bank_number, &data);
         else
                 return -EINVAL;

}

Driver
It is required to implement get_module_eeprom_page() ndo, where it queries
its EEPROM and copies to u8 *data array allocated by the kernel previously.
The ndo has the following prototype:
int get_module_eeprom_page(struct net_device *, u16 offset, u16 length,
                            u8 page_number, u8 bank_number, u8 *data);

The driver may test whatever it may need in this ndo implementation.
Its parameters may also be passed using a special struct.
===============================================================================

driver_get_module_eeprom_page()
{
         vendor_query_module_eeprom(page_number, bank_number, offset, 
length, data);
}


Thanks,
Vlad

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-12-29 10:23                 ` Vladyslav Tarasiuk
@ 2020-12-29 16:25                   ` Andrew Lunn
  2020-12-30 13:55                     ` Vladyslav Tarasiuk
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2020-12-29 16:25 UTC (permalink / raw)
  To: Vladyslav Tarasiuk
  Cc: Moshe Shemesh, Jakub Kicinski, Moshe Shemesh, David S. Miller,
	Adrian Pop, Michal Kubecek, netdev, Maxim Mikityanskiy

> Hi Andrew,
> 
> Following this conversation, I wrote some pseudocode checking if I'm on
> right path here.
> Please review:
> 
> struct eeprom_page {
>         u8 page_number;
>         u8 bank_number;
>         u16 offset;
>         u16 data_length;
>         u8 *data;
> }

I'm wondering about offset and data_length, in this context. I would
expect you always ask the kernel for the full page, not part of
it. Even when user space asks for just part of a page. That keeps you
cache management simpler. But maybe some indicator of low/high is
needed, since many pages are actually 1/2 pages?

The other thing to consider is SFF-8472 and its use of two different
i2c addresses, A0h and A2h. These are different to pages and banks.

> print_human_readable()
> {
>         spec_id = cache_get_page(0, 0, 0, 128)->data[0];
>         switch (spec_id) {
>         case sff_xxxx:
>                 print_sff_xxxx();
>         case cmis_y:
>                 print_cmis_y();
>         default:
>                 print_hex();
>         }
> }

You want to keep as much of the existing ethtool code as you can, but
the basic idea looks O.K.

> getmodule_reply_cb()
> {
>         if (offset || hex || bank_number || page number)
>                 print_hex();
>         else
>                 // if _human_readable() decoder needs more than page 00, it
> will
>                 // fetch it on demand
>                 print_human_readable();
> }

Things get interesting here. Say this is page 0, and
print_human_readable() finds a bit indicating page 1 is valid. So it
requests page 1. We go recursive. While deep down in
print_human_readable(), we send the next netlink message and call
getmodule_reply_cb() when the answer appears. I've had problems with
some of the netlink code not liking recursive calls.

So i suggest you try to find a different structure for the code. Try
to complete the netlink call before doing the decoding. So add the
page to the cache and then return. Do the decode after
nlsock_sendmsg() has returned.

> Driver
> It is required to implement get_module_eeprom_page() ndo, where it queries
> its EEPROM and copies to u8 *data array allocated by the kernel previously.
> The ndo has the following prototype:
> int get_module_eeprom_page(struct net_device *, u16 offset, u16 length,
>                            u8 page_number, u8 bank_number, u8 *data);


I would include extack here, so we can get better error messages.

  Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-12-29 16:25                   ` Andrew Lunn
@ 2020-12-30 13:55                     ` Vladyslav Tarasiuk
  2020-12-30 15:36                       ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Vladyslav Tarasiuk @ 2020-12-30 13:55 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Moshe Shemesh, Jakub Kicinski, Moshe Shemesh, David S. Miller,
	Adrian Pop, Michal Kubecek, netdev, Maxim Mikityanskiy,
	vladyslavt


On 29-Dec-20 18:25, Andrew Lunn wrote:
>> Hi Andrew,
>>
>> Following this conversation, I wrote some pseudocode checking if I'm on
>> right path here.
>> Please review:
>>
>> struct eeprom_page {
>>          u8 page_number;
>>          u8 bank_number;
>>          u16 offset;
>>          u16 data_length;
>>          u8 *data;
>> }
> I'm wondering about offset and data_length, in this context. I would
> expect you always ask the kernel for the full page, not part of
> it. Even when user space asks for just part of a page. That keeps you
> cache management simpler.
As far as I know, there may be bytes, which may change on read.
For example, clear on read values in CMIS 4.0.
Then, retrieving whole page every time may be incorrect.
So I kept these for cases, when user asks for specific few bytes.
> But maybe some indicator of low/high is
> needed, since many pages are actually 1/2 pages?
I was planning to use offset and data_length fields to indicate the
available page. For example, high page will have offset 128 and
data_length 128.
> The other thing to consider is SFF-8472 and its use of two different
> i2c addresses, A0h and A2h. These are different to pages and banks.

I wasn't aware of that. It complicates things a bit, should we add a
parameter of i2c address? So in this case page 0 will be with i2c
address A0h. And if user needs page 0 from i2c address A2h, he will
specify it in command line. And for other specs, this parameter will
not be supported.

>> print_human_readable()
>> {
>>          spec_id = cache_get_page(0, 0, 0, 128)->data[0];
>>          switch (spec_id) {
>>          case sff_xxxx:
>>                  print_sff_xxxx();
>>          case cmis_y:
>>                  print_cmis_y();
>>          default:
>>                  print_hex();
>>          }
>> }
> You want to keep as much of the existing ethtool code as you can, but
> the basic idea looks O.K.
Yes, under print_sff_xxxx(), for example, I meant using existing functions.
Either as is, or refactoring according to cache requirements.
>> getmodule_reply_cb()
>> {
>>          if (offset || hex || bank_number || page number)
>>                  print_hex();
>>          else
>>                  // if _human_readable() decoder needs more than page 00, it
>> will
>>                  // fetch it on demand
>>                  print_human_readable();
>> }
> Things get interesting here. Say this is page 0, and
> print_human_readable() finds a bit indicating page 1 is valid. So it
> requests page 1. We go recursive. While deep down in
> print_human_readable(), we send the next netlink message and call
> getmodule_reply_cb() when the answer appears. I've had problems with
> some of the netlink code not liking recursive calls.
>
> So i suggest you try to find a different structure for the code. Try
> to complete the netlink call before doing the decoding. So add the
> page to the cache and then return. Do the decode after
> nlsock_sendmsg() has returned.
I'm thinking about a standard-specific function, which will prefetch pages
needed by print_human_readable(). It will check the standard ID, and go 
request
pages and add them to the cache. Then, decoder kicks with already cached
pages. This will eliminate recursive netlink calls.
>> Driver
>> It is required to implement get_module_eeprom_page() ndo, where it queries
>> its EEPROM and copies to u8 *data array allocated by the kernel previously.
>> The ndo has the following prototype:
>> int get_module_eeprom_page(struct net_device *, u16 offset, u16 length,
>>                             u8 page_number, u8 bank_number, u8 *data);
>
> I would include extack here, so we can get better error messages.

OK, I will add extack.

Thanks,
Vlad


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-12-30 13:55                     ` Vladyslav Tarasiuk
@ 2020-12-30 15:36                       ` Andrew Lunn
  2021-01-04 15:24                         ` Vladyslav Tarasiuk
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2020-12-30 15:36 UTC (permalink / raw)
  To: Vladyslav Tarasiuk
  Cc: Moshe Shemesh, Jakub Kicinski, Moshe Shemesh, David S. Miller,
	Adrian Pop, Michal Kubecek, netdev, Maxim Mikityanskiy

On Wed, Dec 30, 2020 at 03:55:02PM +0200, Vladyslav Tarasiuk wrote:
> 
> On 29-Dec-20 18:25, Andrew Lunn wrote:
> > > Hi Andrew,
> > > 
> > > Following this conversation, I wrote some pseudocode checking if I'm on
> > > right path here.
> > > Please review:
> > > 
> > > struct eeprom_page {
> > >          u8 page_number;
> > >          u8 bank_number;
> > >          u16 offset;
> > >          u16 data_length;
> > >          u8 *data;
> > > }
> > I'm wondering about offset and data_length, in this context. I would
> > expect you always ask the kernel for the full page, not part of
> > it. Even when user space asks for just part of a page. That keeps you
> > cache management simpler.
> As far as I know, there may be bytes, which may change on read.
> For example, clear on read values in CMIS 4.0.

Ah, i did not know there were such bits. I will go read the spec. But
it should not really matter. If the SFP driver is interested in these
bits, it will have to intercept the read and act on the values.

> I wasn't aware of that. It complicates things a bit, should we add a
> parameter of i2c address? So in this case page 0 will be with i2c
> address A0h. And if user needs page 0 from i2c address A2h, he will
> specify it in command line.

Not on the command line. You should be able to determine from reading
page 0 at A0h is the diagnostics are at A2h or a page of A0h. That is
the whole point of this API, we decode the first page, and that tells
us what other pages should be available. So adding the i2c address to
the netlink message would be sensible. And i would not be too
surprised if there are SFPs with proprietary registers on other
addresses, which could be interesting to dump, if you can get access
to the needed datasheets.

   Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2020-12-30 15:36                       ` Andrew Lunn
@ 2021-01-04 15:24                         ` Vladyslav Tarasiuk
  2021-01-04 15:48                           ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Vladyslav Tarasiuk @ 2021-01-04 15:24 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Moshe Shemesh, Jakub Kicinski, Moshe Shemesh, David S. Miller,
	Adrian Pop, Michal Kubecek, netdev, Maxim Mikityanskiy,
	vladyslavt


On 30-Dec-20 17:36, Andrew Lunn wrote:
> On Wed, Dec 30, 2020 at 03:55:02PM +0200, Vladyslav Tarasiuk wrote:
>> On 29-Dec-20 18:25, Andrew Lunn wrote:
>>>> Hi Andrew,
>>>>
>>>> Following this conversation, I wrote some pseudocode checking if I'm on
>>>> right path here.
>>>> Please review:
>>>>
>>>> struct eeprom_page {
>>>>           u8 page_number;
>>>>           u8 bank_number;
>>>>           u16 offset;
>>>>           u16 data_length;
>>>>           u8 *data;
>>>> }
>>> I'm wondering about offset and data_length, in this context. I would
>>> expect you always ask the kernel for the full page, not part of
>>> it. Even when user space asks for just part of a page. That keeps you
>>> cache management simpler.
>> As far as I know, there may be bytes, which may change on read.
>> For example, clear on read values in CMIS 4.0.
> Ah, i did not know there were such bits. I will go read the spec. But
> it should not really matter. If the SFP driver is interested in these
> bits, it will have to intercept the read and act on the values.

But in case user requests a few bytes from a page with clear-on-read
values, reading full page will clear all such bytesfrom user perspective
even if they were not requested. Driver may intercept the read, but for
user it will look like those bytes were not set.

Current user interface allows arbitrary reads, so I wanted to keep this
behavior pretty much exactly like it is now - request specific part of
a page, get this part without any extra data.

>> I wasn't aware of that. It complicates things a bit, should we add a
>> parameter of i2c address? So in this case page 0 will be with i2c
>> address A0h. And if user needs page 0 from i2c address A2h, he will
>> specify it in command line.
> Not on the command line. You should be able to determine from reading
> page 0 at A0h is the diagnostics are at A2h or a page of A0h. That is
> the whole point of this API, we decode the first page, and that tells
> us what other pages should be available. So adding the i2c address to
> the netlink message would be sensible. And i would not be too
> surprised if there are SFPs with proprietary registers on other
> addresses, which could be interesting to dump, if you can get access
> to the needed datasheets.
Without command line argument user will not be able to request a single
A2h page, for example. He will see it only in some kind of general dump -
with human-readable decoder usage or multiple page dump.

And same goes forpages on other i2c addresses. How to know what to dump,
if user does not provide i2c address and there is no way to know what to
request from proprietary SFPs?

Thanks,
Vlad


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type
  2021-01-04 15:24                         ` Vladyslav Tarasiuk
@ 2021-01-04 15:48                           ` Andrew Lunn
  0 siblings, 0 replies; 19+ messages in thread
From: Andrew Lunn @ 2021-01-04 15:48 UTC (permalink / raw)
  To: Vladyslav Tarasiuk
  Cc: Moshe Shemesh, Jakub Kicinski, Moshe Shemesh, David S. Miller,
	Adrian Pop, Michal Kubecek, netdev, Maxim Mikityanskiy

On Mon, Jan 04, 2021 at 05:24:11PM +0200, Vladyslav Tarasiuk wrote:
> 
> On 30-Dec-20 17:36, Andrew Lunn wrote:
> > On Wed, Dec 30, 2020 at 03:55:02PM +0200, Vladyslav Tarasiuk wrote:
> > > On 29-Dec-20 18:25, Andrew Lunn wrote:
> > > > > Hi Andrew,
> > > > > 
> > > > > Following this conversation, I wrote some pseudocode checking if I'm on
> > > > > right path here.
> > > > > Please review:
> > > > > 
> > > > > struct eeprom_page {
> > > > >           u8 page_number;
> > > > >           u8 bank_number;
> > > > >           u16 offset;
> > > > >           u16 data_length;
> > > > >           u8 *data;
> > > > > }
> > > > I'm wondering about offset and data_length, in this context. I would
> > > > expect you always ask the kernel for the full page, not part of
> > > > it. Even when user space asks for just part of a page. That keeps you
> > > > cache management simpler.
> > > As far as I know, there may be bytes, which may change on read.
> > > For example, clear on read values in CMIS 4.0.
> > Ah, i did not know there were such bits. I will go read the spec. But
> > it should not really matter. If the SFP driver is interested in these
> > bits, it will have to intercept the read and act on the values.
> 
> But in case user requests a few bytes from a page with clear-on-read
> values, reading full page will clear all such bytesfrom user perspective
> even if they were not requested. Driver may intercept the read, but for
> user it will look like those bytes were not set.

Yes, O.K. Reading individual words does make sense.

> Without command line argument user will not be able to request a single
> A2h page, for example. He will see it only in some kind of general dump -
> with human-readable decoder usage or multiple page dump.
> 
> And same goes forpages on other i2c addresses. How to know what to dump,
> if user does not provide i2c address and there is no way to know what to
> request from proprietary SFPs?

So we should look at this from the perspective of use cases. Currently
we have:

ethtool -m|--dump-module-eeprom|--module-info devname [raw on|off] [hex on|off] [offset N] [length N]

If you use it without any of [raw on|off] [hex on|off] [offset N]
[length N] it decodes what it finds. As soon as you pass any of these
options, the decoding is disabled and is just dumps values, either raw
or hex.

I would say, i2c address as a parameter can be added, but again,
passing it disables decoding, is just dumps raw or hex.

When not passed, and decoding is enabled, the decoder should decide if
A2 is available based on what is finds in page 0, and ask for it.

We also need to clearly define what offset and length means in this
context. It has to be within a specific page if page, bank or i2c
address has been passed, unlike what it currently means which is
offset into the current blob returned by the kernel.

I also took a look at CMIS. It has interesting semantics for address
wrap around when doing multiple byte reads. A read which reaches 127
wraps around to 0. A read which reached 255 wraps around to 128. So
for the kernel API, we probably do not want to allow offset/length to
cause a wrap around. You can only read within the low 128 bytes, or
the upper 128 bytes.

   Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-01-04 15:49 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-23  9:19 [PATCH net-next v2 0/2] Add support for DSFP transceiver type Moshe Shemesh
2020-11-23  9:19 ` [PATCH net-next v2 1/2] ethtool: Add CMIS 4.0 module type to UAPI Moshe Shemesh
2020-11-23 22:40   ` Jesse Brandeburg
2020-11-25 10:41     ` Moshe Shemesh
2020-11-23  9:19 ` [PATCH net-next v2 2/2] net/mlx5e: Add DSFP EEPROM dump support to ethtool Moshe Shemesh
2020-11-24  1:14 ` [PATCH net-next v2 0/2] Add support for DSFP transceiver type Andrew Lunn
2020-11-24 21:16   ` Jakub Kicinski
2020-11-25 10:40     ` Moshe Shemesh
2020-11-25 14:18       ` Andrew Lunn
2020-11-26 14:23         ` Moshe Shemesh
2020-11-26 15:21           ` Andrew Lunn
2020-11-27 15:12             ` Moshe Shemesh
2020-11-27 15:56               ` Andrew Lunn
2020-12-29 10:23                 ` Vladyslav Tarasiuk
2020-12-29 16:25                   ` Andrew Lunn
2020-12-30 13:55                     ` Vladyslav Tarasiuk
2020-12-30 15:36                       ` Andrew Lunn
2021-01-04 15:24                         ` Vladyslav Tarasiuk
2021-01-04 15:48                           ` Andrew Lunn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.