netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips
@ 2024-05-01 23:05 FUJITA Tomonori
  2024-05-01 23:05 ` [PATCH net-next v4 1/6] net: tn40xx: add pci " FUJITA Tomonori
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-01 23:05 UTC (permalink / raw)
  To: netdev; +Cc: andrew, kuba, jiri, horms

This patchset adds a new 10G ethernet driver for Tehuti Networks
TN40xx chips. Note in mainline, there is a driver for Tehuti Networks
(drivers/net/ethernet/tehuti/tehuti.[hc]), which supports TN30xx
chips.

Multiple vendors (DLink, Asus, Edimax, QNAP, etc) developed adapters
based on TN40xx chips. Tehuti Networks went out of business but the
drivers are still distributed under GPL2 with some of the hardware
(and also available on some sites). With some changes, I try to
upstream this driver with a new PHY driver in Rust.

The major change is replacing a PHY abstraction layer with
PHYLIB. TN40xx chips are used with various PHY hardware (AMCC QT2025,
TI TLK10232, Aqrate AQR105, and Marvell MV88X3120, MV88X3310, and
MV88E2010). So the original driver has the own PHY abstraction layer
to handle them.

I've also been working on a new PHY driver for QT2025 in Rust [1]. For
now, I enable only adapters using QT2025 PHY in the PCI ID table of
this driver. I've tested this driver and the QT2025 PHY driver with
Edimax EN-9320 10G adapter. In mainline, there are PHY drivers for
AQR105 and Marvell PHYs, which could work for some TN40xx adapters
with this driver.

To make reviewing easier, this patchset has only basic functions. Once
merged, I'll submit features like ethtool support.

v4:
- fix warning on 32bit build
- fix inline warnings
- fix header file inclusion
- fix TN40_NDEV_TXQ_LEN
- remove 'select PHYLIB' in Kconfig
- fix access to phydev
- clean up readx_poll_timeout_atomic usage
v3: https://lore.kernel.org/netdev/20240429043827.44407-2-fujita.tomonori@gmail.com/
- remove driver version
- use prefixes tn40_/TN40_ for all function, struct and define names
v2: https://lore.kernel.org/netdev/20240425010354.32605-1-fujita.tomonori@gmail.com/
- split mdio patch into mdio and phy support
- add phylink support
- clean up mdio read/write
- use the standard bit operation macros
- use upper_32/lower_32_bits macro
- use tn40_ prefix instead of bdx_
- fix Sparse errors
- fix compiler warnings
- fix style issues
v1: https://lore.kernel.org/netdev/20240415104352.4685-1-fujita.tomonori@gmail.com/

[1] https://lore.kernel.org/netdev/20240415104701.4772-1-fujita.tomonori@gmail.com/


FUJITA Tomonori (6):
  net: tn40xx: add pci driver for Tehuti Networks TN40xx chips
  net: tn40xx: add register defines
  net: tn40xx: add basic Tx handling
  net: tn40xx: add basic Rx handling
  net: tn40xx: add mdio bus support
  net: tn40xx: add PHYLIB support

 MAINTAINERS                             |    8 +-
 drivers/net/ethernet/tehuti/Kconfig     |   14 +
 drivers/net/ethernet/tehuti/Makefile    |    3 +
 drivers/net/ethernet/tehuti/tn40.c      | 1880 +++++++++++++++++++++++
 drivers/net/ethernet/tehuti/tn40.h      |  259 ++++
 drivers/net/ethernet/tehuti/tn40_mdio.c |  134 ++
 drivers/net/ethernet/tehuti/tn40_phy.c  |   67 +
 drivers/net/ethernet/tehuti/tn40_regs.h |  245 +++
 8 files changed, 2609 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/tehuti/tn40.c
 create mode 100644 drivers/net/ethernet/tehuti/tn40.h
 create mode 100644 drivers/net/ethernet/tehuti/tn40_mdio.c
 create mode 100644 drivers/net/ethernet/tehuti/tn40_phy.c
 create mode 100644 drivers/net/ethernet/tehuti/tn40_regs.h


base-commit: d5115a55ffb5253743346ddf628a890417e2935e
-- 
2.34.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH net-next v4 1/6] net: tn40xx: add pci driver for Tehuti Networks TN40xx chips
  2024-05-01 23:05 [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips FUJITA Tomonori
@ 2024-05-01 23:05 ` FUJITA Tomonori
  2024-05-07  1:38   ` Jakub Kicinski
  2024-05-01 23:05 ` [PATCH net-next v4 2/6] net: tn40xx: add register defines FUJITA Tomonori
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-01 23:05 UTC (permalink / raw)
  To: netdev; +Cc: andrew, kuba, jiri, horms

This just adds the scaffolding for an ethernet driver for Tehuti
Networks TN40xx chips.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
---
 MAINTAINERS                          |  8 +++-
 drivers/net/ethernet/tehuti/Kconfig  | 12 ++++++
 drivers/net/ethernet/tehuti/Makefile |  3 ++
 drivers/net/ethernet/tehuti/tn40.c   | 58 ++++++++++++++++++++++++++++
 drivers/net/ethernet/tehuti/tn40.h   | 11 ++++++
 5 files changed, 91 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/tehuti/tn40.c
 create mode 100644 drivers/net/ethernet/tehuti/tn40.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ab89edc6974d..3b8f1b8cc2f4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21841,7 +21841,13 @@ TEHUTI ETHERNET DRIVER
 M:	Andy Gospodarek <andy@greyhouse.net>
 L:	netdev@vger.kernel.org
 S:	Supported
-F:	drivers/net/ethernet/tehuti/*
+F:	drivers/net/ethernet/tehuti/tehuti.*
+
+TEHUTI TN40XX ETHERNET DRIVER
+M:	FUJITA Tomonori <fujita.tomonori@gmail.com>
+L:	netdev@vger.kernel.org
+S:	Supported
+F:	drivers/net/ethernet/tehuti/tn40*
 
 TELECOM CLOCK DRIVER FOR MCPL0010
 M:	Mark Gross <markgross@kernel.org>
diff --git a/drivers/net/ethernet/tehuti/Kconfig b/drivers/net/ethernet/tehuti/Kconfig
index 8735633765a1..849e3b4a71c1 100644
--- a/drivers/net/ethernet/tehuti/Kconfig
+++ b/drivers/net/ethernet/tehuti/Kconfig
@@ -23,4 +23,16 @@ config TEHUTI
 	help
 	  Tehuti Networks 10G Ethernet NIC
 
+config TEHUTI_TN40
+	tristate "Tehuti Networks TN40xx 10G Ethernet adapters"
+	depends on PCI
+	help
+	  This driver supports 10G Ethernet adapters using Tehuti Networks
+	  TN40xx chips. Currently, adapters with Applied Micro Circuits
+	  Corporation QT2025 are supported; Tehuti Networks TN9310,
+	  DLink DXE-810S, ASUS XG-C100F, and Edimax EN-9320.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called tn40xx.
+
 endif # NET_VENDOR_TEHUTI
diff --git a/drivers/net/ethernet/tehuti/Makefile b/drivers/net/ethernet/tehuti/Makefile
index 13a0ddd62088..1c468d99e476 100644
--- a/drivers/net/ethernet/tehuti/Makefile
+++ b/drivers/net/ethernet/tehuti/Makefile
@@ -4,3 +4,6 @@
 #
 
 obj-$(CONFIG_TEHUTI) += tehuti.o
+
+tn40xx-y := tn40.o
+obj-$(CONFIG_TEHUTI_TN40) += tn40xx.o
diff --git a/drivers/net/ethernet/tehuti/tn40.c b/drivers/net/ethernet/tehuti/tn40.c
new file mode 100644
index 000000000000..a7c4fad249c9
--- /dev/null
+++ b/drivers/net/ethernet/tehuti/tn40.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0+
+/* Copyright (c) Tehuti Networks Ltd. */
+
+#include <linux/pci.h>
+
+#include "tn40.h"
+
+static int tn40_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	int ret;
+
+	ret = pci_enable_device(pdev);
+	if (ret)
+		return ret;
+
+	if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))) {
+		ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_err(&pdev->dev, "failed to set DMA mask.\n");
+			goto err_disable_device;
+		}
+	}
+	return 0;
+err_disable_device:
+	pci_disable_device(pdev);
+	return ret;
+}
+
+static void tn40_remove(struct pci_dev *pdev)
+{
+	pci_disable_device(pdev);
+}
+
+static const struct pci_device_id tn40_id_table[] = {
+	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_TEHUTI, 0x4022,
+			 PCI_VENDOR_ID_TEHUTI, 0x3015) },
+	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_TEHUTI, 0x4022,
+			 PCI_VENDOR_ID_DLINK, 0x4d00) },
+	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_TEHUTI, 0x4022,
+			 PCI_VENDOR_ID_ASUSTEK, 0x8709) },
+	{ PCI_DEVICE_SUB(PCI_VENDOR_ID_TEHUTI, 0x4022,
+			 PCI_VENDOR_ID_EDIMAX, 0x8103) },
+	{ }
+};
+
+static struct pci_driver tn40_driver = {
+	.name = TN40_DRV_NAME,
+	.id_table = tn40_id_table,
+	.probe = tn40_probe,
+	.remove = tn40_remove,
+};
+
+module_pci_driver(tn40_driver);
+
+MODULE_DEVICE_TABLE(pci, tn40_id_table);
+MODULE_AUTHOR("Tehuti networks");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Tehuti Network TN40xx Driver");
diff --git a/drivers/net/ethernet/tehuti/tn40.h b/drivers/net/ethernet/tehuti/tn40.h
new file mode 100644
index 000000000000..a5c5b558f56c
--- /dev/null
+++ b/drivers/net/ethernet/tehuti/tn40.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/* Copyright (c) Tehuti Networks Ltd. */
+
+#ifndef _TN40_H_
+#define _TN40_H_
+
+#define TN40_DRV_NAME "tn40xx"
+
+#define PCI_VENDOR_ID_EDIMAX 0x1432
+
+#endif /* _TN40XX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v4 2/6] net: tn40xx: add register defines
  2024-05-01 23:05 [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips FUJITA Tomonori
  2024-05-01 23:05 ` [PATCH net-next v4 1/6] net: tn40xx: add pci " FUJITA Tomonori
@ 2024-05-01 23:05 ` FUJITA Tomonori
  2024-05-01 23:05 ` [PATCH net-next v4 3/6] net: tn40xx: add basic Tx handling FUJITA Tomonori
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-01 23:05 UTC (permalink / raw)
  To: netdev; +Cc: andrew, kuba, jiri, horms

This adds several defines to handle registers in Tehuti Networks
TN40xx chips for later patches.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
---
 drivers/net/ethernet/tehuti/tn40.h      |   2 +
 drivers/net/ethernet/tehuti/tn40_regs.h | 245 ++++++++++++++++++++++++
 2 files changed, 247 insertions(+)
 create mode 100644 drivers/net/ethernet/tehuti/tn40_regs.h

diff --git a/drivers/net/ethernet/tehuti/tn40.h b/drivers/net/ethernet/tehuti/tn40.h
index a5c5b558f56c..c0ac64a19c31 100644
--- a/drivers/net/ethernet/tehuti/tn40.h
+++ b/drivers/net/ethernet/tehuti/tn40.h
@@ -4,6 +4,8 @@
 #ifndef _TN40_H_
 #define _TN40_H_
 
+#include "tn40_regs.h"
+
 #define TN40_DRV_NAME "tn40xx"
 
 #define PCI_VENDOR_ID_EDIMAX 0x1432
diff --git a/drivers/net/ethernet/tehuti/tn40_regs.h b/drivers/net/ethernet/tehuti/tn40_regs.h
new file mode 100644
index 000000000000..95171aa57a9e
--- /dev/null
+++ b/drivers/net/ethernet/tehuti/tn40_regs.h
@@ -0,0 +1,245 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/* Copyright (c) Tehuti Networks Ltd. */
+
+#ifndef _TN40_REGS_H_
+#define _TN40_REGS_H_
+
+/* Register region size */
+#define TN40_REGS_SIZE 0x10000
+
+/* Registers from 0x0000-0x00fc were remapped to 0x4000-0x40fc */
+#define TN40_REG_TXD_CFG1_0 0x4000
+#define TN40_REG_TXD_CFG1_1 0x4004
+#define TN40_REG_TXD_CFG1_2 0x4008
+#define TN40_REG_TXD_CFG1_3 0x400C
+
+#define TN40_REG_RXF_CFG1_0 0x4010
+#define TN40_REG_RXF_CFG1_1 0x4014
+#define TN40_REG_RXF_CFG1_2 0x4018
+#define TN40_REG_RXF_CFG1_3 0x401C
+
+#define TN40_REG_RXD_CFG1_0 0x4020
+#define TN40_REG_RXD_CFG1_1 0x4024
+#define TN40_REG_RXD_CFG1_2 0x4028
+#define TN40_REG_RXD_CFG1_3 0x402C
+
+#define TN40_REG_TXF_CFG1_0 0x4030
+#define TN40_REG_TXF_CFG1_1 0x4034
+#define TN40_REG_TXF_CFG1_2 0x4038
+#define TN40_REG_TXF_CFG1_3 0x403C
+
+#define TN40_REG_TXD_CFG0_0 0x4040
+#define TN40_REG_TXD_CFG0_1 0x4044
+#define TN40_REG_TXD_CFG0_2 0x4048
+#define TN40_REG_TXD_CFG0_3 0x404C
+
+#define TN40_REG_RXF_CFG0_0 0x4050
+#define TN40_REG_RXF_CFG0_1 0x4054
+#define TN40_REG_RXF_CFG0_2 0x4058
+#define TN40_REG_RXF_CFG0_3 0x405C
+
+#define TN40_REG_RXD_CFG0_0 0x4060
+#define TN40_REG_RXD_CFG0_1 0x4064
+#define TN40_REG_RXD_CFG0_2 0x4068
+#define TN40_REG_RXD_CFG0_3 0x406C
+
+#define TN40_REG_TXF_CFG0_0 0x4070
+#define TN40_REG_TXF_CFG0_1 0x4074
+#define TN40_REG_TXF_CFG0_2 0x4078
+#define TN40_REG_TXF_CFG0_3 0x407C
+
+#define TN40_REG_TXD_WPTR_0 0x4080
+#define TN40_REG_TXD_WPTR_1 0x4084
+#define TN40_REG_TXD_WPTR_2 0x4088
+#define TN40_REG_TXD_WPTR_3 0x408C
+
+#define TN40_REG_RXF_WPTR_0 0x4090
+#define TN40_REG_RXF_WPTR_1 0x4094
+#define TN40_REG_RXF_WPTR_2 0x4098
+#define TN40_REG_RXF_WPTR_3 0x409C
+
+#define TN40_REG_RXD_WPTR_0 0x40A0
+#define TN40_REG_RXD_WPTR_1 0x40A4
+#define TN40_REG_RXD_WPTR_2 0x40A8
+#define TN40_REG_RXD_WPTR_3 0x40AC
+
+#define TN40_REG_TXF_WPTR_0 0x40B0
+#define TN40_REG_TXF_WPTR_1 0x40B4
+#define TN40_REG_TXF_WPTR_2 0x40B8
+#define TN40_REG_TXF_WPTR_3 0x40BC
+
+#define TN40_REG_TXD_RPTR_0 0x40C0
+#define TN40_REG_TXD_RPTR_1 0x40C4
+#define TN40_REG_TXD_RPTR_2 0x40C8
+#define TN40_REG_TXD_RPTR_3 0x40CC
+
+#define TN40_REG_RXF_RPTR_0 0x40D0
+#define TN40_REG_RXF_RPTR_1 0x40D4
+#define TN40_REG_RXF_RPTR_2 0x40D8
+#define TN40_REG_RXF_RPTR_3 0x40DC
+
+#define TN40_REG_RXD_RPTR_0 0x40E0
+#define TN40_REG_RXD_RPTR_1 0x40E4
+#define TN40_REG_RXD_RPTR_2 0x40E8
+#define TN40_REG_RXD_RPTR_3 0x40EC
+
+#define TN40_REG_TXF_RPTR_0 0x40F0
+#define TN40_REG_TXF_RPTR_1 0x40F4
+#define TN40_REG_TXF_RPTR_2 0x40F8
+#define TN40_REG_TXF_RPTR_3 0x40FC
+
+/* Hardware versioning */
+#define TN40_FPGA_VER 0x5030
+
+/* Registers from 0x0100-0x0150 were remapped to 0x5100-0x5150 */
+#define TN40_REG_ISR TN40_REG_ISR0
+#define TN40_REG_ISR0 0x5100
+
+#define TN40_REG_IMR TN40_REG_IMR0
+#define TN40_REG_IMR0 0x5110
+
+#define TN40_REG_RDINTCM0 0x5120
+#define TN40_REG_RDINTCM2 0x5128
+
+#define TN40_REG_TDINTCM0 0x5130
+
+#define TN40_REG_ISR_MSK0 0x5140
+
+#define TN40_REG_INIT_SEMAPHORE 0x5170
+#define TN40_REG_INIT_STATUS 0x5180
+
+#define TN40_REG_MAC_LNK_STAT 0x0200
+#define TN40_MAC_LINK_STAT 0x0004 /* Link state */
+
+#define TN40_REG_BLNK_LED 0x0210
+
+#define TN40_REG_GMAC_RXF_A 0x1240
+
+#define TN40_REG_UNC_MAC0_A 0x1250
+#define TN40_REG_UNC_MAC1_A 0x1260
+#define TN40_REG_UNC_MAC2_A 0x1270
+
+#define TN40_REG_VLAN_0 0x1800
+
+#define TN40_REG_MAX_FRAME_A 0x12C0
+
+#define TN40_REG_RX_MAC_MCST0 0x1A80
+#define TN40_REG_RX_MAC_MCST1 0x1A84
+#define TN40_MAC_MCST_NUM 15
+#define TN40_REG_RX_MCST_HASH0 0x1A00
+#define TN40_MAC_MCST_HASH_NUM 8
+
+#define TN40_REG_VPC 0x2300
+#define TN40_REG_VIC 0x2320
+#define TN40_REG_VGLB 0x2340
+
+#define TN40_REG_CLKPLL 0x5000
+
+/* MDIO interface */
+
+#define TN40_REG_MDIO_CMD_STAT 0x6030
+#define TN40_REG_MDIO_CMD 0x6034
+#define TN40_REG_MDIO_DATA 0x6038
+#define TN40_REG_MDIO_ADDR 0x603C
+#define TN40_GET_MDIO_BUSY(x) FIELD_GET(GENMASK(0, 0), (x))
+#define TN40_GET_MDIO_RD_ERR(x) FIELD_GET(GENMASK(1, 1), (x))
+
+#define TN40_REG_REVISION 0x6000
+#define TN40_REG_SCRATCH 0x6004
+#define TN40_REG_CTRLST 0x6008
+#define TN40_REG_MAC_ADDR_0 0x600C
+#define TN40_REG_MAC_ADDR_1 0x6010
+#define TN40_REG_FRM_LENGTH 0x6014
+#define TN40_REG_PAUSE_QUANT 0x6054
+#define TN40_REG_RX_FIFO_SECTION 0x601C
+#define TN40_REG_TX_FIFO_SECTION 0x6020
+#define TN40_REG_RX_FULLNESS 0x6024
+#define TN40_REG_TX_FULLNESS 0x6028
+#define TN40_REG_HASHTABLE 0x602C
+
+#define TN40_REG_RST_PORT 0x7000
+#define TN40_REG_DIS_PORT 0x7010
+#define TN40_REG_RST_QU 0x7020
+#define TN40_REG_DIS_QU 0x7030
+
+#define TN40_REG_CTRLST_TX_ENA 0x0001
+#define TN40_REG_CTRLST_RX_ENA 0x0002
+#define TN40_REG_CTRLST_PRM_ENA 0x0010
+#define TN40_REG_CTRLST_PAD_ENA 0x0020
+
+#define TN40_REG_CTRLST_BASE (TN40_REG_CTRLST_PAD_ENA | REG_CTRLST_PRM_ENA)
+
+/* TXD TXF RXF RXD  CONFIG 0x0000 --- 0x007c */
+#define TN40_TX_RX_CFG1_BASE 0xffffffff /*0-31 */
+#define TN40_TX_RX_CFG0_BASE 0xfffff000 /*31:12 */
+#define TN40_TX_RX_CFG0_RSVD 0x00000ffc /*11:2 */
+#define TN40_TX_RX_CFG0_SIZE 0x00000003 /*1:0 */
+
+/* TXD TXF RXF RXD  WRITE 0x0080 --- 0x00BC */
+#define TN40_TXF_WPTR_WR_PTR 0x00007ff8 /*14:3 */
+
+/* TXD TXF RXF RXD  READ  0x00CO --- 0x00FC */
+#define TN40_TXF_RPTR_RD_PTR 0x00007ff8 /*14:3 */
+
+/* The last 4 bits are dropped size is rounded to 16 */
+#define TN40_TXF_WPTR_MASK 0x7ff0
+
+/* regISR 0x0100 */
+/* regIMR 0x0110 */
+#define TN40_IMR_INPROG 0x80000000 /*31 */
+#define TN40_IR_LNKCHG1 0x10000000 /*28 */
+#define TN40_IR_LNKCHG0 0x08000000 /*27 */
+#define TN40_IR_GPIO 0x04000000 /*26 */
+#define TN40_IR_RFRSH 0x02000000 /*25 */
+#define TN40_IR_RSVD 0x01000000 /*24 */
+#define TN40_IR_SWI 0x00800000 /*23 */
+#define TN40_IR_RX_FREE_3 0x00400000 /*22 */
+#define TN40_IR_RX_FREE_2 0x00200000 /*21 */
+#define TN40_IR_RX_FREE_1 0x00100000 /*20 */
+#define TN40_IR_RX_FREE_0 0x00080000 /*19 */
+#define TN40_IR_TX_FREE_3 0x00040000 /*18 */
+#define TN40_IR_TX_FREE_2 0x00020000 /*17 */
+#define TN40_IR_TX_FREE_1 0x00010000 /*16 */
+#define TN40_IR_TX_FREE_0 0x00008000 /*15 */
+#define TN40_IR_RX_DESC_3 0x00004000 /*14 */
+#define TN40_IR_RX_DESC_2 0x00002000 /*13 */
+#define TN40_IR_RX_DESC_1 0x00001000 /*12 */
+#define TN40_IR_RX_DESC_0 0x00000800 /*11 */
+#define TN40_IR_PSE 0x00000400 /*10 */
+#define TN40_IR_TMR3 0x00000200 /* 9 */
+#define TN40_IR_TMR2 0x00000100 /* 8 */
+#define TN40_IR_TMR1 0x00000080 /* 7 */
+#define TN40_IR_TMR0 0x00000040 /* 6 */
+#define TN40_IR_VNT 0x00000020 /* 5 */
+#define TN40_IR_RxFL 0x00000010 /* 4 */
+#define TN40_IR_SDPERR 0x00000008 /* 3 */
+#define TN40_IR_TR 0x00000004 /* 2 */
+#define TN40_IR_PCIE_LINK 0x00000002 /* 1 */
+#define TN40_IR_PCIE_TOUT 0x00000001 /* 0 */
+
+#define TN40_IR_EXTRA						\
+	(TN40_IR_RX_FREE_0 | TN40_IR_LNKCHG0 | TN40_IR_LNKCHG1 |\
+	TN40_IR_PSE | TN40_IR_TMR0 | TN40_IR_PCIE_LINK |	\
+	TN40_IR_PCIE_TOUT)
+
+#define TN40_GMAC_RX_FILTER_OSEN 0x1000 /* shared OS enable */
+#define TN40_GMAC_RX_FILTER_TXFC 0x0400 /* Tx flow control */
+#define TN40_GMAC_RX_FILTER_RSV0 0x0200 /* reserved */
+#define TN40_GMAC_RX_FILTER_FDA 0x0100 /* filter out direct address */
+#define TN40_GMAC_RX_FILTER_AOF 0x0080 /* accept over run */
+#define TN40_GMAC_RX_FILTER_ACF 0x0040 /* accept control frames */
+#define TN40_GMAC_RX_FILTER_ARUNT 0x0020 /* accept under run */
+#define TN40_GMAC_RX_FILTER_ACRC 0x0010 /* accept crc error */
+#define TN40_GMAC_RX_FILTER_AM 0x0008 /* accept multicast */
+#define TN40_GMAC_RX_FILTER_AB 0x0004 /* accept broadcast */
+#define TN40_GMAC_RX_FILTER_PRM 0x0001 /* [0:1] promiscuous mode */
+
+#define TN40_MAX_FRAME_AB_VAL 0x3fff /* 13:0 */
+
+#define TN40_CLKPLL_PLLLKD 0x0200 /* 9 */
+#define TN40_CLKPLL_RSTEND 0x0100 /* 8 */
+#define TN40_CLKPLL_SFTRST 0x0001 /* 0 */
+
+#define TN40_CLKPLL_LKD (TN40_CLKPLL_PLLLKD | TN40_CLKPLL_RSTEND)
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v4 3/6] net: tn40xx: add basic Tx handling
  2024-05-01 23:05 [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips FUJITA Tomonori
  2024-05-01 23:05 ` [PATCH net-next v4 1/6] net: tn40xx: add pci " FUJITA Tomonori
  2024-05-01 23:05 ` [PATCH net-next v4 2/6] net: tn40xx: add register defines FUJITA Tomonori
@ 2024-05-01 23:05 ` FUJITA Tomonori
  2024-05-07  1:51   ` Jakub Kicinski
  2024-05-01 23:05 ` [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling FUJITA Tomonori
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-01 23:05 UTC (permalink / raw)
  To: netdev; +Cc: andrew, kuba, jiri, horms

This patch adds device specific structures to initialize the hardware
with basic Tx handling. The original driver loads the embedded
firmware in the header file. This driver is implemented to use the
firmware APIs.

The Tx logic uses three major data structures; two ring buffers with
NIC and one database. One ring buffer is used to send information
about packets to be sent for NIC. The other is used to get information
from NIC about packet that are sent. The database is used to keep the
information about DMA mapping. After a packet is sent, the db is used
to free the resource used for the packet.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
---
 drivers/net/ethernet/tehuti/Kconfig |    1 +
 drivers/net/ethernet/tehuti/tn40.c  | 1258 +++++++++++++++++++++++++++
 drivers/net/ethernet/tehuti/tn40.h  |  167 ++++
 3 files changed, 1426 insertions(+)

diff --git a/drivers/net/ethernet/tehuti/Kconfig b/drivers/net/ethernet/tehuti/Kconfig
index 849e3b4a71c1..4198fd59e42e 100644
--- a/drivers/net/ethernet/tehuti/Kconfig
+++ b/drivers/net/ethernet/tehuti/Kconfig
@@ -26,6 +26,7 @@ config TEHUTI
 config TEHUTI_TN40
 	tristate "Tehuti Networks TN40xx 10G Ethernet adapters"
 	depends on PCI
+	select FW_LOADER
 	help
 	  This driver supports 10G Ethernet adapters using Tehuti Networks
 	  TN40xx chips. Currently, adapters with Applied Micro Circuits
diff --git a/drivers/net/ethernet/tehuti/tn40.c b/drivers/net/ethernet/tehuti/tn40.c
index a7c4fad249c9..5fa5c3c12f5c 100644
--- a/drivers/net/ethernet/tehuti/tn40.c
+++ b/drivers/net/ethernet/tehuti/tn40.c
@@ -1,14 +1,1184 @@
 // SPDX-License-Identifier: GPL-2.0+
 /* Copyright (c) Tehuti Networks Ltd. */
 
+#include <linux/bitfield.h>
+#include <linux/ethtool.h>
+#include <linux/firmware.h>
+#include <linux/if_vlan.h>
+#include <linux/netdevice.h>
 #include <linux/pci.h>
 
 #include "tn40.h"
 
+#define TN40_SHORT_PACKET_SIZE 60
+#define TN40_FIRMWARE_NAME "tn40xx-14.fw"
+
+static void tn40_enable_interrupts(struct tn40_priv *priv)
+{
+	tn40_write_reg(priv, TN40_REG_IMR, priv->isr_mask);
+}
+
+static void tn40_disable_interrupts(struct tn40_priv *priv)
+{
+	tn40_write_reg(priv, TN40_REG_IMR, 0);
+}
+
+static int tn40_fifo_alloc(struct tn40_priv *priv, struct tn40_fifo *f,
+			   int fsz_type,
+			   u16 reg_cfg0, u16 reg_cfg1,
+			   u16 reg_rptr, u16 reg_wptr)
+{
+	u16 memsz = TN40_FIFO_SIZE * (1 << fsz_type);
+	u64 cfg_base;
+
+	memset(f, 0, sizeof(struct tn40_fifo));
+	/* 1K extra space is allocated at the end of the fifo to simplify
+	 * processing of descriptors that wraps around fifo's end.
+	 */
+	f->va = dma_alloc_coherent(&priv->pdev->dev,
+				   memsz + TN40_FIFO_EXTRA_SPACE, &f->da,
+				   GFP_KERNEL);
+	if (!f->va)
+		return -ENOMEM;
+
+	f->reg_cfg0 = reg_cfg0;
+	f->reg_cfg1 = reg_cfg1;
+	f->reg_rptr = reg_rptr;
+	f->reg_wptr = reg_wptr;
+	f->rptr = 0;
+	f->wptr = 0;
+	f->memsz = memsz;
+	f->size_mask = memsz - 1;
+	cfg_base = lower_32_bits((f->da & TN40_TX_RX_CFG0_BASE) | fsz_type);
+	tn40_write_reg(priv, reg_cfg0, cfg_base);
+	tn40_write_reg(priv, reg_cfg1, upper_32_bits(f->da));
+	return 0;
+}
+
+static void tn40_fifo_free(struct tn40_priv *priv, struct tn40_fifo *f)
+{
+	dma_free_coherent(&priv->pdev->dev,
+			  f->memsz + TN40_FIFO_EXTRA_SPACE, f->va, f->da);
+}
+
+/* TX HW/SW interaction overview
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ * There are 2 types of TX communication channels between driver and NIC.
+ * 1) TX Free Fifo - TXF - Holds ack descriptors for sent packets.
+ * 2) TX Data Fifo - TXD - Holds descriptors of full buffers.
+ *
+ * Currently the NIC supports TSO, checksumming and gather DMA
+ * UFO and IP fragmentation is on the way.
+ *
+ * RX SW Data Structures
+ * ~~~~~~~~~~~~~~~~~~~~~
+ * TXDB is used to keep track of all skbs owned by SW and their DMA addresses.
+ * For TX case, ownership lasts from getting the packet via hard_xmit and
+ * until the HW acknowledges sending the packet by TXF descriptors.
+ * TXDB is implemented as a cyclic buffer.
+ *
+ * FIFO objects keep info about the fifo's size and location, relevant HW
+ * registers, usage and skb db. Each RXD and RXF fifo has their own fifo
+ * structure. Implemented as simple struct.
+ *
+ * TX SW Execution Flow
+ * ~~~~~~~~~~~~~~~~~~~~
+ * OS calls the driver's hard_xmit method with a packet to send. The driver
+ * creates DMA mappings, builds TXD descriptors and kicks the HW by updating
+ * TXD WPTR.
+ *
+ * When a packet is sent, The HW write a TXF descriptor and the SW
+ * frees the original skb. To prevent TXD fifo overflow without
+ * reading HW registers every time, the SW deploys "tx level"
+ * technique. Upon startup, the tx level is initialized to TXD fifo
+ * length. For every sent packet, the SW gets its TXD descriptor size
+ * (from a pre-calculated array) and subtracts it from tx level.  The
+ * size is also stored in txdb. When a TXF ack arrives, the SW fetched
+ * the size of the original TXD descriptor from the txdb and adds it
+ * to the tx level. When the Tx level drops below some predefined
+ * threshold, the driver stops the TX queue. When the TX level rises
+ * above that level, the tx queue is enabled again.
+ *
+ * This technique avoids excessive reading of RPTR and WPTR registers.
+ * As our benchmarks shows, it adds 1.5 Gbit/sec to NIS's throughput.
+ */
+static void tn40_do_tx_db_ptr_next(struct tn40_txdb *db,
+				   struct tn40_tx_map **pptr)
+{
+	++*pptr;
+	if (unlikely(*pptr == db->end))
+		*pptr = db->start;
+}
+
+static void tn40_tx_db_inc_rptr(struct tn40_txdb *db)
+{
+	tn40_do_tx_db_ptr_next(db, &db->rptr);
+}
+
+static void tn40_tx_db_inc_wptr(struct tn40_txdb *db)
+{
+	tn40_do_tx_db_ptr_next(db, &db->wptr);
+}
+
+static int tn40_tx_db_init(struct tn40_txdb *d, int sz_type)
+{
+	int memsz = TN40_FIFO_SIZE * (1 << (sz_type + 1));
+
+	d->start = vzalloc(memsz);
+	if (!d->start)
+		return -ENOMEM;
+	/* In order to differentiate between an empty db state and a full db
+	 * state at least one element should always be empty in order to
+	 * avoid rptr == wptr, which means that the db is empty.
+	 */
+	d->size = memsz / sizeof(struct tn40_tx_map) - 1;
+	d->end = d->start + d->size + 1;	/* just after last element */
+
+	/* All dbs are created empty */
+	d->rptr = d->start;
+	d->wptr = d->start;
+	return 0;
+}
+
+static void tn40_tx_db_close(struct tn40_txdb *d)
+{
+	if (d->start) {
+		vfree(d->start);
+		d->start = NULL;
+	}
+}
+
+/* Sizes of tx desc (including padding if needed) as function of the SKB's
+ * frag number
+ */
+static struct {
+	u16 bytes;
+	u16 qwords;		/* qword = 64 bit */
+} tn40_txd_sizes[TN40_MAX_PBL];
+
+static void tn40_pbl_set(struct tn40_pbl *pbl, dma_addr_t dma, int len)
+{
+	pbl->len = cpu_to_le32(len);
+	pbl->pa_lo = cpu_to_le32(lower_32_bits(dma));
+	pbl->pa_hi = cpu_to_le32(upper_32_bits(dma));
+}
+
+static void tn40_txdb_set(struct tn40_txdb *db, dma_addr_t dma, int len)
+{
+	db->wptr->len = len;
+	db->wptr->addr.dma = dma;
+}
+
+struct tn40_mapping_info {
+	dma_addr_t dma;
+	size_t size;
+};
+
+/**
+ * tn40_tx_map_skb - create and store DMA mappings for skb's data blocks
+ * @priv: NIC private structure
+ * @skb: socket buffer to map
+ * @txdd: pointer to tx descriptor to be updated
+ * @pkt_len: pointer to unsigned long value
+ *
+ * This function creates DMA mappings for skb's data blocks and writes them to
+ * PBL of a new tx descriptor. It also stores them in the tx db, so they could
+ * be unmapped after the data has been sent. It is the responsibility of the
+ * caller to make sure that there is enough space in the txdb. The last
+ * element holds a pointer to skb itself and is marked with a zero length.
+ *
+ * Return: 0 on success and negative value on error.
+ */
+static int tn40_tx_map_skb(struct tn40_priv *priv, struct sk_buff *skb,
+			   struct tn40_txd_desc *txdd, unsigned int *pkt_len)
+{
+	struct tn40_mapping_info info[TN40_MAX_PBL];
+	int nr_frags = skb_shinfo(skb)->nr_frags;
+	struct tn40_pbl *pbl = &txdd->pbl[0];
+	struct tn40_txdb *db = &priv->txdb;
+	unsigned int size;
+	int i, len, ret;
+	dma_addr_t dma;
+
+	netdev_dbg(priv->ndev, "TX skb %p skbLen %d dataLen %d frags %d\n", skb,
+		   skb->len, skb->data_len, nr_frags);
+	if (nr_frags > TN40_MAX_PBL - 1) {
+		ret = skb_linearize(skb);
+		if (ret)
+			return ret;
+		nr_frags = skb_shinfo(skb)->nr_frags;
+	}
+	/* initial skb */
+	len = skb->len - skb->data_len;
+	dma = dma_map_single(&priv->pdev->dev, skb->data, len,
+			     DMA_TO_DEVICE);
+	ret = dma_mapping_error(&priv->pdev->dev, dma);
+	if (ret)
+		return ret;
+
+	tn40_txdb_set(db, dma, len);
+	tn40_pbl_set(pbl++, db->wptr->addr.dma, db->wptr->len);
+	*pkt_len = db->wptr->len;
+
+	for (i = 0; i < nr_frags; i++) {
+		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+		size = skb_frag_size(frag);
+		dma = skb_frag_dma_map(&priv->pdev->dev, frag, 0,
+				       size, DMA_TO_DEVICE);
+
+		ret = dma_mapping_error(&priv->pdev->dev, dma);
+		if (ret)
+			goto mapping_error;
+		info[i].dma = dma;
+		info[i].size = size;
+	}
+
+	for (i = 0; i < nr_frags; i++) {
+		tn40_tx_db_inc_wptr(db);
+		tn40_txdb_set(db, info[i].dma, info[i].size);
+		tn40_pbl_set(pbl++, db->wptr->addr.dma, db->wptr->len);
+		*pkt_len += db->wptr->len;
+	}
+
+	/* SHORT_PKT_FIX */
+	if (skb->len < TN40_SHORT_PACKET_SIZE)
+		++nr_frags;
+
+	/* Add skb clean up info. */
+	tn40_tx_db_inc_wptr(db);
+	db->wptr->len = -tn40_txd_sizes[nr_frags].bytes;
+	db->wptr->addr.skb = skb;
+	tn40_tx_db_inc_wptr(db);
+
+	return 0;
+ mapping_error:
+	dma_unmap_page(&priv->pdev->dev, db->wptr->addr.dma, db->wptr->len,
+		       DMA_TO_DEVICE);
+	for (; i > 0; i--)
+		dma_unmap_page(&priv->pdev->dev, info[i - 1].dma,
+			       info[i - 1].size, DMA_TO_DEVICE);
+	return -ENOMEM;
+}
+
+static void tn40_init_txd_sizes(void)
+{
+	int i, lwords;
+
+	if (tn40_txd_sizes[0].bytes)
+		return;
+
+	/* 7 - is number of lwords in txd with one phys buffer
+	 * 3 - is number of lwords used for every additional phys buffer
+	 */
+	for (i = 0; i < TN40_MAX_PBL; i++) {
+		lwords = 7 + (i * 3);
+		if (lwords & 1)
+			lwords++;	/* pad it with 1 lword */
+		tn40_txd_sizes[i].qwords = lwords >> 1;
+		tn40_txd_sizes[i].bytes = lwords << 2;
+	}
+}
+
+static int tn40_create_tx_ring(struct tn40_priv *priv)
+{
+	int ret;
+
+	ret = tn40_fifo_alloc(priv, &priv->txd_fifo0.m, priv->txd_size,
+			      TN40_REG_TXD_CFG0_0, TN40_REG_TXD_CFG1_0,
+			      TN40_REG_TXD_RPTR_0, TN40_REG_TXD_WPTR_0);
+	if (ret)
+		return ret;
+
+	ret = tn40_fifo_alloc(priv, &priv->txf_fifo0.m, priv->txf_size,
+			      TN40_REG_TXF_CFG0_0, TN40_REG_TXF_CFG1_0,
+			      TN40_REG_TXF_RPTR_0, TN40_REG_TXF_WPTR_0);
+	if (ret)
+		goto err_free_txd;
+
+	/* The TX db has to keep mappings for all packets sent (on
+	 * TxD) and not yet reclaimed (on TxF).
+	 */
+	ret = tn40_tx_db_init(&priv->txdb, max(priv->txd_size, priv->txf_size));
+	if (ret)
+		goto err_free_txf;
+
+	/* SHORT_PKT_FIX */
+	priv->b0_len = 64;
+	priv->b0_va = dma_alloc_coherent(&priv->pdev->dev, priv->b0_len,
+					 &priv->b0_dma, GFP_KERNEL);
+	if (!priv->b0_va)
+		goto err_free_db;
+
+	priv->tx_level = TN40_MAX_TX_LEVEL;
+	priv->tx_update_mark = priv->tx_level - 1024;
+	return 0;
+err_free_db:
+	tn40_tx_db_close(&priv->txdb);
+err_free_txf:
+	tn40_fifo_free(priv, &priv->txf_fifo0.m);
+err_free_txd:
+	tn40_fifo_free(priv, &priv->txd_fifo0.m);
+	return -ENOMEM;
+}
+
+/**
+ * tn40_tx_space - Calculate the available space in the TX fifo.
+ * @priv: NIC private structure
+ *
+ * Return: available space in TX fifo in bytes
+ */
+static int tn40_tx_space(struct tn40_priv *priv)
+{
+	struct tn40_txd_fifo *f = &priv->txd_fifo0;
+	int fsize;
+
+	f->m.rptr = tn40_read_reg(priv, f->m.reg_rptr) & TN40_TXF_WPTR_WR_PTR;
+	fsize = f->m.rptr - f->m.wptr;
+	if (fsize <= 0)
+		fsize = f->m.memsz + fsize;
+	return fsize;
+}
+
+#define TN40_TXD_FULL_CHECKSUM 7
+
+static int tn40_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+{
+	struct tn40_priv *priv = netdev_priv(ndev);
+	struct tn40_txd_fifo *f = &priv->txd_fifo0;
+	int txd_checksum = TN40_TXD_FULL_CHECKSUM;
+	struct tn40_txd_desc *txdd;
+	int nr_frags, len, err;
+	unsigned int pkt_len;
+	int txd_vlan_id = 0;
+	int txd_lgsnd = 0;
+	int txd_vtag = 0;
+	int txd_mss = 0;
+
+	/* Build tx descriptor */
+	txdd = (struct tn40_txd_desc *)(f->m.va + f->m.wptr);
+	err = tn40_tx_map_skb(priv, skb, txdd, &pkt_len);
+	if (err) {
+		dev_kfree_skb(skb);
+		return NETDEV_TX_OK;
+	}
+	nr_frags = skb_shinfo(skb)->nr_frags;
+	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL))
+		txd_checksum = 0;
+
+	if (skb_shinfo(skb)->gso_size) {
+		txd_mss = skb_shinfo(skb)->gso_size;
+		txd_lgsnd = 1;
+		netdev_dbg(priv->ndev, "skb %p pkt len %d gso size = %d\n", skb,
+			   pkt_len, txd_mss);
+	}
+	if (skb_vlan_tag_present(skb)) {
+		/* Don't cut VLAN ID to 12 bits */
+		txd_vlan_id = skb_vlan_tag_get(skb);
+		txd_vtag = 1;
+	}
+	txdd->va_hi = 0;
+	txdd->va_lo = 0;
+	txdd->length = cpu_to_le16(pkt_len);
+	txdd->mss = cpu_to_le16(txd_mss);
+	txdd->txd_val1 =
+		cpu_to_le32(TN40_TXD_W1_VAL
+			    (tn40_txd_sizes[nr_frags].qwords, txd_checksum,
+			     txd_vtag, txd_lgsnd, txd_vlan_id));
+	netdev_dbg(priv->ndev, "=== w1 qwords[%d] %d =====\n", nr_frags,
+		   tn40_txd_sizes[nr_frags].qwords);
+	netdev_dbg(priv->ndev, "=== TxD desc =====================\n");
+	netdev_dbg(priv->ndev, "=== w1: 0x%x ================\n",
+		   txdd->txd_val1);
+	netdev_dbg(priv->ndev, "=== w2: mss 0x%x len 0x%x\n", txdd->mss,
+		   txdd->length);
+	/* SHORT_PKT_FIX */
+	if (pkt_len < TN40_SHORT_PACKET_SIZE) {
+		struct tn40_pbl *pbl = &txdd->pbl[++nr_frags];
+
+		txdd->length = cpu_to_le16(TN40_SHORT_PACKET_SIZE);
+		txdd->txd_val1 =
+			cpu_to_le32(TN40_TXD_W1_VAL
+				    (tn40_txd_sizes[nr_frags].qwords,
+				     txd_checksum, txd_vtag, txd_lgsnd,
+				     txd_vlan_id));
+		pbl->len = cpu_to_le32(TN40_SHORT_PACKET_SIZE - pkt_len);
+		pbl->pa_lo = cpu_to_le32(lower_32_bits(priv->b0_dma));
+		pbl->pa_hi = cpu_to_le32(upper_32_bits(priv->b0_dma));
+		netdev_dbg(priv->ndev, "=== SHORT_PKT_FIX   ==============\n");
+		netdev_dbg(priv->ndev, "=== nr_frags : %d   ==============\n",
+			   nr_frags);
+	}
+
+	/* Increment TXD write pointer. In case of fifo wrapping copy
+	 * reminder of the descriptor to the beginning.
+	 */
+	f->m.wptr += tn40_txd_sizes[nr_frags].bytes;
+	len = f->m.wptr - f->m.memsz;
+	if (unlikely(len >= 0)) {
+		f->m.wptr = len;
+		if (len > 0)
+			memcpy(f->m.va, f->m.va + f->m.memsz, len);
+	}
+	/* Force memory writes to complete before letting the HW know
+	 * there are new descriptors to fetch.
+	 */
+	wmb();
+
+	priv->tx_level -= tn40_txd_sizes[nr_frags].bytes;
+	if (priv->tx_level > priv->tx_update_mark) {
+		tn40_write_reg(priv, f->m.reg_wptr,
+			       f->m.wptr & TN40_TXF_WPTR_WR_PTR);
+	} else {
+		if (priv->tx_noupd++ > TN40_NO_UPD_PACKETS) {
+			priv->tx_noupd = 0;
+			tn40_write_reg(priv, f->m.reg_wptr,
+				       f->m.wptr & TN40_TXF_WPTR_WR_PTR);
+		}
+	}
+
+	netif_trans_update(ndev);
+	priv->net_stats.tx_packets++;
+	priv->net_stats.tx_bytes += pkt_len;
+	if (priv->tx_level < TN40_MIN_TX_LEVEL) {
+		netdev_dbg(priv->ndev, "TX Q STOP level %d\n", priv->tx_level);
+		netif_stop_queue(ndev);
+	}
+
+	return NETDEV_TX_OK;
+}
+
+static void tn40_tx_free_skbs(struct tn40_priv *priv)
+{
+	struct tn40_txdb *db = &priv->txdb;
+
+	while (db->rptr != db->wptr) {
+		if (likely(db->rptr->len))
+			dma_unmap_page(&priv->pdev->dev, db->rptr->addr.dma,
+				       db->rptr->len, DMA_TO_DEVICE);
+		else
+			dev_kfree_skb(db->rptr->addr.skb);
+		tn40_tx_db_inc_rptr(db);
+	}
+}
+
+static void tn40_destroy_tx_ring(struct tn40_priv *priv)
+{
+	tn40_tx_free_skbs(priv);
+	tn40_fifo_free(priv, &priv->txd_fifo0.m);
+	tn40_fifo_free(priv, &priv->txf_fifo0.m);
+	tn40_tx_db_close(&priv->txdb);
+	/* SHORT_PKT_FIX */
+	if (priv->b0_len) {
+		dma_free_coherent(&priv->pdev->dev, priv->b0_len, priv->b0_va,
+				  priv->b0_dma);
+		priv->b0_len = 0;
+	}
+}
+
+/**
+ * tn40_tx_push_desc - Push a descriptor to TxD fifo.
+ *
+ * @priv: NIC private structure
+ * @data: desc's data
+ * @size: desc's size
+ *
+ * This function pushes desc to TxD fifo and overlaps it if needed.
+ *
+ * This function does not check for available space, nor does it check
+ * that the data size is smaller than the fifo size. Checking for
+ * space is the responsibility of the caller.
+ */
+static void tn40_tx_push_desc(struct tn40_priv *priv, void *data, int size)
+{
+	struct tn40_txd_fifo *f = &priv->txd_fifo0;
+	int i = f->m.memsz - f->m.wptr;
+
+	if (size == 0)
+		return;
+
+	if (i > size) {
+		memcpy(f->m.va + f->m.wptr, data, size);
+		f->m.wptr += size;
+	} else {
+		memcpy(f->m.va + f->m.wptr, data, i);
+		f->m.wptr = size - i;
+		memcpy(f->m.va, data + i, f->m.wptr);
+	}
+	tn40_write_reg(priv, f->m.reg_wptr, f->m.wptr & TN40_TXF_WPTR_WR_PTR);
+}
+
+/**
+ * tn40_tx_push_desc_safe - push descriptor to TxD fifo in a safe way.
+ *
+ * @priv: NIC private structure
+ * @data: descriptor data
+ * @size: descriptor size
+ *
+ * This function does check for available space and, if necessary,
+ * waits for the NIC to read existing data before writing new data.
+ */
+static void tn40_tx_push_desc_safe(struct tn40_priv *priv, void *data, int size)
+{
+	int timer = 0;
+
+	while (size > 0) {
+		/* We subtract 8 because when the fifo is full rptr ==
+		 * wptr, which also means that fifo is empty, we can
+		 * understand the difference, but could the HW do the
+		 * same ???
+		 */
+		int avail = tn40_tx_space(priv) - 8;
+
+		if (avail <= 0) {
+			if (timer++ > 300) /* Prevent endless loop */
+				break;
+
+			udelay(50); /* Give the HW a chance to clean the fifo */
+			continue;
+		}
+		avail = min(avail, size);
+		netdev_dbg(priv->ndev,
+			   "about to push  %d bytes starting %p size %d\n",
+			   avail, data, size);
+		tn40_tx_push_desc(priv, data, avail);
+		size -= avail;
+		data += avail;
+	}
+}
+
+static int tn40_set_link_speed(struct tn40_priv *priv, u32 speed)
+{
+	u32 val;
+	int i;
+
+	netdev_dbg(priv->ndev, "speed %d\n", speed);
+	switch (speed) {
+	case SPEED_10000:
+	case SPEED_5000:
+	case SPEED_2500:
+		netdev_dbg(priv->ndev, "link_speed %d\n", speed);
+
+		tn40_write_reg(priv, 0x1010, 0x217);	/*ETHSD.REFCLK_CONF  */
+		tn40_write_reg(priv, 0x104c, 0x4c);	/*ETHSD.L0_RX_PCNT  */
+		tn40_write_reg(priv, 0x1050, 0x4c);	/*ETHSD.L1_RX_PCNT  */
+		tn40_write_reg(priv, 0x1054, 0x4c);	/*ETHSD.L2_RX_PCNT  */
+		tn40_write_reg(priv, 0x1058, 0x4c);	/*ETHSD.L3_RX_PCNT  */
+		tn40_write_reg(priv, 0x102c, 0x434);	/*ETHSD.L0_TX_PCNT  */
+		tn40_write_reg(priv, 0x1030, 0x434);	/*ETHSD.L1_TX_PCNT  */
+		tn40_write_reg(priv, 0x1034, 0x434);	/*ETHSD.L2_TX_PCNT  */
+		tn40_write_reg(priv, 0x1038, 0x434);	/*ETHSD.L3_TX_PCNT  */
+		tn40_write_reg(priv, 0x6300, 0x0400);	/*MAC.PCS_CTRL */
+
+		tn40_write_reg(priv, 0x1018, 0x00);	/*Mike2 */
+		udelay(5);
+		tn40_write_reg(priv, 0x1018, 0x04);	/*Mike2 */
+		udelay(5);
+		tn40_write_reg(priv, 0x1018, 0x06);	/*Mike2 */
+		udelay(5);
+		/*MikeFix1 */
+		/*L0: 0x103c , L1: 0x1040 , L2: 0x1044 , L3: 0x1048 =0x81644 */
+		tn40_write_reg(priv, 0x103c, 0x81644);	/*ETHSD.L0_TX_DCNT  */
+		tn40_write_reg(priv, 0x1040, 0x81644);	/*ETHSD.L1_TX_DCNT  */
+		tn40_write_reg(priv, 0x1044, 0x81644);	/*ETHSD.L2_TX_DCNT  */
+		tn40_write_reg(priv, 0x1048, 0x81644);	/*ETHSD.L3_TX_DCNT  */
+		tn40_write_reg(priv, 0x1014, 0x043);	/*ETHSD.INIT_STAT */
+		for (i = 1000; i; i--) {
+			udelay(50);
+			/*ETHSD.INIT_STAT */
+			val = tn40_read_reg(priv, 0x1014);
+			if (val & (1 << 9)) {
+				/*ETHSD.INIT_STAT */
+				tn40_write_reg(priv, 0x1014, 0x3);
+				/*ETHSD.INIT_STAT */
+				val = tn40_read_reg(priv, 0x1014);
+
+				break;
+			}
+		}
+		if (!i)
+			netdev_err(priv->ndev, "MAC init timeout!\n");
+
+		tn40_write_reg(priv, 0x6350, 0x0);	/*MAC.PCS_IF_MODE */
+		tn40_write_reg(priv, TN40_REG_CTRLST, 0xC13);	/*0x93//0x13 */
+		tn40_write_reg(priv, 0x111c, 0x7ff);	/*MAC.MAC_RST_CNT */
+		for (i = 40; i--;)
+			udelay(50);
+
+		tn40_write_reg(priv, 0x111c, 0x0);	/*MAC.MAC_RST_CNT */
+		break;
+
+	case SPEED_1000:
+	case SPEED_100:
+		tn40_write_reg(priv, 0x1010, 0x613);	/*ETHSD.REFCLK_CONF */
+		tn40_write_reg(priv, 0x104c, 0x4d);	/*ETHSD.L0_RX_PCNT  */
+		tn40_write_reg(priv, 0x1050, 0x0);	/*ETHSD.L1_RX_PCNT  */
+		tn40_write_reg(priv, 0x1054, 0x0);	/*ETHSD.L2_RX_PCNT  */
+		tn40_write_reg(priv, 0x1058, 0x0);	/*ETHSD.L3_RX_PCNT  */
+		tn40_write_reg(priv, 0x102c, 0x35);	/*ETHSD.L0_TX_PCNT  */
+		tn40_write_reg(priv, 0x1030, 0x0);	/*ETHSD.L1_TX_PCNT  */
+		tn40_write_reg(priv, 0x1034, 0x0);	/*ETHSD.L2_TX_PCNT  */
+		tn40_write_reg(priv, 0x1038, 0x0);	/*ETHSD.L3_TX_PCNT  */
+		tn40_write_reg(priv, 0x6300, 0x01140);	/*MAC.PCS_CTRL */
+
+		tn40_write_reg(priv, 0x1014, 0x043);	/*ETHSD.INIT_STAT */
+		for (i = 1000; i; i--) {
+			udelay(50);
+			val = tn40_read_reg(priv, 0x1014); /*ETHSD.INIT_STAT */
+			if (val & (1 << 9)) {
+				/*ETHSD.INIT_STAT */
+				tn40_write_reg(priv, 0x1014, 0x3);
+				/*ETHSD.INIT_STAT */
+				val = tn40_read_reg(priv, 0x1014);
+
+				break;
+			}
+		}
+		if (!i)
+			netdev_err(priv->ndev, "MAC init timeout!\n");
+
+		tn40_write_reg(priv, 0x6350, 0x2b);	/*MAC.PCS_IF_MODE 1g */
+		tn40_write_reg(priv, 0x6310, 0x9801);	/*MAC.PCS_DEV_AB */
+
+		tn40_write_reg(priv, 0x6314, 0x1);	/*MAC.PCS_PART_AB */
+		tn40_write_reg(priv, 0x6348, 0xc8);	/*MAC.PCS_LINK_LO */
+		tn40_write_reg(priv, 0x634c, 0xc8);	/*MAC.PCS_LINK_HI */
+		udelay(50);
+		tn40_write_reg(priv, TN40_REG_CTRLST, 0xC13);	/*0x93//0x13 */
+		tn40_write_reg(priv, 0x111c, 0x7ff);	/*MAC.MAC_RST_CNT */
+		for (i = 40; i--;)
+			udelay(50);
+
+		tn40_write_reg(priv, 0x111c, 0x0);	/*MAC.MAC_RST_CNT */
+		tn40_write_reg(priv, 0x6300, 0x1140);	/*MAC.PCS_CTRL */
+		break;
+
+	case 0:		/* Link down */
+		tn40_write_reg(priv, 0x104c, 0x0);	/*ETHSD.L0_RX_PCNT  */
+		tn40_write_reg(priv, 0x1050, 0x0);	/*ETHSD.L1_RX_PCNT  */
+		tn40_write_reg(priv, 0x1054, 0x0);	/*ETHSD.L2_RX_PCNT  */
+		tn40_write_reg(priv, 0x1058, 0x0);	/*ETHSD.L3_RX_PCNT  */
+		tn40_write_reg(priv, 0x102c, 0x0);	/*ETHSD.L0_TX_PCNT  */
+		tn40_write_reg(priv, 0x1030, 0x0);	/*ETHSD.L1_TX_PCNT  */
+		tn40_write_reg(priv, 0x1034, 0x0);	/*ETHSD.L2_TX_PCNT  */
+		tn40_write_reg(priv, 0x1038, 0x0);	/*ETHSD.L3_TX_PCNT  */
+
+		tn40_write_reg(priv, TN40_REG_CTRLST, 0x800);
+		tn40_write_reg(priv, 0x111c, 0x7ff);	/*MAC.MAC_RST_CNT */
+		for (i = 40; i--;)
+			udelay(50);
+		tn40_write_reg(priv, 0x111c, 0x0);	/*MAC.MAC_RST_CNT */
+		break;
+
+	default:
+		netdev_err(priv->ndev,
+			   "Link speed was not identified yet (%d)\n", speed);
+		speed = 0;
+		break;
+	}
+	return speed;
+}
+
+#define TN40_LINK_LOOP_MAX 10
+
+static void tn40_link_changed(struct tn40_priv *priv)
+{
+	u32 link = tn40_read_reg(priv,
+				 TN40_REG_MAC_LNK_STAT) & TN40_MAC_LINK_STAT;
+	if (!link) {
+		if (netif_carrier_ok(priv->ndev) && priv->link)
+			netif_stop_queue(priv->ndev);
+
+		priv->link = 0;
+		if (priv->link_loop_cnt++ > TN40_LINK_LOOP_MAX) {
+			/* MAC reset */
+			tn40_set_link_speed(priv, 0);
+			priv->link_loop_cnt = 0;
+		}
+		tn40_write_reg(priv, 0x5150, 1000000);
+		return;
+	}
+	if (!netif_carrier_ok(priv->ndev) && !link)
+		netif_wake_queue(priv->ndev);
+
+	priv->link = link;
+}
+
+static void tn40_isr_extra(struct tn40_priv *priv, u32 isr)
+{
+	if (isr & (TN40_IR_LNKCHG0 | TN40_IR_LNKCHG1 | TN40_IR_TMR0)) {
+		netdev_dbg(priv->ndev, "isr = 0x%x\n", isr);
+		tn40_link_changed(priv);
+	}
+}
+
+static irqreturn_t tn40_isr_napi(int irq, void *dev)
+{
+	struct tn40_priv *priv = netdev_priv((struct net_device *)dev);
+	u32 isr;
+
+	isr = tn40_read_reg(priv, TN40_REG_ISR_MSK0);
+
+	if (unlikely(!isr)) {
+		tn40_enable_interrupts(priv);
+		return IRQ_NONE;	/* Not our interrupt */
+	}
+
+	if (isr & TN40_IR_EXTRA)
+		tn40_isr_extra(priv, isr);
+
+	if (isr & (TN40_IR_RX_DESC_0 | TN40_IR_TX_FREE_0 | TN40_IR_TMR1)) {
+		/* We get here if an interrupt has slept into the
+		 * small time window between these lines in
+		 * tn40_poll: tn40_enable_interrupts(priv); return 0;
+		 *
+		 * Currently interrupts are disabled (since we read
+		 * the ISR register) and we have failed to register
+		 * the next poll. So we read the regs to trigger the
+		 * chip and allow further interrupts.
+		 */
+		tn40_read_reg(priv, TN40_REG_TXF_WPTR_0);
+		tn40_read_reg(priv, TN40_REG_RXD_WPTR_0);
+	}
+
+	tn40_enable_interrupts(priv);
+	return IRQ_HANDLED;
+}
+
+static int tn40_fw_load(struct tn40_priv *priv)
+{
+	const struct firmware *fw = NULL;
+	int master, i, ret;
+
+	ret = request_firmware(&fw, TN40_FIRMWARE_NAME, &priv->pdev->dev);
+	if (ret)
+		return ret;
+
+	master = tn40_read_reg(priv, TN40_REG_INIT_SEMAPHORE);
+	if (!tn40_read_reg(priv, TN40_REG_INIT_STATUS) && master) {
+		netdev_dbg(priv->ndev, "Loading FW...\n");
+		tn40_tx_push_desc_safe(priv, (void *)fw->data, fw->size);
+		mdelay(100);
+	}
+	for (i = 0; i < 200; i++) {
+		if (tn40_read_reg(priv, TN40_REG_INIT_STATUS))
+			break;
+		mdelay(2);
+	}
+	if (master)
+		tn40_write_reg(priv, TN40_REG_INIT_SEMAPHORE, 1);
+
+	if (i == 200) {
+		netdev_err(priv->ndev, "firmware loading failed\n");
+		netdev_dbg(priv->ndev, "VPC: 0x%x VIC: 0x%x STATUS: 0x%xd\n",
+			   tn40_read_reg(priv, TN40_REG_VPC),
+			   tn40_read_reg(priv, TN40_REG_VIC),
+			   tn40_read_reg(priv, TN40_REG_INIT_STATUS));
+		ret = -EIO;
+	} else {
+		netdev_dbg(priv->ndev, "firmware loading success\n");
+	}
+	release_firmware(fw);
+	return ret;
+}
+
+static void tn40_restore_mac(struct net_device *ndev, struct tn40_priv *priv)
+{
+	u32 val;
+
+	netdev_dbg(priv->ndev, "mac0 =%x mac1 =%x mac2 =%x\n",
+		   tn40_read_reg(priv, TN40_REG_UNC_MAC0_A),
+		   tn40_read_reg(priv, TN40_REG_UNC_MAC1_A),
+		   tn40_read_reg(priv, TN40_REG_UNC_MAC2_A));
+
+	val = (ndev->dev_addr[0] << 8) | (ndev->dev_addr[1]);
+	tn40_write_reg(priv, TN40_REG_UNC_MAC2_A, val);
+	val = (ndev->dev_addr[2] << 8) | (ndev->dev_addr[3]);
+	tn40_write_reg(priv, TN40_REG_UNC_MAC1_A, val);
+	val = (ndev->dev_addr[4] << 8) | (ndev->dev_addr[5]);
+	tn40_write_reg(priv, TN40_REG_UNC_MAC0_A, val);
+
+	/* More then IP MAC address */
+	tn40_write_reg(priv, TN40_REG_MAC_ADDR_0,
+		       (ndev->dev_addr[3] << 24) | (ndev->dev_addr[2] << 16) |
+		       (ndev->dev_addr[1] << 8) | (ndev->dev_addr[0]));
+	tn40_write_reg(priv, TN40_REG_MAC_ADDR_1,
+		       (ndev->dev_addr[5] << 8) | (ndev->dev_addr[4]));
+
+	netdev_dbg(priv->ndev, "mac0 =%x mac1 =%x mac2 =%x\n",
+		   tn40_read_reg(priv, TN40_REG_UNC_MAC0_A),
+		   tn40_read_reg(priv, TN40_REG_UNC_MAC1_A),
+		   tn40_read_reg(priv, TN40_REG_UNC_MAC2_A));
+}
+
+static void tn40_hw_start(struct tn40_priv *priv)
+{
+	tn40_write_reg(priv, TN40_REG_FRM_LENGTH, 0X3FE0);
+	tn40_write_reg(priv, TN40_REG_GMAC_RXF_A, 0X10fd);
+	/*MikeFix1 */
+	/*L0: 0x103c , L1: 0x1040 , L2: 0x1044 , L3: 0x1048 =0x81644 */
+	tn40_write_reg(priv, 0x103c, 0x81644);	/*ETHSD.L0_TX_DCNT  */
+	tn40_write_reg(priv, 0x1040, 0x81644);	/*ETHSD.L1_TX_DCNT  */
+	tn40_write_reg(priv, 0x1044, 0x81644);	/*ETHSD.L2_TX_DCNT  */
+	tn40_write_reg(priv, 0x1048, 0x81644);	/*ETHSD.L3_TX_DCNT  */
+	tn40_write_reg(priv, TN40_REG_RX_FIFO_SECTION, 0x10);
+	tn40_write_reg(priv, TN40_REG_TX_FIFO_SECTION, 0xE00010);
+	tn40_write_reg(priv, TN40_REG_RX_FULLNESS, 0);
+	tn40_write_reg(priv, TN40_REG_TX_FULLNESS, 0);
+
+	tn40_write_reg(priv, TN40_REG_VGLB, 0);
+	tn40_write_reg(priv, TN40_REG_RDINTCM0, priv->rdintcm);
+	tn40_write_reg(priv, TN40_REG_RDINTCM2, 0);
+
+	/* old val = 0x300064 */
+	tn40_write_reg(priv, TN40_REG_TDINTCM0, priv->tdintcm);
+
+	/* Enable timer interrupt once in 2 secs. */
+	tn40_restore_mac(priv->ndev, priv);
+
+	/* Pause frame */
+	tn40_write_reg(priv, 0x12E0, 0x28);
+	tn40_write_reg(priv, TN40_REG_PAUSE_QUANT, 0xFFFF);
+	tn40_write_reg(priv, 0x6064, 0xF);
+
+	tn40_write_reg(priv, TN40_REG_GMAC_RXF_A,
+		       TN40_GMAC_RX_FILTER_OSEN | TN40_GMAC_RX_FILTER_TXFC |
+		       TN40_GMAC_RX_FILTER_AM | TN40_GMAC_RX_FILTER_AB);
+
+	tn40_link_changed(priv);
+	tn40_enable_interrupts(priv);
+}
+
+static int tn40_hw_reset(struct tn40_priv *priv)
+{
+	u32 val, i;
+
+	/* Reset sequences: read, write 1, read, write 0 */
+	val = tn40_read_reg(priv, TN40_REG_CLKPLL);
+	tn40_write_reg(priv, TN40_REG_CLKPLL, (val | TN40_CLKPLL_SFTRST) + 0x8);
+	udelay(50);
+	val = tn40_read_reg(priv, TN40_REG_CLKPLL);
+	tn40_write_reg(priv, TN40_REG_CLKPLL, val & ~TN40_CLKPLL_SFTRST);
+
+	/* Check that the PLLs are locked and reset ended */
+	for (i = 0; i < 70; i++, mdelay(10)) {
+		if ((tn40_read_reg(priv, TN40_REG_CLKPLL) & TN40_CLKPLL_LKD) ==
+		    TN40_CLKPLL_LKD) {
+			udelay(50);
+			/* Do any PCI-E read transaction */
+			tn40_read_reg(priv, TN40_REG_RXD_CFG0_0);
+			return 0;
+		}
+	}
+	return 1;
+}
+
+static void tn40_sw_reset(struct tn40_priv *priv)
+{
+	int i;
+
+	/* 1. load MAC (obsolete) */
+	/* 2. disable Rx (and Tx) */
+	tn40_write_reg(priv, TN40_REG_GMAC_RXF_A, 0);
+	mdelay(100);
+	/* 3. Disable port */
+	tn40_write_reg(priv, TN40_REG_DIS_PORT, 1);
+	/* 4. Disable queue */
+	tn40_write_reg(priv, TN40_REG_DIS_QU, 1);
+	/* 5. Wait until hw is disabled */
+	for (i = 0; i < 50; i++) {
+		if (tn40_read_reg(priv, TN40_REG_RST_PORT) & 1)
+			break;
+		mdelay(10);
+	}
+	if (i == 50)
+		netdev_err(priv->ndev, "SW reset timeout. continuing anyway\n");
+
+	/* 6. Disable interrupts */
+	tn40_write_reg(priv, TN40_REG_RDINTCM0, 0);
+	tn40_write_reg(priv, TN40_REG_TDINTCM0, 0);
+	tn40_write_reg(priv, TN40_REG_IMR, 0);
+	tn40_read_reg(priv, TN40_REG_ISR);
+
+	/* 7. Reset queue */
+	tn40_write_reg(priv, TN40_REG_RST_QU, 1);
+	/* 8. Reset port */
+	tn40_write_reg(priv, TN40_REG_RST_PORT, 1);
+	/* 9. Zero all read and write pointers */
+	for (i = TN40_REG_TXD_WPTR_0; i <= TN40_REG_TXF_RPTR_3; i += 0x10)
+		tn40_write_reg(priv, i, 0);
+	/* 10. Unset port disable */
+	tn40_write_reg(priv, TN40_REG_DIS_PORT, 0);
+	/* 11. Unset queue disable */
+	tn40_write_reg(priv, TN40_REG_DIS_QU, 0);
+	/* 12. Unset queue reset */
+	tn40_write_reg(priv, TN40_REG_RST_QU, 0);
+	/* 13. Unset port reset */
+	tn40_write_reg(priv, TN40_REG_RST_PORT, 0);
+	/* 14. Enable Rx */
+	/* Skipped. will be done later */
+}
+
+static int tn40_start(struct tn40_priv *priv)
+{
+	int ret;
+
+	ret = tn40_create_tx_ring(priv);
+	if (ret) {
+		netdev_err(priv->ndev, "failed to tx init %d\n", ret);
+		return ret;
+	}
+
+	ret = request_irq(priv->pdev->irq, &tn40_isr_napi, IRQF_SHARED,
+			  priv->ndev->name, priv->ndev);
+	if (ret) {
+		netdev_err(priv->ndev, "failed to request irq %d\n", ret);
+		goto err_tx_ring;
+	}
+
+	tn40_hw_start(priv);
+	return 0;
+err_tx_ring:
+	tn40_destroy_tx_ring(priv);
+	return ret;
+}
+
+static int tn40_close(struct net_device *ndev)
+{
+	struct tn40_priv *priv = netdev_priv(ndev);
+
+	tn40_disable_interrupts(priv);
+	free_irq(priv->pdev->irq, priv->ndev);
+	tn40_sw_reset(priv);
+	tn40_destroy_tx_ring(priv);
+	return 0;
+}
+
+static int tn40_open(struct net_device *dev)
+{
+	struct tn40_priv *priv = netdev_priv(dev);
+	int ret;
+
+	tn40_sw_reset(priv);
+	ret = tn40_start(priv);
+	if (ret) {
+		netdev_err(dev, "failed to start %d\n", ret);
+		return ret;
+	}
+	return 0;
+}
+
+static void __tn40_vlan_rx_vid(struct net_device *ndev, uint16_t vid,
+			       int enable)
+{
+	struct tn40_priv *priv = netdev_priv(ndev);
+	u32 reg, bit, val;
+
+	netdev_dbg(priv->ndev, "vid =%d value =%d\n", (int)vid, enable);
+	if (unlikely(vid >= 4096)) {
+		netdev_err(priv->ndev, "invalid VID: %u (> 4096)\n", vid);
+		return;
+	}
+	reg = TN40_REG_VLAN_0 + (vid / 32) * 4;
+	bit = 1 << vid % 32;
+	val = tn40_read_reg(priv, reg);
+	netdev_dbg(priv->ndev, "reg =%x, val =%x, bit =%d\n", reg, val, bit);
+	if (enable)
+		val |= bit;
+	else
+		val &= ~bit;
+	netdev_dbg(priv->ndev, "new val %x\n", val);
+	tn40_write_reg(priv, reg, val);
+}
+
+static int tn40_vlan_rx_add_vid(struct net_device *ndev,
+				__always_unused __be16 proto, u16 vid)
+{
+	__tn40_vlan_rx_vid(ndev, vid, 1);
+	return 0;
+}
+
+static int tn40_vlan_rx_kill_vid(struct net_device *ndev,
+				 __always_unused __be16 proto, u16 vid)
+{
+	__tn40_vlan_rx_vid(ndev, vid, 0);
+	return 0;
+}
+
+static void tn40_setmulti(struct net_device *ndev)
+{
+	struct tn40_priv *priv = netdev_priv(ndev);
+
+	u32 rxf_val = TN40_GMAC_RX_FILTER_AM | TN40_GMAC_RX_FILTER_AB |
+		TN40_GMAC_RX_FILTER_OSEN | TN40_GMAC_RX_FILTER_TXFC;
+	int i;
+
+	/* IMF - imperfect (hash) rx multicast filter */
+	/* PMF - perfect rx multicast filter */
+
+	/* FIXME: RXE(OFF) */
+	if (ndev->flags & IFF_PROMISC) {
+		rxf_val |= TN40_GMAC_RX_FILTER_PRM;
+	} else if (ndev->flags & IFF_ALLMULTI) {
+		/* set IMF to accept all multicast frames */
+		for (i = 0; i < TN40_MAC_MCST_HASH_NUM; i++)
+			tn40_write_reg(priv,
+				       TN40_REG_RX_MCST_HASH0 + i * 4, ~0);
+	} else if (netdev_mc_count(ndev)) {
+		u8 hash;
+		struct netdev_hw_addr *mclist;
+		u32 reg, val;
+
+		/* Set IMF to deny all multicast frames */
+		for (i = 0; i < TN40_MAC_MCST_HASH_NUM; i++)
+			tn40_write_reg(priv,
+				       TN40_REG_RX_MCST_HASH0 + i * 4, 0);
+
+		/* Set PMF to deny all multicast frames */
+		for (i = 0; i < TN40_MAC_MCST_NUM; i++) {
+			tn40_write_reg(priv,
+				       TN40_REG_RX_MAC_MCST0 + i * 8, 0);
+			tn40_write_reg(priv,
+				       TN40_REG_RX_MAC_MCST1 + i * 8, 0);
+		}
+		/* Use PMF to accept first MAC_MCST_NUM (15) addresses */
+
+		/* TBD: Sort the addresses and write them in ascending
+		 * order into RX_MAC_MCST regs. we skip this phase now
+		 * and accept ALL multicast frames through IMF. Accept
+		 * the rest of addresses throw IMF.
+		 */
+		netdev_for_each_mc_addr(mclist, ndev) {
+			hash = 0;
+			for (i = 0; i < ETH_ALEN; i++)
+				hash ^= mclist->addr[i];
+
+			reg = TN40_REG_RX_MCST_HASH0 + ((hash >> 5) << 2);
+			val = tn40_read_reg(priv, reg);
+			val |= (1 << (hash % 32));
+			tn40_write_reg(priv, reg, val);
+		}
+	} else {
+		rxf_val |= TN40_GMAC_RX_FILTER_AB;
+	}
+	tn40_write_reg(priv, TN40_REG_GMAC_RXF_A, rxf_val);
+	/* Enable RX */
+	/* FIXME: RXE(ON) */
+}
+
+static int tn40_set_mac(struct net_device *ndev, void *p)
+{
+	struct tn40_priv *priv = netdev_priv(ndev);
+	struct sockaddr *addr = p;
+
+	eth_hw_addr_set(ndev, addr->sa_data);
+	tn40_restore_mac(ndev, priv);
+	return 0;
+}
+
+static void tn40_mac_init(struct tn40_priv *priv)
+{
+	u8 addr[ETH_ALEN];
+	u64 val;
+
+	val = (u64)tn40_read_reg(priv, TN40_REG_UNC_MAC0_A);
+	val |= (u64)tn40_read_reg(priv, TN40_REG_UNC_MAC1_A) << 16;
+	val |= (u64)tn40_read_reg(priv, TN40_REG_UNC_MAC2_A) << 32;
+
+	u64_to_ether_addr(val, addr);
+	eth_hw_addr_set(priv->ndev, addr);
+}
+
+static struct net_device_stats *tn40_get_stats(struct net_device *ndev)
+{
+	struct tn40_priv *priv = netdev_priv(ndev);
+
+	return &priv->net_stats;
+}
+
+static const struct net_device_ops tn40_netdev_ops = {
+	.ndo_open = tn40_open,
+	.ndo_stop = tn40_close,
+	.ndo_start_xmit = tn40_start_xmit,
+	.ndo_validate_addr = eth_validate_addr,
+	.ndo_set_rx_mode = tn40_setmulti,
+	.ndo_get_stats = tn40_get_stats,
+	.ndo_set_mac_address = tn40_set_mac,
+	.ndo_vlan_rx_add_vid = tn40_vlan_rx_add_vid,
+	.ndo_vlan_rx_kill_vid = tn40_vlan_rx_kill_vid,
+};
+
+static int tn40_priv_init(struct tn40_priv *priv)
+{
+	int ret;
+
+	ret = tn40_hw_reset(priv);
+	if (ret)
+		return ret;
+
+	/* Set GPIO[9:0] to output 0 */
+	tn40_write_reg(priv, 0x51E0, 0x30010006);	/* GPIO_OE_ WR CMD */
+	tn40_write_reg(priv, 0x51F0, 0x0);	/* GPIO_OE_ DATA */
+	tn40_write_reg(priv, TN40_REG_MDIO_CMD_STAT, 0x3ec8);
+
+	// we use tx descriptors to load a firmware.
+	ret = tn40_create_tx_ring(priv);
+	if (ret)
+		return ret;
+	ret = tn40_fw_load(priv);
+	tn40_destroy_tx_ring(priv);
+	return ret;
+}
+
+static struct net_device *tn40_netdev_alloc(struct pci_dev *pdev)
+{
+	struct net_device *ndev;
+
+	ndev = devm_alloc_etherdev(&pdev->dev, sizeof(struct tn40_priv));
+	if (!ndev)
+		return NULL;
+	ndev->netdev_ops = &tn40_netdev_ops;
+	ndev->tx_queue_len = TN40_NDEV_TXQ_LEN;
+	ndev->mem_start = pci_resource_start(pdev, 0);
+	ndev->mem_end = pci_resource_end(pdev, 0);
+	ndev->min_mtu = ETH_ZLEN;
+	ndev->max_mtu = TN40_MAX_MTU;
+
+	ndev->features = NETIF_F_IP_CSUM |
+		NETIF_F_SG |
+		NETIF_F_FRAGLIST |
+		NETIF_F_TSO | NETIF_F_GRO |
+		NETIF_F_RXCSUM |
+		NETIF_F_RXHASH |
+		NETIF_F_HW_VLAN_CTAG_TX |
+		NETIF_F_HW_VLAN_CTAG_RX |
+		NETIF_F_HW_VLAN_CTAG_FILTER;
+	ndev->vlan_features = NETIF_F_IP_CSUM |
+			       NETIF_F_SG |
+			       NETIF_F_TSO | NETIF_F_GRO | NETIF_F_RXHASH;
+
+	if (dma_get_mask(&pdev->dev) == DMA_BIT_MASK(64)) {
+		ndev->features |= NETIF_F_HIGHDMA;
+		ndev->vlan_features |= NETIF_F_HIGHDMA;
+	}
+	ndev->hw_features |= ndev->features;
+
+	SET_NETDEV_DEV(ndev, &pdev->dev);
+	netif_carrier_off(ndev);
+	netif_stop_queue(ndev);
+	return ndev;
+}
+
 static int tn40_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
+	struct net_device *ndev;
+	struct tn40_priv *priv;
+	unsigned int nvec = 1;
+	void __iomem *regs;
 	int ret;
 
+	tn40_init_txd_sizes();
+
 	ret = pci_enable_device(pdev);
 	if (ret)
 		return ret;
@@ -20,7 +1190,85 @@ static int tn40_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 			goto err_disable_device;
 		}
 	}
+
+	ret = pci_request_regions(pdev, TN40_DRV_NAME);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to request PCI regions.\n");
+		goto err_disable_device;
+	}
+
+	pci_set_master(pdev);
+
+	regs = pci_iomap(pdev, 0, TN40_REGS_SIZE);
+	if (!regs) {
+		ret = -EIO;
+		dev_err(&pdev->dev, "failed to map PCI bar.\n");
+		goto err_free_regions;
+	}
+
+	ndev = tn40_netdev_alloc(pdev);
+	if (!ndev) {
+		ret = -ENOMEM;
+		dev_err(&pdev->dev, "failed to allocate netdev.\n");
+		goto err_iounmap;
+	}
+
+	priv = netdev_priv(ndev);
+	pci_set_drvdata(pdev, priv);
+
+	priv->regs = regs;
+	priv->pdev = pdev;
+	priv->ndev = ndev;
+	/* Initialize fifo sizes. */
+	priv->txd_size = 3;
+	priv->txf_size = 3;
+	priv->rxd_size = 3;
+	priv->rxf_size = 3;
+	/* Initialize the initial coalescing registers. */
+	priv->rdintcm = TN40_INT_REG_VAL(0x20, 1, 4, 12);
+	priv->tdintcm = TN40_INT_REG_VAL(0x20, 1, 0, 12);
+
+	ret = tn40_hw_reset(priv);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to reset HW.\n");
+		goto err_unset_drvdata;
+	}
+
+	ret = pci_alloc_irq_vectors(pdev, 1, nvec, PCI_IRQ_MSI);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "failed to allocate irq.\n");
+		goto err_unset_drvdata;
+	}
+
+	priv->stats_flag =
+		((tn40_read_reg(priv, TN40_FPGA_VER) & 0xFFF) != 308);
+
+	priv->isr_mask = TN40_IR_RX_FREE_0 | TN40_IR_LNKCHG0 | TN40_IR_PSE |
+		TN40_IR_TMR0 | TN40_IR_RX_DESC_0 | TN40_IR_TX_FREE_0 |
+		TN40_IR_TMR1;
+
+	tn40_mac_init(priv);
+
+	ret = tn40_priv_init(priv);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to initialize tn40_priv.\n");
+		goto err_free_irq;
+	}
+
+	ret = register_netdev(ndev);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to register netdev.\n");
+		goto err_free_irq;
+	}
 	return 0;
+err_free_irq:
+	pci_free_irq_vectors(pdev);
+err_unset_drvdata:
+	pci_set_drvdata(pdev, NULL);
+err_iounmap:
+	iounmap(regs);
+err_free_regions:
+	pci_release_regions(pdev);
 err_disable_device:
 	pci_disable_device(pdev);
 	return ret;
@@ -28,6 +1276,15 @@ static int tn40_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 static void tn40_remove(struct pci_dev *pdev)
 {
+	struct tn40_priv *priv = pci_get_drvdata(pdev);
+	struct net_device *ndev = priv->ndev;
+
+	unregister_netdev(ndev);
+
+	pci_free_irq_vectors(priv->pdev);
+	pci_set_drvdata(pdev, NULL);
+	iounmap(priv->regs);
+	pci_release_regions(pdev);
 	pci_disable_device(pdev);
 }
 
@@ -55,4 +1312,5 @@ module_pci_driver(tn40_driver);
 MODULE_DEVICE_TABLE(pci, tn40_id_table);
 MODULE_AUTHOR("Tehuti networks");
 MODULE_LICENSE("GPL");
+MODULE_FIRMWARE(TN40_FIRMWARE_NAME);
 MODULE_DESCRIPTION("Tehuti Network TN40xx Driver");
diff --git a/drivers/net/ethernet/tehuti/tn40.h b/drivers/net/ethernet/tehuti/tn40.h
index c0ac64a19c31..d532bca6ec37 100644
--- a/drivers/net/ethernet/tehuti/tn40.h
+++ b/drivers/net/ethernet/tehuti/tn40.h
@@ -10,4 +10,171 @@
 
 #define PCI_VENDOR_ID_EDIMAX 0x1432
 
+#define TN40_MDIO_SPEED_1MHZ (1)
+#define TN40_MDIO_SPEED_6MHZ (6)
+
+/* netdev tx queue len for Luxor. The default value is 1000.
+ * ifconfig eth1 txqueuelen 3000 - to change it at runtime.
+ */
+#define TN40_NDEV_TXQ_LEN 1000
+
+#define TN40_FIFO_SIZE 4096
+#define TN40_FIFO_EXTRA_SPACE 1024
+
+#define TN40_TXF_DESC_SZ 16
+#define TN40_MAX_TX_LEVEL (priv->txd_fifo0.m.memsz - 16)
+#define TN40_MIN_TX_LEVEL 256
+#define TN40_NO_UPD_PACKETS 40
+#define TN40_MAX_MTU BIT(14)
+
+#define TN40_PCK_TH_MULT 128
+#define TN40_INT_COAL_MULT 2
+
+#define TN40_INT_REG_VAL(coal, coal_rc, rxf_th, pck_th) (	\
+	FIELD_PREP(GENMASK(14, 0), (coal)) |		\
+	FIELD_PREP(BIT(15), (coal_rc)) |		\
+	FIELD_PREP(GENMASK(19, 16), (rxf_th)) |		\
+	FIELD_PREP(GENMASK(31, 20), (pck_th))		\
+	)
+
+struct tn40_fifo {
+	dma_addr_t da; /* Physical address of fifo (used by HW) */
+	char *va; /* Virtual address of fifo (used by SW) */
+	u32 rptr, wptr;
+	 /* Cached values of RPTR and WPTR registers,
+	  * they're 32 bits on both 32 and 64 archs.
+	  */
+	u16 reg_cfg0;
+	u16 reg_cfg1;
+	u16 reg_rptr;
+	u16 reg_wptr;
+	u16 memsz; /* Memory size allocated for fifo */
+	u16 size_mask;
+	u16 pktsz; /* Skb packet size to allocate */
+	u16 rcvno; /* Number of buffers that come from this RXF */
+};
+
+struct tn40_txf_fifo {
+	struct tn40_fifo m; /* The minimal set of variables used by all fifos */
+};
+
+struct tn40_txd_fifo {
+	struct tn40_fifo m; /* The minimal set of variables used by all fifos */
+};
+
+union tn40_tx_dma_addr {
+	dma_addr_t dma;
+	struct sk_buff *skb;
+};
+
+/* Entry in the db.
+ * if len == 0 addr is dma
+ * if len != 0 addr is skb
+ */
+struct tn40_tx_map {
+	union tn40_tx_dma_addr addr;
+	int len;
+};
+
+/* tx database - implemented as circular fifo buffer */
+struct tn40_txdb {
+	struct tn40_tx_map *start; /* Points to the first element */
+	struct tn40_tx_map *end; /* Points just AFTER the last element */
+	struct tn40_tx_map *rptr; /* Points to the next element to read */
+	struct tn40_tx_map *wptr; /* Points to the next element to write */
+	int size; /* Number of elements in the db */
+};
+
+struct tn40_priv {
+	struct net_device *ndev;
+	struct pci_dev *pdev;
+
+	/* Tx FIFOs: 1 for data desc, 1 for empty (acks) desc */
+	struct tn40_txd_fifo txd_fifo0;
+	struct tn40_txf_fifo txf_fifo0;
+	struct tn40_txdb txdb;
+	int tx_level;
+	int tx_update_mark;
+	int tx_noupd;
+
+	int stats_flag;
+	struct net_device_stats net_stats;
+
+	u8 txd_size;
+	u8 txf_size;
+	u8 rxd_size;
+	u8 rxf_size;
+	u32 rdintcm;
+	u32 tdintcm;
+
+	u32 isr_mask;
+	int link;
+	u32 link_loop_cnt;
+
+	void __iomem *regs;
+
+	/* SHORT_PKT_FIX */
+	u32 b0_len;
+	dma_addr_t b0_dma; /* Physical address of buffer */
+	char *b0_va; /* Virtual address of buffer */
+};
+
+#define TN40_MAX_PBL (19)
+/* PBL describes each virtual buffer to be transmitted from the host. */
+struct tn40_pbl {
+	__le32 pa_lo;
+	__le32 pa_hi;
+	__le32 len;
+};
+
+/* First word for TXD descriptor. It means: type = 3 for regular Tx packet,
+ * hw_csum = 7 for IP+UDP+TCP HW checksums.
+ */
+#define TN40_TXD_W1_VAL(bc, checksum, vtag, lgsnd, vlan_id) (		\
+	GENMASK(17, 16) |						\
+	FIELD_PREP(GENMASK(4, 0), (bc)) |				\
+	FIELD_PREP(GENMASK(7, 5), (checksum)) |				\
+	FIELD_PREP(BIT(8), (vtag)) |					\
+	FIELD_PREP(GENMASK(12, 9), (lgsnd)) |				\
+	FIELD_PREP(GENMASK(15, 13),					\
+		   FIELD_GET(GENMASK(15, 13), (vlan_id))) |		\
+	FIELD_PREP(GENMASK(31, 20),					\
+		   FIELD_GET(GENMASK(11, 0), (vlan_id)))		\
+	)
+
+struct tn40_txd_desc {
+	__le32 txd_val1;
+	__le16 mss;
+	__le16 length;
+	__le32 va_lo;
+	__le32 va_hi;
+	struct tn40_pbl pbl[]; /* Fragments */
+} __packed;
+
+struct tn40_txf_desc {
+	u32 status;
+	u32 va_lo; /* VAdr[31:0] */
+	u32 va_hi; /* VAdr[63:32] */
+	u32 pad;
+} __packed;
+
+/* 32 bit kernels use 16 bits for page_offset. Do not increase
+ * LUXOR__MAX_PAGE_SIZE beyond 64K!
+ */
+#if BITS_PER_LONG > 32
+#define TN40_MAX_PAGE_SIZE 0x40000
+#else
+#define TN40_MAX_PAGE_SIZE 0x10000
+#endif
+
+static inline u32 tn40_read_reg(struct tn40_priv *priv, u32 reg)
+{
+	return readl(priv->regs + reg);
+}
+
+static inline void tn40_write_reg(struct tn40_priv *priv, u32 reg, u32 val)
+{
+	writel(val, priv->regs + reg);
+}
+
 #endif /* _TN40XX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling
  2024-05-01 23:05 [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips FUJITA Tomonori
                   ` (2 preceding siblings ...)
  2024-05-01 23:05 ` [PATCH net-next v4 3/6] net: tn40xx: add basic Tx handling FUJITA Tomonori
@ 2024-05-01 23:05 ` FUJITA Tomonori
  2024-05-06  9:20   ` Paolo Abeni
  2024-05-07  1:58   ` Jakub Kicinski
  2024-05-01 23:05 ` [PATCH net-next v4 5/6] net: tn40xx: add mdio bus support FUJITA Tomonori
  2024-05-01 23:05 ` [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support FUJITA Tomonori
  5 siblings, 2 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-01 23:05 UTC (permalink / raw)
  To: netdev; +Cc: andrew, kuba, jiri, horms

This patch adds basic Rx handling. The Rx logic uses three major data
structures; two ring buffers with NIC and one database. One ring
buffer is used to send information to NIC about memory to be stored
packets to be received. The other is used to get information from NIC
about received packets. The database is used to keep the information
about DMA mapping. After a packet arrived, the db is used to pass the
packet to the network stack.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
---
 drivers/net/ethernet/tehuti/tn40.c | 538 ++++++++++++++++++++++++++++-
 drivers/net/ethernet/tehuti/tn40.h |  69 ++++
 2 files changed, 606 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/tehuti/tn40.c b/drivers/net/ethernet/tehuti/tn40.c
index 5fa5c3c12f5c..6f9c5fc88d45 100644
--- a/drivers/net/ethernet/tehuti/tn40.c
+++ b/drivers/net/ethernet/tehuti/tn40.c
@@ -61,6 +61,455 @@ static void tn40_fifo_free(struct tn40_priv *priv, struct tn40_fifo *f)
 			  f->memsz + TN40_FIFO_EXTRA_SPACE, f->va, f->da);
 }
 
+static struct tn40_rxdb *tn40_rxdb_alloc(int nelem)
+{
+	size_t size = sizeof(struct tn40_rxdb) + (nelem * sizeof(int)) +
+	    (nelem * sizeof(struct tn40_rx_map));
+	struct tn40_rxdb *db;
+	int i;
+
+	db = vzalloc(size);
+	if (db) {
+		db->stack = (int *)(db + 1);
+		db->elems = (void *)(db->stack + nelem);
+		db->nelem = nelem;
+		db->top = nelem;
+		/* make the first alloc close to db struct */
+		for (i = 0; i < nelem; i++)
+			db->stack[i] = nelem - i - 1;
+	}
+	return db;
+}
+
+static void tn40_rxdb_free(struct tn40_rxdb *db)
+{
+	vfree(db);
+}
+
+static int tn40_rxdb_alloc_elem(struct tn40_rxdb *db)
+{
+	return db->stack[--(db->top)];
+}
+
+static void *tn40_rxdb_addr_elem(struct tn40_rxdb *db, unsigned int n)
+{
+	return db->elems + n;
+}
+
+static int tn40_rxdb_available(struct tn40_rxdb *db)
+{
+	return db->top;
+}
+
+static void tn40_rxdb_free_elem(struct tn40_rxdb *db, unsigned int n)
+{
+	db->stack[(db->top)++] = n;
+}
+
+static struct tn40_rx_page *tn40_rx_page_alloc(struct tn40_priv *priv)
+{
+	struct tn40_rx_page *rx_page = &priv->rx_page_table.rx_pages;
+	int page_size = priv->rx_page_table.page_size;
+	struct page *page;
+	gfp_t gfp_mask;
+	dma_addr_t dma;
+
+	gfp_mask = GFP_ATOMIC | __GFP_NOWARN;
+	if (page_size > PAGE_SIZE)
+		gfp_mask |= __GFP_COMP;
+
+	page = alloc_pages(gfp_mask, get_order(page_size));
+	if (likely(page)) {
+		netdev_dbg(priv->ndev, "map page %p size %d\n",
+			   page, page_size);
+		dma = dma_map_page(&priv->pdev->dev, page, 0, page_size,
+				   DMA_FROM_DEVICE);
+		if (unlikely(dma_mapping_error(&priv->pdev->dev, dma))) {
+			netdev_err(priv->ndev, "failed to map page %d\n",
+				   page_size);
+			__free_pages(page, get_order(page_size));
+			return NULL;
+		}
+	} else {
+		return NULL;
+	}
+
+	rx_page->page = page;
+	rx_page->dma = dma;
+	return rx_page;
+}
+
+static int tn40_rx_page_size(struct tn40_priv *priv)
+{
+	int dno = tn40_rxdb_available(priv->rxdb0) - 1;
+
+	priv->rx_page_table.page_size =
+	    min(TN40_MAX_PAGE_SIZE, dno * priv->rx_page_table.buf_size);
+
+	return priv->rx_page_table.page_size;
+}
+
+static void tn40_rx_page_reuse(struct tn40_priv *priv, struct tn40_rx_map *dm)
+{
+	if (dm->off == 0)
+		dma_unmap_page(&priv->pdev->dev, dm->dma, dm->size,
+			       DMA_FROM_DEVICE);
+}
+
+static void tn40_rx_page_ref(struct tn40_rx_page *rx_page)
+{
+	get_page(rx_page->page);
+}
+
+static void tn40_rx_page_put(struct tn40_priv *priv, struct tn40_rx_map *dm)
+{
+	if (dm->off == 0)
+		dma_unmap_page(&priv->pdev->dev, dm->dma, dm->size,
+			       DMA_FROM_DEVICE);
+	put_page(dm->rx_page.page);
+}
+
+static void tn40_dm_rx_page_set(register struct tn40_rx_map *dm,
+				struct tn40_rx_page *rx_page)
+{
+	dm->rx_page.page = rx_page->page;
+}
+
+/**
+ * tn40_create_rx_ring - Initialize RX all related HW and SW resources
+ * @priv: NIC private structure
+ *
+ * create_rx_ring creates rxf and rxd fifos, updates the relevant HW registers,
+ * preallocates skbs for rx. It assumes that Rx is disabled in HW funcs are
+ * grouped for better cache usage
+ *
+ * RxD fifo is smaller then RxF fifo by design. Upon high load, RxD will be
+ * filled and packets will be dropped by the NIC without getting into the host
+ * or generating interrupts. In this situation the host has no chance of
+ * processing all the packets. Dropping packets by the NIC is cheaper, since it
+ * takes 0 CPU cycles.
+ *
+ * Return: 0 on success and negative value on error.
+ */
+static int tn40_create_rx_ring(struct tn40_priv *priv)
+{
+	int ret, pkt_size, nr;
+
+	ret = tn40_fifo_alloc(priv, &priv->rxd_fifo0.m, priv->rxd_size,
+			      TN40_REG_RXD_CFG0_0, TN40_REG_RXD_CFG1_0,
+			      TN40_REG_RXD_RPTR_0, TN40_REG_RXD_WPTR_0);
+	if (ret)
+		return ret;
+
+	ret = tn40_fifo_alloc(priv, &priv->rxf_fifo0.m, priv->rxf_size,
+			      TN40_REG_RXF_CFG0_0, TN40_REG_RXF_CFG1_0,
+			      TN40_REG_RXF_RPTR_0, TN40_REG_RXF_WPTR_0);
+	if (ret)
+		goto err_free_rxd;
+
+	pkt_size = priv->ndev->mtu + VLAN_ETH_HLEN;
+	priv->rxf_fifo0.m.pktsz = pkt_size;
+	nr = priv->rxf_fifo0.m.memsz / sizeof(struct tn40_rxf_desc);
+	priv->rxdb0 = tn40_rxdb_alloc(nr);
+	if (!priv->rxdb0) {
+		ret = -ENOMEM;
+		goto err_free_rxf;
+	}
+
+	priv->rx_page_table.buf_size = round_up(pkt_size, SMP_CACHE_BYTES);
+	return 0;
+err_free_rxf:
+	tn40_fifo_free(priv, &priv->rxf_fifo0.m);
+err_free_rxd:
+	tn40_fifo_free(priv, &priv->rxd_fifo0.m);
+	return ret;
+}
+
+static void tn40_rx_free_buffers(struct tn40_priv *priv, struct tn40_rxdb *db,
+				 struct tn40_rxf_fifo *f)
+{
+	struct tn40_rx_map *dm;
+	u16 i;
+
+	netdev_dbg(priv->ndev, "total =%d free =%d busy =%d\n", db->nelem,
+		   tn40_rxdb_available(db),
+		   db->nelem - tn40_rxdb_available(db));
+	while (tn40_rxdb_available(db) > 0) {
+		i = tn40_rxdb_alloc_elem(db);
+		dm = tn40_rxdb_addr_elem(db, i);
+		dm->dma = 0;
+	}
+	for (i = 0; i < db->nelem; i++) {
+		dm = tn40_rxdb_addr_elem(db, i);
+		if (dm->dma && dm->rx_page.page)
+			tn40_rx_page_put(priv, dm);
+	}
+}
+
+static void tn40_destroy_rx_ring(struct tn40_priv *priv)
+{
+	if (priv->rxdb0) {
+		tn40_rx_free_buffers(priv, priv->rxdb0, &priv->rxf_fifo0);
+		tn40_rxdb_free(priv->rxdb0);
+		priv->rxdb0 = NULL;
+	}
+	tn40_fifo_free(priv, &priv->rxf_fifo0.m);
+	tn40_fifo_free(priv, &priv->rxd_fifo0.m);
+}
+
+/**
+ * tn40_rx_alloc_buffers - Fill rxf fifo with new skbs.
+ *
+ * @priv: NIC's private structure
+ *
+ * rx_alloc_buffers allocates skbs, builds rxf descs and pushes them (rxf
+ * descr) into the rxf fifo.  Skb's virtual and physical addresses are stored
+ * in skb db.
+ * To calculate the free space, we uses the cached values of RPTR and WPTR
+ * when needed. This function also updates RPTR and WPTR.
+ */
+static void tn40_rx_alloc_buffers(struct tn40_priv *priv)
+{
+	int buf_size = priv->rx_page_table.buf_size;
+	struct tn40_rxf_fifo *f = &priv->rxf_fifo0;
+	struct tn40_rx_page *rx_page = NULL;
+	struct tn40_rxdb *db = priv->rxdb0;
+	struct tn40_rxf_desc *rxfd;
+	struct tn40_rx_map *dm;
+	int dno, delta, idx;
+	int page_off = -1;
+	int n_pages = 0;
+	u64 dma = 0ULL;
+	int page_size;
+
+	dno = tn40_rxdb_available(db) - 1;
+	page_size = tn40_rx_page_size(priv);
+	netdev_dbg(priv->ndev, "dno %d page_size %d buf_size %d\n", dno,
+		   page_size, priv->rx_page_table.buf_size);
+	while (dno > 0) {
+		/* We allocate large pages (i.e. 64KB) and store
+		 * multiple packet buffers in each page. The packet
+		 * buffers are stored backwards in each page (starting
+		 * from the highest address). We utilize the fact that
+		 * the last buffer in each page has a 0 offset to
+		 * detect that all the buffers were processed in order
+		 * to unmap the page.
+		 */
+		if (unlikely(page_off < 0)) {
+			rx_page = tn40_rx_page_alloc(priv);
+			if (!rx_page) {
+				u32 timeout = 1000000;	/* 1/5 sec */
+
+				tn40_write_reg(priv, 0x5154, timeout);
+				netdev_dbg(priv->ndev,
+					   "system memory is temporary low\n");
+				break;
+			}
+			page_off = ((page_size / buf_size) - 1) * buf_size;
+			dma = rx_page->dma;
+			n_pages += 1;
+		} else {
+			tn40_rx_page_ref(rx_page);
+			/* Page is already allocated and mapped, just
+			 * increment the page usage count.
+			 */
+		}
+		rxfd = (struct tn40_rxf_desc *)(f->m.va + f->m.wptr);
+		idx = tn40_rxdb_alloc_elem(db);
+		dm = tn40_rxdb_addr_elem(db, idx);
+		dm->size = page_size;
+		tn40_dm_rx_page_set(dm, rx_page);
+		dm->off = page_off;
+		dm->dma = dma + page_off;
+		netdev_dbg(priv->ndev, "dm size %d off %d dma %llx\n", dm->size,
+			   dm->off, dm->dma);
+		page_off -= buf_size;
+
+		rxfd->info = cpu_to_le32(0x10003);	/* INFO =1 BC =3 */
+		rxfd->va_lo = cpu_to_le32(idx);
+		rxfd->pa_lo = cpu_to_le32(lower_32_bits(dm->dma));
+		rxfd->pa_hi = cpu_to_le32(upper_32_bits(dm->dma));
+		rxfd->len = cpu_to_le32(f->m.pktsz);
+		f->m.wptr += sizeof(struct tn40_rxf_desc);
+		delta = f->m.wptr - f->m.memsz;
+		if (unlikely(delta >= 0)) {
+			f->m.wptr = delta;
+			if (delta > 0) {
+				memcpy(f->m.va, f->m.va + f->m.memsz, delta);
+				netdev_dbg(priv->ndev,
+					   "wrapped rxd descriptor\n");
+			}
+		}
+		dno--;
+	}
+	netdev_dbg(priv->ndev, "n_pages %d\n", n_pages);
+	/* TBD: Do not update WPTR if no desc were written */
+	tn40_write_reg(priv, f->m.reg_wptr, f->m.wptr & TN40_TXF_WPTR_WR_PTR);
+	netdev_dbg(priv->ndev, "write_reg 0x%04x f->m.reg_wptr 0x%x\n",
+		   f->m.reg_wptr, f->m.wptr & TN40_TXF_WPTR_WR_PTR);
+	netdev_dbg(priv->ndev, "read_reg  0x%04x f->m.reg_rptr=0x%x\n",
+		   f->m.reg_rptr, tn40_read_reg(priv, f->m.reg_rptr));
+	netdev_dbg(priv->ndev, "write_reg 0x%04x f->m.reg_wptr=0x%x\n",
+		   f->m.reg_wptr, tn40_read_reg(priv, f->m.reg_wptr));
+}
+
+static void tn40_recycle_skb(struct tn40_priv *priv, struct tn40_rxd_desc *rxdd)
+{
+	struct tn40_rx_map *dm = tn40_rxdb_addr_elem(priv->rxdb0,
+						     le32_to_cpu(rxdd->va_lo));
+	struct tn40_rxf_fifo *f = &priv->rxf_fifo0;
+	struct tn40_rxf_desc *rxfd;
+	int delta;
+
+	rxfd = (struct tn40_rxf_desc *)(f->m.va + f->m.wptr);
+	rxfd->info = cpu_to_le32(0x10003);	/* INFO=1 BC=3 */
+	rxfd->va_lo = rxdd->va_lo;
+	rxfd->pa_lo = cpu_to_le32(lower_32_bits(dm->dma));
+	rxfd->pa_hi = cpu_to_le32(upper_32_bits(dm->dma));
+	rxfd->len = cpu_to_le32(f->m.pktsz);
+	f->m.wptr += sizeof(struct tn40_rxf_desc);
+	delta = f->m.wptr - f->m.memsz;
+	if (unlikely(delta >= 0)) {
+		f->m.wptr = delta;
+		if (delta > 0) {
+			memcpy(f->m.va, f->m.va + f->m.memsz, delta);
+			netdev_dbg(priv->ndev, "wrapped rxf descriptor\n");
+		}
+	}
+}
+
+static int tn40_rx_receive(struct tn40_priv *priv, struct tn40_rxd_fifo *f,
+			   int budget)
+{
+	u32 rxd_val1, rxd_err, pkt_id;
+	struct tn40_rx_page *rx_page;
+	int tmp_len, size, done = 0;
+	struct tn40_rxdb *db = NULL;
+	struct tn40_rxd_desc *rxdd;
+	struct tn40_rx_map *dm;
+	struct sk_buff *skb;
+	u16 len, rxd_vlan;
+
+	f->m.wptr = tn40_read_reg(priv, f->m.reg_wptr) & TN40_TXF_WPTR_WR_PTR;
+	size = f->m.wptr - f->m.rptr;
+	if (size < 0)
+		size += f->m.memsz;	/* Size is negative :-) */
+
+	while (size > 0) {
+		rxdd = (struct tn40_rxd_desc *)(f->m.va + f->m.rptr);
+		db = priv->rxdb0;
+
+		/* We have a chicken and egg problem here. If the
+		 * descriptor is wrapped we first need to copy the tail
+		 * of the descriptor to the end of the buffer before
+		 * extracting values from the descriptor. However in
+		 * order to know if the descriptor is wrapped we need to
+		 * obtain the length of the descriptor from (the
+		 * wrapped) descriptor. Luckily the length is the first
+		 * word of the descriptor. Descriptor lengths are
+		 * multiples of 8 bytes so in case of a wrapped
+		 * descriptor the first 8 bytes guaranteed to appear
+		 * before the end of the buffer. We first obtain the
+		 * length, we then copy the rest of the descriptor if
+		 * needed and then extract the rest of the values from
+		 * the descriptor.
+		 *
+		 * Do not change the order of operations as it will
+		 * break the code!!!
+		 */
+		rxd_val1 = le32_to_cpu(rxdd->rxd_val1);
+		tmp_len = TN40_GET_RXD_BC(rxd_val1) << 3;
+		pkt_id = TN40_GET_RXD_PKT_ID(rxd_val1);
+		size -= tmp_len;
+		/* CHECK FOR A PARTIALLY ARRIVED DESCRIPTOR */
+		if (size < 0) {
+			netdev_dbg(priv->ndev,
+				   "%s partially arrived desc tmp_len %d\n",
+				   __func__, tmp_len);
+			break;
+		}
+		/* make sure that the descriptor fully is arrived
+		 * before reading the rest of the descriptor.
+		 */
+		rmb();
+
+		/* A special treatment is given to non-contiguous
+		 * descriptors that start near the end, wraps around
+		 * and continue at the beginning. The second part is
+		 * copied right after the first, and then descriptor
+		 * is interpreted as normal. The fifo has an extra
+		 * space to allow such operations.
+		 */
+
+		/* HAVE WE REACHED THE END OF THE QUEUE? */
+		f->m.rptr += tmp_len;
+		tmp_len = f->m.rptr - f->m.memsz;
+		if (unlikely(tmp_len >= 0)) {
+			f->m.rptr = tmp_len;
+			if (tmp_len > 0) {
+				/* COPY PARTIAL DESCRIPTOR
+				 * TO THE END OF THE QUEUE
+				 */
+				netdev_dbg(priv->ndev,
+					   "wrapped desc rptr=%d tmp_len=%d\n",
+					   f->m.rptr, tmp_len);
+				memcpy(f->m.va + f->m.memsz, f->m.va, tmp_len);
+			}
+		}
+		dm = tn40_rxdb_addr_elem(db, le32_to_cpu(rxdd->va_lo));
+		prefetch(dm);
+		rx_page = &dm->rx_page;
+
+		len = le16_to_cpu(rxdd->len);
+		rxd_vlan = le16_to_cpu(rxdd->rxd_vlan);
+		/* CHECK FOR ERRORS */
+		rxd_err = TN40_GET_RXD_ERR(rxd_val1);
+		if (unlikely(rxd_err)) {
+			netdev_err(priv->ndev, "rxd_err = 0x%x\n", rxd_err);
+			priv->net_stats.rx_errors++;
+			tn40_recycle_skb(priv, rxdd);
+			continue;
+		}
+
+		/* In this case we obtain a pre-allocated skb from
+		 * napi. We add a frag with the page/off/len tuple of
+		 * the buffer that we have just read and then call
+		 * vlan_gro_frags()/napi_gro_frags() to process the
+		 * packet. The same skb is used again and again to
+		 * handle all packets, which eliminates the need to
+		 * allocate an skb for each packet.
+		 */
+		skb = napi_get_frags(&priv->napi);
+		if (!skb) {
+			netdev_err(priv->ndev, "napi_get_frags failed\n");
+			break;
+		}
+		skb->ip_summed =
+		    (pkt_id == 0) ? CHECKSUM_NONE : CHECKSUM_UNNECESSARY;
+		skb_add_rx_frag(skb, 0, rx_page->page, dm->off, len,
+				SKB_TRUESIZE(len));
+		tn40_rxdb_free_elem(db, le32_to_cpu(rxdd->va_lo));
+
+		/* PROCESS PACKET */
+		if (TN40_GET_RXD_VTAG(rxd_val1))
+			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q),
+					       TN40_GET_RXD_VLAN_TCI(rxd_vlan));
+		napi_gro_frags(&priv->napi);
+
+		tn40_rx_page_reuse(priv, dm);
+		priv->net_stats.rx_bytes += len;
+
+		if (unlikely(++done >= budget))
+			break;
+	}
+
+	priv->net_stats.rx_packets += done;
+	/* FIXME: Do something to minimize pci accesses */
+	tn40_write_reg(priv, f->m.reg_rptr, f->m.rptr & TN40_TXF_WPTR_WR_PTR);
+	tn40_rx_alloc_buffers(priv);
+	return done;
+}
+
 /* TX HW/SW interaction overview
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  * There are 2 types of TX communication channels between driver and NIC.
@@ -448,6 +897,56 @@ static int tn40_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	return NETDEV_TX_OK;
 }
 
+static void tn40_tx_cleanup(struct tn40_priv *priv)
+{
+	struct tn40_txf_fifo *f = &priv->txf_fifo0;
+	struct tn40_txdb *db = &priv->txdb;
+	int tx_level = 0;
+
+	f->m.wptr = tn40_read_reg(priv, f->m.reg_wptr) & TN40_TXF_WPTR_MASK;
+
+	netif_tx_lock(priv->ndev);
+	while (f->m.wptr != f->m.rptr) {
+		f->m.rptr += TN40_TXF_DESC_SZ;
+		f->m.rptr &= f->m.size_mask;
+		/* Unmap all fragments */
+		/* First has to come tx_maps containing DMA */
+		do {
+			dma_unmap_page(&priv->pdev->dev, db->rptr->addr.dma,
+				       db->rptr->len, DMA_TO_DEVICE);
+			tn40_tx_db_inc_rptr(db);
+		} while (db->rptr->len > 0);
+		tx_level -= db->rptr->len; /* '-' Because the len is negative */
+
+		/* Now should come skb pointer - free it */
+		dev_kfree_skb_any(db->rptr->addr.skb);
+		netdev_dbg(priv->ndev, "dev_kfree_skb_any %p %d\n",
+			   db->rptr->addr.skb, -db->rptr->len);
+		tn40_tx_db_inc_rptr(db);
+	}
+
+	/* Let the HW know which TXF descriptors were cleaned */
+	tn40_write_reg(priv, f->m.reg_rptr, f->m.rptr & TN40_TXF_WPTR_WR_PTR);
+
+	/* We reclaimed resources, so in case the Q is stopped by xmit
+	 * callback, we resume the transmission and use tx_lock to
+	 * synchronize with xmit.
+	 */
+	priv->tx_level += tx_level;
+	if (priv->tx_noupd) {
+		priv->tx_noupd = 0;
+		tn40_write_reg(priv, priv->txd_fifo0.m.reg_wptr,
+			       priv->txd_fifo0.m.wptr & TN40_TXF_WPTR_WR_PTR);
+	}
+	if (unlikely(netif_queue_stopped(priv->ndev) &&
+		     netif_carrier_ok(priv->ndev) &&
+		     (priv->tx_level >= TN40_MAX_TX_LEVEL / 2))) {
+		netdev_dbg(priv->ndev, "TX Q WAKE level %d\n", priv->tx_level);
+		netif_wake_queue(priv->ndev);
+	}
+	netif_tx_unlock(priv->ndev);
+}
+
 static void tn40_tx_free_skbs(struct tn40_priv *priv)
 {
 	struct tn40_txdb *db = &priv->txdb;
@@ -728,6 +1227,10 @@ static irqreturn_t tn40_isr_napi(int irq, void *dev)
 		tn40_isr_extra(priv, isr);
 
 	if (isr & (TN40_IR_RX_DESC_0 | TN40_IR_TX_FREE_0 | TN40_IR_TMR1)) {
+		if (likely(napi_schedule_prep(&priv->napi))) {
+			__napi_schedule(&priv->napi);
+			return IRQ_HANDLED;
+		}
 		/* We get here if an interrupt has slept into the
 		 * small time window between these lines in
 		 * tn40_poll: tn40_enable_interrupts(priv); return 0;
@@ -745,6 +1248,21 @@ static irqreturn_t tn40_isr_napi(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
+static int tn40_poll(struct napi_struct *napi, int budget)
+{
+	struct tn40_priv *priv = container_of(napi, struct tn40_priv, napi);
+	int work_done;
+
+	tn40_tx_cleanup(priv);
+
+	work_done = tn40_rx_receive(priv, &priv->rxd_fifo0, budget);
+	if (work_done < budget) {
+		napi_complete(napi);
+		tn40_enable_interrupts(priv);
+	}
+	return work_done;
+}
+
 static int tn40_fw_load(struct tn40_priv *priv)
 {
 	const struct firmware *fw = NULL;
@@ -827,6 +1345,8 @@ static void tn40_hw_start(struct tn40_priv *priv)
 	tn40_write_reg(priv, TN40_REG_TX_FULLNESS, 0);
 
 	tn40_write_reg(priv, TN40_REG_VGLB, 0);
+	tn40_write_reg(priv, TN40_REG_MAX_FRAME_A,
+		       priv->rxf_fifo0.m.pktsz & TN40_MAX_FRAME_AB_VAL);
 	tn40_write_reg(priv, TN40_REG_RDINTCM0, priv->rdintcm);
 	tn40_write_reg(priv, TN40_REG_RDINTCM2, 0);
 
@@ -929,15 +1449,25 @@ static int tn40_start(struct tn40_priv *priv)
 		return ret;
 	}
 
+	ret = tn40_create_rx_ring(priv);
+	if (ret) {
+		netdev_err(priv->ndev, "failed to rx init %d\n", ret);
+		goto err_tx_ring;
+	}
+
+	tn40_rx_alloc_buffers(priv);
+
 	ret = request_irq(priv->pdev->irq, &tn40_isr_napi, IRQF_SHARED,
 			  priv->ndev->name, priv->ndev);
 	if (ret) {
 		netdev_err(priv->ndev, "failed to request irq %d\n", ret);
-		goto err_tx_ring;
+		goto err_rx_ring;
 	}
 
 	tn40_hw_start(priv);
 	return 0;
+err_rx_ring:
+	tn40_destroy_rx_ring(priv);
 err_tx_ring:
 	tn40_destroy_tx_ring(priv);
 	return ret;
@@ -947,9 +1477,12 @@ static int tn40_close(struct net_device *ndev)
 {
 	struct tn40_priv *priv = netdev_priv(ndev);
 
+	netif_napi_del(&priv->napi);
+	napi_disable(&priv->napi);
 	tn40_disable_interrupts(priv);
 	free_irq(priv->pdev->irq, priv->ndev);
 	tn40_sw_reset(priv);
+	tn40_destroy_rx_ring(priv);
 	tn40_destroy_tx_ring(priv);
 	return 0;
 }
@@ -965,6 +1498,8 @@ static int tn40_open(struct net_device *dev)
 		netdev_err(dev, "failed to start %d\n", ret);
 		return ret;
 	}
+	napi_enable(&priv->napi);
+	netif_start_queue(priv->ndev);
 	return 0;
 }
 
@@ -1215,6 +1750,7 @@ static int tn40_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	priv = netdev_priv(ndev);
 	pci_set_drvdata(pdev, priv);
+	netif_napi_add(ndev, &priv->napi, tn40_poll);
 
 	priv->regs = regs;
 	priv->pdev = pdev;
diff --git a/drivers/net/ethernet/tehuti/tn40.h b/drivers/net/ethernet/tehuti/tn40.h
index d532bca6ec37..cbb7b80c66b1 100644
--- a/drivers/net/ethernet/tehuti/tn40.h
+++ b/drivers/net/ethernet/tehuti/tn40.h
@@ -62,6 +62,33 @@ struct tn40_txd_fifo {
 	struct tn40_fifo m; /* The minimal set of variables used by all fifos */
 };
 
+struct tn40_rxf_fifo {
+	struct tn40_fifo m; /* The minimal set of variables used by all fifos */
+};
+
+struct tn40_rxd_fifo {
+	struct tn40_fifo m; /* The minimal set of variables used by all fifos */
+};
+
+struct tn40_rx_page {
+	struct page *page;
+	u64 dma;
+};
+
+struct tn40_rx_map {
+	struct tn40_rx_page rx_page;
+	u64 dma;
+	u32 off;
+	u32 size; /* Mapped area (i.e. page) size */
+};
+
+struct tn40_rxdb {
+	int *stack;
+	struct tn40_rx_map *elems;
+	int nelem;
+	int top;
+};
+
 union tn40_tx_dma_addr {
 	dma_addr_t dma;
 	struct sk_buff *skb;
@@ -85,10 +112,24 @@ struct tn40_txdb {
 	int size; /* Number of elements in the db */
 };
 
+struct tn40_rx_page_table {
+	int page_size;
+	int buf_size;
+	struct tn40_rx_page rx_pages;
+};
+
 struct tn40_priv {
 	struct net_device *ndev;
 	struct pci_dev *pdev;
 
+	struct napi_struct napi;
+	/* RX FIFOs: 1 for data (full) descs, and 2 for free descs */
+	struct tn40_rxd_fifo rxd_fifo0;
+	struct tn40_rxf_fifo rxf_fifo0;
+	struct tn40_rxdb *rxdb0; /* Rx dbs to store skb pointers */
+	int napi_stop;
+	struct vlan_group *vlgrp;
+
 	/* Tx FIFOs: 1 for data desc, 1 for empty (acks) desc */
 	struct tn40_txd_fifo txd_fifo0;
 	struct tn40_txf_fifo txf_fifo0;
@@ -117,6 +158,34 @@ struct tn40_priv {
 	u32 b0_len;
 	dma_addr_t b0_dma; /* Physical address of buffer */
 	char *b0_va; /* Virtual address of buffer */
+
+	struct tn40_rx_page_table rx_page_table;
+};
+
+/* RX FREE descriptor - 64bit */
+struct tn40_rxf_desc {
+	__le32 info; /* Buffer Count + Info - described below */
+	__le32 va_lo; /* VAdr[31:0] */
+	__le32 va_hi; /* VAdr[63:32] */
+	__le32 pa_lo; /* PAdr[31:0] */
+	__le32 pa_hi; /* PAdr[63:32] */
+	__le32 len; /* Buffer Length */
+};
+
+#define TN40_GET_RXD_BC(x) FIELD_GET(GENMASK(4, 0), (x))
+#define TN40_GET_RXD_ERR(x) FIELD_GET(GENMASK(26, 21), (x))
+#define TN40_GET_RXD_PKT_ID(x) FIELD_GET(GENMASK(30, 28), (x))
+#define TN40_GET_RXD_VTAG(x) FIELD_GET(BIT(31), (x))
+#define TN40_GET_RXD_VLAN_TCI(x) FIELD_GET(GENMASK(15, 0), (x))
+
+struct tn40_rxd_desc {
+	__le32 rxd_val1;
+	__le16 len;
+	__le16 rxd_vlan;
+	__le32 va_lo;
+	__le32 va_hi;
+	__le32 rss_lo;
+	__le32 rss_hash;
 };
 
 #define TN40_MAX_PBL (19)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v4 5/6] net: tn40xx: add mdio bus support
  2024-05-01 23:05 [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips FUJITA Tomonori
                   ` (3 preceding siblings ...)
  2024-05-01 23:05 ` [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling FUJITA Tomonori
@ 2024-05-01 23:05 ` FUJITA Tomonori
  2024-05-01 23:05 ` [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support FUJITA Tomonori
  5 siblings, 0 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-01 23:05 UTC (permalink / raw)
  To: netdev; +Cc: andrew, kuba, jiri, horms

This patch adds supports for mdio bus. A later path adds PHYLIB
support on the top of this.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
---
 drivers/net/ethernet/tehuti/Makefile    |   2 +-
 drivers/net/ethernet/tehuti/tn40.c      |   6 ++
 drivers/net/ethernet/tehuti/tn40.h      |   3 +
 drivers/net/ethernet/tehuti/tn40_mdio.c | 134 ++++++++++++++++++++++++
 4 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/tehuti/tn40_mdio.c

diff --git a/drivers/net/ethernet/tehuti/Makefile b/drivers/net/ethernet/tehuti/Makefile
index 1c468d99e476..7a0fe586a243 100644
--- a/drivers/net/ethernet/tehuti/Makefile
+++ b/drivers/net/ethernet/tehuti/Makefile
@@ -5,5 +5,5 @@
 
 obj-$(CONFIG_TEHUTI) += tehuti.o
 
-tn40xx-y := tn40.o
+tn40xx-y := tn40.o tn40_mdio.o
 obj-$(CONFIG_TEHUTI_TN40) += tn40xx.o
diff --git a/drivers/net/ethernet/tehuti/tn40.c b/drivers/net/ethernet/tehuti/tn40.c
index 6f9c5fc88d45..db1f781b8063 100644
--- a/drivers/net/ethernet/tehuti/tn40.c
+++ b/drivers/net/ethernet/tehuti/tn40.c
@@ -1776,6 +1776,12 @@ static int tn40_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_unset_drvdata;
 	}
 
+	ret = tn40_mdiobus_init(priv);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to initialize mdio bus.\n");
+		goto err_free_irq;
+	}
+
 	priv->stats_flag =
 		((tn40_read_reg(priv, TN40_FPGA_VER) & 0xFFF) != 308);
 
diff --git a/drivers/net/ethernet/tehuti/tn40.h b/drivers/net/ethernet/tehuti/tn40.h
index cbb7b80c66b1..ce991041caf9 100644
--- a/drivers/net/ethernet/tehuti/tn40.h
+++ b/drivers/net/ethernet/tehuti/tn40.h
@@ -160,6 +160,7 @@ struct tn40_priv {
 	char *b0_va; /* Virtual address of buffer */
 
 	struct tn40_rx_page_table rx_page_table;
+	struct mii_bus *mdio;
 };
 
 /* RX FREE descriptor - 64bit */
@@ -246,4 +247,6 @@ static inline void tn40_write_reg(struct tn40_priv *priv, u32 reg, u32 val)
 	writel(val, priv->regs + reg);
 }
 
+int tn40_mdiobus_init(struct tn40_priv *priv);
+
 #endif /* _TN40XX_H */
diff --git a/drivers/net/ethernet/tehuti/tn40_mdio.c b/drivers/net/ethernet/tehuti/tn40_mdio.c
new file mode 100644
index 000000000000..64ef7f40f25d
--- /dev/null
+++ b/drivers/net/ethernet/tehuti/tn40_mdio.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0+
+/* Copyright (c) Tehuti Networks Ltd. */
+
+#include <linux/netdevice.h>
+#include <linux/pci.h>
+#include <linux/phylink.h>
+
+#include "tn40.h"
+
+static void tn40_mdio_set_speed(struct tn40_priv *priv, u32 speed)
+{
+	void __iomem *regs = priv->regs;
+	int mdio_cfg;
+
+	mdio_cfg = readl(regs + TN40_REG_MDIO_CMD_STAT);
+	if (speed == 1)
+		mdio_cfg = (0x7d << 7) | 0x08;	/* 1MHz */
+	else
+		mdio_cfg = 0xA08;	/* 6MHz */
+	mdio_cfg |= (1 << 6);
+	writel(mdio_cfg, regs + TN40_REG_MDIO_CMD_STAT);
+	msleep(100);
+}
+
+static u32 tn40_mdio_stat(struct tn40_priv *priv)
+{
+	void __iomem *regs = priv->regs;
+
+	return readl(regs + TN40_REG_MDIO_CMD_STAT);
+}
+
+static int tn40_mdio_get(struct tn40_priv *priv, u32 *val)
+{
+	u32 stat;
+
+	return readx_poll_timeout_atomic(tn40_mdio_stat, priv, stat,
+					 TN40_GET_MDIO_BUSY(stat) == 0, 10,
+					 10000);
+}
+
+static int tn40_mdio_read(struct tn40_priv *priv, int port, int device,
+			  u16 regnum)
+{
+	void __iomem *regs = priv->regs;
+	u32 tmp_reg, i;
+
+	/* wait until MDIO is not busy */
+	if (tn40_mdio_get(priv, NULL))
+		return -EIO;
+
+	i = ((device & 0x1F) | ((port & 0x1F) << 5));
+	writel(i, regs + TN40_REG_MDIO_CMD);
+	writel((u32)regnum, regs + TN40_REG_MDIO_ADDR);
+	if (tn40_mdio_get(priv, NULL))
+		return -EIO;
+
+	writel(((1 << 15) | i), regs + TN40_REG_MDIO_CMD);
+	/* read CMD_STAT until not busy */
+	if (tn40_mdio_get(priv, NULL))
+		return -EIO;
+
+	tmp_reg = readl(regs + TN40_REG_MDIO_DATA);
+	return lower_16_bits(tmp_reg);
+}
+
+static int tn40_mdio_write(struct tn40_priv *priv, int port, int device,
+			   u16 regnum, u16 data)
+{
+	void __iomem *regs = priv->regs;
+	u32 tmp_reg = 0;
+	int ret;
+
+	/* wait until MDIO is not busy */
+	if (tn40_mdio_get(priv, NULL))
+		return -EIO;
+	writel(((device & 0x1F) | ((port & 0x1F) << 5)),
+	       regs + TN40_REG_MDIO_CMD);
+	writel((u32)regnum, regs + TN40_REG_MDIO_ADDR);
+	if (tn40_mdio_get(priv, NULL))
+		return -EIO;
+	writel((u32)data, regs + TN40_REG_MDIO_DATA);
+	/* read CMD_STAT until not busy */
+	ret = tn40_mdio_get(priv, &tmp_reg);
+	if (ret)
+		return -EIO;
+
+	if (TN40_GET_MDIO_RD_ERR(tmp_reg)) {
+		dev_err(&priv->pdev->dev, "MDIO error after write command\n");
+		return -EIO;
+	}
+	return 0;
+}
+
+static int tn40_mdio_read_cb(struct mii_bus *mii_bus, int addr, int devnum,
+			     int regnum)
+{
+	return tn40_mdio_read(mii_bus->priv, addr, devnum, regnum);
+}
+
+static int tn40_mdio_write_cb(struct mii_bus *mii_bus, int addr, int devnum,
+			      int regnum, u16 val)
+{
+	return  tn40_mdio_write(mii_bus->priv, addr, devnum, regnum, val);
+}
+
+int tn40_mdiobus_init(struct tn40_priv *priv)
+{
+	struct pci_dev *pdev = priv->pdev;
+	struct mii_bus *bus;
+	int ret;
+
+	bus = devm_mdiobus_alloc(&pdev->dev);
+	if (!bus)
+		return -ENOMEM;
+
+	bus->name = TN40_DRV_NAME;
+	bus->parent = &pdev->dev;
+	snprintf(bus->id, MII_BUS_ID_SIZE, "tn40xx-%x-%x",
+		 pci_domain_nr(pdev->bus), pci_dev_id(pdev));
+	bus->priv = priv;
+
+	bus->read_c45 = tn40_mdio_read_cb;
+	bus->write_c45 = tn40_mdio_write_cb;
+
+	ret = devm_mdiobus_register(&pdev->dev, bus);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to register mdiobus %d %u %u\n",
+			ret, bus->state, MDIOBUS_UNREGISTERED);
+		return ret;
+	}
+	tn40_mdio_set_speed(priv, TN40_MDIO_SPEED_6MHZ);
+	priv->mdio = bus;
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support
  2024-05-01 23:05 [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips FUJITA Tomonori
                   ` (4 preceding siblings ...)
  2024-05-01 23:05 ` [PATCH net-next v4 5/6] net: tn40xx: add mdio bus support FUJITA Tomonori
@ 2024-05-01 23:05 ` FUJITA Tomonori
  2024-05-08 12:21   ` Andrew Lunn
  5 siblings, 1 reply; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-01 23:05 UTC (permalink / raw)
  To: netdev; +Cc: andrew, kuba, jiri, horms

This patch adds supports for multiple PHY hardware with PHYLIB. The
adapters with TN40xx chips use multiple PHY hardware; AMCC QT2025, TI
TLK10232, Aqrate AQR105, and Marvell 88X3120, 88X3310, and MV88E2010.

For now, the PCI ID table of this driver enables adapters using only
QT2025 PHY. I've tested this driver and the QT2025 PHY driver with
Edimax EN-9320 10G adapter.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
---
 drivers/net/ethernet/tehuti/Kconfig    |  1 +
 drivers/net/ethernet/tehuti/Makefile   |  2 +-
 drivers/net/ethernet/tehuti/tn40.c     | 34 ++++++++++---
 drivers/net/ethernet/tehuti/tn40.h     |  7 +++
 drivers/net/ethernet/tehuti/tn40_phy.c | 67 ++++++++++++++++++++++++++
 5 files changed, 104 insertions(+), 7 deletions(-)
 create mode 100644 drivers/net/ethernet/tehuti/tn40_phy.c

diff --git a/drivers/net/ethernet/tehuti/Kconfig b/drivers/net/ethernet/tehuti/Kconfig
index 4198fd59e42e..6ad5d37eb0e4 100644
--- a/drivers/net/ethernet/tehuti/Kconfig
+++ b/drivers/net/ethernet/tehuti/Kconfig
@@ -27,6 +27,7 @@ config TEHUTI_TN40
 	tristate "Tehuti Networks TN40xx 10G Ethernet adapters"
 	depends on PCI
 	select FW_LOADER
+	select PHYLINK
 	help
 	  This driver supports 10G Ethernet adapters using Tehuti Networks
 	  TN40xx chips. Currently, adapters with Applied Micro Circuits
diff --git a/drivers/net/ethernet/tehuti/Makefile b/drivers/net/ethernet/tehuti/Makefile
index 7a0fe586a243..0d4f4d63a65c 100644
--- a/drivers/net/ethernet/tehuti/Makefile
+++ b/drivers/net/ethernet/tehuti/Makefile
@@ -5,5 +5,5 @@
 
 obj-$(CONFIG_TEHUTI) += tehuti.o
 
-tn40xx-y := tn40.o tn40_mdio.o
+tn40xx-y := tn40.o tn40_mdio.o tn40_phy.o
 obj-$(CONFIG_TEHUTI_TN40) += tn40xx.o
diff --git a/drivers/net/ethernet/tehuti/tn40.c b/drivers/net/ethernet/tehuti/tn40.c
index db1f781b8063..bf9c00513a0c 100644
--- a/drivers/net/ethernet/tehuti/tn40.c
+++ b/drivers/net/ethernet/tehuti/tn40.c
@@ -7,6 +7,7 @@
 #include <linux/if_vlan.h>
 #include <linux/netdevice.h>
 #include <linux/pci.h>
+#include <linux/phylink.h>
 
 #include "tn40.h"
 
@@ -1185,21 +1186,25 @@ static void tn40_link_changed(struct tn40_priv *priv)
 	u32 link = tn40_read_reg(priv,
 				 TN40_REG_MAC_LNK_STAT) & TN40_MAC_LINK_STAT;
 	if (!link) {
-		if (netif_carrier_ok(priv->ndev) && priv->link)
+		if (netif_carrier_ok(priv->ndev) && priv->link) {
 			netif_stop_queue(priv->ndev);
+			phylink_mac_change(priv->phylink, false);
+		}
 
 		priv->link = 0;
 		if (priv->link_loop_cnt++ > TN40_LINK_LOOP_MAX) {
 			/* MAC reset */
 			tn40_set_link_speed(priv, 0);
+			tn40_set_link_speed(priv, priv->speed);
 			priv->link_loop_cnt = 0;
 		}
 		tn40_write_reg(priv, 0x5150, 1000000);
 		return;
 	}
-	if (!netif_carrier_ok(priv->ndev) && !link)
+	if (!netif_carrier_ok(priv->ndev) && !link) {
 		netif_wake_queue(priv->ndev);
-
+		phylink_mac_change(priv->phylink, true);
+	}
 	priv->link = link;
 }
 
@@ -1477,6 +1482,9 @@ static int tn40_close(struct net_device *ndev)
 {
 	struct tn40_priv *priv = netdev_priv(ndev);
 
+	phylink_stop(priv->phylink);
+	phylink_disconnect_phy(priv->phylink);
+
 	netif_napi_del(&priv->napi);
 	napi_disable(&priv->napi);
 	tn40_disable_interrupts(priv);
@@ -1492,10 +1500,17 @@ static int tn40_open(struct net_device *dev)
 	struct tn40_priv *priv = netdev_priv(dev);
 	int ret;
 
+	ret = phylink_connect_phy(priv->phylink, priv->phydev);
+	if (ret)
+		return ret;
+
 	tn40_sw_reset(priv);
+	phylink_start(priv->phylink);
 	ret = tn40_start(priv);
 	if (ret) {
 		netdev_err(dev, "failed to start %d\n", ret);
+		phylink_stop(priv->phylink);
+		phylink_disconnect_phy(priv->phylink);
 		return ret;
 	}
 	napi_enable(&priv->napi);
@@ -1790,19 +1805,25 @@ static int tn40_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		TN40_IR_TMR1;
 
 	tn40_mac_init(priv);
-
+	ret = tn40_phy_register(priv);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to set up PHY.\n");
+		goto err_free_irq;
+	}
 	ret = tn40_priv_init(priv);
 	if (ret) {
 		dev_err(&pdev->dev, "failed to initialize tn40_priv.\n");
-		goto err_free_irq;
+		goto err_unregister_phydev;
 	}
 
 	ret = register_netdev(ndev);
 	if (ret) {
 		dev_err(&pdev->dev, "failed to register netdev.\n");
-		goto err_free_irq;
+		goto err_unregister_phydev;
 	}
 	return 0;
+err_unregister_phydev:
+	tn40_phy_unregister(priv);
 err_free_irq:
 	pci_free_irq_vectors(pdev);
 err_unset_drvdata:
@@ -1823,6 +1844,7 @@ static void tn40_remove(struct pci_dev *pdev)
 
 	unregister_netdev(ndev);
 
+	tn40_phy_unregister(priv);
 	pci_free_irq_vectors(priv->pdev);
 	pci_set_drvdata(pdev, NULL);
 	iounmap(priv->regs);
diff --git a/drivers/net/ethernet/tehuti/tn40.h b/drivers/net/ethernet/tehuti/tn40.h
index ce991041caf9..cfe7f2318be2 100644
--- a/drivers/net/ethernet/tehuti/tn40.h
+++ b/drivers/net/ethernet/tehuti/tn40.h
@@ -161,6 +161,10 @@ struct tn40_priv {
 
 	struct tn40_rx_page_table rx_page_table;
 	struct mii_bus *mdio;
+	struct phy_device *phydev;
+	struct phylink *phylink;
+	struct phylink_config phylink_config;
+	int speed;
 };
 
 /* RX FREE descriptor - 64bit */
@@ -249,4 +253,7 @@ static inline void tn40_write_reg(struct tn40_priv *priv, u32 reg, u32 val)
 
 int tn40_mdiobus_init(struct tn40_priv *priv);
 
+int tn40_phy_register(struct tn40_priv *priv);
+void tn40_phy_unregister(struct tn40_priv *priv);
+
 #endif /* _TN40XX_H */
diff --git a/drivers/net/ethernet/tehuti/tn40_phy.c b/drivers/net/ethernet/tehuti/tn40_phy.c
new file mode 100644
index 000000000000..97aa3e100a3b
--- /dev/null
+++ b/drivers/net/ethernet/tehuti/tn40_phy.c
@@ -0,0 +1,67 @@
+// SPDX-License-Identifier: GPL-2.0+
+/* Copyright (c) Tehuti Networks Ltd. */
+
+#include <linux/netdevice.h>
+#include <linux/pci.h>
+#include <linux/phylink.h>
+
+#include "tn40.h"
+
+static void tn40_link_up(struct phylink_config *config, struct phy_device *phy,
+			 unsigned int mode, phy_interface_t interface,
+			 int speed, int duplex, bool tx_pause, bool rx_pause)
+{
+	struct tn40_priv *priv = container_of(config, struct tn40_priv,
+					      phylink_config);
+
+	priv->speed = speed;
+}
+
+static void tn40_link_down(struct phylink_config *config, unsigned int mode,
+			   phy_interface_t interface)
+{
+}
+
+static void tn40_mac_config(struct phylink_config *config, unsigned int mode,
+			    const struct phylink_link_state *state)
+{
+}
+
+static const struct phylink_mac_ops tn40_mac_ops = {
+	.mac_config = tn40_mac_config,
+	.mac_link_up = tn40_link_up,
+	.mac_link_down = tn40_link_down,
+};
+
+int tn40_phy_register(struct tn40_priv *priv)
+{
+	struct phylink_config *config;
+	struct phy_device *phydev;
+	struct phylink *phylink;
+
+	phydev = phy_find_first(priv->mdio);
+	if (!phydev) {
+		dev_err(&priv->pdev->dev, "PHY isn't found\n");
+		return -1;
+	}
+
+	config = &priv->phylink_config;
+	config->dev = &priv->ndev->dev;
+	config->type = PHYLINK_NETDEV;
+	config->mac_capabilities = MAC_10000FD | MLO_AN_PHY;
+	__set_bit(PHY_INTERFACE_MODE_XAUI, config->supported_interfaces);
+
+	phylink = phylink_create(config, NULL, PHY_INTERFACE_MODE_XAUI,
+				 &tn40_mac_ops);
+	if (IS_ERR(phylink))
+		return PTR_ERR(phylink);
+
+	priv->phydev = phydev;
+	priv->phylink = phylink;
+	return 0;
+}
+
+void tn40_phy_unregister(struct tn40_priv *priv)
+{
+	phylink_destroy(priv->phylink);
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling
  2024-05-01 23:05 ` [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling FUJITA Tomonori
@ 2024-05-06  9:20   ` Paolo Abeni
  2024-05-08  7:26     ` FUJITA Tomonori
  2024-05-07  1:58   ` Jakub Kicinski
  1 sibling, 1 reply; 22+ messages in thread
From: Paolo Abeni @ 2024-05-06  9:20 UTC (permalink / raw)
  To: FUJITA Tomonori, netdev; +Cc: andrew, kuba, jiri, horms


On Thu, 2024-05-02 at 08:05 +0900, FUJITA Tomonori wrote:
> +static struct tn40_rx_page *tn40_rx_page_alloc(struct tn40_priv *priv)
> +{
> +	struct tn40_rx_page *rx_page = &priv->rx_page_table.rx_pages;
> +	int page_size = priv->rx_page_table.page_size;
> +	struct page *page;
> +	gfp_t gfp_mask;
> +	dma_addr_t dma;
> +
> +	gfp_mask = GFP_ATOMIC | __GFP_NOWARN;
> +	if (page_size > PAGE_SIZE)
> +		gfp_mask |= __GFP_COMP;
> +
> +	page = alloc_pages(gfp_mask, get_order(page_size));
> +	if (likely(page)) {

Note that this allocation schema can be problematic when the NIC will
receive traffic from many different streams/connection: a single packet
can keep a full order 4 page in use leading to overall memory usage
much greater the what truesize will report.

See commit 3226b158e67c. Here the under-estimation could fair worse.

Drivers usually use order-0 or order-1 pages.

[...]
> +static void tn40_recycle_skb(struct tn40_priv *priv, struct tn40_rxd_desc *rxdd)
> +{

Minor nit: the function name is confusing, at it does recycle in
internal buffer, not a skbuff.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 1/6] net: tn40xx: add pci driver for Tehuti Networks TN40xx chips
  2024-05-01 23:05 ` [PATCH net-next v4 1/6] net: tn40xx: add pci " FUJITA Tomonori
@ 2024-05-07  1:38   ` Jakub Kicinski
  2024-05-08  7:36     ` FUJITA Tomonori
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2024-05-07  1:38 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: netdev, andrew, jiri, horms

On Thu,  2 May 2024 08:05:47 +0900 FUJITA Tomonori wrote:
> +	if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))) {
> +		ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));

This fallback is unnecessary, please see commit f0ed939b6a or one of
many similar removals..

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 3/6] net: tn40xx: add basic Tx handling
  2024-05-01 23:05 ` [PATCH net-next v4 3/6] net: tn40xx: add basic Tx handling FUJITA Tomonori
@ 2024-05-07  1:51   ` Jakub Kicinski
  2024-05-08  7:41     ` FUJITA Tomonori
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2024-05-07  1:51 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: netdev, andrew, jiri, horms

On Thu,  2 May 2024 08:05:49 +0900 FUJITA Tomonori wrote:
> +	err = tn40_tx_map_skb(priv, skb, txdd, &pkt_len);
> +	if (err) {
> +		dev_kfree_skb(skb);
> +		return NETDEV_TX_OK;

make sure you always count drops

> +	.ndo_get_stats = tn40_get_stats,

ndo_get_stats64 is the standard these days

> +struct tn40_txd_desc {
> +	__le32 txd_val1;
> +	__le16 mss;
> +	__le16 length;
> +	__le32 va_lo;
> +	__le32 va_hi;
> +	struct tn40_pbl pbl[]; /* Fragments */
> +} __packed;
> +
> +struct tn40_txf_desc {
> +	u32 status;
> +	u32 va_lo; /* VAdr[31:0] */
> +	u32 va_hi; /* VAdr[63:32] */
> +	u32 pad;
> +} __packed;

Can these be unaligned? There doesn't seem to be any holes in these
struct, it's not necessary to pack them unless you want them to be
unaligned.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling
  2024-05-01 23:05 ` [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling FUJITA Tomonori
  2024-05-06  9:20   ` Paolo Abeni
@ 2024-05-07  1:58   ` Jakub Kicinski
  2024-05-08  7:43     ` FUJITA Tomonori
  1 sibling, 1 reply; 22+ messages in thread
From: Jakub Kicinski @ 2024-05-07  1:58 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: netdev, andrew, jiri, horms

On Thu,  2 May 2024 08:05:50 +0900 FUJITA Tomonori wrote:
> @@ -745,6 +1248,21 @@ static irqreturn_t tn40_isr_napi(int irq, void *dev)
>  	return IRQ_HANDLED;
>  }
>  
> +static int tn40_poll(struct napi_struct *napi, int budget)
> +{
> +	struct tn40_priv *priv = container_of(napi, struct tn40_priv, napi);
> +	int work_done;
> +
> +	tn40_tx_cleanup(priv);
> +
> +	work_done = tn40_rx_receive(priv, &priv->rxd_fifo0, budget);
> +	if (work_done < budget) {
> +		napi_complete(napi);

napi_complete_done() works better with busy polling and such 

> +		tn40_enable_interrupts(priv);
> +	}
> +	return work_done;
> +}
> +

> +	netif_napi_del(&priv->napi);
> +	napi_disable(&priv->napi);

These two lines are likely in the wrong order

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling
  2024-05-06  9:20   ` Paolo Abeni
@ 2024-05-08  7:26     ` FUJITA Tomonori
  0 siblings, 0 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-08  7:26 UTC (permalink / raw)
  To: pabeni; +Cc: fujita.tomonori, netdev, andrew, kuba, jiri, horms

Hi,

Thanks for reviewing the patch!

On Mon, 06 May 2024 11:20:55 +0200
Paolo Abeni <pabeni@redhat.com> wrote:

> 
> On Thu, 2024-05-02 at 08:05 +0900, FUJITA Tomonori wrote:
>> +static struct tn40_rx_page *tn40_rx_page_alloc(struct tn40_priv *priv)
>> +{
>> +	struct tn40_rx_page *rx_page = &priv->rx_page_table.rx_pages;
>> +	int page_size = priv->rx_page_table.page_size;
>> +	struct page *page;
>> +	gfp_t gfp_mask;
>> +	dma_addr_t dma;
>> +
>> +	gfp_mask = GFP_ATOMIC | __GFP_NOWARN;
>> +	if (page_size > PAGE_SIZE)
>> +		gfp_mask |= __GFP_COMP;
>> +
>> +	page = alloc_pages(gfp_mask, get_order(page_size));
>> +	if (likely(page)) {
> 
> Note that this allocation schema can be problematic when the NIC will
> receive traffic from many different streams/connection: a single packet
> can keep a full order 4 page in use leading to overall memory usage
> much greater the what truesize will report.
> 
> See commit 3226b158e67c. Here the under-estimation could fair worse.
> 
> Drivers usually use order-0 or order-1 pages.

Understood. I fixed the driver to use only order-0 or order-1 pages.

> [...]
>> +static void tn40_recycle_skb(struct tn40_priv *priv, struct tn40_rxd_desc *rxdd)
>> +{
> 
> Minor nit: the function name is confusing, at it does recycle in
> internal buffer, not a skbuff.

Sure, I'll go with recycle_rx_buffer instead.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 1/6] net: tn40xx: add pci driver for Tehuti Networks TN40xx chips
  2024-05-07  1:38   ` Jakub Kicinski
@ 2024-05-08  7:36     ` FUJITA Tomonori
  2024-05-08 13:49       ` Jakub Kicinski
  0 siblings, 1 reply; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-08  7:36 UTC (permalink / raw)
  To: kuba; +Cc: fujita.tomonori, netdev, andrew, jiri, horms

Hi,

On Mon, 6 May 2024 18:38:25 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> On Thu,  2 May 2024 08:05:47 +0900 FUJITA Tomonori wrote:
>> +	if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))) {
>> +		ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
> 
> This fallback is unnecessary, please see commit f0ed939b6a or one of
> many similar removals..

I see, fixed.

It might not be necessary to check the returned value here? I keep the
checking alone like the majority of drivers though.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 3/6] net: tn40xx: add basic Tx handling
  2024-05-07  1:51   ` Jakub Kicinski
@ 2024-05-08  7:41     ` FUJITA Tomonori
  0 siblings, 0 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-08  7:41 UTC (permalink / raw)
  To: kuba; +Cc: fujita.tomonori, netdev, andrew, jiri, horms

Hi,

On Mon, 6 May 2024 18:51:29 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> On Thu,  2 May 2024 08:05:49 +0900 FUJITA Tomonori wrote:
>> +	err = tn40_tx_map_skb(priv, skb, txdd, &pkt_len);
>> +	if (err) {
>> +		dev_kfree_skb(skb);
>> +		return NETDEV_TX_OK;
> 
> make sure you always count drops

Fixed.


>> +	.ndo_get_stats = tn40_get_stats,
> 
> ndo_get_stats64 is the standard these days

Fixed. I'll work on hardware stats support after merged (seems that
some firmware versions support hw stats).


>> +struct tn40_txd_desc {
>> +	__le32 txd_val1;
>> +	__le16 mss;
>> +	__le16 length;
>> +	__le32 va_lo;
>> +	__le32 va_hi;
>> +	struct tn40_pbl pbl[]; /* Fragments */
>> +} __packed;
>> +
>> +struct tn40_txf_desc {
>> +	u32 status;
>> +	u32 va_lo; /* VAdr[31:0] */
>> +	u32 va_hi; /* VAdr[63:32] */
>> +	u32 pad;
>> +} __packed;
> 
> Can these be unaligned? There doesn't seem to be any holes in these
> struct, it's not necessary to pack them unless you want them to be
> unaligned.

Indeed, not necessary. Removed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling
  2024-05-07  1:58   ` Jakub Kicinski
@ 2024-05-08  7:43     ` FUJITA Tomonori
  0 siblings, 0 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-08  7:43 UTC (permalink / raw)
  To: kuba; +Cc: fujita.tomonori, netdev, andrew, jiri, horms

Hi,

On Mon, 6 May 2024 18:58:37 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> On Thu,  2 May 2024 08:05:50 +0900 FUJITA Tomonori wrote:
>> @@ -745,6 +1248,21 @@ static irqreturn_t tn40_isr_napi(int irq, void *dev)
>>  	return IRQ_HANDLED;
>>  }
>>  
>> +static int tn40_poll(struct napi_struct *napi, int budget)
>> +{
>> +	struct tn40_priv *priv = container_of(napi, struct tn40_priv, napi);
>> +	int work_done;
>> +
>> +	tn40_tx_cleanup(priv);
>> +
>> +	work_done = tn40_rx_receive(priv, &priv->rxd_fifo0, budget);
>> +	if (work_done < budget) {
>> +		napi_complete(napi);
> 
> napi_complete_done() works better with busy polling and such 

Understood, fixed.

I also fixed the function to handle the cases where budget is zero.


>> +		tn40_enable_interrupts(priv);
>> +	}
>> +	return work_done;
>> +}
>> +
> 
>> +	netif_napi_del(&priv->napi);
>> +	napi_disable(&priv->napi);
> 
> These two lines are likely in the wrong order

Ah, fixed.


Thanks a lot!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support
  2024-05-01 23:05 ` [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support FUJITA Tomonori
@ 2024-05-08 12:21   ` Andrew Lunn
  2024-05-08 13:18     ` FUJITA Tomonori
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Lunn @ 2024-05-08 12:21 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: netdev, kuba, jiri, horms

On Thu, May 02, 2024 at 08:05:52AM +0900, FUJITA Tomonori wrote:
> This patch adds supports for multiple PHY hardware with PHYLIB. The
> adapters with TN40xx chips use multiple PHY hardware; AMCC QT2025, TI
> TLK10232, Aqrate AQR105, and Marvell 88X3120, 88X3310, and MV88E2010.
> 
> For now, the PCI ID table of this driver enables adapters using only
> QT2025 PHY. I've tested this driver and the QT2025 PHY driver with
> Edimax EN-9320 10G adapter.
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
> ---
>  drivers/net/ethernet/tehuti/Kconfig    |  1 +
>  drivers/net/ethernet/tehuti/Makefile   |  2 +-
>  drivers/net/ethernet/tehuti/tn40.c     | 34 ++++++++++---
>  drivers/net/ethernet/tehuti/tn40.h     |  7 +++
>  drivers/net/ethernet/tehuti/tn40_phy.c | 67 ++++++++++++++++++++++++++
>  5 files changed, 104 insertions(+), 7 deletions(-)
>  create mode 100644 drivers/net/ethernet/tehuti/tn40_phy.c
> 
> diff --git a/drivers/net/ethernet/tehuti/Kconfig b/drivers/net/ethernet/tehuti/Kconfig
> index 4198fd59e42e..6ad5d37eb0e4 100644
> --- a/drivers/net/ethernet/tehuti/Kconfig
> +++ b/drivers/net/ethernet/tehuti/Kconfig
> @@ -27,6 +27,7 @@ config TEHUTI_TN40
>  	tristate "Tehuti Networks TN40xx 10G Ethernet adapters"
>  	depends on PCI
>  	select FW_LOADER
> +	select PHYLINK
>  	help
>  	  This driver supports 10G Ethernet adapters using Tehuti Networks
>  	  TN40xx chips. Currently, adapters with Applied Micro Circuits
> diff --git a/drivers/net/ethernet/tehuti/Makefile b/drivers/net/ethernet/tehuti/Makefile
> index 7a0fe586a243..0d4f4d63a65c 100644
> --- a/drivers/net/ethernet/tehuti/Makefile
> +++ b/drivers/net/ethernet/tehuti/Makefile
> @@ -5,5 +5,5 @@
>  
>  obj-$(CONFIG_TEHUTI) += tehuti.o
>  
> -tn40xx-y := tn40.o tn40_mdio.o
> +tn40xx-y := tn40.o tn40_mdio.o tn40_phy.o
>  obj-$(CONFIG_TEHUTI_TN40) += tn40xx.o
> diff --git a/drivers/net/ethernet/tehuti/tn40.c b/drivers/net/ethernet/tehuti/tn40.c
> index db1f781b8063..bf9c00513a0c 100644
> --- a/drivers/net/ethernet/tehuti/tn40.c
> +++ b/drivers/net/ethernet/tehuti/tn40.c
> @@ -7,6 +7,7 @@
>  #include <linux/if_vlan.h>
>  #include <linux/netdevice.h>
>  #include <linux/pci.h>
> +#include <linux/phylink.h>
>  
>  #include "tn40.h"
>  
> @@ -1185,21 +1186,25 @@ static void tn40_link_changed(struct tn40_priv *priv)
>  	u32 link = tn40_read_reg(priv,
>  				 TN40_REG_MAC_LNK_STAT) & TN40_MAC_LINK_STAT;
>  	if (!link) {
> -		if (netif_carrier_ok(priv->ndev) && priv->link)
> +		if (netif_carrier_ok(priv->ndev) && priv->link) {
>  			netif_stop_queue(priv->ndev);
> +			phylink_mac_change(priv->phylink, false);
> +		}

What exactly does link_changed mean?

The normal use case for calling phylink_mac_change() is that you have
received an interrupt from something like the PCS, or the PHY. The MAC
driver itself cannot fully evaluate if the link is up because there
can be multiple parts in that decision. Is the SFP reporting LOS? Does
the PCS SERDES have sync, etc. So all you do is forward the interrupt
to phylink. phylink will then look at everything it knows about and
decide the state of the link, and maybe call one of the callbacks
indicating the link is now up/down.

>  		priv->link = 0;
>  		if (priv->link_loop_cnt++ > TN40_LINK_LOOP_MAX) {
>  			/* MAC reset */
>  			tn40_set_link_speed(priv, 0);
> +			tn40_set_link_speed(priv, priv->speed);
>  			priv->link_loop_cnt = 0;

This should move into the link_down callback.

> -	if (!netif_carrier_ok(priv->ndev) && !link)
> +	if (!netif_carrier_ok(priv->ndev) && !link) {
>  		netif_wake_queue(priv->ndev);

and this should be in the link_up callback.
> +static void tn40_link_up(struct phylink_config *config, struct phy_device *phy,
> +			 unsigned int mode, phy_interface_t interface,
> +			 int speed, int duplex, bool tx_pause, bool rx_pause)
> +{
> +	struct tn40_priv *priv = container_of(config, struct tn40_priv,
> +					      phylink_config);
> +
> +	priv->speed = speed;

This is where you should take any actions needed to make the MAC send
packets, at the correct rate.

> +}
> +
> +static void tn40_link_down(struct phylink_config *config, unsigned int mode,
> +			   phy_interface_t interface)
> +{

And here you should stop the MAC sending packets.

> +}

> +
> +static void tn40_mac_config(struct phylink_config *config, unsigned int mode,
> +			    const struct phylink_link_state *state)
> +{

I know at the moment you only support 10G. When you add support for
1G, this is where you will need to configure the MAC to swap between
the different modes. phylink will tell you which mode to use,
10GBaseX, 1000BaseX, SGMII, etc. You might want to move the existing
code for 10GBaseX here.

For the next version, please also Cc: Russell King, the phylink
Maintainer.

    Andrew

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support
  2024-05-08 12:21   ` Andrew Lunn
@ 2024-05-08 13:18     ` FUJITA Tomonori
  2024-05-08 14:03       ` Andrew Lunn
  2024-05-09  4:23       ` FUJITA Tomonori
  0 siblings, 2 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-08 13:18 UTC (permalink / raw)
  To: andrew; +Cc: fujita.tomonori, netdev, kuba, jiri, horms

Hi,

On Wed, 8 May 2024 14:21:29 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

>> --- a/drivers/net/ethernet/tehuti/tn40.c
>> +++ b/drivers/net/ethernet/tehuti/tn40.c
>> @@ -7,6 +7,7 @@
>>  #include <linux/if_vlan.h>
>>  #include <linux/netdevice.h>
>>  #include <linux/pci.h>
>> +#include <linux/phylink.h>
>>  
>>  #include "tn40.h"
>>  
>> @@ -1185,21 +1186,25 @@ static void tn40_link_changed(struct tn40_priv *priv)
>>  	u32 link = tn40_read_reg(priv,
>>  				 TN40_REG_MAC_LNK_STAT) & TN40_MAC_LINK_STAT;
>>  	if (!link) {
>> -		if (netif_carrier_ok(priv->ndev) && priv->link)
>> +		if (netif_carrier_ok(priv->ndev) && priv->link) {
>>  			netif_stop_queue(priv->ndev);
>> +			phylink_mac_change(priv->phylink, false);
>> +		}
> 
> What exactly does link_changed mean?
> 
> The normal use case for calling phylink_mac_change() is that you have
> received an interrupt from something like the PCS, or the PHY. The MAC
> driver itself cannot fully evaluate if the link is up because there
> can be multiple parts in that decision. Is the SFP reporting LOS? Does

The original driver receives an interrupt from an PHY (or something),
then reads the register (TN40_REG_MAC_LNK_STAT) to evaluate the state
of the link; doesn't use information from the PHY.


> the PCS SERDES have sync, etc. So all you do is forward the interrupt
> to phylink. phylink will then look at everything it knows about and
> decide the state of the link, and maybe call one of the callbacks
> indicating the link is now up/down.

What function should be used to forward an interrupt to phylink?
equivalent to phy_mac_interrupt() in phylib.


>>  		priv->link = 0;
>>  		if (priv->link_loop_cnt++ > TN40_LINK_LOOP_MAX) {
>>  			/* MAC reset */
>>  			tn40_set_link_speed(priv, 0);
>> +			tn40_set_link_speed(priv, priv->speed);
>>  			priv->link_loop_cnt = 0;
> 
> This should move into the link_down callback.

I'll try phylink callbacks to see if they would work. 


>> -	if (!netif_carrier_ok(priv->ndev) && !link)
>> +	if (!netif_carrier_ok(priv->ndev) && !link) {
>>  		netif_wake_queue(priv->ndev);
> 
> and this should be in the link_up callback.
>> +static void tn40_link_up(struct phylink_config *config, struct phy_device *phy,
>> +			 unsigned int mode, phy_interface_t interface,
>> +			 int speed, int duplex, bool tx_pause, bool rx_pause)
>> +{
>> +	struct tn40_priv *priv = container_of(config, struct tn40_priv,
>> +					      phylink_config);
>> +
>> +	priv->speed = speed;
> 
> This is where you should take any actions needed to make the MAC send
> packets, at the correct rate.
> 
>> +}
>> +
>> +static void tn40_link_down(struct phylink_config *config, unsigned int mode,
>> +			   phy_interface_t interface)
>> +{
> 
> And here you should stop the MAC sending packets.
> 
>> +}
> 
>> +
>> +static void tn40_mac_config(struct phylink_config *config, unsigned int mode,
>> +			    const struct phylink_link_state *state)
>> +{
> 
> I know at the moment you only support 10G. When you add support for
> 1G, this is where you will need to configure the MAC to swap between
> the different modes. phylink will tell you which mode to use,
> 10GBaseX, 1000BaseX, SGMII, etc. You might want to move the existing
> code for 10GBaseX here.

Yeah.

The original driver configures the MAC for 10G with QT2025 PHY so I'm
not sure things would work well. I'll experiment once I get 1G SFP.


> For the next version, please also Cc: Russell King, the phylink
> Maintainer.

Sure, I'll do in v6.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 1/6] net: tn40xx: add pci driver for Tehuti Networks TN40xx chips
  2024-05-08  7:36     ` FUJITA Tomonori
@ 2024-05-08 13:49       ` Jakub Kicinski
  0 siblings, 0 replies; 22+ messages in thread
From: Jakub Kicinski @ 2024-05-08 13:49 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: netdev, andrew, jiri, horms

On Wed, 08 May 2024 16:36:18 +0900 (JST) FUJITA Tomonori wrote:
> > On Thu,  2 May 2024 08:05:47 +0900 FUJITA Tomonori wrote:  
> >> +	if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))) {
> >> +		ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));  
> > 
> > This fallback is unnecessary, please see commit f0ed939b6a or one of
> > many similar removals..  
> 
> I see, fixed.
> 
> It might not be necessary to check the returned value here? I keep the
> checking alone like the majority of drivers though.

Right, keep the error check. It's just that the failure, if it happens,
will not be related to the length of the mask. So "fallback" to 32b if
64b fails is unnecessary.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support
  2024-05-08 13:18     ` FUJITA Tomonori
@ 2024-05-08 14:03       ` Andrew Lunn
  2024-05-09  4:23       ` FUJITA Tomonori
  1 sibling, 0 replies; 22+ messages in thread
From: Andrew Lunn @ 2024-05-08 14:03 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: netdev, kuba, jiri, horms

On Wed, May 08, 2024 at 10:18:51PM +0900, FUJITA Tomonori wrote:
> Hi,
> 
> On Wed, 8 May 2024 14:21:29 +0200
> Andrew Lunn <andrew@lunn.ch> wrote:
> 
> >> --- a/drivers/net/ethernet/tehuti/tn40.c
> >> +++ b/drivers/net/ethernet/tehuti/tn40.c
> >> @@ -7,6 +7,7 @@
> >>  #include <linux/if_vlan.h>
> >>  #include <linux/netdevice.h>
> >>  #include <linux/pci.h>
> >> +#include <linux/phylink.h>
> >>  
> >>  #include "tn40.h"
> >>  
> >> @@ -1185,21 +1186,25 @@ static void tn40_link_changed(struct tn40_priv *priv)
> >>  	u32 link = tn40_read_reg(priv,
> >>  				 TN40_REG_MAC_LNK_STAT) & TN40_MAC_LINK_STAT;
> >>  	if (!link) {
> >> -		if (netif_carrier_ok(priv->ndev) && priv->link)
> >> +		if (netif_carrier_ok(priv->ndev) && priv->link) {
> >>  			netif_stop_queue(priv->ndev);
> >> +			phylink_mac_change(priv->phylink, false);
> >> +		}
> > 
> > What exactly does link_changed mean?
> > 
> > The normal use case for calling phylink_mac_change() is that you have
> > received an interrupt from something like the PCS, or the PHY. The MAC
> > driver itself cannot fully evaluate if the link is up because there
> > can be multiple parts in that decision. Is the SFP reporting LOS? Does
> 
> The original driver receives an interrupt from an PHY (or something),
> then reads the register (TN40_REG_MAC_LNK_STAT) to evaluate the state
> of the link; doesn't use information from the PHY.

So i guess this is the PCS state. Call phylink_mac_change() with this
state. Don't do anything else here.

       Andrew

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support
  2024-05-08 13:18     ` FUJITA Tomonori
  2024-05-08 14:03       ` Andrew Lunn
@ 2024-05-09  4:23       ` FUJITA Tomonori
  2024-05-09 13:37         ` Andrew Lunn
  1 sibling, 1 reply; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-09  4:23 UTC (permalink / raw)
  To: andrew; +Cc: netdev, kuba, jiri, horms, fujita.tomonori

Hi,

On Wed, 08 May 2024 22:18:51 +0900 (JST)
FUJITA Tomonori <fujita.tomonori@gmail.com> wrote:

>>>  		priv->link = 0;
>>>  		if (priv->link_loop_cnt++ > TN40_LINK_LOOP_MAX) {
>>>  			/* MAC reset */
>>>  			tn40_set_link_speed(priv, 0);
>>> +			tn40_set_link_speed(priv, priv->speed);
>>>  			priv->link_loop_cnt = 0;
>> 
>> This should move into the link_down callback.
> 
> I'll try phylink callbacks to see if they would work. 

I found that the link_down callback doesn't work well for the MAC
reset above.

Currently, when TN40_REG_MAC_LNK_STAT register tells that the link is
off, the driver configures the MAC to generate an interrupt
periodically; tn40_write_reg(priv, 0x5150, 1000000) is called in
tn40_link_changed().

Eventually, the counter is over TN40_LINK_LOOP_MAX and then the driver
executes the MAC reset. Without the MAC reset, the NIC will not work.

The link_down callback is called only when the link becomes down so it
can't be used to trigger the MAC reset.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support
  2024-05-09  4:23       ` FUJITA Tomonori
@ 2024-05-09 13:37         ` Andrew Lunn
  2024-05-12  5:20           ` FUJITA Tomonori
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Lunn @ 2024-05-09 13:37 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: netdev, kuba, jiri, horms

On Thu, May 09, 2024 at 01:23:41PM +0900, FUJITA Tomonori wrote:
> Hi,
> 
> On Wed, 08 May 2024 22:18:51 +0900 (JST)
> FUJITA Tomonori <fujita.tomonori@gmail.com> wrote:
> 
> >>>  		priv->link = 0;
> >>>  		if (priv->link_loop_cnt++ > TN40_LINK_LOOP_MAX) {
> >>>  			/* MAC reset */
> >>>  			tn40_set_link_speed(priv, 0);
> >>> +			tn40_set_link_speed(priv, priv->speed);
> >>>  			priv->link_loop_cnt = 0;
> >> 
> >> This should move into the link_down callback.
> > 
> > I'll try phylink callbacks to see if they would work. 
> 
> I found that the link_down callback doesn't work well for the MAC
> reset above.
> 
> Currently, when TN40_REG_MAC_LNK_STAT register tells that the link is
> off, the driver configures the MAC to generate an interrupt
> periodically; tn40_write_reg(priv, 0x5150, 1000000) is called in
> tn40_link_changed().
> 
> Eventually, the counter is over TN40_LINK_LOOP_MAX and then the driver
> executes the MAC reset. Without the MAC reset, the NIC will not work.
> 
> The link_down callback is called only when the link becomes down so it
> can't be used to trigger the MAC reset.

So this sounds like a hardware bug workaround.

But it might also be to do with auto-neg. The MAC PCS/SERDES and the
PHY PCS/SERDES, depending on the mode, should be performing auto-neg,
to indicate things like pause. For some hardware, you need to restart
autoneg when the line sides gets link. It could be this hardware has
no way to do that, other than hit the whole thing with a reset?

Take a look at struct phylink_pcs_ops and see if you can map bits of
the driver to this structure. It might be you can implement a PCS, and
have the pcs_an_restart do the MAC reset.

     Andrew

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support
  2024-05-09 13:37         ` Andrew Lunn
@ 2024-05-12  5:20           ` FUJITA Tomonori
  0 siblings, 0 replies; 22+ messages in thread
From: FUJITA Tomonori @ 2024-05-12  5:20 UTC (permalink / raw)
  To: andrew; +Cc: fujita.tomonori, netdev, kuba, jiri, horms

Hi,

On Thu, 9 May 2024 15:37:55 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

> On Thu, May 09, 2024 at 01:23:41PM +0900, FUJITA Tomonori wrote:
>> Hi,
>> 
>> On Wed, 08 May 2024 22:18:51 +0900 (JST)
>> FUJITA Tomonori <fujita.tomonori@gmail.com> wrote:
>> 
>> >>>  		priv->link = 0;
>> >>>  		if (priv->link_loop_cnt++ > TN40_LINK_LOOP_MAX) {
>> >>>  			/* MAC reset */
>> >>>  			tn40_set_link_speed(priv, 0);
>> >>> +			tn40_set_link_speed(priv, priv->speed);
>> >>>  			priv->link_loop_cnt = 0;
>> >> 
>> >> This should move into the link_down callback.
>> > 
>> > I'll try phylink callbacks to see if they would work. 
>> 
>> I found that the link_down callback doesn't work well for the MAC
>> reset above.
>> 
>> Currently, when TN40_REG_MAC_LNK_STAT register tells that the link is
>> off, the driver configures the MAC to generate an interrupt
>> periodically; tn40_write_reg(priv, 0x5150, 1000000) is called in
>> tn40_link_changed().
>> 
>> Eventually, the counter is over TN40_LINK_LOOP_MAX and then the driver
>> executes the MAC reset. Without the MAC reset, the NIC will not work.
>> 
>> The link_down callback is called only when the link becomes down so it
>> can't be used to trigger the MAC reset.
> 
> So this sounds like a hardware bug workaround.

Yeah, looks so.

> But it might also be to do with auto-neg. The MAC PCS/SERDES and the
> PHY PCS/SERDES, depending on the mode, should be performing auto-neg,
> to indicate things like pause. For some hardware, you need to restart
> autoneg when the line sides gets link. It could be this hardware has
> no way to do that, other than hit the whole thing with a reset?

I can't find a function to do such in the original driver.


> Take a look at struct phylink_pcs_ops and see if you can map bits of
> the driver to this structure. It might be you can implement a PCS, and
> have the pcs_an_restart do the MAC reset.

I can't find anything that checks the link periodically until the link
is recovered.

I dropped the workaround and does a reset every time the link is down
(mac_link_down). Seems that it works.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-05-12  5:20 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-01 23:05 [PATCH net-next v4 0/6] add ethernet driver for Tehuti Networks TN40xx chips FUJITA Tomonori
2024-05-01 23:05 ` [PATCH net-next v4 1/6] net: tn40xx: add pci " FUJITA Tomonori
2024-05-07  1:38   ` Jakub Kicinski
2024-05-08  7:36     ` FUJITA Tomonori
2024-05-08 13:49       ` Jakub Kicinski
2024-05-01 23:05 ` [PATCH net-next v4 2/6] net: tn40xx: add register defines FUJITA Tomonori
2024-05-01 23:05 ` [PATCH net-next v4 3/6] net: tn40xx: add basic Tx handling FUJITA Tomonori
2024-05-07  1:51   ` Jakub Kicinski
2024-05-08  7:41     ` FUJITA Tomonori
2024-05-01 23:05 ` [PATCH net-next v4 4/6] net: tn40xx: add basic Rx handling FUJITA Tomonori
2024-05-06  9:20   ` Paolo Abeni
2024-05-08  7:26     ` FUJITA Tomonori
2024-05-07  1:58   ` Jakub Kicinski
2024-05-08  7:43     ` FUJITA Tomonori
2024-05-01 23:05 ` [PATCH net-next v4 5/6] net: tn40xx: add mdio bus support FUJITA Tomonori
2024-05-01 23:05 ` [PATCH net-next v4 6/6] net: tn40xx: add PHYLIB support FUJITA Tomonori
2024-05-08 12:21   ` Andrew Lunn
2024-05-08 13:18     ` FUJITA Tomonori
2024-05-08 14:03       ` Andrew Lunn
2024-05-09  4:23       ` FUJITA Tomonori
2024-05-09 13:37         ` Andrew Lunn
2024-05-12  5:20           ` FUJITA Tomonori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).