netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
@ 2014-12-11 11:42 Ding Tianhong
  2014-12-11 11:42 ` [PATCH net-next v9 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet Ding Tianhong
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
  To: zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, linux-lFZ/pmaqli7XmaaqVzeoHQ,
	arnd-r2nGTMty4D4, f.fainelli-Re5JQEeQqe8AvxtiuMwx3w,
	sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8,
	mark.rutland-5wv7dgnIgG8, David.Laight-ZS65k/vG3HxXrIkS9f7CXA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA

v9:
- There is no tx completion interrupts to free DMAd Tx packets, it means taht
  we rely on new tx packets arriving to run the destructors of completed packets,
  which open up space in their sockets's send queues. Sometimes we don't get such
  new packets causing Tx to stall, a single UDP transmitter is a good example of
  this situation, so we need a clean up workqueue to reclaims completed packets,
  the workqueue will only free the last packets which is already stay for several jiffies.
  Also fix some format cleanups.

v8:
- Use poll to reclaim xmitted buffer as workaround since no tx done interrupt 

v7:
- Remove select NET_CORE in 0002

v6:
- Suggest by Russell: Use netdev_sent_queue & netdev_completed_queue to solve latency issue 
  Also shorten the period of timer, which is used to wakeup the queue since no
  tx completed interrupt.

v5:
  no big change, fix typo

v4:
- Modify accoringly to the suggetion from Arnd, Florian, Eric, David
  Use of_parse_phandle_with_fixed_args & syscon_node_to_regmap get ppe info
  Add skb_orphan() and tx_timer for reclaim since no tx_finished interrupt
  Update timeout, and move of_phy_connect to probe to reuse open/stop

v3:
- Suggest from Arnd, use syscon & regmap_write/read to replace static void __iomem *ppebase.
  Modify hisilicon-hip04-net.txt accrordingly to suggestion from Florian and Sergei.

v2:
- Got many suggestions from Russell, Arnd, Florian, Mark and Sergei
  Remove memcpy, use dma_map/unmap_single, use dma_alloc_coherent rather than dma_pool, etc.
  Refer property in ethernet.txt, change ppe description, etc.

Zhangfei Gao (3):
  Documentation: add Device tree bindings for Hisilicon hip04 ethernet
  net: hisilicon: new hip04 MDIO driver
  net: hisilicon: new hip04 ethernet driver

 .../bindings/net/hisilicon-hip04-net.txt           |  88 +++
 drivers/net/ethernet/hisilicon/Kconfig             |   9 +
 drivers/net/ethernet/hisilicon/Makefile            |   1 +
 drivers/net/ethernet/hisilicon/hip04_eth.c         | 876 +++++++++++++++++++++
 drivers/net/ethernet/hisilicon/hip04_mdio.c        | 186 +++++
 5 files changed, 1160 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
 create mode 100644 drivers/net/ethernet/hisilicon/hip04_eth.c
 create mode 100644 drivers/net/ethernet/hisilicon/hip04_mdio.c

-- 
1.8.0


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH net-next v9 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet
  2014-12-11 11:42 [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver Ding Tianhong
@ 2014-12-11 11:42 ` Ding Tianhong
  2014-12-11 11:42 ` [PATCH net-next v9 2/3] net: hisilicon: new hip04 MDIO driver Ding Tianhong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
  To: zhangfei.gao, davem, linux, arnd, f.fainelli, sergei.shtylyov,
	mark.rutland, David.Laight, eric.dumazet, xuwei5
  Cc: linux-arm-kernel, netdev, devicetree

From: Zhangfei Gao <zhangfei.gao@linaro.org>

This patch adds the Device Tree bindings for the Hisilicon hip04
Ethernet controller, including 100M / 1000M controller.

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 .../bindings/net/hisilicon-hip04-net.txt           | 88 ++++++++++++++++++++++
 1 file changed, 88 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt

diff --git a/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt b/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
new file mode 100644
index 0000000..988fc69
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
@@ -0,0 +1,88 @@
+Hisilicon hip04 Ethernet Controller
+
+* Ethernet controller node
+
+Required properties:
+- compatible: should be "hisilicon,hip04-mac".
+- reg: address and length of the register set for the device.
+- interrupts: interrupt for the device.
+- port-handle: <phandle port channel>
+	phandle, specifies a reference to the syscon ppe node
+	port, port number connected to the controller
+	channel, recv channel start from channel * number (RX_DESC_NUM)
+- phy-mode: see ethernet.txt [1].
+
+Optional properties:
+- phy-handle: see ethernet.txt [1].
+
+[1] Documentation/devicetree/bindings/net/ethernet.txt
+
+
+* Ethernet ppe node:
+Control rx & tx fifos of all ethernet controllers.
+Have 2048 recv channels shared by all ethernet controllers, only if no overlap.
+Each controller's recv channel start from channel * number (RX_DESC_NUM).
+
+Required properties:
+- compatible: "hisilicon,hip04-ppe", "syscon".
+- reg: address and length of the register set for the device.
+
+
+* MDIO bus node:
+
+Required properties:
+
+- compatible: should be "hisilicon,hip04-mdio".
+- Inherits from MDIO bus node binding [2]
+[2] Documentation/devicetree/bindings/net/phy.txt
+
+Example:
+	mdio {
+		compatible = "hisilicon,hip04-mdio";
+		reg = <0x28f1000 0x1000>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		phy0: ethernet-phy@0 {
+			compatible = "ethernet-phy-ieee802.3-c22";
+			reg = <0>;
+			marvell,reg-init = <18 0x14 0 0x8001>;
+		};
+
+		phy1: ethernet-phy@1 {
+			compatible = "ethernet-phy-ieee802.3-c22";
+			reg = <1>;
+			marvell,reg-init = <18 0x14 0 0x8001>;
+		};
+	};
+
+	ppe: ppe@28c0000 {
+		compatible = "hisilicon,hip04-ppe", "syscon";
+		reg = <0x28c0000 0x10000>;
+	};
+
+	fe: ethernet@28b0000 {
+		compatible = "hisilicon,hip04-mac";
+		reg = <0x28b0000 0x10000>;
+		interrupts = <0 413 4>;
+		phy-mode = "mii";
+		port-handle = <&ppe 31 0>;
+	};
+
+	ge0: ethernet@2800000 {
+		compatible = "hisilicon,hip04-mac";
+		reg = <0x2800000 0x10000>;
+		interrupts = <0 402 4>;
+		phy-mode = "sgmii";
+		port-handle = <&ppe 0 1>;
+		phy-handle = <&phy0>;
+	};
+
+	ge8: ethernet@2880000 {
+		compatible = "hisilicon,hip04-mac";
+		reg = <0x2880000 0x10000>;
+		interrupts = <0 410 4>;
+		phy-mode = "sgmii";
+		port-handle = <&ppe 8 2>;
+		phy-handle = <&phy1>;
+	};
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net-next v9 2/3] net: hisilicon: new hip04 MDIO driver
  2014-12-11 11:42 [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver Ding Tianhong
  2014-12-11 11:42 ` [PATCH net-next v9 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet Ding Tianhong
@ 2014-12-11 11:42 ` Ding Tianhong
  2014-12-11 11:42 ` [PATCH net-next v9 3/3] net: hisilicon: new hip04 ethernet driver Ding Tianhong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
  To: zhangfei.gao, davem, linux, arnd, f.fainelli, sergei.shtylyov,
	mark.rutland, David.Laight, eric.dumazet, xuwei5
  Cc: linux-arm-kernel, netdev, devicetree

From: Zhangfei Gao <zhangfei.gao@linaro.org>

Hisilicon hip04 platform mdio driver
Reuse Marvell phy drivers/net/phy/marvell.c

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/Kconfig      |   9 ++
 drivers/net/ethernet/hisilicon/Makefile     |   1 +
 drivers/net/ethernet/hisilicon/hip04_mdio.c | 186 ++++++++++++++++++++++++++++
 3 files changed, 196 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hip04_mdio.c

diff --git a/drivers/net/ethernet/hisilicon/Kconfig b/drivers/net/ethernet/hisilicon/Kconfig
index e942173..a54d897 100644
--- a/drivers/net/ethernet/hisilicon/Kconfig
+++ b/drivers/net/ethernet/hisilicon/Kconfig
@@ -24,4 +24,13 @@ config HIX5HD2_GMAC
 	help
 	  This selects the hix5hd2 mac family network device.
 
+config HIP04_ETH
+	tristate "HISILICON P04 Ethernet support"
+	select PHYLIB
+	select MARVELL_PHY
+	select MFD_SYSCON
+	---help---
+	  If you wish to compile a kernel for a hardware with hisilicon p04 SoC and
+	  want to use the internal ethernet then you should answer Y to this.
+
 endif # NET_VENDOR_HISILICON
diff --git a/drivers/net/ethernet/hisilicon/Makefile b/drivers/net/ethernet/hisilicon/Makefile
index 9175e846..40115a7 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -3,3 +3,4 @@
 #
 
 obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
+obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o
diff --git a/drivers/net/ethernet/hisilicon/hip04_mdio.c b/drivers/net/ethernet/hisilicon/hip04_mdio.c
new file mode 100644
index 0000000..b3bac25
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hip04_mdio.c
@@ -0,0 +1,186 @@
+/* Copyright (c) 2014 Linaro Ltd.
+ * Copyright (c) 2014 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>
+#include <linux/of_mdio.h>
+#include <linux/delay.h>
+
+#define MDIO_CMD_REG		0x0
+#define MDIO_ADDR_REG		0x4
+#define MDIO_WDATA_REG		0x8
+#define MDIO_RDATA_REG		0xc
+#define MDIO_STA_REG		0x10
+
+#define MDIO_START		BIT(14)
+#define MDIO_R_VALID		BIT(1)
+#define MDIO_READ	        (BIT(12) | BIT(11) | MDIO_START)
+#define MDIO_WRITE	        (BIT(12) | BIT(10) | MDIO_START)
+
+struct hip04_mdio_priv {
+	void __iomem *base;
+};
+
+#define WAIT_TIMEOUT 10
+static int hip04_mdio_wait_ready(struct mii_bus *bus)
+{
+	struct hip04_mdio_priv *priv = bus->priv;
+	int i;
+
+	for (i = 0; readl_relaxed(priv->base + MDIO_CMD_REG) & MDIO_START; i++) {
+		if (i == WAIT_TIMEOUT)
+			return -ETIMEDOUT;
+		msleep(20);
+	}
+
+	return 0;
+}
+
+static int hip04_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
+{
+	struct hip04_mdio_priv *priv = bus->priv;
+	u32 val;
+	int ret;
+
+	ret = hip04_mdio_wait_ready(bus);
+	if (ret < 0)
+		goto out;
+
+	val = regnum | (mii_id << 5) | MDIO_READ;
+	writel_relaxed(val, priv->base + MDIO_CMD_REG);
+
+	ret = hip04_mdio_wait_ready(bus);
+	if (ret < 0)
+		goto out;
+
+	val = readl_relaxed(priv->base + MDIO_STA_REG);
+	if (val & MDIO_R_VALID) {
+		dev_err(bus->parent, "SMI bus read not valid\n");
+		ret = -ENODEV;
+		goto out;
+	}
+
+	val = readl_relaxed(priv->base + MDIO_RDATA_REG);
+	ret = val & 0xFFFF;
+out:
+	return ret;
+}
+
+static int hip04_mdio_write(struct mii_bus *bus, int mii_id,
+			    int regnum, u16 value)
+{
+	struct hip04_mdio_priv *priv = bus->priv;
+	u32 val;
+	int ret;
+
+	ret = hip04_mdio_wait_ready(bus);
+	if (ret < 0)
+		goto out;
+
+	writel_relaxed(value, priv->base + MDIO_WDATA_REG);
+	val = regnum | (mii_id << 5) | MDIO_WRITE;
+	writel_relaxed(val, priv->base + MDIO_CMD_REG);
+out:
+	return ret;
+}
+
+static int hip04_mdio_reset(struct mii_bus *bus)
+{
+	int temp, i;
+
+	for (i = 0; i < PHY_MAX_ADDR; i++) {
+		hip04_mdio_write(bus, i, 22, 0);
+		temp = hip04_mdio_read(bus, i, MII_BMCR);
+		if (temp < 0)
+			continue;
+
+		temp |= BMCR_RESET;
+		if (hip04_mdio_write(bus, i, MII_BMCR, temp) < 0)
+			continue;
+	}
+
+	mdelay(500);
+	return 0;
+}
+
+static int hip04_mdio_probe(struct platform_device *pdev)
+{
+	struct resource *r;
+	struct mii_bus *bus;
+	struct hip04_mdio_priv *priv;
+	int ret;
+
+	bus = mdiobus_alloc_size(sizeof(struct hip04_mdio_priv));
+	if (!bus) {
+		dev_err(&pdev->dev, "Cannot allocate MDIO bus\n");
+		return -ENOMEM;
+	}
+
+	bus->name = "hip04_mdio_bus";
+	bus->read = hip04_mdio_read;
+	bus->write = hip04_mdio_write;
+	bus->reset = hip04_mdio_reset;
+	snprintf(bus->id, MII_BUS_ID_SIZE, "%s-mii", dev_name(&pdev->dev));
+	bus->parent = &pdev->dev;
+	priv = bus->priv;
+
+	r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	priv->base = devm_ioremap_resource(&pdev->dev, r);
+	if (IS_ERR(priv->base)) {
+		ret = PTR_ERR(priv->base);
+		goto out_mdio;
+	}
+
+	ret = of_mdiobus_register(bus, pdev->dev.of_node);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "Cannot register MDIO bus (%d)\n", ret);
+		goto out_mdio;
+	}
+
+	platform_set_drvdata(pdev, bus);
+
+	return 0;
+
+out_mdio:
+	mdiobus_free(bus);
+	return ret;
+}
+
+static int hip04_mdio_remove(struct platform_device *pdev)
+{
+	struct mii_bus *bus = platform_get_drvdata(pdev);
+
+	mdiobus_unregister(bus);
+	mdiobus_free(bus);
+
+	return 0;
+}
+
+static const struct of_device_id hip04_mdio_match[] = {
+	{ .compatible = "hisilicon,hip04-mdio" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, hip04_mdio_match);
+
+static struct platform_driver hip04_mdio_driver = {
+	.probe = hip04_mdio_probe,
+	.remove = hip04_mdio_remove,
+	.driver = {
+		.name = "hip04-mdio",
+		.owner = THIS_MODULE,
+		.of_match_table = hip04_mdio_match,
+	},
+};
+
+module_platform_driver(hip04_mdio_driver);
+
+MODULE_DESCRIPTION("HISILICON P04 MDIO interface driver");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:hip04-mdio");
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net-next v9 3/3] net: hisilicon: new hip04 ethernet driver
  2014-12-11 11:42 [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver Ding Tianhong
  2014-12-11 11:42 ` [PATCH net-next v9 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet Ding Tianhong
  2014-12-11 11:42 ` [PATCH net-next v9 2/3] net: hisilicon: new hip04 MDIO driver Ding Tianhong
@ 2014-12-11 11:42 ` Ding Tianhong
       [not found] ` <1418298150-4944-1-git-send-email-dingtianhong-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2014-12-15 13:29 ` Arnd Bergmann
  4 siblings, 0 replies; 10+ messages in thread
From: Ding Tianhong @ 2014-12-11 11:42 UTC (permalink / raw)
  To: zhangfei.gao, davem, linux, arnd, f.fainelli, sergei.shtylyov,
	mark.rutland, David.Laight, eric.dumazet, xuwei5
  Cc: linux-arm-kernel, netdev, devicetree

From: Zhangfei Gao <zhangfei.gao@linaro.org>

Support Hisilicon hip04 ethernet driver, including 100M / 1000M controller.
The controller has no tx done interrupt, reclaim xmitted buffer in the poll.

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/Makefile    |   2 +-
 drivers/net/ethernet/hisilicon/hip04_eth.c | 876 +++++++++++++++++++++++++++++
 2 files changed, 877 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/hisilicon/hip04_eth.c

diff --git a/drivers/net/ethernet/hisilicon/Makefile b/drivers/net/ethernet/hisilicon/Makefile
index 40115a7..6c14540 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -3,4 +3,4 @@
 #
 
 obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
-obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o
+obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
new file mode 100644
index 0000000..9d37b67
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -0,0 +1,876 @@
+
+/* Copyright (c) 2014 Linaro Ltd.
+ * Copyright (c) 2014 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/etherdevice.h>
+#include <linux/platform_device.h>
+#include <linux/interrupt.h>
+#include <linux/of_address.h>
+#include <linux/phy.h>
+#include <linux/of_mdio.h>
+#include <linux/of_net.h>
+#include <linux/mfd/syscon.h>
+#include <linux/regmap.h>
+
+#define PPE_CFG_RX_ADDR			0x100
+#define PPE_CFG_POOL_GRP		0x300
+#define PPE_CFG_RX_BUF_SIZE		0x400
+#define PPE_CFG_RX_FIFO_SIZE		0x500
+#define PPE_CURR_BUF_CNT		0xa200
+
+#define GE_DUPLEX_TYPE			0x08
+#define GE_MAX_FRM_SIZE_REG		0x3c
+#define GE_PORT_MODE			0x40
+#define GE_PORT_EN			0x44
+#define GE_SHORT_RUNTS_THR_REG		0x50
+#define GE_TX_LOCAL_PAGE_REG		0x5c
+#define GE_TRANSMIT_CONTROL_REG		0x60
+#define GE_CF_CRC_STRIP_REG		0x1b0
+#define GE_MODE_CHANGE_REG		0x1b4
+#define GE_RECV_CONTROL_REG		0x1e0
+#define GE_STATION_MAC_ADDRESS		0x210
+#define PPE_CFG_CPU_ADD_ADDR		0x580
+#define PPE_CFG_MAX_FRAME_LEN_REG	0x408
+#define PPE_CFG_BUS_CTRL_REG		0x424
+#define PPE_CFG_RX_CTRL_REG		0x428
+#define PPE_CFG_RX_PKT_MODE_REG		0x438
+#define PPE_CFG_QOS_VMID_GEN		0x500
+#define PPE_CFG_RX_PKT_INT		0x538
+#define PPE_INTEN			0x600
+#define PPE_INTSTS			0x608
+#define PPE_RINT			0x604
+#define PPE_CFG_STS_MODE		0x700
+#define PPE_HIS_RX_PKT_CNT		0x804
+
+/* REG_INTERRUPT */
+#define RCV_INT				BIT(10)
+#define RCV_NOBUF			BIT(8)
+#define RCV_DROP			BIT(7)
+#define TX_DROP				BIT(6)
+#define DEF_INT_ERR			(RCV_NOBUF | RCV_DROP | TX_DROP)
+#define DEF_INT_MASK			(RCV_INT | DEF_INT_ERR)
+
+/* TX descriptor config */
+#define TX_FREE_MEM			BIT(0)
+#define TX_READ_ALLOC_L3		BIT(1)
+#define TX_FINISH_CACHE_INV		BIT(2)
+#define TX_CLEAR_WB			BIT(4)
+#define TX_L3_CHECKSUM			BIT(5)
+#define TX_LOOP_BACK			BIT(11)
+
+/* RX error */
+#define RX_PKT_DROP			BIT(0)
+#define RX_L2_ERR			BIT(1)
+#define RX_PKT_ERR			(RX_PKT_DROP | RX_L2_ERR)
+
+#define SGMII_SPEED_1000		0x08
+#define SGMII_SPEED_100			0x07
+#define SGMII_SPEED_10			0x06
+#define MII_SPEED_100			0x01
+#define MII_SPEED_10			0x00
+
+#define GE_DUPLEX_FULL			BIT(0)
+#define GE_DUPLEX_HALF			0x00
+#define GE_MODE_CHANGE_EN		BIT(0)
+
+#define GE_TX_AUTO_NEG			BIT(5)
+#define GE_TX_ADD_CRC			BIT(6)
+#define GE_TX_SHORT_PAD_THROUGH		BIT(7)
+
+#define GE_RX_STRIP_CRC			BIT(0)
+#define GE_RX_STRIP_PAD			BIT(3)
+#define GE_RX_PAD_EN			BIT(4)
+
+#define GE_AUTO_NEG_CTL			BIT(0)
+
+#define GE_RX_INT_THRESHOLD		BIT(6)
+#define GE_RX_TIMEOUT			0x04
+
+#define GE_RX_PORT_EN			BIT(1)
+#define GE_TX_PORT_EN			BIT(2)
+
+#define PPE_CFG_STS_RX_PKT_CNT_RC	BIT(12)
+
+#define PPE_CFG_RX_PKT_ALIGN		BIT(18)
+#define PPE_CFG_QOS_VMID_MODE		BIT(14)
+#define PPE_CFG_QOS_VMID_GRP_SHIFT	8
+
+#define PPE_CFG_RX_FIFO_FSFU		BIT(11)
+#define PPE_CFG_RX_DEPTH_SHIFT		16
+#define PPE_CFG_RX_START_SHIFT		0
+#define PPE_CFG_RX_CTRL_ALIGN_SHIFT	11
+
+#define PPE_CFG_BUS_LOCAL_REL		BIT(14)
+#define PPE_CFG_BUS_BIG_ENDIEN		BIT(0)
+
+#define RX_DESC_NUM			128
+#define TX_DESC_NUM			256
+#define TX_NEXT(N)			(((N) + 1) & (TX_DESC_NUM-1))
+#define RX_NEXT(N)			(((N) + 1) & (RX_DESC_NUM-1))
+
+#define GMAC_PPE_RX_PKT_MAX_LEN		379
+#define GMAC_MAX_PKT_LEN		1516
+#define GMAC_MIN_PKT_LEN		31
+#define RX_BUF_SIZE			1600
+#define RESET_TIMEOUT			1000
+#define TX_TIMEOUT			(6 * HZ)
+
+#define DRV_NAME			"hip04-ether"
+
+struct tx_desc {
+	u32 send_addr;
+	u32 send_size;
+	u32 next_addr;
+	u32 cfg;
+	u32 wb_addr;
+} __aligned(64);
+
+struct rx_desc {
+	u16 reserved_16;
+	u16 pkt_len;
+	u32 reserve1[3];
+	u32 pkt_err;
+	u32 reserve2[4];
+};
+
+struct hip04_priv {
+	void __iomem *base;
+	int phy_mode;
+	int chan;
+	unsigned int port;
+	unsigned int speed;
+	unsigned int duplex;
+	unsigned int reg_inten;
+
+	struct napi_struct napi;
+	struct net_device *ndev;
+
+	struct tx_desc *tx_desc;
+	dma_addr_t tx_desc_dma;
+	struct sk_buff *tx_skb[TX_DESC_NUM];
+	dma_addr_t tx_phys[TX_DESC_NUM];
+	unsigned int tx_head;
+	unsigned int tx_tail;
+	int tx_count;
+	unsigned long last_tx;
+
+	unsigned char *rx_buf[RX_DESC_NUM];
+	dma_addr_t rx_phys[RX_DESC_NUM];
+	unsigned int rx_head;
+	unsigned int rx_buf_size;
+
+	struct device_node *phy_node;
+	struct phy_device *phy;
+	struct regmap *map;
+	struct work_struct tx_timeout_task;
+
+	struct workqueue_struct *wq;
+	struct delayed_work tx_clean_task;
+};
+
+static void hip04_config_port(struct net_device *ndev, u32 speed, u32 duplex)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	u32 val;
+
+	priv->speed = speed;
+	priv->duplex = duplex;
+
+	switch (priv->phy_mode) {
+	case PHY_INTERFACE_MODE_SGMII:
+		if (speed == SPEED_1000)
+			val = SGMII_SPEED_1000;
+		else if (speed == SPEED_100)
+			val = SGMII_SPEED_100;
+		else
+			val = SGMII_SPEED_10;
+		break;
+	case PHY_INTERFACE_MODE_MII:
+		if (speed == SPEED_100)
+			val = MII_SPEED_100;
+		else
+			val = MII_SPEED_10;
+		break;
+	default:
+		netdev_warn(ndev, "not supported mode\n");
+		val = MII_SPEED_10;
+		break;
+	}
+	writel_relaxed(val, priv->base + GE_PORT_MODE);
+
+	val = duplex ? GE_DUPLEX_FULL : GE_DUPLEX_HALF;
+	writel_relaxed(val, priv->base + GE_DUPLEX_TYPE);
+
+	val = GE_MODE_CHANGE_EN;
+	writel_relaxed(val, priv->base + GE_MODE_CHANGE_REG);
+}
+
+static void hip04_reset_ppe(struct hip04_priv *priv)
+{
+	u32 val, tmp, timeout = 0;
+
+	do {
+		regmap_read(priv->map, priv->port * 4 + PPE_CURR_BUF_CNT, &val);
+		regmap_read(priv->map, priv->port * 4 + PPE_CFG_RX_ADDR, &tmp);
+		if (timeout++ > RESET_TIMEOUT)
+			break;
+	} while (val & 0xfff);
+}
+
+static void hip04_config_fifo(struct hip04_priv *priv)
+{
+	u32 val;
+
+	val = readl_relaxed(priv->base + PPE_CFG_STS_MODE);
+	val |= PPE_CFG_STS_RX_PKT_CNT_RC;
+	writel_relaxed(val, priv->base + PPE_CFG_STS_MODE);
+
+	val = BIT(priv->port);
+	regmap_write(priv->map, priv->port * 4 + PPE_CFG_POOL_GRP, val);
+
+	val = priv->port << PPE_CFG_QOS_VMID_GRP_SHIFT;
+	val |= PPE_CFG_QOS_VMID_MODE;
+	writel_relaxed(val, priv->base + PPE_CFG_QOS_VMID_GEN);
+
+	val = RX_BUF_SIZE;
+	regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_BUF_SIZE, val);
+
+	val = RX_DESC_NUM << PPE_CFG_RX_DEPTH_SHIFT;
+	val |= PPE_CFG_RX_FIFO_FSFU;
+	val |= priv->chan << PPE_CFG_RX_START_SHIFT;
+	regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_FIFO_SIZE, val);
+
+	val = NET_IP_ALIGN << PPE_CFG_RX_CTRL_ALIGN_SHIFT;
+	writel_relaxed(val, priv->base + PPE_CFG_RX_CTRL_REG);
+
+	val = PPE_CFG_RX_PKT_ALIGN;
+	writel_relaxed(val, priv->base + PPE_CFG_RX_PKT_MODE_REG);
+
+	val = PPE_CFG_BUS_LOCAL_REL | PPE_CFG_BUS_BIG_ENDIEN;
+	writel_relaxed(val, priv->base + PPE_CFG_BUS_CTRL_REG);
+
+	val = GMAC_PPE_RX_PKT_MAX_LEN;
+	writel_relaxed(val, priv->base + PPE_CFG_MAX_FRAME_LEN_REG);
+
+	val = GMAC_MAX_PKT_LEN;
+	writel_relaxed(val, priv->base + GE_MAX_FRM_SIZE_REG);
+
+	val = GMAC_MIN_PKT_LEN;
+	writel_relaxed(val, priv->base + GE_SHORT_RUNTS_THR_REG);
+
+	val = readl_relaxed(priv->base + GE_TRANSMIT_CONTROL_REG);
+	val |= GE_TX_AUTO_NEG | GE_TX_ADD_CRC | GE_TX_SHORT_PAD_THROUGH;
+	writel_relaxed(val, priv->base + GE_TRANSMIT_CONTROL_REG);
+
+	val = GE_RX_STRIP_CRC;
+	writel_relaxed(val, priv->base + GE_CF_CRC_STRIP_REG);
+
+	val = readl_relaxed(priv->base + GE_RECV_CONTROL_REG);
+	val |= GE_RX_STRIP_PAD | GE_RX_PAD_EN;
+	writel_relaxed(val, priv->base + GE_RECV_CONTROL_REG);
+
+	val = GE_AUTO_NEG_CTL;
+	writel_relaxed(val, priv->base + GE_TX_LOCAL_PAGE_REG);
+}
+
+static void hip04_mac_enable(struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	u32 val;
+
+	/* enable tx & rx */
+	val = readl_relaxed(priv->base + GE_PORT_EN);
+	val |= GE_RX_PORT_EN | GE_TX_PORT_EN;
+	writel_relaxed(val, priv->base + GE_PORT_EN);
+
+	/* clear rx int */
+	val = RCV_INT;
+	writel_relaxed(val, priv->base + PPE_RINT);
+
+	/* config recv int */
+	val = GE_RX_INT_THRESHOLD | GE_RX_TIMEOUT;
+	writel_relaxed(val, priv->base + PPE_CFG_RX_PKT_INT);
+
+	/* enable interrupt */
+	priv->reg_inten = DEF_INT_MASK;
+	writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+}
+
+static void hip04_mac_disable(struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	u32 val;
+
+	/* disable int */
+	priv->reg_inten &= ~(DEF_INT_MASK);
+	writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+
+	/* disable tx & rx */
+	val = readl_relaxed(priv->base + GE_PORT_EN);
+	val &= ~(GE_RX_PORT_EN | GE_TX_PORT_EN);
+	writel_relaxed(val, priv->base + GE_PORT_EN);
+}
+
+static void hip04_set_xmit_desc(struct hip04_priv *priv, dma_addr_t phys)
+{
+	writel(phys, priv->base + PPE_CFG_CPU_ADD_ADDR);
+}
+
+static void hip04_set_recv_desc(struct hip04_priv *priv, dma_addr_t phys)
+{
+	regmap_write(priv->map, priv->port * 4 + PPE_CFG_RX_ADDR, phys);
+}
+
+static u32 hip04_recv_cnt(struct hip04_priv *priv)
+{
+	return readl(priv->base + PPE_HIS_RX_PKT_CNT);
+}
+
+static void hip04_update_mac_address(struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+
+	writel_relaxed(((ndev->dev_addr[0] << 8) | (ndev->dev_addr[1])),
+		       priv->base + GE_STATION_MAC_ADDRESS);
+	writel_relaxed(((ndev->dev_addr[2] << 24) | (ndev->dev_addr[3] << 16) |
+			(ndev->dev_addr[4] << 8) | (ndev->dev_addr[5])),
+		       priv->base + GE_STATION_MAC_ADDRESS + 4);
+}
+
+static int hip04_set_mac_address(struct net_device *ndev, void *addr)
+{
+	eth_mac_addr(ndev, addr);
+	hip04_update_mac_address(ndev);
+	return 0;
+}
+
+static void hip04_tx_reclaim(struct net_device *ndev, bool force)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	unsigned tx_tail = priv->tx_tail;
+	struct tx_desc *desc;
+	unsigned int bytes_compl = 0, pkts_compl = 0;
+
+	if (priv->tx_count == 0)
+		goto out;
+
+	while ((tx_tail != priv->tx_head) || (priv->tx_count == TX_DESC_NUM)) {
+		desc = &priv->tx_desc[priv->tx_tail];
+		if (desc->send_addr != 0) {
+			if (force)
+				desc->send_addr = 0;
+			else
+				break;
+		}
+
+		if (priv->tx_phys[tx_tail]) {
+			dma_unmap_single(&ndev->dev, priv->tx_phys[tx_tail],
+					 priv->tx_skb[tx_tail]->len,
+					 DMA_TO_DEVICE);
+			priv->tx_phys[tx_tail] = 0;
+		}
+		pkts_compl++;
+		bytes_compl += priv->tx_skb[tx_tail]->len;
+		dev_kfree_skb(priv->tx_skb[tx_tail]);
+		priv->tx_skb[tx_tail] = NULL;
+		tx_tail = TX_NEXT(tx_tail);
+		priv->tx_count--;
+
+		if (priv->tx_count <= 0)
+			break;
+	}
+
+	priv->tx_tail = tx_tail;
+
+	/* Ensure tx_tail & tx_count visible to xmit */
+	smp_mb();
+out:
+
+	if (pkts_compl || bytes_compl)
+		netdev_completed_queue(ndev, pkts_compl, bytes_compl);
+
+	if (unlikely(netif_queue_stopped(ndev)) &&
+	    (priv->tx_count < TX_DESC_NUM))
+		netif_wake_queue(ndev);
+}
+
+static void hip04_tx_clean_monitor(struct work_struct *work)
+{
+	struct hip04_priv *priv = container_of(work, struct hip04_priv,
+					       tx_clean_task.work);
+	struct net_device *ndev = priv->ndev;
+	int delta_in_ticks = msecs_to_jiffies(1000);
+
+	if (!time_in_range(jiffies, priv->last_tx,
+			   priv->last_tx + delta_in_ticks)) {
+		netif_tx_lock(ndev);
+		hip04_tx_reclaim(ndev, false);
+		netif_tx_unlock(ndev);
+	}
+	queue_delayed_work(priv->wq, &priv->tx_clean_task, delta_in_ticks);
+}
+
+static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	struct net_device_stats *stats = &ndev->stats;
+	unsigned int tx_head = priv->tx_head;
+	struct tx_desc *desc = &priv->tx_desc[tx_head];
+	dma_addr_t phys;
+
+	if (priv->tx_count >= TX_DESC_NUM) {
+		netif_stop_queue(ndev);
+		return NETDEV_TX_BUSY;
+	}
+
+	hip04_tx_reclaim(ndev, false);
+
+	phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
+	if (dma_mapping_error(&ndev->dev, phys)) {
+		dev_kfree_skb(skb);
+		return NETDEV_TX_OK;
+	}
+
+	priv->tx_skb[tx_head] = skb;
+	priv->tx_phys[tx_head] = phys;
+	desc->send_addr = cpu_to_be32(phys);
+	desc->send_size = cpu_to_be32(skb->len);
+	desc->cfg = cpu_to_be32(TX_CLEAR_WB | TX_FINISH_CACHE_INV);
+	phys = priv->tx_desc_dma + tx_head * sizeof(struct tx_desc);
+	desc->wb_addr = cpu_to_be32(phys);
+	skb_tx_timestamp(skb);
+
+	/* Don't wait up for transmitted skbs to be freed. */
+	skb_orphan(skb);
+
+	hip04_set_xmit_desc(priv, phys);
+	priv->tx_head = TX_NEXT(tx_head);
+	netdev_sent_queue(ndev, skb->len);
+
+	stats->tx_bytes += skb->len;
+	stats->tx_packets++;
+	priv->tx_count++;
+	priv->last_tx = jiffies;
+
+	/* Ensure tx_head & tx_count update visible to tx reclaim */
+	smp_mb();
+
+	return NETDEV_TX_OK;
+}
+
+static int hip04_rx_poll(struct napi_struct *napi, int budget)
+{
+	struct hip04_priv *priv = container_of(napi, struct hip04_priv, napi);
+	struct net_device *ndev = priv->ndev;
+	struct net_device_stats *stats = &ndev->stats;
+	unsigned int cnt = hip04_recv_cnt(priv);
+	struct rx_desc *desc;
+	struct sk_buff *skb;
+	unsigned char *buf;
+	bool last = false;
+	dma_addr_t phys;
+	int rx = 0;
+	u16 len;
+	u32 err;
+
+	while (cnt && !last) {
+		buf = priv->rx_buf[priv->rx_head];
+		skb = build_skb(buf, priv->rx_buf_size);
+		if (unlikely(!skb))
+			net_dbg_ratelimited("build_skb failed\n");
+
+		dma_unmap_single(&ndev->dev, priv->rx_phys[priv->rx_head],
+				 RX_BUF_SIZE, DMA_FROM_DEVICE);
+		priv->rx_phys[priv->rx_head] = 0;
+
+		desc = (struct rx_desc *)skb->data;
+		len = be16_to_cpu(desc->pkt_len);
+		err = be32_to_cpu(desc->pkt_err);
+
+		if (0 == len) {
+			dev_kfree_skb_any(skb);
+			last = true;
+		} else if ((err & RX_PKT_ERR) || (len >= GMAC_MAX_PKT_LEN)) {
+			dev_kfree_skb_any(skb);
+			stats->rx_dropped++;
+			stats->rx_errors++;
+		} else {
+			skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
+			skb_put(skb, len);
+			skb->protocol = eth_type_trans(skb, ndev);
+			napi_gro_receive(&priv->napi, skb);
+			stats->rx_packets++;
+			stats->rx_bytes += len;
+			rx++;
+		}
+
+		buf = netdev_alloc_frag(priv->rx_buf_size);
+		if (!buf)
+			return -ENOMEM;
+		phys = dma_map_single(&ndev->dev, buf,
+				      RX_BUF_SIZE, DMA_FROM_DEVICE);
+		if (dma_mapping_error(&ndev->dev, phys))
+			return -EIO;
+		priv->rx_buf[priv->rx_head] = buf;
+		priv->rx_phys[priv->rx_head] = phys;
+		hip04_set_recv_desc(priv, phys);
+
+		priv->rx_head = RX_NEXT(priv->rx_head);
+		if (rx >= budget)
+			goto done;
+
+		if (--cnt == 0)
+			cnt = hip04_recv_cnt(priv);
+	}
+
+	if (!(priv->reg_inten & RCV_INT)) {
+		/* enable rx interrupt */
+		priv->reg_inten |= RCV_INT;
+		writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+	}
+	napi_complete(napi);
+done:
+	return rx;
+}
+
+static irqreturn_t hip04_mac_interrupt(int irq, void *dev_id)
+{
+	struct net_device *ndev = (struct net_device *)dev_id;
+	struct hip04_priv *priv = netdev_priv(ndev);
+	struct net_device_stats *stats = &ndev->stats;
+	u32 ists = readl_relaxed(priv->base + PPE_INTSTS);
+
+	writel_relaxed(DEF_INT_MASK, priv->base + PPE_RINT);
+
+	if (unlikely(ists & DEF_INT_ERR)) {
+		if (ists & (RCV_NOBUF | RCV_DROP))
+			stats->rx_errors++;
+			stats->rx_dropped++;
+			netdev_err(ndev, "rx drop\n");
+		if (ists & TX_DROP) {
+			stats->tx_dropped++;
+			netdev_err(ndev, "tx drop\n");
+		}
+	}
+
+	if (ists & RCV_INT) {
+		/* disable rx interrupt */
+		priv->reg_inten &= ~(RCV_INT);
+		writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
+		napi_schedule(&priv->napi);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static void hip04_adjust_link(struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	struct phy_device *phy = priv->phy;
+
+	if ((priv->speed != phy->speed) || (priv->duplex != phy->duplex)) {
+		hip04_config_port(ndev, phy->speed, phy->duplex);
+		phy_print_status(phy);
+	}
+}
+
+static int hip04_mac_open(struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	int i;
+
+	priv->rx_head = 0;
+	priv->tx_head = 0;
+	priv->tx_tail = 0;
+	priv->tx_count = 0;
+	hip04_reset_ppe(priv);
+
+	for (i = 0; i < RX_DESC_NUM; i++) {
+		dma_addr_t phys;
+
+		phys = dma_map_single(&ndev->dev, priv->rx_buf[i],
+				      RX_BUF_SIZE, DMA_FROM_DEVICE);
+		if (dma_mapping_error(&ndev->dev, phys))
+			return -EIO;
+
+		priv->rx_phys[i] = phys;
+		hip04_set_recv_desc(priv, phys);
+	}
+
+	if (priv->phy)
+		phy_start(priv->phy);
+
+	netdev_reset_queue(ndev);
+	netif_start_queue(ndev);
+	hip04_mac_enable(ndev);
+	napi_enable(&priv->napi);
+
+	INIT_DELAYED_WORK(&priv->tx_clean_task, hip04_tx_clean_monitor);
+	queue_delayed_work(priv->wq, &priv->tx_clean_task, 0);
+
+	return 0;
+}
+
+static int hip04_mac_stop(struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	int i;
+
+	cancel_delayed_work_sync(&priv->tx_clean_task);
+
+	napi_disable(&priv->napi);
+	netif_stop_queue(ndev);
+	hip04_mac_disable(ndev);
+	hip04_tx_reclaim(ndev, true);
+	hip04_reset_ppe(priv);
+
+	if (priv->phy)
+		phy_stop(priv->phy);
+
+	for (i = 0; i < RX_DESC_NUM; i++) {
+		if (priv->rx_phys[i]) {
+			dma_unmap_single(&ndev->dev, priv->rx_phys[i],
+					 RX_BUF_SIZE, DMA_FROM_DEVICE);
+			priv->rx_phys[i] = 0;
+		}
+	}
+
+	return 0;
+}
+
+static void hip04_timeout(struct net_device *ndev)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+
+	schedule_work(&priv->tx_timeout_task);
+}
+
+static void hip04_tx_timeout_task(struct work_struct *work)
+{
+	struct hip04_priv *priv;
+
+	priv = container_of(work, struct hip04_priv, tx_timeout_task);
+	hip04_mac_stop(priv->ndev);
+	hip04_mac_open(priv->ndev);
+}
+
+static struct net_device_stats *hip04_get_stats(struct net_device *ndev)
+{
+	return &ndev->stats;
+}
+
+static struct net_device_ops hip04_netdev_ops = {
+	.ndo_open		= hip04_mac_open,
+	.ndo_stop		= hip04_mac_stop,
+	.ndo_get_stats		= hip04_get_stats,
+	.ndo_start_xmit		= hip04_mac_start_xmit,
+	.ndo_set_mac_address	= hip04_set_mac_address,
+	.ndo_tx_timeout         = hip04_timeout,
+	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_change_mtu		= eth_change_mtu,
+};
+
+static int hip04_alloc_ring(struct net_device *ndev, struct device *d)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	int i;
+
+	priv->tx_desc = dma_alloc_coherent(d,
+			TX_DESC_NUM * sizeof(struct tx_desc),
+			&priv->tx_desc_dma, GFP_KERNEL);
+	if (!priv->tx_desc)
+		return -ENOMEM;
+
+	priv->rx_buf_size = RX_BUF_SIZE +
+			    SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	for (i = 0; i < RX_DESC_NUM; i++) {
+		priv->rx_buf[i] = netdev_alloc_frag(priv->rx_buf_size);
+		if (!priv->rx_buf[i])
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void hip04_free_ring(struct net_device *ndev, struct device *d)
+{
+	struct hip04_priv *priv = netdev_priv(ndev);
+	int i;
+
+	for (i = 0; i < RX_DESC_NUM; i++)
+		if (priv->rx_buf[i])
+			put_page(virt_to_head_page(priv->rx_buf[i]));
+
+	for (i = 0; i < TX_DESC_NUM; i++)
+		if (priv->tx_skb[i])
+			dev_kfree_skb_any(priv->tx_skb[i]);
+
+	dma_free_coherent(d, TX_DESC_NUM * sizeof(struct tx_desc),
+			  priv->tx_desc, priv->tx_desc_dma);
+}
+
+static int hip04_mac_probe(struct platform_device *pdev)
+{
+	struct device *d = &pdev->dev;
+	struct device_node *node = d->of_node;
+	struct of_phandle_args arg;
+	struct net_device *ndev;
+	struct hip04_priv *priv;
+	struct resource *res;
+	unsigned int irq;
+	int ret;
+
+	ndev = alloc_etherdev(sizeof(struct hip04_priv));
+	if (!ndev)
+		return -ENOMEM;
+
+	priv = netdev_priv(ndev);
+	priv->ndev = ndev;
+	platform_set_drvdata(pdev, ndev);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	priv->base = devm_ioremap_resource(d, res);
+	if (IS_ERR(priv->base)) {
+		ret = PTR_ERR(priv->base);
+		goto init_fail;
+	}
+
+	ret = of_parse_phandle_with_fixed_args(node, "port-handle", 2, 0, &arg);
+	if (ret < 0) {
+		dev_warn(d, "no port-handle\n");
+		goto init_fail;
+	}
+
+	priv->port = arg.args[0];
+	priv->chan = arg.args[1] * RX_DESC_NUM;
+
+	priv->map = syscon_node_to_regmap(arg.np);
+	if (IS_ERR(priv->map)) {
+		dev_warn(d, "no syscon hisilicon,hip04-ppe\n");
+		ret = PTR_ERR(priv->map);
+		goto init_fail;
+	}
+
+	priv->phy_mode = of_get_phy_mode(node);
+	if (priv->phy_mode < 0) {
+		dev_warn(d, "not find phy-mode\n");
+		ret = -EINVAL;
+		goto init_fail;
+	}
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq <= 0) {
+		ret = -EINVAL;
+		goto init_fail;
+	}
+
+	ret = devm_request_irq(d, irq, hip04_mac_interrupt,
+			       0, pdev->name, ndev);
+	if (ret) {
+		netdev_err(ndev, "devm_request_irq failed\n");
+		goto init_fail;
+	}
+
+	priv->phy_node = of_parse_phandle(node, "phy-handle", 0);
+	if (priv->phy_node) {
+		priv->phy = of_phy_connect(ndev, priv->phy_node,
+			&hip04_adjust_link, 0, priv->phy_mode);
+		if (!priv->phy) {
+			ret = -EPROBE_DEFER;
+			goto init_fail;
+		}
+	}
+
+	priv->wq = create_singlethread_workqueue(ndev->name);
+	if (!priv->wq) {
+		ret = -ENOMEM;
+		goto init_fail;
+	}
+
+	INIT_WORK(&priv->tx_timeout_task, hip04_tx_timeout_task);
+
+	ether_setup(ndev);
+	ndev->netdev_ops = &hip04_netdev_ops;
+	ndev->watchdog_timeo = TX_TIMEOUT;
+	ndev->priv_flags |= IFF_UNICAST_FLT;
+	ndev->irq = irq;
+	netif_napi_add(ndev, &priv->napi, hip04_rx_poll, NAPI_POLL_WEIGHT);
+	SET_NETDEV_DEV(ndev, &pdev->dev);
+
+	hip04_reset_ppe(priv);
+	if (priv->phy_mode == PHY_INTERFACE_MODE_MII)
+		hip04_config_port(ndev, SPEED_100, DUPLEX_FULL);
+
+	hip04_config_fifo(priv);
+	random_ether_addr(ndev->dev_addr);
+	hip04_update_mac_address(ndev);
+
+	ret = hip04_alloc_ring(ndev, d);
+	if (ret) {
+		netdev_err(ndev, "alloc ring fail\n");
+		goto alloc_fail;
+	}
+
+	ret = register_netdev(ndev);
+	if (ret) {
+		free_netdev(ndev);
+		goto alloc_fail;
+	}
+
+	return 0;
+
+alloc_fail:
+	hip04_free_ring(ndev, d);
+init_fail:
+	of_node_put(priv->phy_node);
+	free_netdev(ndev);
+	return ret;
+}
+
+static int hip04_remove(struct platform_device *pdev)
+{
+	struct net_device *ndev = platform_get_drvdata(pdev);
+	struct hip04_priv *priv = netdev_priv(ndev);
+	struct device *d = &pdev->dev;
+
+	if (priv->phy)
+		phy_disconnect(priv->phy);
+
+	hip04_free_ring(ndev, d);
+	unregister_netdev(ndev);
+	free_irq(ndev->irq, ndev);
+	of_node_put(priv->phy_node);
+	cancel_work_sync(&priv->tx_timeout_task);
+	if (priv->wq)
+		destroy_workqueue(priv->wq);
+	free_netdev(ndev);
+
+	return 0;
+}
+
+static const struct of_device_id hip04_mac_match[] = {
+	{ .compatible = "hisilicon,hip04-mac" },
+	{ }
+};
+
+static struct platform_driver hip04_mac_driver = {
+	.probe	= hip04_mac_probe,
+	.remove	= hip04_remove,
+	.driver	= {
+		.name		= DRV_NAME,
+		.owner		= THIS_MODULE,
+		.of_match_table	= hip04_mac_match,
+	},
+};
+module_platform_driver(hip04_mac_driver);
+
+MODULE_DESCRIPTION("HISILICON P04 Ethernet driver");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:hip04-ether");
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
       [not found] ` <1418298150-4944-1-git-send-email-dingtianhong-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2014-12-11 17:04   ` David Miller
  2014-12-12  1:46     ` Ding Tianhong
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2014-12-11 17:04 UTC (permalink / raw)
  To: dingtianhong-hv44wF8Li93QT0dZR+AlfA
  Cc: zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A,
	linux-lFZ/pmaqli7XmaaqVzeoHQ, arnd-r2nGTMty4D4,
	f.fainelli-Re5JQEeQqe8AvxtiuMwx3w,
	sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8,
	mark.rutland-5wv7dgnIgG8, David.Laight-ZS65k/vG3HxXrIkS9f7CXA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	xuwei5-C8/M+/jPZTeaMJb+Lgu22Q,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA


The net-next tree is closed, therefore it is not appropriate to
submit net-next changes at this time.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
  2014-12-11 17:04   ` [PATCH net-next v9 0/3] add hisilicon " David Miller
@ 2014-12-12  1:46     ` Ding Tianhong
  0 siblings, 0 replies; 10+ messages in thread
From: Ding Tianhong @ 2014-12-12  1:46 UTC (permalink / raw)
  To: David Miller
  Cc: zhangfei.gao, linux, arnd, f.fainelli, sergei.shtylyov,
	mark.rutland, David.Laight, eric.dumazet, xuwei5,
	linux-arm-kernel, netdev, devicetree

On 2014/12/12 1:04, David Miller wrote:
> 
> The net-next tree is closed, therefore it is not appropriate to
> submit net-next changes at this time.
> 
> .
> 
Ok, sure, thanks.

Ding

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
  2014-12-11 11:42 [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver Ding Tianhong
                   ` (3 preceding siblings ...)
       [not found] ` <1418298150-4944-1-git-send-email-dingtianhong-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2014-12-15 13:29 ` Arnd Bergmann
  2014-12-16  7:57   ` Ding Tianhong
  4 siblings, 1 reply; 10+ messages in thread
From: Arnd Bergmann @ 2014-12-15 13:29 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ding Tianhong, zhangfei.gao, davem, linux, f.fainelli,
	sergei.shtylyov, mark.rutland, David.Laight, eric.dumazet,
	xuwei5, netdev, devicetree

On Thursday 11 December 2014 19:42:27 Ding Tianhong wrote:
> v9:
> - There is no tx completion interrupts to free DMAd Tx packets, it means taht
>   we rely on new tx packets arriving to run the destructors of completed packets,
>   which open up space in their sockets's send queues. Sometimes we don't get such
>   new packets causing Tx to stall, a single UDP transmitter is a good example of
>   this situation, so we need a clean up workqueue to reclaims completed packets,
>   the workqueue will only free the last packets which is already stay for several jiffies.
>   Also fix some format cleanups.

You must have missed the reply in which David Miller explained why
you can't call skb_orphan and rely on an occasional cleanup of the
queue. Please use something like the patch below:

- drop the broken skb_orphan call
- drop the workqueue
- batch cleanup based on tx_coalesce_frames/usecs for better throughput
- use a reasonable default tx timeout (200us, could be shorted
  based on measurements) with a range timer
- fix napi poll function return value
- use a lockless queue for cleanup

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Feel free to fold this into your patch rather than keeping it separate.
Completely untested, probably contains some bugs. Please ask if you
have questions about this.

diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
index 9d37b670db6a..d85c5287654e 100644
--- a/drivers/net/ethernet/hisilicon/hip04_eth.c
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -12,6 +12,7 @@
 #include <linux/etherdevice.h>
 #include <linux/platform_device.h>
 #include <linux/interrupt.h>
+#include <linux/ktime.h>
 #include <linux/of_address.h>
 #include <linux/phy.h>
 #include <linux/of_mdio.h>
@@ -157,9 +158,11 @@ struct hip04_priv {
 	struct sk_buff *tx_skb[TX_DESC_NUM];
 	dma_addr_t tx_phys[TX_DESC_NUM];
 	unsigned int tx_head;
-	unsigned int tx_tail;
-	int tx_count;
-	unsigned long last_tx;
+
+	/* FIXME: make these adjustable through ethtool */
+	int tx_coalesce_frames;
+	int tx_coalesce_usecs;
+	struct hrtimer tx_coalesce_timer;
 
 	unsigned char *rx_buf[RX_DESC_NUM];
 	dma_addr_t rx_phys[RX_DESC_NUM];
@@ -171,10 +174,15 @@ struct hip04_priv {
 	struct regmap *map;
 	struct work_struct tx_timeout_task;
 
-	struct workqueue_struct *wq;
-	struct delayed_work tx_clean_task;
+	/* written only by tx cleanup */
+	unsigned int tx_tail ____cacheline_aligned_in_smp;
 };
 
+static inline unsigned int tx_count(unsigned int head, unsigned int tail)
+{
+	return (head - tail) % (TX_DESC_NUM - 1);
+}
+
 static void hip04_config_port(struct net_device *ndev, u32 speed, u32 duplex)
 {
 	struct hip04_priv *priv = netdev_priv(ndev);
@@ -351,18 +359,21 @@ static int hip04_set_mac_address(struct net_device *ndev, void *addr)
 	return 0;
 }
 
-static void hip04_tx_reclaim(struct net_device *ndev, bool force)
+static int hip04_tx_reclaim(struct net_device *ndev, bool force)
 {
 	struct hip04_priv *priv = netdev_priv(ndev);
 	unsigned tx_tail = priv->tx_tail;
 	struct tx_desc *desc;
 	unsigned int bytes_compl = 0, pkts_compl = 0;
+	unsigned int count;
 
-	if (priv->tx_count == 0)
+	smp_rmb();
+	count = tx_count(ACCESS_ONCE(priv->tx_head), tx_tail);
+	if (count == 0)
 		goto out;
 
-	while ((tx_tail != priv->tx_head) || (priv->tx_count == TX_DESC_NUM)) {
-		desc = &priv->tx_desc[priv->tx_tail];
+	while (count) {
+		desc = &priv->tx_desc[tx_tail];
 		if (desc->send_addr != 0) {
 			if (force)
 				desc->send_addr = 0;
@@ -381,57 +392,37 @@ static void hip04_tx_reclaim(struct net_device *ndev, bool force)
 		dev_kfree_skb(priv->tx_skb[tx_tail]);
 		priv->tx_skb[tx_tail] = NULL;
 		tx_tail = TX_NEXT(tx_tail);
-		priv->tx_count--;
-
-		if (priv->tx_count <= 0)
-			break;
+		count--;
 	}
 
 	priv->tx_tail = tx_tail;
+	smp_wmb(); /* Ensure tx_tail visible to xmit */
 
-	/* Ensure tx_tail & tx_count visible to xmit */
-	smp_mb();
 out:
-
 	if (pkts_compl || bytes_compl)
 		netdev_completed_queue(ndev, pkts_compl, bytes_compl);
 
-	if (unlikely(netif_queue_stopped(ndev)) &&
-	    (priv->tx_count < TX_DESC_NUM))
+	if (unlikely(netif_queue_stopped(ndev)) && (count < (TX_DESC_NUM - 1)))
 		netif_wake_queue(ndev);
-}
 
-static void hip04_tx_clean_monitor(struct work_struct *work)
-{
-	struct hip04_priv *priv = container_of(work, struct hip04_priv,
-					       tx_clean_task.work);
-	struct net_device *ndev = priv->ndev;
-	int delta_in_ticks = msecs_to_jiffies(1000);
-
-	if (!time_in_range(jiffies, priv->last_tx,
-			   priv->last_tx + delta_in_ticks)) {
-		netif_tx_lock(ndev);
-		hip04_tx_reclaim(ndev, false);
-		netif_tx_unlock(ndev);
-	}
-	queue_delayed_work(priv->wq, &priv->tx_clean_task, delta_in_ticks);
+	return count;
 }
 
 static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 {
 	struct hip04_priv *priv = netdev_priv(ndev);
 	struct net_device_stats *stats = &ndev->stats;
-	unsigned int tx_head = priv->tx_head;
+	unsigned int tx_head = priv->tx_head, count;
 	struct tx_desc *desc = &priv->tx_desc[tx_head];
 	dma_addr_t phys;
 
-	if (priv->tx_count >= TX_DESC_NUM) {
+	smp_rmb();
+	count = tx_count(tx_head, ACCESS_ONCE(priv->tx_tail));
+	if (count == (TX_DESC_NUM - 1)) {
 		netif_stop_queue(ndev);
 		return NETDEV_TX_BUSY;
 	}
 
-	hip04_tx_reclaim(ndev, false);
-
 	phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
 	if (dma_mapping_error(&ndev->dev, phys)) {
 		dev_kfree_skb(skb);
@@ -447,20 +438,33 @@ static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	desc->wb_addr = cpu_to_be32(phys);
 	skb_tx_timestamp(skb);
 
-	/* Don't wait up for transmitted skbs to be freed. */
-	skb_orphan(skb);
-
+	/* FIXME: eliminate this mmio access if xmit_more is set */
 	hip04_set_xmit_desc(priv, phys);
 	priv->tx_head = TX_NEXT(tx_head);
+	count++;
 	netdev_sent_queue(ndev, skb->len);
 
 	stats->tx_bytes += skb->len;
 	stats->tx_packets++;
-	priv->tx_count++;
-	priv->last_tx = jiffies;
 
-	/* Ensure tx_head & tx_count update visible to tx reclaim */
-	smp_mb();
+	/* Ensure tx_head update visible to tx reclaim */
+	smp_wmb();
+
+	/* queue is getting full, better start cleaning up now */
+	if (count >= priv->tx_coalesce_frames) {
+		if (napi_schedule_prep(&priv->napi)) {
+			/* disable rx interrupt and timer */
+			priv->reg_inten &= ~(RCV_INT);
+			writel_relaxed(DEF_INT_MASK & ~RCV_INT,
+				       priv->base + PPE_INTEN);
+			hrtimer_cancel(&priv->tx_coalesce_timer);
+			__napi_schedule(&priv->napi);
+		}
+	} else if (!hrtimer_is_queued(&priv->tx_coalesce_timer)) {
+		/* cleanup not pending yet, start a new timer */
+		hrtimer_start_expires(&priv->tx_coalesce_timer,
+				      HRTIMER_MODE_REL);
+	}
 
 	return NETDEV_TX_OK;
 }
@@ -477,6 +481,7 @@ static int hip04_rx_poll(struct napi_struct *napi, int budget)
 	bool last = false;
 	dma_addr_t phys;
 	int rx = 0;
+	int tx_remaining;
 	u16 len;
 	u32 err;
 
@@ -513,11 +518,11 @@ static int hip04_rx_poll(struct napi_struct *napi, int budget)
 
 		buf = netdev_alloc_frag(priv->rx_buf_size);
 		if (!buf)
-			return -ENOMEM;
+			goto done;
 		phys = dma_map_single(&ndev->dev, buf,
 				      RX_BUF_SIZE, DMA_FROM_DEVICE);
 		if (dma_mapping_error(&ndev->dev, phys))
-			return -EIO;
+			goto done;
 		priv->rx_buf[priv->rx_head] = buf;
 		priv->rx_phys[priv->rx_head] = phys;
 		hip04_set_recv_desc(priv, phys);
@@ -537,6 +542,11 @@ static int hip04_rx_poll(struct napi_struct *napi, int budget)
 	}
 	napi_complete(napi);
 done:
+	/* clean up tx descriptors and start a new timer if necessary */
+	tx_remaining = hip04_tx_reclaim(ndev, false);
+	if (rx < budget && tx_remaining)
+		hrtimer_start_expires(&priv->tx_coalesce_timer, HRTIMER_MODE_REL);
+
 	return rx;
 }
 
@@ -547,6 +557,9 @@ static irqreturn_t hip04_mac_interrupt(int irq, void *dev_id)
 	struct net_device_stats *stats = &ndev->stats;
 	u32 ists = readl_relaxed(priv->base + PPE_INTSTS);
 
+	if (!ists)
+		return IRQ_NONE;
+
 	writel_relaxed(DEF_INT_MASK, priv->base + PPE_RINT);
 
 	if (unlikely(ists & DEF_INT_ERR)) {
@@ -560,16 +573,32 @@ static irqreturn_t hip04_mac_interrupt(int irq, void *dev_id)
 		}
 	}
 
-	if (ists & RCV_INT) {
+	if (ists & RCV_INT && napi_schedule_prep(&priv->napi)) {
 		/* disable rx interrupt */
 		priv->reg_inten &= ~(RCV_INT);
-		writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
-		napi_schedule(&priv->napi);
+		writel_relaxed(DEF_INT_MASK & ~RCV_INT, priv->base + PPE_INTEN);
+		hrtimer_cancel(&priv->tx_coalesce_timer);
+		__napi_schedule(&priv->napi);
 	}
 
 	return IRQ_HANDLED;
 }
 
+enum hrtimer_restart tx_done(struct hrtimer *hrtimer)
+{
+	struct hip04_priv *priv;
+	priv = container_of(hrtimer, struct hip04_priv, tx_coalesce_timer);
+
+	if (napi_schedule_prep(&priv->napi)) {
+		/* disable rx interrupt */
+		priv->reg_inten &= ~(RCV_INT);
+		writel_relaxed(DEF_INT_MASK & ~RCV_INT, priv->base + PPE_INTEN);
+		__napi_schedule(&priv->napi);
+	}
+
+	return HRTIMER_NORESTART;
+}
+
 static void hip04_adjust_link(struct net_device *ndev)
 {
 	struct hip04_priv *priv = netdev_priv(ndev);
@@ -589,7 +618,6 @@ static int hip04_mac_open(struct net_device *ndev)
 	priv->rx_head = 0;
 	priv->tx_head = 0;
 	priv->tx_tail = 0;
-	priv->tx_count = 0;
 	hip04_reset_ppe(priv);
 
 	for (i = 0; i < RX_DESC_NUM; i++) {
@@ -612,9 +640,6 @@ static int hip04_mac_open(struct net_device *ndev)
 	hip04_mac_enable(ndev);
 	napi_enable(&priv->napi);
 
-	INIT_DELAYED_WORK(&priv->tx_clean_task, hip04_tx_clean_monitor);
-	queue_delayed_work(priv->wq, &priv->tx_clean_task, 0);
-
 	return 0;
 }
 
@@ -623,8 +648,6 @@ static int hip04_mac_stop(struct net_device *ndev)
 	struct hip04_priv *priv = netdev_priv(ndev);
 	int i;
 
-	cancel_delayed_work_sync(&priv->tx_clean_task);
-
 	napi_disable(&priv->napi);
 	netif_stop_queue(ndev);
 	hip04_mac_disable(ndev);
@@ -725,6 +748,7 @@ static int hip04_mac_probe(struct platform_device *pdev)
 	struct hip04_priv *priv;
 	struct resource *res;
 	unsigned int irq;
+	ktime_t txtime;
 	int ret;
 
 	ndev = alloc_etherdev(sizeof(struct hip04_priv));
@@ -751,6 +775,21 @@ static int hip04_mac_probe(struct platform_device *pdev)
 	priv->port = arg.args[0];
 	priv->chan = arg.args[1] * RX_DESC_NUM;
 
+	hrtimer_init(&priv->tx_coalesce_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+
+	/*
+	 * BQL will try to keep the TX queue as short as possible, but it can't
+	 * be faster than tx_coalesce_usecs, so we need a fast timeout here,
+	 * but also long enough to gather up enough frames to ensure we don't
+	 * get more interrupts than necessary.
+	 * 200us is enough for 16 frames of 1500 bytes at gigabit ethernet rate
+	 */
+	priv->tx_coalesce_frames = TX_DESC_NUM * 3 / 4;
+	priv->tx_coalesce_usecs = 200;
+	/* allow timer to fire after half the time at the earliest */
+	txtime = ktime_set(0, priv->tx_coalesce_usecs * NSEC_PER_USEC / 2);
+	hrtimer_set_expires_range(&priv->tx_coalesce_timer, txtime, txtime);
+
 	priv->map = syscon_node_to_regmap(arg.np);
 	if (IS_ERR(priv->map)) {
 		dev_warn(d, "no syscon hisilicon,hip04-ppe\n");
@@ -788,12 +827,6 @@ static int hip04_mac_probe(struct platform_device *pdev)
 		}
 	}
 
-	priv->wq = create_singlethread_workqueue(ndev->name);
-	if (!priv->wq) {
-		ret = -ENOMEM;
-		goto init_fail;
-	}
-
 	INIT_WORK(&priv->tx_timeout_task, hip04_tx_timeout_task);
 
 	ether_setup(ndev);
@@ -848,8 +881,6 @@ static int hip04_remove(struct platform_device *pdev)
 	free_irq(ndev->irq, ndev);
 	of_node_put(priv->phy_node);
 	cancel_work_sync(&priv->tx_timeout_task);
-	if (priv->wq)
-		destroy_workqueue(priv->wq);
 	free_netdev(ndev);
 
 	return 0;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
  2014-12-15 13:29 ` Arnd Bergmann
@ 2014-12-16  7:57   ` Ding Tianhong
  2014-12-16  8:54     ` Arnd Bergmann
  0 siblings, 1 reply; 10+ messages in thread
From: Ding Tianhong @ 2014-12-16  7:57 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: zhangfei.gao, davem, linux, f.fainelli, sergei.shtylyov,
	mark.rutland, David.Laight, eric.dumazet, xuwei5, netdev,
	devicetree

On 2014/12/15 21:29, Arnd Bergmann wrote:
> On Thursday 11 December 2014 19:42:27 Ding Tianhong wrote:
>> v9:
>> - There is no tx completion interrupts to free DMAd Tx packets, it means taht
>>   we rely on new tx packets arriving to run the destructors of completed packets,
>>   which open up space in their sockets's send queues. Sometimes we don't get such
>>   new packets causing Tx to stall, a single UDP transmitter is a good example of
>>   this situation, so we need a clean up workqueue to reclaims completed packets,
>>   the workqueue will only free the last packets which is already stay for several jiffies.
>>   Also fix some format cleanups.
> 
> You must have missed the reply in which David Miller explained why
> you can't call skb_orphan and rely on an occasional cleanup of the
> queue. Please use something like the patch below:
> 
> - drop the broken skb_orphan call
> - drop the workqueue
> - batch cleanup based on tx_coalesce_frames/usecs for better throughput
> - use a reasonable default tx timeout (200us, could be shorted
>   based on measurements) with a range timer
> - fix napi poll function return value
> - use a lockless queue for cleanup
> 

Ok, agree, my comments see below.

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> 
> Feel free to fold this into your patch rather than keeping it separate.
> Completely untested, probably contains some bugs. Please ask if you
> have questions about this.
> 
> diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
> index 9d37b670db6a..d85c5287654e 100644
> --- a/drivers/net/ethernet/hisilicon/hip04_eth.c
> +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
> @@ -12,6 +12,7 @@
>  #include <linux/etherdevice.h>
>  #include <linux/platform_device.h>
>  #include <linux/interrupt.h>
> +#include <linux/ktime.h>
>  #include <linux/of_address.h>
>  #include <linux/phy.h>
>  #include <linux/of_mdio.h>
> @@ -157,9 +158,11 @@ struct hip04_priv {
>  	struct sk_buff *tx_skb[TX_DESC_NUM];
>  	dma_addr_t tx_phys[TX_DESC_NUM];
>  	unsigned int tx_head;
> -	unsigned int tx_tail;
> -	int tx_count;
> -	unsigned long last_tx;
> +
> +	/* FIXME: make these adjustable through ethtool */
> +	int tx_coalesce_frames;
> +	int tx_coalesce_usecs;
> +	struct hrtimer tx_coalesce_timer;
>  
>  	unsigned char *rx_buf[RX_DESC_NUM];
>  	dma_addr_t rx_phys[RX_DESC_NUM];
> @@ -171,10 +174,15 @@ struct hip04_priv {
>  	struct regmap *map;
>  	struct work_struct tx_timeout_task;
>  
> -	struct workqueue_struct *wq;
> -	struct delayed_work tx_clean_task;
> +	/* written only by tx cleanup */
> +	unsigned int tx_tail ____cacheline_aligned_in_smp;
>  };
>  
> +static inline unsigned int tx_count(unsigned int head, unsigned int tail)
> +{
> +	return (head - tail) % (TX_DESC_NUM - 1);
> +}
> +
>  static void hip04_config_port(struct net_device *ndev, u32 speed, u32 duplex)
>  {
>  	struct hip04_priv *priv = netdev_priv(ndev);
> @@ -351,18 +359,21 @@ static int hip04_set_mac_address(struct net_device *ndev, void *addr)
>  	return 0;
>  }
>  
> -static void hip04_tx_reclaim(struct net_device *ndev, bool force)
> +static int hip04_tx_reclaim(struct net_device *ndev, bool force)
>  {
>  	struct hip04_priv *priv = netdev_priv(ndev);
>  	unsigned tx_tail = priv->tx_tail;
>  	struct tx_desc *desc;
>  	unsigned int bytes_compl = 0, pkts_compl = 0;
> +	unsigned int count;
>  
> -	if (priv->tx_count == 0)
> +	smp_rmb();
> +	count = tx_count(ACCESS_ONCE(priv->tx_head), tx_tail);
> +	if (count == 0)
>  		goto out;
>  
> -	while ((tx_tail != priv->tx_head) || (priv->tx_count == TX_DESC_NUM)) {
> -		desc = &priv->tx_desc[priv->tx_tail];
> +	while (count) {
> +		desc = &priv->tx_desc[tx_tail];
>  		if (desc->send_addr != 0) {
>  			if (force)
>  				desc->send_addr = 0;
> @@ -381,57 +392,37 @@ static void hip04_tx_reclaim(struct net_device *ndev, bool force)
>  		dev_kfree_skb(priv->tx_skb[tx_tail]);
>  		priv->tx_skb[tx_tail] = NULL;
>  		tx_tail = TX_NEXT(tx_tail);
> -		priv->tx_count--;
> -
> -		if (priv->tx_count <= 0)
> -			break;
> +		count--;
>  	}
>  
>  	priv->tx_tail = tx_tail;
> +	smp_wmb(); /* Ensure tx_tail visible to xmit */
>  
> -	/* Ensure tx_tail & tx_count visible to xmit */
> -	smp_mb();
>  out:
> -
>  	if (pkts_compl || bytes_compl)
>  		netdev_completed_queue(ndev, pkts_compl, bytes_compl);
>  
> -	if (unlikely(netif_queue_stopped(ndev)) &&
> -	    (priv->tx_count < TX_DESC_NUM))
> +	if (unlikely(netif_queue_stopped(ndev)) && (count < (TX_DESC_NUM - 1)))
>  		netif_wake_queue(ndev);
> -}
>  
> -static void hip04_tx_clean_monitor(struct work_struct *work)
> -{
> -	struct hip04_priv *priv = container_of(work, struct hip04_priv,
> -					       tx_clean_task.work);
> -	struct net_device *ndev = priv->ndev;
> -	int delta_in_ticks = msecs_to_jiffies(1000);
> -
> -	if (!time_in_range(jiffies, priv->last_tx,
> -			   priv->last_tx + delta_in_ticks)) {
> -		netif_tx_lock(ndev);
> -		hip04_tx_reclaim(ndev, false);
> -		netif_tx_unlock(ndev);
> -	}
> -	queue_delayed_work(priv->wq, &priv->tx_clean_task, delta_in_ticks);
> +	return count;

I think should return pkts_compl, because may break from the loop, the pkts_compl may smaller than count.
and we need to add netif_tx_lock() to protect this function to avoid concurrency conflict.

>  }
>  
>  static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>  {
>  	struct hip04_priv *priv = netdev_priv(ndev);
>  	struct net_device_stats *stats = &ndev->stats;
> -	unsigned int tx_head = priv->tx_head;
> +	unsigned int tx_head = priv->tx_head, count;
>  	struct tx_desc *desc = &priv->tx_desc[tx_head];
>  	dma_addr_t phys;
>  
> -	if (priv->tx_count >= TX_DESC_NUM) {
> +	smp_rmb();
> +	count = tx_count(tx_head, ACCESS_ONCE(priv->tx_tail));
> +	if (count == (TX_DESC_NUM - 1)) {
>  		netif_stop_queue(ndev);
>  		return NETDEV_TX_BUSY;
>  	}
>  
> -	hip04_tx_reclaim(ndev, false);
> -
>  	phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
>  	if (dma_mapping_error(&ndev->dev, phys)) {
>  		dev_kfree_skb(skb);
> @@ -447,20 +438,33 @@ static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>  	desc->wb_addr = cpu_to_be32(phys);
>  	skb_tx_timestamp(skb);
>  
> -	/* Don't wait up for transmitted skbs to be freed. */
> -	skb_orphan(skb);
> -
> +	/* FIXME: eliminate this mmio access if xmit_more is set */
>  	hip04_set_xmit_desc(priv, phys);
>  	priv->tx_head = TX_NEXT(tx_head);
> +	count++;
>  	netdev_sent_queue(ndev, skb->len);
>  
>  	stats->tx_bytes += skb->len;
>  	stats->tx_packets++;
> -	priv->tx_count++;
> -	priv->last_tx = jiffies;
>  
> -	/* Ensure tx_head & tx_count update visible to tx reclaim */
> -	smp_mb();
> +	/* Ensure tx_head update visible to tx reclaim */
> +	smp_wmb();
> +
> +	/* queue is getting full, better start cleaning up now */
> +	if (count >= priv->tx_coalesce_frames) {
> +		if (napi_schedule_prep(&priv->napi)) {
> +			/* disable rx interrupt and timer */
> +			priv->reg_inten &= ~(RCV_INT);
> +			writel_relaxed(DEF_INT_MASK & ~RCV_INT,
> +				       priv->base + PPE_INTEN);
> +			hrtimer_cancel(&priv->tx_coalesce_timer);
> +			__napi_schedule(&priv->napi);
> +		}
> +	} else if (!hrtimer_is_queued(&priv->tx_coalesce_timer)) {
> +		/* cleanup not pending yet, start a new timer */
> +		hrtimer_start_expires(&priv->tx_coalesce_timer,
> +				      HRTIMER_MODE_REL);
> +	}
>  
>  	return NETDEV_TX_OK;
>  }
> @@ -477,6 +481,7 @@ static int hip04_rx_poll(struct napi_struct *napi, int budget)
>  	bool last = false;
>  	dma_addr_t phys;
>  	int rx = 0;
> +	int tx_remaining;
>  	u16 len;
>  	u32 err;
>  
> @@ -513,11 +518,11 @@ static int hip04_rx_poll(struct napi_struct *napi, int budget)
>  
>  		buf = netdev_alloc_frag(priv->rx_buf_size);
>  		if (!buf)
> -			return -ENOMEM;
> +			goto done;
>  		phys = dma_map_single(&ndev->dev, buf,
>  				      RX_BUF_SIZE, DMA_FROM_DEVICE);
>  		if (dma_mapping_error(&ndev->dev, phys))
> -			return -EIO;
> +			goto done;
>  		priv->rx_buf[priv->rx_head] = buf;
>  		priv->rx_phys[priv->rx_head] = phys;
>  		hip04_set_recv_desc(priv, phys);
> @@ -537,6 +542,11 @@ static int hip04_rx_poll(struct napi_struct *napi, int budget)
>  	}
>  	napi_complete(napi);
>  done:
> +	/* clean up tx descriptors and start a new timer if necessary */
> +	tx_remaining = hip04_tx_reclaim(ndev, false);
> +	if (rx < budget && tx_remaining)
> +		hrtimer_start_expires(&priv->tx_coalesce_timer, HRTIMER_MODE_REL);
> +
>  	return rx;
>  }
>  
> @@ -547,6 +557,9 @@ static irqreturn_t hip04_mac_interrupt(int irq, void *dev_id)
>  	struct net_device_stats *stats = &ndev->stats;
>  	u32 ists = readl_relaxed(priv->base + PPE_INTSTS);
>  
> +	if (!ists)
> +		return IRQ_NONE;
> +
>  	writel_relaxed(DEF_INT_MASK, priv->base + PPE_RINT);
>  
>  	if (unlikely(ists & DEF_INT_ERR)) {
> @@ -560,16 +573,32 @@ static irqreturn_t hip04_mac_interrupt(int irq, void *dev_id)
>  		}
>  	}
>  
> -	if (ists & RCV_INT) {
> +	if (ists & RCV_INT && napi_schedule_prep(&priv->napi)) {
>  		/* disable rx interrupt */
>  		priv->reg_inten &= ~(RCV_INT);
> -		writel_relaxed(priv->reg_inten, priv->base + PPE_INTEN);
> -		napi_schedule(&priv->napi);
> +		writel_relaxed(DEF_INT_MASK & ~RCV_INT, priv->base + PPE_INTEN);
> +		hrtimer_cancel(&priv->tx_coalesce_timer);
> +		__napi_schedule(&priv->napi);
>  	}
>  
>  	return IRQ_HANDLED;
>  }
>  
> +enum hrtimer_restart tx_done(struct hrtimer *hrtimer)
> +{
> +	struct hip04_priv *priv;
> +	priv = container_of(hrtimer, struct hip04_priv, tx_coalesce_timer);
> +
> +	if (napi_schedule_prep(&priv->napi)) {
> +		/* disable rx interrupt */
> +		priv->reg_inten &= ~(RCV_INT);
> +		writel_relaxed(DEF_INT_MASK & ~RCV_INT, priv->base + PPE_INTEN);
> +		__napi_schedule(&priv->napi);
> +	}
> +
> +	return HRTIMER_NORESTART;
> +}
> +
>  static void hip04_adjust_link(struct net_device *ndev)
>  {
>  	struct hip04_priv *priv = netdev_priv(ndev);
> @@ -589,7 +618,6 @@ static int hip04_mac_open(struct net_device *ndev)
>  	priv->rx_head = 0;
>  	priv->tx_head = 0;
>  	priv->tx_tail = 0;
> -	priv->tx_count = 0;
>  	hip04_reset_ppe(priv);
>  
>  	for (i = 0; i < RX_DESC_NUM; i++) {
> @@ -612,9 +640,6 @@ static int hip04_mac_open(struct net_device *ndev)
>  	hip04_mac_enable(ndev);
>  	napi_enable(&priv->napi);
>  
> -	INIT_DELAYED_WORK(&priv->tx_clean_task, hip04_tx_clean_monitor);
> -	queue_delayed_work(priv->wq, &priv->tx_clean_task, 0);
> -
>  	return 0;
>  }
>  
> @@ -623,8 +648,6 @@ static int hip04_mac_stop(struct net_device *ndev)
>  	struct hip04_priv *priv = netdev_priv(ndev);
>  	int i;
>  
> -	cancel_delayed_work_sync(&priv->tx_clean_task);
> -
I think we should cancle the hrtimer when closed and queue the timer when open.

>  	napi_disable(&priv->napi);
>  	netif_stop_queue(ndev);
>  	hip04_mac_disable(ndev);
> @@ -725,6 +748,7 @@ static int hip04_mac_probe(struct platform_device *pdev)
>  	struct hip04_priv *priv;
>  	struct resource *res;
>  	unsigned int irq;
> +	ktime_t txtime;
>  	int ret;
>  
>  	ndev = alloc_etherdev(sizeof(struct hip04_priv));
> @@ -751,6 +775,21 @@ static int hip04_mac_probe(struct platform_device *pdev)
>  	priv->port = arg.args[0];
>  	priv->chan = arg.args[1] * RX_DESC_NUM;
>  
> +	hrtimer_init(&priv->tx_coalesce_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +
> +	/*
> +	 * BQL will try to keep the TX queue as short as possible, but it can't
> +	 * be faster than tx_coalesce_usecs, so we need a fast timeout here,
> +	 * but also long enough to gather up enough frames to ensure we don't
> +	 * get more interrupts than necessary.
> +	 * 200us is enough for 16 frames of 1500 bytes at gigabit ethernet rate
> +	 */
> +	priv->tx_coalesce_frames = TX_DESC_NUM * 3 / 4;
> +	priv->tx_coalesce_usecs = 200;
> +	/* allow timer to fire after half the time at the earliest */
> +	txtime = ktime_set(0, priv->tx_coalesce_usecs * NSEC_PER_USEC / 2);
> +	hrtimer_set_expires_range(&priv->tx_coalesce_timer, txtime, txtime);
> +

I think miss the line:
 priv->tx_coalesce_timer.function = tx_done;


Regards
Ding
>  	priv->map = syscon_node_to_regmap(arg.np);
>  	if (IS_ERR(priv->map)) {
>  		dev_warn(d, "no syscon hisilicon,hip04-ppe\n");
> @@ -788,12 +827,6 @@ static int hip04_mac_probe(struct platform_device *pdev)
>  		}
>  	}
>  
> -	priv->wq = create_singlethread_workqueue(ndev->name);
> -	if (!priv->wq) {
> -		ret = -ENOMEM;
> -		goto init_fail;
> -	}
> -
>  	INIT_WORK(&priv->tx_timeout_task, hip04_tx_timeout_task);
>  
>  	ether_setup(ndev);
> @@ -848,8 +881,6 @@ static int hip04_remove(struct platform_device *pdev)
>  	free_irq(ndev->irq, ndev);
>  	of_node_put(priv->phy_node);
>  	cancel_work_sync(&priv->tx_timeout_task);
> -	if (priv->wq)
> -		destroy_workqueue(priv->wq);
>  	free_netdev(ndev);
>  
>  	return 0;
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
  2014-12-16  7:57   ` Ding Tianhong
@ 2014-12-16  8:54     ` Arnd Bergmann
  2014-12-16 10:08       ` Ding Tianhong
  0 siblings, 1 reply; 10+ messages in thread
From: Arnd Bergmann @ 2014-12-16  8:54 UTC (permalink / raw)
  To: Ding Tianhong
  Cc: linux-arm-kernel, zhangfei.gao, davem, linux, f.fainelli,
	sergei.shtylyov, mark.rutland, David.Laight, eric.dumazet,
	xuwei5, netdev, devicetree

On Tuesday 16 December 2014 15:57:21 Ding Tianhong wrote:
> On 2014/12/15 21:29, Arnd Bergmann wrote:
> > On Thursday 11 December 2014 19:42:27 Ding Tianhong wrote:
> > @@ -381,57 +392,37 @@ static void hip04_tx_reclaim(struct net_device *ndev, bool force)
> >  		dev_kfree_skb(priv->tx_skb[tx_tail]);
> >  		priv->tx_skb[tx_tail] = NULL;
> >  		tx_tail = TX_NEXT(tx_tail);
> > -		priv->tx_count--;
> > -
> > -		if (priv->tx_count <= 0)
> > -			break;
> > +		count--;
> >  	}
> >  
...
> > -	queue_delayed_work(priv->wq, &priv->tx_clean_task, delta_in_ticks);
> > +	return count;
> 
> I think should return pkts_compl, because may break from the loop, the
> pkts_compl may smaller than count.

The calling convention I used is to return the packets that are remaining
on the queue. Only if that is nonzero we need to reschedule the timer.

> and we need to add netif_tx_lock() to protect this function to avoid concurrency conflict.

Oh, did I miss something? The idea was that the start_xmit function only updates
the tx_head pointer and reads the tx_tail, while the tx_reclaim function does
the reverse, and writes to a different cache line, in order to allow a lockless
queue traversal.

Can you point to a specific struct member that still need to be protected by
the lock? Did I miss a race that would allow both functions to exit with
the timer disabled and entries left on the queue?

> > @@ -623,8 +648,6 @@ static int hip04_mac_stop(struct net_device *ndev)
> >  	struct hip04_priv *priv = netdev_priv(ndev);
> >  	int i;
> >  
> > -	cancel_delayed_work_sync(&priv->tx_clean_task);
> > -
> I think we should cancle the hrtimer when closed and queue the timer when open.

I was expecting that force-cleaning up the tx queue would be enough for that.
It it not?

I suppose it can't hurt to cancel the timer here anyway, and maybe use
WARN_ON() if it's still active.

Starting the timer after opening seems wrong though: at that point there are
no packets on the queue yet. The timer should always start ticking at the
exact point when the first packet is put on the queue while the timer is
not already pending.

> >  	napi_disable(&priv->napi);
> >  	netif_stop_queue(ndev);
> >  	hip04_mac_disable(ndev);
> > @@ -725,6 +748,7 @@ static int hip04_mac_probe(struct platform_device *pdev)
> >  	struct hip04_priv *priv;
> >  	struct resource *res;
> >  	unsigned int irq;
> > +	ktime_t txtime;
> >  	int ret;
> >  
> >  	ndev = alloc_etherdev(sizeof(struct hip04_priv));
> > @@ -751,6 +775,21 @@ static int hip04_mac_probe(struct platform_device *pdev)
> >  	priv->port = arg.args[0];
> >  	priv->chan = arg.args[1] * RX_DESC_NUM;
> >  
> > +	hrtimer_init(&priv->tx_coalesce_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> > +
> > +	/*
> > +	 * BQL will try to keep the TX queue as short as possible, but it can't
> > +	 * be faster than tx_coalesce_usecs, so we need a fast timeout here,
> > +	 * but also long enough to gather up enough frames to ensure we don't
> > +	 * get more interrupts than necessary.
> > +	 * 200us is enough for 16 frames of 1500 bytes at gigabit ethernet rate
> > +	 */
> > +	priv->tx_coalesce_frames = TX_DESC_NUM * 3 / 4;
> > +	priv->tx_coalesce_usecs = 200;
> > +	/* allow timer to fire after half the time at the earliest */
> > +	txtime = ktime_set(0, priv->tx_coalesce_usecs * NSEC_PER_USEC / 2);
> > +	hrtimer_set_expires_range(&priv->tx_coalesce_timer, txtime, txtime);
> > +
> 
> I think miss the line:
>  priv->tx_coalesce_timer.function = tx_done;

Yes, good point.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver
  2014-12-16  8:54     ` Arnd Bergmann
@ 2014-12-16 10:08       ` Ding Tianhong
  0 siblings, 0 replies; 10+ messages in thread
From: Ding Tianhong @ 2014-12-16 10:08 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, zhangfei.gao, davem, linux, f.fainelli,
	sergei.shtylyov, mark.rutland, David.Laight, eric.dumazet,
	xuwei5, netdev, devicetree

On 2014/12/16 16:54, Arnd Bergmann wrote:
> On Tuesday 16 December 2014 15:57:21 Ding Tianhong wrote:
>> On 2014/12/15 21:29, Arnd Bergmann wrote:
>>> On Thursday 11 December 2014 19:42:27 Ding Tianhong wrote:
>>> @@ -381,57 +392,37 @@ static void hip04_tx_reclaim(struct net_device *ndev, bool force)
>>>  		dev_kfree_skb(priv->tx_skb[tx_tail]);
>>>  		priv->tx_skb[tx_tail] = NULL;
>>>  		tx_tail = TX_NEXT(tx_tail);
>>> -		priv->tx_count--;
>>> -
>>> -		if (priv->tx_count <= 0)
>>> -			break;
>>> +		count--;
>>>  	}
>>>  
> ...
>>> -	queue_delayed_work(priv->wq, &priv->tx_clean_task, delta_in_ticks);
>>> +	return count;
>>
>> I think should return pkts_compl, because may break from the loop, the
>> pkts_compl may smaller than count.
> 
> The calling convention I used is to return the packets that are remaining
> on the queue. Only if that is nonzero we need to reschedule the timer.
> 

OK, agree.

>> and we need to add netif_tx_lock() to protect this function to avoid concurrency conflict.
> 
> Oh, did I miss something? The idea was that the start_xmit function only updates
> the tx_head pointer and reads the tx_tail, while the tx_reclaim function does
> the reverse, and writes to a different cache line, in order to allow a lockless
> queue traversal.
> 
> Can you point to a specific struct member that still need to be protected by
> the lock? Did I miss a race that would allow both functions to exit with
> the timer disabled and entries left on the queue?
> 
OK, got it, no problem.

>>> @@ -623,8 +648,6 @@ static int hip04_mac_stop(struct net_device *ndev)
>>>  	struct hip04_priv *priv = netdev_priv(ndev);
>>>  	int i;
>>>  
>>> -	cancel_delayed_work_sync(&priv->tx_clean_task);
>>> -
>> I think we should cancle the hrtimer when closed and queue the timer when open.
> 
> I was expecting that force-cleaning up the tx queue would be enough for that.
> It it not?
> 
> I suppose it can't hurt to cancel the timer here anyway, and maybe use
> WARN_ON() if it's still active.
> 

Ok, I found no need to worry about this, when the dev is closed, the napi will disable and will not enter timer again.

> Starting the timer after opening seems wrong though: at that point there are
> no packets on the queue yet. The timer should always start ticking at the
> exact point when the first packet is put on the queue while the timer is
> not already pending.
> 
Ok.

>>>  	napi_disable(&priv->napi);
>>>  	netif_stop_queue(ndev);
>>>  	hip04_mac_disable(ndev);
>>> @@ -725,6 +748,7 @@ static int hip04_mac_probe(struct platform_device *pdev)
>>>  	struct hip04_priv *priv;
>>>  	struct resource *res;
>>>  	unsigned int irq;
>>> +	ktime_t txtime;
>>>  	int ret;
>>>  
>>>  	ndev = alloc_etherdev(sizeof(struct hip04_priv));
>>> @@ -751,6 +775,21 @@ static int hip04_mac_probe(struct platform_device *pdev)
>>>  	priv->port = arg.args[0];
>>>  	priv->chan = arg.args[1] * RX_DESC_NUM;
>>>  
>>> +	hrtimer_init(&priv->tx_coalesce_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>>> +
>>> +	/*
>>> +	 * BQL will try to keep the TX queue as short as possible, but it can't
>>> +	 * be faster than tx_coalesce_usecs, so we need a fast timeout here,
>>> +	 * but also long enough to gather up enough frames to ensure we don't
>>> +	 * get more interrupts than necessary.
>>> +	 * 200us is enough for 16 frames of 1500 bytes at gigabit ethernet rate
>>> +	 */
>>> +	priv->tx_coalesce_frames = TX_DESC_NUM * 3 / 4;
>>> +	priv->tx_coalesce_usecs = 200;
>>> +	/* allow timer to fire after half the time at the earliest */
>>> +	txtime = ktime_set(0, priv->tx_coalesce_usecs * NSEC_PER_USEC / 2);
>>> +	hrtimer_set_expires_range(&priv->tx_coalesce_timer, txtime, txtime);
>>> +
>>
>> I think miss the line:
>>  priv->tx_coalesce_timer.function = tx_done;
> 
> Yes, good point.
> 

I will send v10 when the net-next open again, and these days will test this driver, thanks a lot.

Ding

> 	Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-12-16 10:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-11 11:42 [PATCH net-next v9 0/3] add hisilicon hip04 ethernet driver Ding Tianhong
2014-12-11 11:42 ` [PATCH net-next v9 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet Ding Tianhong
2014-12-11 11:42 ` [PATCH net-next v9 2/3] net: hisilicon: new hip04 MDIO driver Ding Tianhong
2014-12-11 11:42 ` [PATCH net-next v9 3/3] net: hisilicon: new hip04 ethernet driver Ding Tianhong
     [not found] ` <1418298150-4944-1-git-send-email-dingtianhong-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-12-11 17:04   ` [PATCH net-next v9 0/3] add hisilicon " David Miller
2014-12-12  1:46     ` Ding Tianhong
2014-12-15 13:29 ` Arnd Bergmann
2014-12-16  7:57   ` Ding Tianhong
2014-12-16  8:54     ` Arnd Bergmann
2014-12-16 10:08       ` Ding Tianhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).