All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/17] stmmac: enhance driver performances and update the version
@ 2016-02-29 13:27 Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 01/17] stmmac: share reset function between dwmac100 and dwmac1000 Alexandre TORGUE
                   ` (19 more replies)
  0 siblings, 20 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

According to Giuseppe, I send the v3 series.

This is a subset of patches to rework the driver in order to improve its
performances and make it more robust under stress conditions.

All patches have been ported on STi mainstream kernel branch and
tested on ARM STiH4xx platforms and newer ones.

This series also updates the driver version and prepares it
to include further development to support new chips.

In detail, these patches are:

o to rework and improve the internal DMA bus settings

  Fine tuning is mandatory on some platforms for both
  performance and stability issues.

o to rework and optimize the descriptor management.

  This will help a lot on performance side and preparing
  the inclusion on the GMAC4.x.

o to add a set of optimizations for both xmit and rx functions.

  These will help a lot on performance side and making the driver
  more robust in case of low memory conditions and under some
  stress test, performed for example on IP-STB.

Below some throughput figures obtained on some boxes before and after
the patches.

                       nuttcp (mbps)       iperf (Mbps)
------------------------------------------------------------------
                      tcp     udp          tcp      udp
                   tx   rx   tx  rx      tx   rx   tx  rx
                    ------------------------------------------
   old             680   800 480  506    760  800   600  700
   new             830   880 540  630    840  880   700   800

==============================================================================

V2: - rx_copybreak is now managed by using ethtool.
V3: - improve comments on PCIe detailing that there are no regressions
    - rework some APIs to properly define some params as bool as expected
    - rework the formula to get the element inside the ring. Comparing V2,
	patches 4 and 13 have been merged because the same formula have been
	used. After this rework, no evident benefit has been noticed in terms
	of performances so the table above is still valid. Disassembling the
	code for SH4 and ARM, with the new formula just an instr is saved
	(depending on compiler flags) and this gives us not so relevanti gain,
	for example, on SH4 where some instr are executed in the same pipeline
	stage.
	Ring sizes are now fixed and maybe they can be reworked to be tuned
	w/o using stmmaceth= cmdline option. Indeed, nobody change these sizes
	and indeed the numbers selected by default respect the budget and
	avoid to pass invalid setup. These are the best driver default sizes
	for ring and chain.

==============================================================================
Fabrice Gasnier (3):
  stmmac: merge get_rx_owner into rx_status routine.
  stmmac: optimize tx clean function
  stmmac: fix phy init when attached to a phy

Giuseppe Cavallaro (14):
  stmmac: share reset function between dwmac100 and dwmac1000
  stmmac: rework DMA bus setting and introduce new platform AXI
    structure
  stmmac: change descriptor layout
  stmmac: review RX/TX ring management
  stmmac: add length field to dma data
  stmmac: add last_segment field to dma data
  stmmac: add is_jumbo field to dma data
  stmmac: optimize tx desc management
  stmmac: set dirty index out of the loop
  stmmac: first frame prep at the end of xmit routine
  stmmac: do not poll phy handler when attach a switch
  stmmac: do not perform zero-copy for rx frames
  stmmac: tune rx copy via threshold.
  stmmac: update version to Oct_2015

 Documentation/devicetree/bindings/net/stmmac.txt   |  54 ++-
 drivers/net/ethernet/stmicro/stmmac/chain_mode.c   |  37 +-
 drivers/net/ethernet/stmicro/stmmac/common.h       |  39 +-
 drivers/net/ethernet/stmicro/stmmac/descs.h        | 330 +++++++-------
 drivers/net/ethernet/stmicro/stmmac/descs_com.h    |  77 ++--
 drivers/net/ethernet/stmicro/stmmac/dwmac100.h     |   1 -
 drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   3 +-
 .../net/ethernet/stmicro/stmmac/dwmac1000_dma.c    | 111 +++--
 drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c |  22 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |  39 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c    |  21 +
 drivers/net/ethernet/stmicro/stmmac/enh_desc.c     | 226 +++++-----
 drivers/net/ethernet/stmicro/stmmac/norm_desc.c    | 150 ++++---
 drivers/net/ethernet/stmicro/stmmac/ring_mode.c    |  32 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |   9 +-
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |  41 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 473 ++++++++++++---------
 drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c   |   4 +-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |  42 +-
 include/linux/stmmac.h                             |  17 +-
 20 files changed, 1016 insertions(+), 712 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 01/17] stmmac: share reset function between dwmac100 and dwmac1000
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 02/17] stmmac: rework DMA bus setting and introduce new platform AXI structure Alexandre TORGUE
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch is to share the same reset procedure between dwmac100 and
dwmac1000 chips.
This will also help on enhancing the driver and support new chips.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 1e19c8f..bac0e44 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -376,7 +376,8 @@ extern const struct stmmac_desc_ops ndesc_ops;
 /* Specific DMA helpers */
 struct stmmac_dma_ops {
 	/* DMA core initialization */
-	int (*init) (void __iomem *ioaddr, int pbl, int fb, int mb,
+	int (*reset)(void __iomem *ioaddr);
+	void (*init)(void __iomem *ioaddr, int pbl, int fb, int mb,
 		     int burst_len, u32 dma_tx, u32 dma_rx, int atds);
 	/* Dump DMA registers */
 	void (*dump_regs) (void __iomem *ioaddr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100.h b/drivers/net/ethernet/stmicro/stmmac/dwmac100.h
index 2ec6aea..1657acf 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100.h
@@ -95,7 +95,6 @@
 #define DMA_BUS_MODE_DSL_MASK	0x0000007c	/* Descriptor Skip Length */
 #define DMA_BUS_MODE_DSL_SHIFT	2	/*   (in DWORDS)      */
 #define DMA_BUS_MODE_BAR_BUS	0x00000002	/* Bar-Bus Arbitration */
-#define DMA_BUS_MODE_SFT_RESET	0x00000001	/* Software Reset */
 #define DMA_BUS_MODE_DEFAULT	0x00000000
 
 /* DMA Control register defines */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index 8831a05..9d36ae7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -221,7 +221,6 @@ enum inter_frame_gap {
 
 /*--- DMA BLOCK defines ---*/
 /* DMA Bus Mode register defines */
-#define DMA_BUS_MODE_SFT_RESET	0x00000001	/* Software Reset */
 #define DMA_BUS_MODE_DA		0x00000002	/* Arbitration scheme */
 #define DMA_BUS_MODE_DSL_MASK	0x0000007c	/* Descriptor Skip Length */
 #define DMA_BUS_MODE_DSL_SHIFT	2		/*   (in DWORDS)      */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index 0e8937c..5f0aea5 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -30,23 +30,10 @@
 #include "dwmac1000.h"
 #include "dwmac_dma.h"
 
-static int dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
-			      int burst_len, u32 dma_tx, u32 dma_rx, int atds)
+static void dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
+			       int burst_len, u32 dma_tx, u32 dma_rx, int atds)
 {
-	u32 value = readl(ioaddr + DMA_BUS_MODE);
-	int limit;
-
-	/* DMA SW reset */
-	value |= DMA_BUS_MODE_SFT_RESET;
-	writel(value, ioaddr + DMA_BUS_MODE);
-	limit = 10;
-	while (limit--) {
-		if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
-			break;
-		mdelay(10);
-	}
-	if (limit < 0)
-		return -EBUSY;
+	u32 value;
 
 	/*
 	 * Set the DMA PBL (Programmable Burst Length) mode
@@ -102,8 +89,6 @@ static int dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
 	 */
 	writel(dma_tx, ioaddr + DMA_TX_BASE_ADDR);
 	writel(dma_rx, ioaddr + DMA_RCV_BASE_ADDR);
-
-	return 0;
 }
 
 static u32 dwmac1000_configure_fc(u32 csr6, int rxfifosz)
@@ -205,6 +190,7 @@ static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt)
 }
 
 const struct stmmac_dma_ops dwmac1000_dma_ops = {
+	.reset = dwmac_dma_reset,
 	.init = dwmac1000_dma_init,
 	.dump_regs = dwmac1000_dump_dma_regs,
 	.dma_mode = dwmac1000_dma_operation_mode,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
index 9d0971c..c40582a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
@@ -32,24 +32,9 @@
 #include "dwmac100.h"
 #include "dwmac_dma.h"
 
-static int dwmac100_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
-			     int burst_len, u32 dma_tx, u32 dma_rx, int atds)
+static void dwmac100_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
+			      int burst_len, u32 dma_tx, u32 dma_rx, int atds)
 {
-	u32 value = readl(ioaddr + DMA_BUS_MODE);
-	int limit;
-
-	/* DMA SW reset */
-	value |= DMA_BUS_MODE_SFT_RESET;
-	writel(value, ioaddr + DMA_BUS_MODE);
-	limit = 10;
-	while (limit--) {
-		if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
-			break;
-		mdelay(10);
-	}
-	if (limit < 0)
-		return -EBUSY;
-
 	/* Enable Application Access by writing to DMA CSR0 */
 	writel(DMA_BUS_MODE_DEFAULT | (pbl << DMA_BUS_MODE_PBL_SHIFT),
 	       ioaddr + DMA_BUS_MODE);
@@ -62,8 +47,6 @@ static int dwmac100_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
 	 */
 	writel(dma_tx, ioaddr + DMA_TX_BASE_ADDR);
 	writel(dma_rx, ioaddr + DMA_RCV_BASE_ADDR);
-
-	return 0;
 }
 
 /* Store and Forward capability is not used at all.
@@ -131,6 +114,7 @@ static void dwmac100_dma_diagnostic_fr(void *data, struct stmmac_extra_stats *x,
 }
 
 const struct stmmac_dma_ops dwmac100_dma_ops = {
+	.reset = dwmac_dma_reset,
 	.init = dwmac100_dma_init,
 	.dump_regs = dwmac100_dump_dma_regs,
 	.dma_mode = dwmac100_dma_operation_mode,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index def266d..13ca90e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -35,6 +35,10 @@
 #define DMA_CONTROL		0x00001018	/* Ctrl (Operational Mode) */
 #define DMA_INTR_ENA		0x0000101c	/* Interrupt Enable */
 #define DMA_MISSED_FRAME_CTR	0x00001020	/* Missed Frame Counter */
+
+/* SW Reset */
+#define DMA_BUS_MODE_SFT_RESET	0x00000001	/* Software Reset */
+
 /* Rx watchdog register */
 #define DMA_RX_WATCHDOG		0x00001024
 /* AXI Bus Mode */
@@ -112,5 +116,6 @@ void dwmac_dma_stop_tx(void __iomem *ioaddr);
 void dwmac_dma_start_rx(void __iomem *ioaddr);
 void dwmac_dma_stop_rx(void __iomem *ioaddr);
 int dwmac_dma_interrupt(void __iomem *ioaddr, struct stmmac_extra_stats *x);
+int dwmac_dma_reset(void __iomem *ioaddr);
 
 #endif /* __DWMAC_DMA_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index 484e3cf..84e3e84 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -26,6 +26,27 @@
 
 #define GMAC_HI_REG_AE		0x80000000
 
+int dwmac_dma_reset(void __iomem *ioaddr)
+{
+	u32 value = readl(ioaddr + DMA_BUS_MODE);
+	int limit;
+
+	/* DMA SW reset */
+	value |= DMA_BUS_MODE_SFT_RESET;
+	writel(value, ioaddr + DMA_BUS_MODE);
+	limit = 10;
+	while (limit--) {
+		if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
+			break;
+		mdelay(10);
+	}
+
+	if (limit < 0)
+		return -EBUSY;
+
+	return 0;
+}
+
 /* CSR1 enables the transmit DMA to check for new descriptor */
 void dwmac_enable_dma_transmission(void __iomem *ioaddr)
 {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c21015b..13752e9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1638,6 +1638,7 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 	int pbl = DEFAULT_DMA_PBL, fixed_burst = 0, burst_len = 0;
 	int mixed_burst = 0;
 	int atds = 0;
+	int ret = 0;
 
 	if (priv->plat->dma_cfg) {
 		pbl = priv->plat->dma_cfg->pbl;
@@ -1649,9 +1650,16 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 	if (priv->extend_desc && (priv->mode == STMMAC_RING_MODE))
 		atds = 1;
 
-	return priv->hw->dma->init(priv->ioaddr, pbl, fixed_burst, mixed_burst,
-				   burst_len, priv->dma_tx_phy,
-				   priv->dma_rx_phy, atds);
+	ret = priv->hw->dma->reset(priv->ioaddr);
+	if (ret) {
+		dev_err(priv->device, "Failed to reset the dma\n");
+		return ret;
+	}
+
+	priv->hw->dma->init(priv->ioaddr, pbl, fixed_burst, mixed_burst,
+			    burst_len, priv->dma_tx_phy,
+			    priv->dma_rx_phy, atds);
+	return ret;
 }
 
 /**
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 02/17] stmmac: rework DMA bus setting and introduce new platform AXI structure
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 01/17] stmmac: share reset function between dwmac100 and dwmac1000 Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 03/17] stmmac: change descriptor layout Alexandre TORGUE
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch restructures the DMA bus settings and this is done
by introducing a new platform structure used for programming
the AXI Bus Mode Register inside the DMA module.
This structure can be populated from device-tree as documented in the
binding txt file.

After initializing the DMA, the AXI register can be optionally tuned
for platform drivers based.
This patch also reworks some parameters to make coherent the DMA
configuration now that AXI register is introduced.
For example, the burst_len is managed by using the mentioned axi
support above; so the snps,burst-len parameter has been removed.
It makes sense to provide the AAL parameter from DT to Address-Aligned
Beats inside the Register0 and review the PBL settings when initialize
the engine.

For PCI glue, rebuilding the story of this setting, it
was added to align a configuration so not for fixing some
known problem. No issue raised after this patch.
It is safe to use the default burst length instead of
tuning it to the maximum value

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/Documentation/devicetree/bindings/net/stmmac.txt b/Documentation/devicetree/bindings/net/stmmac.txt
index e862a92..6605d19 100644
--- a/Documentation/devicetree/bindings/net/stmmac.txt
+++ b/Documentation/devicetree/bindings/net/stmmac.txt
@@ -17,7 +17,25 @@ Required properties:
 	The 1st cell is reset pre-delay in micro seconds.
 	The 2nd cell is reset pulse in micro seconds.
 	The 3rd cell is reset post-delay in micro seconds.
+
+Optional properties:
+- resets: Should contain a phandle to the STMMAC reset signal, if any
+- reset-names: Should contain the reset signal name "stmmaceth", if a
+	reset phandle is given
+- max-frame-size: See ethernet.txt file in the same directory
+- clocks: If present, the first clock should be the GMAC main clock and
+  the second clock should be peripheral's register interface clock. Further
+  clocks may be specified in derived bindings.
+- clock-names: One name for each entry in the clocks property, the
+  first one should be "stmmaceth" and the second one should be "pclk".
+- clk_ptp_ref: this is the PTP reference clock; in case of the PTP is
+  available this clock is used for programming the Timestamp Addend Register.
+  If not passed then the system clock will be used and this is fine on some
+  platforms.
+- tx-fifo-depth: See ethernet.txt file in the same directory
+- rx-fifo-depth: See ethernet.txt file in the same directory
 - snps,pbl		Programmable Burst Length
+- snps,aal		Address-Aligned Beats
 - snps,fixed-burst	Program the DMA to use the fixed burst mode
 - snps,mixed-burst	Program the DMA to use the mixed burst mode
 - snps,force_thresh_dma_mode	Force DMA to use the threshold mode for
@@ -29,27 +47,28 @@ Required properties:
 				supported by this device instance
 - snps,perfect-filter-entries:	Number of perfect filter entries supported
 				by this device instance
-
-Optional properties:
-- resets: Should contain a phandle to the STMMAC reset signal, if any
-- reset-names: Should contain the reset signal name "stmmaceth", if a
-	reset phandle is given
-- max-frame-size: See ethernet.txt file in the same directory
-- clocks: If present, the first clock should be the GMAC main clock
-  The optional second clock should be peripheral's register interface clock.
-  The third optional clock should be the ptp reference clock.
-  Further clocks may be specified in derived bindings.
-- clock-names: One name for each entry in the clocks property.
-  The first one should be "stmmaceth".
-  The optional second one should be "pclk".
-  The optional third one should be "clk_ptp_ref".
-- snps,burst_len: The AXI burst lenth value of the AXI BUS MODE register.
-- tx-fifo-depth: See ethernet.txt file in the same directory
-- rx-fifo-depth: See ethernet.txt file in the same directory
+- AXI BUS Mode parameters: below the list of all the parameters to program the
+			   AXI register inside the DMA module:
+	- snps,lpi_en: enable Low Power Interface
+	- snps,xit_frm: unlock on WoL
+	- snps,wr_osr_lmt: max write oustanding req. limit
+	- snps,rd_osr_lmt: max read oustanding req. limit
+	- snps,kbbe: do not cross 1KiB boundary.
+	- snps,axi_all: align address
+	- snps,blen: this is a vector of supported burst length.
+	- snps,fb: fixed-burst
+	- snps,mb: mixed-burst
+	- snps,rb: rebuild INCRx Burst
 - mdio: with compatible = "snps,dwmac-mdio", create and register mdio bus.
 
 Examples:
 
+	stmmac_axi_setup: stmmac-axi-config {
+		snps,wr_osr_lmt = <0xf>;
+		snps,rd_osr_lmt = <0xf>;
+		snps,blen = <256 128 64 32 0 0 0>;
+	};
+
 	gmac0: ethernet@e0800000 {
 		compatible = "st,spear600-gmac";
 		reg = <0xe0800000 0x8000>;
@@ -65,6 +84,7 @@ Examples:
 		tx-fifo-depth = <16384>;
 		clocks = <&clock>;
 		clock-names = "stmmaceth";
+		snps,axi-config = <&stmmac_axi_setup>;
 		mdio0 {
 			#address-cells = <1>;
 			#size-cells = <0>;
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index bac0e44..586a336 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -27,6 +27,7 @@
 
 #include <linux/etherdevice.h>
 #include <linux/netdevice.h>
+#include <linux/stmmac.h>
 #include <linux/phy.h>
 #include <linux/module.h>
 #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
@@ -378,7 +379,9 @@ struct stmmac_dma_ops {
 	/* DMA core initialization */
 	int (*reset)(void __iomem *ioaddr);
 	void (*init)(void __iomem *ioaddr, int pbl, int fb, int mb,
-		     int burst_len, u32 dma_tx, u32 dma_rx, int atds);
+		     int aal, u32 dma_tx, u32 dma_rx, int atds);
+	/* Configure the AXI Bus Mode Register */
+	void (*axi)(void __iomem *ioaddr, struct stmmac_axi *axi);
 	/* Dump DMA registers */
 	void (*dump_regs) (void __iomem *ioaddr);
 	/* Set tx/rx threshold in the csr6 register
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index 9d36ae7..b0593a4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -240,7 +240,7 @@ enum rx_tx_priority_ratio {
 #define DMA_BUS_MODE_RPBL_MASK	0x003e0000	/* Rx-Programmable Burst Len */
 #define DMA_BUS_MODE_RPBL_SHIFT	17
 #define DMA_BUS_MODE_USP	0x00800000
-#define DMA_BUS_MODE_PBL	0x01000000
+#define DMA_BUS_MODE_MAXPBL	0x01000000
 #define DMA_BUS_MODE_AAL	0x02000000
 
 /* DMA CRS Control and Status Register Mapping */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index 5f0aea5..da32d60 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -30,24 +30,76 @@
 #include "dwmac1000.h"
 #include "dwmac_dma.h"
 
+static void dwmac1000_dma_axi(void __iomem *ioaddr, struct stmmac_axi *axi)
+{
+	u32 value = readl(ioaddr + DMA_AXI_BUS_MODE);
+	int i;
+
+	pr_info("dwmac1000: Master AXI performs %s burst length\n",
+		!(value & DMA_AXI_UNDEF) ? "fixed" : "any");
+
+	if (axi->axi_lpi_en)
+		value |= DMA_AXI_EN_LPI;
+	if (axi->axi_xit_frm)
+		value |= DMA_AXI_LPI_XIT_FRM;
+
+	value |= (axi->axi_wr_osr_lmt & DMA_AXI_WR_OSR_LMT_MASK) <<
+		 DMA_AXI_WR_OSR_LMT_SHIFT;
+
+	value |= (axi->axi_rd_osr_lmt & DMA_AXI_RD_OSR_LMT_MASK) <<
+		 DMA_AXI_RD_OSR_LMT_SHIFT;
+
+	/* Depending on the UNDEF bit the Master AXI will perform any burst
+	 * length according to the BLEN programmed (by default all BLEN are
+	 * set).
+	 */
+	for (i = 0; i < AXI_BLEN; i++) {
+		switch (axi->axi_blen[i]) {
+		case 256:
+			value |= DMA_AXI_BLEN256;
+			break;
+		case 128:
+			value |= DMA_AXI_BLEN128;
+			break;
+		case 64:
+			value |= DMA_AXI_BLEN64;
+			break;
+		case 32:
+			value |= DMA_AXI_BLEN32;
+			break;
+		case 16:
+			value |= DMA_AXI_BLEN16;
+			break;
+		case 8:
+			value |= DMA_AXI_BLEN8;
+			break;
+		case 4:
+			value |= DMA_AXI_BLEN4;
+			break;
+		}
+	}
+
+	writel(value, ioaddr + DMA_AXI_BUS_MODE);
+}
+
 static void dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
-			       int burst_len, u32 dma_tx, u32 dma_rx, int atds)
+			       int aal, u32 dma_tx, u32 dma_rx, int atds)
 {
-	u32 value;
+	u32 value = readl(ioaddr + DMA_BUS_MODE);
 
 	/*
-	 * Set the DMA PBL (Programmable Burst Length) mode
-	 * Before stmmac core 3.50 this mode bit was 4xPBL, and
+	 * Set the DMA PBL (Programmable Burst Length) mode.
+	 *
+	 * Note: before stmmac core 3.50 this mode bit was 4xPBL, and
 	 * post 3.5 mode bit acts as 8*PBL.
-	 * For core rev < 3.5, when the core is set for 4xPBL mode, the
-	 * DMA transfers the data in 4, 8, 16, 32, 64 & 128 beats
-	 * depending on pbl value.
-	 * For core rev > 3.5, when the core is set for 8xPBL mode, the
-	 * DMA transfers the data in 8, 16, 32, 64, 128 & 256 beats
-	 * depending on pbl value.
+	 *
+	 * This configuration doesn't take care about the Separate PBL
+	 * so only the bits: 13-8 are programmed with the PBL passed from the
+	 * platform.
 	 */
-	value = DMA_BUS_MODE_PBL | ((pbl << DMA_BUS_MODE_PBL_SHIFT) |
-				    (pbl << DMA_BUS_MODE_RPBL_SHIFT));
+	value |= DMA_BUS_MODE_MAXPBL;
+	value &= ~DMA_BUS_MODE_PBL_MASK;
+	value |= (pbl << DMA_BUS_MODE_PBL_SHIFT);
 
 	/* Set the Fixed burst mode */
 	if (fb)
@@ -60,26 +112,10 @@ static void dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
 	if (atds)
 		value |= DMA_BUS_MODE_ATDS;
 
-	writel(value, ioaddr + DMA_BUS_MODE);
+	if (aal)
+		value |= DMA_BUS_MODE_AAL;
 
-	/* In case of GMAC AXI configuration, program the DMA_AXI_BUS_MODE
-	 * for supported bursts.
-	 *
-	 * Note: This is applicable only for revision GMACv3.61a. For
-	 * older version this register is reserved and shall have no
-	 * effect.
-	 *
-	 * Note:
-	 *  For Fixed Burst Mode: if we directly write 0xFF to this
-	 *  register using the configurations pass from platform code,
-	 *  this would ensure that all bursts supported by core are set
-	 *  and those which are not supported would remain ineffective.
-	 *
-	 *  For Non Fixed Burst Mode: provide the maximum value of the
-	 *  burst length. Any burst equal or below the provided burst
-	 *  length would be allowed to perform.
-	 */
-	writel(burst_len, ioaddr + DMA_AXI_BUS_MODE);
+	writel(value, ioaddr + DMA_BUS_MODE);
 
 	/* Mask interrupts by writing to CSR7 */
 	writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA);
@@ -192,6 +228,7 @@ static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt)
 const struct stmmac_dma_ops dwmac1000_dma_ops = {
 	.reset = dwmac_dma_reset,
 	.init = dwmac1000_dma_init,
+	.axi = dwmac1000_dma_axi,
 	.dump_regs = dwmac1000_dump_dma_regs,
 	.dma_mode = dwmac1000_dma_operation_mode,
 	.enable_dma_transmission = dwmac_enable_dma_transmission,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
index c40582a..61f54c9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
@@ -33,7 +33,7 @@
 #include "dwmac_dma.h"
 
 static void dwmac100_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
-			      int burst_len, u32 dma_tx, u32 dma_rx, int atds)
+			      int aal, u32 dma_tx, u32 dma_rx, int atds)
 {
 	/* Enable Application Access by writing to DMA CSR0 */
 	writel(DMA_BUS_MODE_DEFAULT | (pbl << DMA_BUS_MODE_PBL_SHIFT),
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index 13ca90e..726d9d9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -41,8 +41,40 @@
 
 /* Rx watchdog register */
 #define DMA_RX_WATCHDOG		0x00001024
-/* AXI Bus Mode */
+
+/* AXI Master Bus Mode */
 #define DMA_AXI_BUS_MODE	0x00001028
+
+#define DMA_AXI_EN_LPI		BIT(31)
+#define DMA_AXI_LPI_XIT_FRM	BIT(30)
+#define DMA_AXI_WR_OSR_LMT	GENMASK(23, 20)
+#define DMA_AXI_WR_OSR_LMT_SHIFT	20
+#define DMA_AXI_WR_OSR_LMT_MASK	0xf
+#define DMA_AXI_RD_OSR_LMT	GENMASK(19, 16)
+#define DMA_AXI_RD_OSR_LMT_SHIFT	16
+#define DMA_AXI_RD_OSR_LMT_MASK	0xf
+
+#define DMA_AXI_OSR_MAX		0xf
+#define DMA_AXI_MAX_OSR_LIMIT ((DMA_AXI_OSR_MAX << DMA_AXI_WR_OSR_LMT_SHIFT) | \
+			       (DMA_AXI_OSR_MAX << DMA_AXI_RD_OSR_LMT_SHIFT))
+#define	DMA_AXI_1KBBE		BIT(13)
+#define DMA_AXI_AAL		BIT(12)
+#define DMA_AXI_BLEN256		BIT(7)
+#define DMA_AXI_BLEN128		BIT(6)
+#define DMA_AXI_BLEN64		BIT(5)
+#define DMA_AXI_BLEN32		BIT(4)
+#define DMA_AXI_BLEN16		BIT(3)
+#define DMA_AXI_BLEN8		BIT(2)
+#define DMA_AXI_BLEN4		BIT(1)
+#define DMA_BURST_LEN_DEFAULT	(DMA_AXI_BLEN256 | DMA_AXI_BLEN128 | \
+				 DMA_AXI_BLEN64 | DMA_AXI_BLEN32 | \
+				 DMA_AXI_BLEN16 | DMA_AXI_BLEN8 | \
+				 DMA_AXI_BLEN4)
+
+#define DMA_AXI_UNDEF		BIT(0)
+
+#define DMA_AXI_BURST_LEN_MASK	0x000000FE
+
 #define DMA_CUR_TX_BUF_ADDR	0x00001050	/* Current Host Tx Buffer */
 #define DMA_CUR_RX_BUF_ADDR	0x00001054	/* Current Host Rx Buffer */
 #define DMA_HW_FEATURE		0x00001058	/* HW Feature Register */
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 13752e9..89c2626 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1635,7 +1635,7 @@ static void stmmac_check_ether_addr(struct stmmac_priv *priv)
  */
 static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 {
-	int pbl = DEFAULT_DMA_PBL, fixed_burst = 0, burst_len = 0;
+	int pbl = DEFAULT_DMA_PBL, fixed_burst = 0, aal = 0;
 	int mixed_burst = 0;
 	int atds = 0;
 	int ret = 0;
@@ -1644,7 +1644,7 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 		pbl = priv->plat->dma_cfg->pbl;
 		fixed_burst = priv->plat->dma_cfg->fixed_burst;
 		mixed_burst = priv->plat->dma_cfg->mixed_burst;
-		burst_len = priv->plat->dma_cfg->burst_len;
+		aal = priv->plat->dma_cfg->aal;
 	}
 
 	if (priv->extend_desc && (priv->mode == STMMAC_RING_MODE))
@@ -1657,8 +1657,12 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 	}
 
 	priv->hw->dma->init(priv->ioaddr, pbl, fixed_burst, mixed_burst,
-			    burst_len, priv->dma_tx_phy,
-			    priv->dma_rx_phy, atds);
+			    aal, priv->dma_tx_phy, priv->dma_rx_phy, atds);
+
+	if ((priv->synopsys_id >= DWMAC_CORE_3_50) &&
+	    (priv->plat->axi && priv->hw->dma->axi))
+		priv->hw->dma->axi(priv->ioaddr, priv->plat->axi);
+
 	return ret;
 }
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
index d71a721..ae43887 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
@@ -81,7 +81,7 @@ static void stmmac_default_data(struct plat_stmmacenet_data *plat)
 	plat->mdio_bus_data->phy_mask = 0;
 
 	plat->dma_cfg->pbl = 32;
-	plat->dma_cfg->burst_len = DMA_AXI_BLEN_256;
+	/* TODO: AXI */
 
 	/* Set default value for multicast hash bins */
 	plat->multicast_filter_bins = HASH_TABLE_SIZE;
@@ -115,8 +115,8 @@ static int quark_default_data(struct plat_stmmacenet_data *plat,
 	plat->mdio_bus_data->phy_mask = 0;
 
 	plat->dma_cfg->pbl = 16;
-	plat->dma_cfg->burst_len = DMA_AXI_BLEN_256;
 	plat->dma_cfg->fixed_burst = 1;
+	/* AXI (TODO) */
 
 	/* Set default value for multicast hash bins */
 	plat->multicast_filter_bins = HASH_TABLE_SIZE;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 6a52fa1..69ccf48 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -96,6 +96,42 @@ static int dwmac1000_validate_ucast_entries(int ucast_entries)
 }
 
 /**
+ * stmmac_axi_setup - parse DT parameters for programming the AXI register
+ * @pdev: platform device
+ * @priv: driver private struct.
+ * Description:
+ * if required, from device-tree the AXI internal register can be tuned
+ * by using platform parameters.
+ */
+static struct stmmac_axi *stmmac_axi_setup(struct platform_device *pdev)
+{
+	struct device_node *np;
+	struct stmmac_axi *axi;
+
+	np = of_parse_phandle(pdev->dev.of_node, "snps,axi-config", 0);
+	if (!np)
+		return NULL;
+
+	axi = kzalloc(sizeof(axi), GFP_KERNEL);
+	if (!axi)
+		return ERR_PTR(-ENOMEM);
+
+	axi->axi_lpi_en = of_property_read_bool(np, "snps,lpi_en");
+	axi->axi_xit_frm = of_property_read_bool(np, "snps,xit_frm");
+	axi->axi_kbbe = of_property_read_bool(np, "snps,axi_kbbe");
+	axi->axi_axi_all = of_property_read_bool(np, "snps,axi_all");
+	axi->axi_fb = of_property_read_bool(np, "snps,axi_fb");
+	axi->axi_mb = of_property_read_bool(np, "snps,axi_mb");
+	axi->axi_rb =  of_property_read_bool(np, "snps,axi_rb");
+
+	of_property_read_u32(np, "snps,wr_osr_lmt", &axi->axi_wr_osr_lmt);
+	of_property_read_u32(np, "snps,rd_osr_lmt", &axi->axi_rd_osr_lmt);
+	of_property_read_u32_array(np, "snps,blen", axi->axi_blen, AXI_BLEN);
+
+	return axi;
+}
+
+/**
  * stmmac_probe_config_dt - parse device-tree driver parameters
  * @pdev: platform_device structure
  * @plat: driver data platform structure
@@ -216,13 +252,11 @@ stmmac_probe_config_dt(struct platform_device *pdev, const char **mac)
 		}
 		plat->dma_cfg = dma_cfg;
 		of_property_read_u32(np, "snps,pbl", &dma_cfg->pbl);
+		dma_cfg->aal = of_property_read_bool(np, "snps,aal");
 		dma_cfg->fixed_burst =
 			of_property_read_bool(np, "snps,fixed-burst");
 		dma_cfg->mixed_burst =
 			of_property_read_bool(np, "snps,mixed-burst");
-		of_property_read_u32(np, "snps,burst_len", &dma_cfg->burst_len);
-		if (dma_cfg->burst_len < 0 || dma_cfg->burst_len > 256)
-			dma_cfg->burst_len = 0;
 	}
 	plat->force_thresh_dma_mode = of_property_read_bool(np, "snps,force_thresh_dma_mode");
 	if (plat->force_thresh_dma_mode) {
@@ -230,6 +264,8 @@ stmmac_probe_config_dt(struct platform_device *pdev, const char **mac)
 		pr_warn("force_sf_dma_mode is ignored if force_thresh_dma_mode is set.");
 	}
 
+	plat->axi = stmmac_axi_setup(pdev);
+
 	return plat;
 }
 #else
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index eead8ab..6e53fa8 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -90,7 +90,21 @@ struct stmmac_dma_cfg {
 	int pbl;
 	int fixed_burst;
 	int mixed_burst;
-	int burst_len;
+	bool aal;
+};
+
+#define AXI_BLEN	7
+struct stmmac_axi {
+	bool axi_lpi_en;
+	bool axi_xit_frm;
+	u32 axi_wr_osr_lmt;
+	u32 axi_rd_osr_lmt;
+	bool axi_kbbe;
+	bool axi_axi_all;
+	u32 axi_blen[AXI_BLEN];
+	bool axi_fb;
+	bool axi_mb;
+	bool axi_rb;
 };
 
 struct plat_stmmacenet_data {
@@ -122,5 +136,6 @@ struct plat_stmmacenet_data {
 	int (*init)(struct platform_device *pdev, void *priv);
 	void (*exit)(struct platform_device *pdev, void *priv);
 	void *bsp_priv;
+	struct stmmac_axi *axi;
 };
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 03/17] stmmac: change descriptor layout
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 01/17] stmmac: share reset function between dwmac100 and dwmac1000 Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 02/17] stmmac: rework DMA bus setting and introduce new platform AXI structure Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 04/17] stmmac: review RX/TX ring management Alexandre TORGUE
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch completely changes the descriptor layout to improve
the whole performances due to the single read usage of the
descriptors in critical paths.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/descs.h b/drivers/net/ethernet/stmicro/stmmac/descs.h
index 799c292..2e4c171 100644
--- a/drivers/net/ethernet/stmicro/stmmac/descs.h
+++ b/drivers/net/ethernet/stmicro/stmmac/descs.h
@@ -1,6 +1,6 @@
 /*******************************************************************************
-  Header File to describe the DMA descriptors.
-  Enhanced descriptors have been in case of DWMAC1000 Cores.
+  Header File to describe the DMA descriptors and related definitions.
+  This is for DWMAC100 and 1000 cores.
 
   This program is free software; you can redistribute it and/or modify it
   under the terms and conditions of the GNU General Public License,
@@ -24,198 +24,164 @@
 #ifndef __DESCS_H__
 #define __DESCS_H__
 
+#include <linux/bitops.h>
+
+/* Normal receive descriptor defines */
+
+/* RDES0 */
+#define	RDES0_PAYLOAD_CSUM_ERR	BIT(0)
+#define	RDES0_CRC_ERROR		BIT(1)
+#define	RDES0_DRIBBLING		BIT(2)
+#define	RDES0_MII_ERROR		BIT(3)
+#define	RDES0_RECEIVE_WATCHDOG	BIT(4)
+#define	RDES0_FRAME_TYPE	BIT(5)
+#define	RDES0_COLLISION		BIT(6)
+#define	RDES0_IPC_CSUM_ERROR	BIT(7)
+#define	RDES0_LAST_DESCRIPTOR	BIT(8)
+#define	RDES0_FIRST_DESCRIPTOR	BIT(9)
+#define	RDES0_VLAN_TAG		BIT(10)
+#define	RDES0_OVERFLOW_ERROR	BIT(11)
+#define	RDES0_LENGTH_ERROR	BIT(12)
+#define	RDES0_SA_FILTER_FAIL	BIT(13)
+#define	RDES0_DESCRIPTOR_ERROR	BIT(14)
+#define	RDES0_ERROR_SUMMARY	BIT(15)
+#define	RDES0_FRAME_LEN_MASK	GENMASK(29, 16)
+#define RDES0_FRAME_LEN_SHIFT	16
+#define	RDES0_DA_FILTER_FAIL	BIT(30)
+#define	RDES0_OWN		BIT(31)
+			/* RDES1 */
+#define	RDES1_BUFFER1_SIZE_MASK		GENMASK(10, 0)
+#define	RDES1_BUFFER2_SIZE_MASK		GENMASK(21, 11)
+#define	RDES1_BUFFER2_SIZE_SHIFT	11
+#define	RDES1_SECOND_ADDRESS_CHAINED	BIT(24)
+#define	RDES1_END_RING			BIT(25)
+#define	RDES1_DISABLE_IC		BIT(31)
+
+/* Enhanced receive descriptor defines */
+
+/* RDES0 (similar to normal RDES) */
+#define	 ERDES0_RX_MAC_ADDR	BIT(0)
+
+/* RDES1: completely differ from normal desc definitions */
+#define	ERDES1_BUFFER1_SIZE_MASK	GENMASK(12, 0)
+#define	ERDES1_SECOND_ADDRESS_CHAINED	BIT(14)
+#define	ERDES1_END_RING			BIT(15)
+#define	ERDES1_BUFFER2_SIZE_MASK	GENMASK(28, 16)
+#define ERDES1_BUFFER2_SIZE_SHIFT	16
+#define	ERDES1_DISABLE_IC		BIT(31)
+
+/* Normal transmit descriptor defines */
+/* TDES0 */
+#define	TDES0_DEFERRED			BIT(0)
+#define	TDES0_UNDERFLOW_ERROR		BIT(1)
+#define	TDES0_EXCESSIVE_DEFERRAL	BIT(2)
+#define	TDES0_COLLISION_COUNT_MASK	GENMASK(6, 3)
+#define	TDES0_VLAN_FRAME		BIT(7)
+#define	TDES0_EXCESSIVE_COLLISIONS	BIT(8)
+#define	TDES0_LATE_COLLISION		BIT(9)
+#define	TDES0_NO_CARRIER		BIT(10)
+#define	TDES0_LOSS_CARRIER		BIT(11)
+#define	TDES0_PAYLOAD_ERROR		BIT(12)
+#define	TDES0_FRAME_FLUSHED		BIT(13)
+#define	TDES0_JABBER_TIMEOUT		BIT(14)
+#define	TDES0_ERROR_SUMMARY		BIT(15)
+#define	TDES0_IP_HEADER_ERROR		BIT(16)
+#define	TDES0_TIME_STAMP_STATUS		BIT(17)
+#define	TDES0_OWN			BIT(31)
+/* TDES1 */
+#define	TDES1_BUFFER1_SIZE_MASK		GENMASK(10, 0)
+#define	TDES1_BUFFER2_SIZE_MASK		GENMASK(21, 11)
+#define	TDES1_BUFFER2_SIZE_SHIFT	11
+#define	TDES1_TIME_STAMP_ENABLE		BIT(22)
+#define	TDES1_DISABLE_PADDING		BIT(23)
+#define	TDES1_SECOND_ADDRESS_CHAINED	BIT(24)
+#define	TDES1_END_RING			BIT(25)
+#define	TDES1_CRC_DISABLE		BIT(26)
+#define	TDES1_CHECKSUM_INSERTION_MASK	GENMASK(28, 27)
+#define	TDES1_CHECKSUM_INSERTION_SHIFT	27
+#define	TDES1_FIRST_SEGMENT		BIT(29)
+#define	TDES1_LAST_SEGMENT		BIT(30)
+#define	TDES1_INTERRUPT			BIT(31)
+
+/* Enhanced transmit descriptor defines */
+/* TDES0 */
+#define	ETDES0_DEFERRED			BIT(0)
+#define	ETDES0_UNDERFLOW_ERROR		BIT(1)
+#define	ETDES0_EXCESSIVE_DEFERRAL	BIT(2)
+#define	ETDES0_COLLISION_COUNT_MASK	GENMASK(6, 3)
+#define	ETDES0_VLAN_FRAME		BIT(7)
+#define	ETDES0_EXCESSIVE_COLLISIONS	BIT(8)
+#define	ETDES0_LATE_COLLISION		BIT(9)
+#define	ETDES0_NO_CARRIER		BIT(10)
+#define	ETDES0_LOSS_CARRIER		BIT(11)
+#define	ETDES0_PAYLOAD_ERROR		BIT(12)
+#define	ETDES0_FRAME_FLUSHED		BIT(13)
+#define	ETDES0_JABBER_TIMEOUT		BIT(14)
+#define	ETDES0_ERROR_SUMMARY		BIT(15)
+#define	ETDES0_IP_HEADER_ERROR		BIT(16)
+#define	ETDES0_TIME_STAMP_STATUS	BIT(17)
+#define	ETDES0_SECOND_ADDRESS_CHAINED	BIT(20)
+#define	ETDES0_END_RING			BIT(21)
+#define	ETDES0_CHECKSUM_INSERTION_MASK	GENMASK(23, 22)
+#define	ETDES0_CHECKSUM_INSERTION_SHIFT	22
+#define	ETDES0_TIME_STAMP_ENABLE	BIT(25)
+#define	ETDES0_DISABLE_PADDING		BIT(26)
+#define	ETDES0_CRC_DISABLE		BIT(27)
+#define	ETDES0_FIRST_SEGMENT		BIT(28)
+#define	ETDES0_LAST_SEGMENT		BIT(29)
+#define	ETDES0_INTERRUPT		BIT(30)
+#define	ETDES0_OWN			BIT(31)
+/* TDES1 */
+#define	ETDES1_BUFFER1_SIZE_MASK	GENMASK(12, 0)
+#define	ETDES1_BUFFER2_SIZE_MASK	GENMASK(28, 16)
+#define	ETDES1_BUFFER2_SIZE_SHIFT	16
+
+/* Extended Receive descriptor definitions */
+#define	ERDES4_IP_PAYLOAD_TYPE_MASK	GENMASK(2, 6)
+#define	ERDES4_IP_HDR_ERR		BIT(3)
+#define	ERDES4_IP_PAYLOAD_ERR		BIT(4)
+#define	ERDES4_IP_CSUM_BYPASSED		BIT(5)
+#define	ERDES4_IPV4_PKT_RCVD		BIT(6)
+#define	ERDES4_IPV6_PKT_RCVD		BIT(7)
+#define	ERDES4_MSG_TYPE_MASK		GENMASK(11, 8)
+#define	ERDES4_PTP_FRAME_TYPE		BIT(12)
+#define	ERDES4_PTP_VER			BIT(13)
+#define	ERDES4_TIMESTAMP_DROPPED	BIT(14)
+#define	ERDES4_AV_PKT_RCVD		BIT(16)
+#define	ERDES4_AV_TAGGED_PKT_RCVD	BIT(17)
+#define	ERDES4_VLAN_TAG_PRI_VAL_MASK	GENMASK(20, 18)
+#define	ERDES4_L3_FILTER_MATCH		BIT(24)
+#define	ERDES4_L4_FILTER_MATCH		BIT(25)
+#define	ERDES4_L3_L4_FILT_NO_MATCH_MASK	GENMASK(27, 26)
+
+/* Extended RDES4 message type definitions */
+#define RDES_EXT_NO_PTP			0
+#define RDES_EXT_SYNC			1
+#define RDES_EXT_FOLLOW_UP		2
+#define RDES_EXT_DELAY_REQ		3
+#define RDES_EXT_DELAY_RESP		4
+#define RDES_EXT_PDELAY_REQ		5
+#define RDES_EXT_PDELAY_RESP		6
+#define RDES_EXT_PDELAY_FOLLOW_UP	7
+
 /* Basic descriptor structure for normal and alternate descriptors */
 struct dma_desc {
-	/* Receive descriptor */
-	union {
-		struct {
-			/* RDES0 */
-			u32 payload_csum_error:1;
-			u32 crc_error:1;
-			u32 dribbling:1;
-			u32 mii_error:1;
-			u32 receive_watchdog:1;
-			u32 frame_type:1;
-			u32 collision:1;
-			u32 ipc_csum_error:1;
-			u32 last_descriptor:1;
-			u32 first_descriptor:1;
-			u32 vlan_tag:1;
-			u32 overflow_error:1;
-			u32 length_error:1;
-			u32 sa_filter_fail:1;
-			u32 descriptor_error:1;
-			u32 error_summary:1;
-			u32 frame_length:14;
-			u32 da_filter_fail:1;
-			u32 own:1;
-			/* RDES1 */
-			u32 buffer1_size:11;
-			u32 buffer2_size:11;
-			u32 reserved1:2;
-			u32 second_address_chained:1;
-			u32 end_ring:1;
-			u32 reserved2:5;
-			u32 disable_ic:1;
-
-		} rx;
-		struct {
-			/* RDES0 */
-			u32 rx_mac_addr:1;
-			u32 crc_error:1;
-			u32 dribbling:1;
-			u32 error_gmii:1;
-			u32 receive_watchdog:1;
-			u32 frame_type:1;
-			u32 late_collision:1;
-			u32 ipc_csum_error:1;
-			u32 last_descriptor:1;
-			u32 first_descriptor:1;
-			u32 vlan_tag:1;
-			u32 overflow_error:1;
-			u32 length_error:1;
-			u32 sa_filter_fail:1;
-			u32 descriptor_error:1;
-			u32 error_summary:1;
-			u32 frame_length:14;
-			u32 da_filter_fail:1;
-			u32 own:1;
-			/* RDES1 */
-			u32 buffer1_size:13;
-			u32 reserved1:1;
-			u32 second_address_chained:1;
-			u32 end_ring:1;
-			u32 buffer2_size:13;
-			u32 reserved2:2;
-			u32 disable_ic:1;
-		} erx;		/* -- enhanced -- */
-
-		/* Transmit descriptor */
-		struct {
-			/* TDES0 */
-			u32 deferred:1;
-			u32 underflow_error:1;
-			u32 excessive_deferral:1;
-			u32 collision_count:4;
-			u32 vlan_frame:1;
-			u32 excessive_collisions:1;
-			u32 late_collision:1;
-			u32 no_carrier:1;
-			u32 loss_carrier:1;
-			u32 payload_error:1;
-			u32 frame_flushed:1;
-			u32 jabber_timeout:1;
-			u32 error_summary:1;
-			u32 ip_header_error:1;
-			u32 time_stamp_status:1;
-			u32 reserved1:13;
-			u32 own:1;
-			/* TDES1 */
-			u32 buffer1_size:11;
-			u32 buffer2_size:11;
-			u32 time_stamp_enable:1;
-			u32 disable_padding:1;
-			u32 second_address_chained:1;
-			u32 end_ring:1;
-			u32 crc_disable:1;
-			u32 checksum_insertion:2;
-			u32 first_segment:1;
-			u32 last_segment:1;
-			u32 interrupt:1;
-		} tx;
-		struct {
-			/* TDES0 */
-			u32 deferred:1;
-			u32 underflow_error:1;
-			u32 excessive_deferral:1;
-			u32 collision_count:4;
-			u32 vlan_frame:1;
-			u32 excessive_collisions:1;
-			u32 late_collision:1;
-			u32 no_carrier:1;
-			u32 loss_carrier:1;
-			u32 payload_error:1;
-			u32 frame_flushed:1;
-			u32 jabber_timeout:1;
-			u32 error_summary:1;
-			u32 ip_header_error:1;
-			u32 time_stamp_status:1;
-			u32 reserved1:2;
-			u32 second_address_chained:1;
-			u32 end_ring:1;
-			u32 checksum_insertion:2;
-			u32 reserved2:1;
-			u32 time_stamp_enable:1;
-			u32 disable_padding:1;
-			u32 crc_disable:1;
-			u32 first_segment:1;
-			u32 last_segment:1;
-			u32 interrupt:1;
-			u32 own:1;
-			/* TDES1 */
-			u32 buffer1_size:13;
-			u32 reserved3:3;
-			u32 buffer2_size:13;
-			u32 reserved4:3;
-		} etx;		/* -- enhanced -- */
-
-		u64 all_flags;
-	} des01;
+	unsigned int des0;
+	unsigned int des1;
 	unsigned int des2;
 	unsigned int des3;
 };
 
-/* Extended descriptor structure (supported by new SYNP GMAC generations) */
+/* Extended descriptor structure (e.g. >= databook 3.50a) */
 struct dma_extended_desc {
-	struct dma_desc basic;
-	union {
-		struct {
-			u32 ip_payload_type:3;
-			u32 ip_hdr_err:1;
-			u32 ip_payload_err:1;
-			u32 ip_csum_bypassed:1;
-			u32 ipv4_pkt_rcvd:1;
-			u32 ipv6_pkt_rcvd:1;
-			u32 msg_type:4;
-			u32 ptp_frame_type:1;
-			u32 ptp_ver:1;
-			u32 timestamp_dropped:1;
-			u32 reserved:1;
-			u32 av_pkt_rcvd:1;
-			u32 av_tagged_pkt_rcvd:1;
-			u32 vlan_tag_priority_val:3;
-			u32 reserved3:3;
-			u32 l3_filter_match:1;
-			u32 l4_filter_match:1;
-			u32 l3_l4_filter_no_match:2;
-			u32 reserved4:4;
-		} erx;
-		struct {
-			u32 reserved;
-		} etx;
-	} des4;
+	struct dma_desc basic;	/* Basic descriptors */
+	unsigned int des4;	/* Extended Status */
 	unsigned int des5;	/* Reserved */
 	unsigned int des6;	/* Tx/Rx Timestamp Low */
 	unsigned int des7;	/* Tx/Rx Timestamp High */
 };
 
 /* Transmit checksum insertion control */
-enum tdes_csum_insertion {
-	cic_disabled = 0,	/* Checksum Insertion Control */
-	cic_only_ip = 1,	/* Only IP header */
-	/* IP header but pseudoheader is not calculated */
-	cic_no_pseudoheader = 2,
-	cic_full = 3,		/* IP header and pseudoheader */
-};
-
-/* Extended RDES4 definitions */
-#define RDES_EXT_NO_PTP			0
-#define RDES_EXT_SYNC			0x1
-#define RDES_EXT_FOLLOW_UP		0x2
-#define RDES_EXT_DELAY_REQ		0x3
-#define RDES_EXT_DELAY_RESP		0x4
-#define RDES_EXT_PDELAY_REQ		0x5
-#define RDES_EXT_PDELAY_RESP		0x6
-#define RDES_EXT_PDELAY_FOLLOW_UP	0x7
+#define	TX_CIC_FULL	3	/* Include IP header and pseudoheader */
 
 #endif /* __DESCS_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/descs_com.h b/drivers/net/ethernet/stmicro/stmmac/descs_com.h
index 6f2cc78..7635a46 100644
--- a/drivers/net/ethernet/stmicro/stmmac/descs_com.h
+++ b/drivers/net/ethernet/stmicro/stmmac/descs_com.h
@@ -35,100 +35,91 @@
 /* Enhanced descriptors */
 static inline void ehn_desc_rx_set_on_ring(struct dma_desc *p, int end)
 {
-	p->des01.erx.buffer2_size = BUF_SIZE_8KiB - 1;
-	if (end)
-		p->des01.erx.end_ring = 1;
-}
+	p->des1 |= ((BUF_SIZE_8KiB - 1) << ERDES1_BUFFER2_SIZE_SHIFT)
+		   & ERDES1_BUFFER2_SIZE_MASK;
 
-static inline void ehn_desc_tx_set_on_ring(struct dma_desc *p, int end)
-{
 	if (end)
-		p->des01.etx.end_ring = 1;
+		p->des1 |= ERDES1_END_RING;
 }
 
-static inline void enh_desc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
+static inline void enh_desc_end_tx_desc_on_ring(struct dma_desc *p, int end)
 {
-	p->des01.etx.end_ring = ter;
+	if (end)
+		p->des0 |= ETDES0_END_RING;
+	else
+		p->des0 &= ~ETDES0_END_RING;
 }
 
 static inline void enh_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
 {
 	if (unlikely(len > BUF_SIZE_4KiB)) {
-		p->des01.etx.buffer1_size = BUF_SIZE_4KiB;
-		p->des01.etx.buffer2_size = len - BUF_SIZE_4KiB;
+		p->des1 |= (((len - BUF_SIZE_4KiB) << ETDES1_BUFFER2_SIZE_SHIFT)
+			    & ETDES1_BUFFER2_SIZE_MASK) | (BUF_SIZE_4KiB
+			    & ETDES1_BUFFER1_SIZE_MASK);
 	} else
-		p->des01.etx.buffer1_size = len;
+		p->des1 |= (len & ETDES1_BUFFER1_SIZE_MASK);
 }
 
 /* Normal descriptors */
 static inline void ndesc_rx_set_on_ring(struct dma_desc *p, int end)
 {
-	p->des01.rx.buffer2_size = BUF_SIZE_2KiB - 1;
-	if (end)
-		p->des01.rx.end_ring = 1;
-}
+	p->des1 |= ((BUF_SIZE_2KiB - 1) << RDES1_BUFFER2_SIZE_SHIFT)
+		    & RDES1_BUFFER2_SIZE_MASK;
 
-static inline void ndesc_tx_set_on_ring(struct dma_desc *p, int end)
-{
 	if (end)
-		p->des01.tx.end_ring = 1;
+		p->des1 |= RDES1_END_RING;
 }
 
-static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
+static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int end)
 {
-	p->des01.tx.end_ring = ter;
+	if (end)
+		p->des1 |= TDES1_END_RING;
+	else
+		p->des1 &= ~TDES1_END_RING;
 }
 
 static inline void norm_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
 {
 	if (unlikely(len > BUF_SIZE_2KiB)) {
-		p->des01.etx.buffer1_size = BUF_SIZE_2KiB - 1;
-		p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
+		unsigned int buffer1 = (BUF_SIZE_2KiB - 1)
+					& TDES1_BUFFER1_SIZE_MASK;
+		p->des1 |= ((((len - buffer1) << TDES1_BUFFER2_SIZE_SHIFT)
+			    & TDES1_BUFFER2_SIZE_MASK) | buffer1);
 	} else
-		p->des01.tx.buffer1_size = len;
+		p->des1 |= (len & TDES1_BUFFER1_SIZE_MASK);
 }
 
 /* Specific functions used for Chain mode */
 
 /* Enhanced descriptors */
-static inline void ehn_desc_rx_set_on_chain(struct dma_desc *p, int end)
-{
-	p->des01.erx.second_address_chained = 1;
-}
-
-static inline void ehn_desc_tx_set_on_chain(struct dma_desc *p, int end)
+static inline void ehn_desc_rx_set_on_chain(struct dma_desc *p)
 {
-	p->des01.etx.second_address_chained = 1;
+	p->des1 |= ERDES1_SECOND_ADDRESS_CHAINED;
 }
 
-static inline void enh_desc_end_tx_desc_on_chain(struct dma_desc *p, int ter)
+static inline void enh_desc_end_tx_desc_on_chain(struct dma_desc *p)
 {
-	p->des01.etx.second_address_chained = 1;
+	p->des0 |= ETDES0_SECOND_ADDRESS_CHAINED;
 }
 
 static inline void enh_set_tx_desc_len_on_chain(struct dma_desc *p, int len)
 {
-	p->des01.etx.buffer1_size = len;
+	p->des1 |= (len & ETDES1_BUFFER1_SIZE_MASK);
 }
 
 /* Normal descriptors */
 static inline void ndesc_rx_set_on_chain(struct dma_desc *p, int end)
 {
-	p->des01.rx.second_address_chained = 1;
-}
-
-static inline void ndesc_tx_set_on_chain(struct dma_desc *p, int ring_size)
-{
-	p->des01.tx.second_address_chained = 1;
+	p->des1 |= RDES1_SECOND_ADDRESS_CHAINED;
 }
 
-static inline void ndesc_end_tx_desc_on_chain(struct dma_desc *p, int ter)
+static inline void ndesc_tx_set_on_chain(struct dma_desc *p)
 {
-	p->des01.tx.second_address_chained = 1;
+	p->des1 |= TDES1_SECOND_ADDRESS_CHAINED;
 }
 
 static inline void norm_set_tx_desc_len_on_chain(struct dma_desc *p, int len)
 {
-	p->des01.tx.buffer1_size = len;
+	p->des1 |= len & TDES1_BUFFER1_SIZE_MASK;
 }
 #endif /* __DESC_COM_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 7d94444..716b807 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -1,7 +1,7 @@
 /*******************************************************************************
   This contains the functions to handle the enhanced descriptors.
 
-  Copyright (C) 2007-2009  STMicroelectronics Ltd
+  Copyright (C) 2007-2014  STMicroelectronics Ltd
 
   This program is free software; you can redistribute it and/or modify it
   under the terms and conditions of the GNU General Public License,
@@ -29,44 +29,44 @@
 static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 				  struct dma_desc *p, void __iomem *ioaddr)
 {
-	int ret = 0;
 	struct net_device_stats *stats = (struct net_device_stats *)data;
+	unsigned int tdes0 = p->des0;
+	int ret = 0;
 
-	if (unlikely(p->des01.etx.error_summary)) {
-		if (unlikely(p->des01.etx.jabber_timeout))
+	if (unlikely(tdes0 & ETDES0_ERROR_SUMMARY)) {
+		if (unlikely(tdes0 & ETDES0_JABBER_TIMEOUT))
 			x->tx_jabber++;
 
-		if (unlikely(p->des01.etx.frame_flushed)) {
+		if (unlikely(tdes0 & ETDES0_FRAME_FLUSHED)) {
 			x->tx_frame_flushed++;
 			dwmac_dma_flush_tx_fifo(ioaddr);
 		}
 
-		if (unlikely(p->des01.etx.loss_carrier)) {
+		if (unlikely(tdes0 & ETDES0_LOSS_CARRIER)) {
 			x->tx_losscarrier++;
 			stats->tx_carrier_errors++;
 		}
-		if (unlikely(p->des01.etx.no_carrier)) {
+		if (unlikely(tdes0 & ETDES0_NO_CARRIER)) {
 			x->tx_carrier++;
 			stats->tx_carrier_errors++;
 		}
-		if (unlikely(p->des01.etx.late_collision))
-			stats->collisions += p->des01.etx.collision_count;
-
-		if (unlikely(p->des01.etx.excessive_collisions))
-			stats->collisions += p->des01.etx.collision_count;
+		if (unlikely((tdes0 & ETDES0_LATE_COLLISION) ||
+			     (tdes0 & ETDES0_EXCESSIVE_COLLISIONS)))
+			stats->collisions +=
+				(tdes0 & ETDES0_COLLISION_COUNT_MASK) >> 3;
 
-		if (unlikely(p->des01.etx.excessive_deferral))
+		if (unlikely(tdes0 & ETDES0_EXCESSIVE_DEFERRAL))
 			x->tx_deferred++;
 
-		if (unlikely(p->des01.etx.underflow_error)) {
+		if (unlikely(tdes0 & ETDES0_UNDERFLOW_ERROR)) {
 			dwmac_dma_flush_tx_fifo(ioaddr);
 			x->tx_underflow++;
 		}
 
-		if (unlikely(p->des01.etx.ip_header_error))
+		if (unlikely(tdes0 & ETDES0_IP_HEADER_ERROR))
 			x->tx_ip_header_error++;
 
-		if (unlikely(p->des01.etx.payload_error)) {
+		if (unlikely(tdes0 & ETDES0_PAYLOAD_ERROR)) {
 			x->tx_payload_error++;
 			dwmac_dma_flush_tx_fifo(ioaddr);
 		}
@@ -74,11 +74,11 @@ static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 		ret = -1;
 	}
 
-	if (unlikely(p->des01.etx.deferred))
+	if (unlikely(tdes0 & ETDES0_DEFERRED))
 		x->tx_deferred++;
 
 #ifdef STMMAC_VLAN_TAG_USED
-	if (p->des01.etx.vlan_frame)
+	if (tdes0 & ETDES0_VLAN_FRAME)
 		x->tx_vlan++;
 #endif
 
@@ -87,7 +87,7 @@ static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 
 static int enh_desc_get_tx_len(struct dma_desc *p)
 {
-	return p->des01.etx.buffer1_size;
+	return (p->des1 & ETDES1_BUFFER1_SIZE_MASK);
 }
 
 static int enh_desc_coe_rdes0(int ipc_err, int type, int payload_err)
@@ -126,50 +126,55 @@ static int enh_desc_coe_rdes0(int ipc_err, int type, int payload_err)
 static void enh_desc_get_ext_status(void *data, struct stmmac_extra_stats *x,
 				    struct dma_extended_desc *p)
 {
-	if (unlikely(p->basic.des01.erx.rx_mac_addr)) {
-		if (p->des4.erx.ip_hdr_err)
+	unsigned int rdes0 = p->basic.des0;
+	unsigned int rdes4 = p->des4;
+
+	if (unlikely(rdes0 & ERDES0_RX_MAC_ADDR)) {
+		int message_type = (rdes4 & ERDES4_MSG_TYPE_MASK) >> 8;
+
+		if (rdes4 & ERDES4_IP_HDR_ERR)
 			x->ip_hdr_err++;
-		if (p->des4.erx.ip_payload_err)
+		if (rdes4 & ERDES4_IP_PAYLOAD_ERR)
 			x->ip_payload_err++;
-		if (p->des4.erx.ip_csum_bypassed)
+		if (rdes4 & ERDES4_IP_CSUM_BYPASSED)
 			x->ip_csum_bypassed++;
-		if (p->des4.erx.ipv4_pkt_rcvd)
+		if (rdes4 & ERDES4_IPV4_PKT_RCVD)
 			x->ipv4_pkt_rcvd++;
-		if (p->des4.erx.ipv6_pkt_rcvd)
+		if (rdes4 & ERDES4_IPV6_PKT_RCVD)
 			x->ipv6_pkt_rcvd++;
-		if (p->des4.erx.msg_type == RDES_EXT_SYNC)
+		if (message_type == RDES_EXT_SYNC)
 			x->rx_msg_type_sync++;
-		else if (p->des4.erx.msg_type == RDES_EXT_FOLLOW_UP)
+		else if (message_type == RDES_EXT_FOLLOW_UP)
 			x->rx_msg_type_follow_up++;
-		else if (p->des4.erx.msg_type == RDES_EXT_DELAY_REQ)
+		else if (message_type == RDES_EXT_DELAY_REQ)
 			x->rx_msg_type_delay_req++;
-		else if (p->des4.erx.msg_type == RDES_EXT_DELAY_RESP)
+		else if (message_type == RDES_EXT_DELAY_RESP)
 			x->rx_msg_type_delay_resp++;
-		else if (p->des4.erx.msg_type == RDES_EXT_PDELAY_REQ)
+		else if (message_type == RDES_EXT_PDELAY_REQ)
 			x->rx_msg_type_pdelay_req++;
-		else if (p->des4.erx.msg_type == RDES_EXT_PDELAY_RESP)
+		else if (message_type == RDES_EXT_PDELAY_RESP)
 			x->rx_msg_type_pdelay_resp++;
-		else if (p->des4.erx.msg_type == RDES_EXT_PDELAY_FOLLOW_UP)
+		else if (message_type == RDES_EXT_PDELAY_FOLLOW_UP)
 			x->rx_msg_type_pdelay_follow_up++;
 		else
 			x->rx_msg_type_ext_no_ptp++;
-		if (p->des4.erx.ptp_frame_type)
+		if (rdes4 & ERDES4_PTP_FRAME_TYPE)
 			x->ptp_frame_type++;
-		if (p->des4.erx.ptp_ver)
+		if (rdes4 & ERDES4_PTP_VER)
 			x->ptp_ver++;
-		if (p->des4.erx.timestamp_dropped)
+		if (rdes4 & ERDES4_TIMESTAMP_DROPPED)
 			x->timestamp_dropped++;
-		if (p->des4.erx.av_pkt_rcvd)
+		if (rdes4 & ERDES4_AV_PKT_RCVD)
 			x->av_pkt_rcvd++;
-		if (p->des4.erx.av_tagged_pkt_rcvd)
+		if (rdes4 & ERDES4_AV_TAGGED_PKT_RCVD)
 			x->av_tagged_pkt_rcvd++;
-		if (p->des4.erx.vlan_tag_priority_val)
+		if ((rdes4 & ERDES4_VLAN_TAG_PRI_VAL_MASK) >> 18)
 			x->vlan_tag_priority_val++;
-		if (p->des4.erx.l3_filter_match)
+		if (rdes4 & ERDES4_L3_FILTER_MATCH)
 			x->l3_filter_match++;
-		if (p->des4.erx.l4_filter_match)
+		if (rdes4 & ERDES4_L4_FILTER_MATCH)
 			x->l4_filter_match++;
-		if (p->des4.erx.l3_l4_filter_no_match)
+		if ((rdes4 & ERDES4_L3_L4_FILT_NO_MATCH_MASK) >> 26)
 			x->l3_l4_filter_no_match++;
 	}
 }
@@ -177,30 +182,30 @@ static void enh_desc_get_ext_status(void *data, struct stmmac_extra_stats *x,
 static int enh_desc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 				  struct dma_desc *p)
 {
-	int ret = good_frame;
 	struct net_device_stats *stats = (struct net_device_stats *)data;
+	unsigned int rdes0 = p->des0;
+	int ret = good_frame;
 
-	if (unlikely(p->des01.erx.error_summary)) {
-		if (unlikely(p->des01.erx.descriptor_error)) {
+	if (unlikely(rdes0 & RDES0_ERROR_SUMMARY)) {
+		if (unlikely(rdes0 & RDES0_DESCRIPTOR_ERROR)) {
 			x->rx_desc++;
 			stats->rx_length_errors++;
 		}
-		if (unlikely(p->des01.erx.overflow_error))
+		if (unlikely(rdes0 & RDES0_OVERFLOW_ERROR))
 			x->rx_gmac_overflow++;
 
-		if (unlikely(p->des01.erx.ipc_csum_error))
+		if (unlikely(rdes0 & RDES0_IPC_CSUM_ERROR))
 			pr_err("\tIPC Csum Error/Giant frame\n");
 
-		if (unlikely(p->des01.erx.late_collision)) {
+		if (unlikely(rdes0 & RDES0_COLLISION))
 			stats->collisions++;
-		}
-		if (unlikely(p->des01.erx.receive_watchdog))
+		if (unlikely(rdes0 & RDES0_RECEIVE_WATCHDOG))
 			x->rx_watchdog++;
 
-		if (unlikely(p->des01.erx.error_gmii))
+		if (unlikely(rdes0 & RDES0_MII_ERROR))	/* GMII */
 			x->rx_mii++;
 
-		if (unlikely(p->des01.erx.crc_error)) {
+		if (unlikely(rdes0 & RDES0_CRC_ERROR)) {
 			x->rx_crc++;
 			stats->rx_crc_errors++;
 		}
@@ -211,26 +216,27 @@ static int enh_desc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 	 * It doesn't match with the information reported into the databook.
 	 * At any rate, we need to understand if the CSUM hw computation is ok
 	 * and report this info to the upper layers. */
-	ret = enh_desc_coe_rdes0(p->des01.erx.ipc_csum_error,
-		p->des01.erx.frame_type, p->des01.erx.rx_mac_addr);
+	ret = enh_desc_coe_rdes0(!!(rdes0 & RDES0_IPC_CSUM_ERROR),
+				 !!(rdes0 & RDES0_FRAME_TYPE),
+				 !!(rdes0 & ERDES0_RX_MAC_ADDR));
 
-	if (unlikely(p->des01.erx.dribbling))
+	if (unlikely(rdes0 & RDES0_DRIBBLING))
 		x->dribbling_bit++;
 
-	if (unlikely(p->des01.erx.sa_filter_fail)) {
+	if (unlikely(rdes0 & RDES0_SA_FILTER_FAIL)) {
 		x->sa_rx_filter_fail++;
 		ret = discard_frame;
 	}
-	if (unlikely(p->des01.erx.da_filter_fail)) {
+	if (unlikely(rdes0 & RDES0_DA_FILTER_FAIL)) {
 		x->da_rx_filter_fail++;
 		ret = discard_frame;
 	}
-	if (unlikely(p->des01.erx.length_error)) {
+	if (unlikely(rdes0 & RDES0_LENGTH_ERROR)) {
 		x->rx_length++;
 		ret = discard_frame;
 	}
 #ifdef STMMAC_VLAN_TAG_USED
-	if (p->des01.erx.vlan_tag)
+	if (rdes0 & RDES0_VLAN_TAG)
 		x->rx_vlan++;
 #endif
 
@@ -240,60 +246,59 @@ static int enh_desc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 static void enh_desc_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
 				  int mode, int end)
 {
-	p->des01.all_flags = 0;
-	p->des01.erx.own = 1;
-	p->des01.erx.buffer1_size = BUF_SIZE_8KiB - 1;
+	p->des0 |= RDES0_OWN;
+	p->des1 |= ((BUF_SIZE_8KiB - 1) & ERDES1_BUFFER1_SIZE_MASK);
 
 	if (mode == STMMAC_CHAIN_MODE)
-		ehn_desc_rx_set_on_chain(p, end);
+		ehn_desc_rx_set_on_chain(p);
 	else
 		ehn_desc_rx_set_on_ring(p, end);
 
 	if (disable_rx_ic)
-		p->des01.erx.disable_ic = 1;
+		p->des1 |= ERDES1_DISABLE_IC;
 }
 
 static void enh_desc_init_tx_desc(struct dma_desc *p, int mode, int end)
 {
-	p->des01.all_flags = 0;
+	p->des0 &= ~ETDES0_OWN;
 	if (mode == STMMAC_CHAIN_MODE)
-		ehn_desc_tx_set_on_chain(p, end);
+		enh_desc_end_tx_desc_on_chain(p);
 	else
-		ehn_desc_tx_set_on_ring(p, end);
+		enh_desc_end_tx_desc_on_ring(p, end);
 }
 
 static int enh_desc_get_tx_owner(struct dma_desc *p)
 {
-	return p->des01.etx.own;
+	return (p->des0 & ETDES0_OWN) >> 31;
 }
 
 static int enh_desc_get_rx_owner(struct dma_desc *p)
 {
-	return p->des01.erx.own;
+	return (p->des0 & RDES0_OWN) >> 31;
 }
 
 static void enh_desc_set_tx_owner(struct dma_desc *p)
 {
-	p->des01.etx.own = 1;
+	p->des0 |= ETDES0_OWN;
 }
 
 static void enh_desc_set_rx_owner(struct dma_desc *p)
 {
-	p->des01.erx.own = 1;
+	p->des0 |= RDES0_OWN;
 }
 
 static int enh_desc_get_tx_ls(struct dma_desc *p)
 {
-	return p->des01.etx.last_segment;
+	return (p->des0 & ETDES0_LAST_SEGMENT) >> 29;
 }
 
 static void enh_desc_release_tx_desc(struct dma_desc *p, int mode)
 {
-	int ter = p->des01.etx.end_ring;
+	int ter = (p->des0 & ETDES0_END_RING) >> 21;
 
 	memset(p, 0, offsetof(struct dma_desc, des2));
 	if (mode == STMMAC_CHAIN_MODE)
-		enh_desc_end_tx_desc_on_chain(p, ter);
+		enh_desc_end_tx_desc_on_chain(p);
 	else
 		enh_desc_end_tx_desc_on_ring(p, ter);
 }
@@ -301,49 +306,60 @@ static void enh_desc_release_tx_desc(struct dma_desc *p, int mode)
 static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 				     int csum_flag, int mode)
 {
-	p->des01.etx.first_segment = is_fs;
+	unsigned int tdes0 = p->des0;
+
+	if (is_fs)
+		tdes0 |= ETDES0_FIRST_SEGMENT;
+	else
+		tdes0 &= ~ETDES0_FIRST_SEGMENT;
+
+	if (likely(csum_flag))
+		tdes0 |= (TX_CIC_FULL << ETDES0_CHECKSUM_INSERTION_SHIFT);
+	else
+		tdes0 &= ~(TX_CIC_FULL << ETDES0_CHECKSUM_INSERTION_SHIFT);
+
+	p->des0 = tdes0;
 
 	if (mode == STMMAC_CHAIN_MODE)
 		enh_set_tx_desc_len_on_chain(p, len);
 	else
 		enh_set_tx_desc_len_on_ring(p, len);
-
-	if (likely(csum_flag))
-		p->des01.etx.checksum_insertion = cic_full;
 }
 
 static void enh_desc_clear_tx_ic(struct dma_desc *p)
 {
-	p->des01.etx.interrupt = 0;
+	p->des0 &= ~ETDES0_INTERRUPT;
 }
 
 static void enh_desc_close_tx_desc(struct dma_desc *p)
 {
-	p->des01.etx.last_segment = 1;
-	p->des01.etx.interrupt = 1;
+	p->des0 |= ETDES0_LAST_SEGMENT | ETDES0_INTERRUPT;
 }
 
 static int enh_desc_get_rx_frame_len(struct dma_desc *p, int rx_coe_type)
 {
+	unsigned int csum = 0;
 	/* The type-1 checksum offload engines append the checksum at
 	 * the end of frame and the two bytes of checksum are added in
 	 * the length.
 	 * Adjust for that in the framelen for type-1 checksum offload
-	 * engines. */
+	 * engines.
+	 */
 	if (rx_coe_type == STMMAC_RX_COE_TYPE1)
-		return p->des01.erx.frame_length - 2;
-	else
-		return p->des01.erx.frame_length;
+		csum = 2;
+
+	return (((p->des0 & RDES0_FRAME_LEN_MASK) >> RDES0_FRAME_LEN_SHIFT) -
+		csum);
 }
 
 static void enh_desc_enable_tx_timestamp(struct dma_desc *p)
 {
-	p->des01.etx.time_stamp_enable = 1;
+	p->des0 |= ETDES0_TIME_STAMP_ENABLE;
 }
 
 static int enh_desc_get_tx_timestamp_status(struct dma_desc *p)
 {
-	return p->des01.etx.time_stamp_status;
+	return (p->des0 & ETDES0_TIME_STAMP_STATUS) >> 17;
 }
 
 static u64 enh_desc_get_timestamp(void *desc, u32 ats)
@@ -368,7 +384,7 @@ static int enh_desc_get_rx_timestamp_status(void *desc, u32 ats)
 {
 	if (ats) {
 		struct dma_extended_desc *p = (struct dma_extended_desc *)desc;
-		return p->basic.des01.erx.ipc_csum_error;
+		return (p->basic.des0 & RDES0_IPC_CSUM_ERROR) >> 7;
 	} else {
 		struct dma_desc *p = (struct dma_desc *)desc;
 		if ((p->des2 == 0xffffffff) && (p->des3 == 0xffffffff))
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 48c3456..460c573 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -29,33 +29,38 @@
 static int ndesc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 			       struct dma_desc *p, void __iomem *ioaddr)
 {
-	int ret = 0;
 	struct net_device_stats *stats = (struct net_device_stats *)data;
+	unsigned int tdes0 = p->des0;
+	int ret = 0;
 
-	if (unlikely(p->des01.tx.error_summary)) {
-		if (unlikely(p->des01.tx.underflow_error)) {
+	if (unlikely(tdes0 & TDES0_ERROR_SUMMARY)) {
+		if (unlikely(tdes0 & TDES0_UNDERFLOW_ERROR)) {
 			x->tx_underflow++;
 			stats->tx_fifo_errors++;
 		}
-		if (unlikely(p->des01.tx.no_carrier)) {
+		if (unlikely(tdes0 & TDES0_NO_CARRIER)) {
 			x->tx_carrier++;
 			stats->tx_carrier_errors++;
 		}
-		if (unlikely(p->des01.tx.loss_carrier)) {
+		if (unlikely(tdes0 & TDES0_LOSS_CARRIER)) {
 			x->tx_losscarrier++;
 			stats->tx_carrier_errors++;
 		}
-		if (unlikely((p->des01.tx.excessive_deferral) ||
-			     (p->des01.tx.excessive_collisions) ||
-			     (p->des01.tx.late_collision)))
-			stats->collisions += p->des01.tx.collision_count;
+		if (unlikely((tdes0 & TDES0_EXCESSIVE_DEFERRAL) ||
+			     (tdes0 & TDES0_EXCESSIVE_COLLISIONS) ||
+			     (tdes0 & TDES0_LATE_COLLISION))) {
+			unsigned int collisions;
+
+			collisions = (tdes0 & TDES0_COLLISION_COUNT_MASK) >> 3;
+			stats->collisions += collisions;
+		}
 		ret = -1;
 	}
 
-	if (p->des01.etx.vlan_frame)
+	if (tdes0 & TDES0_VLAN_FRAME)
 		x->tx_vlan++;
 
-	if (unlikely(p->des01.tx.deferred))
+	if (unlikely(tdes0 & TDES0_DEFERRED))
 		x->tx_deferred++;
 
 	return ret;
@@ -63,7 +68,7 @@ static int ndesc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 
 static int ndesc_get_tx_len(struct dma_desc *p)
 {
-	return p->des01.tx.buffer1_size;
+	return (p->des1 & RDES1_BUFFER1_SIZE_MASK);
 }
 
 /* This function verifies if each incoming frame has some errors
@@ -74,47 +79,48 @@ static int ndesc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 			       struct dma_desc *p)
 {
 	int ret = good_frame;
+	unsigned int rdes0 = p->des0;
 	struct net_device_stats *stats = (struct net_device_stats *)data;
 
-	if (unlikely(p->des01.rx.last_descriptor == 0)) {
+	if (unlikely(!(rdes0 & RDES0_LAST_DESCRIPTOR))) {
 		pr_warn("%s: Oversized frame spanned multiple buffers\n",
 			__func__);
 		stats->rx_length_errors++;
 		return discard_frame;
 	}
 
-	if (unlikely(p->des01.rx.error_summary)) {
-		if (unlikely(p->des01.rx.descriptor_error))
+	if (unlikely(rdes0 & RDES0_ERROR_SUMMARY)) {
+		if (unlikely(rdes0 & RDES0_DESCRIPTOR_ERROR))
 			x->rx_desc++;
-		if (unlikely(p->des01.rx.sa_filter_fail))
+		if (unlikely(rdes0 & RDES0_SA_FILTER_FAIL))
 			x->sa_filter_fail++;
-		if (unlikely(p->des01.rx.overflow_error))
+		if (unlikely(rdes0 & RDES0_OVERFLOW_ERROR))
 			x->overflow_error++;
-		if (unlikely(p->des01.rx.ipc_csum_error))
+		if (unlikely(rdes0 & RDES0_IPC_CSUM_ERROR))
 			x->ipc_csum_error++;
-		if (unlikely(p->des01.rx.collision)) {
+		if (unlikely(rdes0 & RDES0_COLLISION)) {
 			x->rx_collision++;
 			stats->collisions++;
 		}
-		if (unlikely(p->des01.rx.crc_error)) {
+		if (unlikely(rdes0 & RDES0_CRC_ERROR)) {
 			x->rx_crc++;
 			stats->rx_crc_errors++;
 		}
 		ret = discard_frame;
 	}
-	if (unlikely(p->des01.rx.dribbling))
+	if (unlikely(rdes0 & RDES0_DRIBBLING))
 		x->dribbling_bit++;
 
-	if (unlikely(p->des01.rx.length_error)) {
+	if (unlikely(rdes0 & RDES0_LENGTH_ERROR)) {
 		x->rx_length++;
 		ret = discard_frame;
 	}
-	if (unlikely(p->des01.rx.mii_error)) {
+	if (unlikely(rdes0 & RDES0_MII_ERROR)) {
 		x->rx_mii++;
 		ret = discard_frame;
 	}
 #ifdef STMMAC_VLAN_TAG_USED
-	if (p->des01.rx.vlan_tag)
+	if (rdes0 & RDES0_VLAN_TAG)
 		x->vlan_tag++;
 #endif
 	return ret;
@@ -123,9 +129,8 @@ static int ndesc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 static void ndesc_init_rx_desc(struct dma_desc *p, int disable_rx_ic, int mode,
 			       int end)
 {
-	p->des01.all_flags = 0;
-	p->des01.rx.own = 1;
-	p->des01.rx.buffer1_size = BUF_SIZE_2KiB - 1;
+	p->des0 |= RDES0_OWN;
+	p->des1 |= (BUF_SIZE_2KiB - 1) & RDES1_BUFFER1_SIZE_MASK;
 
 	if (mode == STMMAC_CHAIN_MODE)
 		ndesc_rx_set_on_chain(p, end);
@@ -133,50 +138,50 @@ static void ndesc_init_rx_desc(struct dma_desc *p, int disable_rx_ic, int mode,
 		ndesc_rx_set_on_ring(p, end);
 
 	if (disable_rx_ic)
-		p->des01.rx.disable_ic = 1;
+		p->des1 |= RDES1_DISABLE_IC;
 }
 
 static void ndesc_init_tx_desc(struct dma_desc *p, int mode, int end)
 {
-	p->des01.all_flags = 0;
+	p->des0 &= ~TDES0_OWN;
 	if (mode == STMMAC_CHAIN_MODE)
-		ndesc_tx_set_on_chain(p, end);
+		ndesc_tx_set_on_chain(p);
 	else
-		ndesc_tx_set_on_ring(p, end);
+		ndesc_end_tx_desc_on_ring(p, end);
 }
 
 static int ndesc_get_tx_owner(struct dma_desc *p)
 {
-	return p->des01.tx.own;
+	return (p->des0 & TDES0_OWN) >> 31;
 }
 
 static int ndesc_get_rx_owner(struct dma_desc *p)
 {
-	return p->des01.rx.own;
+	return (p->des0 & RDES0_OWN) >> 31;
 }
 
 static void ndesc_set_tx_owner(struct dma_desc *p)
 {
-	p->des01.tx.own = 1;
+	p->des0 |= TDES0_OWN;
 }
 
 static void ndesc_set_rx_owner(struct dma_desc *p)
 {
-	p->des01.rx.own = 1;
+	p->des0 |= RDES0_OWN;
 }
 
 static int ndesc_get_tx_ls(struct dma_desc *p)
 {
-	return p->des01.tx.last_segment;
+	return (p->des1 & TDES1_LAST_SEGMENT) >> 30;
 }
 
 static void ndesc_release_tx_desc(struct dma_desc *p, int mode)
 {
-	int ter = p->des01.tx.end_ring;
+	int ter = (p->des1 & TDES1_END_RING) >> 25;
 
 	memset(p, 0, offsetof(struct dma_desc, des2));
 	if (mode == STMMAC_CHAIN_MODE)
-		ndesc_end_tx_desc_on_chain(p, ter);
+		ndesc_tx_set_on_chain(p);
 	else
 		ndesc_end_tx_desc_on_ring(p, ter);
 }
@@ -184,48 +189,62 @@ static void ndesc_release_tx_desc(struct dma_desc *p, int mode)
 static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 				  int csum_flag, int mode)
 {
-	p->des01.tx.first_segment = is_fs;
+	unsigned int tdes1 = p->des1;
+
+	if (is_fs)
+		tdes1 |= TDES1_FIRST_SEGMENT;
+	else
+		tdes1 &= ~TDES1_FIRST_SEGMENT;
+
+	if (likely(csum_flag))
+		tdes1 |= (TX_CIC_FULL) << TDES1_CHECKSUM_INSERTION_SHIFT;
+	else
+		tdes1 &= ~(TX_CIC_FULL << TDES1_CHECKSUM_INSERTION_SHIFT);
+
+	p->des1 = tdes1;
+
 	if (mode == STMMAC_CHAIN_MODE)
 		norm_set_tx_desc_len_on_chain(p, len);
 	else
 		norm_set_tx_desc_len_on_ring(p, len);
-
-	if (likely(csum_flag))
-		p->des01.tx.checksum_insertion = cic_full;
 }
 
 static void ndesc_clear_tx_ic(struct dma_desc *p)
 {
-	p->des01.tx.interrupt = 0;
+	p->des1 &= ~TDES1_INTERRUPT;
 }
 
 static void ndesc_close_tx_desc(struct dma_desc *p)
 {
-	p->des01.tx.last_segment = 1;
-	p->des01.tx.interrupt = 1;
+	p->des1 |= TDES1_LAST_SEGMENT | TDES1_INTERRUPT;
 }
 
 static int ndesc_get_rx_frame_len(struct dma_desc *p, int rx_coe_type)
 {
+	unsigned int csum = 0;
+
 	/* The type-1 checksum offload engines append the checksum at
 	 * the end of frame and the two bytes of checksum are added in
 	 * the length.
 	 * Adjust for that in the framelen for type-1 checksum offload
-	 * engines. */
+	 * engines
+	 */
 	if (rx_coe_type == STMMAC_RX_COE_TYPE1)
-		return p->des01.rx.frame_length - 2;
-	else
-		return p->des01.rx.frame_length;
+		csum = 2;
+
+	return (((p->des0 & RDES0_FRAME_LEN_MASK) >> RDES0_FRAME_LEN_SHIFT) -
+		csum);
+
 }
 
 static void ndesc_enable_tx_timestamp(struct dma_desc *p)
 {
-	p->des01.tx.time_stamp_enable = 1;
+	p->des1 |= TDES1_TIME_STAMP_ENABLE;
 }
 
 static int ndesc_get_tx_timestamp_status(struct dma_desc *p)
 {
-	return p->des01.tx.time_stamp_status;
+	return (p->des0 & TDES0_TIME_STAMP_STATUS) >> 17;
 }
 
 static u64 ndesc_get_timestamp(void *desc, u32 ats)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 04/17] stmmac: review RX/TX ring management
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (2 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 03/17] stmmac: change descriptor layout Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 05/17] stmmac: add length field to dma data Alexandre TORGUE
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch is to rework the ring management now optimized.
The indexes into the ring buffer are always incremented, and
the entry is accessed via doing a modulo to find the "real"
position in the ring.
It is inefficient, modulo is an expensive operation.

The formula [(entry + 1) & (size - 1)] is now adopted on
a ring that is power-of-2 in size.
Then, the number of elements cannot be set by command line but
it is fixed.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
index cf28dab..2763772 100644
--- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
@@ -31,8 +31,7 @@
 static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 {
 	struct stmmac_priv *priv = (struct stmmac_priv *)p;
-	unsigned int txsize = priv->dma_tx_size;
-	unsigned int entry = priv->cur_tx % txsize;
+	unsigned int entry = priv->cur_tx;
 	struct dma_desc *desc = priv->dma_tx + entry;
 	unsigned int nopaged_len = skb_headlen(skb);
 	unsigned int bmax;
@@ -54,7 +53,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 
 	while (len != 0) {
 		priv->tx_skbuff[entry] = NULL;
-		entry = (++priv->cur_tx) % txsize;
+		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
 		desc = priv->dma_tx + entry;
 
 		if (len > bmax) {
@@ -82,6 +81,9 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 			len = 0;
 		}
 	}
+
+	priv->cur_tx = entry;
+
 	return entry;
 }
 
@@ -138,7 +140,7 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
 		 */
 		p->des3 = (unsigned int)(priv->dma_rx_phy +
 					 (((priv->dirty_rx) + 1) %
-					  priv->dma_rx_size) *
+					  DMA_RX_SIZE) *
 					 sizeof(struct dma_desc));
 }
 
@@ -151,10 +153,9 @@ static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
 		 * 1588-2002 time stamping is enabled, hence reinitialize it
 		 * to keep explicit chaining in the descriptor.
 		 */
-		p->des3 = (unsigned int)(priv->dma_tx_phy +
-					 (((priv->dirty_tx + 1) %
-					   priv->dma_tx_size) *
-					  sizeof(struct dma_desc)));
+		p->des3 = (unsigned int)((priv->dma_tx_phy +
+					  ((priv->dirty_tx + 1) % DMA_TX_SIZE))
+					  * sizeof(struct dma_desc));
 }
 
 const struct stmmac_mode_ops chain_mode_ops = {
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 586a336..09291af 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -42,6 +42,10 @@
 #define	DWMAC_CORE_3_40	0x34
 #define	DWMAC_CORE_3_50	0x35
 
+#define DMA_TX_SIZE 512
+#define DMA_RX_SIZE 512
+#define STMMAC_GET_ENTRY(x, size)	((x + 1) & (size - 1))
+
 #undef FRAME_FILTER_DEBUG
 /* #define FRAME_FILTER_DEBUG */
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
index 5dd50c6..4358a87 100644
--- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
@@ -31,8 +31,7 @@
 static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 {
 	struct stmmac_priv *priv = (struct stmmac_priv *)p;
-	unsigned int txsize = priv->dma_tx_size;
-	unsigned int entry = priv->cur_tx % txsize;
+	unsigned int entry = priv->cur_tx;
 	struct dma_desc *desc;
 	unsigned int nopaged_len = skb_headlen(skb);
 	unsigned int bmax, len;
@@ -62,7 +61,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 						STMMAC_RING_MODE);
 		wmb();
 		priv->tx_skbuff[entry] = NULL;
-		entry = (++priv->cur_tx) % txsize;
+		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
 
 		if (priv->extend_desc)
 			desc = (struct dma_desc *)(priv->dma_etx + entry);
@@ -90,6 +89,8 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 						STMMAC_RING_MODE);
 	}
 
+	priv->cur_tx = entry;
+
 	return entry;
 }
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 1f3b33a..7ae7c64 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -54,7 +54,6 @@ struct stmmac_priv {
 	struct sk_buff **tx_skbuff;
 	unsigned int cur_tx;
 	unsigned int dirty_tx;
-	unsigned int dma_tx_size;
 	u32 tx_count_frames;
 	u32 tx_coal_frames;
 	u32 tx_coal_timer;
@@ -71,7 +70,6 @@ struct stmmac_priv {
 	struct sk_buff **rx_skbuff;
 	unsigned int cur_rx;
 	unsigned int dirty_rx;
-	unsigned int dma_rx_size;
 	unsigned int dma_buf_sz;
 	u32 rx_riwt;
 	int hwts_rx_en;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 89c2626..eb555f0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -71,15 +71,7 @@ static int phyaddr = -1;
 module_param(phyaddr, int, S_IRUGO);
 MODULE_PARM_DESC(phyaddr, "Physical device address");
 
-#define DMA_TX_SIZE 256
-static int dma_txsize = DMA_TX_SIZE;
-module_param(dma_txsize, int, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(dma_txsize, "Number of descriptors in the TX list");
-
-#define DMA_RX_SIZE 256
-static int dma_rxsize = DMA_RX_SIZE;
-module_param(dma_rxsize, int, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(dma_rxsize, "Number of descriptors in the RX list");
+#define STMMAC_TX_THRESH	(DMA_TX_SIZE / 4)
 
 static int flow_ctrl = FLOW_OFF;
 module_param(flow_ctrl, int, S_IRUGO | S_IWUSR);
@@ -134,10 +126,6 @@ static void stmmac_verify_args(void)
 {
 	if (unlikely(watchdog < 0))
 		watchdog = TX_TIMEO;
-	if (unlikely(dma_rxsize < 0))
-		dma_rxsize = DMA_RX_SIZE;
-	if (unlikely(dma_txsize < 0))
-		dma_txsize = DMA_TX_SIZE;
 	if (unlikely((buf_sz < DEFAULT_BUFSIZE) || (buf_sz > BUF_SIZE_16KiB)))
 		buf_sz = DEFAULT_BUFSIZE;
 	if (unlikely(flow_ctrl > 1))
@@ -197,12 +185,28 @@ static void print_pkt(unsigned char *buf, int len)
 	print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, buf, len);
 }
 
-/* minimum number of free TX descriptors required to wake up TX process */
-#define STMMAC_TX_THRESH(x)	(x->dma_tx_size/4)
-
 static inline u32 stmmac_tx_avail(struct stmmac_priv *priv)
 {
-	return priv->dirty_tx + priv->dma_tx_size - priv->cur_tx - 1;
+	unsigned avail;
+
+	if (priv->dirty_tx > priv->cur_tx)
+		avail = priv->dirty_tx - priv->cur_tx - 1;
+	else
+		avail = DMA_TX_SIZE - priv->cur_tx + priv->dirty_tx - 1;
+
+	return avail;
+}
+
+static inline u32 stmmac_rx_dirty(struct stmmac_priv *priv)
+{
+	unsigned dirty;
+
+	if (priv->dirty_rx <= priv->cur_rx)
+		dirty = priv->cur_rx - priv->dirty_rx;
+	else
+		dirty = DMA_RX_SIZE - priv->dirty_rx + priv->cur_rx;
+
+	return dirty;
 }
 
 /**
@@ -906,19 +910,16 @@ static void stmmac_display_ring(void *head, int size, int extend_desc)
 
 static void stmmac_display_rings(struct stmmac_priv *priv)
 {
-	unsigned int txsize = priv->dma_tx_size;
-	unsigned int rxsize = priv->dma_rx_size;
-
 	if (priv->extend_desc) {
 		pr_info("Extended RX descriptor ring:\n");
-		stmmac_display_ring((void *)priv->dma_erx, rxsize, 1);
+		stmmac_display_ring((void *)priv->dma_erx, DMA_RX_SIZE, 1);
 		pr_info("Extended TX descriptor ring:\n");
-		stmmac_display_ring((void *)priv->dma_etx, txsize, 1);
+		stmmac_display_ring((void *)priv->dma_etx, DMA_TX_SIZE, 1);
 	} else {
 		pr_info("RX descriptor ring:\n");
-		stmmac_display_ring((void *)priv->dma_rx, rxsize, 0);
+		stmmac_display_ring((void *)priv->dma_rx, DMA_RX_SIZE, 0);
 		pr_info("TX descriptor ring:\n");
-		stmmac_display_ring((void *)priv->dma_tx, txsize, 0);
+		stmmac_display_ring((void *)priv->dma_tx, DMA_TX_SIZE, 0);
 	}
 }
 
@@ -947,28 +948,26 @@ static int stmmac_set_bfsize(int mtu, int bufsize)
 static void stmmac_clear_descriptors(struct stmmac_priv *priv)
 {
 	int i;
-	unsigned int txsize = priv->dma_tx_size;
-	unsigned int rxsize = priv->dma_rx_size;
 
 	/* Clear the Rx/Tx descriptors */
-	for (i = 0; i < rxsize; i++)
+	for (i = 0; i < DMA_RX_SIZE; i++)
 		if (priv->extend_desc)
 			priv->hw->desc->init_rx_desc(&priv->dma_erx[i].basic,
 						     priv->use_riwt, priv->mode,
-						     (i == rxsize - 1));
+						     (i == DMA_RX_SIZE - 1));
 		else
 			priv->hw->desc->init_rx_desc(&priv->dma_rx[i],
 						     priv->use_riwt, priv->mode,
-						     (i == rxsize - 1));
-	for (i = 0; i < txsize; i++)
+						     (i == DMA_RX_SIZE - 1));
+	for (i = 0; i < DMA_TX_SIZE; i++)
 		if (priv->extend_desc)
 			priv->hw->desc->init_tx_desc(&priv->dma_etx[i].basic,
 						     priv->mode,
-						     (i == txsize - 1));
+						     (i == DMA_TX_SIZE - 1));
 		else
 			priv->hw->desc->init_tx_desc(&priv->dma_tx[i],
 						     priv->mode,
-						     (i == txsize - 1));
+						     (i == DMA_TX_SIZE - 1));
 }
 
 /**
@@ -1031,8 +1030,6 @@ static int init_dma_desc_rings(struct net_device *dev, gfp_t flags)
 {
 	int i;
 	struct stmmac_priv *priv = netdev_priv(dev);
-	unsigned int txsize = priv->dma_tx_size;
-	unsigned int rxsize = priv->dma_rx_size;
 	unsigned int bfsize = 0;
 	int ret = -ENOMEM;
 
@@ -1044,10 +1041,6 @@ static int init_dma_desc_rings(struct net_device *dev, gfp_t flags)
 
 	priv->dma_buf_sz = bfsize;
 
-	if (netif_msg_probe(priv))
-		pr_debug("%s: txsize %d, rxsize %d, bfsize %d\n", __func__,
-			 txsize, rxsize, bfsize);
-
 	if (netif_msg_probe(priv)) {
 		pr_debug("(%s) dma_rx_phy=0x%08x dma_tx_phy=0x%08x\n", __func__,
 			 (u32) priv->dma_rx_phy, (u32) priv->dma_tx_phy);
@@ -1055,7 +1048,7 @@ static int init_dma_desc_rings(struct net_device *dev, gfp_t flags)
 		/* RX INITIALIZATION */
 		pr_debug("\tSKB addresses:\nskb\t\tskb data\tdma data\n");
 	}
-	for (i = 0; i < rxsize; i++) {
+	for (i = 0; i < DMA_RX_SIZE; i++) {
 		struct dma_desc *p;
 		if (priv->extend_desc)
 			p = &((priv->dma_erx + i)->basic);
@@ -1072,26 +1065,26 @@ static int init_dma_desc_rings(struct net_device *dev, gfp_t flags)
 				 (unsigned int)priv->rx_skbuff_dma[i]);
 	}
 	priv->cur_rx = 0;
-	priv->dirty_rx = (unsigned int)(i - rxsize);
+	priv->dirty_rx = (unsigned int)(i - DMA_RX_SIZE);
 	buf_sz = bfsize;
 
 	/* Setup the chained descriptor addresses */
 	if (priv->mode == STMMAC_CHAIN_MODE) {
 		if (priv->extend_desc) {
 			priv->hw->mode->init(priv->dma_erx, priv->dma_rx_phy,
-					     rxsize, 1);
+					     DMA_RX_SIZE, 1);
 			priv->hw->mode->init(priv->dma_etx, priv->dma_tx_phy,
-					     txsize, 1);
+					     DMA_TX_SIZE, 1);
 		} else {
 			priv->hw->mode->init(priv->dma_rx, priv->dma_rx_phy,
-					     rxsize, 0);
+					     DMA_RX_SIZE, 0);
 			priv->hw->mode->init(priv->dma_tx, priv->dma_tx_phy,
-					     txsize, 0);
+					     DMA_TX_SIZE, 0);
 		}
 	}
 
 	/* TX INITIALIZATION */
-	for (i = 0; i < txsize; i++) {
+	for (i = 0; i < DMA_TX_SIZE; i++) {
 		struct dma_desc *p;
 		if (priv->extend_desc)
 			p = &((priv->dma_etx + i)->basic);
@@ -1123,7 +1116,7 @@ static void dma_free_rx_skbufs(struct stmmac_priv *priv)
 {
 	int i;
 
-	for (i = 0; i < priv->dma_rx_size; i++)
+	for (i = 0; i < DMA_RX_SIZE; i++)
 		stmmac_free_rx_buffers(priv, i);
 }
 
@@ -1131,7 +1124,7 @@ static void dma_free_tx_skbufs(struct stmmac_priv *priv)
 {
 	int i;
 
-	for (i = 0; i < priv->dma_tx_size; i++) {
+	for (i = 0; i < DMA_TX_SIZE; i++) {
 		struct dma_desc *p;
 
 		if (priv->extend_desc)
@@ -1171,33 +1164,31 @@ static void dma_free_tx_skbufs(struct stmmac_priv *priv)
  */
 static int alloc_dma_desc_resources(struct stmmac_priv *priv)
 {
-	unsigned int txsize = priv->dma_tx_size;
-	unsigned int rxsize = priv->dma_rx_size;
 	int ret = -ENOMEM;
 
-	priv->rx_skbuff_dma = kmalloc_array(rxsize, sizeof(dma_addr_t),
+	priv->rx_skbuff_dma = kmalloc_array(DMA_RX_SIZE, sizeof(dma_addr_t),
 					    GFP_KERNEL);
 	if (!priv->rx_skbuff_dma)
 		return -ENOMEM;
 
-	priv->rx_skbuff = kmalloc_array(rxsize, sizeof(struct sk_buff *),
+	priv->rx_skbuff = kmalloc_array(DMA_RX_SIZE, sizeof(struct sk_buff *),
 					GFP_KERNEL);
 	if (!priv->rx_skbuff)
 		goto err_rx_skbuff;
 
-	priv->tx_skbuff_dma = kmalloc_array(txsize,
+	priv->tx_skbuff_dma = kmalloc_array(DMA_TX_SIZE,
 					    sizeof(*priv->tx_skbuff_dma),
 					    GFP_KERNEL);
 	if (!priv->tx_skbuff_dma)
 		goto err_tx_skbuff_dma;
 
-	priv->tx_skbuff = kmalloc_array(txsize, sizeof(struct sk_buff *),
+	priv->tx_skbuff = kmalloc_array(DMA_TX_SIZE, sizeof(struct sk_buff *),
 					GFP_KERNEL);
 	if (!priv->tx_skbuff)
 		goto err_tx_skbuff;
 
 	if (priv->extend_desc) {
-		priv->dma_erx = dma_zalloc_coherent(priv->device, rxsize *
+		priv->dma_erx = dma_zalloc_coherent(priv->device, DMA_RX_SIZE *
 						    sizeof(struct
 							   dma_extended_desc),
 						    &priv->dma_rx_phy,
@@ -1205,31 +1196,31 @@ static int alloc_dma_desc_resources(struct stmmac_priv *priv)
 		if (!priv->dma_erx)
 			goto err_dma;
 
-		priv->dma_etx = dma_zalloc_coherent(priv->device, txsize *
+		priv->dma_etx = dma_zalloc_coherent(priv->device, DMA_TX_SIZE *
 						    sizeof(struct
 							   dma_extended_desc),
 						    &priv->dma_tx_phy,
 						    GFP_KERNEL);
 		if (!priv->dma_etx) {
-			dma_free_coherent(priv->device, priv->dma_rx_size *
+			dma_free_coherent(priv->device, DMA_RX_SIZE *
 					  sizeof(struct dma_extended_desc),
 					  priv->dma_erx, priv->dma_rx_phy);
 			goto err_dma;
 		}
 	} else {
-		priv->dma_rx = dma_zalloc_coherent(priv->device, rxsize *
+		priv->dma_rx = dma_zalloc_coherent(priv->device, DMA_RX_SIZE *
 						   sizeof(struct dma_desc),
 						   &priv->dma_rx_phy,
 						   GFP_KERNEL);
 		if (!priv->dma_rx)
 			goto err_dma;
 
-		priv->dma_tx = dma_zalloc_coherent(priv->device, txsize *
+		priv->dma_tx = dma_zalloc_coherent(priv->device, DMA_TX_SIZE *
 						   sizeof(struct dma_desc),
 						   &priv->dma_tx_phy,
 						   GFP_KERNEL);
 		if (!priv->dma_tx) {
-			dma_free_coherent(priv->device, priv->dma_rx_size *
+			dma_free_coherent(priv->device, DMA_RX_SIZE *
 					  sizeof(struct dma_desc),
 					  priv->dma_rx, priv->dma_rx_phy);
 			goto err_dma;
@@ -1258,16 +1249,16 @@ static void free_dma_desc_resources(struct stmmac_priv *priv)
 	/* Free DMA regions of consistent memory previously allocated */
 	if (!priv->extend_desc) {
 		dma_free_coherent(priv->device,
-				  priv->dma_tx_size * sizeof(struct dma_desc),
+				  DMA_TX_SIZE * sizeof(struct dma_desc),
 				  priv->dma_tx, priv->dma_tx_phy);
 		dma_free_coherent(priv->device,
-				  priv->dma_rx_size * sizeof(struct dma_desc),
+				  DMA_RX_SIZE * sizeof(struct dma_desc),
 				  priv->dma_rx, priv->dma_rx_phy);
 	} else {
-		dma_free_coherent(priv->device, priv->dma_tx_size *
+		dma_free_coherent(priv->device, DMA_TX_SIZE *
 				  sizeof(struct dma_extended_desc),
 				  priv->dma_etx, priv->dma_tx_phy);
-		dma_free_coherent(priv->device, priv->dma_rx_size *
+		dma_free_coherent(priv->device, DMA_RX_SIZE *
 				  sizeof(struct dma_extended_desc),
 				  priv->dma_erx, priv->dma_rx_phy);
 	}
@@ -1312,16 +1303,15 @@ static void stmmac_dma_operation_mode(struct stmmac_priv *priv)
  */
 static void stmmac_tx_clean(struct stmmac_priv *priv)
 {
-	unsigned int txsize = priv->dma_tx_size;
 	unsigned int bytes_compl = 0, pkts_compl = 0;
+	unsigned int entry = priv->dirty_tx;
 
 	spin_lock(&priv->tx_lock);
 
 	priv->xstats.tx_clean++;
 
-	while (priv->dirty_tx != priv->cur_tx) {
+	while (entry != priv->cur_tx) {
 		int last;
-		unsigned int entry = priv->dirty_tx % txsize;
 		struct sk_buff *skb = priv->tx_skbuff[entry];
 		struct dma_desc *p;
 
@@ -1378,16 +1368,17 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 
 		priv->hw->desc->release_tx_desc(p, priv->mode);
 
-		priv->dirty_tx++;
+		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
+		priv->dirty_tx = entry;
 	}
 
 	netdev_completed_queue(priv->dev, pkts_compl, bytes_compl);
 
 	if (unlikely(netif_queue_stopped(priv->dev) &&
-		     stmmac_tx_avail(priv) > STMMAC_TX_THRESH(priv))) {
+		     stmmac_tx_avail(priv) > STMMAC_TX_THRESH)) {
 		netif_tx_lock(priv->dev);
 		if (netif_queue_stopped(priv->dev) &&
-		    stmmac_tx_avail(priv) > STMMAC_TX_THRESH(priv)) {
+		    stmmac_tx_avail(priv) > STMMAC_TX_THRESH) {
 			if (netif_msg_tx_done(priv))
 				pr_debug("%s: restart transmit\n", __func__);
 			netif_wake_queue(priv->dev);
@@ -1421,20 +1412,19 @@ static inline void stmmac_disable_dma_irq(struct stmmac_priv *priv)
 static void stmmac_tx_err(struct stmmac_priv *priv)
 {
 	int i;
-	int txsize = priv->dma_tx_size;
 	netif_stop_queue(priv->dev);
 
 	priv->hw->dma->stop_tx(priv->ioaddr);
 	dma_free_tx_skbufs(priv);
-	for (i = 0; i < txsize; i++)
+	for (i = 0; i < DMA_TX_SIZE; i++)
 		if (priv->extend_desc)
 			priv->hw->desc->init_tx_desc(&priv->dma_etx[i].basic,
 						     priv->mode,
-						     (i == txsize - 1));
+						     (i == DMA_TX_SIZE - 1));
 		else
 			priv->hw->desc->init_tx_desc(&priv->dma_tx[i],
 						     priv->mode,
-						     (i == txsize - 1));
+						     (i == DMA_TX_SIZE - 1));
 	priv->dirty_tx = 0;
 	priv->cur_tx = 0;
 	netdev_reset_queue(priv->dev);
@@ -1811,9 +1801,6 @@ static int stmmac_open(struct net_device *dev)
 	memset(&priv->xstats, 0, sizeof(struct stmmac_extra_stats));
 	priv->xstats.threshold = tc;
 
-	/* Create and initialize the TX/RX descriptors chains. */
-	priv->dma_tx_size = STMMAC_ALIGN(dma_txsize);
-	priv->dma_rx_size = STMMAC_ALIGN(dma_rxsize);
 	priv->dma_buf_sz = STMMAC_ALIGN(buf_sz);
 
 	ret = alloc_dma_desc_resources(priv);
@@ -1955,7 +1942,6 @@ static int stmmac_release(struct net_device *dev)
 static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct stmmac_priv *priv = netdev_priv(dev);
-	unsigned int txsize = priv->dma_tx_size;
 	int entry;
 	int i, csum_insertion = 0, is_jumbo = 0;
 	int nfrags = skb_shinfo(skb)->nr_frags;
@@ -1978,7 +1964,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (priv->tx_path_in_lpi_mode)
 		stmmac_disable_eee_mode(priv);
 
-	entry = priv->cur_tx % txsize;
+	entry = priv->cur_tx;
+
 
 	csum_insertion = (skb->ip_summed == CHECKSUM_PARTIAL);
 
@@ -2013,7 +2000,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		int len = skb_frag_size(frag);
 
 		priv->tx_skbuff[entry] = NULL;
-		entry = (++priv->cur_tx) % txsize;
+		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
+
 		if (priv->extend_desc)
 			desc = (struct dma_desc *)(priv->dma_etx + entry);
 		else
@@ -2056,17 +2044,21 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 	priv->hw->desc->set_tx_owner(first);
 	wmb();
 
-	priv->cur_tx++;
+	entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
+
+	priv->cur_tx = entry;
 
 	if (netif_msg_pktdata(priv)) {
 		pr_debug("%s: curr %d dirty=%d entry=%d, first=%p, nfrags=%d",
-			__func__, (priv->cur_tx % txsize),
-			(priv->dirty_tx % txsize), entry, first, nfrags);
+			__func__, (priv->cur_tx % DMA_TX_SIZE),
+			(priv->dirty_tx % DMA_TX_SIZE), entry, first, nfrags);
 
 		if (priv->extend_desc)
-			stmmac_display_ring((void *)priv->dma_etx, txsize, 1);
+			stmmac_display_ring((void *)priv->dma_etx,
+					    DMA_TX_SIZE, 1);
 		else
-			stmmac_display_ring((void *)priv->dma_tx, txsize, 0);
+			stmmac_display_ring((void *)priv->dma_tx,
+					    DMA_TX_SIZE, 0);
 
 		pr_debug(">>> frame to be transmitted: ");
 		print_pkt(skb->data, skb->len);
@@ -2128,11 +2120,11 @@ static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb)
  */
 static inline void stmmac_rx_refill(struct stmmac_priv *priv)
 {
-	unsigned int rxsize = priv->dma_rx_size;
 	int bfsize = priv->dma_buf_sz;
+	unsigned int entry = priv->dirty_rx;
+	int dirty = stmmac_rx_dirty(priv);
 
-	for (; priv->cur_rx - priv->dirty_rx > 0; priv->dirty_rx++) {
-		unsigned int entry = priv->dirty_rx % rxsize;
+	while (dirty-- > 0) {
 		struct dma_desc *p;
 
 		if (priv->extend_desc)
@@ -2168,7 +2160,10 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
 		wmb();
 		priv->hw->desc->set_rx_owner(p);
 		wmb();
+
+		entry = STMMAC_GET_ENTRY(entry, DMA_RX_SIZE);
 	}
+	priv->dirty_rx = entry;
 }
 
 /**
@@ -2180,8 +2175,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
  */
 static int stmmac_rx(struct stmmac_priv *priv, int limit)
 {
-	unsigned int rxsize = priv->dma_rx_size;
-	unsigned int entry = priv->cur_rx % rxsize;
+	unsigned int entry = priv->cur_rx;
 	unsigned int next_entry;
 	unsigned int count = 0;
 	int coe = priv->hw->rx_csum;
@@ -2189,9 +2183,11 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 	if (netif_msg_rx_status(priv)) {
 		pr_debug("%s: descriptor ring:\n", __func__);
 		if (priv->extend_desc)
-			stmmac_display_ring((void *)priv->dma_erx, rxsize, 1);
+			stmmac_display_ring((void *)priv->dma_erx,
+					    DMA_RX_SIZE, 1);
 		else
-			stmmac_display_ring((void *)priv->dma_rx, rxsize, 0);
+			stmmac_display_ring((void *)priv->dma_rx,
+					    DMA_RX_SIZE, 0);
 	}
 	while (count < limit) {
 		int status;
@@ -2207,7 +2203,9 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 
 		count++;
 
-		next_entry = (++priv->cur_rx) % rxsize;
+		priv->cur_rx = STMMAC_GET_ENTRY(priv->cur_rx, DMA_RX_SIZE);
+		next_entry = priv->cur_rx;
+
 		if (priv->extend_desc)
 			prefetch(priv->dma_erx + next_entry);
 		else
@@ -2567,19 +2565,17 @@ static int stmmac_sysfs_ring_read(struct seq_file *seq, void *v)
 {
 	struct net_device *dev = seq->private;
 	struct stmmac_priv *priv = netdev_priv(dev);
-	unsigned int txsize = priv->dma_tx_size;
-	unsigned int rxsize = priv->dma_rx_size;
 
 	if (priv->extend_desc) {
 		seq_printf(seq, "Extended RX descriptor ring:\n");
-		sysfs_display_ring((void *)priv->dma_erx, rxsize, 1, seq);
+		sysfs_display_ring((void *)priv->dma_erx, DMA_RX_SIZE, 1, seq);
 		seq_printf(seq, "Extended TX descriptor ring:\n");
-		sysfs_display_ring((void *)priv->dma_etx, txsize, 1, seq);
+		sysfs_display_ring((void *)priv->dma_etx, DMA_TX_SIZE, 1, seq);
 	} else {
 		seq_printf(seq, "RX descriptor ring:\n");
-		sysfs_display_ring((void *)priv->dma_rx, rxsize, 0, seq);
+		sysfs_display_ring((void *)priv->dma_rx, DMA_RX_SIZE, 0, seq);
 		seq_printf(seq, "TX descriptor ring:\n");
-		sysfs_display_ring((void *)priv->dma_tx, txsize, 0, seq);
+		sysfs_display_ring((void *)priv->dma_tx, DMA_TX_SIZE, 0, seq);
 	}
 
 	return 0;
@@ -3149,12 +3145,6 @@ static int __init stmmac_cmdline_opt(char *str)
 		} else if (!strncmp(opt, "phyaddr:", 8)) {
 			if (kstrtoint(opt + 8, 0, &phyaddr))
 				goto err;
-		} else if (!strncmp(opt, "dma_txsize:", 11)) {
-			if (kstrtoint(opt + 11, 0, &dma_txsize))
-				goto err;
-		} else if (!strncmp(opt, "dma_rxsize:", 11)) {
-			if (kstrtoint(opt + 11, 0, &dma_rxsize))
-				goto err;
 		} else if (!strncmp(opt, "buf_sz:", 7)) {
 			if (kstrtoint(opt + 7, 0, &buf_sz))
 				goto err;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 05/17] stmmac: add length field to dma data
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (3 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 04/17] stmmac: review RX/TX ring management Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 06/17] stmmac: add last_segment " Alexandre TORGUE
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

Currently, the code pulls out the length field when
unmapping a buffer directly from the descriptor. This will result
in an uncached read to a dma_alloc_coherent() region. There is no
need to do this, so this patch simply puts the value directly into
a data structure which will hit the cache.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
index 2763772..7fa7ab0d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
@@ -49,6 +49,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 	if (dma_mapping_error(priv->device, desc->des2))
 		return -1;
 	priv->tx_skbuff_dma[entry].buf = desc->des2;
+	priv->tx_skbuff_dma[entry].len = bmax;
 	priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum, STMMAC_CHAIN_MODE);
 
 	while (len != 0) {
@@ -63,6 +64,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 			if (dma_mapping_error(priv->device, desc->des2))
 				return -1;
 			priv->tx_skbuff_dma[entry].buf = desc->des2;
+			priv->tx_skbuff_dma[entry].len = bmax;
 			priv->hw->desc->prepare_tx_desc(desc, 0, bmax, csum,
 							STMMAC_CHAIN_MODE);
 			priv->hw->desc->set_tx_owner(desc);
@@ -75,6 +77,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 			if (dma_mapping_error(priv->device, desc->des2))
 				return -1;
 			priv->tx_skbuff_dma[entry].buf = desc->des2;
+			priv->tx_skbuff_dma[entry].len = len;
 			priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
 							STMMAC_CHAIN_MODE);
 			priv->hw->desc->set_tx_owner(desc);
diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
index 4358a87..cfc2f24 100644
--- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
@@ -56,6 +56,8 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 			return -1;
 
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
+		priv->tx_skbuff_dma[entry].len = bmax;
+
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum,
 						STMMAC_RING_MODE);
@@ -73,6 +75,8 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 		if (dma_mapping_error(priv->device, desc->des2))
 			return -1;
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
+		priv->tx_skbuff_dma[entry].len = len;
+
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
 						STMMAC_RING_MODE);
@@ -84,6 +88,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 		if (dma_mapping_error(priv->device, desc->des2))
 			return -1;
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
+		priv->tx_skbuff_dma[entry].len = nopaged_len;
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len, csum,
 						STMMAC_RING_MODE);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 7ae7c64..c497460 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -45,6 +45,7 @@ struct stmmac_resources {
 struct stmmac_tx_info {
 	dma_addr_t buf;
 	bool map_as_page;
+	unsigned len;
 };
 
 struct stmmac_priv {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index eb555f0..90a946f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1093,6 +1093,7 @@ static int init_dma_desc_rings(struct net_device *dev, gfp_t flags)
 		p->des2 = 0;
 		priv->tx_skbuff_dma[i].buf = 0;
 		priv->tx_skbuff_dma[i].map_as_page = false;
+		priv->tx_skbuff_dma[i].len = 0;
 		priv->tx_skbuff[i] = NULL;
 	}
 
@@ -1136,12 +1137,12 @@ static void dma_free_tx_skbufs(struct stmmac_priv *priv)
 			if (priv->tx_skbuff_dma[i].map_as_page)
 				dma_unmap_page(priv->device,
 					       priv->tx_skbuff_dma[i].buf,
-					       priv->hw->desc->get_tx_len(p),
+					       priv->tx_skbuff_dma[i].len,
 					       DMA_TO_DEVICE);
 			else
 				dma_unmap_single(priv->device,
 						 priv->tx_skbuff_dma[i].buf,
-						 priv->hw->desc->get_tx_len(p),
+						 priv->tx_skbuff_dma[i].len,
 						 DMA_TO_DEVICE);
 		}
 
@@ -1347,12 +1348,12 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 			if (priv->tx_skbuff_dma[entry].map_as_page)
 				dma_unmap_page(priv->device,
 					       priv->tx_skbuff_dma[entry].buf,
-					       priv->hw->desc->get_tx_len(p),
+					       priv->tx_skbuff_dma[entry].len,
 					       DMA_TO_DEVICE);
 			else
 				dma_unmap_single(priv->device,
 						 priv->tx_skbuff_dma[entry].buf,
-						 priv->hw->desc->get_tx_len(p),
+						 priv->tx_skbuff_dma[entry].len,
 						 DMA_TO_DEVICE);
 			priv->tx_skbuff_dma[entry].buf = 0;
 			priv->tx_skbuff_dma[entry].map_as_page = false;
@@ -1986,6 +1987,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		if (dma_mapping_error(priv->device, desc->des2))
 			goto dma_map_err;
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
+		priv->tx_skbuff_dma[entry].len = nopaged_len;
 		priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len,
 						csum_insertion, priv->mode);
 	} else {
@@ -2014,6 +2016,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
 		priv->tx_skbuff_dma[entry].map_as_page = true;
+		priv->tx_skbuff_dma[entry].len = len;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion,
 						priv->mode);
 		wmb();
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 06/17] stmmac: add last_segment field to dma data
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (4 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 05/17] stmmac: add length field to dma data Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 07/17] stmmac: add is_jumbo " Alexandre TORGUE
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

last_segment field is read twice from dma descriptors in stmmac_clean().
Add last_segment to dma data so that this flag is from priv
structure in cache instead of memory.
It avoids reading twice from memory for each loop in stmmac_clean().

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
index 7fa7ab0d..355eafb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
@@ -150,8 +150,9 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
 static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
 {
 	struct stmmac_priv *priv = (struct stmmac_priv *)priv_ptr;
+	unsigned int entry = priv->dirty_tx;
 
-	if (priv->hw->desc->get_tx_ls(p) && !priv->extend_desc)
+	if (priv->tx_skbuff_dma[entry].last_segment && !priv->extend_desc)
 		/* NOTE: Device will overwrite des3 with timestamp value if
 		 * 1588-2002 time stamping is enabled, hence reinitialize it
 		 * to keep explicit chaining in the descriptor.
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index c497460..0436918 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -46,6 +46,7 @@ struct stmmac_tx_info {
 	dma_addr_t buf;
 	bool map_as_page;
 	unsigned len;
+	bool last_segment;
 };
 
 struct stmmac_priv {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 90a946f..feae0de 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1094,6 +1094,7 @@ static int init_dma_desc_rings(struct net_device *dev, gfp_t flags)
 		priv->tx_skbuff_dma[i].buf = 0;
 		priv->tx_skbuff_dma[i].map_as_page = false;
 		priv->tx_skbuff_dma[i].len = 0;
+		priv->tx_skbuff_dma[i].last_segment = false;
 		priv->tx_skbuff[i] = NULL;
 	}
 
@@ -1326,7 +1327,7 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 			break;
 
 		/* Verify tx error by looking at the last segment. */
-		last = priv->hw->desc->get_tx_ls(p);
+		last = priv->tx_skbuff_dma[entry].last_segment;
 		if (likely(last)) {
 			int tx_error =
 			    priv->hw->desc->tx_status(&priv->dev->stats,
@@ -1359,6 +1360,7 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 			priv->tx_skbuff_dma[entry].map_as_page = false;
 		}
 		priv->hw->mode->clean_desc3(priv, p);
+		priv->tx_skbuff_dma[entry].last_segment = false;
 
 		if (likely(skb != NULL)) {
 			pkts_compl++;
@@ -2028,6 +2030,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	/* Finalize the latest segment. */
 	priv->hw->desc->close_tx_desc(desc);
+	priv->tx_skbuff_dma[entry].last_segment = true;
 
 	wmb();
 	/* According to the coalesce parameter the IC bit for the latest
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 07/17] stmmac: add is_jumbo field to dma data
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (5 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 06/17] stmmac: add last_segment " Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 08/17] stmmac: merge get_rx_owner into rx_status routine Alexandre TORGUE
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

Optimize tx_clean by avoiding a des3 read in stmmac_clean_desc3().

In ring mode, TX, des3 seems only used when xmit a jumbo frame.
In case of normal descriptors, it may also be used for time
stamping.
Clean it in the above two case, without reading it.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
index 355eafb..dacb654 100644
--- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
@@ -152,7 +152,8 @@ static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
 	struct stmmac_priv *priv = (struct stmmac_priv *)priv_ptr;
 	unsigned int entry = priv->dirty_tx;
 
-	if (priv->tx_skbuff_dma[entry].last_segment && !priv->extend_desc)
+	if (priv->tx_skbuff_dma[entry].last_segment && !priv->extend_desc &&
+	    priv->hwts_tx_en)
 		/* NOTE: Device will overwrite des3 with timestamp value if
 		 * 1588-2002 time stamping is enabled, hence reinitialize it
 		 * to keep explicit chaining in the descriptor.
diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
index cfc2f24..c648774 100644
--- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
@@ -57,6 +57,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
 		priv->tx_skbuff_dma[entry].len = bmax;
+		priv->tx_skbuff_dma[entry].is_jumbo = true;
 
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum,
@@ -76,6 +77,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 			return -1;
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
 		priv->tx_skbuff_dma[entry].len = len;
+		priv->tx_skbuff_dma[entry].is_jumbo = true;
 
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
@@ -89,6 +91,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 			return -1;
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
 		priv->tx_skbuff_dma[entry].len = nopaged_len;
+		priv->tx_skbuff_dma[entry].is_jumbo = true;
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len, csum,
 						STMMAC_RING_MODE);
@@ -126,7 +129,13 @@ static void stmmac_init_desc3(struct dma_desc *p)
 
 static void stmmac_clean_desc3(void *priv_ptr, struct dma_desc *p)
 {
-	if (unlikely(p->des3))
+	struct stmmac_priv *priv = (struct stmmac_priv *)priv_ptr;
+	unsigned int entry = priv->dirty_tx;
+
+	/* des3 is only used for jumbo frames tx or time stamping */
+	if (unlikely(priv->tx_skbuff_dma[entry].is_jumbo ||
+		     (priv->tx_skbuff_dma[entry].last_segment &&
+		      !priv->extend_desc && priv->hwts_tx_en)))
 		p->des3 = 0;
 }
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 0436918..0d01f3e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -47,6 +47,7 @@ struct stmmac_tx_info {
 	bool map_as_page;
 	unsigned len;
 	bool last_segment;
+	bool is_jumbo;
 };
 
 struct stmmac_priv {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index feae0de..0194a8f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1361,6 +1361,7 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 		}
 		priv->hw->mode->clean_desc3(priv, p);
 		priv->tx_skbuff_dma[entry].last_segment = false;
+		priv->tx_skbuff_dma[entry].is_jumbo = false;
 
 		if (likely(skb != NULL)) {
 			pkts_compl++;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 08/17] stmmac: merge get_rx_owner into rx_status routine.
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (6 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 07/17] stmmac: add is_jumbo " Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 09/17] stmmac: optimize tx desc management Alexandre TORGUE
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Fabrice Gasnier <fabrice.gasnier@st.com>

The RDES0 register can be read several times while doing RX of a
packet.
This patch slightly improves RX path performance by reading rdes0
once for two operation: check rx owner, get rx status bits.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 09291af..3ba268e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -238,10 +238,11 @@ struct stmmac_extra_stats {
 
 /* Rx IPC status */
 enum rx_frame_status {
-	good_frame = 0,
-	discard_frame = 1,
-	csum_none = 2,
-	llc_snap = 4,
+	good_frame = 0x0,
+	discard_frame = 0x1,
+	csum_none = 0x2,
+	llc_snap = 0x4,
+	dma_own = 0x8,
 };
 
 enum dma_irq_status {
@@ -356,7 +357,6 @@ struct stmmac_desc_ops {
 	/* Get the buffer size from the descriptor */
 	int (*get_tx_len) (struct dma_desc *p);
 	/* Handle extra events on specific interrupts hw dependent */
-	int (*get_rx_owner) (struct dma_desc *p);
 	void (*set_rx_owner) (struct dma_desc *p);
 	/* Get the receive frame size */
 	int (*get_rx_frame_len) (struct dma_desc *p, int rx_coe_type);
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 716b807..1a2fce9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -186,6 +186,9 @@ static int enh_desc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 	unsigned int rdes0 = p->des0;
 	int ret = good_frame;
 
+	if (unlikely(rdes0 & RDES0_OWN))
+		return dma_own;
+
 	if (unlikely(rdes0 & RDES0_ERROR_SUMMARY)) {
 		if (unlikely(rdes0 & RDES0_DESCRIPTOR_ERROR)) {
 			x->rx_desc++;
@@ -272,11 +275,6 @@ static int enh_desc_get_tx_owner(struct dma_desc *p)
 	return (p->des0 & ETDES0_OWN) >> 31;
 }
 
-static int enh_desc_get_rx_owner(struct dma_desc *p)
-{
-	return (p->des0 & RDES0_OWN) >> 31;
-}
-
 static void enh_desc_set_tx_owner(struct dma_desc *p)
 {
 	p->des0 |= ETDES0_OWN;
@@ -402,7 +400,6 @@ const struct stmmac_desc_ops enh_desc_ops = {
 	.init_rx_desc = enh_desc_init_rx_desc,
 	.init_tx_desc = enh_desc_init_tx_desc,
 	.get_tx_owner = enh_desc_get_tx_owner,
-	.get_rx_owner = enh_desc_get_rx_owner,
 	.release_tx_desc = enh_desc_release_tx_desc,
 	.prepare_tx_desc = enh_desc_prepare_tx_desc,
 	.clear_tx_ic = enh_desc_clear_tx_ic,
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 460c573..5a91932 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -82,6 +82,9 @@ static int ndesc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 	unsigned int rdes0 = p->des0;
 	struct net_device_stats *stats = (struct net_device_stats *)data;
 
+	if (unlikely(rdes0 & RDES0_OWN))
+		return dma_own;
+
 	if (unlikely(!(rdes0 & RDES0_LAST_DESCRIPTOR))) {
 		pr_warn("%s: Oversized frame spanned multiple buffers\n",
 			__func__);
@@ -155,11 +158,6 @@ static int ndesc_get_tx_owner(struct dma_desc *p)
 	return (p->des0 & TDES0_OWN) >> 31;
 }
 
-static int ndesc_get_rx_owner(struct dma_desc *p)
-{
-	return (p->des0 & RDES0_OWN) >> 31;
-}
-
 static void ndesc_set_tx_owner(struct dma_desc *p)
 {
 	p->des0 |= TDES0_OWN;
@@ -277,7 +275,6 @@ const struct stmmac_desc_ops ndesc_ops = {
 	.init_rx_desc = ndesc_init_rx_desc,
 	.init_tx_desc = ndesc_init_tx_desc,
 	.get_tx_owner = ndesc_get_tx_owner,
-	.get_rx_owner = ndesc_get_rx_owner,
 	.release_tx_desc = ndesc_release_tx_desc,
 	.prepare_tx_desc = ndesc_prepare_tx_desc,
 	.clear_tx_ic = ndesc_clear_tx_ic,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 0194a8f..796d7c6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2205,7 +2205,11 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 		else
 			p = priv->dma_rx + entry;
 
-		if (priv->hw->desc->get_rx_owner(p))
+		/* read the status of the incoming frame */
+		status = priv->hw->desc->rx_status(&priv->dev->stats,
+						   &priv->xstats, p);
+		/* check if managed by the DMA otherwise go ahead */
+		if (unlikely(status & dma_own))
 			break;
 
 		count++;
@@ -2218,9 +2222,6 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 		else
 			prefetch(priv->dma_rx + next_entry);
 
-		/* read the status of the incoming frame */
-		status = priv->hw->desc->rx_status(&priv->dev->stats,
-						   &priv->xstats, p);
 		if ((priv->extend_desc) && (priv->hw->desc->rx_extended_status))
 			priv->hw->desc->rx_extended_status(&priv->dev->stats,
 							   &priv->xstats,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 09/17] stmmac: optimize tx desc management
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (7 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 08/17] stmmac: merge get_rx_owner into rx_status routine Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 10/17] stmmac: optimize tx clean function Alexandre TORGUE
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch is to optimize the way to manage the TDES inside the
xmit function. When prepare the frame, some settings (e.g. OWN
bit) can be merged. This has been reworked to improve the tx
performances.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
index dacb654..b3e669a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c
@@ -50,7 +50,9 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 		return -1;
 	priv->tx_skbuff_dma[entry].buf = desc->des2;
 	priv->tx_skbuff_dma[entry].len = bmax;
-	priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum, STMMAC_CHAIN_MODE);
+	/* do not close the descriptor and do not set own bit */
+	priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum, STMMAC_CHAIN_MODE,
+					0, false);
 
 	while (len != 0) {
 		priv->tx_skbuff[entry] = NULL;
@@ -66,8 +68,8 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 			priv->tx_skbuff_dma[entry].buf = desc->des2;
 			priv->tx_skbuff_dma[entry].len = bmax;
 			priv->hw->desc->prepare_tx_desc(desc, 0, bmax, csum,
-							STMMAC_CHAIN_MODE);
-			priv->hw->desc->set_tx_owner(desc);
+							STMMAC_CHAIN_MODE, 1,
+							false);
 			len -= bmax;
 			i++;
 		} else {
@@ -78,9 +80,10 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 				return -1;
 			priv->tx_skbuff_dma[entry].buf = desc->des2;
 			priv->tx_skbuff_dma[entry].len = len;
+			/* last descriptor can be set now */
 			priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
-							STMMAC_CHAIN_MODE);
-			priv->hw->desc->set_tx_owner(desc);
+							STMMAC_CHAIN_MODE, 1,
+							true);
 			len = 0;
 		}
 	}
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 3ba268e..885c0f9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -338,12 +338,11 @@ struct stmmac_desc_ops {
 
 	/* Invoked by the xmit function to prepare the tx descriptor */
 	void (*prepare_tx_desc) (struct dma_desc *p, int is_fs, int len,
-				 int csum_flag, int mode);
+				 bool csum_flag, int mode, bool tx_own,
+				 bool ls_ic);
 	/* Set/get the owner of the descriptor */
 	void (*set_tx_owner) (struct dma_desc *p);
 	int (*get_tx_owner) (struct dma_desc *p);
-	/* Invoked by the xmit function to close the tx descriptor */
-	void (*close_tx_desc) (struct dma_desc *p);
 	/* Clean the tx descriptor as soon as the tx irq is received */
 	void (*release_tx_desc) (struct dma_desc *p, int mode);
 	/* Clear interrupt on tx frame completion. When this bit is
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 1a2fce9..1abd80e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -302,7 +302,8 @@ static void enh_desc_release_tx_desc(struct dma_desc *p, int mode)
 }
 
 static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
-				     int csum_flag, int mode)
+				     bool csum_flag, int mode, bool tx_own,
+				     bool ls_ic)
 {
 	unsigned int tdes0 = p->des0;
 
@@ -316,6 +317,19 @@ static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 	else
 		tdes0 &= ~(TX_CIC_FULL << ETDES0_CHECKSUM_INSERTION_SHIFT);
 
+	if (tx_own)
+		tdes0 |= ETDES0_OWN;
+
+	if (is_fs & tx_own)
+		/* When the own bit, for the first frame, has to be set, all
+		 * descriptors for the same frame has to be set before, to
+		 * avoid race condition.
+		 */
+		wmb();
+
+	if (ls_ic)
+		tdes0 |= ETDES0_LAST_SEGMENT | ETDES0_INTERRUPT;
+
 	p->des0 = tdes0;
 
 	if (mode == STMMAC_CHAIN_MODE)
@@ -329,11 +343,6 @@ static void enh_desc_clear_tx_ic(struct dma_desc *p)
 	p->des0 &= ~ETDES0_INTERRUPT;
 }
 
-static void enh_desc_close_tx_desc(struct dma_desc *p)
-{
-	p->des0 |= ETDES0_LAST_SEGMENT | ETDES0_INTERRUPT;
-}
-
 static int enh_desc_get_rx_frame_len(struct dma_desc *p, int rx_coe_type)
 {
 	unsigned int csum = 0;
@@ -403,7 +412,6 @@ const struct stmmac_desc_ops enh_desc_ops = {
 	.release_tx_desc = enh_desc_release_tx_desc,
 	.prepare_tx_desc = enh_desc_prepare_tx_desc,
 	.clear_tx_ic = enh_desc_clear_tx_ic,
-	.close_tx_desc = enh_desc_close_tx_desc,
 	.get_tx_ls = enh_desc_get_tx_ls,
 	.set_tx_owner = enh_desc_set_tx_owner,
 	.set_rx_owner = enh_desc_set_rx_owner,
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 5a91932..19cc12d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -185,7 +185,8 @@ static void ndesc_release_tx_desc(struct dma_desc *p, int mode)
 }
 
 static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
-				  int csum_flag, int mode)
+				  bool csum_flag, int mode, bool tx_own,
+				  bool ls_ic)
 {
 	unsigned int tdes1 = p->des1;
 
@@ -199,6 +200,12 @@ static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 	else
 		tdes1 &= ~(TX_CIC_FULL << TDES1_CHECKSUM_INSERTION_SHIFT);
 
+	if (tx_own)
+		tdes1 |= TDES0_OWN;
+
+	if (ls_ic)
+		tdes1 |= TDES1_LAST_SEGMENT | TDES1_INTERRUPT;
+
 	p->des1 = tdes1;
 
 	if (mode == STMMAC_CHAIN_MODE)
@@ -212,11 +219,6 @@ static void ndesc_clear_tx_ic(struct dma_desc *p)
 	p->des1 &= ~TDES1_INTERRUPT;
 }
 
-static void ndesc_close_tx_desc(struct dma_desc *p)
-{
-	p->des1 |= TDES1_LAST_SEGMENT | TDES1_INTERRUPT;
-}
-
 static int ndesc_get_rx_frame_len(struct dma_desc *p, int rx_coe_type)
 {
 	unsigned int csum = 0;
@@ -278,7 +280,6 @@ const struct stmmac_desc_ops ndesc_ops = {
 	.release_tx_desc = ndesc_release_tx_desc,
 	.prepare_tx_desc = ndesc_prepare_tx_desc,
 	.clear_tx_ic = ndesc_clear_tx_ic,
-	.close_tx_desc = ndesc_close_tx_desc,
 	.get_tx_ls = ndesc_get_tx_ls,
 	.set_tx_owner = ndesc_set_tx_owner,
 	.set_rx_owner = ndesc_set_rx_owner,
diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
index c648774..11c7164 100644
--- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
@@ -61,7 +61,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum,
-						STMMAC_RING_MODE);
+						STMMAC_RING_MODE, 0, false);
 		wmb();
 		priv->tx_skbuff[entry] = NULL;
 		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
@@ -81,9 +81,8 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
-						STMMAC_RING_MODE);
+						STMMAC_RING_MODE, 1, true);
 		wmb();
-		priv->hw->desc->set_tx_owner(desc);
 	} else {
 		desc->des2 = dma_map_single(priv->device, skb->data,
 					    nopaged_len, DMA_TO_DEVICE);
@@ -94,7 +93,7 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 		priv->tx_skbuff_dma[entry].is_jumbo = true;
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len, csum,
-						STMMAC_RING_MODE);
+						STMMAC_RING_MODE, 0, true);
 	}
 
 	priv->cur_tx = entry;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 796d7c6..24c3608 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1991,8 +1991,10 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 			goto dma_map_err;
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
 		priv->tx_skbuff_dma[entry].len = nopaged_len;
+		/* do not set the own at this stage */
 		priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len,
-						csum_insertion, priv->mode);
+						csum_insertion, priv->mode, 0,
+						nfrags == 0);
 	} else {
 		desc = first;
 		entry = priv->hw->mode->jumbo_frm(priv, skb, csum_insertion);
@@ -2003,6 +2005,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 	for (i = 0; i < nfrags; i++) {
 		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 		int len = skb_frag_size(frag);
+		bool last_segment = (i == (nfrags - 1));
 
 		priv->tx_skbuff[entry] = NULL;
 		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
@@ -2021,19 +2024,12 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		priv->tx_skbuff_dma[entry].map_as_page = true;
 		priv->tx_skbuff_dma[entry].len = len;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion,
-						priv->mode);
-		wmb();
-		priv->hw->desc->set_tx_owner(desc);
-		wmb();
+						priv->mode, 1, last_segment);
+		priv->tx_skbuff_dma[entry].last_segment = last_segment;
 	}
 
 	priv->tx_skbuff[entry] = skb;
 
-	/* Finalize the latest segment. */
-	priv->hw->desc->close_tx_desc(desc);
-	priv->tx_skbuff_dma[entry].last_segment = true;
-
-	wmb();
 	/* According to the coalesce parameter the IC bit for the latest
 	 * segment could be reset and the timer re-started to invoke the
 	 * stmmac_tx function. This approach takes care about the fragments.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 10/17] stmmac: optimize tx clean function
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (8 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 09/17] stmmac: optimize tx desc management Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 11/17] stmmac: set dirty index out of the loop Alexandre TORGUE
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Fabrice Gasnier <fabrice.gasnier@st.com>

This patch "inline" get_tx_owner and get_ls routines. It Results in a
unique read to tdes0, instead of three, to check TX_OWN and LS bits,
and other status bits.

It helps improve driver TX path by removing two uncached read/writes
inside TX clean loop for enhanced descriptors but not for normal ones
because the des1 must be read in any case.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 885c0f9..7ccb147 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -245,6 +245,14 @@ enum rx_frame_status {
 	dma_own = 0x8,
 };
 
+/* Tx status */
+enum tx_frame_status {
+	tx_done = 0x0,
+	tx_not_ls = 0x1,
+	tx_err = 0x2,
+	tx_dma_own = 0x4,
+};
+
 enum dma_irq_status {
 	tx_hard_error = 0x1,
 	tx_hard_error_bump_tc = 0x2,
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 1abd80e..957610b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -31,7 +31,15 @@ static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 {
 	struct net_device_stats *stats = (struct net_device_stats *)data;
 	unsigned int tdes0 = p->des0;
-	int ret = 0;
+	int ret = tx_done;
+
+	/* Get tx owner first */
+	if (unlikely(tdes0 & ETDES0_OWN))
+		return tx_dma_own;
+
+	/* Verify tx error by looking at the last segment. */
+	if (likely(!(tdes0 & ETDES0_LAST_SEGMENT)))
+		return tx_not_ls;
 
 	if (unlikely(tdes0 & ETDES0_ERROR_SUMMARY)) {
 		if (unlikely(tdes0 & ETDES0_JABBER_TIMEOUT))
@@ -71,7 +79,7 @@ static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 			dwmac_dma_flush_tx_fifo(ioaddr);
 		}
 
-		ret = -1;
+		ret = tx_err;
 	}
 
 	if (unlikely(tdes0 & ETDES0_DEFERRED))
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 19cc12d..122fb5a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -31,7 +31,16 @@ static int ndesc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 {
 	struct net_device_stats *stats = (struct net_device_stats *)data;
 	unsigned int tdes0 = p->des0;
-	int ret = 0;
+	unsigned int tdes1 = p->des1;
+	int ret = tx_done;
+
+	/* Get tx owner first */
+	if (unlikely(tdes0 & TDES0_OWN))
+		return tx_dma_own;
+
+	/* Verify tx error by looking at the last segment. */
+	if (likely(!(tdes1 & TDES1_LAST_SEGMENT)))
+		return tx_not_ls;
 
 	if (unlikely(tdes0 & TDES0_ERROR_SUMMARY)) {
 		if (unlikely(tdes0 & TDES0_UNDERFLOW_ERROR)) {
@@ -54,7 +63,7 @@ static int ndesc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 			collisions = (tdes0 & TDES0_COLLISION_COUNT_MASK) >> 3;
 			stats->collisions += collisions;
 		}
-		ret = -1;
+		ret = tx_err;
 	}
 
 	if (tdes0 & TDES0_VLAN_FRAME)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 24c3608..d31179f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1313,32 +1313,31 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 	priv->xstats.tx_clean++;
 
 	while (entry != priv->cur_tx) {
-		int last;
 		struct sk_buff *skb = priv->tx_skbuff[entry];
 		struct dma_desc *p;
+		int status;
 
 		if (priv->extend_desc)
 			p = (struct dma_desc *)(priv->dma_etx + entry);
 		else
 			p = priv->dma_tx + entry;
 
-		/* Check if the descriptor is owned by the DMA. */
-		if (priv->hw->desc->get_tx_owner(p))
-			break;
-
-		/* Verify tx error by looking at the last segment. */
-		last = priv->tx_skbuff_dma[entry].last_segment;
-		if (likely(last)) {
-			int tx_error =
-			    priv->hw->desc->tx_status(&priv->dev->stats,
+		status = priv->hw->desc->tx_status(&priv->dev->stats,
 						      &priv->xstats, p,
 						      priv->ioaddr);
-			if (likely(tx_error == 0)) {
+		/* Check if the descriptor is owned by the DMA */
+		if (unlikely(status & tx_dma_own))
+			break;
+
+		/* Just consider the last segment and ...*/
+		if (likely(!(status & tx_not_ls))) {
+			/* ... verify the status error condition */
+			if (unlikely(status & tx_err)) {
+				priv->dev->stats.tx_errors++;
+			} else {
 				priv->dev->stats.tx_packets++;
 				priv->xstats.tx_pkt_n++;
-			} else
-				priv->dev->stats.tx_errors++;
-
+			}
 			stmmac_get_tx_hwtstamp(priv, entry, skb);
 		}
 		if (netif_msg_tx_done(priv))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 11/17] stmmac: set dirty index out of the loop
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (9 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 10/17] stmmac: optimize tx clean function Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 12/17] stmmac: first frame prep at the end of xmit routine Alexandre TORGUE
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

The dirty index can be updated out of the loop where all the
tx resources are claimed. This will help on performances too.
Also a useless debug printk has been removed from the main loop.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d31179f..2e4c10a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1340,9 +1340,6 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 			}
 			stmmac_get_tx_hwtstamp(priv, entry, skb);
 		}
-		if (netif_msg_tx_done(priv))
-			pr_debug("%s: curr %d, dirty %d\n", __func__,
-				 priv->cur_tx, priv->dirty_tx);
 
 		if (likely(priv->tx_skbuff_dma[entry].buf)) {
 			if (priv->tx_skbuff_dma[entry].map_as_page)
@@ -1372,8 +1369,8 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 		priv->hw->desc->release_tx_desc(p, priv->mode);
 
 		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
-		priv->dirty_tx = entry;
 	}
+	priv->dirty_tx = entry;
 
 	netdev_completed_queue(priv->dev, pkts_compl, bytes_compl);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 12/17] stmmac: first frame prep at the end of xmit routine
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (10 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 11/17] stmmac: set dirty index out of the loop Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 13/17] stmmac: do not poll phy handler when attach a switch Alexandre TORGUE
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch is to fill the first descriptor just before granting
the DMA engine so at the end of the xmit.
The patch takes care about the algorithm adopted to mitigate the
interrupts, then it fixes the last segment in case of no fragments.
Moreover, this new implementation does not pass any "ter" field when
prepare the descriptors because this is not necessary.
The patch also details the memory barrier in the xmit.

As final results, this patch guarantees the same performances
but fixing a case if small datagram are sent. In fact, this
kind of test is impacted if no coalesce is done.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 7ccb147..f96d257 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -100,7 +100,7 @@ struct stmmac_extra_stats {
 	unsigned long napi_poll;
 	unsigned long tx_normal_irq_n;
 	unsigned long tx_clean;
-	unsigned long tx_reset_ic_bit;
+	unsigned long tx_set_ic_bit;
 	unsigned long irq_receive_pmt_irq_n;
 	/* MMC info */
 	unsigned long mmc_tx_irq_n;
@@ -347,7 +347,7 @@ struct stmmac_desc_ops {
 	/* Invoked by the xmit function to prepare the tx descriptor */
 	void (*prepare_tx_desc) (struct dma_desc *p, int is_fs, int len,
 				 bool csum_flag, int mode, bool tx_own,
-				 bool ls_ic);
+				 bool ls);
 	/* Set/get the owner of the descriptor */
 	void (*set_tx_owner) (struct dma_desc *p);
 	int (*get_tx_owner) (struct dma_desc *p);
@@ -355,7 +355,7 @@ struct stmmac_desc_ops {
 	void (*release_tx_desc) (struct dma_desc *p, int mode);
 	/* Clear interrupt on tx frame completion. When this bit is
 	 * set an interrupt happens as soon as the frame is transmitted */
-	void (*clear_tx_ic) (struct dma_desc *p);
+	void (*set_tx_ic)(struct dma_desc *p);
 	/* Last tx segment reports the transmit status */
 	int (*get_tx_ls) (struct dma_desc *p);
 	/* Return the transmit status looking at the TDES1 */
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 957610b..cfb018c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -311,10 +311,15 @@ static void enh_desc_release_tx_desc(struct dma_desc *p, int mode)
 
 static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 				     bool csum_flag, int mode, bool tx_own,
-				     bool ls_ic)
+				     bool ls)
 {
 	unsigned int tdes0 = p->des0;
 
+	if (mode == STMMAC_CHAIN_MODE)
+		enh_set_tx_desc_len_on_chain(p, len);
+	else
+		enh_set_tx_desc_len_on_ring(p, len);
+
 	if (is_fs)
 		tdes0 |= ETDES0_FIRST_SEGMENT;
 	else
@@ -325,6 +330,10 @@ static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 	else
 		tdes0 &= ~(TX_CIC_FULL << ETDES0_CHECKSUM_INSERTION_SHIFT);
 
+	if (ls)
+		tdes0 |= ETDES0_LAST_SEGMENT;
+
+	/* Finally set the OWN bit. Later the DMA will start! */
 	if (tx_own)
 		tdes0 |= ETDES0_OWN;
 
@@ -335,20 +344,12 @@ static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 		 */
 		wmb();
 
-	if (ls_ic)
-		tdes0 |= ETDES0_LAST_SEGMENT | ETDES0_INTERRUPT;
-
 	p->des0 = tdes0;
-
-	if (mode == STMMAC_CHAIN_MODE)
-		enh_set_tx_desc_len_on_chain(p, len);
-	else
-		enh_set_tx_desc_len_on_ring(p, len);
 }
 
-static void enh_desc_clear_tx_ic(struct dma_desc *p)
+static void enh_desc_set_tx_ic(struct dma_desc *p)
 {
-	p->des0 &= ~ETDES0_INTERRUPT;
+	p->des0 |= ETDES0_INTERRUPT;
 }
 
 static int enh_desc_get_rx_frame_len(struct dma_desc *p, int rx_coe_type)
@@ -419,7 +420,7 @@ const struct stmmac_desc_ops enh_desc_ops = {
 	.get_tx_owner = enh_desc_get_tx_owner,
 	.release_tx_desc = enh_desc_release_tx_desc,
 	.prepare_tx_desc = enh_desc_prepare_tx_desc,
-	.clear_tx_ic = enh_desc_clear_tx_ic,
+	.set_tx_ic = enh_desc_set_tx_ic,
 	.get_tx_ls = enh_desc_get_tx_ls,
 	.set_tx_owner = enh_desc_set_tx_owner,
 	.set_rx_owner = enh_desc_set_rx_owner,
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 122fb5a..e13228f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -195,10 +195,15 @@ static void ndesc_release_tx_desc(struct dma_desc *p, int mode)
 
 static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 				  bool csum_flag, int mode, bool tx_own,
-				  bool ls_ic)
+				  bool ls)
 {
 	unsigned int tdes1 = p->des1;
 
+	if (mode == STMMAC_CHAIN_MODE)
+		norm_set_tx_desc_len_on_chain(p, len);
+	else
+		norm_set_tx_desc_len_on_ring(p, len);
+
 	if (is_fs)
 		tdes1 |= TDES1_FIRST_SEGMENT;
 	else
@@ -209,23 +214,18 @@ static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 	else
 		tdes1 &= ~(TX_CIC_FULL << TDES1_CHECKSUM_INSERTION_SHIFT);
 
+	if (ls)
+		tdes1 |= TDES1_LAST_SEGMENT;
+
 	if (tx_own)
 		tdes1 |= TDES0_OWN;
 
-	if (ls_ic)
-		tdes1 |= TDES1_LAST_SEGMENT | TDES1_INTERRUPT;
-
 	p->des1 = tdes1;
-
-	if (mode == STMMAC_CHAIN_MODE)
-		norm_set_tx_desc_len_on_chain(p, len);
-	else
-		norm_set_tx_desc_len_on_ring(p, len);
 }
 
-static void ndesc_clear_tx_ic(struct dma_desc *p)
+static void ndesc_set_tx_ic(struct dma_desc *p)
 {
-	p->des1 &= ~TDES1_INTERRUPT;
+	p->des1 |= TDES1_INTERRUPT;
 }
 
 static int ndesc_get_rx_frame_len(struct dma_desc *p, int rx_coe_type)
@@ -288,7 +288,7 @@ const struct stmmac_desc_ops ndesc_ops = {
 	.get_tx_owner = ndesc_get_tx_owner,
 	.release_tx_desc = ndesc_release_tx_desc,
 	.prepare_tx_desc = ndesc_prepare_tx_desc,
-	.clear_tx_ic = ndesc_clear_tx_ic,
+	.set_tx_ic = ndesc_set_tx_ic,
 	.get_tx_ls = ndesc_get_tx_ls,
 	.set_tx_owner = ndesc_set_tx_owner,
 	.set_rx_owner = ndesc_set_rx_owner,
diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
index 11c7164..7723b5d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
@@ -62,7 +62,6 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, bmax, csum,
 						STMMAC_RING_MODE, 0, false);
-		wmb();
 		priv->tx_skbuff[entry] = NULL;
 		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
 
@@ -82,7 +81,6 @@ static int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum,
 						STMMAC_RING_MODE, 1, true);
-		wmb();
 	} else {
 		desc->des2 = dma_map_single(priv->device, skb->data,
 					    nopaged_len, DMA_TO_DEVICE);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 4c6486c..c803d4c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -97,7 +97,7 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
 	STMMAC_STAT(napi_poll),
 	STMMAC_STAT(tx_normal_irq_n),
 	STMMAC_STAT(tx_clean),
-	STMMAC_STAT(tx_reset_ic_bit),
+	STMMAC_STAT(tx_set_ic_bit),
 	STMMAC_STAT(irq_receive_pmt_irq_n),
 	/* MMC info */
 	STMMAC_STAT(mmc_tx_irq_n),
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 2e4c10a..90b2612 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1942,12 +1942,12 @@ static int stmmac_release(struct net_device *dev)
 static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct stmmac_priv *priv = netdev_priv(dev);
-	int entry;
+	unsigned int nopaged_len = skb_headlen(skb);
 	int i, csum_insertion = 0, is_jumbo = 0;
 	int nfrags = skb_shinfo(skb)->nr_frags;
+	unsigned int entry, first_entry;
 	struct dma_desc *desc, *first;
-	unsigned int nopaged_len = skb_headlen(skb);
-	unsigned int enh_desc = priv->plat->enh_desc;
+	unsigned int enh_desc;
 
 	spin_lock(&priv->tx_lock);
 
@@ -1965,34 +1965,25 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		stmmac_disable_eee_mode(priv);
 
 	entry = priv->cur_tx;
-
+	first_entry = entry;
 
 	csum_insertion = (skb->ip_summed == CHECKSUM_PARTIAL);
 
-	if (priv->extend_desc)
+	if (likely(priv->extend_desc))
 		desc = (struct dma_desc *)(priv->dma_etx + entry);
 	else
 		desc = priv->dma_tx + entry;
 
 	first = desc;
 
+	priv->tx_skbuff[first_entry] = skb;
+
+	enh_desc = priv->plat->enh_desc;
 	/* To program the descriptors according to the size of the frame */
 	if (enh_desc)
 		is_jumbo = priv->hw->mode->is_jumbo_frm(skb->len, enh_desc);
 
-	if (likely(!is_jumbo)) {
-		desc->des2 = dma_map_single(priv->device, skb->data,
-					    nopaged_len, DMA_TO_DEVICE);
-		if (dma_mapping_error(priv->device, desc->des2))
-			goto dma_map_err;
-		priv->tx_skbuff_dma[entry].buf = desc->des2;
-		priv->tx_skbuff_dma[entry].len = nopaged_len;
-		/* do not set the own at this stage */
-		priv->hw->desc->prepare_tx_desc(desc, 1, nopaged_len,
-						csum_insertion, priv->mode, 0,
-						nfrags == 0);
-	} else {
-		desc = first;
+	if (unlikely(is_jumbo)) {
 		entry = priv->hw->mode->jumbo_frm(priv, skb, csum_insertion);
 		if (unlikely(entry < 0))
 			goto dma_map_err;
@@ -2003,10 +1994,9 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		int len = skb_frag_size(frag);
 		bool last_segment = (i == (nfrags - 1));
 
-		priv->tx_skbuff[entry] = NULL;
 		entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
 
-		if (priv->extend_desc)
+		if (likely(priv->extend_desc))
 			desc = (struct dma_desc *)(priv->dma_etx + entry);
 		else
 			desc = priv->dma_tx + entry;
@@ -2016,41 +2006,25 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		if (dma_mapping_error(priv->device, desc->des2))
 			goto dma_map_err; /* should reuse desc w/o issues */
 
+		priv->tx_skbuff[entry] = NULL;
 		priv->tx_skbuff_dma[entry].buf = desc->des2;
 		priv->tx_skbuff_dma[entry].map_as_page = true;
 		priv->tx_skbuff_dma[entry].len = len;
+		priv->tx_skbuff_dma[entry].last_segment = last_segment;
+
+		/* Prepare the descriptor and set the own bit too */
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion,
 						priv->mode, 1, last_segment);
-		priv->tx_skbuff_dma[entry].last_segment = last_segment;
 	}
 
-	priv->tx_skbuff[entry] = skb;
-
-	/* According to the coalesce parameter the IC bit for the latest
-	 * segment could be reset and the timer re-started to invoke the
-	 * stmmac_tx function. This approach takes care about the fragments.
-	 */
-	priv->tx_count_frames += nfrags + 1;
-	if (priv->tx_coal_frames > priv->tx_count_frames) {
-		priv->hw->desc->clear_tx_ic(desc);
-		priv->xstats.tx_reset_ic_bit++;
-		mod_timer(&priv->txtimer,
-			  STMMAC_COAL_TIMER(priv->tx_coal_timer));
-	} else
-		priv->tx_count_frames = 0;
-
-	/* To avoid raise condition */
-	priv->hw->desc->set_tx_owner(first);
-	wmb();
-
 	entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE);
 
 	priv->cur_tx = entry;
 
 	if (netif_msg_pktdata(priv)) {
-		pr_debug("%s: curr %d dirty=%d entry=%d, first=%p, nfrags=%d",
-			__func__, (priv->cur_tx % DMA_TX_SIZE),
-			(priv->dirty_tx % DMA_TX_SIZE), entry, first, nfrags);
+		pr_debug("%s: curr=%d dirty=%d f=%d, e=%d, first=%p, nfrags=%d",
+			 __func__, priv->cur_tx, priv->dirty_tx, first_entry,
+			 entry, first, nfrags);
 
 		if (priv->extend_desc)
 			stmmac_display_ring((void *)priv->dma_etx,
@@ -2062,6 +2036,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		pr_debug(">>> frame to be transmitted: ");
 		print_pkt(skb->data, skb->len);
 	}
+
 	if (unlikely(stmmac_tx_avail(priv) <= (MAX_SKB_FRAGS + 1))) {
 		if (netif_msg_hw(priv))
 			pr_debug("%s: stop transmitted packets\n", __func__);
@@ -2070,16 +2045,59 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	dev->stats.tx_bytes += skb->len;
 
-	if (unlikely((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
-		     priv->hwts_tx_en)) {
-		/* declare that device is doing timestamping */
-		skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
-		priv->hw->desc->enable_tx_timestamp(first);
+	/* According to the coalesce parameter the IC bit for the latest
+	 * segment is reset and the timer re-started to clean the tx status.
+	 * This approach takes care about the fragments: desc is the first
+	 * element in case of no SG.
+	 */
+	priv->tx_count_frames += nfrags + 1;
+	if (likely(priv->tx_coal_frames > priv->tx_count_frames)) {
+		mod_timer(&priv->txtimer,
+			  STMMAC_COAL_TIMER(priv->tx_coal_timer));
+	} else {
+		priv->tx_count_frames = 0;
+		priv->hw->desc->set_tx_ic(desc);
+		priv->xstats.tx_set_ic_bit++;
 	}
 
 	if (!priv->hwts_tx_en)
 		skb_tx_timestamp(skb);
 
+	/* Ready to fill the first descriptor and set the OWN bit w/o any
+	 * problems because all the descriptors are actually ready to be
+	 * passed to the DMA engine.
+	 */
+	if (likely(!is_jumbo)) {
+		bool last_segment = (nfrags == 0);
+
+		first->des2 = dma_map_single(priv->device, skb->data,
+					     nopaged_len, DMA_TO_DEVICE);
+		if (dma_mapping_error(priv->device, first->des2))
+			goto dma_map_err;
+
+		priv->tx_skbuff_dma[first_entry].buf = first->des2;
+		priv->tx_skbuff_dma[first_entry].len = nopaged_len;
+		priv->tx_skbuff_dma[first_entry].last_segment = last_segment;
+
+		if (unlikely((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+			     priv->hwts_tx_en)) {
+			/* declare that device is doing timestamping */
+			skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
+			priv->hw->desc->enable_tx_timestamp(first);
+		}
+
+		/* Prepare the first descriptor setting the OWN bit too */
+		priv->hw->desc->prepare_tx_desc(first, 1, nopaged_len,
+						csum_insertion, priv->mode, 1,
+						last_segment);
+
+		/* The own bit must be the latest setting done when prepare the
+		 * descriptor and then barrier is needed to make sure that
+		 * all is coherent before granting the DMA engine.
+		 */
+		smp_wmb();
+	}
+
 	netdev_sent_queue(dev, skb->len);
 	priv->hw->dma->enable_dma_transmission(priv->ioaddr);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 13/17] stmmac: do not poll phy handler when attach a switch
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (11 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 12/17] stmmac: first frame prep at the end of xmit routine Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 14/17] stmmac: fix phy init when attached to a phy Alexandre TORGUE
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch avoids to call the stmmac_adjust_link when
the driver is connected to a switch by using the FIXED_PHY
support. Prior this patch the phydev->irq was set as PHY_POLL
so periodically the phy handler was invoked spending useless
time because the link cannot actually change.
Note that the stmmac_adjust_link will be called just one
time and this guarantees that the ST glue logic will be
setup according to the mode and speed fixed.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 90b2612..eab7ac0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -866,6 +866,11 @@ static int stmmac_init_phy(struct net_device *dev)
 		phy_disconnect(phydev);
 		return -ENODEV;
 	}
+
+	/* If attached to a switch, there is no reason to poll phy handler */
+	if (!strcmp(priv->plat->phy_bus_name, "fixed"))
+		phydev->irq = PHY_IGNORE_INTERRUPT;
+
 	pr_debug("stmmac_init_phy:  %s: attached to PHY (UID 0x%x)"
 		 " Link = %d\n", dev->name, phydev->phy_id, phydev->link);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 14/17] stmmac: fix phy init when attached to a phy
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (12 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 13/17] stmmac: do not poll phy handler when attach a switch Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 15/17] stmmac: do not perform zero-copy for rx frames Alexandre TORGUE
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Fabrice Gasnier <fabrice.gasnier@st.com>

phy_bus_name can be NULL when "fixed-link" property isn't used.
Then, since "stmmac: do not poll phy handler when attach a switch",
phy_bus_name ptr needs to be checked before strcmp is called.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index eab7ac0..3cc1355 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -868,8 +868,9 @@ static int stmmac_init_phy(struct net_device *dev)
 	}
 
 	/* If attached to a switch, there is no reason to poll phy handler */
-	if (!strcmp(priv->plat->phy_bus_name, "fixed"))
-		phydev->irq = PHY_IGNORE_INTERRUPT;
+	if (priv->plat->phy_bus_name)
+		if (!strcmp(priv->plat->phy_bus_name, "fixed"))
+			phydev->irq = PHY_IGNORE_INTERRUPT;
 
 	pr_debug("stmmac_init_phy:  %s: attached to PHY (UID 0x%x)"
 		 " Link = %d\n", dev->name, phydev->phy_id, phydev->link);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 15/17] stmmac: do not perform zero-copy for rx frames
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (13 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 14/17] stmmac: fix phy init when attached to a phy Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 16/17] stmmac: tune rx copy via threshold Alexandre TORGUE
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch is to allow this driver to copy tiny frames during the reception
process. This is giving more stability while stressing the driver on STi
embedded systems.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 0d01f3e..221f5cd 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -74,6 +74,7 @@ struct stmmac_priv {
 	unsigned int cur_rx;
 	unsigned int dirty_rx;
 	unsigned int dma_buf_sz;
+	unsigned int rx_copybreak;
 	u32 rx_riwt;
 	int hwts_rx_en;
 	dma_addr_t *rx_skbuff_dma;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index c803d4c..3c7928e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -781,6 +781,43 @@ static int stmmac_get_ts_info(struct net_device *dev,
 		return ethtool_op_get_ts_info(dev, info);
 }
 
+static int stmmac_get_tunable(struct net_device *dev,
+			      const struct ethtool_tunable *tuna, void *data)
+{
+	struct stmmac_priv *priv = netdev_priv(dev);
+	int ret = 0;
+
+	switch (tuna->id) {
+	case ETHTOOL_RX_COPYBREAK:
+		*(u32 *)data = priv->rx_copybreak;
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
+static int stmmac_set_tunable(struct net_device *dev,
+			      const struct ethtool_tunable *tuna,
+			      const void *data)
+{
+	struct stmmac_priv *priv = netdev_priv(dev);
+	int ret = 0;
+
+	switch (tuna->id) {
+	case ETHTOOL_RX_COPYBREAK:
+		priv->rx_copybreak = *(u32 *)data;
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
 static const struct ethtool_ops stmmac_ethtool_ops = {
 	.begin = stmmac_check_if_running,
 	.get_drvinfo = stmmac_ethtool_getdrvinfo,
@@ -803,6 +840,8 @@ static const struct ethtool_ops stmmac_ethtool_ops = {
 	.get_ts_info = stmmac_get_ts_info,
 	.get_coalesce = stmmac_get_coalesce,
 	.set_coalesce = stmmac_set_coalesce,
+	.get_tunable = stmmac_get_tunable,
+	.set_tunable = stmmac_set_tunable,
 };
 
 void stmmac_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3cc1355..2ffe8dd 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -91,6 +91,8 @@ static int buf_sz = DEFAULT_BUFSIZE;
 module_param(buf_sz, int, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(buf_sz, "DMA buffer size");
 
+#define	STMMAC_RX_COPYBREAK	256
+
 static const u32 default_msg_level = (NETIF_MSG_DRV | NETIF_MSG_PROBE |
 				      NETIF_MSG_LINK | NETIF_MSG_IFUP |
 				      NETIF_MSG_IFDOWN | NETIF_MSG_TIMER);
@@ -1808,6 +1810,7 @@ static int stmmac_open(struct net_device *dev)
 	priv->xstats.threshold = tc;
 
 	priv->dma_buf_sz = STMMAC_ALIGN(buf_sz);
+	priv->rx_copybreak = STMMAC_RX_COPYBREAK;
 
 	ret = alloc_dma_desc_resources(priv);
 	if (ret < 0) {
@@ -2159,8 +2162,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
 			struct sk_buff *skb;
 
 			skb = netdev_alloc_skb_ip_align(priv->dev, bfsize);
-
-			if (unlikely(skb == NULL))
+			if (unlikely(!skb))
 				break;
 
 			priv->rx_skbuff[entry] = skb;
@@ -2282,23 +2284,52 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 					pr_debug("\tframe size %d, COE: %d\n",
 						 frame_len, status);
 			}
-			skb = priv->rx_skbuff[entry];
-			if (unlikely(!skb)) {
-				pr_err("%s: Inconsistent Rx descriptor chain\n",
-				       priv->dev->name);
-				priv->dev->stats.rx_dropped++;
-				break;
+
+			if (unlikely(frame_len < priv->rx_copybreak)) {
+				skb = netdev_alloc_skb_ip_align(priv->dev,
+								frame_len);
+				if (unlikely(!skb)) {
+					if (net_ratelimit())
+						dev_warn(priv->device,
+							 "packet dropped\n");
+					priv->dev->stats.rx_dropped++;
+					break;
+				}
+
+				dma_sync_single_for_cpu(priv->device,
+							priv->rx_skbuff_dma
+							[entry], frame_len,
+							DMA_FROM_DEVICE);
+				skb_copy_to_linear_data(skb,
+							priv->
+							rx_skbuff[entry]->data,
+							frame_len);
+
+				skb_put(skb, frame_len);
+				dma_sync_single_for_device(priv->device,
+							   priv->rx_skbuff_dma
+							   [entry], frame_len,
+							   DMA_FROM_DEVICE);
+			} else {
+				skb = priv->rx_skbuff[entry];
+				if (unlikely(!skb)) {
+					pr_err("%s: Inconsistent Rx chain\n",
+					       priv->dev->name);
+					priv->dev->stats.rx_dropped++;
+					break;
+				}
+				prefetch(skb->data - NET_IP_ALIGN);
+				priv->rx_skbuff[entry] = NULL;
+
+				skb_put(skb, frame_len);
+				dma_unmap_single(priv->device,
+						 priv->rx_skbuff_dma[entry],
+						 priv->dma_buf_sz,
+						 DMA_FROM_DEVICE);
 			}
-			prefetch(skb->data - NET_IP_ALIGN);
-			priv->rx_skbuff[entry] = NULL;
 
 			stmmac_get_rx_hwtstamp(priv, entry, skb);
 
-			skb_put(skb, frame_len);
-			dma_unmap_single(priv->device,
-					 priv->rx_skbuff_dma[entry],
-					 priv->dma_buf_sz, DMA_FROM_DEVICE);
-
 			if (netif_msg_pktdata(priv)) {
 				pr_debug("frame received (%dbytes)", frame_len);
 				print_pkt(skb->data, frame_len);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 16/17] stmmac: tune rx copy via threshold.
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (14 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 15/17] stmmac: do not perform zero-copy for rx frames Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 13:27 ` [PATCH v3 17/17] stmmac: update version to Oct_2015 Alexandre TORGUE
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

There is a threshold now used to also limit the skb allocation
when use zero-copy. This is to avoid that there are incoherence
in the ring due to a failure on skb allocation under very
aggressive testing and under low memory conditions.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 221f5cd..d6c244f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -75,6 +75,7 @@ struct stmmac_priv {
 	unsigned int dirty_rx;
 	unsigned int dma_buf_sz;
 	unsigned int rx_copybreak;
+	unsigned int rx_zeroc_thresh;
 	u32 rx_riwt;
 	int hwts_rx_en;
 	dma_addr_t *rx_skbuff_dma;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 2ffe8dd..4c5ce98 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -72,6 +72,7 @@ module_param(phyaddr, int, S_IRUGO);
 MODULE_PARM_DESC(phyaddr, "Physical device address");
 
 #define STMMAC_TX_THRESH	(DMA_TX_SIZE / 4)
+#define STMMAC_RX_THRESH	(DMA_RX_SIZE / 4)
 
 static int flow_ctrl = FLOW_OFF;
 module_param(flow_ctrl, int, S_IRUGO | S_IWUSR);
@@ -2138,6 +2139,14 @@ static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb)
 }
 
 
+static inline int stmmac_rx_threshold_count(struct stmmac_priv *priv)
+{
+	if (priv->rx_zeroc_thresh < STMMAC_RX_THRESH)
+		return 0;
+
+	return 1;
+}
+
 /**
  * stmmac_rx_refill - refill used skb preallocated buffers
  * @priv: driver private structure
@@ -2162,8 +2171,15 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
 			struct sk_buff *skb;
 
 			skb = netdev_alloc_skb_ip_align(priv->dev, bfsize);
-			if (unlikely(!skb))
+			if (unlikely(!skb)) {
+				/* so for a while no zero-copy! */
+				priv->rx_zeroc_thresh = STMMAC_RX_THRESH;
+				if (unlikely(net_ratelimit()))
+					dev_err(priv->device,
+						"fail to alloc skb entry %d\n",
+						entry);
 				break;
+			}
 
 			priv->rx_skbuff[entry] = skb;
 			priv->rx_skbuff_dma[entry] =
@@ -2179,9 +2195,13 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
 
 			priv->hw->mode->refill_desc3(priv, p);
 
+			if (priv->rx_zeroc_thresh > 0)
+				priv->rx_zeroc_thresh--;
+
 			if (netif_msg_rx_status(priv))
 				pr_debug("\trefill entry #%d\n", entry);
 		}
+
 		wmb();
 		priv->hw->desc->set_rx_owner(p);
 		wmb();
@@ -2285,7 +2305,8 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 						 frame_len, status);
 			}
 
-			if (unlikely(frame_len < priv->rx_copybreak)) {
+			if (unlikely((frame_len < priv->rx_copybreak) ||
+				     stmmac_rx_threshold_count(priv))) {
 				skb = netdev_alloc_skb_ip_align(priv->dev,
 								frame_len);
 				if (unlikely(!skb)) {
@@ -2320,6 +2341,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
 				}
 				prefetch(skb->data - NET_IP_ALIGN);
 				priv->rx_skbuff[entry] = NULL;
+				priv->rx_zeroc_thresh++;
 
 				skb_put(skb, frame_len);
 				dma_unmap_single(priv->device,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 17/17] stmmac: update version to Oct_2015
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (15 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 16/17] stmmac: tune rx copy via threshold Alexandre TORGUE
@ 2016-02-29 13:27 ` Alexandre TORGUE
  2016-02-29 14:19 ` [PATCH v3 00/17] stmmac: enhance driver performances and update the version Giuseppe CAVALLARO
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Alexandre TORGUE @ 2016-02-29 13:27 UTC (permalink / raw)
  To: netdev; +Cc: peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>

This patch just updates the driver to the version fully
tested on STi platforms. This version is Oct_2015.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index d6c244f..8bbab97 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -24,7 +24,7 @@
 #define __STMMAC_H__
 
 #define STMMAC_RESOURCE_NAME   "stmmaceth"
-#define DRV_MODULE_VERSION	"March_2013"
+#define DRV_MODULE_VERSION	"Oct_2015"
 
 #include <linux/clk.h>
 #include <linux/stmmac.h>
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 00/17] stmmac: enhance driver performances and update the version
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (16 preceding siblings ...)
  2016-02-29 13:27 ` [PATCH v3 17/17] stmmac: update version to Oct_2015 Alexandre TORGUE
@ 2016-02-29 14:19 ` Giuseppe CAVALLARO
  2016-02-29 16:50 ` Giuseppe CAVALLARO
  2016-03-02 19:27 ` David Miller
  19 siblings, 0 replies; 23+ messages in thread
From: Giuseppe CAVALLARO @ 2016-02-29 14:19 UTC (permalink / raw)
  To: Alexandre TORGUE, netdev; +Cc: fabrice.gasnier, maxime.coquelin

Thanks a lot Alex for this and for the next development needed
to enhance the stmmac to support new chips etc.

Peppe

On 2/29/2016 2:27 PM, Alexandre TORGUE wrote:
> According to Giuseppe, I send the v3 series.
>
> This is a subset of patches to rework the driver in order to improve its
> performances and make it more robust under stress conditions.
>
> All patches have been ported on STi mainstream kernel branch and
> tested on ARM STiH4xx platforms and newer ones.
>
> This series also updates the driver version and prepares it
> to include further development to support new chips.
>
> In detail, these patches are:
>
> o to rework and improve the internal DMA bus settings
>
>    Fine tuning is mandatory on some platforms for both
>    performance and stability issues.
>
> o to rework and optimize the descriptor management.
>
>    This will help a lot on performance side and preparing
>    the inclusion on the GMAC4.x.
>
> o to add a set of optimizations for both xmit and rx functions.
>
>    These will help a lot on performance side and making the driver
>    more robust in case of low memory conditions and under some
>    stress test, performed for example on IP-STB.
>
> Below some throughput figures obtained on some boxes before and after
> the patches.
>
>                         nuttcp (mbps)       iperf (Mbps)
> ------------------------------------------------------------------
>                        tcp     udp          tcp      udp
>                     tx   rx   tx  rx      tx   rx   tx  rx
>                      ------------------------------------------
>     old             680   800 480  506    760  800   600  700
>     new             830   880 540  630    840  880   700   800
>
> ==============================================================================
>
> V2: - rx_copybreak is now managed by using ethtool.
> V3: - improve comments on PCIe detailing that there are no regressions
>      - rework some APIs to properly define some params as bool as expected
>      - rework the formula to get the element inside the ring. Comparing V2,
> 	patches 4 and 13 have been merged because the same formula have been
> 	used. After this rework, no evident benefit has been noticed in terms
> 	of performances so the table above is still valid. Disassembling the
> 	code for SH4 and ARM, with the new formula just an instr is saved
> 	(depending on compiler flags) and this gives us not so relevanti gain,
> 	for example, on SH4 where some instr are executed in the same pipeline
> 	stage.
> 	Ring sizes are now fixed and maybe they can be reworked to be tuned
> 	w/o using stmmaceth= cmdline option. Indeed, nobody change these sizes
> 	and indeed the numbers selected by default respect the budget and
> 	avoid to pass invalid setup. These are the best driver default sizes
> 	for ring and chain.
>
> ==============================================================================
> Fabrice Gasnier (3):
>    stmmac: merge get_rx_owner into rx_status routine.
>    stmmac: optimize tx clean function
>    stmmac: fix phy init when attached to a phy
>
> Giuseppe Cavallaro (14):
>    stmmac: share reset function between dwmac100 and dwmac1000
>    stmmac: rework DMA bus setting and introduce new platform AXI
>      structure
>    stmmac: change descriptor layout
>    stmmac: review RX/TX ring management
>    stmmac: add length field to dma data
>    stmmac: add last_segment field to dma data
>    stmmac: add is_jumbo field to dma data
>    stmmac: optimize tx desc management
>    stmmac: set dirty index out of the loop
>    stmmac: first frame prep at the end of xmit routine
>    stmmac: do not poll phy handler when attach a switch
>    stmmac: do not perform zero-copy for rx frames
>    stmmac: tune rx copy via threshold.
>    stmmac: update version to Oct_2015
>
>   Documentation/devicetree/bindings/net/stmmac.txt   |  54 ++-
>   drivers/net/ethernet/stmicro/stmmac/chain_mode.c   |  37 +-
>   drivers/net/ethernet/stmicro/stmmac/common.h       |  39 +-
>   drivers/net/ethernet/stmicro/stmmac/descs.h        | 330 +++++++-------
>   drivers/net/ethernet/stmicro/stmmac/descs_com.h    |  77 ++--
>   drivers/net/ethernet/stmicro/stmmac/dwmac100.h     |   1 -
>   drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   3 +-
>   .../net/ethernet/stmicro/stmmac/dwmac1000_dma.c    | 111 +++--
>   drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c |  22 +-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |  39 +-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c    |  21 +
>   drivers/net/ethernet/stmicro/stmmac/enh_desc.c     | 226 +++++-----
>   drivers/net/ethernet/stmicro/stmmac/norm_desc.c    | 150 ++++---
>   drivers/net/ethernet/stmicro/stmmac/ring_mode.c    |  32 +-
>   drivers/net/ethernet/stmicro/stmmac/stmmac.h       |   9 +-
>   .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |  41 +-
>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 473 ++++++++++++---------
>   drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c   |   4 +-
>   .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |  42 +-
>   include/linux/stmmac.h                             |  17 +-
>   20 files changed, 1016 insertions(+), 712 deletions(-)
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 00/17] stmmac: enhance driver performances and update the version
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (17 preceding siblings ...)
  2016-02-29 14:19 ` [PATCH v3 00/17] stmmac: enhance driver performances and update the version Giuseppe CAVALLARO
@ 2016-02-29 16:50 ` Giuseppe CAVALLARO
  2016-03-01  9:15   ` Lars Persson
  2016-03-02 19:27 ` David Miller
  19 siblings, 1 reply; 23+ messages in thread
From: Giuseppe CAVALLARO @ 2016-02-29 16:50 UTC (permalink / raw)
  To: netdev, maxime.coquelin, rabinv, larper, Rayagond K, lars.persson
  Cc: Alexandre TORGUE, fabrice.gasnier, Joachim Eastwood

Gents

on top of these patches, there is a new train to enhance the stmmac to
support the DWMAC_4.x chips. They will be proposed very soon and on
top of this update (as soon as reviewed and merged).

In our context, it has been very useful working with the same driver
that runs fine on several (x86, arm, sh4) boxes with different SYNP
MAC/GMAC IPs (starting from MAC10/100 Database Release 1.5 to databook
3.70a and .... 4.00a and 4.10a). We got the benefit to have all the
features already supported by stmmac plus the good performances
available with TSO on gmac4.

I can image no big issues to enhance the stmmac on supporting the
new 4.00a and 4.10a although there is some other work made by
Rabin and Larper.

I guess, stmmac users will continue to be happy to continue to have the
same d.d. working on their platforms with new gmac.
But! it also makes sense to avoid to have two drivers that aim
to do the same job. Or to get more synergy on the same code as
done in the past with Rayagond for PTP and EEE.

If you have some concern or advice, please do not hesitate to ask.
We will try to send the patches soon to show the code in case of
people are interested in.

Kind Regards
Peppe

On 2/29/2016 2:27 PM, Alexandre TORGUE wrote:
> According to Giuseppe, I send the v3 series.
>
> This is a subset of patches to rework the driver in order to improve its
> performances and make it more robust under stress conditions.
>
> All patches have been ported on STi mainstream kernel branch and
> tested on ARM STiH4xx platforms and newer ones.
>
> This series also updates the driver version and prepares it
> to include further development to support new chips.
>
> In detail, these patches are:
>
> o to rework and improve the internal DMA bus settings
>
>    Fine tuning is mandatory on some platforms for both
>    performance and stability issues.
>
> o to rework and optimize the descriptor management.
>
>    This will help a lot on performance side and preparing
>    the inclusion on the GMAC4.x.
>
> o to add a set of optimizations for both xmit and rx functions.
>
>    These will help a lot on performance side and making the driver
>    more robust in case of low memory conditions and under some
>    stress test, performed for example on IP-STB.
>
> Below some throughput figures obtained on some boxes before and after
> the patches.
>
>                         nuttcp (mbps)       iperf (Mbps)
> ------------------------------------------------------------------
>                        tcp     udp          tcp      udp
>                     tx   rx   tx  rx      tx   rx   tx  rx
>                      ------------------------------------------
>     old             680   800 480  506    760  800   600  700
>     new             830   880 540  630    840  880   700   800
>
> ==============================================================================
>
> V2: - rx_copybreak is now managed by using ethtool.
> V3: - improve comments on PCIe detailing that there are no regressions
>      - rework some APIs to properly define some params as bool as expected
>      - rework the formula to get the element inside the ring. Comparing V2,
> 	patches 4 and 13 have been merged because the same formula have been
> 	used. After this rework, no evident benefit has been noticed in terms
> 	of performances so the table above is still valid. Disassembling the
> 	code for SH4 and ARM, with the new formula just an instr is saved
> 	(depending on compiler flags) and this gives us not so relevanti gain,
> 	for example, on SH4 where some instr are executed in the same pipeline
> 	stage.
> 	Ring sizes are now fixed and maybe they can be reworked to be tuned
> 	w/o using stmmaceth= cmdline option. Indeed, nobody change these sizes
> 	and indeed the numbers selected by default respect the budget and
> 	avoid to pass invalid setup. These are the best driver default sizes
> 	for ring and chain.
>
> ==============================================================================
> Fabrice Gasnier (3):
>    stmmac: merge get_rx_owner into rx_status routine.
>    stmmac: optimize tx clean function
>    stmmac: fix phy init when attached to a phy
>
> Giuseppe Cavallaro (14):
>    stmmac: share reset function between dwmac100 and dwmac1000
>    stmmac: rework DMA bus setting and introduce new platform AXI
>      structure
>    stmmac: change descriptor layout
>    stmmac: review RX/TX ring management
>    stmmac: add length field to dma data
>    stmmac: add last_segment field to dma data
>    stmmac: add is_jumbo field to dma data
>    stmmac: optimize tx desc management
>    stmmac: set dirty index out of the loop
>    stmmac: first frame prep at the end of xmit routine
>    stmmac: do not poll phy handler when attach a switch
>    stmmac: do not perform zero-copy for rx frames
>    stmmac: tune rx copy via threshold.
>    stmmac: update version to Oct_2015
>
>   Documentation/devicetree/bindings/net/stmmac.txt   |  54 ++-
>   drivers/net/ethernet/stmicro/stmmac/chain_mode.c   |  37 +-
>   drivers/net/ethernet/stmicro/stmmac/common.h       |  39 +-
>   drivers/net/ethernet/stmicro/stmmac/descs.h        | 330 +++++++-------
>   drivers/net/ethernet/stmicro/stmmac/descs_com.h    |  77 ++--
>   drivers/net/ethernet/stmicro/stmmac/dwmac100.h     |   1 -
>   drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   3 +-
>   .../net/ethernet/stmicro/stmmac/dwmac1000_dma.c    | 111 +++--
>   drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c |  22 +-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |  39 +-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c    |  21 +
>   drivers/net/ethernet/stmicro/stmmac/enh_desc.c     | 226 +++++-----
>   drivers/net/ethernet/stmicro/stmmac/norm_desc.c    | 150 ++++---
>   drivers/net/ethernet/stmicro/stmmac/ring_mode.c    |  32 +-
>   drivers/net/ethernet/stmicro/stmmac/stmmac.h       |   9 +-
>   .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |  41 +-
>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 473 ++++++++++++---------
>   drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c   |   4 +-
>   .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |  42 +-
>   include/linux/stmmac.h                             |  17 +-
>   20 files changed, 1016 insertions(+), 712 deletions(-)
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 00/17] stmmac: enhance driver performances and update the version
  2016-02-29 16:50 ` Giuseppe CAVALLARO
@ 2016-03-01  9:15   ` Lars Persson
  2016-03-01  9:40     ` Giuseppe CAVALLARO
  0 siblings, 1 reply; 23+ messages in thread
From: Lars Persson @ 2016-03-01  9:15 UTC (permalink / raw)
  To: Giuseppe CAVALLARO, netdev, maxime.coquelin, rabinv, larper, Rayagond K
  Cc: Alexandre TORGUE, fabrice.gasnier, Joachim Eastwood

Hi Peppe

We look forward to seeing the patches for 4.10a support. I hope that we 
can work together to have one driver supporting all chips with the 4.x 
GMACs.

- Lars


On 02/29/2016 05:50 PM, Giuseppe CAVALLARO wrote:
> Gents
>
> on top of these patches, there is a new train to enhance the stmmac to
> support the DWMAC_4.x chips. They will be proposed very soon and on
> top of this update (as soon as reviewed and merged).
>
> In our context, it has been very useful working with the same driver
> that runs fine on several (x86, arm, sh4) boxes with different SYNP
> MAC/GMAC IPs (starting from MAC10/100 Database Release 1.5 to databook
> 3.70a and .... 4.00a and 4.10a). We got the benefit to have all the
> features already supported by stmmac plus the good performances
> available with TSO on gmac4.
>
> I can image no big issues to enhance the stmmac on supporting the
> new 4.00a and 4.10a although there is some other work made by
> Rabin and Larper.
>
> I guess, stmmac users will continue to be happy to continue to have the
> same d.d. working on their platforms with new gmac.
> But! it also makes sense to avoid to have two drivers that aim
> to do the same job. Or to get more synergy on the same code as
> done in the past with Rayagond for PTP and EEE.
>
> If you have some concern or advice, please do not hesitate to ask.
> We will try to send the patches soon to show the code in case of
> people are interested in.
>
> Kind Regards
> Peppe
>
> On 2/29/2016 2:27 PM, Alexandre TORGUE wrote:
>> According to Giuseppe, I send the v3 series.
>>
>> This is a subset of patches to rework the driver in order to improve its
>> performances and make it more robust under stress conditions.
>>
>> All patches have been ported on STi mainstream kernel branch and
>> tested on ARM STiH4xx platforms and newer ones.
>>
>> This series also updates the driver version and prepares it
>> to include further development to support new chips.
>>
>> In detail, these patches are:
>>
>> o to rework and improve the internal DMA bus settings
>>
>>    Fine tuning is mandatory on some platforms for both
>>    performance and stability issues.
>>
>> o to rework and optimize the descriptor management.
>>
>>    This will help a lot on performance side and preparing
>>    the inclusion on the GMAC4.x.
>>
>> o to add a set of optimizations for both xmit and rx functions.
>>
>>    These will help a lot on performance side and making the driver
>>    more robust in case of low memory conditions and under some
>>    stress test, performed for example on IP-STB.
>>
>> Below some throughput figures obtained on some boxes before and after
>> the patches.
>>
>>                         nuttcp (mbps)       iperf (Mbps)
>> ------------------------------------------------------------------
>>                        tcp     udp          tcp      udp
>>                     tx   rx   tx  rx      tx   rx   tx  rx
>>                      ------------------------------------------
>>     old             680   800 480  506    760  800   600  700
>>     new             830   880 540  630    840  880   700   800
>>
>> ==============================================================================
>>
>>
>> V2: - rx_copybreak is now managed by using ethtool.
>> V3: - improve comments on PCIe detailing that there are no regressions
>>      - rework some APIs to properly define some params as bool as
>> expected
>>      - rework the formula to get the element inside the ring.
>> Comparing V2,
>>     patches 4 and 13 have been merged because the same formula have been
>>     used. After this rework, no evident benefit has been noticed in terms
>>     of performances so the table above is still valid. Disassembling the
>>     code for SH4 and ARM, with the new formula just an instr is saved
>>     (depending on compiler flags) and this gives us not so relevanti
>> gain,
>>     for example, on SH4 where some instr are executed in the same
>> pipeline
>>     stage.
>>     Ring sizes are now fixed and maybe they can be reworked to be tuned
>>     w/o using stmmaceth= cmdline option. Indeed, nobody change these
>> sizes
>>     and indeed the numbers selected by default respect the budget and
>>     avoid to pass invalid setup. These are the best driver default sizes
>>     for ring and chain.
>>
>> ==============================================================================
>>
>> Fabrice Gasnier (3):
>>    stmmac: merge get_rx_owner into rx_status routine.
>>    stmmac: optimize tx clean function
>>    stmmac: fix phy init when attached to a phy
>>
>> Giuseppe Cavallaro (14):
>>    stmmac: share reset function between dwmac100 and dwmac1000
>>    stmmac: rework DMA bus setting and introduce new platform AXI
>>      structure
>>    stmmac: change descriptor layout
>>    stmmac: review RX/TX ring management
>>    stmmac: add length field to dma data
>>    stmmac: add last_segment field to dma data
>>    stmmac: add is_jumbo field to dma data
>>    stmmac: optimize tx desc management
>>    stmmac: set dirty index out of the loop
>>    stmmac: first frame prep at the end of xmit routine
>>    stmmac: do not poll phy handler when attach a switch
>>    stmmac: do not perform zero-copy for rx frames
>>    stmmac: tune rx copy via threshold.
>>    stmmac: update version to Oct_2015
>>
>>   Documentation/devicetree/bindings/net/stmmac.txt   |  54 ++-
>>   drivers/net/ethernet/stmicro/stmmac/chain_mode.c   |  37 +-
>>   drivers/net/ethernet/stmicro/stmmac/common.h       |  39 +-
>>   drivers/net/ethernet/stmicro/stmmac/descs.h        | 330 +++++++-------
>>   drivers/net/ethernet/stmicro/stmmac/descs_com.h    |  77 ++--
>>   drivers/net/ethernet/stmicro/stmmac/dwmac100.h     |   1 -
>>   drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   3 +-
>>   .../net/ethernet/stmicro/stmmac/dwmac1000_dma.c    | 111 +++--
>>   drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c |  22 +-
>>   drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |  39 +-
>>   drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c    |  21 +
>>   drivers/net/ethernet/stmicro/stmmac/enh_desc.c     | 226 +++++-----
>>   drivers/net/ethernet/stmicro/stmmac/norm_desc.c    | 150 ++++---
>>   drivers/net/ethernet/stmicro/stmmac/ring_mode.c    |  32 +-
>>   drivers/net/ethernet/stmicro/stmmac/stmmac.h       |   9 +-
>>   .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |  41 +-
>>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 473
>> ++++++++++++---------
>>   drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c   |   4 +-
>>   .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |  42 +-
>>   include/linux/stmmac.h                             |  17 +-
>>   20 files changed, 1016 insertions(+), 712 deletions(-)
>>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 00/17] stmmac: enhance driver performances and update the version
  2016-03-01  9:15   ` Lars Persson
@ 2016-03-01  9:40     ` Giuseppe CAVALLARO
  0 siblings, 0 replies; 23+ messages in thread
From: Giuseppe CAVALLARO @ 2016-03-01  9:40 UTC (permalink / raw)
  To: Lars Persson, netdev, maxime.coquelin, rabinv, larper, Rayagond K
  Cc: Alexandre TORGUE, fabrice.gasnier, Joachim Eastwood

Hello Lars

On 3/1/2016 10:15 AM, Lars Persson wrote:
> Hi Peppe
>
> We look forward to seeing the patches for 4.10a support. I hope that we
> can work together to have one driver supporting all chips with the 4.x
> GMACs.

many thanks for your prompt feedback. So we will proceed to provide
the gmac4 patches on top of this subset for net-next branch.
For sure, we will work together to have the best solution for these
chips.

Regards
Peppe

>
> - Lars
>
>
> On 02/29/2016 05:50 PM, Giuseppe CAVALLARO wrote:
>> Gents
>>
>> on top of these patches, there is a new train to enhance the stmmac to
>> support the DWMAC_4.x chips. They will be proposed very soon and on
>> top of this update (as soon as reviewed and merged).
>>
>> In our context, it has been very useful working with the same driver
>> that runs fine on several (x86, arm, sh4) boxes with different SYNP
>> MAC/GMAC IPs (starting from MAC10/100 Database Release 1.5 to databook
>> 3.70a and .... 4.00a and 4.10a). We got the benefit to have all the
>> features already supported by stmmac plus the good performances
>> available with TSO on gmac4.
>>
>> I can image no big issues to enhance the stmmac on supporting the
>> new 4.00a and 4.10a although there is some other work made by
>> Rabin and Larper.
>>
>> I guess, stmmac users will continue to be happy to continue to have the
>> same d.d. working on their platforms with new gmac.
>> But! it also makes sense to avoid to have two drivers that aim
>> to do the same job. Or to get more synergy on the same code as
>> done in the past with Rayagond for PTP and EEE.
>>
>> If you have some concern or advice, please do not hesitate to ask.
>> We will try to send the patches soon to show the code in case of
>> people are interested in.
>>
>> Kind Regards
>> Peppe
>>
>> On 2/29/2016 2:27 PM, Alexandre TORGUE wrote:
>>> According to Giuseppe, I send the v3 series.
>>>
>>> This is a subset of patches to rework the driver in order to improve its
>>> performances and make it more robust under stress conditions.
>>>
>>> All patches have been ported on STi mainstream kernel branch and
>>> tested on ARM STiH4xx platforms and newer ones.
>>>
>>> This series also updates the driver version and prepares it
>>> to include further development to support new chips.
>>>
>>> In detail, these patches are:
>>>
>>> o to rework and improve the internal DMA bus settings
>>>
>>>    Fine tuning is mandatory on some platforms for both
>>>    performance and stability issues.
>>>
>>> o to rework and optimize the descriptor management.
>>>
>>>    This will help a lot on performance side and preparing
>>>    the inclusion on the GMAC4.x.
>>>
>>> o to add a set of optimizations for both xmit and rx functions.
>>>
>>>    These will help a lot on performance side and making the driver
>>>    more robust in case of low memory conditions and under some
>>>    stress test, performed for example on IP-STB.
>>>
>>> Below some throughput figures obtained on some boxes before and after
>>> the patches.
>>>
>>>                         nuttcp (mbps)       iperf (Mbps)
>>> ------------------------------------------------------------------
>>>                        tcp     udp          tcp      udp
>>>                     tx   rx   tx  rx      tx   rx   tx  rx
>>>                      ------------------------------------------
>>>     old             680   800 480  506    760  800   600  700
>>>     new             830   880 540  630    840  880   700   800
>>>
>>> ==============================================================================
>>>
>>>
>>>
>>> V2: - rx_copybreak is now managed by using ethtool.
>>> V3: - improve comments on PCIe detailing that there are no regressions
>>>      - rework some APIs to properly define some params as bool as
>>> expected
>>>      - rework the formula to get the element inside the ring.
>>> Comparing V2,
>>>     patches 4 and 13 have been merged because the same formula have been
>>>     used. After this rework, no evident benefit has been noticed in
>>> terms
>>>     of performances so the table above is still valid. Disassembling the
>>>     code for SH4 and ARM, with the new formula just an instr is saved
>>>     (depending on compiler flags) and this gives us not so relevanti
>>> gain,
>>>     for example, on SH4 where some instr are executed in the same
>>> pipeline
>>>     stage.
>>>     Ring sizes are now fixed and maybe they can be reworked to be tuned
>>>     w/o using stmmaceth= cmdline option. Indeed, nobody change these
>>> sizes
>>>     and indeed the numbers selected by default respect the budget and
>>>     avoid to pass invalid setup. These are the best driver default sizes
>>>     for ring and chain.
>>>
>>> ==============================================================================
>>>
>>>
>>> Fabrice Gasnier (3):
>>>    stmmac: merge get_rx_owner into rx_status routine.
>>>    stmmac: optimize tx clean function
>>>    stmmac: fix phy init when attached to a phy
>>>
>>> Giuseppe Cavallaro (14):
>>>    stmmac: share reset function between dwmac100 and dwmac1000
>>>    stmmac: rework DMA bus setting and introduce new platform AXI
>>>      structure
>>>    stmmac: change descriptor layout
>>>    stmmac: review RX/TX ring management
>>>    stmmac: add length field to dma data
>>>    stmmac: add last_segment field to dma data
>>>    stmmac: add is_jumbo field to dma data
>>>    stmmac: optimize tx desc management
>>>    stmmac: set dirty index out of the loop
>>>    stmmac: first frame prep at the end of xmit routine
>>>    stmmac: do not poll phy handler when attach a switch
>>>    stmmac: do not perform zero-copy for rx frames
>>>    stmmac: tune rx copy via threshold.
>>>    stmmac: update version to Oct_2015
>>>
>>>   Documentation/devicetree/bindings/net/stmmac.txt   |  54 ++-
>>>   drivers/net/ethernet/stmicro/stmmac/chain_mode.c   |  37 +-
>>>   drivers/net/ethernet/stmicro/stmmac/common.h       |  39 +-
>>>   drivers/net/ethernet/stmicro/stmmac/descs.h        | 330
>>> +++++++-------
>>>   drivers/net/ethernet/stmicro/stmmac/descs_com.h    |  77 ++--
>>>   drivers/net/ethernet/stmicro/stmmac/dwmac100.h     |   1 -
>>>   drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   3 +-
>>>   .../net/ethernet/stmicro/stmmac/dwmac1000_dma.c    | 111 +++--
>>>   drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c |  22 +-
>>>   drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |  39 +-
>>>   drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c    |  21 +
>>>   drivers/net/ethernet/stmicro/stmmac/enh_desc.c     | 226 +++++-----
>>>   drivers/net/ethernet/stmicro/stmmac/norm_desc.c    | 150 ++++---
>>>   drivers/net/ethernet/stmicro/stmmac/ring_mode.c    |  32 +-
>>>   drivers/net/ethernet/stmicro/stmmac/stmmac.h       |   9 +-
>>>   .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |  41 +-
>>>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 473
>>> ++++++++++++---------
>>>   drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c   |   4 +-
>>>   .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |  42 +-
>>>   include/linux/stmmac.h                             |  17 +-
>>>   20 files changed, 1016 insertions(+), 712 deletions(-)
>>>
>>
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 00/17] stmmac: enhance driver performances and update the version
  2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
                   ` (18 preceding siblings ...)
  2016-02-29 16:50 ` Giuseppe CAVALLARO
@ 2016-03-02 19:27 ` David Miller
  19 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2016-03-02 19:27 UTC (permalink / raw)
  To: alexandre.torgue
  Cc: netdev, peppe.cavallaro, fabrice.gasnier, maxime.coquelin

From: Alexandre TORGUE <alexandre.torgue@st.com>
Date: Mon, 29 Feb 2016 14:27:26 +0100

> According to Giuseppe, I send the v3 series.

Series applied to net-next, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-03-02 19:27 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-29 13:27 [PATCH v3 00/17] stmmac: enhance driver performances and update the version Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 01/17] stmmac: share reset function between dwmac100 and dwmac1000 Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 02/17] stmmac: rework DMA bus setting and introduce new platform AXI structure Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 03/17] stmmac: change descriptor layout Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 04/17] stmmac: review RX/TX ring management Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 05/17] stmmac: add length field to dma data Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 06/17] stmmac: add last_segment " Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 07/17] stmmac: add is_jumbo " Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 08/17] stmmac: merge get_rx_owner into rx_status routine Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 09/17] stmmac: optimize tx desc management Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 10/17] stmmac: optimize tx clean function Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 11/17] stmmac: set dirty index out of the loop Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 12/17] stmmac: first frame prep at the end of xmit routine Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 13/17] stmmac: do not poll phy handler when attach a switch Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 14/17] stmmac: fix phy init when attached to a phy Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 15/17] stmmac: do not perform zero-copy for rx frames Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 16/17] stmmac: tune rx copy via threshold Alexandre TORGUE
2016-02-29 13:27 ` [PATCH v3 17/17] stmmac: update version to Oct_2015 Alexandre TORGUE
2016-02-29 14:19 ` [PATCH v3 00/17] stmmac: enhance driver performances and update the version Giuseppe CAVALLARO
2016-02-29 16:50 ` Giuseppe CAVALLARO
2016-03-01  9:15   ` Lars Persson
2016-03-01  9:40     ` Giuseppe CAVALLARO
2016-03-02 19:27 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.