linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 1/7] i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
@ 2018-03-23 18:21 ` Sinan Kaya
  2018-03-23 18:30   ` [Intel-wired-lan] " Alexander Duyck
  2018-03-23 18:21 ` [PATCH v6 2/7] ixgbe: eliminate " Sinan Kaya
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Sinan Kaya, intel-wired-lan, linux-kernel

Code includes wmb() followed by writel(). writel() already has a barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 24 ++++++++++++++++++++----
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  9 +++++++--
 2 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index c6972bd..fc10cc0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -186,7 +186,13 @@ static int i40e_program_fdir_filter(struct i40e_fdir_filter *fdir_data,
 	/* Mark the data descriptor to be watched */
 	first->next_to_watch = tx_desc;
 
-	writel(tx_ring->next_to_use, tx_ring->tail);
+	writel_relaxed(tx_ring->next_to_use, tx_ring->tail);
+
+	/* We need this if more than one processor can write to our tail
+	 * at a time, it synchronizes IO on IA64/Altix systems
+	 */
+	mmiowb();
+
 	return 0;
 
 dma_fail:
@@ -1529,7 +1535,12 @@ static inline void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val)
 	 * such as IA-64).
 	 */
 	wmb();
-	writel(val, rx_ring->tail);
+	writel_relaxed(val, rx_ring->tail);
+
+	/* We need this if more than one processor can write to our tail
+	 * at a time, it synchronizes IO on IA64/Altix systems
+	 */
+	mmiowb();
 }
 
 /**
@@ -2412,7 +2423,12 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 		 */
 		wmb();
 
-		writel(xdp_ring->next_to_use, xdp_ring->tail);
+		writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 
 	rx_ring->skb = skb;
@@ -3437,7 +3453,7 @@ static inline int i40e_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb,
 
 	/* notify HW of packet */
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 1ae112f..ca02762 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -810,7 +810,12 @@ static inline void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val)
 	 * such as IA-64).
 	 */
 	wmb();
-	writel(val, rx_ring->tail);
+	writel_relaxed(val, rx_ring->tail);
+
+	/* We need this if more than one processor can write to our tail
+	 * at a time, it synchronizes IO on IA64/Altix systems
+	 */
+	mmiowb();
 }
 
 /**
@@ -2379,7 +2384,7 @@ static inline void i40evf_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb,
 
 	/* notify HW of packet */
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 2/7] ixgbe: eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
  2018-03-23 18:21 ` [PATCH v6 1/7] i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
@ 2018-03-23 18:21 ` Sinan Kaya
  2018-03-23 18:21 ` [PATCH v6 3/7] igbvf: " Sinan Kaya
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Sinan Kaya, intel-wired-lan, linux-kernel

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index e3b32ea..1ecc2f5 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1701,7 +1701,12 @@ void ixgbe_alloc_rx_buffers(struct ixgbe_ring *rx_ring, u16 cleaned_count)
 		 * such as IA-64).
 		 */
 		wmb();
-		writel(i, rx_ring->tail);
+		writel_relaxed(i, rx_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 }
 
@@ -2470,7 +2475,12 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 		 * know there are new descriptors to fetch.
 		 */
 		wmb();
-		writel(ring->next_to_use, ring->tail);
+		writel_relaxed(ring->next_to_use, ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 
 		xdp_do_flush_map();
 	}
@@ -8101,7 +8111,7 @@ static int ixgbe_tx_map(struct ixgbe_ring *tx_ring,
 	ixgbe_maybe_stop_tx(tx_ring, DESC_NEEDED);
 
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
@@ -10038,7 +10048,12 @@ static void ixgbe_xdp_flush(struct net_device *dev)
 	 * are new descriptors to fetch.
 	 */
 	wmb();
-	writel(ring->next_to_use, ring->tail);
+	writel_relaxed(ring->next_to_use, ring->tail);
+
+	/* We need this if more than one processor can write to our tail
+	 * at a time, it synchronizes IO on IA64/Altix systems
+	 */
+	mmiowb();
 
 	return;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 3/7] igbvf: eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
  2018-03-23 18:21 ` [PATCH v6 1/7] i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
  2018-03-23 18:21 ` [PATCH v6 2/7] ixgbe: eliminate " Sinan Kaya
@ 2018-03-23 18:21 ` Sinan Kaya
  2018-03-23 18:21 ` [PATCH v6 4/7] igb: " Sinan Kaya
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Sinan Kaya, intel-wired-lan, linux-kernel

Code includes wmb() followed by writel(). writel() already has a barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/igbvf/netdev.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index fa07876..6dfd3dc 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -252,7 +252,12 @@ static void igbvf_alloc_rx_buffers(struct igbvf_ring *rx_ring,
 		 * such as IA-64).
 		*/
 		wmb();
-		writel(i, adapter->hw.hw_addr + rx_ring->tail);
+		writel_relaxed(i, adapter->hw.hw_addr + rx_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 }
 
@@ -2298,7 +2303,7 @@ static inline void igbvf_tx_queue_adv(struct igbvf_adapter *adapter,
 
 	tx_ring->buffer_info[first].next_to_watch = tx_desc;
 	tx_ring->next_to_use = i;
-	writel(i, adapter->hw.hw_addr + tx_ring->tail);
+	writel_relaxed(i, adapter->hw.hw_addr + tx_ring->tail);
 	/* we need this if more than one processor can write to our tail
 	 * at a time, it synchronizes IO on IA64/Altix systems
 	 */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 4/7] igb: eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
                   ` (2 preceding siblings ...)
  2018-03-23 18:21 ` [PATCH v6 3/7] igbvf: " Sinan Kaya
@ 2018-03-23 18:21 ` Sinan Kaya
  2018-03-23 18:21 ` [PATCH v6 5/7] fm10k: Eliminate " Sinan Kaya
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Sinan Kaya, intel-wired-lan, linux-kernel

Code includes wmb() followed by writel(). writel() already has a barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 3d4ff3c..570af25 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -5674,7 +5674,7 @@ static int igb_tx_map(struct igb_ring *tx_ring,
 	igb_maybe_stop_tx(tx_ring, DESC_NEEDED);
 
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
@@ -8079,7 +8079,12 @@ void igb_alloc_rx_buffers(struct igb_ring *rx_ring, u16 cleaned_count)
 		 * such as IA-64).
 		 */
 		wmb();
-		writel(i, rx_ring->tail);
+		writel_relaxed(i, rx_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 5/7] fm10k: Eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
                   ` (3 preceding siblings ...)
  2018-03-23 18:21 ` [PATCH v6 4/7] igb: " Sinan Kaya
@ 2018-03-23 18:21 ` Sinan Kaya
  2018-03-23 18:21 ` [PATCH v6 6/7] ixgbevf: keep writel() closer to wmb() Sinan Kaya
  2018-03-23 18:21 ` [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
  6 siblings, 0 replies; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Sinan Kaya, intel-wired-lan, linux-kernel

Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing
the register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 409554d..360ff9b 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -180,7 +180,12 @@ void fm10k_alloc_rx_buffers(struct fm10k_ring *rx_ring, u16 cleaned_count)
 		wmb();
 
 		/* notify hardware of new descriptors */
-		writel(i, rx_ring->tail);
+		writel_relaxed(i, rx_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 }
 
@@ -1055,7 +1060,7 @@ static void fm10k_tx_map(struct fm10k_ring *tx_ring,
 
 	/* notify HW of packet */
 	if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
-		writel(i, tx_ring->tail);
+		writel_relaxed(i, tx_ring->tail);
 
 		/* we need this if more than one processor can write to our tail
 		 * at a time, it synchronizes IO on IA64/Altix systems
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 6/7] ixgbevf: keep writel() closer to wmb()
       [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
                   ` (4 preceding siblings ...)
  2018-03-23 18:21 ` [PATCH v6 5/7] fm10k: Eliminate " Sinan Kaya
@ 2018-03-23 18:21 ` Sinan Kaya
  2018-03-23 18:21 ` [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
  6 siblings, 0 replies; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Sinan Kaya, intel-wired-lan, linux-kernel

Remove ixgbevf_write_tail() in favor of moving writel() close to
wmb().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      | 5 -----
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 6 +++---
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index f712646..f97091d 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -312,11 +312,6 @@ static inline u16 ixgbevf_desc_unused(struct ixgbevf_ring *ring)
 	return ((ntc > ntu) ? 0 : ring->count) + ntc - ntu - 1;
 }
 
-static inline void ixgbevf_write_tail(struct ixgbevf_ring *ring, u32 value)
-{
-	writel(value, ring->tail);
-}
-
 #define IXGBEVF_RX_DESC(R, i)	\
 	(&(((union ixgbe_adv_rx_desc *)((R)->desc))[i]))
 #define IXGBEVF_TX_DESC(R, i)	\
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 3d9033f..815cb1a 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -725,7 +725,7 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
 		 * such as IA-64).
 		 */
 		wmb();
-		ixgbevf_write_tail(rx_ring, i);
+		writel(i, rx_ring->tail);
 	}
 }
 
@@ -1232,7 +1232,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		 * know there are new descriptors to fetch.
 		 */
 		wmb();
-		ixgbevf_write_tail(xdp_ring, xdp_ring->next_to_use);
+		writel(xdp_ring->next_to_use, xdp_ring->tail);
 	}
 
 	u64_stats_update_begin(&rx_ring->syncp);
@@ -4004,7 +4004,7 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 	tx_ring->next_to_use = i;
 
 	/* notify HW of packet */
-	ixgbevf_write_tail(tx_ring, i);
+	writel(i, tx_ring->tail);
 
 	return;
 dma_error:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
                   ` (5 preceding siblings ...)
  2018-03-23 18:21 ` [PATCH v6 6/7] ixgbevf: keep writel() closer to wmb() Sinan Kaya
@ 2018-03-23 18:21 ` Sinan Kaya
  2018-03-23 18:25   ` [Intel-wired-lan] " Alexander Duyck
  6 siblings, 1 reply; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:21 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Sinan Kaya, intel-wired-lan, linux-kernel

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 815cb1a..9e684b1 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -725,7 +725,12 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
 		 * such as IA-64).
 		 */
 		wmb();
-		writel(i, rx_ring->tail);
+		writel_relaxed(i, rx_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 }
 
@@ -1232,7 +1237,12 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		 * know there are new descriptors to fetch.
 		 */
 		wmb();
-		writel(xdp_ring->next_to_use, xdp_ring->tail);
+		writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 
 	u64_stats_update_begin(&rx_ring->syncp);
@@ -4004,7 +4014,12 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 	tx_ring->next_to_use = i;
 
 	/* notify HW of packet */
-	writel(i, tx_ring->tail);
+	writel_relaxed(i, tx_ring->tail);
+
+	/* We need this if more than one processor can write to our tail
+	 * at a time, it synchronizes IO on IA64/Altix systems
+	 */
+	mmiowb();
 
 	return;
 dma_error:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
  2018-03-23 18:21 ` [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
@ 2018-03-23 18:25   ` Alexander Duyck
  2018-03-23 18:27     ` Sinan Kaya
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Duyck @ 2018-03-23 18:25 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Jeff Kirsher, sulrich, Netdev, Timur Tabi, LKML, intel-wired-lan,
	linux-arm-msm, linux-arm-kernel

On Fri, Mar 23, 2018 at 11:21 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> Code includes wmb() followed by writel() in multiple places. writel()
> already has a barrier on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index 815cb1a..9e684b1 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -725,7 +725,12 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>                  * such as IA-64).
>                  */
>                 wmb();
> -               writel(i, rx_ring->tail);
> +               writel_relaxed(i, rx_ring->tail);
> +
> +               /* We need this if more than one processor can write to our tail
> +                * at a time, it synchronizes IO on IA64/Altix systems
> +                */
> +               mmiowb();
>         }

The mmiowb shouldn't be needed for Rx. Only one CPU will be running
NAPI for the queue and we will synchronize this with a full writel
anyway when we re-enable the interrupts.

>  }
>
> @@ -1232,7 +1237,12 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
>                  * know there are new descriptors to fetch.
>                  */
>                 wmb();
> -               writel(xdp_ring->next_to_use, xdp_ring->tail);
> +               writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
> +
> +               /* We need this if more than one processor can write to our tail
> +                * at a time, it synchronizes IO on IA64/Altix systems
> +                */
> +               mmiowb();
>         }
>
>         u64_stats_update_begin(&rx_ring->syncp);
> @@ -4004,7 +4014,12 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
>         tx_ring->next_to_use = i;
>
>         /* notify HW of packet */
> -       writel(i, tx_ring->tail);
> +       writel_relaxed(i, tx_ring->tail);
> +
> +       /* We need this if more than one processor can write to our tail
> +        * at a time, it synchronizes IO on IA64/Altix systems
> +        */
> +       mmiowb();
>
>         return;
>  dma_error:
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
  2018-03-23 18:25   ` [Intel-wired-lan] " Alexander Duyck
@ 2018-03-23 18:27     ` Sinan Kaya
  2018-03-23 18:31       ` Alexander Duyck
  0 siblings, 1 reply; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:27 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jeff Kirsher, sulrich, Netdev, Timur Tabi, LKML, intel-wired-lan,
	linux-arm-msm, linux-arm-kernel

On 3/23/2018 2:25 PM, Alexander Duyck wrote:
>> +               /* We need this if more than one processor can write to our tail
>> +                * at a time, it synchronizes IO on IA64/Altix systems
>> +                */
>> +               mmiowb();
>>         }
> The mmiowb shouldn't be needed for Rx. Only one CPU will be running
> NAPI for the queue and we will synchronize this with a full writel
> anyway when we re-enable the interrupts.
> 

OK. I can fix this on the next version. I did a blanket search and replace for
my writel_relaxed() changes as I don't know the code well enough. 

Please point me to the redundant ones.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH v6 1/7] i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-23 18:21 ` [PATCH v6 1/7] i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
@ 2018-03-23 18:30   ` Alexander Duyck
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Duyck @ 2018-03-23 18:30 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Jeff Kirsher, sulrich, Netdev, Timur Tabi, LKML, intel-wired-lan,
	linux-arm-msm, linux-arm-kernel

On Fri, Mar 23, 2018 at 11:21 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> Code includes wmb() followed by writel(). writel() already has a barrier
> on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 24 ++++++++++++++++++++----
>  drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  9 +++++++--
>  2 files changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index c6972bd..fc10cc0 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -186,7 +186,13 @@ static int i40e_program_fdir_filter(struct i40e_fdir_filter *fdir_data,
>         /* Mark the data descriptor to be watched */
>         first->next_to_watch = tx_desc;
>
> -       writel(tx_ring->next_to_use, tx_ring->tail);
> +       writel_relaxed(tx_ring->next_to_use, tx_ring->tail);
> +
> +       /* We need this if more than one processor can write to our tail
> +        * at a time, it synchronizes IO on IA64/Altix systems
> +        */
> +       mmiowb();
> +
>         return 0;
>

The addition of mmiowb here is valid. All of the others in this patch
are invalid.

>  dma_fail:
> @@ -1529,7 +1535,12 @@ static inline void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val)
>          * such as IA-64).
>          */
>         wmb();
> -       writel(val, rx_ring->tail);
> +       writel_relaxed(val, rx_ring->tail);
> +
> +       /* We need this if more than one processor can write to our tail
> +        * at a time, it synchronizes IO on IA64/Altix systems
> +        */
> +       mmiowb();
>  }
>
>  /**
> @@ -2412,7 +2423,12 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
>                  */
>                 wmb();
>
> -               writel(xdp_ring->next_to_use, xdp_ring->tail);
> +               writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
> +
> +               /* We need this if more than one processor can write to our tail
> +                * at a time, it synchronizes IO on IA64/Altix systems
> +                */
> +               mmiowb();
>         }
>
>         rx_ring->skb = skb;
> @@ -3437,7 +3453,7 @@ static inline int i40e_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb,
>
>         /* notify HW of packet */
>         if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
> -               writel(i, tx_ring->tail);
> +               writel_relaxed(i, tx_ring->tail);
>
>                 /* we need this if more than one processor can write to our tail
>                  * at a time, it synchronizes IO on IA64/Altix systems
> diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
> index 1ae112f..ca02762 100644
> --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
> @@ -810,7 +810,12 @@ static inline void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val)
>          * such as IA-64).
>          */
>         wmb();
> -       writel(val, rx_ring->tail);
> +       writel_relaxed(val, rx_ring->tail);
> +
> +       /* We need this if more than one processor can write to our tail
> +        * at a time, it synchronizes IO on IA64/Altix systems
> +        */
> +       mmiowb();
>  }
>
>  /**
> @@ -2379,7 +2384,7 @@ static inline void i40evf_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb,
>
>         /* notify HW of packet */
>         if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) {
> -               writel(i, tx_ring->tail);
> +               writel_relaxed(i, tx_ring->tail);
>
>                 /* we need this if more than one processor can write to our tail
>                  * at a time, it synchronizes IO on IA64/Altix systems
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
  2018-03-23 18:27     ` Sinan Kaya
@ 2018-03-23 18:31       ` Alexander Duyck
  2018-03-23 18:45         ` Sinan Kaya
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Duyck @ 2018-03-23 18:31 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Jeff Kirsher, sulrich, Netdev, Timur Tabi, LKML, intel-wired-lan,
	linux-arm-msm, linux-arm-kernel

On Fri, Mar 23, 2018 at 11:27 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> On 3/23/2018 2:25 PM, Alexander Duyck wrote:
>>> +               /* We need this if more than one processor can write to our tail
>>> +                * at a time, it synchronizes IO on IA64/Altix systems
>>> +                */
>>> +               mmiowb();
>>>         }
>> The mmiowb shouldn't be needed for Rx. Only one CPU will be running
>> NAPI for the queue and we will synchronize this with a full writel
>> anyway when we re-enable the interrupts.
>>
>
> OK. I can fix this on the next version. I did a blanket search and replace for
> my writel_relaxed() changes as I don't know the code well enough.
>
> Please point me to the redundant ones.

So from what I can tell only this file and i40e needed any additional
mmiowb calls added. The rest are not needed.

- Alex

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
  2018-03-23 18:31       ` Alexander Duyck
@ 2018-03-23 18:45         ` Sinan Kaya
  0 siblings, 0 replies; 12+ messages in thread
From: Sinan Kaya @ 2018-03-23 18:45 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jeff Kirsher, sulrich, Netdev, Timur Tabi, LKML, intel-wired-lan,
	linux-arm-msm, linux-arm-kernel

On 3/23/2018 2:31 PM, Alexander Duyck wrote:
>> Please point me to the redundant ones.
> So from what I can tell only this file and i40e needed any additional
> mmiowb calls added. The rest are not needed.


Thanks, I'll clean up between 2..6 and then make your suggested changes
on 1 and 7.


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-03-23 18:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1521829277-9398-1-git-send-email-okaya@codeaurora.org>
2018-03-23 18:21 ` [PATCH v6 1/7] i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
2018-03-23 18:30   ` [Intel-wired-lan] " Alexander Duyck
2018-03-23 18:21 ` [PATCH v6 2/7] ixgbe: eliminate " Sinan Kaya
2018-03-23 18:21 ` [PATCH v6 3/7] igbvf: " Sinan Kaya
2018-03-23 18:21 ` [PATCH v6 4/7] igb: " Sinan Kaya
2018-03-23 18:21 ` [PATCH v6 5/7] fm10k: Eliminate " Sinan Kaya
2018-03-23 18:21 ` [PATCH v6 6/7] ixgbevf: keep writel() closer to wmb() Sinan Kaya
2018-03-23 18:21 ` [PATCH v6 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
2018-03-23 18:25   ` [Intel-wired-lan] " Alexander Duyck
2018-03-23 18:27     ` Sinan Kaya
2018-03-23 18:31       ` Alexander Duyck
2018-03-23 18:45         ` Sinan Kaya

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).