All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/7] scsi: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50 ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich; +Cc: Sinan Kaya, linux-arm-msm, linux-arm-kernel

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

I did a regex search for wmb() followed by writel() in each drivers
directory.
I scrubbed the ones I care about in this series.

I considered "ease of change", "popular usage" and "performance critical
path" as the determining criteria for my filtering.

We used relaxed API heavily on ARM for a long time but
it did not exist on other architectures. For this reason, relaxed
architectures have been paying double penalty in order to use the common
drivers.

Now that relaxed API is present on all architectures, we can go and scrub
all drivers to see what needs to change and what can remain.

We start with mostly used ones and hope to increase the coverage over time.
It will take a while to cover all drivers.

Feel free to apply patches individually.

Changes since v3:
- https://www.spinics.net/lists/arm-kernel/msg641851.html
- group patches together into subsystems scsi:...
- collect reviewed and tested bys
- scrub barrier()

Sinan Kaya (7):
  scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
  scsi: megaraid: eliminate duplicate barriers on weakly-ordered archs
  scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
  scsi: lpfc: Eliminate duplicate barriers on weakly-ordered archs
  scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs
  scsi: bnx2i: Eliminate duplicate barriers on weakly-ordered archs
  scsi: csiostor: Eliminate duplicate barriers on weakly-ordered archs

 drivers/scsi/bnx2i/bnx2i_hwi.c              | 2 +-
 drivers/scsi/csiostor/csio_hw.h             | 4 ++++
 drivers/scsi/csiostor/csio_wr.c             | 2 +-
 drivers/scsi/dpt_i2o.c                      | 6 +++---
 drivers/scsi/hpsa.h                         | 2 +-
 drivers/scsi/ipr.c                          | 7 ++++---
 drivers/scsi/lpfc/lpfc_sli.c                | 4 ++--
 drivers/scsi/megaraid/megaraid_mbox.c       | 8 ++++----
 drivers/scsi/megaraid/megaraid_mbox.h       | 2 ++
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 4 ++--
 10 files changed, 24 insertions(+), 17 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 0/7] scsi: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50 ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

I did a regex search for wmb() followed by writel() in each drivers
directory.
I scrubbed the ones I care about in this series.

I considered "ease of change", "popular usage" and "performance critical
path" as the determining criteria for my filtering.

We used relaxed API heavily on ARM for a long time but
it did not exist on other architectures. For this reason, relaxed
architectures have been paying double penalty in order to use the common
drivers.

Now that relaxed API is present on all architectures, we can go and scrub
all drivers to see what needs to change and what can remain.

We start with mostly used ones and hope to increase the coverage over time.
It will take a while to cover all drivers.

Feel free to apply patches individually.

Changes since v3:
- https://www.spinics.net/lists/arm-kernel/msg641851.html
- group patches together into subsystems scsi:...
- collect reviewed and tested bys
- scrub barrier()

Sinan Kaya (7):
  scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
  scsi: megaraid: eliminate duplicate barriers on weakly-ordered archs
  scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
  scsi: lpfc: Eliminate duplicate barriers on weakly-ordered archs
  scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs
  scsi: bnx2i: Eliminate duplicate barriers on weakly-ordered archs
  scsi: csiostor: Eliminate duplicate barriers on weakly-ordered archs

 drivers/scsi/bnx2i/bnx2i_hwi.c              | 2 +-
 drivers/scsi/csiostor/csio_hw.h             | 4 ++++
 drivers/scsi/csiostor/csio_wr.c             | 2 +-
 drivers/scsi/dpt_i2o.c                      | 6 +++---
 drivers/scsi/hpsa.h                         | 2 +-
 drivers/scsi/ipr.c                          | 7 ++++---
 drivers/scsi/lpfc/lpfc_sli.c                | 4 ++--
 drivers/scsi/megaraid/megaraid_mbox.c       | 8 ++++----
 drivers/scsi/megaraid/megaraid_mbox.h       | 2 ++
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 4 ++--
 10 files changed, 24 insertions(+), 17 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50 ` Sinan Kaya
@ 2018-03-20  2:50   ` Sinan Kaya
  -1 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Don Brace,
	James E.J. Bottomley, Martin K. Petersen, esc.storagedev,
	linux-kernel

Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing
the register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/hpsa.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 018f980..c7d7e6a 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -599,7 +599,7 @@ static unsigned long SA5_ioaccel_mode1_completed(struct ctlr_info *h, u8 q)
 		 * but with current driver design this is easiest.
 		 */
 		wmb();
-		writel((q << 24) | rq->current_entry, h->vaddr +
+		writel_relaxed((q << 24) | rq->current_entry, h->vaddr +
 				IOACCEL_MODE1_CONSUMER_INDEX);
 		atomic_dec(&h->commands_outstanding);
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50   ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing
the register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/hpsa.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 018f980..c7d7e6a 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -599,7 +599,7 @@ static unsigned long SA5_ioaccel_mode1_completed(struct ctlr_info *h, u8 q)
 		 * but with current driver design this is easiest.
 		 */
 		wmb();
-		writel((q << 24) | rq->current_entry, h->vaddr +
+		writel_relaxed((q << 24) | rq->current_entry, h->vaddr +
 				IOACCEL_MODE1_CONSUMER_INDEX);
 		atomic_dec(&h->commands_outstanding);
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 2/7] scsi: megaraid: eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50 ` Sinan Kaya
@ 2018-03-20  2:50   ` Sinan Kaya
  -1 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Kashyap Desai,
	Sumit Saxena, Shivasharan S, James E.J. Bottomley,
	Martin K. Petersen, megaraidlinux.pdl, linux-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a barrier().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/megaraid/megaraid_mbox.c       | 8 ++++----
 drivers/scsi/megaraid/megaraid_mbox.h       | 2 ++
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 4 ++--
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index 530358c..8c4fc6e 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -1439,7 +1439,7 @@ mbox_post_cmd(adapter_t *adapter, scb_t *scb)
 	mbox->ack	= 0;
 	wmb();
 
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	spin_unlock_irqrestore(MAILBOX_LOCK(raid_dev), flags);
 
@@ -2752,7 +2752,7 @@ mbox_post_sync_cmd(adapter_t *adapter, uint8_t raw_mbox[])
 	mbox->status		= 0xFF;
 
 	wmb();
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	// wait for maximum 1 second for status to post. If the status is not
 	// available within 1 second, assume FW is initializing and wait
@@ -2877,7 +2877,7 @@ mbox_post_sync_cmd_fast(adapter_t *adapter, uint8_t raw_mbox[])
 	mbox->status		= 0xFF;
 
 	wmb();
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	for (i = 0; i < MBOX_SYNC_WAIT_CNT; i++) {
 		if (mbox->numstatus != 0xFF) break;
@@ -3329,7 +3329,7 @@ megaraid_mbox_fire_sync_cmd(adapter_t *adapter)
 	mbox->status		= 0;
 
 	wmb();
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	/* Wait for maximum 1 min for status to post.
 	 * If the Firmware SUPPORTS the ABOVE COMMAND,
diff --git a/drivers/scsi/megaraid/megaraid_mbox.h b/drivers/scsi/megaraid/megaraid_mbox.h
index c1d86d9..641cbd4 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.h
+++ b/drivers/scsi/megaraid/megaraid_mbox.h
@@ -230,6 +230,8 @@ typedef struct {
 
 #define RDINDOOR(rdev)		readl((rdev)->baseaddr + 0x20)
 #define RDOUTDOOR(rdev)		readl((rdev)->baseaddr + 0x2C)
+#define WRINDOOR_RELAXED(rdev, value)	\
+	writel_relaxed(value, (rdev)->baseaddr + 0x20)
 #define WRINDOOR(rdev, value)	writel(value, (rdev)->baseaddr + 0x20)
 #define WROUTDOOR(rdev, value)	writel(value, (rdev)->baseaddr + 0x2C)
 
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 073ced0..f560496 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -3479,11 +3479,11 @@ complete_cmd_fusion(struct megasas_instance *instance, u32 MSIxIndex)
 
 	wmb();
 	if (instance->msix_combined)
-		writel(((MSIxIndex & 0x7) << 24) |
+		writel_relaxed(((MSIxIndex & 0x7) << 24) |
 			fusion->last_reply_idx[MSIxIndex],
 			instance->reply_post_host_index_addr[MSIxIndex/8]);
 	else
-		writel((MSIxIndex << 24) |
+		writel_relaxed((MSIxIndex << 24) |
 			fusion->last_reply_idx[MSIxIndex],
 			instance->reply_post_host_index_addr[0]);
 	megasas_check_and_restore_queue_depth(instance);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 2/7] scsi: megaraid: eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50   ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a barrier().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/megaraid/megaraid_mbox.c       | 8 ++++----
 drivers/scsi/megaraid/megaraid_mbox.h       | 2 ++
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 4 ++--
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index 530358c..8c4fc6e 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -1439,7 +1439,7 @@ mbox_post_cmd(adapter_t *adapter, scb_t *scb)
 	mbox->ack	= 0;
 	wmb();
 
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	spin_unlock_irqrestore(MAILBOX_LOCK(raid_dev), flags);
 
@@ -2752,7 +2752,7 @@ mbox_post_sync_cmd(adapter_t *adapter, uint8_t raw_mbox[])
 	mbox->status		= 0xFF;
 
 	wmb();
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	// wait for maximum 1 second for status to post. If the status is not
 	// available within 1 second, assume FW is initializing and wait
@@ -2877,7 +2877,7 @@ mbox_post_sync_cmd_fast(adapter_t *adapter, uint8_t raw_mbox[])
 	mbox->status		= 0xFF;
 
 	wmb();
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	for (i = 0; i < MBOX_SYNC_WAIT_CNT; i++) {
 		if (mbox->numstatus != 0xFF) break;
@@ -3329,7 +3329,7 @@ megaraid_mbox_fire_sync_cmd(adapter_t *adapter)
 	mbox->status		= 0;
 
 	wmb();
-	WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);
+	WRINDOOR_RELAXED(raid_dev, raid_dev->mbox_dma | 0x1);
 
 	/* Wait for maximum 1 min for status to post.
 	 * If the Firmware SUPPORTS the ABOVE COMMAND,
diff --git a/drivers/scsi/megaraid/megaraid_mbox.h b/drivers/scsi/megaraid/megaraid_mbox.h
index c1d86d9..641cbd4 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.h
+++ b/drivers/scsi/megaraid/megaraid_mbox.h
@@ -230,6 +230,8 @@ typedef struct {
 
 #define RDINDOOR(rdev)		readl((rdev)->baseaddr + 0x20)
 #define RDOUTDOOR(rdev)		readl((rdev)->baseaddr + 0x2C)
+#define WRINDOOR_RELAXED(rdev, value)	\
+	writel_relaxed(value, (rdev)->baseaddr + 0x20)
 #define WRINDOOR(rdev, value)	writel(value, (rdev)->baseaddr + 0x20)
 #define WROUTDOOR(rdev, value)	writel(value, (rdev)->baseaddr + 0x2C)
 
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 073ced0..f560496 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -3479,11 +3479,11 @@ complete_cmd_fusion(struct megasas_instance *instance, u32 MSIxIndex)
 
 	wmb();
 	if (instance->msix_combined)
-		writel(((MSIxIndex & 0x7) << 24) |
+		writel_relaxed(((MSIxIndex & 0x7) << 24) |
 			fusion->last_reply_idx[MSIxIndex],
 			instance->reply_post_host_index_addr[MSIxIndex/8]);
 	else
-		writel((MSIxIndex << 24) |
+		writel_relaxed((MSIxIndex << 24) |
 			fusion->last_reply_idx[MSIxIndex],
 			instance->reply_post_host_index_addr[0]);
 	megasas_check_and_restore_queue_depth(instance);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 3/7] scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50 ` Sinan Kaya
@ 2018-03-20  2:50   ` Sinan Kaya
  -1 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	Adaptec OEM Raid Solutions, James E.J. Bottomley,
	Martin K. Petersen, linux-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/dpt_i2o.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
index fd172b0..3c1e64a 100644
--- a/drivers/scsi/dpt_i2o.c
+++ b/drivers/scsi/dpt_i2o.c
@@ -1300,7 +1300,7 @@ static s32 adpt_i2o_post_this(adpt_hba* pHba, u32* data, int len)
 	wmb();
 
 	//post message
-	writel(m, pHba->post_port);
+	writel_relaxed(m, pHba->post_port);
 	wmb();
 
 	return 0;
@@ -1390,7 +1390,7 @@ static s32 adpt_i2o_reset_hba(adpt_hba* pHba)
 
 	memcpy_toio(pHba->msg_addr_virt+m, msg, sizeof(msg));
 	wmb();
-	writel(m, pHba->post_port);
+	writel_relaxed(m, pHba->post_port);
 	wmb();
 
 	while(*status == 0){
@@ -2797,7 +2797,7 @@ static s32 adpt_send_nop(adpt_hba*pHba,u32 m)
 	writel( 0,&msg[2]);
 	wmb();
 
-	writel(m, pHba->post_port);
+	writel_relaxed(m, pHba->post_port);
 	wmb();
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 3/7] scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50   ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/dpt_i2o.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
index fd172b0..3c1e64a 100644
--- a/drivers/scsi/dpt_i2o.c
+++ b/drivers/scsi/dpt_i2o.c
@@ -1300,7 +1300,7 @@ static s32 adpt_i2o_post_this(adpt_hba* pHba, u32* data, int len)
 	wmb();
 
 	//post message
-	writel(m, pHba->post_port);
+	writel_relaxed(m, pHba->post_port);
 	wmb();
 
 	return 0;
@@ -1390,7 +1390,7 @@ static s32 adpt_i2o_reset_hba(adpt_hba* pHba)
 
 	memcpy_toio(pHba->msg_addr_virt+m, msg, sizeof(msg));
 	wmb();
-	writel(m, pHba->post_port);
+	writel_relaxed(m, pHba->post_port);
 	wmb();
 
 	while(*status == 0){
@@ -2797,7 +2797,7 @@ static s32 adpt_send_nop(adpt_hba*pHba,u32 m)
 	writel( 0,&msg[2]);
 	wmb();
 
-	writel(m, pHba->post_port);
+	writel_relaxed(m, pHba->post_port);
 	wmb();
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 4/7] scsi: lpfc: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50 ` Sinan Kaya
@ 2018-03-20  2:50   ` Sinan Kaya
  -1 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	linux-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/lpfc/lpfc_sli.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 5f5528a..7dae7d3 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -1647,7 +1647,7 @@ lpfc_sli_update_full_ring(struct lpfc_hba *phba, struct lpfc_sli_ring *pring)
 	 * Set ring 'ringno' to SET R0CE_REQ in Chip Att register.
 	 * The HBA will tell us when an IOCB entry is available.
 	 */
-	writel((CA_R0ATT|CA_R0CE_REQ) << (ringno*4), phba->CAregaddr);
+	writel_relaxed((CA_R0ATT|CA_R0CE_REQ) << (ringno*4), phba->CAregaddr);
 	readl(phba->CAregaddr); /* flush */
 
 	pring->stats.iocb_cmd_full++;
@@ -1672,7 +1672,7 @@ lpfc_sli_update_ring(struct lpfc_hba *phba, struct lpfc_sli_ring *pring)
 	 */
 	if (!(phba->sli3_options & LPFC_SLI3_CRP_ENABLED)) {
 		wmb();
-		writel(CA_R0ATT << (ringno * 4), phba->CAregaddr);
+		writel_relaxed(CA_R0ATT << (ringno * 4), phba->CAregaddr);
 		readl(phba->CAregaddr); /* flush */
 	}
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 4/7] scsi: lpfc: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50   ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/lpfc/lpfc_sli.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 5f5528a..7dae7d3 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -1647,7 +1647,7 @@ lpfc_sli_update_full_ring(struct lpfc_hba *phba, struct lpfc_sli_ring *pring)
 	 * Set ring 'ringno' to SET R0CE_REQ in Chip Att register.
 	 * The HBA will tell us when an IOCB entry is available.
 	 */
-	writel((CA_R0ATT|CA_R0CE_REQ) << (ringno*4), phba->CAregaddr);
+	writel_relaxed((CA_R0ATT|CA_R0CE_REQ) << (ringno*4), phba->CAregaddr);
 	readl(phba->CAregaddr); /* flush */
 
 	pring->stats.iocb_cmd_full++;
@@ -1672,7 +1672,7 @@ lpfc_sli_update_ring(struct lpfc_hba *phba, struct lpfc_sli_ring *pring)
 	 */
 	if (!(phba->sli3_options & LPFC_SLI3_CRP_ENABLED)) {
 		wmb();
-		writel(CA_R0ATT << (ringno * 4), phba->CAregaddr);
+		writel_relaxed(CA_R0ATT << (ringno * 4), phba->CAregaddr);
 		readl(phba->CAregaddr); /* flush */
 	}
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 5/7] scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50 ` Sinan Kaya
@ 2018-03-20  2:50   ` Sinan Kaya
  -1 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Brian King,
	James E.J. Bottomley, Martin K. Petersen, linux-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writeX() to
writeX_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/ipr.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index e07dd99..209adac 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -762,9 +762,9 @@ static void ipr_mask_and_clear_interrupts(struct ipr_ioa_cfg *ioa_cfg,
 
 	/* Set interrupt mask to stop all new interrupts */
 	if (ioa_cfg->sis64)
-		writeq(~0, ioa_cfg->regs.set_interrupt_mask_reg);
+		writeq_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
 	else
-		writel(~0, ioa_cfg->regs.set_interrupt_mask_reg);
+		writel_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
 
 	/* Clear any pending interrupts */
 	if (ioa_cfg->sis64)
@@ -8435,7 +8435,8 @@ static int ipr_reset_enable_ioa(struct ipr_cmnd *ipr_cmd)
 	wmb();
 	if (ioa_cfg->sis64) {
 		/* Set the adapter to the correct endian mode. */
-		writel(IPR_ENDIAN_SWAP_KEY, ioa_cfg->regs.endian_swap_reg);
+		writel_relaxed(IPR_ENDIAN_SWAP_KEY,
+			       ioa_cfg->regs.endian_swap_reg);
 		int_reg = readl(ioa_cfg->regs.endian_swap_reg);
 	}
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 5/7] scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50   ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writeX() to
writeX_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/ipr.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index e07dd99..209adac 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -762,9 +762,9 @@ static void ipr_mask_and_clear_interrupts(struct ipr_ioa_cfg *ioa_cfg,
 
 	/* Set interrupt mask to stop all new interrupts */
 	if (ioa_cfg->sis64)
-		writeq(~0, ioa_cfg->regs.set_interrupt_mask_reg);
+		writeq_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
 	else
-		writel(~0, ioa_cfg->regs.set_interrupt_mask_reg);
+		writel_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
 
 	/* Clear any pending interrupts */
 	if (ioa_cfg->sis64)
@@ -8435,7 +8435,8 @@ static int ipr_reset_enable_ioa(struct ipr_cmnd *ipr_cmd)
 	wmb();
 	if (ioa_cfg->sis64) {
 		/* Set the adapter to the correct endian mode. */
-		writel(IPR_ENDIAN_SWAP_KEY, ioa_cfg->regs.endian_swap_reg);
+		writel_relaxed(IPR_ENDIAN_SWAP_KEY,
+			       ioa_cfg->regs.endian_swap_reg);
 		int_reg = readl(ioa_cfg->regs.endian_swap_reg);
 	}
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 6/7] scsi: bnx2i: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50 ` Sinan Kaya
@ 2018-03-20  2:50   ` Sinan Kaya
  -1 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	QLogic-Storage-Upstream, James E.J. Bottomley,
	Martin K. Petersen, linux-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/bnx2i/bnx2i_hwi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c
index 8f03a86..075735b 100644
--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
@@ -278,7 +278,7 @@ static void bnx2i_ring_sq_dbell(struct bnx2i_conn *bnx2i_conn, int count)
 		sq_db->prod_idx = ep->qp.sq_prod_idx;
 		bnx2i_ring_577xx_doorbell(bnx2i_conn);
 	} else
-		writew(count, ep->qp.ctx_base + CNIC_SEND_DOORBELL);
+		writew_relaxed(count, ep->qp.ctx_base + CNIC_SEND_DOORBELL);
 
 	mmiowb(); /* flush posted PCI writes */
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 6/7] scsi: bnx2i: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50   ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/bnx2i/bnx2i_hwi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c
index 8f03a86..075735b 100644
--- a/drivers/scsi/bnx2i/bnx2i_hwi.c
+++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
@@ -278,7 +278,7 @@ static void bnx2i_ring_sq_dbell(struct bnx2i_conn *bnx2i_conn, int count)
 		sq_db->prod_idx = ep->qp.sq_prod_idx;
 		bnx2i_ring_577xx_doorbell(bnx2i_conn);
 	} else
-		writew(count, ep->qp.ctx_base + CNIC_SEND_DOORBELL);
+		writew_relaxed(count, ep->qp.ctx_base + CNIC_SEND_DOORBELL);
 
 	mmiowb(); /* flush posted PCI writes */
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 7/7] scsi: csiostor: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50 ` Sinan Kaya
@ 2018-03-20  2:50   ` Sinan Kaya
  -1 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	James E.J. Bottomley, Martin K. Petersen, Varun Prakash,
	Peter Senna Tschudin, Romain Perier, linux-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a barrier().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/csiostor/csio_hw.h | 4 ++++
 drivers/scsi/csiostor/csio_wr.c | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/csiostor/csio_hw.h b/drivers/scsi/csiostor/csio_hw.h
index 30f5f52..9fd8b00 100644
--- a/drivers/scsi/csiostor/csio_hw.h
+++ b/drivers/scsi/csiostor/csio_hw.h
@@ -512,6 +512,10 @@ struct csio_hw {
 						csio_reg((_h)->regstart, (_r)))
 #define	csio_wr_reg16(_h, _v, _r)	writew((_v), \
 						csio_reg((_h)->regstart, (_r)))
+
+#define	csio_wr_reg32_relaxed(_h, _v, _r) \
+	writel_relaxed((_v), csio_reg((_h)->regstart, (_r)))
+
 #define	csio_wr_reg32(_h, _v, _r)	writel((_v), \
 						csio_reg((_h)->regstart, (_r)))
 #define	csio_wr_reg64(_h, _v, _r)	writeq((_v), \
diff --git a/drivers/scsi/csiostor/csio_wr.c b/drivers/scsi/csiostor/csio_wr.c
index c0a1778..db26222 100644
--- a/drivers/scsi/csiostor/csio_wr.c
+++ b/drivers/scsi/csiostor/csio_wr.c
@@ -984,7 +984,7 @@ csio_wr_issue(struct csio_hw *hw, int qidx, bool prio)
 
 	wmb();
 	/* Ring SGE Doorbell writing q->pidx into it */
-	csio_wr_reg32(hw, DBPRIO_V(prio) | QID_V(q->un.eq.physeqid) |
+	csio_wr_reg32_relaxed(hw, DBPRIO_V(prio) | QID_V(q->un.eq.physeqid) |
 			  PIDX_T5_V(q->inc_idx) | DBTYPE_F,
 			  MYPF_REG(SGE_PF_KDOORBELL_A));
 	q->inc_idx = 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 7/7] scsi: csiostor: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20  2:50   ` Sinan Kaya
  0 siblings, 0 replies; 26+ messages in thread
From: Sinan Kaya @ 2018-03-20  2:50 UTC (permalink / raw)
  To: linux-arm-kernel

Code includes barrier() followed by writel(). writel() already has a
barrier
on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a barrier().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/csiostor/csio_hw.h | 4 ++++
 drivers/scsi/csiostor/csio_wr.c | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/csiostor/csio_hw.h b/drivers/scsi/csiostor/csio_hw.h
index 30f5f52..9fd8b00 100644
--- a/drivers/scsi/csiostor/csio_hw.h
+++ b/drivers/scsi/csiostor/csio_hw.h
@@ -512,6 +512,10 @@ struct csio_hw {
 						csio_reg((_h)->regstart, (_r)))
 #define	csio_wr_reg16(_h, _v, _r)	writew((_v), \
 						csio_reg((_h)->regstart, (_r)))
+
+#define	csio_wr_reg32_relaxed(_h, _v, _r) \
+	writel_relaxed((_v), csio_reg((_h)->regstart, (_r)))
+
 #define	csio_wr_reg32(_h, _v, _r)	writel((_v), \
 						csio_reg((_h)->regstart, (_r)))
 #define	csio_wr_reg64(_h, _v, _r)	writeq((_v), \
diff --git a/drivers/scsi/csiostor/csio_wr.c b/drivers/scsi/csiostor/csio_wr.c
index c0a1778..db26222 100644
--- a/drivers/scsi/csiostor/csio_wr.c
+++ b/drivers/scsi/csiostor/csio_wr.c
@@ -984,7 +984,7 @@ csio_wr_issue(struct csio_hw *hw, int qidx, bool prio)
 
 	wmb();
 	/* Ring SGE Doorbell writing q->pidx into it */
-	csio_wr_reg32(hw, DBPRIO_V(prio) | QID_V(q->un.eq.physeqid) |
+	csio_wr_reg32_relaxed(hw, DBPRIO_V(prio) | QID_V(q->un.eq.physeqid) |
 			  PIDX_T5_V(q->inc_idx) | DBTYPE_F,
 			  MYPF_REG(SGE_PF_KDOORBELL_A));
 	q->inc_idx = 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/7] scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50   ` Sinan Kaya
@ 2018-03-20 10:23     ` Christoph Hellwig
  -1 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-03-20 10:23 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-scsi, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Adaptec OEM Raid Solutions, James E.J. Bottomley,
	Martin K. Petersen, linux-kernel

This is a basically unmaintained driver for historic hardware, and not the
right place for micro-optimizations.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 3/7] scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20 10:23     ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-03-20 10:23 UTC (permalink / raw)
  To: linux-arm-kernel

This is a basically unmaintained driver for historic hardware, and not the
right place for micro-optimizations.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/7] scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
  2018-03-20 10:23     ` Christoph Hellwig
@ 2018-03-20 12:13       ` okaya at codeaurora.org
  -1 siblings, 0 replies; 26+ messages in thread
From: okaya @ 2018-03-20 12:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-scsi, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	Adaptec OEM Raid Solutions, James E.J. Bottomley,
	Martin K. Petersen, linux-kernel

On 2018-03-20 06:23, Christoph Hellwig wrote:
> This is a basically unmaintained driver for historic hardware, and not 
> the
> right place for micro-optimizations.

Sure, I can drop this on the next version.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 3/7] scsi: dpt_i2o: eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20 12:13       ` okaya at codeaurora.org
  0 siblings, 0 replies; 26+ messages in thread
From: okaya at codeaurora.org @ 2018-03-20 12:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 2018-03-20 06:23, Christoph Hellwig wrote:
> This is a basically unmaintained driver for historic hardware, and not 
> the
> right place for micro-optimizations.

Sure, I can drop this on the next version.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50   ` Sinan Kaya
@ 2018-03-20 14:01     ` Don Brace
  -1 siblings, 0 replies; 26+ messages in thread
From: Don Brace @ 2018-03-20 14:01 UTC (permalink / raw)
  To: Sinan Kaya, linux-scsi, timur, sulrich
  Cc: James E.J. Bottomley, Martin K. Petersen, linux-arm-msm,
	esc.storagedev, linux-kernel, linux-arm-kernel

> -----Original Message-----
> From: Sinan Kaya [mailto:okaya@codeaurora.org]
> Sent: Monday, March 19, 2018 9:50 PM
> To: linux-scsi@vger.kernel.org; timur@codeaurora.org; sulrich@codeaurora.org
> Cc: linux-arm-msm@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> Sinan Kaya <okaya@codeaurora.org>; Don Brace <don.brace@microsemi.com>;
> James E.J. Bottomley <jejb@linux.vnet.ibm.com>; Martin K. Petersen
> <martin.petersen@oracle.com>; esc.storagedev
> <esc.storagedev@microsemi.com>; linux-kernel@vger.kernel.org
> Subject: [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-
> ordered archs
> 
> EXTERNAL EMAIL
> 
> 
> Code includes wmb() followed by writel(). writel() already has a
> barrier on some architectures like arm64.
> 
> This ends up CPU observing two barriers back to back before executing
> the register write.
> 
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>

Acked-by: Don Brace <don.brace@microsemi.com>

> ---
>  drivers/scsi/hpsa.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index 018f980..c7d7e6a 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -599,7 +599,7 @@ static unsigned long
> SA5_ioaccel_mode1_completed(struct ctlr_info *h, u8 q)
>                  * but with current driver design this is easiest.
>                  */
>                 wmb();
> -               writel((q << 24) | rq->current_entry, h->vaddr +
> +               writel_relaxed((q << 24) | rq->current_entry, h->vaddr +
>                                 IOACCEL_MODE1_CONSUMER_INDEX);
>                 atomic_dec(&h->commands_outstanding);
>         }
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20 14:01     ` Don Brace
  0 siblings, 0 replies; 26+ messages in thread
From: Don Brace @ 2018-03-20 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Sinan Kaya [mailto:okaya at codeaurora.org]
> Sent: Monday, March 19, 2018 9:50 PM
> To: linux-scsi at vger.kernel.org; timur at codeaurora.org; sulrich at codeaurora.org
> Cc: linux-arm-msm at vger.kernel.org; linux-arm-kernel at lists.infradead.org;
> Sinan Kaya <okaya@codeaurora.org>; Don Brace <don.brace@microsemi.com>;
> James E.J. Bottomley <jejb@linux.vnet.ibm.com>; Martin K. Petersen
> <martin.petersen@oracle.com>; esc.storagedev
> <esc.storagedev@microsemi.com>; linux-kernel at vger.kernel.org
> Subject: [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-
> ordered archs
> 
> EXTERNAL EMAIL
> 
> 
> Code includes wmb() followed by writel(). writel() already has a
> barrier on some architectures like arm64.
> 
> This ends up CPU observing two barriers back to back before executing
> the register write.
> 
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>

Acked-by: Don Brace <don.brace@microsemi.com>

> ---
>  drivers/scsi/hpsa.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index 018f980..c7d7e6a 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -599,7 +599,7 @@ static unsigned long
> SA5_ioaccel_mode1_completed(struct ctlr_info *h, u8 q)
>                  * but with current driver design this is easiest.
>                  */
>                 wmb();
> -               writel((q << 24) | rq->current_entry, h->vaddr +
> +               writel_relaxed((q << 24) | rq->current_entry, h->vaddr +
>                                 IOACCEL_MODE1_CONSUMER_INDEX);
>                 atomic_dec(&h->commands_outstanding);
>         }
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50   ` Sinan Kaya
@ 2018-03-20 16:51     ` Laurence Oberman
  -1 siblings, 0 replies; 26+ messages in thread
From: Laurence Oberman @ 2018-03-20 16:51 UTC (permalink / raw)
  To: Sinan Kaya, linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Don Brace, James E.J. Bottomley,
	Martin K. Petersen, esc.storagedev, linux-kernel

On Mon, 2018-03-19 at 22:50 -0400, Sinan Kaya wrote:
> Code includes wmb() followed by writel(). writel() already has a
> barrier on some architectures like arm64.
> 
> This ends up CPU observing two barriers back to back before executing
> the register write.
> 
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/scsi/hpsa.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index 018f980..c7d7e6a 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -599,7 +599,7 @@ static unsigned long
> SA5_ioaccel_mode1_completed(struct ctlr_info *h, u8 q)
>  		 * but with current driver design this is easiest.
>  		 */
>  		wmb();
> -		writel((q << 24) | rq->current_entry, h->vaddr +
> +		writel_relaxed((q << 24) | rq->current_entry, h-
> >vaddr +
>  				IOACCEL_MODE1_CONSUMER_INDEX);
>  		atomic_dec(&h->commands_outstanding);
>  	}

This looks like it would work for the x86_64 and arm because of how its
defined architecture specific for the x86_64 and the arm64

I guess its up to Don and the driver folks and if its worth the change.
I am generally not a fan of messing with these barrier things though.

Reviewed-by: Laurence Oberman <loberman@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 1/7] scsi: hpsa: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-03-20 16:51     ` Laurence Oberman
  0 siblings, 0 replies; 26+ messages in thread
From: Laurence Oberman @ 2018-03-20 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2018-03-19 at 22:50 -0400, Sinan Kaya wrote:
> Code includes wmb() followed by writel(). writel() already has a
> barrier on some architectures like arm64.
> 
> This ends up CPU observing two barriers back to back before executing
> the register write.
> 
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
> ?drivers/scsi/hpsa.h | 2 +-
> ?1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index 018f980..c7d7e6a 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -599,7 +599,7 @@ static unsigned long
> SA5_ioaccel_mode1_completed(struct ctlr_info *h, u8 q)
> ?		?* but with current driver design this is easiest.
> ?		?*/
> ?		wmb();
> -		writel((q << 24) | rq->current_entry, h->vaddr +
> +		writel_relaxed((q << 24) | rq->current_entry, h-
> >vaddr +
> ?				IOACCEL_MODE1_CONSUMER_INDEX);
> ?		atomic_dec(&h->commands_outstanding);
> ?	}

This looks like it would work for the x86_64 and arm because of how its
defined architecture specific for the x86_64 and the arm64

I guess its up to Don and the driver folks and if its worth the change.
I am generally not a fan of messing with these barrier things though.

Reviewed-by: Laurence Oberman <loberman@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 5/7] scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-20  2:50   ` Sinan Kaya
@ 2018-05-29 20:45     ` Brian King
  -1 siblings, 0 replies; 26+ messages in thread
From: Brian King @ 2018-05-29 20:45 UTC (permalink / raw)
  To: Sinan Kaya, linux-scsi, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Brian King,
	James E.J. Bottomley, Martin K. Petersen, linux-kernel

On 03/19/2018 09:50 PM, Sinan Kaya wrote:
> Code includes barrier() followed by writel(). writel() already has a
> barrier
> on some architectures like arm64.
> 
> This ends up CPU observing two barriers back to back before executing the
> register write.
> 
> Since code already has an explicit barrier call, changing writeX() to
> writeX_relaxed().
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/scsi/ipr.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
> index e07dd99..209adac 100644
> --- a/drivers/scsi/ipr.c
> +++ b/drivers/scsi/ipr.c
> @@ -762,9 +762,9 @@ static void ipr_mask_and_clear_interrupts(struct ipr_ioa_cfg *ioa_cfg,
> 
>  	/* Set interrupt mask to stop all new interrupts */
>  	if (ioa_cfg->sis64)
> -		writeq(~0, ioa_cfg->regs.set_interrupt_mask_reg);
> +		writeq_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
>  	else
> -		writel(~0, ioa_cfg->regs.set_interrupt_mask_reg);
> +		writel_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
> 
>  	/* Clear any pending interrupts */
>  	if (ioa_cfg->sis64)
> @@ -8435,7 +8435,8 @@ static int ipr_reset_enable_ioa(struct ipr_cmnd *ipr_cmd)
>  	wmb();
>  	if (ioa_cfg->sis64) {
>  		/* Set the adapter to the correct endian mode. */
> -		writel(IPR_ENDIAN_SWAP_KEY, ioa_cfg->regs.endian_swap_reg);
> +		writel_relaxed(IPR_ENDIAN_SWAP_KEY,
> +			       ioa_cfg->regs.endian_swap_reg);
>  		int_reg = readl(ioa_cfg->regs.endian_swap_reg);
>  	}
> 

Looks fine to me. Thanks.

Acked-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 5/7] scsi: ipr: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-05-29 20:45     ` Brian King
  0 siblings, 0 replies; 26+ messages in thread
From: Brian King @ 2018-05-29 20:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/19/2018 09:50 PM, Sinan Kaya wrote:
> Code includes barrier() followed by writel(). writel() already has a
> barrier
> on some architectures like arm64.
> 
> This ends up CPU observing two barriers back to back before executing the
> register write.
> 
> Since code already has an explicit barrier call, changing writeX() to
> writeX_relaxed().
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/scsi/ipr.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
> index e07dd99..209adac 100644
> --- a/drivers/scsi/ipr.c
> +++ b/drivers/scsi/ipr.c
> @@ -762,9 +762,9 @@ static void ipr_mask_and_clear_interrupts(struct ipr_ioa_cfg *ioa_cfg,
> 
>  	/* Set interrupt mask to stop all new interrupts */
>  	if (ioa_cfg->sis64)
> -		writeq(~0, ioa_cfg->regs.set_interrupt_mask_reg);
> +		writeq_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
>  	else
> -		writel(~0, ioa_cfg->regs.set_interrupt_mask_reg);
> +		writel_relaxed(~0, ioa_cfg->regs.set_interrupt_mask_reg);
> 
>  	/* Clear any pending interrupts */
>  	if (ioa_cfg->sis64)
> @@ -8435,7 +8435,8 @@ static int ipr_reset_enable_ioa(struct ipr_cmnd *ipr_cmd)
>  	wmb();
>  	if (ioa_cfg->sis64) {
>  		/* Set the adapter to the correct endian mode. */
> -		writel(IPR_ENDIAN_SWAP_KEY, ioa_cfg->regs.endian_swap_reg);
> +		writel_relaxed(IPR_ENDIAN_SWAP_KEY,
> +			       ioa_cfg->regs.endian_swap_reg);
>  		int_reg = readl(ioa_cfg->regs.endian_swap_reg);
>  	}
> 

Looks fine to me. Thanks.

Acked-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-05-29 20:45 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-20  2:50 [PATCH v4 0/7] scsi: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
2018-03-20  2:50 ` Sinan Kaya
2018-03-20  2:50 ` [PATCH v4 1/7] scsi: hpsa: " Sinan Kaya
2018-03-20  2:50   ` Sinan Kaya
2018-03-20 14:01   ` Don Brace
2018-03-20 14:01     ` Don Brace
2018-03-20 16:51   ` Laurence Oberman
2018-03-20 16:51     ` Laurence Oberman
2018-03-20  2:50 ` [PATCH v4 2/7] scsi: megaraid: eliminate " Sinan Kaya
2018-03-20  2:50   ` Sinan Kaya
2018-03-20  2:50 ` [PATCH v4 3/7] scsi: dpt_i2o: " Sinan Kaya
2018-03-20  2:50   ` Sinan Kaya
2018-03-20 10:23   ` Christoph Hellwig
2018-03-20 10:23     ` Christoph Hellwig
2018-03-20 12:13     ` okaya
2018-03-20 12:13       ` okaya at codeaurora.org
2018-03-20  2:50 ` [PATCH v4 4/7] scsi: lpfc: Eliminate " Sinan Kaya
2018-03-20  2:50   ` Sinan Kaya
2018-03-20  2:50 ` [PATCH v4 5/7] scsi: ipr: " Sinan Kaya
2018-03-20  2:50   ` Sinan Kaya
2018-05-29 20:45   ` Brian King
2018-05-29 20:45     ` Brian King
2018-03-20  2:50 ` [PATCH v4 6/7] scsi: bnx2i: " Sinan Kaya
2018-03-20  2:50   ` Sinan Kaya
2018-03-20  2:50 ` [PATCH v4 7/7] scsi: csiostor: " Sinan Kaya
2018-03-20  2:50   ` Sinan Kaya

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.