linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 1/3] RDMA/bnxt_re: Eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521736009-23387-1-git-send-email-okaya@codeaurora.org>
@ 2018-03-22 16:26 ` Sinan Kaya
  2018-03-22 16:26 ` [PATCH v5 2/3] RDMA/i40iw: " Sinan Kaya
  2018-03-22 16:26 ` [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2 Sinan Kaya
  2 siblings, 0 replies; 9+ messages in thread
From: Sinan Kaya @ 2018-03-22 16:26 UTC (permalink / raw)
  To: linux-rdma, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Selvin Xavier,
	Devesh Sharma, Somnath Kotur, Sriharsha Basavapatna,
	Doug Ledford, Jason Gunthorpe, linux-kernel

Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
index 8329ec6..10f9a26 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
@@ -181,10 +181,11 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
 
 	/* ring CMDQ DB */
 	wmb();
-	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
-	       rcfw->cmdq_bar_reg_prod_off);
-	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
-	       rcfw->cmdq_bar_reg_trig_off);
+	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
+		       rcfw->cmdq_bar_reg_prod_off);
+	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
+		       rcfw->cmdq_bar_reg_trig_off);
+	mmiowb();
 done:
 	spin_unlock_irqrestore(&cmdq->lock, flags);
 	/* Return the CREQ response pointer */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 2/3] RDMA/i40iw: Eliminate duplicate barriers on weakly-ordered archs
       [not found] <1521736009-23387-1-git-send-email-okaya@codeaurora.org>
  2018-03-22 16:26 ` [PATCH v5 1/3] RDMA/bnxt_re: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
@ 2018-03-22 16:26 ` Sinan Kaya
  2018-03-23 19:15   ` Sinan Kaya
  2018-03-22 16:26 ` [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2 Sinan Kaya
  2 siblings, 1 reply; 9+ messages in thread
From: Sinan Kaya @ 2018-03-22 16:26 UTC (permalink / raw)
  To: linux-rdma, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Faisal Latif,
	Shiraz Saleem, Doug Ledford, Jason Gunthorpe, linux-kernel

Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a wmb().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/infiniband/hw/i40iw/i40iw_ctrl.c  |  6 ++++--
 drivers/infiniband/hw/i40iw/i40iw_osdep.h |  1 +
 drivers/infiniband/hw/i40iw/i40iw_uk.c    |  3 ++-
 drivers/infiniband/hw/i40iw/i40iw_utils.c | 11 +++++++++++
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
index c74fd33..47f473e 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
@@ -706,9 +706,11 @@ static void i40iw_sc_ccq_arm(struct i40iw_sc_cq *ccq)
 	wmb();       /* make sure shadow area is updated before arming */
 
 	if (ccq->dev->is_pf)
-		i40iw_wr32(ccq->dev->hw, I40E_PFPE_CQARM, ccq->cq_uk.cq_id);
+		i40iw_wr32_relaxed(ccq->dev->hw, I40E_PFPE_CQARM,
+				   ccq->cq_uk.cq_id);
 	else
-		i40iw_wr32(ccq->dev->hw, I40E_VFPE_CQARM1, ccq->cq_uk.cq_id);
+		i40iw_wr32_relaxed(ccq->dev->hw, I40E_VFPE_CQARM1,
+				   ccq->cq_uk.cq_id);
 }
 
 /**
diff --git a/drivers/infiniband/hw/i40iw/i40iw_osdep.h b/drivers/infiniband/hw/i40iw/i40iw_osdep.h
index f27be3e..e06f4b9 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_osdep.h
+++ b/drivers/infiniband/hw/i40iw/i40iw_osdep.h
@@ -213,5 +213,6 @@ void i40iw_hw_stats_start_timer(struct i40iw_sc_vsi *vsi);
 void i40iw_hw_stats_stop_timer(struct i40iw_sc_vsi *vsi);
 #define i40iw_mmiowb() mmiowb()
 void i40iw_wr32(struct i40iw_hw *hw, u32 reg, u32 value);
+void i40iw_wr32_relaxed(struct i40iw_hw *hw, u32 reg, u32 value);
 u32  i40iw_rd32(struct i40iw_hw *hw, u32 reg);
 #endif				/* _I40IW_OSDEP_H_ */
diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c b/drivers/infiniband/hw/i40iw/i40iw_uk.c
index 8afa5a6..f936fc2 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_uk.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c
@@ -723,7 +723,8 @@ static void i40iw_cq_request_notification(struct i40iw_cq_uk *cq,
 
 	wmb(); /* make sure WQE is populated before valid bit is set */
 
-	writel(cq->cq_id, cq->cqe_alloc_reg);
+	writel_relaxed(cq->cq_id, cq->cqe_alloc_reg);
+	mmiowb();
 }
 
 /**
diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c b/drivers/infiniband/hw/i40iw/i40iw_utils.c
index ddc1056..99aa6f8 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_utils.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_utils.c
@@ -125,6 +125,17 @@ inline void i40iw_wr32(struct i40iw_hw *hw, u32 reg, u32 value)
 }
 
 /**
+ * i40iw_wr32_relaxed - write 32 bits to hw register without ordering
+ * @hw: hardware information including registers
+ * @reg: register offset
+ * @value: vvalue to write to register
+ */
+inline void i40iw_wr32_relaxed(struct i40iw_hw *hw, u32 reg, u32 value)
+{
+	writel_relaxed(value, hw->hw_addr + reg);
+}
+
+/**
  * i40iw_rd32 - read a 32 bit hw register
  * @hw: hardware information including registers
  * @reg: register offset
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2
       [not found] <1521736009-23387-1-git-send-email-okaya@codeaurora.org>
  2018-03-22 16:26 ` [PATCH v5 1/3] RDMA/bnxt_re: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
  2018-03-22 16:26 ` [PATCH v5 2/3] RDMA/i40iw: " Sinan Kaya
@ 2018-03-22 16:26 ` Sinan Kaya
  2018-04-03  2:29   ` Sinan Kaya
  2 siblings, 1 reply; 9+ messages in thread
From: Sinan Kaya @ 2018-03-22 16:26 UTC (permalink / raw)
  To: linux-rdma, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Michal Kalderon,
	Ariel Elior, Doug Ledford, Jason Gunthorpe, linux-kernel

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/infiniband/hw/qedr/verbs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 53f00db..d6bd950 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq, u32 cons, u8 flags)
 	wmb();
 	cq->db.data.agg_flags = flags;
 	cq->db.data.value = cpu_to_le32(cons);
-	writeq(cq->db.raw, cq->db_addr);
+	writeq_relaxed(cq->db.raw, cq->db_addr);
 
 	/* Make sure write would stick */
 	mmiowb();
@@ -3338,7 +3338,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
 
 		qp->rq.db_data.data.value++;
 
-		writel(qp->rq.db_data.raw, qp->rq.db);
+		writel_relaxed(qp->rq.db_data.raw, qp->rq.db);
 
 		/* Make sure write sticks */
 		mmiowb();
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 2/3] RDMA/i40iw: Eliminate duplicate barriers on weakly-ordered archs
  2018-03-22 16:26 ` [PATCH v5 2/3] RDMA/i40iw: " Sinan Kaya
@ 2018-03-23 19:15   ` Sinan Kaya
  0 siblings, 0 replies; 9+ messages in thread
From: Sinan Kaya @ 2018-03-23 19:15 UTC (permalink / raw)
  To: linux-rdma, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Faisal Latif, Shiraz Saleem,
	Doug Ledford, Jason Gunthorpe, linux-kernel

On 3/22/2018 12:26 PM, Sinan Kaya wrote:
> +++ b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
> @@ -706,9 +706,11 @@ static void i40iw_sc_ccq_arm(struct i40iw_sc_cq *ccq)
>  	wmb();       /* make sure shadow area is updated before arming */
>  
>  	if (ccq->dev->is_pf)
> -		i40iw_wr32(ccq->dev->hw, I40E_PFPE_CQARM, ccq->cq_uk.cq_id);
> +		i40iw_wr32_relaxed(ccq->dev->hw, I40E_PFPE_CQARM,
> +				   ccq->cq_uk.cq_id);
>  	else
> -		i40iw_wr32(ccq->dev->hw, I40E_VFPE_CQARM1, ccq->cq_uk.cq_id);
> +		i40iw_wr32_relaxed(ccq->dev->hw, I40E_VFPE_CQARM1,
> +				   ccq->cq_uk.cq_id);

do we want an mmiowb() here?

>  }


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2
  2018-03-22 16:26 ` [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2 Sinan Kaya
@ 2018-04-03  2:29   ` Sinan Kaya
  2018-04-03  7:42     ` Kalderon, Michal
  0 siblings, 1 reply; 9+ messages in thread
From: Sinan Kaya @ 2018-04-03  2:29 UTC (permalink / raw)
  To: linux-rdma, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Michal Kalderon, Ariel Elior,
	Doug Ledford, Jason Gunthorpe, linux-kernel

On 3/22/2018 12:26 PM, Sinan Kaya wrote:
> @@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq, u32 cons, u8 flags)
>  	wmb();
>  	cq->db.data.agg_flags = flags;
>  	cq->db.data.value = cpu_to_le32(cons);
> -	writeq(cq->db.raw, cq->db_addr);
> +	writeq_relaxed(cq->db.raw, cq->db_addr);

Given the direction to get rid of wmb() in front of writeX() functions, I have been
reviewing this code. Under normal circumstances, I can get rid of all wmb() as follows.

However, I started having my doubts now. Are these wmb() used as a SMP barrier too?
I can't find any smp_Xmb() in drivers/infiniband/hw/qedr directory.

static void doorbell_cq(struct qedr_cq *cq, u32 cons, u8 flags)
 {
-       /* Flush data before signalling doorbell */
-       wmb();
        cq->db.data.agg_flags = flags;
        cq->db.data.value = cpu_to_le32(cons);
        writeq(cq->db.raw, cq->db_addr);
@@ -1870,8 +1868,7 @@ static int qedr_update_qp_state(struct qedr_dev *dev,
                         */

                        if (rdma_protocol_roce(&dev->ibdev, 1)) {
-                               wmb();
-                               writel_relaxed(qp->rq.db_data.raw, qp->rq.db);
+                               writel(qp->rq.db_data.raw, qp->rq.db);
                                /* Make sure write takes effect */
                                mmiowb();
                        }
@@ -3275,8 +3272,7 @@ int qedr_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
         * unchanged). For performance reasons we avoid checking for this
         * redundant doorbell.
         */
-       wmb();
-       writel_relaxed(qp->sq.db_data.raw, qp->sq.db);
+       writel(qp->sq.db_data.raw, qp->sq.db);

        /* Make sure write sticks */
        mmiowb();
@@ -3362,9 +3358,6 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,

                qedr_inc_sw_prod(&qp->rq);

-               /* Flush all the writes before signalling doorbell */
-               wmb();





-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2
  2018-04-03  2:29   ` Sinan Kaya
@ 2018-04-03  7:42     ` Kalderon, Michal
  2018-04-03 17:47       ` Sinan Kaya
  2018-04-03 20:03       ` Jason Gunthorpe
  0 siblings, 2 replies; 9+ messages in thread
From: Kalderon, Michal @ 2018-04-03  7:42 UTC (permalink / raw)
  To: Sinan Kaya, linux-rdma, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Elior, Ariel, Doug Ledford,
	Jason Gunthorpe, linux-kernel, Elior, Ariel

> From: Sinan Kaya [mailto:okaya@codeaurora.org]
> Sent: Tuesday, April 03, 2018 5:30 AM
> To: linux-rdma@vger.kernel.org; timur@codeaurora.org;
> sulrich@codeaurora.org
> Cc: linux-arm-msm@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> Kalderon, Michal <Michal.Kalderon@cavium.com>; Elior, Ariel
> <Ariel.Elior@cavium.com>; Doug Ledford <dledford@redhat.com>; Jason
> Gunthorpe <jgg@ziepe.ca>; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on
> weakly-ordered archs #2
> 
> On 3/22/2018 12:26 PM, Sinan Kaya wrote:
> > @@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq, u32
> cons, u8 flags)
> >  	wmb();
> >  	cq->db.data.agg_flags = flags;
> >  	cq->db.data.value = cpu_to_le32(cons);
> > -	writeq(cq->db.raw, cq->db_addr);
> > +	writeq_relaxed(cq->db.raw, cq->db_addr);
> 
> Given the direction to get rid of wmb() in front of writeX() functions, I have
> been reviewing this code. Under normal circumstances, I can get rid of all
> wmb() as follows.
> 
> However, I started having my doubts now. Are these wmb() used as a SMP
> barrier too?
> I can't find any smp_Xmb() in drivers/infiniband/hw/qedr directory.

Your doubts are in place. You initial patch series modified writel to writel_relaxed
Simply removing the wmb is dangerous. The wmb before writel are used to make sure the
HW observes the changes in memory before we trigger the doorbell. Smp barriers here
wouldn't suffice, as on a single processor. we still need to make sure memory is updated
and not remained in cache when HW accesses it.
Reviewing the qedr barriers, I can find places where this may have not been necessary, 
But definitely you can't simply remove this wmb barriers. 

> 
> static void doorbell_cq(struct qedr_cq *cq, u32 cons, u8 flags)  {
> -       /* Flush data before signalling doorbell */
> -       wmb();
>         cq->db.data.agg_flags = flags;
>         cq->db.data.value = cpu_to_le32(cons);
>         writeq(cq->db.raw, cq->db_addr); @@ -1870,8 +1868,7 @@ static int
> qedr_update_qp_state(struct qedr_dev *dev,
>                          */
> 
>                         if (rdma_protocol_roce(&dev->ibdev, 1)) {
> -                               wmb();
> -                               writel_relaxed(qp->rq.db_data.raw, qp->rq.db);
> +                               writel(qp->rq.db_data.raw, qp->rq.db);
>                                 /* Make sure write takes effect */
>                                 mmiowb();
>                         }
> @@ -3275,8 +3272,7 @@ int qedr_post_send(struct ib_qp *ibqp, struct
> ib_send_wr *wr,
>          * unchanged). For performance reasons we avoid checking for this
>          * redundant doorbell.
>          */
> -       wmb();
> -       writel_relaxed(qp->sq.db_data.raw, qp->sq.db);
> +       writel(qp->sq.db_data.raw, qp->sq.db);
> 
>         /* Make sure write sticks */
>         mmiowb();
> @@ -3362,9 +3358,6 @@ int qedr_post_recv(struct ib_qp *ibqp, struct
> ib_recv_wr *wr,
> 
>                 qedr_inc_sw_prod(&qp->rq);
> 
> -               /* Flush all the writes before signalling doorbell */
> -               wmb();
> 
> 
> 
> 
> 
> --
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
> Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux
> Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2
  2018-04-03  7:42     ` Kalderon, Michal
@ 2018-04-03 17:47       ` Sinan Kaya
  2018-04-03 20:03       ` Jason Gunthorpe
  1 sibling, 0 replies; 9+ messages in thread
From: Sinan Kaya @ 2018-04-03 17:47 UTC (permalink / raw)
  To: Kalderon, Michal, linux-rdma, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Elior, Ariel, Doug Ledford,
	Jason Gunthorpe, linux-kernel

On 4/3/2018 3:42 AM, Kalderon, Michal wrote:
> The wmb before writel are used to make sure the
> HW observes the changes in memory before we trigger the doorbell. 

According to Linus, writel() guarantees observability. No extra
barrier is necessary.

https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html

There shouldn't be any wmb() in drivers unless it is used for a
very well-known reason. 

APIs like readX() and writeX() guarantee observability.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2
  2018-04-03  7:42     ` Kalderon, Michal
  2018-04-03 17:47       ` Sinan Kaya
@ 2018-04-03 20:03       ` Jason Gunthorpe
  2018-04-04 11:54         ` Kalderon, Michal
  1 sibling, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2018-04-03 20:03 UTC (permalink / raw)
  To: Kalderon, Michal
  Cc: Sinan Kaya, linux-rdma, timur, sulrich, linux-arm-msm,
	linux-arm-kernel, Elior, Ariel, Doug Ledford, linux-kernel

On Tue, Apr 03, 2018 at 07:42:28AM +0000, Kalderon, Michal wrote:
> > From: Sinan Kaya [mailto:okaya@codeaurora.org]
> > Sent: Tuesday, April 03, 2018 5:30 AM
> > To: linux-rdma@vger.kernel.org; timur@codeaurora.org;
> > sulrich@codeaurora.org
> > Cc: linux-arm-msm@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> > Kalderon, Michal <Michal.Kalderon@cavium.com>; Elior, Ariel
> > <Ariel.Elior@cavium.com>; Doug Ledford <dledford@redhat.com>; Jason
> > Gunthorpe <jgg@ziepe.ca>; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on
> > weakly-ordered archs #2
> > 
> > On 3/22/2018 12:26 PM, Sinan Kaya wrote:
> > > @@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq, u32
> > cons, u8 flags)
> > >  	wmb();
> > >  	cq->db.data.agg_flags = flags;
> > >  	cq->db.data.value = cpu_to_le32(cons);
> > > -	writeq(cq->db.raw, cq->db_addr);
> > > +	writeq_relaxed(cq->db.raw, cq->db_addr);
> > 
> > Given the direction to get rid of wmb() in front of writeX() functions, I have
> > been reviewing this code. Under normal circumstances, I can get rid of all
> > wmb() as follows.
> > 
> > However, I started having my doubts now. Are these wmb() used as a SMP
> > barrier too?
> > I can't find any smp_Xmb() in drivers/infiniband/hw/qedr directory.
> 
> Your doubts are in place. You initial patch series modified writel to writel_relaxed
> Simply removing the wmb is dangerous. The wmb before writel are used to make sure the
> HW observes the changes in memory before we trigger the doorbell. Smp barriers here
> wouldn't suffice, as on a single processor. we still need to make sure memory is updated
> and not remained in cache when HW accesses it.
> Reviewing the qedr barriers, I can find places where this may have not been necessary, 
> But definitely you can't simply remove this wmb barriers. 

As Sinan said, the consensus is that wmb();writel(); is redundant if
the only purpose of the wmb is to order DMA and system memory.

So can you review these patches on that basis please? Is the WMB doing
something else, eg SMP related? If yes, please send a patch adding
appropriate comments.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2
  2018-04-03 20:03       ` Jason Gunthorpe
@ 2018-04-04 11:54         ` Kalderon, Michal
  0 siblings, 0 replies; 9+ messages in thread
From: Kalderon, Michal @ 2018-04-04 11:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sinan Kaya, linux-rdma, timur, sulrich, linux-arm-msm,
	linux-arm-kernel, Elior, Ariel, Doug Ledford, linux-kernel

> From: Jason Gunthorpe [mailto:jgg@ziepe.ca]
> Sent: Tuesday, April 03, 2018 11:04 PM
> 
> On Tue, Apr 03, 2018 at 07:42:28AM +0000, Kalderon, Michal wrote:
> > > From: Sinan Kaya [mailto:okaya@codeaurora.org]
> > > Sent: Tuesday, April 03, 2018 5:30 AM
> > > To: linux-rdma@vger.kernel.org; timur@codeaurora.org;
> > > sulrich@codeaurora.org
> > > Cc: linux-arm-msm@vger.kernel.org;
> > > linux-arm-kernel@lists.infradead.org;
> > > Kalderon, Michal <Michal.Kalderon@cavium.com>; Elior, Ariel
> > > <Ariel.Elior@cavium.com>; Doug Ledford <dledford@redhat.com>; Jason
> > > Gunthorpe <jgg@ziepe.ca>; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers
> > > on weakly-ordered archs #2
> > >
> > > On 3/22/2018 12:26 PM, Sinan Kaya wrote:
> > > > @@ -860,7 +860,7 @@ static void doorbell_cq(struct qedr_cq *cq,
> > > > u32
> > > cons, u8 flags)
> > > >  	wmb();
> > > >  	cq->db.data.agg_flags = flags;
> > > >  	cq->db.data.value = cpu_to_le32(cons);
> > > > -	writeq(cq->db.raw, cq->db_addr);
> > > > +	writeq_relaxed(cq->db.raw, cq->db_addr);
> > >
> > > Given the direction to get rid of wmb() in front of writeX()
> > > functions, I have been reviewing this code. Under normal
> > > circumstances, I can get rid of all
> > > wmb() as follows.
> > >
> > > However, I started having my doubts now. Are these wmb() used as a
> > > SMP barrier too?
> > > I can't find any smp_Xmb() in drivers/infiniband/hw/qedr directory.
> >
> > Your doubts are in place. You initial patch series modified writel to
> > writel_relaxed Simply removing the wmb is dangerous. The wmb before
> > writel are used to make sure the HW observes the changes in memory
> > before we trigger the doorbell. Smp barriers here wouldn't suffice, as
> > on a single processor. we still need to make sure memory is updated and
> not remained in cache when HW accesses it.
> > Reviewing the qedr barriers, I can find places where this may have not
> > been necessary, But definitely you can't simply remove this wmb barriers.
> 
> As Sinan said, the consensus is that wmb();writel(); is redundant if the only
> purpose of the wmb is to order DMA and system memory.
> 
> So can you review these patches on that basis please? Is the WMB doing
> something else, eg SMP related? If yes, please send a patch adding
> appropriate comments.

Thanks Sinan and Jason for the references and explanations, I've reviewed the wmb
usages in qedr and am about to send a patch that replaces two of them with smp_wmb
and completely removes two of them that given your explanation, turned out to be redundant,
thanks.

> 
> Thanks,
> Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-04-04 11:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1521736009-23387-1-git-send-email-okaya@codeaurora.org>
2018-03-22 16:26 ` [PATCH v5 1/3] RDMA/bnxt_re: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
2018-03-22 16:26 ` [PATCH v5 2/3] RDMA/i40iw: " Sinan Kaya
2018-03-23 19:15   ` Sinan Kaya
2018-03-22 16:26 ` [PATCH v5 3/3] RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs #2 Sinan Kaya
2018-04-03  2:29   ` Sinan Kaya
2018-04-03  7:42     ` Kalderon, Michal
2018-04-03 17:47       ` Sinan Kaya
2018-04-03 20:03       ` Jason Gunthorpe
2018-04-04 11:54         ` Kalderon, Michal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).