linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
@ 2020-12-06 10:57 Eli Cohen
  2020-12-07  2:51 ` Jason Wang
  2020-12-08 21:45 ` Michael S. Tsirkin
  0 siblings, 2 replies; 11+ messages in thread
From: Eli Cohen @ 2020-12-06 10:57 UTC (permalink / raw)
  To: mst, jasowang, virtualization, linux-kernel; +Cc: lulu, Eli Cohen

Make sure to put write memory barrier after updating CQ consumer index
so the hardware knows that there are available CQE slots in the queue.

Failure to do this can cause the update of the RX doorbell record to get
updated before the CQ consumer index resulting in CQ overrun.

Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen <elic@nvidia.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 1f4089c6f9d7..295f46eea2a5 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
 static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
 {
 	mlx5_cq_set_ci(&mvq->cq.mcq);
+
+	/* make sure CQ cosumer update is visible to the hardware before updating
+	 * RX doorbell record.
+	 */
+	wmb();
 	rx_post(&mvq->vqqp, num);
 	if (mvq->event_cb.callback)
 		mvq->event_cb.callback(mvq->event_cb.private);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-06 10:57 [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index Eli Cohen
@ 2020-12-07  2:51 ` Jason Wang
  2020-12-08  9:15   ` Eli Cohen
  2020-12-08 21:45 ` Michael S. Tsirkin
  1 sibling, 1 reply; 11+ messages in thread
From: Jason Wang @ 2020-12-07  2:51 UTC (permalink / raw)
  To: Eli Cohen, mst, virtualization, linux-kernel; +Cc: lulu


On 2020/12/6 下午6:57, Eli Cohen wrote:
> Make sure to put write memory barrier after updating CQ consumer index
> so the hardware knows that there are available CQE slots in the queue.
>
> Failure to do this can cause the update of the RX doorbell record to get
> updated before the CQ consumer index resulting in CQ overrun.
>
> Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> Signed-off-by: Eli Cohen <elic@nvidia.com>
> ---
>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 1f4089c6f9d7..295f46eea2a5 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
>   static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
>   {
>   	mlx5_cq_set_ci(&mvq->cq.mcq);
> +
> +	/* make sure CQ cosumer update is visible to the hardware before updating
> +	 * RX doorbell record.
> +	 */
> +	wmb();
>   	rx_post(&mvq->vqqp, num);
>   	if (mvq->event_cb.callback)
>   		mvq->event_cb.callback(mvq->event_cb.private);


Acked-by: Jason Wang <jasowang@redhat.com>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-07  2:51 ` Jason Wang
@ 2020-12-08  9:15   ` Eli Cohen
  2020-12-08 13:53     ` Michael S. Tsirkin
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Cohen @ 2020-12-08  9:15 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, virtualization, linux-kernel, lulu, elic

On Mon, Dec 07, 2020 at 10:51:44AM +0800, Jason Wang wrote:
> 
> On 2020/12/6 下午6:57, Eli Cohen wrote:
> > Make sure to put write memory barrier after updating CQ consumer index
> > so the hardware knows that there are available CQE slots in the queue.
> > 
> > Failure to do this can cause the update of the RX doorbell record to get
> > updated before the CQ consumer index resulting in CQ overrun.
> > 
> > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1

Michael, I left this gerrit ID by mistake. Can you remove it before
merging?

> > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > ---
> >   drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> >   1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index 1f4089c6f9d7..295f46eea2a5 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> >   static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> >   {
> >   	mlx5_cq_set_ci(&mvq->cq.mcq);
> > +
> > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > +	 * RX doorbell record.
> > +	 */
> > +	wmb();
> >   	rx_post(&mvq->vqqp, num);
> >   	if (mvq->event_cb.callback)
> >   		mvq->event_cb.callback(mvq->event_cb.private);
> 
> 
> Acked-by: Jason Wang <jasowang@redhat.com>
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-08  9:15   ` Eli Cohen
@ 2020-12-08 13:53     ` Michael S. Tsirkin
  0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2020-12-08 13:53 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Jason Wang, virtualization, linux-kernel, lulu

On Tue, Dec 08, 2020 at 11:15:00AM +0200, Eli Cohen wrote:
> On Mon, Dec 07, 2020 at 10:51:44AM +0800, Jason Wang wrote:
> > 
> > On 2020/12/6 下午6:57, Eli Cohen wrote:
> > > Make sure to put write memory barrier after updating CQ consumer index
> > > so the hardware knows that there are available CQE slots in the queue.
> > > 
> > > Failure to do this can cause the update of the RX doorbell record to get
> > > updated before the CQ consumer index resulting in CQ overrun.
> > > 
> > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> 
> Michael, I left this gerrit ID by mistake. Can you remove it before
> merging?

No problem.

> > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > > ---
> > >   drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> > >   1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> > >   static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> > >   {
> > >   	mlx5_cq_set_ci(&mvq->cq.mcq);
> > > +
> > > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > > +	 * RX doorbell record.
> > > +	 */
> > > +	wmb();
> > >   	rx_post(&mvq->vqqp, num);
> > >   	if (mvq->event_cb.callback)
> > >   		mvq->event_cb.callback(mvq->event_cb.private);
> > 
> > 
> > Acked-by: Jason Wang <jasowang@redhat.com>
> > 
> > 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-06 10:57 [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index Eli Cohen
  2020-12-07  2:51 ` Jason Wang
@ 2020-12-08 21:45 ` Michael S. Tsirkin
  2020-12-09  6:02   ` Eli Cohen
  1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2020-12-08 21:45 UTC (permalink / raw)
  To: Eli Cohen; +Cc: jasowang, virtualization, linux-kernel, lulu

On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> Make sure to put write memory barrier after updating CQ consumer index
> so the hardware knows that there are available CQE slots in the queue.
> 
> Failure to do this can cause the update of the RX doorbell record to get
> updated before the CQ consumer index resulting in CQ overrun.
> 
> Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> Signed-off-by: Eli Cohen <elic@nvidia.com>

Aren't both memory writes? And given that, isn't dma_wmb() sufficient
here?


> ---
>  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 1f4089c6f9d7..295f46eea2a5 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
>  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
>  {
>  	mlx5_cq_set_ci(&mvq->cq.mcq);
> +
> +	/* make sure CQ cosumer update is visible to the hardware before updating
> +	 * RX doorbell record.
> +	 */
> +	wmb();
>  	rx_post(&mvq->vqqp, num);
>  	if (mvq->event_cb.callback)
>  		mvq->event_cb.callback(mvq->event_cb.private);
> -- 
> 2.27.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-08 21:45 ` Michael S. Tsirkin
@ 2020-12-09  6:02   ` Eli Cohen
  2020-12-09  6:46     ` Michael S. Tsirkin
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Cohen @ 2020-12-09  6:02 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: jasowang, virtualization, linux-kernel, lulu

On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > Make sure to put write memory barrier after updating CQ consumer index
> > so the hardware knows that there are available CQE slots in the queue.
> > 
> > Failure to do this can cause the update of the RX doorbell record to get
> > updated before the CQ consumer index resulting in CQ overrun.
> > 
> > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > Signed-off-by: Eli Cohen <elic@nvidia.com>
> 
> Aren't both memory writes?

Not sure what exactly you mean here.

> And given that, isn't dma_wmb() sufficient here?

I agree that dma_wmb() is more appropriate here.

> 
> 
> > ---
> >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index 1f4089c6f9d7..295f46eea2a5 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> >  {
> >  	mlx5_cq_set_ci(&mvq->cq.mcq);
> > +
> > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > +	 * RX doorbell record.
> > +	 */
> > +	wmb();
> >  	rx_post(&mvq->vqqp, num);
> >  	if (mvq->event_cb.callback)
> >  		mvq->event_cb.callback(mvq->event_cb.private);
> > -- 
> > 2.27.0
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-09  6:02   ` Eli Cohen
@ 2020-12-09  6:46     ` Michael S. Tsirkin
  2020-12-09  6:58       ` Eli Cohen
  0 siblings, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2020-12-09  6:46 UTC (permalink / raw)
  To: Eli Cohen; +Cc: jasowang, virtualization, linux-kernel, lulu

On Wed, Dec 09, 2020 at 08:02:30AM +0200, Eli Cohen wrote:
> On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> > On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > > Make sure to put write memory barrier after updating CQ consumer index
> > > so the hardware knows that there are available CQE slots in the queue.
> > > 
> > > Failure to do this can cause the update of the RX doorbell record to get
> > > updated before the CQ consumer index resulting in CQ overrun.
> > > 
> > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > 
> > Aren't both memory writes?
> 
> Not sure what exactly you mean here.

Both updates are CPU writes into RAM that hardware then reads
using DMA.

> > And given that, isn't dma_wmb() sufficient here?
> 
> I agree that dma_wmb() is more appropriate here.
> 
> > 
> > 
> > > ---
> > >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> > >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> > >  {
> > >  	mlx5_cq_set_ci(&mvq->cq.mcq);
> > > +
> > > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > > +	 * RX doorbell record.
> > > +	 */
> > > +	wmb();
> > >  	rx_post(&mvq->vqqp, num);
> > >  	if (mvq->event_cb.callback)
> > >  		mvq->event_cb.callback(mvq->event_cb.private);
> > > -- 
> > > 2.27.0
> > 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-09  6:46     ` Michael S. Tsirkin
@ 2020-12-09  6:58       ` Eli Cohen
  2020-12-09  8:05         ` Michael S. Tsirkin
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Cohen @ 2020-12-09  6:58 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: jasowang, virtualization, linux-kernel, lulu

On Wed, Dec 09, 2020 at 01:46:22AM -0500, Michael S. Tsirkin wrote:
> On Wed, Dec 09, 2020 at 08:02:30AM +0200, Eli Cohen wrote:
> > On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> > > On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > > > Make sure to put write memory barrier after updating CQ consumer index
> > > > so the hardware knows that there are available CQE slots in the queue.
> > > > 
> > > > Failure to do this can cause the update of the RX doorbell record to get
> > > > updated before the CQ consumer index resulting in CQ overrun.
> > > > 
> > > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > > 
> > > Aren't both memory writes?
> > 
> > Not sure what exactly you mean here.
> 
> Both updates are CPU writes into RAM that hardware then reads
> using DMA.
> 

You mean why I did not put a memory barrier right after updating the
recieve doorbell record?

I thought about this and I think it is not required. Suppose it takes a
very long time till the hardware can actually see this update. The worst
effect would be that the hardware will drop received packets if it does
sees none available due to the delayed update. Eventually it will see
the update and will continue working.

If I put a memory barrier, I put some delay waiting for the CPU to flush
the write before continuing. I tried both options while checking packet
rate on couldn't see noticable difference in either case.

> > > And given that, isn't dma_wmb() sufficient here?
> > 
> > I agree that dma_wmb() is more appropriate here.
> > 
> > > 
> > > 
> > > > ---
> > > >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> > > >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> > > >  {
> > > >  	mlx5_cq_set_ci(&mvq->cq.mcq);
> > > > +
> > > > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > > > +	 * RX doorbell record.
> > > > +	 */
> > > > +	wmb();
> > > >  	rx_post(&mvq->vqqp, num);
> > > >  	if (mvq->event_cb.callback)
> > > >  		mvq->event_cb.callback(mvq->event_cb.private);
> > > > -- 
> > > > 2.27.0
> > > 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-09  6:58       ` Eli Cohen
@ 2020-12-09  8:05         ` Michael S. Tsirkin
  2020-12-09  9:38           ` Eli Cohen
  0 siblings, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2020-12-09  8:05 UTC (permalink / raw)
  To: Eli Cohen; +Cc: jasowang, virtualization, linux-kernel, lulu

On Wed, Dec 09, 2020 at 08:58:46AM +0200, Eli Cohen wrote:
> On Wed, Dec 09, 2020 at 01:46:22AM -0500, Michael S. Tsirkin wrote:
> > On Wed, Dec 09, 2020 at 08:02:30AM +0200, Eli Cohen wrote:
> > > On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> > > > On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > > > > Make sure to put write memory barrier after updating CQ consumer index
> > > > > so the hardware knows that there are available CQE slots in the queue.
> > > > > 
> > > > > Failure to do this can cause the update of the RX doorbell record to get
> > > > > updated before the CQ consumer index resulting in CQ overrun.
> > > > > 
> > > > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > > > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > > > 
> > > > Aren't both memory writes?
> > > 
> > > Not sure what exactly you mean here.
> > 
> > Both updates are CPU writes into RAM that hardware then reads
> > using DMA.
> > 
> 
> You mean why I did not put a memory barrier right after updating the
> recieve doorbell record?

Sorry about being unclear.  I just tried to give justification for why
dma_wmb seems more appropriate than wmb here. If you need to
order memory writes wrt writes to card, that is different, but generally
writeX and friends will handle the ordering for you, except when
using relaxed memory mappings - then wmb is generally necessary.

> I thought about this and I think it is not required. Suppose it takes a
> very long time till the hardware can actually see this update. The worst
> effect would be that the hardware will drop received packets if it does
> sees none available due to the delayed update. Eventually it will see
> the update and will continue working.
> 
> If I put a memory barrier, I put some delay waiting for the CPU to flush
> the write before continuing. I tried both options while checking packet
> rate on couldn't see noticable difference in either case.


makes sense.

> > > > And given that, isn't dma_wmb() sufficient here?
> > > 
> > > I agree that dma_wmb() is more appropriate here.
> > > 
> > > > 
> > > > 
> > > > > ---
> > > > >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> > > > >  1 file changed, 5 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> > > > >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> > > > >  {
> > > > >  	mlx5_cq_set_ci(&mvq->cq.mcq);
> > > > > +
> > > > > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > > > > +	 * RX doorbell record.
> > > > > +	 */
> > > > > +	wmb();
> > > > >  	rx_post(&mvq->vqqp, num);
> > > > >  	if (mvq->event_cb.callback)
> > > > >  		mvq->event_cb.callback(mvq->event_cb.private);
> > > > > -- 
> > > > > 2.27.0
> > > > 
> > 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-09  8:05         ` Michael S. Tsirkin
@ 2020-12-09  9:38           ` Eli Cohen
  2020-12-09 12:47             ` Michael S. Tsirkin
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Cohen @ 2020-12-09  9:38 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: jasowang, virtualization, linux-kernel, lulu

On Wed, Dec 09, 2020 at 03:05:42AM -0500, Michael S. Tsirkin wrote:
> On Wed, Dec 09, 2020 at 08:58:46AM +0200, Eli Cohen wrote:
> > On Wed, Dec 09, 2020 at 01:46:22AM -0500, Michael S. Tsirkin wrote:
> > > On Wed, Dec 09, 2020 at 08:02:30AM +0200, Eli Cohen wrote:
> > > > On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> > > > > On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > > > > > Make sure to put write memory barrier after updating CQ consumer index
> > > > > > so the hardware knows that there are available CQE slots in the queue.
> > > > > > 
> > > > > > Failure to do this can cause the update of the RX doorbell record to get
> > > > > > updated before the CQ consumer index resulting in CQ overrun.
> > > > > > 
> > > > > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > > > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > > > > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > > > > 
> > > > > Aren't both memory writes?
> > > > 
> > > > Not sure what exactly you mean here.
> > > 
> > > Both updates are CPU writes into RAM that hardware then reads
> > > using DMA.
> > > 
> > 
> > You mean why I did not put a memory barrier right after updating the
> > recieve doorbell record?
> 
> Sorry about being unclear.  I just tried to give justification for why
> dma_wmb seems more appropriate than wmb here. If you need to
> order memory writes wrt writes to card, that is different, but generally
> writeX and friends will handle the ordering for you, except when
> using relaxed memory mappings - then wmb is generally necessary.
> 

Bear in mind, we're writing to memory (not io memory). In this case, we
want this write to be visible my the DMA device.

https://www.kernel.org/doc/Documentation/memory-barriers.txt gives a
similar example using dma_wmb() to flush updates to make them visible
by the hardware before notifying the hardware to come and inspect this
memory.


> > I thought about this and I think it is not required. Suppose it takes a
> > very long time till the hardware can actually see this update. The worst
> > effect would be that the hardware will drop received packets if it does
> > sees none available due to the delayed update. Eventually it will see
> > the update and will continue working.
> > 
> > If I put a memory barrier, I put some delay waiting for the CPU to flush
> > the write before continuing. I tried both options while checking packet
> > rate on couldn't see noticable difference in either case.
> 
> 
> makes sense.
> 
> > > > > And given that, isn't dma_wmb() sufficient here?
> > > > 
> > > > I agree that dma_wmb() is more appropriate here.
> > > > 
> > > > > 
> > > > > 
> > > > > > ---
> > > > > >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> > > > > >  1 file changed, 5 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> > > > > >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> > > > > >  {
> > > > > >  	mlx5_cq_set_ci(&mvq->cq.mcq);
> > > > > > +
> > > > > > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > > > > > +	 * RX doorbell record.
> > > > > > +	 */
> > > > > > +	wmb();
> > > > > >  	rx_post(&mvq->vqqp, num);
> > > > > >  	if (mvq->event_cb.callback)
> > > > > >  		mvq->event_cb.callback(mvq->event_cb.private);
> > > > > > -- 
> > > > > > 2.27.0
> > > > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index
  2020-12-09  9:38           ` Eli Cohen
@ 2020-12-09 12:47             ` Michael S. Tsirkin
  0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2020-12-09 12:47 UTC (permalink / raw)
  To: Eli Cohen; +Cc: jasowang, virtualization, linux-kernel, lulu

On Wed, Dec 09, 2020 at 11:38:36AM +0200, Eli Cohen wrote:
> On Wed, Dec 09, 2020 at 03:05:42AM -0500, Michael S. Tsirkin wrote:
> > On Wed, Dec 09, 2020 at 08:58:46AM +0200, Eli Cohen wrote:
> > > On Wed, Dec 09, 2020 at 01:46:22AM -0500, Michael S. Tsirkin wrote:
> > > > On Wed, Dec 09, 2020 at 08:02:30AM +0200, Eli Cohen wrote:
> > > > > On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> > > > > > On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > > > > > > Make sure to put write memory barrier after updating CQ consumer index
> > > > > > > so the hardware knows that there are available CQE slots in the queue.
> > > > > > > 
> > > > > > > Failure to do this can cause the update of the RX doorbell record to get
> > > > > > > updated before the CQ consumer index resulting in CQ overrun.
> > > > > > > 
> > > > > > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > > > > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > > > > > > Signed-off-by: Eli Cohen <elic@nvidia.com>
> > > > > > 
> > > > > > Aren't both memory writes?
> > > > > 
> > > > > Not sure what exactly you mean here.
> > > > 
> > > > Both updates are CPU writes into RAM that hardware then reads
> > > > using DMA.
> > > > 
> > > 
> > > You mean why I did not put a memory barrier right after updating the
> > > recieve doorbell record?
> > 
> > Sorry about being unclear.  I just tried to give justification for why
> > dma_wmb seems more appropriate than wmb here. If you need to
> > order memory writes wrt writes to card, that is different, but generally
> > writeX and friends will handle the ordering for you, except when
> > using relaxed memory mappings - then wmb is generally necessary.
> > 
> 
> Bear in mind, we're writing to memory (not io memory). In this case, we
> want this write to be visible my the DMA device.
> 
> https://www.kernel.org/doc/Documentation/memory-barriers.txt gives a
> similar example using dma_wmb() to flush updates to make them visible
> by the hardware before notifying the hardware to come and inspect this
> memory.

Exactly.

> 
> > > I thought about this and I think it is not required. Suppose it takes a
> > > very long time till the hardware can actually see this update. The worst
> > > effect would be that the hardware will drop received packets if it does
> > > sees none available due to the delayed update. Eventually it will see
> > > the update and will continue working.
> > > 
> > > If I put a memory barrier, I put some delay waiting for the CPU to flush
> > > the write before continuing. I tried both options while checking packet
> > > rate on couldn't see noticable difference in either case.
> > 
> > 
> > makes sense.
> > 
> > > > > > And given that, isn't dma_wmb() sufficient here?
> > > > > 
> > > > > I agree that dma_wmb() is more appropriate here.
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > > ---
> > > > > > >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +++++
> > > > > > >  1 file changed, 5 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> > > > > > >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num)
> > > > > > >  {
> > > > > > >  	mlx5_cq_set_ci(&mvq->cq.mcq);
> > > > > > > +
> > > > > > > +	/* make sure CQ cosumer update is visible to the hardware before updating
> > > > > > > +	 * RX doorbell record.
> > > > > > > +	 */
> > > > > > > +	wmb();
> > > > > > >  	rx_post(&mvq->vqqp, num);
> > > > > > >  	if (mvq->event_cb.callback)
> > > > > > >  		mvq->event_cb.callback(mvq->event_cb.private);
> > > > > > > -- 
> > > > > > > 2.27.0
> > > > > > 
> > > > 
> > 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-12-09 12:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-06 10:57 [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index Eli Cohen
2020-12-07  2:51 ` Jason Wang
2020-12-08  9:15   ` Eli Cohen
2020-12-08 13:53     ` Michael S. Tsirkin
2020-12-08 21:45 ` Michael S. Tsirkin
2020-12-09  6:02   ` Eli Cohen
2020-12-09  6:46     ` Michael S. Tsirkin
2020-12-09  6:58       ` Eli Cohen
2020-12-09  8:05         ` Michael S. Tsirkin
2020-12-09  9:38           ` Eli Cohen
2020-12-09 12:47             ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).