* [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating @ 2018-01-12 3:42 Jianchao Wang 2018-01-12 16:32 ` Jason Gunthorpe 0 siblings, 1 reply; 27+ messages in thread From: Jianchao Wang @ 2018-01-12 3:42 UTC (permalink / raw) To: tariqt; +Cc: junxiao.bi, netdev, linux-rdma, linux-kernel Customer reported memory corruption issue on previous mlx4_en driver version where the order-3 pages and multiple page reference counting were still used. Finally, find out one of the root causes is that the HW may see stale rx_descs due to prod db updating reaches HW before rx_desc. Especially when cross order-3 pages boundary and update a new one, HW may write on the pages which may has been freed and allocated again by others. To fix it, add a wmb between rx_desc and prod db updating to ensure the order. Even thougth order-0 and page recycling has been introduced, the disorder between rx_desc and prod db still could lead to corruption on different inbound packages. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> --- drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 85e28ef..eefa82c 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -555,7 +555,7 @@ static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv, break; ring->prod++; } while (likely(--missing)); - + wmb(); /* ensure rx_desc updating reaches HW before prod db updating */ mlx4_en_update_rx_prod_db(ring); } -- 2.7.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 3:42 [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating Jianchao Wang @ 2018-01-12 16:32 ` Jason Gunthorpe 2018-01-12 16:46 ` Eric Dumazet 0 siblings, 1 reply; 27+ messages in thread From: Jason Gunthorpe @ 2018-01-12 16:32 UTC (permalink / raw) To: Jianchao Wang Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote: > Customer reported memory corruption issue on previous mlx4_en driver > version where the order-3 pages and multiple page reference counting > were still used. > > Finally, find out one of the root causes is that the HW may see stale > rx_descs due to prod db updating reaches HW before rx_desc. Especially > when cross order-3 pages boundary and update a new one, HW may write > on the pages which may has been freed and allocated again by others. > > To fix it, add a wmb between rx_desc and prod db updating to ensure > the order. Even thougth order-0 and page recycling has been introduced, > the disorder between rx_desc and prod db still could lead to corruption > on different inbound packages. > > Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> > drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > index 85e28ef..eefa82c 100644 > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > @@ -555,7 +555,7 @@ static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv, > break; > ring->prod++; > } while (likely(--missing)); > - > + wmb(); /* ensure rx_desc updating reaches HW before prod db updating */ > mlx4_en_update_rx_prod_db(ring); > } > Does this need to be dma_wmb(), and should it be in mlx4_en_update_rx_prod_db ? Jason ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 16:32 ` Jason Gunthorpe @ 2018-01-12 16:46 ` Eric Dumazet 2018-01-12 19:53 ` Saeed Mahameed 2018-01-14 2:40 ` jianchao.wang 0 siblings, 2 replies; 27+ messages in thread From: Eric Dumazet @ 2018-01-12 16:46 UTC (permalink / raw) To: Jason Gunthorpe, Jianchao Wang Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote: > On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote: > > Customer reported memory corruption issue on previous mlx4_en driver > > version where the order-3 pages and multiple page reference counting > > were still used. > > > > Finally, find out one of the root causes is that the HW may see stale > > rx_descs due to prod db updating reaches HW before rx_desc. Especially > > when cross order-3 pages boundary and update a new one, HW may write > > on the pages which may has been freed and allocated again by others. > > > > To fix it, add a wmb between rx_desc and prod db updating to ensure > > the order. Even thougth order-0 and page recycling has been introduced, > > the disorder between rx_desc and prod db still could lead to corruption > > on different inbound packages. > > > > Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> > > drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > > index 85e28ef..eefa82c 100644 > > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > > @@ -555,7 +555,7 @@ static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv, > > break; > > ring->prod++; > > } while (likely(--missing)); > > - > > + wmb(); /* ensure rx_desc updating reaches HW before prod db updating */ > > mlx4_en_update_rx_prod_db(ring); > > } > > > > Does this need to be dma_wmb(), and should it be in > mlx4_en_update_rx_prod_db ? > +1 on dma_wmb() On what architecture bug was observed ? In any case, the barrier should be moved in mlx4_en_update_rx_prod_db() I think. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 16:46 ` Eric Dumazet @ 2018-01-12 19:53 ` Saeed Mahameed 2018-01-12 20:16 ` Eric Dumazet 2018-01-14 2:40 ` jianchao.wang 1 sibling, 1 reply; 27+ messages in thread From: Saeed Mahameed @ 2018-01-12 19:53 UTC (permalink / raw) To: Eric Dumazet, Jason Gunthorpe, Jianchao Wang Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel On 01/12/2018 08:46 AM, Eric Dumazet wrote: > On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote: >> On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote: >>> Customer reported memory corruption issue on previous mlx4_en driver >>> version where the order-3 pages and multiple page reference counting >>> were still used. >>> >>> Finally, find out one of the root causes is that the HW may see stale >>> rx_descs due to prod db updating reaches HW before rx_desc. Especially >>> when cross order-3 pages boundary and update a new one, HW may write >>> on the pages which may has been freed and allocated again by others. >>> >>> To fix it, add a wmb between rx_desc and prod db updating to ensure >>> the order. Even thougth order-0 and page recycling has been introduced, >>> the disorder between rx_desc and prod db still could lead to corruption >>> on different inbound packages. >>> >>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> >>> drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>> index 85e28ef..eefa82c 100644 >>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>> @@ -555,7 +555,7 @@ static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv, >>> break; >>> ring->prod++; >>> } while (likely(--missing)); >>> - >>> + wmb(); /* ensure rx_desc updating reaches HW before prod db updating */ >>> mlx4_en_update_rx_prod_db(ring); >>> } >>> >> >> Does this need to be dma_wmb(), and should it be in >> mlx4_en_update_rx_prod_db ? >> > > +1 on dma_wmb() > > On what architecture bug was observed ? > > In any case, the barrier should be moved in mlx4_en_update_rx_prod_db() > I think. > +1 on dma_wmb(), thanks Eric for reviewing this. The barrier is also needed elsewhere in the code as well, but I wouldn't put it in mlx4_en_update_rx_prod_db(), just to allow batch filling of all rx rings and then hit the barrier only once. As a rule of thumb, mem barriers are the ring API caller responsibility. e.g. in mlx4_en_activate_rx_rings(): between mlx4_en_fill_rx_buffers(priv); and the loop that updates rx prod for all rings ring, the dma_wmb is needed, see below. diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index b4d144e67514..65541721a240 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -370,6 +370,8 @@ int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv) if (err) goto err_buffers; + dma_wmb(); + for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) { ring = priv->rx_ring[ring_ind]; ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 19:53 ` Saeed Mahameed @ 2018-01-12 20:16 ` Eric Dumazet 2018-01-12 21:01 ` Saeed Mahameed 0 siblings, 1 reply; 27+ messages in thread From: Eric Dumazet @ 2018-01-12 20:16 UTC (permalink / raw) To: Saeed Mahameed, Jason Gunthorpe, Jianchao Wang Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel On Fri, 2018-01-12 at 11:53 -0800, Saeed Mahameed wrote: > > On 01/12/2018 08:46 AM, Eric Dumazet wrote: > > On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote: > > > On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote: > > > > Customer reported memory corruption issue on previous mlx4_en driver > > > > version where the order-3 pages and multiple page reference counting > > > > were still used. > > > > > > > > Finally, find out one of the root causes is that the HW may see stale > > > > rx_descs due to prod db updating reaches HW before rx_desc. Especially > > > > when cross order-3 pages boundary and update a new one, HW may write > > > > on the pages which may has been freed and allocated again by others. > > > > > > > > To fix it, add a wmb between rx_desc and prod db updating to ensure > > > > the order. Even thougth order-0 and page recycling has been introduced, > > > > the disorder between rx_desc and prod db still could lead to corruption > > > > on different inbound packages. > > > > > > > > Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> > > > > drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > > > > index 85e28ef..eefa82c 100644 > > > > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > > > > @@ -555,7 +555,7 @@ static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv, > > > > break; > > > > ring->prod++; > > > > } while (likely(--missing)); > > > > - > > > > + wmb(); /* ensure rx_desc updating reaches HW before prod db updating */ > > > > mlx4_en_update_rx_prod_db(ring); > > > > } > > > > > > > > > > Does this need to be dma_wmb(), and should it be in > > > mlx4_en_update_rx_prod_db ? > > > > > > > +1 on dma_wmb() > > > > On what architecture bug was observed ? > > > > In any case, the barrier should be moved in mlx4_en_update_rx_prod_db() > > I think. > > > > +1 on dma_wmb(), thanks Eric for reviewing this. > > The barrier is also needed elsewhere in the code as well, but I wouldn't > put it in mlx4_en_update_rx_prod_db(), just to allow batch filling of > all rx rings and then hit the barrier only once. As a rule of thumb, mem > barriers are the ring API caller responsibility. > > e.g. in mlx4_en_activate_rx_rings(): > between mlx4_en_fill_rx_buffers(priv); and the loop that updates rx prod > for all rings ring, the dma_wmb is needed, see below. > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > index b4d144e67514..65541721a240 100644 > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > @@ -370,6 +370,8 @@ int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv) > if (err) > goto err_buffers; > > + dma_wmb(); > + > for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) { > ring = priv->rx_ring[ring_ind]; Why bother, considering dma_wmb() is a nop on x86, simply a compiler barrier. Putting it in mlx4_en_update_rx_prod_db() and have no obscure bugs... Also we might change the existing wmb() in mlx4_en_process_rx_cq() by dma_wmb(), that would help performance a bit. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 20:16 ` Eric Dumazet @ 2018-01-12 21:01 ` Saeed Mahameed 2018-01-12 21:21 ` Eric Dumazet 2018-01-13 19:15 ` Jason Gunthorpe 0 siblings, 2 replies; 27+ messages in thread From: Saeed Mahameed @ 2018-01-12 21:01 UTC (permalink / raw) To: Eric Dumazet, Jason Gunthorpe, Jianchao Wang Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel On 01/12/2018 12:16 PM, Eric Dumazet wrote: > On Fri, 2018-01-12 at 11:53 -0800, Saeed Mahameed wrote: >> >> On 01/12/2018 08:46 AM, Eric Dumazet wrote: >>> On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote: >>>> On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote: >>>>> Customer reported memory corruption issue on previous mlx4_en driver >>>>> version where the order-3 pages and multiple page reference counting >>>>> were still used. >>>>> >>>>> Finally, find out one of the root causes is that the HW may see stale >>>>> rx_descs due to prod db updating reaches HW before rx_desc. Especially >>>>> when cross order-3 pages boundary and update a new one, HW may write >>>>> on the pages which may has been freed and allocated again by others. >>>>> >>>>> To fix it, add a wmb between rx_desc and prod db updating to ensure >>>>> the order. Even thougth order-0 and page recycling has been introduced, >>>>> the disorder between rx_desc and prod db still could lead to corruption >>>>> on different inbound packages. >>>>> >>>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> >>>>> drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>>>> index 85e28ef..eefa82c 100644 >>>>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>>>> @@ -555,7 +555,7 @@ static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv, >>>>> break; >>>>> ring->prod++; >>>>> } while (likely(--missing)); >>>>> - >>>>> + wmb(); /* ensure rx_desc updating reaches HW before prod db updating */ >>>>> mlx4_en_update_rx_prod_db(ring); >>>>> } >>>>> >>>> >>>> Does this need to be dma_wmb(), and should it be in >>>> mlx4_en_update_rx_prod_db ? >>>> >>> >>> +1 on dma_wmb() >>> >>> On what architecture bug was observed ? >>> >>> In any case, the barrier should be moved in mlx4_en_update_rx_prod_db() >>> I think. >>> >> >> +1 on dma_wmb(), thanks Eric for reviewing this. >> >> The barrier is also needed elsewhere in the code as well, but I wouldn't >> put it in mlx4_en_update_rx_prod_db(), just to allow batch filling of >> all rx rings and then hit the barrier only once. As a rule of thumb, mem >> barriers are the ring API caller responsibility. >> >> e.g. in mlx4_en_activate_rx_rings(): >> between mlx4_en_fill_rx_buffers(priv); and the loop that updates rx prod >> for all rings ring, the dma_wmb is needed, see below. >> >> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c >> b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >> index b4d144e67514..65541721a240 100644 >> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c >> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >> @@ -370,6 +370,8 @@ int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv) >> if (err) >> goto err_buffers; >> >> + dma_wmb(); >> + >> for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) { >> ring = priv->rx_ring[ring_ind]; > > > Why bother, considering dma_wmb() is a nop on x86, > simply a compiler barrier. > > Putting it in mlx4_en_update_rx_prod_db() and have no obscure bugs... > Simply putting a memory barrier on the top or the bottom of a functions, means nothing unless you are looking at the whole picture, of all the callers of that function to understand why is it there. which is better to grasp ?: update_doorbell() { dma_wmb(); ring->db = prod; } or fill buffers(); dma_wmb(); update_doorbell(); I simply like the 2nd one since with one look you can understand what this dma_wmb is protecting. Anyway this is truly a nit, Tariq can decide what is better for him :). > Also we might change the existing wmb() in mlx4_en_process_rx_cq() by > dma_wmb(), that would help performance a bit. > > +1, Tariq will you handle ? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 21:01 ` Saeed Mahameed @ 2018-01-12 21:21 ` Eric Dumazet 2018-01-13 19:15 ` Jason Gunthorpe 1 sibling, 0 replies; 27+ messages in thread From: Eric Dumazet @ 2018-01-12 21:21 UTC (permalink / raw) To: Saeed Mahameed, Jason Gunthorpe, Jianchao Wang Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel On Fri, 2018-01-12 at 13:01 -0800, Saeed Mahameed wrote: > which is better to grasp ?: > > update_doorbell() { > dma_wmb(); > ring->db = prod; > } This one is IMO the most secure one (least surprise) Considering the time it took to discover this bug, I would really play safe. But obviously I do not maintain mlx4. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 21:01 ` Saeed Mahameed 2018-01-12 21:21 ` Eric Dumazet @ 2018-01-13 19:15 ` Jason Gunthorpe 1 sibling, 0 replies; 27+ messages in thread From: Jason Gunthorpe @ 2018-01-13 19:15 UTC (permalink / raw) To: Saeed Mahameed Cc: Eric Dumazet, Jianchao Wang, tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel On Fri, Jan 12, 2018 at 01:01:56PM -0800, Saeed Mahameed wrote: > Simply putting a memory barrier on the top or the bottom of a functions, > means nothing unless you are looking at the whole picture, of all the > callers of that function to understand why is it there. When I review code I want to see the memory barrier placed *directly* before the write which allows the DMA. So yes, this is my preference: > update_doorbell() { > dma_wmb(); > ring->db = prod; > } Conceptually what is happening here is very similar to what smp_store_release() does for SMP cases. In most cases wmb should always be strongly connected with a following write. smp_store_release() is called 'release' because the write it incorporates allows the other CPU to 'see' what is being protected. Similarly here, the write to the db allows the device to 'see' the new ring data. And this is bad idea: > fill buffers(); > dma_wmb(); > update_doorbell(); > I simply like the 2nd one since with one look you can understand > what this dma_wmb is protecting. What do you think the wmb is protecting in the above? It isn't the fill. Jason ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-12 16:46 ` Eric Dumazet 2018-01-12 19:53 ` Saeed Mahameed @ 2018-01-14 2:40 ` jianchao.wang 2018-01-14 9:47 ` Tariq Toukan 1 sibling, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-14 2:40 UTC (permalink / raw) To: Eric Dumazet, Jason Gunthorpe Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Dear all Thanks for the kindly response and reviewing. That's really appreciated. On 01/13/2018 12:46 AM, Eric Dumazet wrote: >> Does this need to be dma_wmb(), and should it be in >> mlx4_en_update_rx_prod_db ? >> > +1 on dma_wmb() > > On what architecture bug was observed ? This issue was observed on x86-64. And I will send a new patch, in which replace wmb() with dma_wmb(), to customer to confirm. Thanks Jianchao ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-14 2:40 ` jianchao.wang @ 2018-01-14 9:47 ` Tariq Toukan 2018-01-15 5:50 ` jianchao.wang 0 siblings, 1 reply; 27+ messages in thread From: Tariq Toukan @ 2018-01-14 9:47 UTC (permalink / raw) To: jianchao.wang, Eric Dumazet, Jason Gunthorpe Cc: tariqt, junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Thanks Jianchao for your patch. And Thank you guys for your reviews, much appreciated. I was off-work on Friday and Saturday. On 14/01/2018 4:40 AM, jianchao.wang wrote: > Dear all > > Thanks for the kindly response and reviewing. That's really appreciated. > > On 01/13/2018 12:46 AM, Eric Dumazet wrote: >>> Does this need to be dma_wmb(), and should it be in >>> mlx4_en_update_rx_prod_db ? >>> >> +1 on dma_wmb() >> >> On what architecture bug was observed ? > This issue was observed on x86-64. > And I will send a new patch, in which replace wmb() with dma_wmb(), to customer > to confirm. +1 on dma_wmb, let us know once customer confirms. Please place it within mlx4_en_update_rx_prod_db as suggested. All other calls to mlx4_en_update_rx_prod_db are in control/slow path so I prefer being on the safe side, and care less about bulking the barrier. Thanks, Tariq ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-14 9:47 ` Tariq Toukan @ 2018-01-15 5:50 ` jianchao.wang 2018-01-19 15:16 ` jianchao.wang 0 siblings, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-15 5:50 UTC (permalink / raw) To: Tariq Toukan, Eric Dumazet, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Tariq Thanks for your kindly response. On 01/14/2018 05:47 PM, Tariq Toukan wrote: > Thanks Jianchao for your patch. > > And Thank you guys for your reviews, much appreciated. > I was off-work on Friday and Saturday. > > On 14/01/2018 4:40 AM, jianchao.wang wrote: >> Dear all >> >> Thanks for the kindly response and reviewing. That's really appreciated. >> >> On 01/13/2018 12:46 AM, Eric Dumazet wrote: >>>> Does this need to be dma_wmb(), and should it be in >>>> mlx4_en_update_rx_prod_db ? >>>> >>> +1 on dma_wmb() >>> >>> On what architecture bug was observed ? >> This issue was observed on x86-64. >> And I will send a new patch, in which replace wmb() with dma_wmb(), to customer >> to confirm. > > +1 on dma_wmb, let us know once customer confirms. > Please place it within mlx4_en_update_rx_prod_db as suggested. Yes, I have recommended it to customer. Once I get the result, I will share it here. > All other calls to mlx4_en_update_rx_prod_db are in control/slow path so I prefer being on the safe side, and care less about bulking the barrier. > > Thanks, > Tariq > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-15 5:50 ` jianchao.wang @ 2018-01-19 15:16 ` jianchao.wang 2018-01-19 15:49 ` Eric Dumazet 0 siblings, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-19 15:16 UTC (permalink / raw) To: Tariq Toukan, Eric Dumazet, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Tariq Very sad that the crash was reproduced again after applied the patch. --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -252,6 +252,7 @@ static inline bool mlx4_en_is_ring_empty(struct mlx4_en_rx_ring *ring) static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) { + dma_wmb(); *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); } I analyzed the kdump, it should be a memory corruption. Thanks Jianchao On 01/15/2018 01:50 PM, jianchao.wang wrote: > Hi Tariq > > Thanks for your kindly response. > > On 01/14/2018 05:47 PM, Tariq Toukan wrote: >> Thanks Jianchao for your patch. >> >> And Thank you guys for your reviews, much appreciated. >> I was off-work on Friday and Saturday. >> >> On 14/01/2018 4:40 AM, jianchao.wang wrote: >>> Dear all >>> >>> Thanks for the kindly response and reviewing. That's really appreciated. >>> >>> On 01/13/2018 12:46 AM, Eric Dumazet wrote: >>>>> Does this need to be dma_wmb(), and should it be in >>>>> mlx4_en_update_rx_prod_db ? >>>>> >>>> +1 on dma_wmb() >>>> >>>> On what architecture bug was observed ? >>> This issue was observed on x86-64. >>> And I will send a new patch, in which replace wmb() with dma_wmb(), to customer >>> to confirm. >> >> +1 on dma_wmb, let us know once customer confirms. >> Please place it within mlx4_en_update_rx_prod_db as suggested. > Yes, I have recommended it to customer. > Once I get the result, I will share it here. >> All other calls to mlx4_en_update_rx_prod_db are in control/slow path so I prefer being on the safe side, and care less about bulking the barrier. >> >> Thanks, >> Tariq >> > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-19 15:16 ` jianchao.wang @ 2018-01-19 15:49 ` Eric Dumazet 2018-01-21 9:31 ` Tariq Toukan 0 siblings, 1 reply; 27+ messages in thread From: Eric Dumazet @ 2018-01-19 15:49 UTC (permalink / raw) To: jianchao.wang, Tariq Toukan, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: > Hi Tariq > > Very sad that the crash was reproduced again after applied the patch. > > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > @@ -252,6 +252,7 @@ static inline bool mlx4_en_is_ring_empty(struct mlx4_en_rx_ring *ring) > > static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) > { > + dma_wmb(); So... is wmb() here fixing the issue ? > *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); > } > > I analyzed the kdump, it should be a memory corruption. > > Thanks > Jianchao > On 01/15/2018 01:50 PM, jianchao.wang wrote: > > Hi Tariq > > > > Thanks for your kindly response. > > > > On 01/14/2018 05:47 PM, Tariq Toukan wrote: > > > Thanks Jianchao for your patch. > > > > > > And Thank you guys for your reviews, much appreciated. > > > I was off-work on Friday and Saturday. > > > > > > On 14/01/2018 4:40 AM, jianchao.wang wrote: > > > > Dear all > > > > > > > > Thanks for the kindly response and reviewing. That's really appreciated. > > > > > > > > On 01/13/2018 12:46 AM, Eric Dumazet wrote: > > > > > > Does this need to be dma_wmb(), and should it be in > > > > > > mlx4_en_update_rx_prod_db ? > > > > > > > > > > > > > > > > +1 on dma_wmb() > > > > > > > > > > On what architecture bug was observed ? > > > > > > > > This issue was observed on x86-64. > > > > And I will send a new patch, in which replace wmb() with dma_wmb(), to customer > > > > to confirm. > > > > > > +1 on dma_wmb, let us know once customer confirms. > > > Please place it within mlx4_en_update_rx_prod_db as suggested. > > > > Yes, I have recommended it to customer. > > Once I get the result, I will share it here. > > > All other calls to mlx4_en_update_rx_prod_db are in control/slow path so I prefer being on the safe side, and care less about bulking the barrier. > > > > > > Thanks, > > > Tariq > > > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-19 15:49 ` Eric Dumazet @ 2018-01-21 9:31 ` Tariq Toukan 2018-01-21 16:24 ` Tariq Toukan 2018-01-21 20:40 ` Jason Gunthorpe 0 siblings, 2 replies; 27+ messages in thread From: Tariq Toukan @ 2018-01-21 9:31 UTC (permalink / raw) To: Eric Dumazet, jianchao.wang, Tariq Toukan, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On 19/01/2018 5:49 PM, Eric Dumazet wrote: > On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >> Hi Tariq >> >> Very sad that the crash was reproduced again after applied the patch. >> >> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c >> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >> @@ -252,6 +252,7 @@ static inline bool mlx4_en_is_ring_empty(struct mlx4_en_rx_ring *ring) >> >> static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) >> { >> + dma_wmb(); > > So... is wmb() here fixing the issue ? > >> *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); >> } >> >> I analyzed the kdump, it should be a memory corruption. >> >> Thanks >> Jianchao Hmm, this is actually consistent with the example below [1]. AIU from the example, it seems that the dma_wmb/dma_rmb barriers are good for synchronizing cpu/device accesses to the "Streaming DMA mapped" buffers (the descriptors, went through the dma_map_page() API), but not for the doorbell (a coherent memory, typically allocated via dma_alloc_coherent) that requires using the stronger wmb() barrier. [1] Documentation/memory-barriers.txt (*) dma_wmb(); (*) dma_rmb(); These are for use with consistent memory to guarantee the ordering of writes or reads of shared memory accessible to both the CPU and a DMA capable device. For example, consider a device driver that shares memory with a device and uses a descriptor status value to indicate if the descriptor belongs to the device or the CPU, and a doorbell to notify it when new descriptors are available: if (desc->status != DEVICE_OWN) { /* do not read data until we own descriptor */ dma_rmb(); /* read/modify data */ read_data = desc->data; desc->data = write_data; /* flush modifications before status update */ dma_wmb(); /* assign ownership */ desc->status = DEVICE_OWN; /* force memory to sync before notifying device via MMIO */ wmb(); /* notify device of new descriptors */ writel(DESC_NOTIFY, doorbell); } The dma_rmb() allows us guarantee the device has released ownership before we read the data from the descriptor, and the dma_wmb() allows us to guarantee the data is written to the descriptor before the device can see it now has ownership. The wmb() is needed to guarantee that the cache coherent memory writes have completed before attempting a write to the cache incoherent MMIO region. See Documentation/DMA-API.txt for more information on consistent memory. >> On 01/15/2018 01:50 PM, jianchao.wang wrote: >>> Hi Tariq >>> >>> Thanks for your kindly response. >>> >>> On 01/14/2018 05:47 PM, Tariq Toukan wrote: >>>> Thanks Jianchao for your patch. >>>> >>>> And Thank you guys for your reviews, much appreciated. >>>> I was off-work on Friday and Saturday. >>>> >>>> On 14/01/2018 4:40 AM, jianchao.wang wrote: >>>>> Dear all >>>>> >>>>> Thanks for the kindly response and reviewing. That's really appreciated. >>>>> >>>>> On 01/13/2018 12:46 AM, Eric Dumazet wrote: >>>>>>> Does this need to be dma_wmb(), and should it be in >>>>>>> mlx4_en_update_rx_prod_db ? >>>>>>> >>>>>> >>>>>> +1 on dma_wmb() >>>>>> >>>>>> On what architecture bug was observed ? >>>>> >>>>> This issue was observed on x86-64. >>>>> And I will send a new patch, in which replace wmb() with dma_wmb(), to customer >>>>> to confirm. >>>> >>>> +1 on dma_wmb, let us know once customer confirms. >>>> Please place it within mlx4_en_update_rx_prod_db as suggested. >>> >>> Yes, I have recommended it to customer. >>> Once I get the result, I will share it here. >>>> All other calls to mlx4_en_update_rx_prod_db are in control/slow path so I prefer being on the safe side, and care less about bulking the barrier. >>>> >>>> Thanks, >>>> Tariq >>>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-21 9:31 ` Tariq Toukan @ 2018-01-21 16:24 ` Tariq Toukan 2018-01-21 16:43 ` Eric Dumazet 2018-01-22 2:12 ` jianchao.wang 2018-01-21 20:40 ` Jason Gunthorpe 1 sibling, 2 replies; 27+ messages in thread From: Tariq Toukan @ 2018-01-21 16:24 UTC (permalink / raw) To: Tariq Toukan, Eric Dumazet, jianchao.wang, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On 21/01/2018 11:31 AM, Tariq Toukan wrote: > > > On 19/01/2018 5:49 PM, Eric Dumazet wrote: >> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>> Hi Tariq >>> >>> Very sad that the crash was reproduced again after applied the patch. Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps? >>> >>> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>> @@ -252,6 +252,7 @@ static inline bool mlx4_en_is_ring_empty(struct >>> mlx4_en_rx_ring *ring) >>> static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring >>> *ring) >>> { >>> + dma_wmb(); >> >> So... is wmb() here fixing the issue ? >> >>> *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); >>> } >>> >>> I analyzed the kdump, it should be a memory corruption. >>> >>> Thanks >>> Jianchao > > Hmm, this is actually consistent with the example below [1]. > > AIU from the example, it seems that the dma_wmb/dma_rmb barriers are > good for synchronizing cpu/device accesses to the "Streaming DMA mapped" > buffers (the descriptors, went through the dma_map_page() API), but not > for the doorbell (a coherent memory, typically allocated via > dma_alloc_coherent) that requires using the stronger wmb() barrier. > > > [1] Documentation/memory-barriers.txt > > (*) dma_wmb(); > (*) dma_rmb(); > > These are for use with consistent memory to guarantee the ordering > of writes or reads of shared memory accessible to both the CPU and a > DMA capable device. > > For example, consider a device driver that shares memory with a > device > and uses a descriptor status value to indicate if the descriptor > belongs > to the device or the CPU, and a doorbell to notify it when new > descriptors are available: > > if (desc->status != DEVICE_OWN) { > /* do not read data until we own descriptor */ > dma_rmb(); > > /* read/modify data */ > read_data = desc->data; > desc->data = write_data; > > /* flush modifications before status update */ > dma_wmb(); > > /* assign ownership */ > desc->status = DEVICE_OWN; > > /* force memory to sync before notifying device via MMIO */ > wmb(); > > /* notify device of new descriptors */ > writel(DESC_NOTIFY, doorbell); > } > > The dma_rmb() allows us guarantee the device has released ownership > before we read the data from the descriptor, and the dma_wmb() allows > us to guarantee the data is written to the descriptor before the > device > can see it now has ownership. The wmb() is needed to guarantee > that the > cache coherent memory writes have completed before attempting a > write to > the cache incoherent MMIO region. > > See Documentation/DMA-API.txt for more information on consistent > memory. > > >>> On 01/15/2018 01:50 PM, jianchao.wang wrote: >>>> Hi Tariq >>>> >>>> Thanks for your kindly response. >>>> >>>> On 01/14/2018 05:47 PM, Tariq Toukan wrote: >>>>> Thanks Jianchao for your patch. >>>>> >>>>> And Thank you guys for your reviews, much appreciated. >>>>> I was off-work on Friday and Saturday. >>>>> >>>>> On 14/01/2018 4:40 AM, jianchao.wang wrote: >>>>>> Dear all >>>>>> >>>>>> Thanks for the kindly response and reviewing. That's really >>>>>> appreciated. >>>>>> >>>>>> On 01/13/2018 12:46 AM, Eric Dumazet wrote: >>>>>>>> Does this need to be dma_wmb(), and should it be in >>>>>>>> mlx4_en_update_rx_prod_db ? >>>>>>>> >>>>>>> >>>>>>> +1 on dma_wmb() >>>>>>> >>>>>>> On what architecture bug was observed ? >>>>>> >>>>>> This issue was observed on x86-64. >>>>>> And I will send a new patch, in which replace wmb() with >>>>>> dma_wmb(), to customer >>>>>> to confirm. >>>>> >>>>> +1 on dma_wmb, let us know once customer confirms. >>>>> Please place it within mlx4_en_update_rx_prod_db as suggested. >>>> >>>> Yes, I have recommended it to customer. >>>> Once I get the result, I will share it here. >>>>> All other calls to mlx4_en_update_rx_prod_db are in control/slow >>>>> path so I prefer being on the safe side, and care less about >>>>> bulking the barrier. >>>>> >>>>> Thanks, >>>>> Tariq >>>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-21 16:24 ` Tariq Toukan @ 2018-01-21 16:43 ` Eric Dumazet 2018-01-22 2:40 ` jianchao.wang 2018-01-22 2:12 ` jianchao.wang 1 sibling, 1 reply; 27+ messages in thread From: Eric Dumazet @ 2018-01-21 16:43 UTC (permalink / raw) To: Tariq Toukan, jianchao.wang, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote: > > On 21/01/2018 11:31 AM, Tariq Toukan wrote: > > > > > > On 19/01/2018 5:49 PM, Eric Dumazet wrote: > > > On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: > > > > Hi Tariq > > > > > > > > Very sad that the crash was reproduced again after applied the patch. > > Memory barriers vary for different Archs, can you please share more > details regarding arch and repro steps? Yeah, mlx4 NICs in Google fleet receive trillions of packets per second, and we never noticed an issue. Although we are using a slightly different driver, using order-0 pages and fast pages recycling. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-21 16:43 ` Eric Dumazet @ 2018-01-22 2:40 ` jianchao.wang 2018-01-22 15:47 ` Jason Gunthorpe 0 siblings, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-22 2:40 UTC (permalink / raw) To: Eric Dumazet, Tariq Toukan, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Eric On 01/22/2018 12:43 AM, Eric Dumazet wrote: > On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote: >> >> On 21/01/2018 11:31 AM, Tariq Toukan wrote: >>> >>> >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>> Hi Tariq >>>>> >>>>> Very sad that the crash was reproduced again after applied the patch. >> >> Memory barriers vary for different Archs, can you please share more >> details regarding arch and repro steps? > > Yeah, mlx4 NICs in Google fleet receive trillions of packets per > second, and we never noticed an issue. > > Although we are using a slightly different driver, using order-0 pages > and fast pages recycling. > > The driver we use will will set the page reference count to (size of pages)/stride, the pages will be freed by networking stack when the reference become zero, and the order-3 pages maybe allocated soon, this give NIC device a chance to corrupt the pages which have been allocated by others, such as slab. In the current version with order-0 and page recycling, maybe the corruption occurred on the inbound packets sometimes and just cause some bad and invalid packets which will be dropped. Thanks Jianchao ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-22 2:40 ` jianchao.wang @ 2018-01-22 15:47 ` Jason Gunthorpe 2018-01-23 3:25 ` jianchao.wang 0 siblings, 1 reply; 27+ messages in thread From: Jason Gunthorpe @ 2018-01-22 15:47 UTC (permalink / raw) To: jianchao.wang Cc: Eric Dumazet, Tariq Toukan, junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On Mon, Jan 22, 2018 at 10:40:53AM +0800, jianchao.wang wrote: > Hi Eric > > On 01/22/2018 12:43 AM, Eric Dumazet wrote: > > On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote: > >> > >> On 21/01/2018 11:31 AM, Tariq Toukan wrote: > >>> > >>> > >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: > >>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: > >>>>> Hi Tariq > >>>>> > >>>>> Very sad that the crash was reproduced again after applied the patch. > >> > >> Memory barriers vary for different Archs, can you please share more > >> details regarding arch and repro steps? > > > > Yeah, mlx4 NICs in Google fleet receive trillions of packets per > > second, and we never noticed an issue. > > > > Although we are using a slightly different driver, using order-0 pages > > and fast pages recycling. > > > > > The driver we use will will set the page reference count to (size of pages)/stride, the > pages will be freed by networking stack when the reference become zero, and the order-3 > pages maybe allocated soon, this give NIC device a chance to corrupt the pages which have > been allocated by others, such as slab. But it looks like the wmb() is placed when stuffing new rx descriptors into the device - how can it prevent corruption of pages where ownership was transfered from device to the host? That sounds more like a rmb() is missing someplace to me... (Granted the missing wmb() is a bug, but it may not be fully solving this issue??) Jason ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-22 15:47 ` Jason Gunthorpe @ 2018-01-23 3:25 ` jianchao.wang 0 siblings, 0 replies; 27+ messages in thread From: jianchao.wang @ 2018-01-23 3:25 UTC (permalink / raw) To: Jason Gunthorpe Cc: Eric Dumazet, Tariq Toukan, junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Jason Thanks for your kindly response. On 01/22/2018 11:47 PM, Jason Gunthorpe wrote: >>> Yeah, mlx4 NICs in Google fleet receive trillions of packets per >>> second, and we never noticed an issue. >>> >>> Although we are using a slightly different driver, using order-0 pages >>> and fast pages recycling. >>> >>> >> The driver we use will will set the page reference count to (size of pages)/stride, the >> pages will be freed by networking stack when the reference become zero, and the order-3 >> pages maybe allocated soon, this give NIC device a chance to corrupt the pages which have >> been allocated by others, such as slab. > But it looks like the wmb() is placed when stuffing new rx descriptors > into the device - how can it prevent corruption of pages where > ownership was transfered from device to the host? That sounds more like a > rmb() is missing someplace to me... > The device may see the prod_db updating before rx_desc updating. Then it will get stale rx_descs. These stale rx_descs may contain pages that has been freed to host. > (Granted the missing wmb() is a bug, but it may not be fully solving this > issue??)the wmb() here fix one of the customer's test case. Thanks Jianchao ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-21 16:24 ` Tariq Toukan 2018-01-21 16:43 ` Eric Dumazet @ 2018-01-22 2:12 ` jianchao.wang 2018-01-25 3:27 ` jianchao.wang 1 sibling, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-22 2:12 UTC (permalink / raw) To: Tariq Toukan, Eric Dumazet, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Tariq and all Many thanks for your kindly and detailed response and comment. On 01/22/2018 12:24 AM, Tariq Toukan wrote: > > > On 21/01/2018 11:31 AM, Tariq Toukan wrote: >> >> >> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>> Hi Tariq >>>> >>>> Very sad that the crash was reproduced again after applied the patch. > > Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps?The hardware is HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 The xen is installed. The crash occurred in DOM0. Regarding to the repro steps, it is a customer's test which does heavy disk I/O over NFS storage without any guest. The patch that can fix this issue is as follow: --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -1005,6 +1005,7 @@ out: wmb(); /* ensure HW sees CQ consumer before we post new buffers */ ring->cons = cq->mcq.cons_index; mlx4_en_refill_rx_buffers(priv, ring); + wmb(); mlx4_en_update_rx_prod_db(ring); return polled; } Thanks Jianchao > >>>> >>>> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c >>>> @@ -252,6 +252,7 @@ static inline bool mlx4_en_is_ring_empty(struct mlx4_en_rx_ring *ring) >>>> static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) >>>> { >>>> + dma_wmb(); >>> >>> So... is wmb() here fixing the issue ? >>> >>>> *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); >>>> } >>>> >>>> I analyzed the kdump, it should be a memory corruption. >>>> >>>> Thanks >>>> Jianchao >> >> Hmm, this is actually consistent with the example below [1]. >> >> AIU from the example, it seems that the dma_wmb/dma_rmb barriers are good for synchronizing cpu/device accesses to the "Streaming DMA mapped" buffers (the descriptors, went through the dma_map_page() API), but not for the doorbell (a coherent memory, typically allocated via dma_alloc_coherent) that requires using the stronger wmb() barrier. >> >> >> [1] Documentation/memory-barriers.txt >> >> (*) dma_wmb(); >> (*) dma_rmb(); >> >> These are for use with consistent memory to guarantee the ordering >> of writes or reads of shared memory accessible to both the CPU and a >> DMA capable device. >> >> For example, consider a device driver that shares memory with a device >> and uses a descriptor status value to indicate if the descriptor belongs >> to the device or the CPU, and a doorbell to notify it when new >> descriptors are available: >> >> if (desc->status != DEVICE_OWN) { >> /* do not read data until we own descriptor */ >> dma_rmb(); >> >> /* read/modify data */ >> read_data = desc->data; >> desc->data = write_data; >> >> /* flush modifications before status update */ >> dma_wmb(); >> >> /* assign ownership */ >> desc->status = DEVICE_OWN; >> >> /* force memory to sync before notifying device via MMIO */ >> wmb(); >> >> /* notify device of new descriptors */ >> writel(DESC_NOTIFY, doorbell); >> } >> >> The dma_rmb() allows us guarantee the device has released ownership >> before we read the data from the descriptor, and the dma_wmb() allows >> us to guarantee the data is written to the descriptor before the device >> can see it now has ownership. The wmb() is needed to guarantee that the >> cache coherent memory writes have completed before attempting a write to >> the cache incoherent MMIO region. >> >> See Documentation/DMA-API.txt for more information on consistent memory. >> >> >>>> On 01/15/2018 01:50 PM, jianchao.wang wrote: >>>>> Hi Tariq >>>>> >>>>> Thanks for your kindly response. >>>>> >>>>> On 01/14/2018 05:47 PM, Tariq Toukan wrote: >>>>>> Thanks Jianchao for your patch. >>>>>> >>>>>> And Thank you guys for your reviews, much appreciated. >>>>>> I was off-work on Friday and Saturday. >>>>>> >>>>>> On 14/01/2018 4:40 AM, jianchao.wang wrote: >>>>>>> Dear all >>>>>>> >>>>>>> Thanks for the kindly response and reviewing. That's really appreciated. >>>>>>> >>>>>>> On 01/13/2018 12:46 AM, Eric Dumazet wrote: >>>>>>>>> Does this need to be dma_wmb(), and should it be in >>>>>>>>> mlx4_en_update_rx_prod_db ? >>>>>>>>> >>>>>>>> >>>>>>>> +1 on dma_wmb() >>>>>>>> >>>>>>>> On what architecture bug was observed ? >>>>>>> >>>>>>> This issue was observed on x86-64. >>>>>>> And I will send a new patch, in which replace wmb() with dma_wmb(), to customer >>>>>>> to confirm. >>>>>> >>>>>> +1 on dma_wmb, let us know once customer confirms. >>>>>> Please place it within mlx4_en_update_rx_prod_db as suggested. >>>>> >>>>> Yes, I have recommended it to customer. >>>>> Once I get the result, I will share it here. >>>>>> All other calls to mlx4_en_update_rx_prod_db are in control/slow path so I prefer being on the safe side, and care less about bulking the barrier. >>>>>> >>>>>> Thanks, >>>>>> Tariq >>>>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=s8_-sqvK_-1EHwvxh5DBpBIakIb0lpcn0fN6zbFxgpk&s=q3jITeGfYvYPdMo8vqfURwAbUNbSrVi2pkJfmPVGUH8&e= >>> > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-22 2:12 ` jianchao.wang @ 2018-01-25 3:27 ` jianchao.wang 2018-01-25 3:55 ` Eric Dumazet 0 siblings, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-25 3:27 UTC (permalink / raw) To: Tariq Toukan, Eric Dumazet, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Tariq On 01/22/2018 10:12 AM, jianchao.wang wrote: >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>> Hi Tariq >>>>> >>>>> Very sad that the crash was reproduced again after applied the patch. >> Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps? > The hardware is HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 > The xen is installed. The crash occurred in DOM0. > Regarding to the repro steps, it is a customer's test which does heavy disk I/O over NFS storage without any guest. > What is the finial suggestion on this ? If use wmb there, is the performance pulled down ? Thanks in advance Jianchao ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-25 3:27 ` jianchao.wang @ 2018-01-25 3:55 ` Eric Dumazet 2018-01-25 6:25 ` jianchao.wang 0 siblings, 1 reply; 27+ messages in thread From: Eric Dumazet @ 2018-01-25 3:55 UTC (permalink / raw) To: jianchao.wang, Tariq Toukan, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote: > Hi Tariq > > On 01/22/2018 10:12 AM, jianchao.wang wrote: > > > > On 19/01/2018 5:49 PM, Eric Dumazet wrote: > > > > > On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: > > > > > > Hi Tariq > > > > > > > > > > > > Very sad that the crash was reproduced again after applied the patch. > > > > > > Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps? > > The hardware is HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 > > The xen is installed. The crash occurred in DOM0. > > Regarding to the repro steps, it is a customer's test which does heavy disk I/O over NFS storage without any guest. > > > > What is the finial suggestion on this ? > If use wmb there, is the performance pulled down ? Since https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=dad42c3038a59d27fced28ee4ec1d4a891b28155 we batch allocations, so mlx4_en_refill_rx_buffers() is not called that often. I doubt the additional wmb() will have serious impact there. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-25 3:55 ` Eric Dumazet @ 2018-01-25 6:25 ` jianchao.wang 2018-01-25 9:54 ` Tariq Toukan 0 siblings, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-25 6:25 UTC (permalink / raw) To: Eric Dumazet, Tariq Toukan, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Eric Thanks for you kindly response and suggestion. That's really appreciated. Jianchao On 01/25/2018 11:55 AM, Eric Dumazet wrote: > On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote: >> Hi Tariq >> >> On 01/22/2018 10:12 AM, jianchao.wang wrote: >>>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>>>> Hi Tariq >>>>>>> >>>>>>> Very sad that the crash was reproduced again after applied the patch. >>>> >>>> Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps? >>> The hardware is HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 >>> The xen is installed. The crash occurred in DOM0. >>> Regarding to the repro steps, it is a customer's test which does heavy disk I/O over NFS storage without any guest. >>> >> >> What is the finial suggestion on this ? >> If use wmb there, is the performance pulled down ? > > Since https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_davem_net-2Dnext.git_commit_-3Fid-3Ddad42c3038a59d27fced28ee4ec1d4a891b28155&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=c0oI8duFkyFBILMQYDsqRApHQrOlLY_2uGiz_utcd7s&s=E4_XKmSI0B63qB0DLQ1EX_fj1bOP78ZdeYADBf33B-k&e= > > we batch allocations, so mlx4_en_refill_rx_buffers() is not called that often. > > I doubt the additional wmb() will have serious impact there. > > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-25 6:25 ` jianchao.wang @ 2018-01-25 9:54 ` Tariq Toukan 2018-01-27 12:41 ` jianchao.wang 0 siblings, 1 reply; 27+ messages in thread From: Tariq Toukan @ 2018-01-25 9:54 UTC (permalink / raw) To: jianchao.wang, Eric Dumazet, Tariq Toukan, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On 25/01/2018 8:25 AM, jianchao.wang wrote: > Hi Eric > > Thanks for you kindly response and suggestion. > That's really appreciated. > > Jianchao > > On 01/25/2018 11:55 AM, Eric Dumazet wrote: >> On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote: >>> Hi Tariq >>> >>> On 01/22/2018 10:12 AM, jianchao.wang wrote: >>>>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>>>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>>>>> Hi Tariq >>>>>>>> >>>>>>>> Very sad that the crash was reproduced again after applied the patch. >>>>> >>>>> Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps? >>>> The hardware is HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 >>>> The xen is installed. The crash occurred in DOM0. >>>> Regarding to the repro steps, it is a customer's test which does heavy disk I/O over NFS storage without any guest. >>>> >>> >>> What is the finial suggestion on this ? >>> If use wmb there, is the performance pulled down ? I want to evaluate this effect. I agree with Eric, expected impact is restricted, especially after batching the allocations. >> >> Since https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_davem_net-2Dnext.git_commit_-3Fid-3Ddad42c3038a59d27fced28ee4ec1d4a891b28155&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=c0oI8duFkyFBILMQYDsqRApHQrOlLY_2uGiz_utcd7s&s=E4_XKmSI0B63qB0DLQ1EX_fj1bOP78ZdeYADBf33B-k&e= >> >> we batch allocations, so mlx4_en_refill_rx_buffers() is not called that often. >> >> I doubt the additional wmb() will have serious impact there. >> I will test the effect (it'll be beginning of next week). I'll update so we can make a more confident decision. Thanks, Tariq >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-25 9:54 ` Tariq Toukan @ 2018-01-27 12:41 ` jianchao.wang [not found] ` <d9883261-e93e-400a-757c-3a81d8b6aca1@mellanox.com> 0 siblings, 1 reply; 27+ messages in thread From: jianchao.wang @ 2018-01-27 12:41 UTC (permalink / raw) To: Tariq Toukan, Eric Dumazet, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed Hi Tariq Thanks for your kindly response. That's really appreciated. On 01/25/2018 05:54 PM, Tariq Toukan wrote: > > > On 25/01/2018 8:25 AM, jianchao.wang wrote: >> Hi Eric >> >> Thanks for you kindly response and suggestion. >> That's really appreciated. >> >> Jianchao >> >> On 01/25/2018 11:55 AM, Eric Dumazet wrote: >>> On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote: >>>> Hi Tariq >>>> >>>> On 01/22/2018 10:12 AM, jianchao.wang wrote: >>>>>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>>>>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>>>>>> Hi Tariq >>>>>>>>> >>>>>>>>> Very sad that the crash was reproduced again after applied the patch. >>>>>> >>>>>> Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps? >>>>> The hardware is HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 >>>>> The xen is installed. The crash occurred in DOM0. >>>>> Regarding to the repro steps, it is a customer's test which does heavy disk I/O over NFS storage without any guest. >>>>> >>>> >>>> What is the finial suggestion on this ? >>>> If use wmb there, is the performance pulled down ? > > I want to evaluate this effect. > I agree with Eric, expected impact is restricted, especially after batching the allocations.> >>> >>> Since https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_davem_net-2Dnext.git_commit_-3Fid-3Ddad42c3038a59d27fced28ee4ec1d4a891b28155&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=c0oI8duFkyFBILMQYDsqRApHQrOlLY_2uGiz_utcd7s&s=E4_XKmSI0B63qB0DLQ1EX_fj1bOP78ZdeYADBf33B-k&e= >>> >>> we batch allocations, so mlx4_en_refill_rx_buffers() is not called that often. >>> >>> I doubt the additional wmb() will have serious impact there. >>> > > I will test the effect (it'll be beginning of next week). > I'll update so we can make a more confident decision. > I have also sent patches with wmb and batching allocations to customer and let them check whether the performance is impacted. And update here asap when get feedback. > Thanks, > Tariq > >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=f0myCdBQoRjaklxGau_S9ZtQKSQYALW9p2MIuTMAEYo&s=447fFu-xZoLvmxdaVhijK6cUk4Jcx7GtBCNddQT4GOQ&e= >> > ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <d9883261-e93e-400a-757c-3a81d8b6aca1@mellanox.com>]
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating [not found] ` <d9883261-e93e-400a-757c-3a81d8b6aca1@mellanox.com> @ 2019-01-02 1:43 ` jianchao.wang 0 siblings, 0 replies; 27+ messages in thread From: jianchao.wang @ 2019-01-02 1:43 UTC (permalink / raw) To: Tariq Toukan, Eric Dumazet, Jason Gunthorpe Cc: junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed On 12/31/18 12:27 AM, Tariq Toukan wrote: > > > On 1/27/2018 2:41 PM, jianchao.wang wrote: >> Hi Tariq >> >> Thanks for your kindly response. >> That's really appreciated. >> >> On 01/25/2018 05:54 PM, Tariq Toukan wrote: >>> >>> >>> On 25/01/2018 8:25 AM, jianchao.wang wrote: >>>> Hi Eric >>>> >>>> Thanks for you kindly response and suggestion. >>>> That's really appreciated. >>>> >>>> Jianchao >>>> >>>> On 01/25/2018 11:55 AM, Eric Dumazet wrote: >>>>> On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote: >>>>>> Hi Tariq >>>>>> >>>>>> On 01/22/2018 10:12 AM, jianchao.wang wrote: >>>>>>>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>>>>>>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>>>>>>>> Hi Tariq >>>>>>>>>>> >>>>>>>>>>> Very sad that the crash was reproduced again after applied the patch. >>>>>>>> >>>>>>>> Memory barriers vary for different Archs, can you please share more details regarding arch and repro steps? >>>>>>> The hardware is HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 >>>>>>> The xen is installed. The crash occurred in DOM0. >>>>>>> Regarding to the repro steps, it is a customer's test which does heavy disk I/O over NFS storage without any guest. >>>>>>> >>>>>> >>>>>> What is the finial suggestion on this ? >>>>>> If use wmb there, is the performance pulled down ? >>> >>> I want to evaluate this effect. >>> I agree with Eric, expected impact is restricted, especially after batching the allocations.> >>>>> >>>>> Since https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_davem_net-2Dnext.git_commit_-3Fid-3Ddad42c3038a59d27fced28ee4ec1d4a891b28155&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=c0oI8duFkyFBILMQYDsqRApHQrOlLY_2uGiz_utcd7s&s=E4_XKmSI0B63qB0DLQ1EX_fj1bOP78ZdeYADBf33B-k&e= >>>>> >>>>> we batch allocations, so mlx4_en_refill_rx_buffers() is not called that often. >>>>> >>>>> I doubt the additional wmb() will have serious impact there. >>>>> >>> >>> I will test the effect (it'll be beginning of next week). >>> I'll update so we can make a more confident decision. >>> >> I have also sent patches with wmb and batching allocations to customer and let them check whether the performance is impacted. >> And update here asap when get feedback. >> >>> Thanks, >>> Tariq >>> > > Hi Jianchao, > > I am interested to push this bug fix. > Do you want me to submit, or do it yourself? > Can you elaborate regarding the arch with the repro? > > This is the patch I suggest: > > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > @@ -161,6 +161,8 @@ static bool mlx4_en_is_ring_empty(const struct > mlx4_en_rx_ring *ring) > > static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) > { > + /* ensure rx_desc updating reaches HW before prod db updating */ > + wmb(); > *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); > } > Hi Tariq Happy new year ! The customer provided confused test result for us. The fix cannot fix their issue. And we finally find a upstream fix 5d70bd5c98d0e655bde2aae2b5251bdd44df5e71 (net/mlx4_en: fix potential use-after-free with dma_unmap_page) and killed the issue during Oct 2018. That's really a long way. Please go ahead with this patch. Thanks Jianchao ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating 2018-01-21 9:31 ` Tariq Toukan 2018-01-21 16:24 ` Tariq Toukan @ 2018-01-21 20:40 ` Jason Gunthorpe 1 sibling, 0 replies; 27+ messages in thread From: Jason Gunthorpe @ 2018-01-21 20:40 UTC (permalink / raw) To: Tariq Toukan Cc: Eric Dumazet, jianchao.wang, junxiao.bi, netdev, linux-rdma, linux-kernel, Saeed Mahameed > Hmm, this is actually consistent with the example below [1]. > > AIU from the example, it seems that the dma_wmb/dma_rmb barriers are good > for synchronizing cpu/device accesses to the "Streaming DMA mapped" buffers > (the descriptors, went through the dma_map_page() API), but not for the > doorbell (a coherent memory, typically allocated via dma_alloc_coherent) > that requires using the stronger wmb() barrier. If x86 truely requires a wmb() (aka SFENCE) here then the userspace RDMA stuff is broken too, and that has been tested to death at this point.. I looked into this at one point and I thought I concluded that x86 did not require a SFENCE between a posted PCI write and writes to system memory to guarnetee order with-respect-to the PCI device? Well, so long as non-temporal stores and other specialty accesses are not being used.. Is there a chance a fancy sse optimized memcpy or memset, crypto or something is being involved here? However, Documentation/memory-barriers.txt does seem pretty clear that the kernel definition of wmb() makes it required here, even if it might be overkill for x86? Jason ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2019-01-02 1:43 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-01-12 3:42 [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating Jianchao Wang 2018-01-12 16:32 ` Jason Gunthorpe 2018-01-12 16:46 ` Eric Dumazet 2018-01-12 19:53 ` Saeed Mahameed 2018-01-12 20:16 ` Eric Dumazet 2018-01-12 21:01 ` Saeed Mahameed 2018-01-12 21:21 ` Eric Dumazet 2018-01-13 19:15 ` Jason Gunthorpe 2018-01-14 2:40 ` jianchao.wang 2018-01-14 9:47 ` Tariq Toukan 2018-01-15 5:50 ` jianchao.wang 2018-01-19 15:16 ` jianchao.wang 2018-01-19 15:49 ` Eric Dumazet 2018-01-21 9:31 ` Tariq Toukan 2018-01-21 16:24 ` Tariq Toukan 2018-01-21 16:43 ` Eric Dumazet 2018-01-22 2:40 ` jianchao.wang 2018-01-22 15:47 ` Jason Gunthorpe 2018-01-23 3:25 ` jianchao.wang 2018-01-22 2:12 ` jianchao.wang 2018-01-25 3:27 ` jianchao.wang 2018-01-25 3:55 ` Eric Dumazet 2018-01-25 6:25 ` jianchao.wang 2018-01-25 9:54 ` Tariq Toukan 2018-01-27 12:41 ` jianchao.wang [not found] ` <d9883261-e93e-400a-757c-3a81d8b6aca1@mellanox.com> 2019-01-02 1:43 ` jianchao.wang 2018-01-21 20:40 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).