netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
@ 2019-02-06  0:35 Cong Wang
  2019-02-06 12:02 ` Tariq Toukan
  2019-02-06 17:35 ` Saeed Mahameed
  0 siblings, 2 replies; 11+ messages in thread
From: Cong Wang @ 2019-02-06  0:35 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Saeed Mahameed, Tariq Toukan

mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
gets a lot of contentions when we test some heavy workload
with 60 RX queues and 80 CPU's, and it is clearly shown in the
flame graph.

In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
we don't have to take a spinlock on this hot path. It is pretty much
similar to commit 291c566a2891
("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
are still serialized with the spinlock, and with synchronize_irq()
it should be safe to just move the fast path to RCU read lock.

This patch itself reduces the latency by about 50% with our workload.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index ee04aab65a9f..7092457705a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -114,11 +114,11 @@ static struct mlx5_core_cq *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
 	struct mlx5_cq_table *table = &eq->cq_table;
 	struct mlx5_core_cq *cq = NULL;
 
-	spin_lock(&table->lock);
+	rcu_read_lock();
 	cq = radix_tree_lookup(&table->tree, cqn);
 	if (likely(cq))
 		mlx5_cq_hold(cq);
-	spin_unlock(&table->lock);
+	rcu_read_unlock();
 
 	return cq;
 }
@@ -371,9 +371,9 @@ int mlx5_eq_add_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq)
 	struct mlx5_cq_table *table = &eq->cq_table;
 	int err;
 
-	spin_lock_irq(&table->lock);
+	spin_lock(&table->lock);
 	err = radix_tree_insert(&table->tree, cq->cqn, cq);
-	spin_unlock_irq(&table->lock);
+	spin_unlock(&table->lock);
 
 	return err;
 }
@@ -383,9 +383,9 @@ int mlx5_eq_del_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq)
 	struct mlx5_cq_table *table = &eq->cq_table;
 	struct mlx5_core_cq *tmp;
 
-	spin_lock_irq(&table->lock);
+	spin_lock(&table->lock);
 	tmp = radix_tree_delete(&table->tree, cq->cqn);
-	spin_unlock_irq(&table->lock);
+	spin_unlock(&table->lock);
 
 	if (!tmp) {
 		mlx5_core_warn(eq->dev, "cq 0x%x not found in eq 0x%x tree\n", eq->eqn, cq->cqn);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06  0:35 [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get() Cong Wang
@ 2019-02-06 12:02 ` Tariq Toukan
  2019-02-06 16:55   ` Saeed Mahameed
  2019-02-06 17:35 ` Saeed Mahameed
  1 sibling, 1 reply; 11+ messages in thread
From: Tariq Toukan @ 2019-02-06 12:02 UTC (permalink / raw)
  To: Cong Wang, netdev; +Cc: Saeed Mahameed



On 2/6/2019 2:35 AM, Cong Wang wrote:
> mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> gets a lot of contentions when we test some heavy workload
> with 60 RX queues and 80 CPU's, and it is clearly shown in the
> flame graph.
> 
> In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> we don't have to take a spinlock on this hot path. It is pretty much
> similar to commit 291c566a2891
> ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> are still serialized with the spinlock, and with synchronize_irq()
> it should be safe to just move the fast path to RCU read lock.
> 
> This patch itself reduces the latency by about 50% with our workload.
> 
> Cc: Saeed Mahameed <saeedm@mellanox.com>
> Cc: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index ee04aab65a9f..7092457705a2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -114,11 +114,11 @@ static struct mlx5_core_cq *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
>   	struct mlx5_cq_table *table = &eq->cq_table;
>   	struct mlx5_core_cq *cq = NULL;
>   
> -	spin_lock(&table->lock);
> +	rcu_read_lock();
>   	cq = radix_tree_lookup(&table->tree, cqn);
>   	if (likely(cq))
>   		mlx5_cq_hold(cq);
> -	spin_unlock(&table->lock);
> +	rcu_read_unlock();

Thanks for you patch.

I think we can improve it further, by taking the if statement out of the 
critical section.

Other than that, patch LGTM.

Regards,
Tariq

>   
>   	return cq;
>   }
> @@ -371,9 +371,9 @@ int mlx5_eq_add_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq)
>   	struct mlx5_cq_table *table = &eq->cq_table;
>   	int err;
>   
> -	spin_lock_irq(&table->lock);
> +	spin_lock(&table->lock);
>   	err = radix_tree_insert(&table->tree, cq->cqn, cq);
> -	spin_unlock_irq(&table->lock);
> +	spin_unlock(&table->lock);
>   
>   	return err;
>   }
> @@ -383,9 +383,9 @@ int mlx5_eq_del_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq)
>   	struct mlx5_cq_table *table = &eq->cq_table;
>   	struct mlx5_core_cq *tmp;
>   
> -	spin_lock_irq(&table->lock);
> +	spin_lock(&table->lock);
>   	tmp = radix_tree_delete(&table->tree, cq->cqn);
> -	spin_unlock_irq(&table->lock);
> +	spin_unlock(&table->lock);
>   
>   	if (!tmp) {
>   		mlx5_core_warn(eq->dev, "cq 0x%x not found in eq 0x%x tree\n", eq->eqn, cq->cqn);
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 12:02 ` Tariq Toukan
@ 2019-02-06 16:55   ` Saeed Mahameed
  2019-02-06 17:15     ` Cong Wang
  2019-02-06 17:17     ` Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-06 16:55 UTC (permalink / raw)
  To: netdev, Tariq Toukan, xiyou.wangcong

On Wed, 2019-02-06 at 12:02 +0000, Tariq Toukan wrote:
> 
> On 2/6/2019 2:35 AM, Cong Wang wrote:
> > mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> > gets a lot of contentions when we test some heavy workload
> > with 60 RX queues and 80 CPU's, and it is clearly shown in the
> > flame graph.
> > 


Hi Cong,

The patch is ok to me, but i really doubt that you can hit a contention
on latest upstream driver, since we already have spinlock per EQ, which
means spinlock per core,  each EQ (core) msix handler can only access
one spinlock (its own), so I am surprised how you got the contention,
Maybe you are not running on latest upstream driver ?

what is the workload ? 

> > In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> > we don't have to take a spinlock on this hot path. It is pretty
> > much
> > similar to commit 291c566a2891
> > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> > are still serialized with the spinlock, and with synchronize_irq()
> > it should be safe to just move the fast path to RCU read lock.
> > 
> > This patch itself reduces the latency by about 50% with our
> > workload.
> > 
> > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > Cc: Tariq Toukan <tariqt@mellanox.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >   drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
> >   1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > index ee04aab65a9f..7092457705a2 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > @@ -114,11 +114,11 @@ static struct mlx5_core_cq
> > *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
> >   	struct mlx5_cq_table *table = &eq->cq_table;
> >   	struct mlx5_core_cq *cq = NULL;
> >   
> > -	spin_lock(&table->lock);
> > +	rcu_read_lock();
> >   	cq = radix_tree_lookup(&table->tree, cqn);
> >   	if (likely(cq))
> >   		mlx5_cq_hold(cq);
> > -	spin_unlock(&table->lock);
> > +	rcu_read_unlock();
> 
> Thanks for you patch.
> 
> I think we can improve it further, by taking the if statement out of
> the 
> critical section.
> 

No, mlx5_cq_hold must stay under RCU read, otherwise cq might get freed
before the irq gets a change to increment ref count on it.

another way to do it is not to do any refcounting in the irq handler
and fence cq removal via synchronize_irq(eq->irqn) on mlx5_eq_del_cq.
But let's keep one approach (refcounting), synchronize_irq/rcu can be
heavy sometimes especially on RDMA workloads with many create/destroy
cq in loops.

> Other than that, patch LGTM.
> 
> Regards,
> Tariq
> 
> >   
> >   	return cq;
> >   }
> > @@ -371,9 +371,9 @@ int mlx5_eq_add_cq(struct mlx5_eq *eq, struct
> > mlx5_core_cq *cq)
> >   	struct mlx5_cq_table *table = &eq->cq_table;
> >   	int err;
> >   
> > -	spin_lock_irq(&table->lock);
> > +	spin_lock(&table->lock);
> >   	err = radix_tree_insert(&table->tree, cq->cqn, cq);
> > -	spin_unlock_irq(&table->lock);
> > +	spin_unlock(&table->lock);
> >   
> >   	return err;
> >   }
> > @@ -383,9 +383,9 @@ int mlx5_eq_del_cq(struct mlx5_eq *eq, struct
> > mlx5_core_cq *cq)
> >   	struct mlx5_cq_table *table = &eq->cq_table;
> >   	struct mlx5_core_cq *tmp;
> >   
> > -	spin_lock_irq(&table->lock);
> > +	spin_lock(&table->lock);
> >   	tmp = radix_tree_delete(&table->tree, cq->cqn);
> > -	spin_unlock_irq(&table->lock);
> > +	spin_unlock(&table->lock);
> >   
> >   	if (!tmp) {
> >   		mlx5_core_warn(eq->dev, "cq 0x%x not found in eq 0x%x
> > tree\n", eq->eqn, cq->cqn);
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 16:55   ` Saeed Mahameed
@ 2019-02-06 17:15     ` Cong Wang
  2019-02-06 17:35       ` Saeed Mahameed
  2019-02-06 17:17     ` Eric Dumazet
  1 sibling, 1 reply; 11+ messages in thread
From: Cong Wang @ 2019-02-06 17:15 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: netdev, Tariq Toukan

On Wed, Feb 6, 2019 at 8:55 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
> Hi Cong,
>
> The patch is ok to me, but i really doubt that you can hit a contention
> on latest upstream driver, since we already have spinlock per EQ, which
> means spinlock per core,  each EQ (core) msix handler can only access
> one spinlock (its own), so I am surprised how you got the contention,
> Maybe you are not running on latest upstream driver ?

We are running 4.14 stable release. Which commit changes the game
here? We can consider to backport it unless it is complicated.

Also, if you don't like this patch, we are happy to carry it for our own,
sometimes it isn't worth the time to push into upstream.

>
> what is the workload ?
>

It's a memcached RPC performance test, that is all I can tell.
(Apparently I have almost zero knowledge about memcached.)


> > > In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> > > we don't have to take a spinlock on this hot path. It is pretty
> > > much
> > > similar to commit 291c566a2891
> > > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> > > are still serialized with the spinlock, and with synchronize_irq()
> > > it should be safe to just move the fast path to RCU read lock.
> > >
> > > This patch itself reduces the latency by about 50% with our
> > > workload.
> > >
> > > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > > Cc: Tariq Toukan <tariqt@mellanox.com>
> > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > > ---
> > >   drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
> > >   1 file changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > index ee04aab65a9f..7092457705a2 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > @@ -114,11 +114,11 @@ static struct mlx5_core_cq
> > > *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
> > >     struct mlx5_cq_table *table = &eq->cq_table;
> > >     struct mlx5_core_cq *cq = NULL;
> > >
> > > -   spin_lock(&table->lock);
> > > +   rcu_read_lock();
> > >     cq = radix_tree_lookup(&table->tree, cqn);
> > >     if (likely(cq))
> > >             mlx5_cq_hold(cq);
> > > -   spin_unlock(&table->lock);
> > > +   rcu_read_unlock();
> >
> > Thanks for you patch.
> >
> > I think we can improve it further, by taking the if statement out of
> > the
> > critical section.
> >
>
> No, mlx5_cq_hold must stay under RCU read, otherwise cq might get freed
> before the irq gets a change to increment ref count on it.
>

Agreed.


Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 16:55   ` Saeed Mahameed
  2019-02-06 17:15     ` Cong Wang
@ 2019-02-06 17:17     ` Eric Dumazet
  2019-02-06 17:37       ` Saeed Mahameed
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2019-02-06 17:17 UTC (permalink / raw)
  To: Saeed Mahameed, netdev, Tariq Toukan, xiyou.wangcong



On 02/06/2019 08:55 AM, Saeed Mahameed wrote:
> On Wed, 2019-02-06 at 12:02 +0000, Tariq Toukan wrote:
>>
>> On 2/6/2019 2:35 AM, Cong Wang wrote:
>>> mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
>>> gets a lot of contentions when we test some heavy workload
>>> with 60 RX queues and 80 CPU's, and it is clearly shown in the
>>> flame graph.
>>>
> 
> 
> Hi Cong,
> 
> The patch is ok to me, but i really doubt that you can hit a contention
> on latest upstream driver, since we already have spinlock per EQ, which
> means spinlock per core,  each EQ (core) msix handler can only access
> one spinlock (its own), so I am surprised how you got the contention,
> Maybe you are not running on latest upstream driver ?
> 
> what is the workload ? 

Surprisingly (or not), atomic operations, even on _not_ contended cache lines can
stall the cpu enough for perf tools to notice...

If the atomic operation can be trivially replaced by RCU, then do it by any mean.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 17:15     ` Cong Wang
@ 2019-02-06 17:35       ` Saeed Mahameed
  2019-02-06 17:43         ` Cong Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-06 17:35 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, Tariq Toukan

On Wed, 2019-02-06 at 09:15 -0800, Cong Wang wrote:
> On Wed, Feb 6, 2019 at 8:55 AM Saeed Mahameed <saeedm@mellanox.com>
> wrote:
> > Hi Cong,
> > 
> > The patch is ok to me, but i really doubt that you can hit a
> > contention
> > on latest upstream driver, since we already have spinlock per EQ,
> > which
> > means spinlock per core,  each EQ (core) msix handler can only
> > access
> > one spinlock (its own), so I am surprised how you got the
> > contention,
> > Maybe you are not running on latest upstream driver ?
> 
> We are running 4.14 stable release. Which commit changes the game
> here? We can consider to backport it unless it is complicated.
> 

Ok, so there is no issue upstream, you are just missing the following
patch:

commit 02d92f7903647119e125b24f5470f96cee0d4b4b
Author: Saeed Mahameed <saeedm@mellanox.com>
Date:   Fri Jan 19 16:13:01 2018 -0800

    net/mlx5: CQ Database per EQ
    
    Before this patch the driver had one CQ database protected via one
    spinlock, this spinlock is meant to synchronize between CQ
    adding/removing and CQ IRQ interrupt handling.
[...]

> Also, if you don't like this patch, we are happy to carry it for our
> own,
> sometimes it isn't worth the time to push into upstream.

I Do like it and it always worth it to push upstream, we all get to
learn cool new stuff.

> 
> > what is the workload ?
> > 
> 
> It's a memcached RPC performance test, that is all I can tell.

cool, thanks, so the missing commit should fix your issue.

> (Apparently I have almost zero knowledge about memcached.)
> 
> 
> > > > In fact, radix_tree_lookup() is perfectly fine with RCU read
> > > > lock,
> > > > we don't have to take a spinlock on this hot path. It is pretty
> > > > much
> > > > similar to commit 291c566a2891
> > > > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow
> > > > paths
> > > > are still serialized with the spinlock, and with
> > > > synchronize_irq()
> > > > it should be safe to just move the fast path to RCU read lock.
> > > > 
> > > > This patch itself reduces the latency by about 50% with our
> > > > workload.
> > > > 
> > > > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > > > Cc: Tariq Toukan <tariqt@mellanox.com>
> > > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > > > ---
> > > >   drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++-----
> > > > -
> > > >   1 file changed, 6 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > index ee04aab65a9f..7092457705a2 100644
> > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > @@ -114,11 +114,11 @@ static struct mlx5_core_cq
> > > > *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
> > > >     struct mlx5_cq_table *table = &eq->cq_table;
> > > >     struct mlx5_core_cq *cq = NULL;
> > > > 
> > > > -   spin_lock(&table->lock);
> > > > +   rcu_read_lock();
> > > >     cq = radix_tree_lookup(&table->tree, cqn);
> > > >     if (likely(cq))
> > > >             mlx5_cq_hold(cq);
> > > > -   spin_unlock(&table->lock);
> > > > +   rcu_read_unlock();
> > > 
> > > Thanks for you patch.
> > > 
> > > I think we can improve it further, by taking the if statement out
> > > of
> > > the
> > > critical section.
> > > 
> > 
> > No, mlx5_cq_hold must stay under RCU read, otherwise cq might get
> > freed
> > before the irq gets a change to increment ref count on it.
> > 
> 
> Agreed.
> 

Cool, I will ack the patch.. 

> 
> Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06  0:35 [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get() Cong Wang
  2019-02-06 12:02 ` Tariq Toukan
@ 2019-02-06 17:35 ` Saeed Mahameed
  2019-02-06 17:40   ` Saeed Mahameed
  1 sibling, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-06 17:35 UTC (permalink / raw)
  To: netdev, xiyou.wangcong; +Cc: Tariq Toukan

On Tue, 2019-02-05 at 16:35 -0800, Cong Wang wrote:
> mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> gets a lot of contentions when we test some heavy workload
> with 60 RX queues and 80 CPU's, and it is clearly shown in the
> flame graph.
> 
> In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> we don't have to take a spinlock on this hot path. It is pretty much
> similar to commit 291c566a2891
> ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> are still serialized with the spinlock, and with synchronize_irq()
> it should be safe to just move the fast path to RCU read lock.
> 
> This patch itself reduces the latency by about 50% with our workload.
> 
> Cc: Saeed Mahameed <saeedm@mellanox.com>
> Cc: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Acked-by: Saeed Mahameed <saeedm@mellanox.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 17:17     ` Eric Dumazet
@ 2019-02-06 17:37       ` Saeed Mahameed
  0 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-06 17:37 UTC (permalink / raw)
  To: eric.dumazet, netdev, Tariq Toukan, xiyou.wangcong

On Wed, 2019-02-06 at 09:17 -0800, Eric Dumazet wrote:
> 
> On 02/06/2019 08:55 AM, Saeed Mahameed wrote:
> > On Wed, 2019-02-06 at 12:02 +0000, Tariq Toukan wrote:
> > > On 2/6/2019 2:35 AM, Cong Wang wrote:
> > > > mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> > > > gets a lot of contentions when we test some heavy workload
> > > > with 60 RX queues and 80 CPU's, and it is clearly shown in the
> > > > flame graph.
> > > > 
> > 
> > Hi Cong,
> > 
> > The patch is ok to me, but i really doubt that you can hit a
> > contention
> > on latest upstream driver, since we already have spinlock per EQ,
> > which
> > means spinlock per core,  each EQ (core) msix handler can only
> > access
> > one spinlock (its own), so I am surprised how you got the
> > contention,
> > Maybe you are not running on latest upstream driver ?
> > 
> > what is the workload ? 
> 
> Surprisingly (or not), atomic operations, even on _not_ contended
> cache lines can
> stall the cpu enough for perf tools to notice...
> 
> If the atomic operation can be trivially replaced by RCU, then do it
> by any mean.
> 
> 

Totally agree, Thanks Eric.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 17:35 ` Saeed Mahameed
@ 2019-02-06 17:40   ` Saeed Mahameed
  2019-02-06 22:51     ` Cong Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-06 17:40 UTC (permalink / raw)
  To: netdev, xiyou.wangcong; +Cc: Tariq Toukan

On Wed, 2019-02-06 at 09:35 -0800, Saeed Mahameed wrote:
> On Tue, 2019-02-05 at 16:35 -0800, Cong Wang wrote:
> > mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> > gets a lot of contentions when we test some heavy workload
> > with 60 RX queues and 80 CPU's, and it is clearly shown in the
> > flame graph.
> > 
> > In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> > we don't have to take a spinlock on this hot path. It is pretty
> > much
> > similar to commit 291c566a2891
> > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> > are still serialized with the spinlock, and with synchronize_irq()
> > it should be safe to just move the fast path to RCU read lock.
> > 
> > This patch itself reduces the latency by about 50% with our
> > workload.
> > 
> > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > Cc: Tariq Toukan <tariqt@mellanox.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> 
> Acked-by: Saeed Mahameed <saeedm@mellanox.com>
> 

Actually, the commit message needs some rework, since there is no
contention upstream, Cong can you take care of this and post a V2 ?

Thanks,
Saeed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 17:35       ` Saeed Mahameed
@ 2019-02-06 17:43         ` Cong Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2019-02-06 17:43 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: netdev, Tariq Toukan

On Wed, Feb 6, 2019 at 9:35 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> On Wed, 2019-02-06 at 09:15 -0800, Cong Wang wrote:
> > On Wed, Feb 6, 2019 at 8:55 AM Saeed Mahameed <saeedm@mellanox.com>
> > wrote:
> > > Hi Cong,
> > >
> > > The patch is ok to me, but i really doubt that you can hit a
> > > contention
> > > on latest upstream driver, since we already have spinlock per EQ,
> > > which
> > > means spinlock per core,  each EQ (core) msix handler can only
> > > access
> > > one spinlock (its own), so I am surprised how you got the
> > > contention,
> > > Maybe you are not running on latest upstream driver ?
> >
> > We are running 4.14 stable release. Which commit changes the game
> > here? We can consider to backport it unless it is complicated.
> >
>
> Ok, so there is no issue upstream, you are just missing the following
> patch:
>
> commit 02d92f7903647119e125b24f5470f96cee0d4b4b
> Author: Saeed Mahameed <saeedm@mellanox.com>
> Date:   Fri Jan 19 16:13:01 2018 -0800
>
>     net/mlx5: CQ Database per EQ
>
>     Before this patch the driver had one CQ database protected via one
>     spinlock, this spinlock is meant to synchronize between CQ
>     adding/removing and CQ IRQ interrupt handling.


Thanks for pointing it out! I will evaluate if we should backport it to
4.14.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
  2019-02-06 17:40   ` Saeed Mahameed
@ 2019-02-06 22:51     ` Cong Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2019-02-06 22:51 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: netdev, Tariq Toukan

On Wed, Feb 6, 2019 at 9:40 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> On Wed, 2019-02-06 at 09:35 -0800, Saeed Mahameed wrote:
> > On Tue, 2019-02-05 at 16:35 -0800, Cong Wang wrote:
> > > mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> > > gets a lot of contentions when we test some heavy workload
> > > with 60 RX queues and 80 CPU's, and it is clearly shown in the
> > > flame graph.
> > >
> > > In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> > > we don't have to take a spinlock on this hot path. It is pretty
> > > much
> > > similar to commit 291c566a2891
> > > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> > > are still serialized with the spinlock, and with synchronize_irq()
> > > it should be safe to just move the fast path to RCU read lock.
> > >
> > > This patch itself reduces the latency by about 50% with our
> > > workload.
> > >
> > > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > > Cc: Tariq Toukan <tariqt@mellanox.com>
> > > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> >
> > Acked-by: Saeed Mahameed <saeedm@mellanox.com>
> >
>
> Actually, the commit message needs some rework, since there is no
> contention upstream, Cong can you take care of this and post a V2 ?

I can't verify if upstream has contention or not, but yeah, I can at
least mention the commit 02d92f7903647119e125b24f547 in
changelog.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-02-06 22:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-06  0:35 [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get() Cong Wang
2019-02-06 12:02 ` Tariq Toukan
2019-02-06 16:55   ` Saeed Mahameed
2019-02-06 17:15     ` Cong Wang
2019-02-06 17:35       ` Saeed Mahameed
2019-02-06 17:43         ` Cong Wang
2019-02-06 17:17     ` Eric Dumazet
2019-02-06 17:37       ` Saeed Mahameed
2019-02-06 17:35 ` Saeed Mahameed
2019-02-06 17:40   ` Saeed Mahameed
2019-02-06 22:51     ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).