* [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
@ 2019-02-06 23:00 Cong Wang
2019-02-06 23:36 ` Eric Dumazet
2019-02-11 22:41 ` Saeed Mahameed
0 siblings, 2 replies; 9+ messages in thread
From: Cong Wang @ 2019-02-06 23:00 UTC (permalink / raw)
To: netdev; +Cc: Cong Wang, Saeed Mahameed, Tariq Toukan
mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
gets a lot of contentions when we test some heavy workload
with 60 RX queues and 80 CPU's, and it is clearly shown in the
flame graph.
In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
we don't have to take a spinlock on this hot path. This is pretty
much similar to commit 291c566a2891
("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
are still serialized with the spinlock, and with synchronize_irq()
it should be safe to just move the fast path to RCU read lock.
This patch itself reduces the latency by about 50% for our memcached
workload on a 4.14 kernel we test. In upstream, as pointed out by Saeed,
this spinlock gets some rework in commit 02d92f790364
("net/mlx5: CQ Database per EQ"), so the difference could be smaller.
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index ee04aab65a9f..7092457705a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -114,11 +114,11 @@ static struct mlx5_core_cq *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
struct mlx5_cq_table *table = &eq->cq_table;
struct mlx5_core_cq *cq = NULL;
- spin_lock(&table->lock);
+ rcu_read_lock();
cq = radix_tree_lookup(&table->tree, cqn);
if (likely(cq))
mlx5_cq_hold(cq);
- spin_unlock(&table->lock);
+ rcu_read_unlock();
return cq;
}
@@ -371,9 +371,9 @@ int mlx5_eq_add_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq)
struct mlx5_cq_table *table = &eq->cq_table;
int err;
- spin_lock_irq(&table->lock);
+ spin_lock(&table->lock);
err = radix_tree_insert(&table->tree, cq->cqn, cq);
- spin_unlock_irq(&table->lock);
+ spin_unlock(&table->lock);
return err;
}
@@ -383,9 +383,9 @@ int mlx5_eq_del_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq)
struct mlx5_cq_table *table = &eq->cq_table;
struct mlx5_core_cq *tmp;
- spin_lock_irq(&table->lock);
+ spin_lock(&table->lock);
tmp = radix_tree_delete(&table->tree, cq->cqn);
- spin_unlock_irq(&table->lock);
+ spin_unlock(&table->lock);
if (!tmp) {
mlx5_core_warn(eq->dev, "cq 0x%x not found in eq 0x%x tree\n", eq->eqn, cq->cqn);
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-06 23:00 [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get() Cong Wang
@ 2019-02-06 23:36 ` Eric Dumazet
2019-02-06 23:55 ` Eric Dumazet
2019-02-07 0:04 ` Cong Wang
2019-02-11 22:41 ` Saeed Mahameed
1 sibling, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2019-02-06 23:36 UTC (permalink / raw)
To: Cong Wang, netdev; +Cc: Saeed Mahameed, Tariq Toukan
On 02/06/2019 03:00 PM, Cong Wang wrote:
> mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> gets a lot of contentions when we test some heavy workload
> with 60 RX queues and 80 CPU's, and it is clearly shown in the
> flame graph.
>
> In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> we don't have to take a spinlock on this hot path. This is pretty
> much similar to commit 291c566a2891
> ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> are still serialized with the spinlock, and with synchronize_irq()
> it should be safe to just move the fast path to RCU read lock.
>
> This patch itself reduces the latency by about 50% for our memcached
> workload on a 4.14 kernel we test. In upstream, as pointed out by Saeed,
> this spinlock gets some rework in commit 02d92f790364
> ("net/mlx5: CQ Database per EQ"), so the difference could be smaller.
>
> Cc: Saeed Mahameed <saeedm@mellanox.com>
> Cc: Tariq Toukan <tariqt@mellanox.com>
> Acked-by: Saeed Mahameed <saeedm@mellanox.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index ee04aab65a9f..7092457705a2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -114,11 +114,11 @@ static struct mlx5_core_cq *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
> struct mlx5_cq_table *table = &eq->cq_table;
> struct mlx5_core_cq *cq = NULL;
>
> - spin_lock(&table->lock);
> + rcu_read_lock();
> cq = radix_tree_lookup(&table->tree, cqn);
> if (likely(cq))
> mlx5_cq_hold(cq);
I suspect that you need a variant that makes sure refcount is not zero.
( Typical RCU rules apply )
if (cq && !refcount_inc_not_zero(&cq->refcount))
cq = NULL;
See commit 6fa19f5637a6c22bc0999596bcc83bdcac8a4fa6 rds: fix refcount bug in rds_sock_addref
for a similar issue I fixed recently.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-06 23:36 ` Eric Dumazet
@ 2019-02-06 23:55 ` Eric Dumazet
2019-02-07 0:04 ` Cong Wang
1 sibling, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2019-02-06 23:55 UTC (permalink / raw)
To: Eric Dumazet, Cong Wang, netdev; +Cc: Saeed Mahameed, Tariq Toukan
On 02/06/2019 03:36 PM, Eric Dumazet wrote:
>
> I suspect that you need a variant that makes sure refcount is not zero.
>
> ( Typical RCU rules apply )
>
> if (cq && !refcount_inc_not_zero(&cq->refcount))
> cq = NULL;
>
>
> See commit 6fa19f5637a6c22bc0999596bcc83bdcac8a4fa6 rds: fix refcount bug in rds_sock_addref
> for a similar issue I fixed recently.
>
By the way, we also could avoid two atomics on the cq->refcount , by using rcu_read_lock()/unlock()
in the two callers .
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-06 23:36 ` Eric Dumazet
2019-02-06 23:55 ` Eric Dumazet
@ 2019-02-07 0:04 ` Cong Wang
2019-02-07 0:28 ` Eric Dumazet
1 sibling, 1 reply; 9+ messages in thread
From: Cong Wang @ 2019-02-07 0:04 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Network Developers, Saeed Mahameed, Tariq Toukan
On Wed, Feb 6, 2019 at 3:36 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 02/06/2019 03:00 PM, Cong Wang wrote:
> > mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> > gets a lot of contentions when we test some heavy workload
> > with 60 RX queues and 80 CPU's, and it is clearly shown in the
> > flame graph.
> >
> > In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> > we don't have to take a spinlock on this hot path. This is pretty
> > much similar to commit 291c566a2891
> > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> > are still serialized with the spinlock, and with synchronize_irq()
> > it should be safe to just move the fast path to RCU read lock.
> >
> > This patch itself reduces the latency by about 50% for our memcached
> > workload on a 4.14 kernel we test. In upstream, as pointed out by Saeed,
> > this spinlock gets some rework in commit 02d92f790364
> > ("net/mlx5: CQ Database per EQ"), so the difference could be smaller.
> >
> > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > Cc: Tariq Toukan <tariqt@mellanox.com>
> > Acked-by: Saeed Mahameed <saeedm@mellanox.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> > drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > index ee04aab65a9f..7092457705a2 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > @@ -114,11 +114,11 @@ static struct mlx5_core_cq *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
> > struct mlx5_cq_table *table = &eq->cq_table;
> > struct mlx5_core_cq *cq = NULL;
> >
> > - spin_lock(&table->lock);
> > + rcu_read_lock();
> > cq = radix_tree_lookup(&table->tree, cqn);
> > if (likely(cq))
> > mlx5_cq_hold(cq);
>
> I suspect that you need a variant that makes sure refcount is not zero.
>
> ( Typical RCU rules apply )
>
> if (cq && !refcount_inc_not_zero(&cq->refcount))
> cq = NULL;
>
>
> See commit 6fa19f5637a6c22bc0999596bcc83bdcac8a4fa6 rds: fix refcount bug in rds_sock_addref
> for a similar issue I fixed recently.
synchronize_irq() is called before mlx5_cq_put(), so I don't
see why readers could get 0 refcnt.
For the rds you mentioned, it doesn't wait for readers, this
is why it needs to check against 0 and why it is different from
this one.
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-07 0:04 ` Cong Wang
@ 2019-02-07 0:28 ` Eric Dumazet
2019-02-07 0:51 ` Saeed Mahameed
2019-02-07 0:53 ` Cong Wang
0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2019-02-07 0:28 UTC (permalink / raw)
To: Cong Wang, Eric Dumazet
Cc: Linux Kernel Network Developers, Saeed Mahameed, Tariq Toukan
On 02/06/2019 04:04 PM, Cong Wang wrote:
> synchronize_irq() is called before mlx5_cq_put(), so I don't
> see why readers could get 0 refcnt.
Then the more reasons to get rid of the refcount increment/decrement completely ...
Technically, even the rcu_read_lock() and rcu_read_unlock() are not needed,
since synchronize_irq() is enough.
>
> For the rds you mentioned, it doesn't wait for readers, this
> is why it needs to check against 0 and why it is different from
> this one.
>
> Thanks.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-07 0:28 ` Eric Dumazet
@ 2019-02-07 0:51 ` Saeed Mahameed
2019-02-07 0:53 ` Cong Wang
1 sibling, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2019-02-07 0:51 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Linux Kernel Network Developers, Saeed Mahameed, Tariq Toukan
On Wed, Feb 6, 2019 at 4:28 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 02/06/2019 04:04 PM, Cong Wang wrote:
>
> > synchronize_irq() is called before mlx5_cq_put(), so I don't
> > see why readers could get 0 refcnt.
>
> Then the more reasons to get rid of the refcount increment/decrement completely ...
>
> Technically, even the rcu_read_lock() and rcu_read_unlock() are not needed,
> since synchronize_irq() is enough.
>
I already suggested this, quoting myself from my first reply to this patch V0:
"another way to do it is not to do any refcounting in the irq handler
and fence cq removal via synchronize_irq(eq->irqn) on mlx5_eq_del_cq."
I already have a patch I was just waiting for Cong to push V2.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-07 0:28 ` Eric Dumazet
2019-02-07 0:51 ` Saeed Mahameed
@ 2019-02-07 0:53 ` Cong Wang
2019-02-07 0:56 ` Saeed Mahameed
1 sibling, 1 reply; 9+ messages in thread
From: Cong Wang @ 2019-02-07 0:53 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Network Developers, Saeed Mahameed, Tariq Toukan
On Wed, Feb 6, 2019 at 4:28 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 02/06/2019 04:04 PM, Cong Wang wrote:
>
> > synchronize_irq() is called before mlx5_cq_put(), so I don't
> > see why readers could get 0 refcnt.
>
> Then the more reasons to get rid of the refcount increment/decrement completely ...
>
> Technically, even the rcu_read_lock() and rcu_read_unlock() are not needed,
> since synchronize_irq() is enough.
Excellent point.
For the refcnt, I am afraid we still have to hold refcnt for the tasklet,
mlx5_cq_tasklet_cb. But yeah, should be safe to remove from IRQ
path.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-07 0:53 ` Cong Wang
@ 2019-02-07 0:56 ` Saeed Mahameed
0 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2019-02-07 0:56 UTC (permalink / raw)
To: Cong Wang
Cc: Eric Dumazet, Linux Kernel Network Developers, Saeed Mahameed,
Tariq Toukan
On Wed, Feb 6, 2019 at 4:53 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Wed, Feb 6, 2019 at 4:28 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >
> >
> > On 02/06/2019 04:04 PM, Cong Wang wrote:
> >
> > > synchronize_irq() is called before mlx5_cq_put(), so I don't
> > > see why readers could get 0 refcnt.
> >
> > Then the more reasons to get rid of the refcount increment/decrement completely ...
> >
> > Technically, even the rcu_read_lock() and rcu_read_unlock() are not needed,
> > since synchronize_irq() is enough.
>
> Excellent point.
>
> For the refcnt, I am afraid we still have to hold refcnt for the tasklet,
> mlx5_cq_tasklet_cb. But yeah, should be safe to remove from IRQ
> path.
the tasklet path is for rdma CQs only, netdev cqs handling will be refcnt free.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get()
2019-02-06 23:00 [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get() Cong Wang
2019-02-06 23:36 ` Eric Dumazet
@ 2019-02-11 22:41 ` Saeed Mahameed
1 sibling, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2019-02-11 22:41 UTC (permalink / raw)
To: netdev, xiyou.wangcong; +Cc: Tariq Toukan
On Wed, 2019-02-06 at 15:00 -0800, Cong Wang wrote:
> mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> gets a lot of contentions when we test some heavy workload
> with 60 RX queues and 80 CPU's, and it is clearly shown in the
> flame graph.
>
> In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> we don't have to take a spinlock on this hot path. This is pretty
> much similar to commit 291c566a2891
> ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> are still serialized with the spinlock, and with synchronize_irq()
> it should be safe to just move the fast path to RCU read lock.
>
> This patch itself reduces the latency by about 50% for our memcached
> workload on a 4.14 kernel we test. In upstream, as pointed out by
> Saeed,
> this spinlock gets some rework in commit 02d92f790364
> ("net/mlx5: CQ Database per EQ"), so the difference could be smaller.
>
> Cc: Saeed Mahameed <saeedm@mellanox.com>
> Cc: Tariq Toukan <tariqt@mellanox.com>
> Acked-by: Saeed Mahameed <saeedm@mellanox.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>
Applied to mlx5-next
Will be sent to net-next in my next pull request
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-02-11 22:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-06 23:00 [Patch net-next v2] mlx5: use RCU lock in mlx5_eq_cq_get() Cong Wang
2019-02-06 23:36 ` Eric Dumazet
2019-02-06 23:55 ` Eric Dumazet
2019-02-07 0:04 ` Cong Wang
2019-02-07 0:28 ` Eric Dumazet
2019-02-07 0:51 ` Saeed Mahameed
2019-02-07 0:53 ` Cong Wang
2019-02-07 0:56 ` Saeed Mahameed
2019-02-11 22:41 ` Saeed Mahameed
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.