netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry.
@ 2019-12-03 22:01 Jonathan Lemon
  2019-12-04  8:32 ` Jesper Dangaard Brouer
  2019-12-05  0:36 ` David Miller
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Lemon @ 2019-12-03 22:01 UTC (permalink / raw)
  To: netdev, davem; +Cc: kernel-team, brouer, grygorii.strashko

A lockdep splat was observed when trying to remove an xdp memory
model from the table since the mutex was obtained when trying to
remove the entry, but not before the table walk started:

Fix the splat by obtaining the lock before starting the table walk.

Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
---
 net/core/xdp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index e334fad0a6b8..7c8390ad4dc6 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -80,12 +80,8 @@ static void mem_xa_remove(struct xdp_mem_allocator *xa)
 {
 	trace_mem_disconnect(xa);
 
-	mutex_lock(&mem_id_lock);
-
 	if (!rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
 		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
-
-	mutex_unlock(&mem_id_lock);
 }
 
 static void mem_allocator_disconnect(void *allocator)
@@ -93,6 +89,8 @@ static void mem_allocator_disconnect(void *allocator)
 	struct xdp_mem_allocator *xa;
 	struct rhashtable_iter iter;
 
+	mutex_lock(&mem_id_lock);
+
 	rhashtable_walk_enter(mem_id_ht, &iter);
 	do {
 		rhashtable_walk_start(&iter);
@@ -106,6 +104,8 @@ static void mem_allocator_disconnect(void *allocator)
 
 	} while (xa == ERR_PTR(-EAGAIN));
 	rhashtable_walk_exit(&iter);
+
+	mutex_unlock(&mem_id_lock);
 }
 
 static void mem_id_disconnect(int id)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry.
  2019-12-03 22:01 [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry Jonathan Lemon
@ 2019-12-04  8:32 ` Jesper Dangaard Brouer
  2019-12-04 10:07   ` Grygorii Strashko
  2019-12-05  0:36 ` David Miller
  1 sibling, 1 reply; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-04  8:32 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: netdev, davem, kernel-team, grygorii.strashko, brouer, Ilias Apalodimas

On Tue, 3 Dec 2019 14:01:14 -0800
Jonathan Lemon <jonathan.lemon@gmail.com> wrote:

> A lockdep splat was observed when trying to remove an xdp memory
> model from the table since the mutex was obtained when trying to
> remove the entry, but not before the table walk started:
> 
> Fix the splat by obtaining the lock before starting the table walk.
> 
> Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>

Have you tested if this patch fix the problem reported by Grygorii?

Link: https://lore.kernel.org/netdev/c2de8927-7bca-612f-cdfd-e9112fee412a@ti.com

Grygorii can you test this?

> ---
>  net/core/xdp.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index e334fad0a6b8..7c8390ad4dc6 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -80,12 +80,8 @@ static void mem_xa_remove(struct xdp_mem_allocator *xa)
>  {
>  	trace_mem_disconnect(xa);
>  
> -	mutex_lock(&mem_id_lock);
> -
>  	if (!rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
>  		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
> -
> -	mutex_unlock(&mem_id_lock);
>  }
>  
>  static void mem_allocator_disconnect(void *allocator)
> @@ -93,6 +89,8 @@ static void mem_allocator_disconnect(void *allocator)
>  	struct xdp_mem_allocator *xa;
>  	struct rhashtable_iter iter;
>  
> +	mutex_lock(&mem_id_lock);
> +
>  	rhashtable_walk_enter(mem_id_ht, &iter);
>  	do {
>  		rhashtable_walk_start(&iter);
> @@ -106,6 +104,8 @@ static void mem_allocator_disconnect(void *allocator)
>  
>  	} while (xa == ERR_PTR(-EAGAIN));
>  	rhashtable_walk_exit(&iter);
> +
> +	mutex_unlock(&mem_id_lock);
>  }
>  
>  static void mem_id_disconnect(int id)

Moving the mutex-lock might be a good idea, but I'm not sure that fixes
the issue.  I'm more suspect about the usage of rcu_read_lock() in
xdp_rxq_info_unreg_mem_model(), listed below.

void xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
{
	struct xdp_mem_allocator *xa;
	int id = xdp_rxq->mem.id;

	if (xdp_rxq->reg_state != REG_STATE_REGISTERED) {
		WARN(1, "Missing register, driver bug");
		return;
	}

	if (id == 0)
		return;

	if (xdp_rxq->mem.type == MEM_TYPE_ZERO_COPY)
		return mem_id_disconnect(id);

	if (xdp_rxq->mem.type == MEM_TYPE_PAGE_POOL) {
		rcu_read_lock();
		xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
		page_pool_destroy(xa->page_pool);
		rcu_read_unlock();
	}
}
EXPORT_SYMBOL_GPL(xdp_rxq_info_unreg_mem_model);


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry.
  2019-12-04  8:32 ` Jesper Dangaard Brouer
@ 2019-12-04 10:07   ` Grygorii Strashko
  2019-12-04 11:38     ` Jesper Dangaard Brouer
  2019-12-04 20:28     ` Ilias Apalodimas
  0 siblings, 2 replies; 7+ messages in thread
From: Grygorii Strashko @ 2019-12-04 10:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Jonathan Lemon
  Cc: netdev, davem, kernel-team, Ilias Apalodimas



On 04/12/2019 10:32, Jesper Dangaard Brouer wrote:
> On Tue, 3 Dec 2019 14:01:14 -0800
> Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> 
>> A lockdep splat was observed when trying to remove an xdp memory
>> model from the table since the mutex was obtained when trying to
>> remove the entry, but not before the table walk started:
>>
>> Fix the splat by obtaining the lock before starting the table walk.
>>
>> Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
>> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
>> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
> 
> Have you tested if this patch fix the problem reported by Grygorii?
> 
> Link: https://lore.kernel.org/netdev/c2de8927-7bca-612f-cdfd-e9112fee412a@ti.com
> 
> Grygorii can you test this?

Thanks.
I do not see this trace any more and networking is working after if down/up

Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>


-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry.
  2019-12-04 10:07   ` Grygorii Strashko
@ 2019-12-04 11:38     ` Jesper Dangaard Brouer
  2019-12-04 17:30       ` Jonathan Lemon
  2019-12-04 20:28     ` Ilias Apalodimas
  1 sibling, 1 reply; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-04 11:38 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Jonathan Lemon, netdev, davem, kernel-team, Ilias Apalodimas, brouer

On Wed, 4 Dec 2019 12:07:22 +0200
Grygorii Strashko <grygorii.strashko@ti.com> wrote:

> On 04/12/2019 10:32, Jesper Dangaard Brouer wrote:
> > On Tue, 3 Dec 2019 14:01:14 -0800
> > Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> >   
> >> A lockdep splat was observed when trying to remove an xdp memory
> >> model from the table since the mutex was obtained when trying to
> >> remove the entry, but not before the table walk started:
> >>
> >> Fix the splat by obtaining the lock before starting the table walk.
> >>
> >> Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
> >> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
> >> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>  
> > 
> > Have you tested if this patch fix the problem reported by Grygorii?
> > 
> > Link: https://lore.kernel.org/netdev/c2de8927-7bca-612f-cdfd-e9112fee412a@ti.com
> > 
> > Grygorii can you test this?  
> 
> Thanks.
> I do not see this trace any more and networking is working after if down/up
> 
> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
> 

Well if it fixes you issue, then I guess its okay.

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

I just though it was related to the rcu_read_lock() around the
page_pool_destroy() call. Guess, I was wrong.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry.
  2019-12-04 11:38     ` Jesper Dangaard Brouer
@ 2019-12-04 17:30       ` Jonathan Lemon
  0 siblings, 0 replies; 7+ messages in thread
From: Jonathan Lemon @ 2019-12-04 17:30 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Grygorii Strashko, netdev, davem, kernel-team, Ilias Apalodimas

On 4 Dec 2019, at 3:38, Jesper Dangaard Brouer wrote:

> On Wed, 4 Dec 2019 12:07:22 +0200
> Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>
>> On 04/12/2019 10:32, Jesper Dangaard Brouer wrote:
>>> On Tue, 3 Dec 2019 14:01:14 -0800
>>> Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>>>
>>>> A lockdep splat was observed when trying to remove an xdp memory
>>>> model from the table since the mutex was obtained when trying to
>>>> remove the entry, but not before the table walk started:
>>>>
>>>> Fix the splat by obtaining the lock before starting the table walk.
>>>>
>>>> Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight 
>>>> == 0.")
>>>> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>>> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
>>>
>>> Have you tested if this patch fix the problem reported by Grygorii?

Yes, I reproduced the problem locally, and confirmed that the patch
resolves the issue.
--
Jonathan


>>>
>>> Link: 
>>> https://lore.kernel.org/netdev/c2de8927-7bca-612f-cdfd-e9112fee412a@ti.com
>>>
>>> Grygorii can you test this?
>>
>> Thanks.
>> I do not see this trace any more and networking is working after if 
>> down/up
>>
>> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>
>
> Well if it fixes you issue, then I guess its okay.
>
> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
>
> I just though it was related to the rcu_read_lock() around the
> page_pool_destroy() call. Guess, I was wrong.
>
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry.
  2019-12-04 10:07   ` Grygorii Strashko
  2019-12-04 11:38     ` Jesper Dangaard Brouer
@ 2019-12-04 20:28     ` Ilias Apalodimas
  1 sibling, 0 replies; 7+ messages in thread
From: Ilias Apalodimas @ 2019-12-04 20:28 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Jesper Dangaard Brouer, Jonathan Lemon, netdev, davem, kernel-team

On Wed, Dec 04, 2019 at 12:07:22PM +0200, Grygorii Strashko wrote:
> 
> 
> On 04/12/2019 10:32, Jesper Dangaard Brouer wrote:
> > On Tue, 3 Dec 2019 14:01:14 -0800
> > Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> > 
> > > A lockdep splat was observed when trying to remove an xdp memory
> > > model from the table since the mutex was obtained when trying to
> > > remove the entry, but not before the table walk started:
> > > 
> > > Fix the splat by obtaining the lock before starting the table walk.
> > > 
> > > Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
> > > Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
> > > Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
> > 
> > Have you tested if this patch fix the problem reported by Grygorii?
> > 
> > Link: https://lore.kernel.org/netdev/c2de8927-7bca-612f-cdfd-e9112fee412a@ti.com
> > 
> > Grygorii can you test this?
> 
> Thanks.
> I do not see this trace any more and networking is working after if down/up
> 
> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
> 
> 
> -- 
> Best regards,
> grygorii

Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry.
  2019-12-03 22:01 [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry Jonathan Lemon
  2019-12-04  8:32 ` Jesper Dangaard Brouer
@ 2019-12-05  0:36 ` David Miller
  1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2019-12-05  0:36 UTC (permalink / raw)
  To: jonathan.lemon; +Cc: netdev, kernel-team, brouer, grygorii.strashko

From: Jonathan Lemon <jonathan.lemon@gmail.com>
Date: Tue, 3 Dec 2019 14:01:14 -0800

> A lockdep splat was observed when trying to remove an xdp memory
> model from the table since the mutex was obtained when trying to
> remove the entry, but not before the table walk started:
> 
> Fix the splat by obtaining the lock before starting the table walk.
> 
> Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>

Applied, and since this fixes a bug fix that went back to v5.3 in -stable
I will queue it there as well.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-12-05  0:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-03 22:01 [net PATCH] xdp: obtain the mem_id mutex before trying to remove an entry Jonathan Lemon
2019-12-04  8:32 ` Jesper Dangaard Brouer
2019-12-04 10:07   ` Grygorii Strashko
2019-12-04 11:38     ` Jesper Dangaard Brouer
2019-12-04 17:30       ` Jonathan Lemon
2019-12-04 20:28     ` Ilias Apalodimas
2019-12-05  0:36 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).