netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] Possible unsafe page_pool usage in octeontx2
@ 2023-08-23  9:47 Sebastian Andrzej Siewior
  2023-08-23 11:36 ` Ilias Apalodimas
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-08-23  9:47 UTC (permalink / raw)
  To: netdev, Ratheesh Kannoth
  Cc: David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Jesper Dangaard Brouer, Paolo Abeni,
	Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner, hariprasad

Hi,

I've been looking at the page_pool locking.

page_pool_alloc_frag() -> page_pool_alloc_pages() ->
__page_pool_get_cached():

There core of the allocation is:
|         /* Caller MUST guarantee safe non-concurrent access, e.g. softirq */
|         if (likely(pool->alloc.count)) {
|                 /* Fast-path */
|                 page = pool->alloc.cache[--pool->alloc.count];

The access to the `cache' array and the `count' variable is not locked.
This is fine as long as there only one consumer per pool. In my
understanding the intention is to have one page_pool per NAPI callback
to ensure this.

The pool can be filled in the same context (within allocation if the
pool is empty). There is also page_pool_recycle_in_cache() which fills
the pool from within skb free, for instance:
 napi_consume_skb() -> skb_release_all() -> skb_release_data() ->
 napi_frag_unref() -> page_pool_return_skb_page().

The last one has the following check here:
|         napi = READ_ONCE(pp->p.napi);
|         allow_direct = napi_safe && napi &&
|                 READ_ONCE(napi->list_owner) == smp_processor_id();

This eventually ends in page_pool_recycle_in_cache() where it adds the
page to the cache buffer if the check above is true (and BH is disabled). 

napi->list_owner is set once NAPI is scheduled until the poll callback
completed. It is safe to add items to list because only one of the two
can run on a single CPU and the completion of them ensured by having BH
disabled the whole time.

This breaks in octeontx2 where a worker is used to fill the buffer:
  otx2_pool_refill_task() -> otx2_alloc_rbuf() -> __otx2_alloc_rbuf() ->
  otx2_alloc_pool_buf() -> page_pool_alloc_frag().

BH is disabled but the add of a page can still happen while NAPI
callback runs on a remote CPU and so corrupting the index/ array.

API wise I would suggest to

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 7ff80b80a6f9f..b50e219470a36 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
 			page_pool_dma_sync_for_device(pool, page,
 						      dma_sync_size);
 
-		if (allow_direct && in_softirq() &&
+		if (allow_direct && in_serving_softirq() &&
 		    page_pool_recycle_in_cache(page, pool))
 			return NULL;
 
because the intention (as I understand it) is to be invoked from within
the NAPI callback (while softirq is served) and not if BH is just
disabled due to a lock or so.

It would also make sense to a add WARN_ON_ONCE(!in_serving_softirq()) to
page_pool_alloc_pages() to spot usage outside of softirq. But this will
trigger in every driver since the same function is used in the open
callback to initially setup the HW.

Sebastian

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23  9:47 [BUG] Possible unsafe page_pool usage in octeontx2 Sebastian Andrzej Siewior
@ 2023-08-23 11:36 ` Ilias Apalodimas
  2023-08-23 13:31   ` Sebastian Andrzej Siewior
  2023-08-23 12:28 ` [EXT] " Ratheesh Kannoth
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Ilias Apalodimas @ 2023-08-23 11:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: netdev, Ratheesh Kannoth, David S. Miller, Eric Dumazet,
	Geetha sowjanya, Jakub Kicinski, Jesper Dangaard Brouer,
	Paolo Abeni, Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner,
	hariprasad

Hi Sebastian,

Thanks for the report.


On Wed, 23 Aug 2023 at 12:48, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> Hi,
>
> I've been looking at the page_pool locking.

Apologies for any traumas we caused with that code :)


>
> page_pool_alloc_frag() -> page_pool_alloc_pages() ->
> __page_pool_get_cached():
>
> There core of the allocation is:
> |         /* Caller MUST guarantee safe non-concurrent access, e.g. softirq */
> |         if (likely(pool->alloc.count)) {
> |                 /* Fast-path */
> |                 page = pool->alloc.cache[--pool->alloc.count];
>
> The access to the `cache' array and the `count' variable is not locked.
> This is fine as long as there only one consumer per pool. In my
> understanding the intention is to have one page_pool per NAPI callback
> to ensure this.
>
> The pool can be filled in the same context (within allocation if the
> pool is empty). There is also page_pool_recycle_in_cache() which fills
> the pool from within skb free, for instance:
>  napi_consume_skb() -> skb_release_all() -> skb_release_data() ->
>  napi_frag_unref() -> page_pool_return_skb_page().
>
> The last one has the following check here:
> |         napi = READ_ONCE(pp->p.napi);
> |         allow_direct = napi_safe && napi &&
> |                 READ_ONCE(napi->list_owner) == smp_processor_id();
>
> This eventually ends in page_pool_recycle_in_cache() where it adds the
> page to the cache buffer if the check above is true (and BH is disabled).
>
> napi->list_owner is set once NAPI is scheduled until the poll callback
> completed. It is safe to add items to list because only one of the two
> can run on a single CPU and the completion of them ensured by having BH
> disabled the whole time.
>
> This breaks in octeontx2 where a worker is used to fill the buffer:
>   otx2_pool_refill_task() -> otx2_alloc_rbuf() -> __otx2_alloc_rbuf() ->
>   otx2_alloc_pool_buf() -> page_pool_alloc_frag().
>
> BH is disabled but the add of a page can still happen while NAPI
> callback runs on a remote CPU and so corrupting the index/ array.
>
> API wise I would suggest to
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 7ff80b80a6f9f..b50e219470a36 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
>                         page_pool_dma_sync_for_device(pool, page,
>                                                       dma_sync_size);
>
> -               if (allow_direct && in_softirq() &&
> +               if (allow_direct && in_serving_softirq() &&
>                     page_pool_recycle_in_cache(page, pool))
>                         return NULL;
>

FWIW we used to have that check.
commit 542bcea4be866b ("net: page_pool: use in_softirq() instead")
changed that, so maybe we should revert that overall?

> because the intention (as I understand it) is to be invoked from within
> the NAPI callback (while softirq is served) and not if BH is just
> disabled due to a lock or so.
>
> It would also make sense to a add WARN_ON_ONCE(!in_serving_softirq()) to
> page_pool_alloc_pages() to spot usage outside of softirq. But this will
> trigger in every driver since the same function is used in the open
> callback to initially setup the HW.

What about adding a check in the cached allocation path in order to
skip the initial page allocation?

Thanks
/Ilias
>
> Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [EXT] [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23  9:47 [BUG] Possible unsafe page_pool usage in octeontx2 Sebastian Andrzej Siewior
  2023-08-23 11:36 ` Ilias Apalodimas
@ 2023-08-23 12:28 ` Ratheesh Kannoth
  2023-08-23 12:54   ` Sebastian Andrzej Siewior
  2023-08-23 14:49 ` Jakub Kicinski
  2023-08-23 19:45 ` Jesper Dangaard Brouer
  3 siblings, 1 reply; 22+ messages in thread
From: Ratheesh Kannoth @ 2023-08-23 12:28 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, netdev
  Cc: David S. Miller, Eric Dumazet, Geethasowjanya Akula,
	Ilias Apalodimas, Jakub Kicinski, Jesper Dangaard Brouer,
	Paolo Abeni, Subbaraya Sundeep Bhatta, Sunil Kovvuri Goutham,
	Thomas Gleixner, Hariprasad Kelam

> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Sent: Wednesday, August 23, 2023 3:18 PM
> Subject: [EXT] [BUG] Possible unsafe page_pool usage in octeontx2
> 
> This breaks in octeontx2 where a worker is used to fill the buffer:
>   otx2_pool_refill_task() -> otx2_alloc_rbuf() -> __otx2_alloc_rbuf() ->
>   otx2_alloc_pool_buf() -> page_pool_alloc_frag().
>
Thanks. I agree.  This path is called in octeontx2 driver only if __otx2_alloc_rbuf() function returns error In below path.
otx2_napi_handler() -->  pfvf->hw_ops->refill_pool_ptrs() ---> cn10k_refill_pool_ptrs() --> otx2_alloc_buffer() --> __otx2_alloc_rbuf(). 
As I understand, the problem is due to workqueue may get scheduled on other CPU. If we use BOUND workqueue, do you think this problem can be solved ?
 
> BH is disabled but the add of a page can still happen while NAPI callback runs
> on a remote CPU and so corrupting the index/ array.
> 
> API wise I would suggest to
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c index
> 7ff80b80a6f9f..b50e219470a36 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool,
> struct page *page,
>  			page_pool_dma_sync_for_device(pool, page,
>  						      dma_sync_size);
> 
> -		if (allow_direct && in_softirq() &&
> +		if (allow_direct && in_serving_softirq() &&
>  		    page_pool_recycle_in_cache(page, pool))
>  			return NULL;
> 
> because the intention (as I understand it) is to be invoked from within the
> NAPI callback (while softirq is served) and not if BH is just disabled due to a
> lock or so.
Could you help me understand where in_softirq() check will break ?  If  we TX a packet (dev_queue_xmit()) in 
Process context on same core,  in_serving_softirq() check will prevent it from recycling ?



 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: RE: [EXT] [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23 12:28 ` [EXT] " Ratheesh Kannoth
@ 2023-08-23 12:54   ` Sebastian Andrzej Siewior
  2023-08-24  2:49     ` Ratheesh Kannoth
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-08-23 12:54 UTC (permalink / raw)
  To: Ratheesh Kannoth
  Cc: netdev, David S. Miller, Eric Dumazet, Geethasowjanya Akula,
	Ilias Apalodimas, Jakub Kicinski, Jesper Dangaard Brouer,
	Paolo Abeni, Subbaraya Sundeep Bhatta, Sunil Kovvuri Goutham,
	Thomas Gleixner, Hariprasad Kelam

On 2023-08-23 12:28:58 [+0000], Ratheesh Kannoth wrote:
> > From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Sent: Wednesday, August 23, 2023 3:18 PM
> > Subject: [EXT] [BUG] Possible unsafe page_pool usage in octeontx2
> > 
> > This breaks in octeontx2 where a worker is used to fill the buffer:
> >   otx2_pool_refill_task() -> otx2_alloc_rbuf() -> __otx2_alloc_rbuf() ->
> >   otx2_alloc_pool_buf() -> page_pool_alloc_frag().
> >
> As I understand, the problem is due to workqueue may get scheduled on
> other CPU. If we use BOUND workqueue, do you think this problem can be
> solved ?

It would but is still open to less obvious races for instance if the
IRQ/ NAPI is assigned to another CPU while the workqueue is scheduled.
You would have to additional synchronisation to ensure that bad can
happen. This does not make it any simpler nor prettier or serves as a
good example.

I would suggest to stay away from the lock-less buffer if not in NAPI
and feed the pool->ring instead.

> > BH is disabled but the add of a page can still happen while NAPI callback runs
> > on a remote CPU and so corrupting the index/ array.
> > 
> > API wise I would suggest to
> > 
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c index
> > 7ff80b80a6f9f..b50e219470a36 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool,
> > struct page *page,
> >  			page_pool_dma_sync_for_device(pool, page,
> >  						      dma_sync_size);
> > 
> > -		if (allow_direct && in_softirq() &&
> > +		if (allow_direct && in_serving_softirq() &&
> >  		    page_pool_recycle_in_cache(page, pool))
> >  			return NULL;
> > 
> > because the intention (as I understand it) is to be invoked from within the
> > NAPI callback (while softirq is served) and not if BH is just disabled due to a
> > lock or so.
> Could you help me understand where in_softirq() check will break ?  If
> we TX a packet (dev_queue_xmit()) in 
> Process context on same core,  in_serving_softirq() check will prevent
> it from recycling ?

If a check is added to page_pool_alloc_pages() then it will trigger if
you fill the buffer from your ->ndo_open() callback.
Also, if you invoke dev_queue_xmit() from process context. But It will
be added to &pool->ring instead.

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23 11:36 ` Ilias Apalodimas
@ 2023-08-23 13:31   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-08-23 13:31 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: netdev, Ratheesh Kannoth, David S. Miller, Eric Dumazet,
	Geetha sowjanya, Jakub Kicinski, Jesper Dangaard Brouer,
	Paolo Abeni, Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner,
	hariprasad

On 2023-08-23 14:36:06 [+0300], Ilias Apalodimas wrote:
> Hi Sebastian,
Hi Ilias,

> >                                                       dma_sync_size);
> >
> > -               if (allow_direct && in_softirq() &&
> > +               if (allow_direct && in_serving_softirq() &&
> >                     page_pool_recycle_in_cache(page, pool))
> >                         return NULL;
> >
> 
> FWIW we used to have that check.
> commit 542bcea4be866b ("net: page_pool: use in_softirq() instead")
> changed that, so maybe we should revert that overall?

The other changes look okay, this in particular I am not sure about. It
would end up in the pool->ring instead of the lock-less cache. It still
depends how it got here. If it comes from page_pool_return_skb_page()
then the list_owner check will fail because it is not set for the
threaded-NAPI case. If it was a real concern, the entry point must have
been page_pool_put_full_page() or later. This basically assumes the same
context.

> > because the intention (as I understand it) is to be invoked from within
> > the NAPI callback (while softirq is served) and not if BH is just
> > disabled due to a lock or so.
> >
> > It would also make sense to a add WARN_ON_ONCE(!in_serving_softirq()) to
> > page_pool_alloc_pages() to spot usage outside of softirq. But this will
> > trigger in every driver since the same function is used in the open
> > callback to initially setup the HW.
> 
> What about adding a check in the cached allocation path in order to
> skip the initial page allocation?

Maybe. I'm a bit worried that this lock-less has no lockdep or similar
checks if everyone plays by the rules.

> Thanks
> /Ilias

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23  9:47 [BUG] Possible unsafe page_pool usage in octeontx2 Sebastian Andrzej Siewior
  2023-08-23 11:36 ` Ilias Apalodimas
  2023-08-23 12:28 ` [EXT] " Ratheesh Kannoth
@ 2023-08-23 14:49 ` Jakub Kicinski
  2023-08-23 19:45 ` Jesper Dangaard Brouer
  3 siblings, 0 replies; 22+ messages in thread
From: Jakub Kicinski @ 2023-08-23 14:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: netdev, Ratheesh Kannoth, David S. Miller, Eric Dumazet,
	Geetha sowjanya, Ilias Apalodimas, Jesper Dangaard Brouer,
	Paolo Abeni, Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner,
	hariprasad

On Wed, 23 Aug 2023 11:47:57 +0200 Sebastian Andrzej Siewior wrote:
> The pool can be filled in the same context (within allocation if the
> pool is empty). There is also page_pool_recycle_in_cache() which fills
> the pool from within skb free, for instance:
>  napi_consume_skb() -> skb_release_all() -> skb_release_data() ->
>  napi_frag_unref() -> page_pool_return_skb_page().
> 
> The last one has the following check here:
> |         napi = READ_ONCE(pp->p.napi);
> |         allow_direct = napi_safe && napi &&
> |                 READ_ONCE(napi->list_owner) == smp_processor_id();
> 
> This eventually ends in page_pool_recycle_in_cache() where it adds the
> page to the cache buffer if the check above is true (and BH is disabled). 
> 
> napi->list_owner is set once NAPI is scheduled until the poll callback
> completed. It is safe to add items to list because only one of the two
> can run on a single CPU and the completion of them ensured by having BH
> disabled the whole time.

One clarification - .napi will be NULL for otx2, it's an opt-in for
drivers which allocate from NAPI, and AFAICT otx2 does not opt in.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23  9:47 [BUG] Possible unsafe page_pool usage in octeontx2 Sebastian Andrzej Siewior
                   ` (2 preceding siblings ...)
  2023-08-23 14:49 ` Jakub Kicinski
@ 2023-08-23 19:45 ` Jesper Dangaard Brouer
  2023-08-24  7:21   ` Ilias Apalodimas
                     ` (2 more replies)
  3 siblings, 3 replies; 22+ messages in thread
From: Jesper Dangaard Brouer @ 2023-08-23 19:45 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth
  Cc: David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Jesper Dangaard Brouer, Paolo Abeni,
	Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner, hariprasad,
	Alexander Lobakin, Qingfang DENG

(Cc Olek as he have changes in this code path)

On 23/08/2023 11.47, Sebastian Andrzej Siewior wrote:
> Hi,
> 
> I've been looking at the page_pool locking.
> 
> page_pool_alloc_frag() -> page_pool_alloc_pages() ->
> __page_pool_get_cached():
> 
> There core of the allocation is:
> |         /* Caller MUST guarantee safe non-concurrent access, e.g. softirq */
> |         if (likely(pool->alloc.count)) {
> |                 /* Fast-path */
> |                 page = pool->alloc.cache[--pool->alloc.count];
> 
> The access to the `cache' array and the `count' variable is not locked.
> This is fine as long as there only one consumer per pool. In my
> understanding the intention is to have one page_pool per NAPI callback
> to ensure this.
> 

Yes, the intention is a single PP instance is "bound" to one RX-NAPI.


> The pool can be filled in the same context (within allocation if the
> pool is empty). There is also page_pool_recycle_in_cache() which fills
> the pool from within skb free, for instance:
>   napi_consume_skb() -> skb_release_all() -> skb_release_data() ->
>   napi_frag_unref() -> page_pool_return_skb_page().
> 
> The last one has the following check here:
> |         napi = READ_ONCE(pp->p.napi);
> |         allow_direct = napi_safe && napi &&
> |                 READ_ONCE(napi->list_owner) == smp_processor_id();
> 
> This eventually ends in page_pool_recycle_in_cache() where it adds the
> page to the cache buffer if the check above is true (and BH is disabled).
> 
> napi->list_owner is set once NAPI is scheduled until the poll callback
> completed. It is safe to add items to list because only one of the two
> can run on a single CPU and the completion of them ensured by having BH
> disabled the whole time.
> 
> This breaks in octeontx2 where a worker is used to fill the buffer:
>    otx2_pool_refill_task() -> otx2_alloc_rbuf() -> __otx2_alloc_rbuf() ->
>    otx2_alloc_pool_buf() -> page_pool_alloc_frag().
> 

This seems problematic! - this is NOT allowed.

But otx2_pool_refill_task() is a work-queue, and I though it runs in
process-context.  This WQ process is not allowed to use the lockless PP
cache.  This seems to be a bug!

The problematic part is otx2_alloc_rbuf() that disables BH:

  int otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool,
		    dma_addr_t *dma)
  {
	int ret;

	local_bh_disable();
	ret = __otx2_alloc_rbuf(pfvf, pool, dma);
	local_bh_enable();
	return ret;
  }

The fix, can be to not do this local_bh_disable() in this driver?

> BH is disabled but the add of a page can still happen while NAPI
> callback runs on a remote CPU and so corrupting the index/ array.
> 
> API wise I would suggest to
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 7ff80b80a6f9f..b50e219470a36 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
>   			page_pool_dma_sync_for_device(pool, page,
>   						      dma_sync_size);
>   
> -		if (allow_direct && in_softirq() &&
> +		if (allow_direct && in_serving_softirq() &&

This is the "return/free/put" code path, where we have "allow_direct" as
a protection in the API.  API users are suppose to use
page_pool_recycle_direct() to indicate this, but as some point we
allowed APIs to expose 'allow_direct'.

The PP-alloc side is more fragile, and maybe the in_serving_softirq()
belongs there.

>   		    page_pool_recycle_in_cache(page, pool))
>   			return NULL;
>   
> because the intention (as I understand it) is to be invoked from within
> the NAPI callback (while softirq is served) and not if BH is just
> disabled due to a lock or so.
>

True, and it used-to-be like this (in_serving_softirq), but as Ilias
wrote it was changed recently.  This was to support threaded-NAPI (in
542bcea4be866b ("net: page_pool: use in_softirq() instead")), which
I understood was one of your (Sebastian's) use-cases.


> It would also make sense to a add WARN_ON_ONCE(!in_serving_softirq()) to
> page_pool_alloc_pages() to spot usage outside of softirq. But this will
> trigger in every driver since the same function is used in the open
> callback to initially setup the HW.
> 

I'm very open to ideas of detecting this.  Since mentioned commit PP is
open to these kind of miss-uses of the API.

One idea would be to leverage that NAPI napi->list_owner will have been
set to something else than -1, when this is NAPI context.  Getting hold
of napi object, could be done via pp->p.napi (but as Jakub wrote this is
opt-in ATM).

--Jesper

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: RE: [EXT] [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23 12:54   ` Sebastian Andrzej Siewior
@ 2023-08-24  2:49     ` Ratheesh Kannoth
  0 siblings, 0 replies; 22+ messages in thread
From: Ratheesh Kannoth @ 2023-08-24  2:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: netdev, David S. Miller, Eric Dumazet, Geethasowjanya Akula,
	Ilias Apalodimas, Jakub Kicinski, Jesper Dangaard Brouer,
	Paolo Abeni, Subbaraya Sundeep Bhatta, Sunil Kovvuri Goutham,
	Thomas Gleixner, Hariprasad Kelam

> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Sent: Wednesday, August 23, 2023 6:25 PM
> To: Ratheesh Kannoth <rkannoth@marvell.com>
> Subject: Re: RE: [EXT] [BUG] Possible unsafe page_pool usage in octeontx2

> I would suggest to stay away from the lock-less buffer if not in NAPI and feed
> the pool->ring instead.
As Jacub explained, allow_direct will be false as pp->p.napi is 0.
So there is no lockless addition. I think, we don’t have to fix the page pool alloc() in workqueue issue. 
 
-Ratheesh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23 19:45 ` Jesper Dangaard Brouer
@ 2023-08-24  7:21   ` Ilias Apalodimas
  2023-08-24  7:42     ` Ilias Apalodimas
  2023-08-24 15:26   ` Alexander Lobakin
  2023-08-25 13:16   ` Jesper Dangaard Brouer
  2 siblings, 1 reply; 22+ messages in thread
From: Ilias Apalodimas @ 2023-08-24  7:21 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Jakub Kicinski,
	Paolo Abeni, Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner,
	hariprasad, Alexander Lobakin, Qingfang DENG

[...]

> >
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index 7ff80b80a6f9f..b50e219470a36 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
> >                       page_pool_dma_sync_for_device(pool, page,
> >                                                     dma_sync_size);
> >
> > -             if (allow_direct && in_softirq() &&
> > +             if (allow_direct && in_serving_softirq() &&
>
> This is the "return/free/put" code path, where we have "allow_direct" as
> a protection in the API.  API users are suppose to use
> page_pool_recycle_direct() to indicate this, but as some point we
> allowed APIs to expose 'allow_direct'.
>
> The PP-alloc side is more fragile, and maybe the in_serving_softirq()
> belongs there.
>
> >                   page_pool_recycle_in_cache(page, pool))
> >                       return NULL;
> >
> > because the intention (as I understand it) is to be invoked from within
> > the NAPI callback (while softirq is served) and not if BH is just
> > disabled due to a lock or so.
> >
>
> True, and it used-to-be like this (in_serving_softirq), but as Ilias
> wrote it was changed recently.  This was to support threaded-NAPI (in
> 542bcea4be866b ("net: page_pool: use in_softirq() instead")), which
> I understood was one of your (Sebastian's) use-cases.
>
>
> > It would also make sense to a add WARN_ON_ONCE(!in_serving_softirq()) to
> > page_pool_alloc_pages() to spot usage outside of softirq. But this will
> > trigger in every driver since the same function is used in the open
> > callback to initially setup the HW.
> >
>
> I'm very open to ideas of detecting this.  Since mentioned commit PP is
> open to these kind of miss-uses of the API.
>
> One idea would be to leverage that NAPI napi->list_owner will have been
> set to something else than -1, when this is NAPI context.  Getting hold
> of napi object, could be done via pp->p.napi (but as Jakub wrote this is
> opt-in ATM).

I mentioned this earlier, but can't we add the softirq check in
__page_pool_get_cached()?
In theory, when a driver comes up and allocates pages to fill in its
descriptors it will call page_pool_alloc_pages().  That will go
through the slow allocation path, fill up the caches, and return the
last page.  After that, most of the allocations will be served by
__page_pool_get_cached(), and this is supposed to be running during
the driver Rx routine which runs under NAPI.  So eventually we will
hit that warning.

Thanks
/Ilias

>
> --Jesper

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-24  7:21   ` Ilias Apalodimas
@ 2023-08-24  7:42     ` Ilias Apalodimas
  0 siblings, 0 replies; 22+ messages in thread
From: Ilias Apalodimas @ 2023-08-24  7:42 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Jakub Kicinski,
	Paolo Abeni, Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner,
	hariprasad, Alexander Lobakin, Qingfang DENG

On Thu, 24 Aug 2023 at 10:21, Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> [...]
>
> > >
> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index 7ff80b80a6f9f..b50e219470a36 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
> > >                       page_pool_dma_sync_for_device(pool, page,
> > >                                                     dma_sync_size);
> > >
> > > -             if (allow_direct && in_softirq() &&
> > > +             if (allow_direct && in_serving_softirq() &&
> >
> > This is the "return/free/put" code path, where we have "allow_direct" as
> > a protection in the API.  API users are suppose to use
> > page_pool_recycle_direct() to indicate this, but as some point we
> > allowed APIs to expose 'allow_direct'.
> >
> > The PP-alloc side is more fragile, and maybe the in_serving_softirq()
> > belongs there.
> >
> > >                   page_pool_recycle_in_cache(page, pool))
> > >                       return NULL;
> > >
> > > because the intention (as I understand it) is to be invoked from within
> > > the NAPI callback (while softirq is served) and not if BH is just
> > > disabled due to a lock or so.
> > >
> >
> > True, and it used-to-be like this (in_serving_softirq), but as Ilias
> > wrote it was changed recently.  This was to support threaded-NAPI (in
> > 542bcea4be866b ("net: page_pool: use in_softirq() instead")), which
> > I understood was one of your (Sebastian's) use-cases.
> >
> >
> > > It would also make sense to a add WARN_ON_ONCE(!in_serving_softirq()) to
> > > page_pool_alloc_pages() to spot usage outside of softirq. But this will
> > > trigger in every driver since the same function is used in the open
> > > callback to initially setup the HW.
> > >
> >
> > I'm very open to ideas of detecting this.  Since mentioned commit PP is
> > open to these kind of miss-uses of the API.
> >
> > One idea would be to leverage that NAPI napi->list_owner will have been
> > set to something else than -1, when this is NAPI context.  Getting hold
> > of napi object, could be done via pp->p.napi (but as Jakub wrote this is
> > opt-in ATM).
>
> I mentioned this earlier, but can't we add the softirq check in
> __page_pool_get_cached()?
> In theory, when a driver comes up and allocates pages to fill in its
> descriptors it will call page_pool_alloc_pages().  That will go
> through the slow allocation path, fill up the caches, and return the
> last page.  After that, most of the allocations will be served by
> __page_pool_get_cached(), and this is supposed to be running during
> the driver Rx routine which runs under NAPI.  So eventually we will
> hit that warning.

Right... Scratch that, this will still warn on the initial allocation.
The first descriptor will get a page of the slow path, but the rest
will be filled via the caches.


/Ilias
>
> Thanks
> /Ilias
>
> >
> > --Jesper

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23 19:45 ` Jesper Dangaard Brouer
  2023-08-24  7:21   ` Ilias Apalodimas
@ 2023-08-24 15:26   ` Alexander Lobakin
  2023-08-25 13:22     ` Jesper Dangaard Brouer
  2023-08-25 13:16   ` Jesper Dangaard Brouer
  2 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2023-08-24 15:26 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG

From: Jesper Dangaard Brouer <hawk@kernel.org>
Date: Wed, 23 Aug 2023 21:45:04 +0200

> (Cc Olek as he have changes in this code path)

Thanks! I was reading the thread a bit on LKML, but being in the CC list
is more convenient :D

> 
> On 23/08/2023 11.47, Sebastian Andrzej Siewior wrote:
>> Hi,
>>
>> I've been looking at the page_pool locking.
>>
>> page_pool_alloc_frag() -> page_pool_alloc_pages() ->
>> __page_pool_get_cached():
>>
>> There core of the allocation is:
>> |         /* Caller MUST guarantee safe non-concurrent access, e.g.
>> softirq */
>> |         if (likely(pool->alloc.count)) {
>> |                 /* Fast-path */
>> |                 page = pool->alloc.cache[--pool->alloc.count];
>>
>> The access to the `cache' array and the `count' variable is not locked.
>> This is fine as long as there only one consumer per pool. In my
>> understanding the intention is to have one page_pool per NAPI callback
>> to ensure this.
>>
> 
> Yes, the intention is a single PP instance is "bound" to one RX-NAPI.

Isn't that also a misuse of page_pool->p.napi? I thought it can be set
only when page allocation and cache refill happen both inside the same
NAPI polling function. Otx2 uses workqueues to refill the queues,
meaning that consumer and producer can happen in different contexts or
even threads and it shouldn't set p.napi.

> 
> 
>> The pool can be filled in the same context (within allocation if the
>> pool is empty). There is also page_pool_recycle_in_cache() which fills
>> the pool from within skb free, for instance:
>>   napi_consume_skb() -> skb_release_all() -> skb_release_data() ->
>>   napi_frag_unref() -> page_pool_return_skb_page().
>>
>> The last one has the following check here:
>> |         napi = READ_ONCE(pp->p.napi);
>> |         allow_direct = napi_safe && napi &&
>> |                 READ_ONCE(napi->list_owner) == smp_processor_id();
>>
>> This eventually ends in page_pool_recycle_in_cache() where it adds the
>> page to the cache buffer if the check above is true (and BH is disabled).
>>
>> napi->list_owner is set once NAPI is scheduled until the poll callback
>> completed. It is safe to add items to list because only one of the two
>> can run on a single CPU and the completion of them ensured by having BH
>> disabled the whole time.
>>
>> This breaks in octeontx2 where a worker is used to fill the buffer:
>>    otx2_pool_refill_task() -> otx2_alloc_rbuf() -> __otx2_alloc_rbuf() ->
>>    otx2_alloc_pool_buf() -> page_pool_alloc_frag().
>>
> 
> This seems problematic! - this is NOT allowed.
> 
> But otx2_pool_refill_task() is a work-queue, and I though it runs in
> process-context.  This WQ process is not allowed to use the lockless PP
> cache.  This seems to be a bug!
> 
> The problematic part is otx2_alloc_rbuf() that disables BH:
> 
>  int otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool,
>             dma_addr_t *dma)
>  {
>     int ret;
> 
>     local_bh_disable();
>     ret = __otx2_alloc_rbuf(pfvf, pool, dma);
>     local_bh_enable();
>     return ret;
>  }
> 
> The fix, can be to not do this local_bh_disable() in this driver?
> 
>> BH is disabled but the add of a page can still happen while NAPI
>> callback runs on a remote CPU and so corrupting the index/ array.
>>
>> API wise I would suggest to
>>
>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
>> index 7ff80b80a6f9f..b50e219470a36 100644
>> --- a/net/core/page_pool.c
>> +++ b/net/core/page_pool.c
>> @@ -612,7 +612,7 @@ __page_pool_put_page(struct page_pool *pool,
>> struct page *page,
>>               page_pool_dma_sync_for_device(pool, page,
>>                                 dma_sync_size);
>>   -        if (allow_direct && in_softirq() &&
>> +        if (allow_direct && in_serving_softirq() &&
> 
> This is the "return/free/put" code path, where we have "allow_direct" as
> a protection in the API.  API users are suppose to use
> page_pool_recycle_direct() to indicate this, but as some point we
> allowed APIs to expose 'allow_direct'.
> 
> The PP-alloc side is more fragile, and maybe the in_serving_softirq()
> belongs there.
> 
>>               page_pool_recycle_in_cache(page, pool))
>>               return NULL;
>>   because the intention (as I understand it) is to be invoked from within
>> the NAPI callback (while softirq is served) and not if BH is just
>> disabled due to a lock or so.
>>
> 
> True, and it used-to-be like this (in_serving_softirq), but as Ilias
> wrote it was changed recently.  This was to support threaded-NAPI (in
> 542bcea4be866b ("net: page_pool: use in_softirq() instead")), which
> I understood was one of your (Sebastian's) use-cases.
> 
> 
>> It would also make sense to a add WARN_ON_ONCE(!in_serving_softirq()) to
>> page_pool_alloc_pages() to spot usage outside of softirq. But this will
>> trigger in every driver since the same function is used in the open
>> callback to initially setup the HW.
>>
> 
> I'm very open to ideas of detecting this.  Since mentioned commit PP is
> open to these kind of miss-uses of the API.
> 
> One idea would be to leverage that NAPI napi->list_owner will have been
> set to something else than -1, when this is NAPI context.  Getting hold
> of napi object, could be done via pp->p.napi (but as Jakub wrote this is
> opt-in ATM).
> 
> --Jesper

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-23 19:45 ` Jesper Dangaard Brouer
  2023-08-24  7:21   ` Ilias Apalodimas
  2023-08-24 15:26   ` Alexander Lobakin
@ 2023-08-25 13:16   ` Jesper Dangaard Brouer
  2023-08-30  7:14     ` [EXT] " Ratheesh Kannoth
  2 siblings, 1 reply; 22+ messages in thread
From: Jesper Dangaard Brouer @ 2023-08-25 13:16 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth
  Cc: hawk, David S. Miller, Eric Dumazet, Geetha sowjanya,
	Ilias Apalodimas, Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep,
	Sunil Goutham, Thomas Gleixner, hariprasad, Alexander Lobakin,
	Qingfang DENG



On 23/08/2023 21.45, Jesper Dangaard Brouer wrote:
> On 23/08/2023 11.47, Sebastian Andrzej Siewior wrote:
[...]
>>
>> This breaks in octeontx2 where a worker is used to fill the buffer:
>>    otx2_pool_refill_task() -> otx2_alloc_rbuf() -> __otx2_alloc_rbuf() ->
>>    otx2_alloc_pool_buf() -> page_pool_alloc_frag().
>>
> 
> This seems problematic! - this is NOT allowed.
> 
> But otx2_pool_refill_task() is a work-queue, and I though it runs in
> process-context.  This WQ process is not allowed to use the lockless PP
> cache.  This seems to be a bug!
> 
> The problematic part is otx2_alloc_rbuf() that disables BH:
> 
>   int otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool,
>              dma_addr_t *dma)
>   {
>      int ret;
> 
>      local_bh_disable();
>      ret = __otx2_alloc_rbuf(pfvf, pool, dma);
>      local_bh_enable();
>      return ret;
>   }
> 
> The fix, can be to not do this local_bh_disable() in this driver?

Correcting myself. It is not a fix to remove this local_bh_disable().
(which is obvious now I read the code again).

This WQ process is not allowed to use the page_pool_alloc() API this way
(from a work-queue).  The PP alloc-side API must only be used under NAPI
protection.  Thanks for spotting this Sebastian!

Will/can any of the Cc'ed Marvell people work on a fix?

--Jesper

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-24 15:26   ` Alexander Lobakin
@ 2023-08-25 13:22     ` Jesper Dangaard Brouer
  2023-08-25 13:38       ` Alexander Lobakin
  0 siblings, 1 reply; 22+ messages in thread
From: Jesper Dangaard Brouer @ 2023-08-25 13:22 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: hawk, Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG



On 24/08/2023 17.26, Alexander Lobakin wrote:
> From: Jesper Dangaard Brouer<hawk@kernel.org>
> Date: Wed, 23 Aug 2023 21:45:04 +0200
> 
>> (Cc Olek as he have changes in this code path)
> Thanks! I was reading the thread a bit on LKML, but being in the CC list
> is more convenient :D
> 

:D

>> On 23/08/2023 11.47, Sebastian Andrzej Siewior wrote:
>>> Hi,
>>>
>>> I've been looking at the page_pool locking.
>>>
>>> page_pool_alloc_frag() -> page_pool_alloc_pages() ->
>>> __page_pool_get_cached():
>>>
>>> There core of the allocation is:
>>> |         /* Caller MUST guarantee safe non-concurrent access, e.g.
>>> softirq */
>>> |         if (likely(pool->alloc.count)) {
>>> |                 /* Fast-path */
>>> |                 page = pool->alloc.cache[--pool->alloc.count];
>>>
>>> The access to the `cache' array and the `count' variable is not locked.
>>> This is fine as long as there only one consumer per pool. In my
>>> understanding the intention is to have one page_pool per NAPI callback
>>> to ensure this.
>>>
>> Yes, the intention is a single PP instance is "bound" to one RX-NAPI.
>
> Isn't that also a misuse of page_pool->p.napi? I thought it can be set
> only when page allocation and cache refill happen both inside the same
> NAPI polling function. Otx2 uses workqueues to refill the queues,
> meaning that consumer and producer can happen in different contexts or
> even threads and it shouldn't set p.napi.
> 

As Jakub wrote this otx2 driver doesn't set p.napi (so that part of the
problem cannot happen).

That said using workqueues to refill the queues is not compatible with
using page_pool_alloc APIs.  This need to be fixed in driver!

--Jesper

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-25 13:22     ` Jesper Dangaard Brouer
@ 2023-08-25 13:38       ` Alexander Lobakin
  2023-08-25 17:25         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2023-08-25 13:38 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG

From: Jesper Dangaard Brouer <hawk@kernel.org>
Date: Fri, 25 Aug 2023 15:22:05 +0200

> 
> 
> On 24/08/2023 17.26, Alexander Lobakin wrote:
>> From: Jesper Dangaard Brouer<hawk@kernel.org>
>> Date: Wed, 23 Aug 2023 21:45:04 +0200
>>
>>> (Cc Olek as he have changes in this code path)
>> Thanks! I was reading the thread a bit on LKML, but being in the CC list
>> is more convenient :D
>>
> 
> :D
> 
>>> On 23/08/2023 11.47, Sebastian Andrzej Siewior wrote:
>>>> Hi,
>>>>
>>>> I've been looking at the page_pool locking.
>>>>
>>>> page_pool_alloc_frag() -> page_pool_alloc_pages() ->
>>>> __page_pool_get_cached():
>>>>
>>>> There core of the allocation is:
>>>> |         /* Caller MUST guarantee safe non-concurrent access, e.g.
>>>> softirq */
>>>> |         if (likely(pool->alloc.count)) {
>>>> |                 /* Fast-path */
>>>> |                 page = pool->alloc.cache[--pool->alloc.count];
>>>>
>>>> The access to the `cache' array and the `count' variable is not locked.
>>>> This is fine as long as there only one consumer per pool. In my
>>>> understanding the intention is to have one page_pool per NAPI callback
>>>> to ensure this.
>>>>
>>> Yes, the intention is a single PP instance is "bound" to one RX-NAPI.
>>
>> Isn't that also a misuse of page_pool->p.napi? I thought it can be set
>> only when page allocation and cache refill happen both inside the same
>> NAPI polling function. Otx2 uses workqueues to refill the queues,
>> meaning that consumer and producer can happen in different contexts or
>> even threads and it shouldn't set p.napi.
>>
> 
> As Jakub wrote this otx2 driver doesn't set p.napi (so that part of the
> problem cannot happen).

Oh I'm dumb :z

> 
> That said using workqueues to refill the queues is not compatible with
> using page_pool_alloc APIs.  This need to be fixed in driver!

Hmmm I'm wondering now, how should we use Page Pool if we want to run
consume and alloc routines separately? Do you want to say we can't use
Page Pool in that case at all? Quoting other your reply:

> This WQ process is not allowed to use the page_pool_alloc() API this
> way (from a work-queue).  The PP alloc-side API must only be used
> under NAPI protection.

Who did say that? If I don't set p.napi, how is Page Pool then tied to NAPI?
Moreover, when you initially do an ifup/.ndo_open, you have to fill your
Rx queues. It's process context and it can happen on whichever CPU.
Do you mean I can't allocate pages in .ndo_open? :D

> 
> --Jesper

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-25 13:38       ` Alexander Lobakin
@ 2023-08-25 17:25         ` Jesper Dangaard Brouer
  2023-08-26  0:42           ` Jakub Kicinski
  2023-08-28 11:07           ` Alexander Lobakin
  0 siblings, 2 replies; 22+ messages in thread
From: Jesper Dangaard Brouer @ 2023-08-25 17:25 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: hawk, Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG



On 25/08/2023 15.38, Alexander Lobakin wrote:
> From: Jesper Dangaard Brouer <hawk@kernel.org>
> Date: Fri, 25 Aug 2023 15:22:05 +0200
> 
>>
>>
>> On 24/08/2023 17.26, Alexander Lobakin wrote:
>>> From: Jesper Dangaard Brouer<hawk@kernel.org>
>>> Date: Wed, 23 Aug 2023 21:45:04 +0200
>>>
>>>> (Cc Olek as he have changes in this code path)
>>> Thanks! I was reading the thread a bit on LKML, but being in the CC list
>>> is more convenient :D
>>>
>>
>> :D
>>
>>>> On 23/08/2023 11.47, Sebastian Andrzej Siewior wrote:
>>>>> Hi,
>>>>>
>>>>> I've been looking at the page_pool locking.
>>>>>
>>>>> page_pool_alloc_frag() -> page_pool_alloc_pages() ->
>>>>> __page_pool_get_cached():
>>>>>
>>>>> There core of the allocation is:
>>>>> |         /* Caller MUST guarantee safe non-concurrent access, e.g.
>>>>> softirq */
>>>>> |         if (likely(pool->alloc.count)) {
>>>>> |                 /* Fast-path */
>>>>> |                 page = pool->alloc.cache[--pool->alloc.count];
>>>>>
>>>>> The access to the `cache' array and the `count' variable is not locked.
>>>>> This is fine as long as there only one consumer per pool. In my
>>>>> understanding the intention is to have one page_pool per NAPI callback
>>>>> to ensure this.
>>>>>
>>>> Yes, the intention is a single PP instance is "bound" to one RX-NAPI.
>>>
>>> Isn't that also a misuse of page_pool->p.napi? I thought it can be set
>>> only when page allocation and cache refill happen both inside the same
>>> NAPI polling function. Otx2 uses workqueues to refill the queues,
>>> meaning that consumer and producer can happen in different contexts or
>>> even threads and it shouldn't set p.napi.
>>>
>>
>> As Jakub wrote this otx2 driver doesn't set p.napi (so that part of the
>> problem cannot happen).
> 
> Oh I'm dumb :z
> 
>>
>> That said using workqueues to refill the queues is not compatible with
>> using page_pool_alloc APIs.  This need to be fixed in driver!
> 
> Hmmm I'm wondering now, how should we use Page Pool if we want to run
> consume and alloc routines separately? Do you want to say we can't use
> Page Pool in that case at all? Quoting other your reply:
> 
>> This WQ process is not allowed to use the page_pool_alloc() API this
>> way (from a work-queue).  The PP alloc-side API must only be used
>> under NAPI protection.
> 
> Who did say that? If I don't set p.napi, how is Page Pool then tied to NAPI?

*I* say that (as the PP inventor) as that was the design and intent,
that this is tied to a NAPI instance and rely on the NAPI protection to
make it safe to do lockless access to this cache array.

> Moreover, when you initially do an ifup/.ndo_open, you have to fill your
> Rx queues. It's process context and it can happen on whichever CPU.
> Do you mean I can't allocate pages in .ndo_open? :D

True, that all driver basically allocate from this *before* the RX-ring 
/ NAPI is activated.  That is safe and "allowed" given the driver 
RX-ring is not active yet.  This use-case unfortunately also make it 
harder to add something to the PP API, that detect miss-usage of the API.

Looking again at the driver otx2_txrx.c NAPI code path also calls PP 
directly in otx2_napi_handler() will call refill_pool_ptrs() -> 
otx2_refill_pool_ptrs() -> otx2_alloc_buffer() -> __otx2_alloc_rbuf() -> 
if (pool->page_pool) otx2_alloc_pool_buf() -> page_pool_alloc_frag().

The function otx2_alloc_buffer() can also choose to start a WQ, that 
also called PP alloc API this time via otx2_alloc_rbuf() that have 
BH-disable.  Like Sebastian, I don't think this is safe!

--Jesper

This can be a workaround fix:

$ git diff
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c 
b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
index dce3cea00032..ab7ca146fddf 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
@@ -578,6 +578,10 @@ int otx2_alloc_buffer(struct otx2_nic *pfvf, struct 
otx2_cq_queue *cq,
                 struct refill_work *work;
                 struct delayed_work *dwork;

+               /* page_pool alloc API cannot be used from WQ */
+               if (cq->rbpool->page_pool)
+                       return -ENOMEM;
+
                 work = &pfvf->refill_wrk[cq->cq_idx];
                 dwork = &work->pool_refill_work;
                 /* Schedule a task if no other task is running */

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-25 17:25         ` Jesper Dangaard Brouer
@ 2023-08-26  0:42           ` Jakub Kicinski
  2023-08-28 10:59             ` Alexander Lobakin
  2023-08-28 12:25             ` Jesper Dangaard Brouer
  2023-08-28 11:07           ` Alexander Lobakin
  1 sibling, 2 replies; 22+ messages in thread
From: Jakub Kicinski @ 2023-08-26  0:42 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Sebastian Andrzej Siewior, netdev,
	Ratheesh Kannoth, David S. Miller, Eric Dumazet, Geetha sowjanya,
	Ilias Apalodimas, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG

On Fri, 25 Aug 2023 19:25:42 +0200 Jesper Dangaard Brouer wrote:
> >> This WQ process is not allowed to use the page_pool_alloc() API this
> >> way (from a work-queue).  The PP alloc-side API must only be used
> >> under NAPI protection.  
> > 
> > Who did say that? If I don't set p.napi, how is Page Pool then tied to NAPI?  
> 
> *I* say that (as the PP inventor) as that was the design and intent,
> that this is tied to a NAPI instance and rely on the NAPI protection to
> make it safe to do lockless access to this cache array.

Absolutely no objection to us making the NAPI / bh context a requirement
past the startup stage, but just to be sure I understand the code -
technically if the driver never recycles direct, does not set the NAPI,
does not use xdp_return_frame_rx_napi etc. - the cache is always empty
so we good?

I wonder if we can add a check like "mark the pool as BH-only on first
BH use, and WARN() on process use afterwards". But I'm not sure what
CONFIG you'd accept that being under ;)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-26  0:42           ` Jakub Kicinski
@ 2023-08-28 10:59             ` Alexander Lobakin
  2023-08-28 12:25             ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2023-08-28 10:59 UTC (permalink / raw)
  To: Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Paolo Abeni, Subbaraya Sundeep, Sunil Goutham, Thomas Gleixner,
	hariprasad, Qingfang DENG

From: Jakub Kicinski <kuba@kernel.org>
Date: Fri, 25 Aug 2023 17:42:58 -0700

> On Fri, 25 Aug 2023 19:25:42 +0200 Jesper Dangaard Brouer wrote:
>>>> This WQ process is not allowed to use the page_pool_alloc() API this
>>>> way (from a work-queue).  The PP alloc-side API must only be used
>>>> under NAPI protection.  
>>>
>>> Who did say that? If I don't set p.napi, how is Page Pool then tied to NAPI?  
>>
>> *I* say that (as the PP inventor) as that was the design and intent,
>> that this is tied to a NAPI instance and rely on the NAPI protection to
>> make it safe to do lockless access to this cache array.
> 
> Absolutely no objection to us making the NAPI / bh context a requirement
> past the startup stage, but just to be sure I understand the code -
> technically if the driver never recycles direct, does not set the NAPI,
> does not use xdp_return_frame_rx_napi etc. - the cache is always empty
> so we good?

+1, I don't say Otx2 is correct, but I don't see any issues in having
consumer and producer running on different cores and in different
contexes, as long as p.napi is not set.

Split queue model is trending and I don't see reasons why PP may require
serializing -> effectively making it work the same way as "single queue"
mode works. Esp. given that we have ptr_ring with locks, not only the
direct cache.

> 
> I wonder if we can add a check like "mark the pool as BH-only on first
> BH use, and WARN() on process use afterwards". But I'm not sure what

Why do we use spin_lock_bh() and friends then and check in_softirq(), if
PP can work only in one context?

> CONFIG you'd accept that being under ;)

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-25 17:25         ` Jesper Dangaard Brouer
  2023-08-26  0:42           ` Jakub Kicinski
@ 2023-08-28 11:07           ` Alexander Lobakin
  2023-08-28 12:34             ` Jesper Dangaard Brouer
  2023-08-28 16:40             ` Sebastian Andrzej Siewior
  1 sibling, 2 replies; 22+ messages in thread
From: Alexander Lobakin @ 2023-08-28 11:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG

From: Jesper Dangaard Brouer <hawk@kernel.org>
Date: Fri, 25 Aug 2023 19:25:42 +0200

> 
> 
> On 25/08/2023 15.38, Alexander Lobakin wrote:
>> From: Jesper Dangaard Brouer <hawk@kernel.org>
>> Date: Fri, 25 Aug 2023 15:22:05 +0200
>>
>>>
>>>
>>> On 24/08/2023 17.26, Alexander Lobakin wrote:
>>>> From: Jesper Dangaard Brouer<hawk@kernel.org>
>>>> Date: Wed, 23 Aug 2023 21:45:04 +0200
>>>>
>>>>> (Cc Olek as he have changes in this code path)
>>>> Thanks! I was reading the thread a bit on LKML, but being in the CC
>>>> list
>>>> is more convenient :D
>>>>
>>>
>>> :D
>>>
>>>>> On 23/08/2023 11.47, Sebastian Andrzej Siewior wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I've been looking at the page_pool locking.
>>>>>>
>>>>>> page_pool_alloc_frag() -> page_pool_alloc_pages() ->
>>>>>> __page_pool_get_cached():
>>>>>>
>>>>>> There core of the allocation is:
>>>>>> |         /* Caller MUST guarantee safe non-concurrent access, e.g.
>>>>>> softirq */
>>>>>> |         if (likely(pool->alloc.count)) {
>>>>>> |                 /* Fast-path */
>>>>>> |                 page = pool->alloc.cache[--pool->alloc.count];
>>>>>>
>>>>>> The access to the `cache' array and the `count' variable is not
>>>>>> locked.
>>>>>> This is fine as long as there only one consumer per pool. In my
>>>>>> understanding the intention is to have one page_pool per NAPI
>>>>>> callback
>>>>>> to ensure this.
>>>>>>
>>>>> Yes, the intention is a single PP instance is "bound" to one RX-NAPI.
>>>>
>>>> Isn't that also a misuse of page_pool->p.napi? I thought it can be set
>>>> only when page allocation and cache refill happen both inside the same
>>>> NAPI polling function. Otx2 uses workqueues to refill the queues,
>>>> meaning that consumer and producer can happen in different contexts or
>>>> even threads and it shouldn't set p.napi.
>>>>
>>>
>>> As Jakub wrote this otx2 driver doesn't set p.napi (so that part of the
>>> problem cannot happen).
>>
>> Oh I'm dumb :z
>>
>>>
>>> That said using workqueues to refill the queues is not compatible with
>>> using page_pool_alloc APIs.  This need to be fixed in driver!
>>
>> Hmmm I'm wondering now, how should we use Page Pool if we want to run
>> consume and alloc routines separately? Do you want to say we can't use
>> Page Pool in that case at all? Quoting other your reply:
>>
>>> This WQ process is not allowed to use the page_pool_alloc() API this
>>> way (from a work-queue).  The PP alloc-side API must only be used
>>> under NAPI protection.
>>
>> Who did say that? If I don't set p.napi, how is Page Pool then tied to
>> NAPI?
> 
> *I* say that (as the PP inventor) as that was the design and intent,

> *I*
> the PP inventor

Sure I got that a couple years ago, no need to keep reminding me that xD

> that this is tied to a NAPI instance and rely on the NAPI protection to
> make it safe to do lockless access to this cache array.
> 
>> Moreover, when you initially do an ifup/.ndo_open, you have to fill your
>> Rx queues. It's process context and it can happen on whichever CPU.
>> Do you mean I can't allocate pages in .ndo_open? :D
> 
> True, that all driver basically allocate from this *before* the RX-ring
> / NAPI is activated.  That is safe and "allowed" given the driver
> RX-ring is not active yet.  This use-case unfortunately also make it
> harder to add something to the PP API, that detect miss-usage of the API.
> 
> Looking again at the driver otx2_txrx.c NAPI code path also calls PP
> directly in otx2_napi_handler() will call refill_pool_ptrs() ->
> otx2_refill_pool_ptrs() -> otx2_alloc_buffer() -> __otx2_alloc_rbuf() ->
> if (pool->page_pool) otx2_alloc_pool_buf() -> page_pool_alloc_frag().
> 
> The function otx2_alloc_buffer() can also choose to start a WQ, that
> also called PP alloc API this time via otx2_alloc_rbuf() that have
> BH-disable.  Like Sebastian, I don't think this is safe!

Disabling BH doesn't look correct to me, but I don't see issues in
having consumer and producer working on different cores, as long as they
use ptr_ring with locks.

> 
> --Jesper
> 
> This can be a workaround fix:
> 
> $ git diff
> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
> b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
> index dce3cea00032..ab7ca146fddf 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
> @@ -578,6 +578,10 @@ int otx2_alloc_buffer(struct otx2_nic *pfvf, struct
> otx2_cq_queue *cq,
>                 struct refill_work *work;
>                 struct delayed_work *dwork;
> 
> +               /* page_pool alloc API cannot be used from WQ */
> +               if (cq->rbpool->page_pool)
> +                       return -ENOMEM;

I believe that breaks the driver?

> +
>                 work = &pfvf->refill_wrk[cq->cq_idx];
>                 dwork = &work->pool_refill_work;
>                 /* Schedule a task if no other task is running */

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-26  0:42           ` Jakub Kicinski
  2023-08-28 10:59             ` Alexander Lobakin
@ 2023-08-28 12:25             ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 22+ messages in thread
From: Jesper Dangaard Brouer @ 2023-08-28 12:25 UTC (permalink / raw)
  To: Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Sebastian Andrzej Siewior, netdev,
	Ratheesh Kannoth, David S. Miller, Eric Dumazet, Geetha sowjanya,
	Ilias Apalodimas, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG



On 26/08/2023 02.42, Jakub Kicinski wrote:
> On Fri, 25 Aug 2023 19:25:42 +0200 Jesper Dangaard Brouer wrote:
>>>> This WQ process is not allowed to use the page_pool_alloc() API this
>>>> way (from a work-queue).  The PP alloc-side API must only be used
>>>> under NAPI protection.
>>>
>>> Who did say that? If I don't set p.napi, how is Page Pool then tied to NAPI?
>>
>> *I* say that (as the PP inventor) as that was the design and intent,
>> that this is tied to a NAPI instance and rely on the NAPI protection to
>> make it safe to do lockless access to this cache array.
> 
> Absolutely no objection to us making the NAPI / bh context a requirement
> past the startup stage, but just to be sure I understand the code -
> technically if the driver never recycles direct, does not set the NAPI,
> does not use xdp_return_frame_rx_napi etc. - the cache is always empty
> so we good?
> 

Nope cache is NOT always empty, the PP cache will be refilled if empty.
Thus, PP alloc side code will touch/use pool->alloc.cache[].  See two
places in code with comment: /* Return last page */.

The PP cache is always refilled; Either from ptr_ring or via
page-allocators bulking APIs.

> I wonder if we can add a check like "mark the pool as BH-only on first
> BH use, and WARN() on process use afterwards". But I'm not sure what
> CONFIG you'd accept that being under ;)
> 

The PP alloc side is designed as a Single Consumer data-structure/model
on alloc side, (via a lockless cache/array).  That on empty cache
fallback to bulk from a Multi Consumer data-structure/model to amortize
that cost.  This is where the PP speedup comes from.

--Jesper

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-28 11:07           ` Alexander Lobakin
@ 2023-08-28 12:34             ` Jesper Dangaard Brouer
  2023-08-28 16:40             ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 22+ messages in thread
From: Jesper Dangaard Brouer @ 2023-08-28 12:34 UTC (permalink / raw)
  To: Alexander Lobakin, Jesper Dangaard Brouer
  Cc: brouer, Sebastian Andrzej Siewior, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG



On 28/08/2023 13.07, Alexander Lobakin wrote:
>> This can be a workaround fix:
>>
>> $ git diff
>> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>> b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>> index dce3cea00032..ab7ca146fddf 100644
>> --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>> @@ -578,6 +578,10 @@ int otx2_alloc_buffer(struct otx2_nic *pfvf, struct
>> otx2_cq_queue *cq,
>>                  struct refill_work *work;
>>                  struct delayed_work *dwork;
>>
>> +               /* page_pool alloc API cannot be used from WQ */
>> +               if (cq->rbpool->page_pool)
>> +                       return -ENOMEM;
> I believe that breaks the driver?
> 

Why would that break the driver?

AFAIK returning 0 here will break the driver.
We need to return something non-zero, see otx2_refill_pool_ptrs() 
copy-pasted below signature.


>> +
>>                  work = &pfvf->refill_wrk[cq->cq_idx];
>>                  dwork = &work->pool_refill_work;
>>                  /* Schedule a task if no other task is running */


--Jesper

  void otx2_refill_pool_ptrs(void *dev, struct otx2_cq_queue *cq)
  {
	struct otx2_nic *pfvf = dev;
	dma_addr_t bufptr;

	while (cq->pool_ptrs) {
		if (otx2_alloc_buffer(pfvf, cq, &bufptr))
			break;
		otx2_aura_freeptr(pfvf, cq->cq_idx, bufptr + OTX2_HEAD_ROOM);
		cq->pool_ptrs--;
	}
  }


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-28 11:07           ` Alexander Lobakin
  2023-08-28 12:34             ` Jesper Dangaard Brouer
@ 2023-08-28 16:40             ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-08-28 16:40 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Jesper Dangaard Brouer, netdev, Ratheesh Kannoth,
	David S. Miller, Eric Dumazet, Geetha sowjanya, Ilias Apalodimas,
	Jakub Kicinski, Paolo Abeni, Subbaraya Sundeep, Sunil Goutham,
	Thomas Gleixner, hariprasad, Qingfang DENG

On 2023-08-28 13:07:12 [+0200], Alexander Lobakin wrote:
> > Looking again at the driver otx2_txrx.c NAPI code path also calls PP
> > directly in otx2_napi_handler() will call refill_pool_ptrs() ->
> > otx2_refill_pool_ptrs() -> otx2_alloc_buffer() -> __otx2_alloc_rbuf() ->
> > if (pool->page_pool) otx2_alloc_pool_buf() -> page_pool_alloc_frag().
> > 
> > The function otx2_alloc_buffer() can also choose to start a WQ, that
> > also called PP alloc API this time via otx2_alloc_rbuf() that have
> > BH-disable.  Like Sebastian, I don't think this is safe!
> 
> Disabling BH doesn't look correct to me, but I don't see issues in
> having consumer and producer working on different cores, as long as they
> use ptr_ring with locks.

After learning what p.napi is about, may I point out that there are also
users that don't check that and use page_pool_put_full_page() or later?
While I can't comment on the bpf/XDP users, browsing otx2: there is
this:
| otx2_stop()
| -> otx2_free_hw_resources()
|   -> otx2_cleanup_rx_cqes
|      -> otx2_free_bufs
|        -> page_pool_put_full_page(, true)
| -> cancel_delayed_work_sync()

otx2 is "safe" here due to the in_softirq() check in
__page_pool_put_page(). But: the worker could run and fill the lock less
buffer while otx2_free_bufs() is doing clean up/ removing pages and
filling this buffer on another CPU.
The worker is synchronised after the free. The lack of BH-disable in
otx2_stop()'s path safes the day here.
(It looks somehow suspicious that there is first "free mem" followed by
"sync refill worker" and not the other way around)

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [EXT] Re: [BUG] Possible unsafe page_pool usage in octeontx2
  2023-08-25 13:16   ` Jesper Dangaard Brouer
@ 2023-08-30  7:14     ` Ratheesh Kannoth
  0 siblings, 0 replies; 22+ messages in thread
From: Ratheesh Kannoth @ 2023-08-30  7:14 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Sebastian Andrzej Siewior, netdev
  Cc: David S. Miller, Eric Dumazet, Geethasowjanya Akula,
	Ilias Apalodimas, Jakub Kicinski, Paolo Abeni,
	Subbaraya Sundeep Bhatta, Sunil Kovvuri Goutham, Thomas Gleixner,
	Hariprasad Kelam, Alexander Lobakin, Qingfang DENG

> From: Jesper Dangaard Brouer <hawk@kernel.org>
> Subject: [EXT] Re: [BUG] Possible unsafe page_pool usage in octeontx2
> 
> This WQ process is not allowed to use the page_pool_alloc() API this way
> (from a work-queue).  The PP alloc-side API must only be used under NAPI
> protection.  Thanks for spotting this Sebastian!
> 
> Will/can any of the Cc'ed Marvell people work on a fix?

We are working on it.  
 


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-08-30  7:15 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23  9:47 [BUG] Possible unsafe page_pool usage in octeontx2 Sebastian Andrzej Siewior
2023-08-23 11:36 ` Ilias Apalodimas
2023-08-23 13:31   ` Sebastian Andrzej Siewior
2023-08-23 12:28 ` [EXT] " Ratheesh Kannoth
2023-08-23 12:54   ` Sebastian Andrzej Siewior
2023-08-24  2:49     ` Ratheesh Kannoth
2023-08-23 14:49 ` Jakub Kicinski
2023-08-23 19:45 ` Jesper Dangaard Brouer
2023-08-24  7:21   ` Ilias Apalodimas
2023-08-24  7:42     ` Ilias Apalodimas
2023-08-24 15:26   ` Alexander Lobakin
2023-08-25 13:22     ` Jesper Dangaard Brouer
2023-08-25 13:38       ` Alexander Lobakin
2023-08-25 17:25         ` Jesper Dangaard Brouer
2023-08-26  0:42           ` Jakub Kicinski
2023-08-28 10:59             ` Alexander Lobakin
2023-08-28 12:25             ` Jesper Dangaard Brouer
2023-08-28 11:07           ` Alexander Lobakin
2023-08-28 12:34             ` Jesper Dangaard Brouer
2023-08-28 16:40             ` Sebastian Andrzej Siewior
2023-08-25 13:16   ` Jesper Dangaard Brouer
2023-08-30  7:14     ` [EXT] " Ratheesh Kannoth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).