All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeedm@mellanox.com>
To: "ilias.apalodimas@linaro.org" <ilias.apalodimas@linaro.org>,
	Li Rongqing <lirongqing@baidu.com>
Cc: "mhocko@kernel.org" <mhocko@kernel.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"linyunsheng@huawei.com" <linyunsheng@huawei.com>,
	"jonathan.lemon@gmail.com" <jonathan.lemon@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"brouer@redhat.com" <brouer@redhat.com>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"bjorn.topel@intel.com" <bjorn.topel@intel.com>
Subject: Re: 答复: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition
Date: Tue, 17 Dec 2019 19:38:44 +0000	[thread overview]
Message-ID: <3f2d88fdcb00b6cc2925d5a2fab38e50d43d8a52.camel@mellanox.com> (raw)
In-Reply-To: <20191216101350.GA6939@apalos.home>

On Mon, 2019-12-16 at 12:13 +0200, Ilias Apalodimas wrote:
> On Mon, Dec 16, 2019 at 04:02:04AM +0000, Li,Rongqing wrote:
> > 
> > > -----邮件原件-----
> > > 发件人: Yunsheng Lin [mailto:linyunsheng@huawei.com]
> > > 发送时间: 2019年12月16日 9:51
> > > 收件人: Jesper Dangaard Brouer <brouer@redhat.com>
> > > 抄送: Li,Rongqing <lirongqing@baidu.com>; Saeed Mahameed
> > > <saeedm@mellanox.com>; ilias.apalodimas@linaro.org;
> > > jonathan.lemon@gmail.com; netdev@vger.kernel.org; 
> > > mhocko@kernel.org;
> > > peterz@infradead.org; Greg Kroah-Hartman <
> > > gregkh@linuxfoundation.org>;
> > > bhelgaas@google.com; linux-kernel@vger.kernel.org; Björn Töpel
> > > <bjorn.topel@intel.com>
> > > 主题: Re: [PATCH][v2] page_pool: handle page recycle for
> > > NUMA_NO_NODE
> > > condition
> > > 
> > > On 2019/12/13 16:48, Jesper Dangaard Brouer wrote:> You are
> > > basically saying
> > > that the NUMA check should be moved to
> > > > allocation time, as it is running the RX-CPU (NAPI).  And
> > > > eventually
> > > > after some time the pages will come from correct NUMA node.
> > > > 
> > > > I think we can do that, and only affect the semi-fast-path.
> > > > We just need to handle that pages in the ptr_ring that are
> > > > recycled
> > > > can be from the wrong NUMA node.  In __page_pool_get_cached()
> > > > when
> > > > consuming pages from the ptr_ring (__ptr_ring_consume_batched),
> > > > then
> > > > we can evict pages from wrong NUMA node.
> > > 
> > > Yes, that's workable.
> > > 
> > > > For the pool->alloc.cache we either accept, that it will
> > > > eventually
> > > > after some time be emptied (it is only in a 100% XDP_DROP
> > > > workload that
> > > > it will continue to reuse same pages).   Or we simply clear the
> > > > pool->alloc.cache when calling page_pool_update_nid().
> > > 
> > > Simply clearing the pool->alloc.cache when calling
> > > page_pool_update_nid()
> > > seems better.
> > > 
> > 
> > How about the below codes, the driver can configure p.nid to any,
> > which will be adjusted in NAPI polling, irq migration will not be
> > problem, but it will add a check into hot path.
> 
> We'll have to check the impact on some high speed (i.e 100gbit)
> interface
> between doing anything like that. Saeed's current patch runs once per
> NAPI. This
> runs once per packet. The load might be measurable. 
> The READ_ONCE is needed in case all producers/consumers run on the
> same CPU
> right?
> 

I agree with Illias, and as i explained this will make the pool biased
to cpu close only, and we want to avoid this,

Li, can you please check if this fixes your issue:

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index a6aefe989043..00c99282a306 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -28,6 +28,9 @@ static int page_pool_init(struct page_pool *pool,
 
        memcpy(&pool->p, params, sizeof(pool->p));
 
+       /* overwrite to allow recycling.. */
+       if (pool->p.nid == NUMA_NO_NODE) 
+               pool->p.nid = numa_mem_id(); 
+

if user wants dev_to_node() then use can use dev_to_node() on pool
initialization rather than NUMA_NO_NODE.


> Thanks
> /Ilias
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index a6aefe989043..4374a6239d17 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -108,6 +108,10 @@ static struct page
> > *__page_pool_get_cached(struct page_pool *pool)
> >                 if (likely(pool->alloc.count)) {
> >                         /* Fast-path */
> >                         page = pool->alloc.cache[--pool-
> > >alloc.count];
> > +
> > +                       if (unlikely(READ_ONCE(pool->p.nid) !=
> > numa_mem_id()))
> > +                               WRITE_ONCE(pool->p.nid,
> > numa_mem_id());
> > +
> >                         return page;
> >                 }
> >                 refill = true;
> > @@ -155,6 +159,10 @@ static struct page
> > *__page_pool_alloc_pages_slow(struct page_pool *pool,
> >         if (pool->p.order)
> >                 gfp |= __GFP_COMP;
> >  
> > +
> > +       if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id()))
> > +               WRITE_ONCE(pool->p.nid, numa_mem_id());
> > +
> >         /* FUTURE development:
> >          *
> >          * Current slow-path essentially falls back to single page
> > Thanks
> > 
> > -Li

  parent reply	other threads:[~2019-12-17 19:38 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06  9:32 [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition Li RongQing
2019-12-07  3:52 ` Saeed Mahameed
2019-12-09  1:31   ` Yunsheng Lin
2019-12-09  3:47     ` 答复: " Li,Rongqing
2019-12-09  9:30       ` Ilias Apalodimas
2019-12-09 10:37         ` 答复: " Li,Rongqing
2019-12-09 12:14   ` Jesper Dangaard Brouer
2019-12-09 23:34     ` Saeed Mahameed
2019-12-10  1:31       ` Yunsheng Lin
2019-12-10  9:39         ` 答复: " Li,Rongqing
2019-12-10 14:52           ` Ilias Apalodimas
2019-12-10 19:56           ` Saeed Mahameed
2019-12-10 19:45         ` Saeed Mahameed
2019-12-11  3:01           ` Yunsheng Lin
2019-12-11  3:06             ` Yunsheng Lin
2019-12-11 20:57             ` Saeed Mahameed
2019-12-12  1:04               ` Yunsheng Lin
2019-12-10 15:02       ` Ilias Apalodimas
2019-12-10 20:02         ` Saeed Mahameed
2019-12-10 20:10           ` Ilias Apalodimas
2019-12-11 18:49   ` Jesper Dangaard Brouer
2019-12-11 21:24     ` Saeed Mahameed
2019-12-12  1:34       ` Yunsheng Lin
2019-12-12 10:18         ` Jesper Dangaard Brouer
2019-12-13  3:40           ` Yunsheng Lin
2019-12-13  6:27             ` 答复: " Li,Rongqing
2019-12-13  6:53               ` Yunsheng Lin
2019-12-13  8:48                 ` Jesper Dangaard Brouer
2019-12-16  1:51                   ` Yunsheng Lin
2019-12-16  4:02                     ` 答复: " Li,Rongqing
2019-12-16 10:13                       ` Ilias Apalodimas
2019-12-16 10:16                         ` Ilias Apalodimas
2019-12-16 10:57                           ` 答复: " Li,Rongqing
2019-12-17 19:38                         ` Saeed Mahameed [this message]
2019-12-17 19:35             ` Saeed Mahameed
2019-12-17 19:27           ` Saeed Mahameed
2019-12-16 12:15         ` Michal Hocko
2019-12-16 12:34           ` Ilias Apalodimas
2019-12-16 13:08             ` Michal Hocko
2019-12-16 13:21               ` Ilias Apalodimas
2019-12-17  2:11                 ` Yunsheng Lin
2019-12-17  9:11                   ` Michal Hocko
2019-12-19  2:09                     ` Yunsheng Lin
2019-12-19 11:53                       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f2d88fdcb00b6cc2925d5a2fab38e50d43d8a52.camel@mellanox.com \
    --to=saeedm@mellanox.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn.topel@intel.com \
    --cc=brouer@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=lirongqing@baidu.com \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.