netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeedm@mellanox.com>
To: "ilias.apalodimas@linaro.org" <ilias.apalodimas@linaro.org>,
	Li Rongqing <lirongqing@baidu.com>
Cc: "mhocko@kernel.org" <mhocko@kernel.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"linyunsheng@huawei.com" <linyunsheng@huawei.com>,
	"jonathan.lemon@gmail.com" <jonathan.lemon@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"brouer@redhat.com" <brouer@redhat.com>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"bjorn.topel@intel.com" <bjorn.topel@intel.com>
Subject: Re: 答复: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition
Date: Tue, 17 Dec 2019 19:38:44 +0000	[thread overview]
Message-ID: <3f2d88fdcb00b6cc2925d5a2fab38e50d43d8a52.camel@mellanox.com> (raw)
In-Reply-To: <20191216101350.GA6939@apalos.home>

On Mon, 2019-12-16 at 12:13 +0200, Ilias Apalodimas wrote:
> On Mon, Dec 16, 2019 at 04:02:04AM +0000, Li,Rongqing wrote:
> > 
> > > -----邮件原件-----
> > > 发件人: Yunsheng Lin [mailto:linyunsheng@huawei.com]
> > > 发送时间: 2019年12月16日 9:51
> > > 收件人: Jesper Dangaard Brouer <brouer@redhat.com>
> > > 抄送: Li,Rongqing <lirongqing@baidu.com>; Saeed Mahameed
> > > <saeedm@mellanox.com>; ilias.apalodimas@linaro.org;
> > > jonathan.lemon@gmail.com; netdev@vger.kernel.org; 
> > > mhocko@kernel.org;
> > > peterz@infradead.org; Greg Kroah-Hartman <
> > > gregkh@linuxfoundation.org>;
> > > bhelgaas@google.com; linux-kernel@vger.kernel.org; Björn Töpel
> > > <bjorn.topel@intel.com>
> > > 主题: Re: [PATCH][v2] page_pool: handle page recycle for
> > > NUMA_NO_NODE
> > > condition
> > > 
> > > On 2019/12/13 16:48, Jesper Dangaard Brouer wrote:> You are
> > > basically saying
> > > that the NUMA check should be moved to
> > > > allocation time, as it is running the RX-CPU (NAPI).  And
> > > > eventually
> > > > after some time the pages will come from correct NUMA node.
> > > > 
> > > > I think we can do that, and only affect the semi-fast-path.
> > > > We just need to handle that pages in the ptr_ring that are
> > > > recycled
> > > > can be from the wrong NUMA node.  In __page_pool_get_cached()
> > > > when
> > > > consuming pages from the ptr_ring (__ptr_ring_consume_batched),
> > > > then
> > > > we can evict pages from wrong NUMA node.
> > > 
> > > Yes, that's workable.
> > > 
> > > > For the pool->alloc.cache we either accept, that it will
> > > > eventually
> > > > after some time be emptied (it is only in a 100% XDP_DROP
> > > > workload that
> > > > it will continue to reuse same pages).   Or we simply clear the
> > > > pool->alloc.cache when calling page_pool_update_nid().
> > > 
> > > Simply clearing the pool->alloc.cache when calling
> > > page_pool_update_nid()
> > > seems better.
> > > 
> > 
> > How about the below codes, the driver can configure p.nid to any,
> > which will be adjusted in NAPI polling, irq migration will not be
> > problem, but it will add a check into hot path.
> 
> We'll have to check the impact on some high speed (i.e 100gbit)
> interface
> between doing anything like that. Saeed's current patch runs once per
> NAPI. This
> runs once per packet. The load might be measurable. 
> The READ_ONCE is needed in case all producers/consumers run on the
> same CPU
> right?
> 

I agree with Illias, and as i explained this will make the pool biased
to cpu close only, and we want to avoid this,

Li, can you please check if this fixes your issue:

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index a6aefe989043..00c99282a306 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -28,6 +28,9 @@ static int page_pool_init(struct page_pool *pool,
 
        memcpy(&pool->p, params, sizeof(pool->p));
 
+       /* overwrite to allow recycling.. */
+       if (pool->p.nid == NUMA_NO_NODE) 
+               pool->p.nid = numa_mem_id(); 
+

if user wants dev_to_node() then use can use dev_to_node() on pool
initialization rather than NUMA_NO_NODE.


> Thanks
> /Ilias
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index a6aefe989043..4374a6239d17 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -108,6 +108,10 @@ static struct page
> > *__page_pool_get_cached(struct page_pool *pool)
> >                 if (likely(pool->alloc.count)) {
> >                         /* Fast-path */
> >                         page = pool->alloc.cache[--pool-
> > >alloc.count];
> > +
> > +                       if (unlikely(READ_ONCE(pool->p.nid) !=
> > numa_mem_id()))
> > +                               WRITE_ONCE(pool->p.nid,
> > numa_mem_id());
> > +
> >                         return page;
> >                 }
> >                 refill = true;
> > @@ -155,6 +159,10 @@ static struct page
> > *__page_pool_alloc_pages_slow(struct page_pool *pool,
> >         if (pool->p.order)
> >                 gfp |= __GFP_COMP;
> >  
> > +
> > +       if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id()))
> > +               WRITE_ONCE(pool->p.nid, numa_mem_id());
> > +
> >         /* FUTURE development:
> >          *
> >          * Current slow-path essentially falls back to single page
> > Thanks
> > 
> > -Li

  parent reply	other threads:[~2019-12-17 19:38 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06  9:32 [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition Li RongQing
2019-12-07  3:52 ` Saeed Mahameed
2019-12-09  1:31   ` Yunsheng Lin
2019-12-09  3:47     ` 答复: " Li,Rongqing
2019-12-09  9:30       ` Ilias Apalodimas
2019-12-09 10:37         ` 答复: " Li,Rongqing
2019-12-09 12:14   ` Jesper Dangaard Brouer
2019-12-09 23:34     ` Saeed Mahameed
2019-12-10  1:31       ` Yunsheng Lin
2019-12-10  9:39         ` 答复: " Li,Rongqing
2019-12-10 14:52           ` Ilias Apalodimas
2019-12-10 19:56           ` Saeed Mahameed
2019-12-10 19:45         ` Saeed Mahameed
2019-12-11  3:01           ` Yunsheng Lin
2019-12-11  3:06             ` Yunsheng Lin
2019-12-11 20:57             ` Saeed Mahameed
2019-12-12  1:04               ` Yunsheng Lin
2019-12-10 15:02       ` Ilias Apalodimas
2019-12-10 20:02         ` Saeed Mahameed
2019-12-10 20:10           ` Ilias Apalodimas
2019-12-11 18:49   ` Jesper Dangaard Brouer
2019-12-11 21:24     ` Saeed Mahameed
2019-12-12  1:34       ` Yunsheng Lin
2019-12-12 10:18         ` Jesper Dangaard Brouer
2019-12-13  3:40           ` Yunsheng Lin
2019-12-13  6:27             ` 答复: " Li,Rongqing
2019-12-13  6:53               ` Yunsheng Lin
2019-12-13  8:48                 ` Jesper Dangaard Brouer
2019-12-16  1:51                   ` Yunsheng Lin
2019-12-16  4:02                     ` 答复: " Li,Rongqing
2019-12-16 10:13                       ` Ilias Apalodimas
2019-12-16 10:16                         ` Ilias Apalodimas
2019-12-16 10:57                           ` 答复: " Li,Rongqing
2019-12-17 19:38                         ` Saeed Mahameed [this message]
2019-12-17 19:35             ` Saeed Mahameed
2019-12-17 19:27           ` Saeed Mahameed
2019-12-16 12:15         ` Michal Hocko
2019-12-16 12:34           ` Ilias Apalodimas
2019-12-16 13:08             ` Michal Hocko
2019-12-16 13:21               ` Ilias Apalodimas
2019-12-17  2:11                 ` Yunsheng Lin
2019-12-17  9:11                   ` Michal Hocko
2019-12-19  2:09                     ` Yunsheng Lin
2019-12-19 11:53                       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f2d88fdcb00b6cc2925d5a2fab38e50d43d8a52.camel@mellanox.com \
    --to=saeedm@mellanox.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn.topel@intel.com \
    --cc=brouer@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=lirongqing@baidu.com \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).