All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Li,Rongqing" <lirongqing@baidu.com>
To: Yunsheng Lin <linyunsheng@huawei.com>,
	Saeed Mahameed <saeedm@mellanox.com>,
	"jonathan.lemon@gmail.com" <jonathan.lemon@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"brouer@redhat.com" <brouer@redhat.com>,
	"ilias.apalodimas@linaro.org" <ilias.apalodimas@linaro.org>
Cc: "ivan.khoronzhuk@linaro.org" <ivan.khoronzhuk@linaro.org>,
	"grygorii.strashko@ti.com" <grygorii.strashko@ti.com>
Subject: 答复: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition
Date: Mon, 9 Dec 2019 03:47:50 +0000	[thread overview]
Message-ID: <96bc5e8351a54adc8f00c18a61e2555d@baidu.com> (raw)
In-Reply-To: <e724cb64-776d-176e-f55b-3c328d7c5298@huawei.com>

Cc: Grygorii Strashko  Ivan Khoronzhuk

I see that cpsw is using NUMA_NO_NODE when init page pool

> On 2019/12/7 11:52, Saeed Mahameed wrote:
> > On Fri, 2019-12-06 at 17:32 +0800, Li RongQing wrote:
> >> some drivers uses page pool, but not require to allocate pages from
> >> bound node, or simply assign pool.p.nid to NUMA_NO_NODE, and the
> >> commit d5394610b1ba ("page_pool:
> >> Don't recycle non-reusable pages") will block this kind of driver to
> >> recycle
> >>
> >> so take page as reusable when page belongs to current memory node if
> >> nid is NUMA_NO_NODE
> >>
> >> v1-->v2: add check with numa_mem_id from Yunsheng
> >>
> >> Fixes: d5394610b1ba ("page_pool: Don't recycle non-reusable pages")
> >> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> >> Suggested-by: Yunsheng Lin <linyunsheng@huawei.com>
> >> Cc: Saeed Mahameed <saeedm@mellanox.com>
> >> ---
> >>  net/core/page_pool.c | 7 ++++++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c index
> >> a6aefe989043..3c8b51ccd1c1 100644
> >> --- a/net/core/page_pool.c
> >> +++ b/net/core/page_pool.c
> >> @@ -312,12 +312,17 @@ static bool __page_pool_recycle_direct(struct
> >> page *page,
> >>  /* page is NOT reusable when:
> >>   * 1) allocated when system is under some pressure.
> >> (page_is_pfmemalloc)
> >>   * 2) belongs to a different NUMA node than pool->p.nid.
> >> + * 3) belongs to a different memory node than current context
> >> + * if pool->p.nid is NUMA_NO_NODE
> >>   *
> >>   * To update pool->p.nid users must call page_pool_update_nid.
> >>   */
> >>  static bool pool_page_reusable(struct page_pool *pool, struct page
> >> *page)
> >>  {
> >> -	return !page_is_pfmemalloc(page) && page_to_nid(page) == pool-
> >>> p.nid;
> >> +	return !page_is_pfmemalloc(page) &&
> >> +		(page_to_nid(page) == pool->p.nid ||
> >> +		(pool->p.nid == NUMA_NO_NODE &&
> >> +		page_to_nid(page) == numa_mem_id()));
> >>  }
> >>
> >
> > Cc'ed Jesper, Ilias & Jonathan.
> >
> > I don't think it is correct to check that the page nid is same as
> > numa_mem_id() if pool is NUMA_NO_NODE. In such case we should allow
> > all pages to recycle, because you can't assume where pages are
> > allocated from and where they are being handled.
> >
> > I suggest the following:
> >
> > return !page_pfmemalloc() &&
> > ( page_to_nid(page) == pool->p.nid || pool->p.nid == NUMA_NO_NODE );
> >
> > 1) never recycle emergency pages, regardless of pool nid.
> > 2) always recycle if pool is NUMA_NO_NODE.
> 
> As I can see, below are the cases that the pool->p.nid could be
> NUMA_NO_NODE:
> 
> 1. kernel built with the CONFIG_NUMA being off.
> 
> 2. kernel built with the CONFIG_NUMA being on, but FW/BIOS dose not provide
>    a valid node id through ACPI/DT, and it has the below cases:
> 
>    a). the hardware itself is single numa node system, so maybe it is valid
>        to not provide a valid node for the device.
>    b). the hardware itself is multi numa nodes system, and the FW/BIOS is
>        broken that it does not provide a valid one.
> 
> 3. kernel built with the CONFIG_NUMA being on, and FW/BIOS dose provide a
>    valid node id through ACPI/DT, but the driver does not pass the valid
>    node id when calling page_pool_init().
> 
> I am not sure which case this patch is trying to fix, maybe Rongqing can help to
> clarify.
> 
> For case 1 and case 2 (a), I suppose checking pool->p.nid being
> NUMA_NO_NODE is enough.
> 
> For case 2 (b), There may be argument that it should be fixed in the BIOS/FW,
> But it also can be argued that the numa_mem_id() checking has been done in
> the driver that has not using page pool yet when deciding whether to do
> recycling, see [1]. If I understanding correctly, recycling is normally done at the
> NAPI pooling, which has the same affinity as the rx interrupt, and rx interrupt is
> not changed very often. If it does change to different memory node, maybe it
> makes sense not to recycle the old page belongs to old node?
> 
> 
> For case 3, I am not sure if any driver is doing that, and if the page pool API
> even allow that?
> 

I think pool_page_reusable should support NUMA_NO_NODE no matter which cases


And recycling is normally done at the NAPI pooling, NUMA_NO_NODE hint to use the
local node, except memory usage is unbalance, so I add the check that the page nid is
same as numa_mem_id() if pool is NUMA_NO_NODE

-Li


> [1] https://elixir.bootlin.com/linux/latest/ident/numa_mem_id
> 
> >
> > the above change should not add any overhead, a modest branch
> > predictor will handle this with no effort.
> >
> > Jesper et al. what do you think?
> >
> > -Saeed.
> >


  reply	other threads:[~2019-12-09  3:48 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06  9:32 [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition Li RongQing
2019-12-07  3:52 ` Saeed Mahameed
2019-12-09  1:31   ` Yunsheng Lin
2019-12-09  3:47     ` Li,Rongqing [this message]
2019-12-09  9:30       ` 答复: " Ilias Apalodimas
2019-12-09 10:37         ` 答复: " Li,Rongqing
2019-12-09 12:14   ` Jesper Dangaard Brouer
2019-12-09 23:34     ` Saeed Mahameed
2019-12-10  1:31       ` Yunsheng Lin
2019-12-10  9:39         ` 答复: " Li,Rongqing
2019-12-10 14:52           ` Ilias Apalodimas
2019-12-10 19:56           ` Saeed Mahameed
2019-12-10 19:45         ` Saeed Mahameed
2019-12-11  3:01           ` Yunsheng Lin
2019-12-11  3:06             ` Yunsheng Lin
2019-12-11 20:57             ` Saeed Mahameed
2019-12-12  1:04               ` Yunsheng Lin
2019-12-10 15:02       ` Ilias Apalodimas
2019-12-10 20:02         ` Saeed Mahameed
2019-12-10 20:10           ` Ilias Apalodimas
2019-12-11 18:49   ` Jesper Dangaard Brouer
2019-12-11 21:24     ` Saeed Mahameed
2019-12-12  1:34       ` Yunsheng Lin
2019-12-12 10:18         ` Jesper Dangaard Brouer
2019-12-13  3:40           ` Yunsheng Lin
2019-12-13  6:27             ` 答复: " Li,Rongqing
2019-12-13  6:53               ` Yunsheng Lin
2019-12-13  8:48                 ` Jesper Dangaard Brouer
2019-12-16  1:51                   ` Yunsheng Lin
2019-12-16  4:02                     ` 答复: " Li,Rongqing
2019-12-16 10:13                       ` Ilias Apalodimas
2019-12-16 10:16                         ` Ilias Apalodimas
2019-12-16 10:57                           ` 答复: " Li,Rongqing
2019-12-17 19:38                         ` Saeed Mahameed
2019-12-17 19:35             ` Saeed Mahameed
2019-12-17 19:27           ` Saeed Mahameed
2019-12-16 12:15         ` Michal Hocko
2019-12-16 12:34           ` Ilias Apalodimas
2019-12-16 13:08             ` Michal Hocko
2019-12-16 13:21               ` Ilias Apalodimas
2019-12-17  2:11                 ` Yunsheng Lin
2019-12-17  9:11                   ` Michal Hocko
2019-12-19  2:09                     ` Yunsheng Lin
2019-12-19 11:53                       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96bc5e8351a54adc8f00c18a61e2555d@baidu.com \
    --to=lirongqing@baidu.com \
    --cc=brouer@redhat.com \
    --cc=grygorii.strashko@ti.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=ivan.khoronzhuk@linaro.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.