netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH net-next 2/2] page_pool: support non-frag page for page_pool_alloc_frag()
Date: Sat, 27 May 2023 17:51:36 +0800	[thread overview]
Message-ID: <f264188a-7562-8031-3fec-f84683002f9a@huawei.com> (raw)
In-Reply-To: <CAKgT0UfuNqY240nfhBoZfYoL5uZ5hSqPOafKY1=4kz6v0MsWxw@mail.gmail.com>

On 2023/5/26 23:16, Alexander Duyck wrote:
> On Fri, May 26, 2023 at 2:28 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> There is performance penalty with using page frag support when
>> user requests a larger frag size and a page only supports one
>> frag user, see [1].
>>
>> It seems like user may request different frag size depending
>> on the mtu and packet size, provide an option to allocate
>> non-frag page when a whole page is not able to hold two frags,
>> so that user has a unified interface for the memory allocation
>> with least memory utilization and performance penalty.
>>
>> 1. https://lore.kernel.org/netdev/ZEU+vospFdm08IeE@localhost.localdomain/
>>
>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>> CC: Lorenzo Bianconi <lorenzo@kernel.org>
>> CC: Alexander Duyck <alexander.duyck@gmail.com>
> 
> The way I see it there are several problems with this approach.
> 
> First, why not just increase the page order rather than trying to
> essentially make page_pool_alloc_frag into an analog for
> page_pool_alloc_pages? I know for the skb allocator we are working
> with an order 3 page. You could likely do something similar here to
> achieve the better performance you are looking for.

I suppose the "an order 3 page" refers to __page_frag_cache_refill()
trying to allocate an order 3 page, if it fails, then fail back to
order 0 page?

As page pool alloc and recycle page in order based on pool->alloc
and pool->ring, so we can not do the above failback trick for page
pool.

We could add different pool->alloc/pool->ring for different order
page, for example, pool->alloc_order_0/pool->ring_order_0 for order
0 page, and pool->alloc_order_3/pool->ring_order_3 for order 3 page,
but then it will be complicated and possibly more memory wasteful.

We would also create page pool with pool->p.order being 3, then
the optimization in this patch also apply.

> 
> Second, I am not a fan of these changes as they seem to be wasteful
> for drivers that might make use of a mix of large and small

I suppose 'wasteful' refer to memory usage, right?
I am not sure how drivers that might make use a mix of large and
small yet, like mlx5 using page_pool_defrag_page() directly.
but if the PAGE_SIZE << pool->p.order is integral multiple of the frag,
then some waste seems to be unavoidable, does handling the frag count
in the driver save more memory than handling the frag count in page pool?
If not, handling the frag count in page pool seems more reasonable?

> allocations. If we aren't going to use fragments then we should
> probably just wrap the call to this function in an inline wrapper that
> checks the size and just automatically pulls the larger sizes off into
> the non-frag allocation path. Look at something such as
> __netdev_alloc_skb as an example.

Do you mean adding an new inline wrapper like below?

if (len > xxx)
	return page_pool_alloc_pages(pool, gfp);
else
	return page_pool_alloc_frag();

It seems the above is essentially the same as this patch does,
this patch does it by not introducing another interface and doing
some optimization. For example, for the above case with 'len > xxx'
in this patch, we still allow frag allocate if the remaining size of
the page is bigger than 'len', which I think it is a small optimization
for 64K page size or order 3 page.

Also, do you have some thing in mind above xxx here? As this patch
chooses the ((PAGE_SIZE << pool->p.order) / 2) as xxx.

> .
> 

      reply	other threads:[~2023-05-27  9:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-26  9:26 [PATCH net-next 0/2] support non-frag page for page_pool_alloc_frag() Yunsheng Lin
2023-05-26  9:26 ` [PATCH net-next 1/2] page_pool: unify frag page and non-frag page handling Yunsheng Lin
2023-05-26 12:03   ` Ilias Apalodimas
2023-05-26 12:35     ` Yunsheng Lin
2023-05-26 15:38       ` Ilias Apalodimas
2023-05-27  8:18         ` Yunsheng Lin
2023-05-26  9:26 ` [PATCH net-next 2/2] page_pool: support non-frag page for page_pool_alloc_frag() Yunsheng Lin
2023-05-26 15:16   ` Alexander Duyck
2023-05-27  9:51     ` Yunsheng Lin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f264188a-7562-8031-3fec-f84683002f9a@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).