netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Russell King - ARM Linux <linux@armlinux.org.uk>,
	Marcin Wojtas <mw@semihalf.com>,
	linuxarm@openeuler.org, yisen.zhuang@huawei.com,
	Salil Mehta <salil.mehta@huawei.com>,
	thomas.petazzoni@bootlin.com, hawk@kernel.org,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	fenghua.yu@intel.com, guro@fb.com, Peter Xu <peterx@redhat.com>,
	Feng Tang <feng.tang@intel.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Matteo Croce <mcroce@microsoft.com>,
	Hugh Dickins <hughd@google.com>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	Alexander Lobakin <alobakin@pm.me>,
	Willem de Bruijn <willemb@google.com>,
	wenxu@ucloud.cn, Cong Wang <cong.wang@bytedance.com>,
	Kevin Hao <haokexin@gmail.com>,
	nogikh@google.com, Marco Elver <elver@google.com>,
	Yonghong Song <yhs@fb.com>,
	kpsingh@kernel.org, andrii@kernel.org,
	Martin KaFai Lau <kafai@fb.com>,
	songliubraving@fb.com, Netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH rfc v6 2/4] page_pool: add interface to manipulate frag count in page pool
Date: Thu, 22 Jul 2021 08:18:50 -0700	[thread overview]
Message-ID: <CAKgT0UcBgo0Ex=x514qGeLvppJr-0vqx9ZngAFDTwugjtKUrOA@mail.gmail.com> (raw)
In-Reply-To: <fffae41f-b0a3-3c43-491f-096d31ba94ca@huawei.com>

> >
> >> You are right that that may cover up the reference count errors. How about
> >> something like below:
> >>
> >> static inline long page_pool_atomic_sub_frag_count_return(struct page *page,
> >>                                                           long nr)
> >> {
> >> #ifdef CONFIG_DEBUG_PAGE_REF
> >>         long ret = atomic_long_sub_return(nr, &page->pp_frag_count);
> >>
> >>         WARN_ON(ret < 0);
> >>
> >>         return ret;
> >> #else
> >>         if (atomic_long_read(&page->pp_frag_count) == nr)
> >>                 return 0;
> >>
> >>         return atomic_long_sub_return(nr, &page->pp_frag_count);
> >> #end
> >> }
> >>
> >> Or any better suggestion?
> >
> > So the one thing I might change would be to make it so that you only
> > do the atomic_long_read if nr is a constant via __builtin_constant_p.
> > That way you would be performing the comparison in
> > __page_pool_put_page and in the cases of freeing or draining the
> > page_frags you would be using the atomic_long_sub_return which should
> > be paths where you would not expect it to match or that are slowpath
> > anyway.
> >
> > Also I would keep the WARN_ON in both paths just to be on the safe side.
>
> If I understand it correctly, we should change it as below, right?
>
> static inline long page_pool_atomic_sub_frag_count_return(struct page *page,
>                                                           long nr)
> {
>         long ret;
>
>         /* As suggested by Alexander, atomic_long_read() may cover up the
>          * reference count errors, so avoid calling atomic_long_read() in
>          * the cases of freeing or draining the page_frags, where we would
>          * not expect it to match or that are slowpath anyway.
>          */
>         if (__builtin_constant_p(nr) &&
>             atomic_long_read(&page->pp_frag_count) == nr)
>                 return 0;
>
>         ret = atomic_long_sub_return(nr, &page->pp_frag_count);
>         WARN_ON(ret < 0);
>         return ret;
> }

Yes, that is what I had in mind.

One thought I had for a future optimization is that we could look at
reducing the count by 1 so that we could essentially combine the
non-frag and frag cases.Then instead of testing for 1 we would test
for 0 at thee start of the function and test for < 0 to decide if we
want to free it or not instead of testing for 0. With that we can
essentially reduce the calls to the WARN_ON since we should only have
one case where we actually return a value < 0, and we can then check
to see if we overshot -1 which would be the WARN_ON case.

With that a value of 0 instead of 1 would indicate page frag is not in
use for the page *AND/OR* that the page has reached the state where
there are no other frags present so the page can be recycled. In
effect it would allow us to mix page frags and no frags within the
same pool. The added bonus would be we could get rid of the check for
PP_FLAG_PAGE_FRAG flag check in the __page_pool_put_page function and
replace it with a check for PAGE_POOL_DMA_USE_PP_FRAG_COUNT since we
cannot read frag_count in that case.

  reply	other threads:[~2021-07-22 15:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20  3:35 [PATCH rfc v6 0/4] add frag page support in page pool Yunsheng Lin
2021-07-20  3:35 ` [PATCH rfc v6 1/4] page_pool: keep pp info as long as page pool owns the page Yunsheng Lin
2021-07-20  3:35 ` [PATCH rfc v6 2/4] page_pool: add interface to manipulate frag count in page pool Yunsheng Lin
2021-07-20 15:43   ` Alexander Duyck
2021-07-21  8:15     ` Yunsheng Lin
2021-07-21 14:06       ` Alexander Duyck
2021-07-22  8:07         ` Yunsheng Lin
2021-07-22 15:18           ` Alexander Duyck [this message]
2021-07-23 11:12             ` Yunsheng Lin
2021-07-23 16:08               ` Alexander Duyck
2021-07-24 13:07                 ` Yunsheng Lin
2021-07-25 16:49                   ` Alexander Duyck
2021-07-27  7:54                     ` Yunsheng Lin
2021-07-27 18:38                       ` Alexander Duyck
2021-08-02  9:17                         ` Yunsheng Lin
2021-07-20  3:35 ` [PATCH rfc v6 3/4] page_pool: add frag page recycling support " Yunsheng Lin
2021-07-20  3:35 ` [PATCH rfc v6 4/4] net: hns3: support skb's frag page recycling based on " Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgT0UcBgo0Ex=x514qGeLvppJr-0vqx9ZngAFDTwugjtKUrOA@mail.gmail.com' \
    --to=alexander.duyck@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alobakin@pm.me \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=elver@google.com \
    --cc=feng.tang@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=guro@fb.com \
    --cc=haokexin@gmail.com \
    --cc=hawk@kernel.org \
    --cc=hughd@google.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jgg@ziepe.ca \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxarm@openeuler.org \
    --cc=linyunsheng@huawei.com \
    --cc=mcroce@microsoft.com \
    --cc=mw@semihalf.com \
    --cc=netdev@vger.kernel.org \
    --cc=nogikh@google.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=salil.mehta@huawei.com \
    --cc=songliubraving@fb.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=vbabka@suse.cz \
    --cc=wenxu@ucloud.cn \
    --cc=will@kernel.org \
    --cc=willemb@google.com \
    --cc=willy@infradead.org \
    --cc=yhs@fb.com \
    --cc=yisen.zhuang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).