From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
netdev@vger.kernel.org, Eric Dumazet <eric.dumazet@gmail.com>,
linux-mm@kvack.org, Mel Gorman <mgorman@techsingularity.net>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>,
lorenzo@kernel.org, linyunsheng@huawei.com, bpf@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
willy@infradead.org
Subject: Re: [PATCH RFC net-next/mm V4 2/2] page_pool: Remove workqueue in new shutdown scheme
Date: Tue, 23 May 2023 18:16:15 +0200 [thread overview]
Message-ID: <87h6s3nhv4.fsf@toke.dk> (raw)
In-Reply-To: <168485357834.2849279.8073426325295894331.stgit@firesoul>
> void page_pool_destroy(struct page_pool *pool)
> {
> + unsigned int flags;
> + u32 release_cnt;
> + u32 hold_cnt;
> +
> if (!pool)
> return;
>
> @@ -868,11 +894,45 @@ void page_pool_destroy(struct page_pool *pool)
> if (!page_pool_release(pool))
> return;
>
> - pool->defer_start = jiffies;
> - pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
> + /* PP have pages inflight, thus cannot immediately release memory.
> + * Enter into shutdown phase, depending on remaining in-flight PP
> + * pages to trigger shutdown process (on concurrent CPUs) and last
> + * page will free pool instance.
> + *
> + * There exist two race conditions here, we need to take into
> + * account in the following code.
> + *
> + * 1. Before setting PP_FLAG_SHUTDOWN another CPU released the last
> + * pages into the ptr_ring. Thus, it missed triggering shutdown
> + * process, which can then be stalled forever.
> + *
> + * 2. After setting PP_FLAG_SHUTDOWN another CPU released the last
> + * page, which triggered shutdown process and freed pool
> + * instance. Thus, its not safe to dereference *pool afterwards.
> + *
> + * Handling races by holding a fake in-flight count, via artificially
> + * bumping pages_state_hold_cnt, which assures pool isn't freed under
> + * us. Use RCU Grace-Periods to guarantee concurrent CPUs will
> + * transition safely into the shutdown phase.
> + *
> + * After safely transition into this state the races are resolved. For
> + * race(1) its safe to recheck and empty ptr_ring (it will not free
> + * pool). Race(2) cannot happen, and we can release fake in-flight count
> + * as last step.
> + */
> + hold_cnt = READ_ONCE(pool->pages_state_hold_cnt) + 1;
> + WRITE_ONCE(pool->pages_state_hold_cnt, hold_cnt);
> + synchronize_rcu();
> +
> + flags = READ_ONCE(pool->p.flags) | PP_FLAG_SHUTDOWN;
> + WRITE_ONCE(pool->p.flags, flags);
> + synchronize_rcu();
Hmm, synchronize_rcu() can be quite expensive; why do we need two of
them? Should be fine to just do one after those two writes, as long as
the order of those writes is correct (which WRITE_ONCE should ensure)?
Also, if we're adding this (blocking) operation in the teardown path we
risk adding latency to that path (network interface removal,
BPF_PROG_RUN syscall etc), so not sure if this actually ends up being an
improvement anymore, as opposed to just keeping the workqueue but
dropping the warning?
-Toke
next prev parent reply other threads:[~2023-05-23 16:19 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-23 14:52 [PATCH RFC net-next/mm V4 0/2] page_pool: new approach for leak detection and shutdown phase Jesper Dangaard Brouer
2023-05-23 14:52 ` [PATCH RFC net-next/mm V4 1/2] mm/page_pool: catch page_pool memory leaks Jesper Dangaard Brouer
2023-05-23 14:52 ` [PATCH RFC net-next/mm V4 2/2] page_pool: Remove workqueue in new shutdown scheme Jesper Dangaard Brouer
2023-05-23 16:16 ` Toke Høiland-Jørgensen [this message]
2023-05-24 12:00 ` Yunsheng Lin
2023-05-24 16:42 ` Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h6s3nhv4.fsf@toke.dk \
--to=toke@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=ilias.apalodimas@linaro.org \
--cc=kuba@kernel.org \
--cc=linux-mm@kvack.org \
--cc=linyunsheng@huawei.com \
--cc=lorenzo@kernel.org \
--cc=mgorman@techsingularity.net \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).