netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jean-Philippe Brucker <jean-philippe@linaro.org>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: linuxarm@openeuler.org, ilias.apalodimas@linaro.org,
	salil.mehta@huawei.com, netdev@vger.kernel.org,
	moyufeng@huawei.com, alexanderduyck@fb.com, brouer@redhat.com,
	kuba@kernel.org
Subject: Re: [PATCH net-next v2 4/4] net: hns3: support skb's frag page recycling based on page pool
Date: Thu, 3 Feb 2022 09:48:07 +0000	[thread overview]
Message-ID: <Yfuk11on6XiaB6Di@myrica> (raw)
In-Reply-To: <ff54ec37-cb69-cc2f-7ee7-7974f244d843@huawei.com>

On Sat, Jan 29, 2022 at 04:44:34PM +0800, Yunsheng Lin wrote:
> >> My initial thinking is to track if the reference counting or pp_frag_count of
> >> the page is manipulated correctly.
> > 
> > It looks like pp_frag_count is dropped too many times: after (1),
> > pp_frag_count only has 1 ref, so (2) drops it to 0 and (3) results in
> > underflow. I turned page_pool_atomic_sub_frag_count_return() into
> > "atomic_long_sub_return(nr, &page->pp_frag_count)" to make sure (the
> > atomic_long_read() bit normally hides this). Wasn't entirely sure if this
> > is expected behavior, though.
> 
> Are you true the above 1~3 step is happening for the same page?

Yes they happen on the same page. What I did was save the backtrace of
each call to page_pool_atomic_sub_frag_count_return() and, when an
underflow error happens on a page, print out the history of that page
only.

My report was not right, though, I forgot to save the backtrace for
pp_frag_count==0. There's actually two refs on the page. It goes like
this:

  (1) T-1535, drop BIAS_MAX - 2, pp_frag_count now 2
     page_pool_alloc_frag+0x128/0x240
     hns3_alloc_and_map_buffer+0x30/0x170
     hns3_nic_alloc_rx_buffers+0x9c/0x170
     hns3_clean_rx_ring+0x864/0x960
     hns3_nic_common_poll+0xa0/0x218
     __napi_poll+0x38/0x188
     net_rx_action+0xe8/0x248
     __do_softirq+0x120/0x284

  (2) T-4, drop 1, pp_frag_count now 1
     page_pool_put_page+0x98/0x338
     page_pool_return_skb_page+0x48/0x60
     skb_release_data+0x170/0x190
     skb_release_all+0x28/0x38
     kfree_skb_reason+0x30/0x90
     packet_rcv+0x58/0x430
     __netif_receive_skb_list_core+0x1f4/0x218
     netif_receive_skb_list_internal+0x18c/0x2a8
  
  (3) T-1, drop 1, pp_frag_count now 0
     page_pool_put_page+0x98/0x338
     page_pool_return_skb_page+0x48/0x60
     skb_release_data+0x170/0x190
     skb_release_all+0x28/0x38
     __kfree_skb+0x18/0x30
     __sk_defer_free_flush+0x44/0x58
     tcp_recvmsg+0x94/0x1b8
     inet_recvmsg+0x50/0x128
  
  (4) T, drop 1, pp_frag_count now -1 (underflow)
     page_pool_put_page+0x2d0/0x338
     hns3_clean_rx_ring+0x74c/0x960
     hns3_nic_common_poll+0xa0/0x218
     __napi_poll+0x38/0x188
     net_rx_action+0xe8/0x248

> If it is the same page, there must be something wrong here.
> 
> Normally there are 1024 BD for a rx ring:
> 
> BD_0 BD_1 BD_2 BD_3 BD_4 .... BD_1020 BD_1021  BD_1022  BD_1023
>            ^         ^
>          head       tail
> 
> Suppose head is manipulated by driver, and tail is manipulated by
> hw.
> 
> driver allocates buffer for BD pointed by head, as the frag page
> recycling is introduced in this patch, the BD_0 and BD_1's buffer
> may point to the same page(4K page size, and each BD only need
> 2k Buffer.
> hw dma the data to the buffer pointed by tail when packet is received.
> 
> so step 1 Normally happen for the BD pointed by head,
> and step 2 & 3 Normally happen for the BD pointed by tail.
> And Normally there are at least (1024 - RCB_NOF_ALLOC_RX_BUFF_ONCE) BD
> between head and tail, so it is unlikely that head and tail's BD buffer
> points to the same page.

I think a new page is allocated at step 1, no?  The driver calls
page_pool_alloc_frag() when refilling the rx ring, and since the current
pool->frag_page (P1) is still used by BD_0 and BD_1, then
page_pool_drain_frag() drops (BIAS_MAX - 2) references and
page_pool_alloc_frag() replaces frag_page with a new page, P2. Later, head
points to BD_1, the driver can drop the remaining 2 references to P1 in
steps 2 and 3, and P1 can be unmapped and freed/recycled

What I don't get is which of steps 2, 3 and 4 is the wrong one. Could be
2 or 3 because the device is evidently still doing DMA to the page after
it's released, but it could also be that the driver doesn't properly clear
the BD in which case step 4 is wrong. I'll try to find out which fragment
gets dropped twice.

Thanks,
Jean


  reply	other threads:[~2022-02-03  9:48 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-06  2:46 [PATCH net-next v2 0/4] add frag page support in page pool Yunsheng Lin
2021-08-06  2:46 ` [PATCH net-next v2 1/4] page_pool: keep pp info as long as page pool owns the page Yunsheng Lin
2021-08-06  2:46 ` [PATCH net-next v2 2/4] page_pool: add interface to manipulate frag count in page pool Yunsheng Lin
2021-08-10 14:58   ` Jesper Dangaard Brouer
2021-08-11  0:48     ` Yunsheng Lin
2021-08-12 15:17   ` Jesper Dangaard Brouer
2021-08-06  2:46 ` [PATCH net-next v2 3/4] page_pool: add frag page recycling support " Yunsheng Lin
2021-08-06  2:46 ` [PATCH net-next v2 4/4] net: hns3: support skb's frag page recycling based on " Yunsheng Lin
2021-09-08  8:31   ` moyufeng
2021-09-08 15:08     ` Jakub Kicinski
2021-09-08 15:26       ` Ilias Apalodimas
2021-09-08 15:57         ` Jakub Kicinski
2021-09-08 16:47           ` Jesper Dangaard Brouer
2021-09-08 16:51           ` Ilias Apalodimas
2022-01-26 14:30   ` Jean-Philippe Brucker
2022-01-28  4:00     ` Yunsheng Lin
2022-01-28  9:21       ` Jean-Philippe Brucker
2022-01-29  8:44         ` Yunsheng Lin
2022-02-03  9:48           ` Jean-Philippe Brucker [this message]
2022-02-07  2:54             ` Yunsheng Lin
2022-03-24 18:09               ` Jean-Philippe Brucker
2021-08-10 14:01 ` [PATCH net-next v2 0/4] add frag page support in " Jakub Kicinski
2021-08-10 14:23   ` Jesper Dangaard Brouer
2021-08-10 14:43     ` Jakub Kicinski
2021-08-10 15:09       ` Alexander Duyck
2021-08-11  1:06         ` [Linuxarm] " Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yfuk11on6XiaB6Di@myrica \
    --to=jean-philippe@linaro.org \
    --cc=alexanderduyck@fb.com \
    --cc=brouer@redhat.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kuba@kernel.org \
    --cc=linuxarm@openeuler.org \
    --cc=linyunsheng@huawei.com \
    --cc=moyufeng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=salil.mehta@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).