bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr
       [not found] <1622458734.837168-1-xuanzhuo@linux.alibaba.com>
@ 2021-06-01  3:03 ` Jason Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2021-06-01  3:03 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, virtualization, bpf, netdev


在 2021/5/31 下午6:58, Xuan Zhuo 写道:
> On Mon, 31 May 2021 14:10:55 +0800, Jason Wang <jasowang@redhat.com> wrote:
>> 在 2021/5/14 下午11:16, Xuan Zhuo 写道:
>>> In the case of merge, the page passed into page_to_skb() may be a head
>>> page, not the page where the current data is located.
>>
>> I don't get how this can happen?
>>
>> Maybe you can explain a little bit more?
>>
>> receive_mergeable() call page_to_skb() in two places:
>>
>> 1) XDP_PASS for linearized page , in this case we use xdp_page
>> 2) page_to_skb() for "normal" page, in this case the page contains the data
> The offset may be greater than PAGE_SIZE, because page is obtained by
> virt_to_head_page(), not the page where buf is located. And "offset" is the offset
> of buf relative to page.
>
> 	tailroom = truesize - len - offset;
>
> In this case, the tailroom must be less than 0. Although there may be enough
> content on this page to save skb_shared_info.


Interesting, I think we don't use compound pages for virtio-net. (We 
don't define SKB_FRAG_PAGE_ORDER).

Am I wrong?

Thanks


>
> Thanks.
>
>> Thanks
>>
>>
>>> So when trying to
>>> get the buf where the data is located, you should directly use the
>>> pointer(p) to get the address corresponding to the page.
>>>
>>> At the same time, the offset of the data in the page should also be
>>> obtained using offset_in_page().
>>>
>>> This patch solves this problem. But if you don’t use this patch, the
>>> original code can also run, because if the page is not the page of the
>>> current data, the calculated tailroom will be less than 0, and will not
>>> enter the logic of build_skb() . The significance of this patch is to
>>> modify this logical problem, allowing more situations to use
>>> build_skb().
>>>
>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>> ---
>>>    drivers/net/virtio_net.c | 8 ++++++--
>>>    1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index 3e46c12dde08..073fec4c0df1 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>>>    		 * see add_recvbuf_mergeable() + get_mergeable_buf_len()
>>>    		 */
>>>    		truesize = PAGE_SIZE;
>>> -		tailroom = truesize - len - offset;
>>> -		buf = page_address(page);
>>> +
>>> +		/* page maybe head page, so we should get the buf by p, not the
>>> +		 * page
>>> +		 */
>>> +		tailroom = truesize - len - offset_in_page(p);
>>> +		buf = (char *)((unsigned long)p & PAGE_MASK);
>>>    	} else {
>>>    		tailroom = truesize - len;
>>>    		buf = p;


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr
       [not found] <1622527177.2087624-1-xuanzhuo@linux.alibaba.com>
@ 2021-06-01  6:17 ` Jason Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2021-06-01  6:17 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, virtualization, bpf, netdev


在 2021/6/1 下午1:59, Xuan Zhuo 写道:
> On Tue, 1 Jun 2021 11:27:12 +0800, Jason Wang <jasowang@redhat.com> wrote:
>> 在 2021/6/1 上午11:08, Xuan Zhuo 写道:
>>> On Tue, 1 Jun 2021 11:03:37 +0800, Jason Wang <jasowang@redhat.com> wrote:
>>>> 在 2021/5/31 下午6:58, Xuan Zhuo 写道:
>>>>> On Mon, 31 May 2021 14:10:55 +0800, Jason Wang <jasowang@redhat.com> wrote:
>>>>>> 在 2021/5/14 下午11:16, Xuan Zhuo 写道:
>>>>>>> In the case of merge, the page passed into page_to_skb() may be a head
>>>>>>> page, not the page where the current data is located.
>>>>>> I don't get how this can happen?
>>>>>>
>>>>>> Maybe you can explain a little bit more?
>>>>>>
>>>>>> receive_mergeable() call page_to_skb() in two places:
>>>>>>
>>>>>> 1) XDP_PASS for linearized page , in this case we use xdp_page
>>>>>> 2) page_to_skb() for "normal" page, in this case the page contains the data
>>>>> The offset may be greater than PAGE_SIZE, because page is obtained by
>>>>> virt_to_head_page(), not the page where buf is located. And "offset" is the offset
>>>>> of buf relative to page.
>>>>>
>>>>> 	tailroom = truesize - len - offset;
>>>>>
>>>>> In this case, the tailroom must be less than 0. Although there may be enough
>>>>> content on this page to save skb_shared_info.
>>>> Interesting, I think we don't use compound pages for virtio-net. (We
>>>> don't define SKB_FRAG_PAGE_ORDER).
>>>>
>>>> Am I wrong?
>>> It seems to me that it seems to be a fixed setting, not for us to configure
>>> independently
>>
>> Looks like you are right.
>>
>> See comments below.
>>
>>
>>> Thanks.
>>>
>>> ==========================================
>>>
>>> net/sock.c
>>>
>>> #define SKB_FRAG_PAGE_ORDER	get_order(32768)
>>> DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);
>>>
>>> /**
>>>    * skb_page_frag_refill - check that a page_frag contains enough room
>>>    * @sz: minimum size of the fragment we want to get
>>>    * @pfrag: pointer to page_frag
>>>    * @gfp: priority for memory allocation
>>>    *
>>>    * Note: While this allocator tries to use high order pages, there is
>>>    * no guarantee that allocations succeed. Therefore, @sz MUST be
>>>    * less or equal than PAGE_SIZE.
>>>    */
>>> bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
>>> {
>>> 	if (pfrag->page) {
>>> 		if (page_ref_count(pfrag->page) == 1) {
>>> 			pfrag->offset = 0;
>>> 			return true;
>>> 		}
>>> 		if (pfrag->offset + sz <= pfrag->size)
>>> 			return true;
>>> 		put_page(pfrag->page);
>>> 	}
>>>
>>> 	pfrag->offset = 0;
>>> 	if (SKB_FRAG_PAGE_ORDER &&
>>> 	    !static_branch_unlikely(&net_high_order_alloc_disable_key)) {
>>> 		/* Avoid direct reclaim but allow kswapd to wake */
>>> 		pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) |
>>> 					  __GFP_COMP | __GFP_NOWARN |
>>> 					  __GFP_NORETRY,
>>> 					  SKB_FRAG_PAGE_ORDER);
>>> 		if (likely(pfrag->page)) {
>>> 			pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
>>> 			return true;
>>> 		}
>>> 	}
>>> 	pfrag->page = alloc_page(gfp);
>>> 	if (likely(pfrag->page)) {
>>> 		pfrag->size = PAGE_SIZE;
>>> 		return true;
>>> 	}
>>> 	return false;
>>> }
>>> EXPORT_SYMBOL(skb_page_frag_refill);
>>>
>>>
>>>> Thanks
>>>>
>>>>
>>>>> Thanks.
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>> So when trying to
>>>>>>> get the buf where the data is located, you should directly use the
>>>>>>> pointer(p) to get the address corresponding to the page.
>>>>>>>
>>>>>>> At the same time, the offset of the data in the page should also be
>>>>>>> obtained using offset_in_page().
>>>>>>>
>>>>>>> This patch solves this problem. But if you don’t use this patch, the
>>>>>>> original code can also run, because if the page is not the page of the
>>>>>>> current data, the calculated tailroom will be less than 0, and will not
>>>>>>> enter the logic of build_skb() . The significance of this patch is to
>>>>>>> modify this logical problem, allowing more situations to use
>>>>>>> build_skb().
>>>>>>>
>>>>>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> ---
>>>>>>>      drivers/net/virtio_net.c | 8 ++++++--
>>>>>>>      1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>>> index 3e46c12dde08..073fec4c0df1 100644
>>>>>>> --- a/drivers/net/virtio_net.c
>>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>>> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>>>>>>>      		 * see add_recvbuf_mergeable() + get_mergeable_buf_len()
>>>>>>>      		 */
>>>>>>>      		truesize = PAGE_SIZE;
>>>>>>> -		tailroom = truesize - len - offset;
>>>>>>> -		buf = page_address(page);
>>>>>>> +
>>>>>>> +		/* page maybe head page, so we should get the buf by p, not the
>>>>>>> +		 * page
>>>>>>> +		 */
>>>>>>> +		tailroom = truesize - len - offset_in_page(p);
>>
>> I wonder why offset_in_page(p) is correct? I guess it should be:
>>
>> tailroom = truesize - len - headroom;
>>
>> The reason is that the buffer is not necessarily allocated at the page
>> boundary.
> In my understanding, offset_in_page(p) is the offset of p in the page where it
> is located. In this case, the two should be equal.


I think not, if the frag is not page aligned. offset_in_page(p) doesn't 
equal to headroom.

Consider the case that the frag start from page offset 1500.


>   This has nothing to do with
> which page is allocated.
>
> Of course I think using headroom is a good idea, and it is semantically better.
>
> Thanks.


Thanks


>
>> Thanks
>>
>>
>>>>>>> +		buf = (char *)((unsigned long)p & PAGE_MASK);
>>>>>>>      	} else {
>>>>>>>      		tailroom = truesize - len;
>>>>>>>      		buf = p;


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr
       [not found] <1622516885.7439268-1-xuanzhuo@linux.alibaba.com>
@ 2021-06-01  3:27 ` Jason Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2021-06-01  3:27 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, virtualization, bpf, netdev


在 2021/6/1 上午11:08, Xuan Zhuo 写道:
> On Tue, 1 Jun 2021 11:03:37 +0800, Jason Wang <jasowang@redhat.com> wrote:
>> 在 2021/5/31 下午6:58, Xuan Zhuo 写道:
>>> On Mon, 31 May 2021 14:10:55 +0800, Jason Wang <jasowang@redhat.com> wrote:
>>>> 在 2021/5/14 下午11:16, Xuan Zhuo 写道:
>>>>> In the case of merge, the page passed into page_to_skb() may be a head
>>>>> page, not the page where the current data is located.
>>>> I don't get how this can happen?
>>>>
>>>> Maybe you can explain a little bit more?
>>>>
>>>> receive_mergeable() call page_to_skb() in two places:
>>>>
>>>> 1) XDP_PASS for linearized page , in this case we use xdp_page
>>>> 2) page_to_skb() for "normal" page, in this case the page contains the data
>>> The offset may be greater than PAGE_SIZE, because page is obtained by
>>> virt_to_head_page(), not the page where buf is located. And "offset" is the offset
>>> of buf relative to page.
>>>
>>> 	tailroom = truesize - len - offset;
>>>
>>> In this case, the tailroom must be less than 0. Although there may be enough
>>> content on this page to save skb_shared_info.
>>
>> Interesting, I think we don't use compound pages for virtio-net. (We
>> don't define SKB_FRAG_PAGE_ORDER).
>>
>> Am I wrong?
>
> It seems to me that it seems to be a fixed setting, not for us to configure
> independently


Looks like you are right.

See comments below.


>
> Thanks.
>
> ==========================================
>
> net/sock.c
>
> #define SKB_FRAG_PAGE_ORDER	get_order(32768)
> DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);
>
> /**
>   * skb_page_frag_refill - check that a page_frag contains enough room
>   * @sz: minimum size of the fragment we want to get
>   * @pfrag: pointer to page_frag
>   * @gfp: priority for memory allocation
>   *
>   * Note: While this allocator tries to use high order pages, there is
>   * no guarantee that allocations succeed. Therefore, @sz MUST be
>   * less or equal than PAGE_SIZE.
>   */
> bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
> {
> 	if (pfrag->page) {
> 		if (page_ref_count(pfrag->page) == 1) {
> 			pfrag->offset = 0;
> 			return true;
> 		}
> 		if (pfrag->offset + sz <= pfrag->size)
> 			return true;
> 		put_page(pfrag->page);
> 	}
>
> 	pfrag->offset = 0;
> 	if (SKB_FRAG_PAGE_ORDER &&
> 	    !static_branch_unlikely(&net_high_order_alloc_disable_key)) {
> 		/* Avoid direct reclaim but allow kswapd to wake */
> 		pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) |
> 					  __GFP_COMP | __GFP_NOWARN |
> 					  __GFP_NORETRY,
> 					  SKB_FRAG_PAGE_ORDER);
> 		if (likely(pfrag->page)) {
> 			pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
> 			return true;
> 		}
> 	}
> 	pfrag->page = alloc_page(gfp);
> 	if (likely(pfrag->page)) {
> 		pfrag->size = PAGE_SIZE;
> 		return true;
> 	}
> 	return false;
> }
> EXPORT_SYMBOL(skb_page_frag_refill);
>
>
>> Thanks
>>
>>
>>> Thanks.
>>>
>>>> Thanks
>>>>
>>>>
>>>>> So when trying to
>>>>> get the buf where the data is located, you should directly use the
>>>>> pointer(p) to get the address corresponding to the page.
>>>>>
>>>>> At the same time, the offset of the data in the page should also be
>>>>> obtained using offset_in_page().
>>>>>
>>>>> This patch solves this problem. But if you don’t use this patch, the
>>>>> original code can also run, because if the page is not the page of the
>>>>> current data, the calculated tailroom will be less than 0, and will not
>>>>> enter the logic of build_skb() . The significance of this patch is to
>>>>> modify this logical problem, allowing more situations to use
>>>>> build_skb().
>>>>>
>>>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>>>> ---
>>>>>     drivers/net/virtio_net.c | 8 ++++++--
>>>>>     1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>> index 3e46c12dde08..073fec4c0df1 100644
>>>>> --- a/drivers/net/virtio_net.c
>>>>> +++ b/drivers/net/virtio_net.c
>>>>> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>>>>>     		 * see add_recvbuf_mergeable() + get_mergeable_buf_len()
>>>>>     		 */
>>>>>     		truesize = PAGE_SIZE;
>>>>> -		tailroom = truesize - len - offset;
>>>>> -		buf = page_address(page);
>>>>> +
>>>>> +		/* page maybe head page, so we should get the buf by p, not the
>>>>> +		 * page
>>>>> +		 */
>>>>> +		tailroom = truesize - len - offset_in_page(p);


I wonder why offset_in_page(p) is correct? I guess it should be:

tailroom = truesize - len - headroom;

The reason is that the buffer is not necessarily allocated at the page 
boundary.

Thanks


>>>>> +		buf = (char *)((unsigned long)p & PAGE_MASK);
>>>>>     	} else {
>>>>>     		tailroom = truesize - len;
>>>>>     		buf = p;


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr
  2021-05-14 15:16 ` [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr Xuan Zhuo
@ 2021-05-31  6:10   ` Jason Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2021-05-31  6:10 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, virtualization, bpf


在 2021/5/14 下午11:16, Xuan Zhuo 写道:
> In the case of merge, the page passed into page_to_skb() may be a head
> page, not the page where the current data is located.


I don't get how this can happen?

Maybe you can explain a little bit more?

receive_mergeable() call page_to_skb() in two places:

1) XDP_PASS for linearized page , in this case we use xdp_page
2) page_to_skb() for "normal" page, in this case the page contains the data

Thanks


> So when trying to
> get the buf where the data is located, you should directly use the
> pointer(p) to get the address corresponding to the page.
>
> At the same time, the offset of the data in the page should also be
> obtained using offset_in_page().
>
> This patch solves this problem. But if you don’t use this patch, the
> original code can also run, because if the page is not the page of the
> current data, the calculated tailroom will be less than 0, and will not
> enter the logic of build_skb() . The significance of this patch is to
> modify this logical problem, allowing more situations to use
> build_skb().
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   drivers/net/virtio_net.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 3e46c12dde08..073fec4c0df1 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>   		 * see add_recvbuf_mergeable() + get_mergeable_buf_len()
>   		 */
>   		truesize = PAGE_SIZE;
> -		tailroom = truesize - len - offset;
> -		buf = page_address(page);
> +
> +		/* page maybe head page, so we should get the buf by p, not the
> +		 * page
> +		 */
> +		tailroom = truesize - len - offset_in_page(p);
> +		buf = (char *)((unsigned long)p & PAGE_MASK);
>   	} else {
>   		tailroom = truesize - len;
>   		buf = p;


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr
  2021-05-14 15:16 [PATCH net 0/2] virtio-net: fix for build_skb() Xuan Zhuo
@ 2021-05-14 15:16 ` Xuan Zhuo
  2021-05-31  6:10   ` Jason Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Xuan Zhuo @ 2021-05-14 15:16 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Xuan Zhuo, virtualization, bpf

In the case of merge, the page passed into page_to_skb() may be a head
page, not the page where the current data is located. So when trying to
get the buf where the data is located, you should directly use the
pointer(p) to get the address corresponding to the page.

At the same time, the offset of the data in the page should also be
obtained using offset_in_page().

This patch solves this problem. But if you don’t use this patch, the
original code can also run, because if the page is not the page of the
current data, the calculated tailroom will be less than 0, and will not
enter the logic of build_skb() . The significance of this patch is to
modify this logical problem, allowing more situations to use
build_skb().

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 3e46c12dde08..073fec4c0df1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 		 * see add_recvbuf_mergeable() + get_mergeable_buf_len()
 		 */
 		truesize = PAGE_SIZE;
-		tailroom = truesize - len - offset;
-		buf = page_address(page);
+
+		/* page maybe head page, so we should get the buf by p, not the
+		 * page
+		 */
+		tailroom = truesize - len - offset_in_page(p);
+		buf = (char *)((unsigned long)p & PAGE_MASK);
 	} else {
 		tailroom = truesize - len;
 		buf = p;
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-01  6:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1622458734.837168-1-xuanzhuo@linux.alibaba.com>
2021-06-01  3:03 ` [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr Jason Wang
     [not found] <1622527177.2087624-1-xuanzhuo@linux.alibaba.com>
2021-06-01  6:17 ` Jason Wang
     [not found] <1622516885.7439268-1-xuanzhuo@linux.alibaba.com>
2021-06-01  3:27 ` Jason Wang
2021-05-14 15:16 [PATCH net 0/2] virtio-net: fix for build_skb() Xuan Zhuo
2021-05-14 15:16 ` [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr Xuan Zhuo
2021-05-31  6:10   ` Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).