* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr [not found] <1622527177.2087624-1-xuanzhuo@linux.alibaba.com> @ 2021-06-01 6:17 ` Jason Wang 0 siblings, 0 replies; 5+ messages in thread From: Jason Wang @ 2021-06-01 6:17 UTC (permalink / raw) To: Xuan Zhuo Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, virtualization, bpf, netdev 在 2021/6/1 下午1:59, Xuan Zhuo 写道: > On Tue, 1 Jun 2021 11:27:12 +0800, Jason Wang <jasowang@redhat.com> wrote: >> 在 2021/6/1 上午11:08, Xuan Zhuo 写道: >>> On Tue, 1 Jun 2021 11:03:37 +0800, Jason Wang <jasowang@redhat.com> wrote: >>>> 在 2021/5/31 下午6:58, Xuan Zhuo 写道: >>>>> On Mon, 31 May 2021 14:10:55 +0800, Jason Wang <jasowang@redhat.com> wrote: >>>>>> 在 2021/5/14 下午11:16, Xuan Zhuo 写道: >>>>>>> In the case of merge, the page passed into page_to_skb() may be a head >>>>>>> page, not the page where the current data is located. >>>>>> I don't get how this can happen? >>>>>> >>>>>> Maybe you can explain a little bit more? >>>>>> >>>>>> receive_mergeable() call page_to_skb() in two places: >>>>>> >>>>>> 1) XDP_PASS for linearized page , in this case we use xdp_page >>>>>> 2) page_to_skb() for "normal" page, in this case the page contains the data >>>>> The offset may be greater than PAGE_SIZE, because page is obtained by >>>>> virt_to_head_page(), not the page where buf is located. And "offset" is the offset >>>>> of buf relative to page. >>>>> >>>>> tailroom = truesize - len - offset; >>>>> >>>>> In this case, the tailroom must be less than 0. Although there may be enough >>>>> content on this page to save skb_shared_info. >>>> Interesting, I think we don't use compound pages for virtio-net. (We >>>> don't define SKB_FRAG_PAGE_ORDER). >>>> >>>> Am I wrong? >>> It seems to me that it seems to be a fixed setting, not for us to configure >>> independently >> >> Looks like you are right. >> >> See comments below. >> >> >>> Thanks. >>> >>> ========================================== >>> >>> net/sock.c >>> >>> #define SKB_FRAG_PAGE_ORDER get_order(32768) >>> DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key); >>> >>> /** >>> * skb_page_frag_refill - check that a page_frag contains enough room >>> * @sz: minimum size of the fragment we want to get >>> * @pfrag: pointer to page_frag >>> * @gfp: priority for memory allocation >>> * >>> * Note: While this allocator tries to use high order pages, there is >>> * no guarantee that allocations succeed. Therefore, @sz MUST be >>> * less or equal than PAGE_SIZE. >>> */ >>> bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp) >>> { >>> if (pfrag->page) { >>> if (page_ref_count(pfrag->page) == 1) { >>> pfrag->offset = 0; >>> return true; >>> } >>> if (pfrag->offset + sz <= pfrag->size) >>> return true; >>> put_page(pfrag->page); >>> } >>> >>> pfrag->offset = 0; >>> if (SKB_FRAG_PAGE_ORDER && >>> !static_branch_unlikely(&net_high_order_alloc_disable_key)) { >>> /* Avoid direct reclaim but allow kswapd to wake */ >>> pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) | >>> __GFP_COMP | __GFP_NOWARN | >>> __GFP_NORETRY, >>> SKB_FRAG_PAGE_ORDER); >>> if (likely(pfrag->page)) { >>> pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER; >>> return true; >>> } >>> } >>> pfrag->page = alloc_page(gfp); >>> if (likely(pfrag->page)) { >>> pfrag->size = PAGE_SIZE; >>> return true; >>> } >>> return false; >>> } >>> EXPORT_SYMBOL(skb_page_frag_refill); >>> >>> >>>> Thanks >>>> >>>> >>>>> Thanks. >>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>>> So when trying to >>>>>>> get the buf where the data is located, you should directly use the >>>>>>> pointer(p) to get the address corresponding to the page. >>>>>>> >>>>>>> At the same time, the offset of the data in the page should also be >>>>>>> obtained using offset_in_page(). >>>>>>> >>>>>>> This patch solves this problem. But if you don’t use this patch, the >>>>>>> original code can also run, because if the page is not the page of the >>>>>>> current data, the calculated tailroom will be less than 0, and will not >>>>>>> enter the logic of build_skb() . The significance of this patch is to >>>>>>> modify this logical problem, allowing more situations to use >>>>>>> build_skb(). >>>>>>> >>>>>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> >>>>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>>>>>> --- >>>>>>> drivers/net/virtio_net.c | 8 ++++++-- >>>>>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>>>>>> index 3e46c12dde08..073fec4c0df1 100644 >>>>>>> --- a/drivers/net/virtio_net.c >>>>>>> +++ b/drivers/net/virtio_net.c >>>>>>> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, >>>>>>> * see add_recvbuf_mergeable() + get_mergeable_buf_len() >>>>>>> */ >>>>>>> truesize = PAGE_SIZE; >>>>>>> - tailroom = truesize - len - offset; >>>>>>> - buf = page_address(page); >>>>>>> + >>>>>>> + /* page maybe head page, so we should get the buf by p, not the >>>>>>> + * page >>>>>>> + */ >>>>>>> + tailroom = truesize - len - offset_in_page(p); >> >> I wonder why offset_in_page(p) is correct? I guess it should be: >> >> tailroom = truesize - len - headroom; >> >> The reason is that the buffer is not necessarily allocated at the page >> boundary. > In my understanding, offset_in_page(p) is the offset of p in the page where it > is located. In this case, the two should be equal. I think not, if the frag is not page aligned. offset_in_page(p) doesn't equal to headroom. Consider the case that the frag start from page offset 1500. > This has nothing to do with > which page is allocated. > > Of course I think using headroom is a good idea, and it is semantically better. > > Thanks. Thanks > >> Thanks >> >> >>>>>>> + buf = (char *)((unsigned long)p & PAGE_MASK); >>>>>>> } else { >>>>>>> tailroom = truesize - len; >>>>>>> buf = p; ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1622516885.7439268-1-xuanzhuo@linux.alibaba.com>]
* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr [not found] <1622516885.7439268-1-xuanzhuo@linux.alibaba.com> @ 2021-06-01 3:27 ` Jason Wang 0 siblings, 0 replies; 5+ messages in thread From: Jason Wang @ 2021-06-01 3:27 UTC (permalink / raw) To: Xuan Zhuo Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, virtualization, bpf, netdev 在 2021/6/1 上午11:08, Xuan Zhuo 写道: > On Tue, 1 Jun 2021 11:03:37 +0800, Jason Wang <jasowang@redhat.com> wrote: >> 在 2021/5/31 下午6:58, Xuan Zhuo 写道: >>> On Mon, 31 May 2021 14:10:55 +0800, Jason Wang <jasowang@redhat.com> wrote: >>>> 在 2021/5/14 下午11:16, Xuan Zhuo 写道: >>>>> In the case of merge, the page passed into page_to_skb() may be a head >>>>> page, not the page where the current data is located. >>>> I don't get how this can happen? >>>> >>>> Maybe you can explain a little bit more? >>>> >>>> receive_mergeable() call page_to_skb() in two places: >>>> >>>> 1) XDP_PASS for linearized page , in this case we use xdp_page >>>> 2) page_to_skb() for "normal" page, in this case the page contains the data >>> The offset may be greater than PAGE_SIZE, because page is obtained by >>> virt_to_head_page(), not the page where buf is located. And "offset" is the offset >>> of buf relative to page. >>> >>> tailroom = truesize - len - offset; >>> >>> In this case, the tailroom must be less than 0. Although there may be enough >>> content on this page to save skb_shared_info. >> >> Interesting, I think we don't use compound pages for virtio-net. (We >> don't define SKB_FRAG_PAGE_ORDER). >> >> Am I wrong? > > It seems to me that it seems to be a fixed setting, not for us to configure > independently Looks like you are right. See comments below. > > Thanks. > > ========================================== > > net/sock.c > > #define SKB_FRAG_PAGE_ORDER get_order(32768) > DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key); > > /** > * skb_page_frag_refill - check that a page_frag contains enough room > * @sz: minimum size of the fragment we want to get > * @pfrag: pointer to page_frag > * @gfp: priority for memory allocation > * > * Note: While this allocator tries to use high order pages, there is > * no guarantee that allocations succeed. Therefore, @sz MUST be > * less or equal than PAGE_SIZE. > */ > bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp) > { > if (pfrag->page) { > if (page_ref_count(pfrag->page) == 1) { > pfrag->offset = 0; > return true; > } > if (pfrag->offset + sz <= pfrag->size) > return true; > put_page(pfrag->page); > } > > pfrag->offset = 0; > if (SKB_FRAG_PAGE_ORDER && > !static_branch_unlikely(&net_high_order_alloc_disable_key)) { > /* Avoid direct reclaim but allow kswapd to wake */ > pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) | > __GFP_COMP | __GFP_NOWARN | > __GFP_NORETRY, > SKB_FRAG_PAGE_ORDER); > if (likely(pfrag->page)) { > pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER; > return true; > } > } > pfrag->page = alloc_page(gfp); > if (likely(pfrag->page)) { > pfrag->size = PAGE_SIZE; > return true; > } > return false; > } > EXPORT_SYMBOL(skb_page_frag_refill); > > >> Thanks >> >> >>> Thanks. >>> >>>> Thanks >>>> >>>> >>>>> So when trying to >>>>> get the buf where the data is located, you should directly use the >>>>> pointer(p) to get the address corresponding to the page. >>>>> >>>>> At the same time, the offset of the data in the page should also be >>>>> obtained using offset_in_page(). >>>>> >>>>> This patch solves this problem. But if you don’t use this patch, the >>>>> original code can also run, because if the page is not the page of the >>>>> current data, the calculated tailroom will be less than 0, and will not >>>>> enter the logic of build_skb() . The significance of this patch is to >>>>> modify this logical problem, allowing more situations to use >>>>> build_skb(). >>>>> >>>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> >>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>>>> --- >>>>> drivers/net/virtio_net.c | 8 ++++++-- >>>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>>>> index 3e46c12dde08..073fec4c0df1 100644 >>>>> --- a/drivers/net/virtio_net.c >>>>> +++ b/drivers/net/virtio_net.c >>>>> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, >>>>> * see add_recvbuf_mergeable() + get_mergeable_buf_len() >>>>> */ >>>>> truesize = PAGE_SIZE; >>>>> - tailroom = truesize - len - offset; >>>>> - buf = page_address(page); >>>>> + >>>>> + /* page maybe head page, so we should get the buf by p, not the >>>>> + * page >>>>> + */ >>>>> + tailroom = truesize - len - offset_in_page(p); I wonder why offset_in_page(p) is correct? I guess it should be: tailroom = truesize - len - headroom; The reason is that the buffer is not necessarily allocated at the page boundary. Thanks >>>>> + buf = (char *)((unsigned long)p & PAGE_MASK); >>>>> } else { >>>>> tailroom = truesize - len; >>>>> buf = p; ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1622458734.837168-1-xuanzhuo@linux.alibaba.com>]
* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr [not found] <1622458734.837168-1-xuanzhuo@linux.alibaba.com> @ 2021-06-01 3:03 ` Jason Wang 0 siblings, 0 replies; 5+ messages in thread From: Jason Wang @ 2021-06-01 3:03 UTC (permalink / raw) To: Xuan Zhuo Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, virtualization, bpf, netdev 在 2021/5/31 下午6:58, Xuan Zhuo 写道: > On Mon, 31 May 2021 14:10:55 +0800, Jason Wang <jasowang@redhat.com> wrote: >> 在 2021/5/14 下午11:16, Xuan Zhuo 写道: >>> In the case of merge, the page passed into page_to_skb() may be a head >>> page, not the page where the current data is located. >> >> I don't get how this can happen? >> >> Maybe you can explain a little bit more? >> >> receive_mergeable() call page_to_skb() in two places: >> >> 1) XDP_PASS for linearized page , in this case we use xdp_page >> 2) page_to_skb() for "normal" page, in this case the page contains the data > The offset may be greater than PAGE_SIZE, because page is obtained by > virt_to_head_page(), not the page where buf is located. And "offset" is the offset > of buf relative to page. > > tailroom = truesize - len - offset; > > In this case, the tailroom must be less than 0. Although there may be enough > content on this page to save skb_shared_info. Interesting, I think we don't use compound pages for virtio-net. (We don't define SKB_FRAG_PAGE_ORDER). Am I wrong? Thanks > > Thanks. > >> Thanks >> >> >>> So when trying to >>> get the buf where the data is located, you should directly use the >>> pointer(p) to get the address corresponding to the page. >>> >>> At the same time, the offset of the data in the page should also be >>> obtained using offset_in_page(). >>> >>> This patch solves this problem. But if you don’t use this patch, the >>> original code can also run, because if the page is not the page of the >>> current data, the calculated tailroom will be less than 0, and will not >>> enter the logic of build_skb() . The significance of this patch is to >>> modify this logical problem, allowing more situations to use >>> build_skb(). >>> >>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> >>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>> --- >>> drivers/net/virtio_net.c | 8 ++++++-- >>> 1 file changed, 6 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>> index 3e46c12dde08..073fec4c0df1 100644 >>> --- a/drivers/net/virtio_net.c >>> +++ b/drivers/net/virtio_net.c >>> @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, >>> * see add_recvbuf_mergeable() + get_mergeable_buf_len() >>> */ >>> truesize = PAGE_SIZE; >>> - tailroom = truesize - len - offset; >>> - buf = page_address(page); >>> + >>> + /* page maybe head page, so we should get the buf by p, not the >>> + * page >>> + */ >>> + tailroom = truesize - len - offset_in_page(p); >>> + buf = (char *)((unsigned long)p & PAGE_MASK); >>> } else { >>> tailroom = truesize - len; >>> buf = p; ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net 0/2] virtio-net: fix for build_skb() @ 2021-05-14 15:16 Xuan Zhuo 2021-05-14 15:16 ` [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr Xuan Zhuo 0 siblings, 1 reply; 5+ messages in thread From: Xuan Zhuo @ 2021-05-14 15:16 UTC (permalink / raw) To: netdev Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, Xuan Zhuo, virtualization, bpf #1 Fixed a serious error. #2 Fixed a logical error, but this error did not cause any serious consequences. The logic of this piece is really messy. Fortunately, my refactored patch can be completed with a small amount of testing. Thanks. Xuan Zhuo (2): virtio-net: fix for unable to handle page fault for address virtio-net: get build_skb() buf by data ptr drivers/net/virtio_net.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) -- 2.31.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr 2021-05-14 15:16 [PATCH net 0/2] virtio-net: fix for build_skb() Xuan Zhuo @ 2021-05-14 15:16 ` Xuan Zhuo 2021-05-31 6:10 ` Jason Wang 0 siblings, 1 reply; 5+ messages in thread From: Xuan Zhuo @ 2021-05-14 15:16 UTC (permalink / raw) To: netdev Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, Xuan Zhuo, virtualization, bpf In the case of merge, the page passed into page_to_skb() may be a head page, not the page where the current data is located. So when trying to get the buf where the data is located, you should directly use the pointer(p) to get the address corresponding to the page. At the same time, the offset of the data in the page should also be obtained using offset_in_page(). This patch solves this problem. But if you don’t use this patch, the original code can also run, because if the page is not the page of the current data, the calculated tailroom will be less than 0, and will not enter the logic of build_skb() . The significance of this patch is to modify this logical problem, allowing more situations to use build_skb(). Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> --- drivers/net/virtio_net.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 3e46c12dde08..073fec4c0df1 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, * see add_recvbuf_mergeable() + get_mergeable_buf_len() */ truesize = PAGE_SIZE; - tailroom = truesize - len - offset; - buf = page_address(page); + + /* page maybe head page, so we should get the buf by p, not the + * page + */ + tailroom = truesize - len - offset_in_page(p); + buf = (char *)((unsigned long)p & PAGE_MASK); } else { tailroom = truesize - len; buf = p; -- 2.31.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr 2021-05-14 15:16 ` [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr Xuan Zhuo @ 2021-05-31 6:10 ` Jason Wang 0 siblings, 0 replies; 5+ messages in thread From: Jason Wang @ 2021-05-31 6:10 UTC (permalink / raw) To: Xuan Zhuo, netdev Cc: Michael S. Tsirkin, David S. Miller, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, virtualization, bpf 在 2021/5/14 下午11:16, Xuan Zhuo 写道: > In the case of merge, the page passed into page_to_skb() may be a head > page, not the page where the current data is located. I don't get how this can happen? Maybe you can explain a little bit more? receive_mergeable() call page_to_skb() in two places: 1) XDP_PASS for linearized page , in this case we use xdp_page 2) page_to_skb() for "normal" page, in this case the page contains the data Thanks > So when trying to > get the buf where the data is located, you should directly use the > pointer(p) to get the address corresponding to the page. > > At the same time, the offset of the data in the page should also be > obtained using offset_in_page(). > > This patch solves this problem. But if you don’t use this patch, the > original code can also run, because if the page is not the page of the > current data, the calculated tailroom will be less than 0, and will not > enter the logic of build_skb() . The significance of this patch is to > modify this logical problem, allowing more situations to use > build_skb(). > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Acked-by: Michael S. Tsirkin <mst@redhat.com> > --- > drivers/net/virtio_net.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 3e46c12dde08..073fec4c0df1 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -407,8 +407,12 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, > * see add_recvbuf_mergeable() + get_mergeable_buf_len() > */ > truesize = PAGE_SIZE; > - tailroom = truesize - len - offset; > - buf = page_address(page); > + > + /* page maybe head page, so we should get the buf by p, not the > + * page > + */ > + tailroom = truesize - len - offset_in_page(p); > + buf = (char *)((unsigned long)p & PAGE_MASK); > } else { > tailroom = truesize - len; > buf = p; ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-06-01 6:17 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1622527177.2087624-1-xuanzhuo@linux.alibaba.com> 2021-06-01 6:17 ` [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr Jason Wang [not found] <1622516885.7439268-1-xuanzhuo@linux.alibaba.com> 2021-06-01 3:27 ` Jason Wang [not found] <1622458734.837168-1-xuanzhuo@linux.alibaba.com> 2021-06-01 3:03 ` Jason Wang 2021-05-14 15:16 [PATCH net 0/2] virtio-net: fix for build_skb() Xuan Zhuo 2021-05-14 15:16 ` [PATCH net 2/2] virtio-net: get build_skb() buf by data ptr Xuan Zhuo 2021-05-31 6:10 ` Jason Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).