On Sep 16, Jakub Kicinski wrote: > On Fri, 10 Sep 2021 18:14:16 +0200 Lorenzo Bianconi wrote: > > From: Eelco Chaudron > > > > This change adds support for tail growing and shrinking for XDP multi-buff. > > > > When called on a multi-buffer packet with a grow request, it will always > > work on the last fragment of the packet. So the maximum grow size is the > > last fragments tailroom, i.e. no new buffer will be allocated. > > > > When shrinking, it will work from the last fragment, all the way down to > > the base buffer depending on the shrinking size. It's important to mention > > that once you shrink down the fragment(s) are freed, so you can not grow > > again to the original size. > > > > Acked-by: John Fastabend > > Co-developed-by: Lorenzo Bianconi > > Signed-off-by: Lorenzo Bianconi > > Signed-off-by: Eelco Chaudron > > > +static inline unsigned int xdp_get_frag_tailroom(const skb_frag_t *frag) > > +{ > > + struct page *page = skb_frag_page(frag); > > + > > + return page_size(page) - skb_frag_size(frag) - skb_frag_off(frag); > > +} > > How do we deal with NICs which can pack multiple skbs into a page frag? > skb_shared_info field to mark the end of last fragment? Just want to make > sure there is a path to supporting such designs. I guess here, intead of using page_size(page) we can rely on xdp_buff->frame_sz or even on skb_shared_info()->xdp_frag_truesize (assuming all fragments from a given hw have the same truesize, but I think this is something we can rely on) static inline unsigned int xdp_get_frag_tailroom(struct xdp_buff *xdp, const skb_frag_t *frag) { return xdp->frame_sz - skb_frag_size(frag) - skb_frag_off(frag); } what do you think? > > > +static int bpf_xdp_mb_adjust_tail(struct xdp_buff *xdp, int offset) > > +{ > > + struct skb_shared_info *sinfo; > > + > > + sinfo = xdp_get_shared_info_from_buff(xdp); > > + if (offset >= 0) { > > + skb_frag_t *frag = &sinfo->frags[sinfo->nr_frags - 1]; > > + int size; > > + > > + if (unlikely(offset > xdp_get_frag_tailroom(frag))) > > + return -EINVAL; > > + > > + size = skb_frag_size(frag); > > + memset(skb_frag_address(frag) + size, 0, offset); > > + skb_frag_size_set(frag, size + offset); > > + sinfo->xdp_frags_size += offset; > > + } else { > > + int i, n_frags_free = 0, len_free = 0, tlen_free = 0; > > + > > + offset = abs(offset); > > + if (unlikely(offset > ((int)(xdp->data_end - xdp->data) + > > + sinfo->xdp_frags_size - ETH_HLEN))) > > + return -EINVAL; > > + > > + for (i = sinfo->nr_frags - 1; i >= 0 && offset > 0; i--) { > > + skb_frag_t *frag = &sinfo->frags[i]; > > + int size = skb_frag_size(frag); > > + int shrink = min_t(int, offset, size); > > + > > + len_free += shrink; > > + offset -= shrink; > > + > > + if (unlikely(size == shrink)) { > > + struct page *page = skb_frag_page(frag); > > + > > + __xdp_return(page_address(page), &xdp->rxq->mem, > > + false, NULL); > > + tlen_free += page_size(page); > > + n_frags_free++; > > + } else { > > + skb_frag_size_set(frag, size - shrink); > > + break; > > + } > > + } > > + sinfo->nr_frags -= n_frags_free; > > + sinfo->xdp_frags_size -= len_free; > > + sinfo->xdp_frags_truesize -= tlen_free; > > + > > + if (unlikely(offset > 0)) { > > + xdp_buff_clear_mb(xdp); > > + xdp->data_end -= offset; > > + } > > + } > > + > > + return 0; > > +} > > nit: most of this function is indented, situation is ripe for splitting > it into two sure, I will fix it. Regards, Lorenzo