> Lorenzo Bianconi wrote: > > > Lorenzo Bianconi wrote: > > > > From: Eelco Chaudron > > > > > > > > This change adds support for tail growing and shrinking for XDP multi-buff. > > > > > > > > When called on a multi-buffer packet with a grow request, it will work > > > > on the last fragment of the packet. So the maximum grow size is the > > > > last fragments tailroom, i.e. no new buffer will be allocated. > > > > A XDP mb capable driver is expected to set frag_size in xdp_rxq_info data > > > > structure to notify the XDP core the fragment size. frag_size set to 0 is > > > > interpreted by the XDP core as tail growing is not allowed. > > > > Introduce __xdp_rxq_info_reg utility routine to initialize frag_size field. > > > > > > > > When shrinking, it will work from the last fragment, all the way down to > > > > the base buffer depending on the shrinking size. It's important to mention > > > > that once you shrink down the fragment(s) are freed, so you can not grow > > > > again to the original size. > > > > > > > > Acked-by: Jakub Kicinski > > > > Co-developed-by: Lorenzo Bianconi > > > > Signed-off-by: Lorenzo Bianconi > > > > Signed-off-by: Eelco Chaudron > > > > --- > > [...] > > pasting full function here to help following along. > > + > +static int bpf_xdp_mb_shrink_tail(struct xdp_buff *xdp, int offset) > +{ > + struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); > + int i, n_frags_free = 0, len_free = 0; > + > + if (unlikely(offset > (int)xdp_get_buff_len(xdp) - ETH_HLEN)) > + return -EINVAL; > + > + for (i = sinfo->nr_frags - 1; i >= 0 && offset > 0; i--) { > + skb_frag_t *frag = &sinfo->frags[i]; > + int size = skb_frag_size(frag); > + int shrink = min_t(int, offset, size); > + > + len_free += shrink; > + offset -= shrink; > + > + if (unlikely(size == shrink)) { > + struct page *page = skb_frag_page(frag); > + > + __xdp_return(page_address(page), &xdp->rxq->mem, > + false, NULL); > + n_frags_free++; > + } else { > + skb_frag_size_set(frag, size - shrink); > + break; > + } > + } > + sinfo->nr_frags -= n_frags_free; > + sinfo->xdp_frags_size -= len_free; > + > + if (unlikely(offset > 0)) { > + xdp_buff_clear_mb(xdp); > + xdp->data_end -= offset; > + } > + > + return 0; > +} > + > > > > > > > hmm whats the case for offset to != 0? Seems with initial unlikely > > > check and shrinking while walking backwards through the frags it > > > should be zero? Maybe a comment would help? > > > > Looking at the code, offset can be > 0 here whenever we reduce the mb frame to > > a legacy frame (so whenever offset will move the boundary into the linear > > area). > > But still missing if we need to clear the mb bit or not when we shrink down > to a single frag. I think its fine, but worth double checking. As an example > consider I shrink 2k from a 3k pkt with two frags, one full 2k and another > 1k extra, > > On the first run through, > > i = 1; > offset = 2k > > + for (i = sinfo->nr_frags - 1; i >= 0 && offset > 0; i--) { > + skb_frag_t *frag = &sinfo->frags[i]; > + int size = skb_frag_size(frag); > + int shrink = min_t(int, offset, size); > > shrink = 1k; // min_t(int, offset, size) -> size > > + > + len_free += shrink; > + offset -= shrink; > > offset = 1k > > + if (unlikely(size == shrink)) { > + struct page *page = skb_frag_page(frag); > + > + __xdp_return(page_address(page), &xdp->rxq->mem, > + false, NULL); > + n_frags_free++; > > Will free the frag > > Then next run through > > i = 0; > offset = 1k; > > + skb_frag_t *frag = &sinfo->frags[i]; > + int size = skb_frag_size(frag); > + int shrink = min_t(int, offset, size); > > shrink = 1k; // min_t(int, offset, size) -> offset > > + > + len_free += shrink; > + offset -= shrink; > > offset = 0k > > + > + if (unlikely(size == shrink)) { ... > + } else { > + skb_frag_size_set(frag, size - shrink); > + break; > + } > > Then later there is the check 'if (unlikely(offset > 0) { ...}', but that > wont hit this case and we shrunk it back to a single frag. Did we want > to clear the mb in this case? I'm not seeing how it harms things to have > the mb bit set just trying to follow code here. If I followed correctly your example, we will have sinfo->nr_frags = 1 at the end of the processing (since the first fragment has 2k size), right? If so mb bit must be set to 1. Am I missing something? Re-looking at the code I guess we should clear mb bit using sinfo->nr_frags instead: if (!sinfo->nr_frags) xdp_buff_clear_mb(xdp); Agree? Regards, Lorenzo > > Would offset > 0 indicate we weren't able to shrink the xdp buff enough > for some reason. Need some coffee perhaps. > > Thanks, > John >