From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: Re: [net-next PATCH v2 5/8] udp: Partially unroll handling of first segment and last segment Date: Sat, 5 May 2018 10:13:17 -0700 Message-ID: References: <20180504182537.5194.72775.stgit@localhost.localdomain> <20180504183019.5194.47789.stgit@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Network Development , Willem de Bruijn , David Miller To: Willem de Bruijn Return-path: Received: from mail-ot0-f194.google.com ([74.125.82.194]:42399 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751183AbeEERNS (ORCPT ); Sat, 5 May 2018 13:13:18 -0400 Received: by mail-ot0-f194.google.com with SMTP id l13-v6so27833889otk.9 for ; Sat, 05 May 2018 10:13:18 -0700 (PDT) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Sat, May 5, 2018 at 1:37 AM, Willem de Bruijn wrote: > On Fri, May 4, 2018 at 8:30 PM, Alexander Duyck > wrote: >> From: Alexander Duyck >> >> This patch allows us to take care of unrolling the first segment and the >> last segment > > Only the last segment needs to be unrolled, right? I need the first segment as I have to get access to the UDP header to recompute what the checksum should actually be for the first frame and all the frames in between. >> of the loop for processing the segmented skb. Part of the >> motivation for this is that it makes it easier to process the fact that the >> first fame and all of the frames in between should be mostly identical >> in terms of header data, and the last frame has differences in the length >> and partial checksum. >> >> In addition I am dropping the header length calculation since we don't >> really need it for anything but the last frame and it can be easily >> obtained by just pulling the data_len and offset of tail from the transport >> header. >> >> Signed-off-by: Alexander Duyck >> --- >> >> v2: New break-out patch based on one patch from earlier series >> >> net/ipv4/udp_offload.c | 35 ++++++++++++++++++++--------------- >> 1 file changed, 20 insertions(+), 15 deletions(-) >> >> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c >> index 5c4bb8b9e83e..946d06d2aa0c 100644 >> --- a/net/ipv4/udp_offload.c >> +++ b/net/ipv4/udp_offload.c >> @@ -193,7 +193,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, >> struct sk_buff *seg, *segs = ERR_PTR(-EINVAL); >> struct sock *sk = gso_skb->sk; >> unsigned int sum_truesize = 0; >> - unsigned int hdrlen; >> struct udphdr *uh; >> unsigned int mss; >> __sum16 check; >> @@ -206,7 +205,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, >> if (!pskb_may_pull(gso_skb, sizeof(*uh))) >> goto out; >> >> - hdrlen = gso_skb->data - skb_mac_header(gso_skb); >> skb_pull(gso_skb, sizeof(*uh)); >> >> /* clear destructor to avoid skb_segment assigning it to tail */ >> @@ -219,7 +217,8 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, >> return segs; >> } >> >> - uh = udp_hdr(segs); >> + seg = segs; >> + uh = udp_hdr(seg); >> >> /* compute checksum adjustment based on old length versus new */ >> newlen = htons(sizeof(*uh) + mss); >> @@ -227,25 +226,31 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, >> ((__force u32)uh->len ^ 0xFFFF) + >> (__force u32)newlen)); >> >> - for (seg = segs; seg; seg = seg->next) { >> - uh = udp_hdr(seg); >> + for (;;) { >> + seg->destructor = sock_wfree; >> + seg->sk = sk; >> + sum_truesize += seg->truesize; >> >> - /* last packet can be partial gso_size */ >> - if (!seg->next) { >> - newlen = htons(seg->len - hdrlen); >> - check = ~csum_fold((__force __wsum)((__force u32)uh->check + >> - ((__force u32)uh->len ^ 0xFFFF) + >> - (__force u32)newlen)); >> - } >> + if (!seg->next) >> + break; > > Not critical, but I find replacing > > for (seg = segs; seg; seg = seg->next) { > uh = udp_hdr(seg); > ... > } > > with > > uh = udp_hdr(seg); > for (;;) { > ... > if (!seg->next) > break; > > uh = udp_hdr(seg); > } > > much less easy to parse and it inflates patch size. How about just > > - for (seg = segs; seg; seg = seg->next) > + for( seg = segs; seg->next; seg = seg->next) > > The problem is I need access to the UDP header before I start the loop. That was why I pulled seg = segs and the "uh = udp_hdr(seg)" out seperately >> >> uh->len = newlen; >> uh->check = check; >> >> - seg->destructor = sock_wfree; >> - seg->sk = sk; >> - sum_truesize += seg->truesize; >> + seg = seg->next; >> + uh = udp_hdr(seg); >> } >> >> + /* last packet can be partial gso_size, account for that in checksum */ >> + newlen = htons(skb_tail_pointer(seg) - skb_transport_header(seg) + >> + seg->data_len); >> + check = ~csum_fold((__force __wsum)((__force u32)uh->check + >> + ((__force u32)uh->len ^ 0xFFFF) + >> + (__force u32)newlen)); >> + uh->len = newlen; >> + uh->check = check; >> + >> + /* update refcount for the packet */ >> refcount_add(sum_truesize - gso_skb->truesize, &sk->sk_wmem_alloc); >> out: >> return segs; >>