Re: [PATCH net] net: Add check for csum_start in skb_partial_csum_set()

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: "luwei (O)" <luwei32@huawei.com>, Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	asml.silence@gmail.com, imagedong@tencent.com, brouer@redhat.com,
	keescook@chromium.org, jbenc@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net] net: Add check for csum_start in skb_partial_csum_set()
Date: Wed, 12 Apr 2023 09:44:26 -0400	[thread overview]
Message-ID: <6436b5ba5c005_41e2294dd@willemb.c.googlers.com.notmuch> (raw)
In-Reply-To: <c90abe8c-ffa0-f986-11eb-bde65c84d18b@huawei.com>

luwei (O) wrote:
> 
> 在 2023/4/11 4:13 PM, Eric Dumazet 写道:
> > On Tue, Apr 11, 2023 at 4:33 AM luwei (O) <luwei32@huawei.com> wrote:
> >>
> >> 在 2023/4/11 1:30 AM, Willem de Bruijn 写道:
> >>
> >> Eric Dumazet wrote:
> >>
> >> On Mon, Apr 10, 2023 at 4:22 AM Lu Wei <luwei32@huawei.com> wrote:
> >>
> >> If an AF_PACKET socket is used to send packets through a L3 mode ipvlan
> >> and a vnet header is set via setsockopt() with the option name of
> >> PACKET_VNET_HDR, the value of offset will be nagetive in function
> >> skb_checksum_help() and trigger the following warning:
> >>
> >> WARNING: CPU: 3 PID: 2023 at net/core/dev.c:3262
> >> skb_checksum_help+0x2dc/0x390
> >> ......
> >> Call Trace:
> >>   <TASK>
> >>   ip_do_fragment+0x63d/0xd00
> >>   ip_fragment.constprop.0+0xd2/0x150
> >>   __ip_finish_output+0x154/0x1e0
> >>   ip_finish_output+0x36/0x1b0
> >>   ip_output+0x134/0x240
> >>   ip_local_out+0xba/0xe0
> >>   ipvlan_process_v4_outbound+0x26d/0x2b0
> >>   ipvlan_xmit_mode_l3+0x44b/0x480
> >>   ipvlan_queue_xmit+0xd6/0x1d0
> >>   ipvlan_start_xmit+0x32/0xa0
> >>   dev_hard_start_xmit+0xdf/0x3f0
> >>   packet_snd+0xa7d/0x1130
> >>   packet_sendmsg+0x7b/0xa0
> >>   sock_sendmsg+0x14f/0x160
> >>   __sys_sendto+0x209/0x2e0
> >>   __x64_sys_sendto+0x7d/0x90
> >>
> >> The root cause is:
> >> 1. skb->csum_start is set in packet_snd() according vnet_hdr:
> >>     skb->csum_start = skb_headroom(skb) + (u32)start;
> >>
> >>     'start' is the offset from skb->data, and mac header has been
> >>     set at this moment.
> >>
> >> 2. when this skb arrives ipvlan_process_outbound(), the mac header
> >>     is unset and skb_pull is called to expand the skb headroom.
> >>
> >> 3. In function skb_checksum_help(), the variable offset is calculated
> >>     as:
> >>        offset = skb->csum_start - skb_headroom(skb);
> >>
> >>     since skb headroom is expanded in step2, offset is nagetive, and it
> >>     is converted to an unsigned integer when compared with skb_headlen
> >>     and trigger the warning.
> >>
> >> Not sure why it is negative ? This seems like the real problem...
> >>
> >> csum_start is relative to skb->head, regardless of pull operations.
> >>
> >> whatever set csum_start to a too small value should be tracked and fixed.
> >>
> >> Right. The only way I could see it go negative is if something does
> >> the equivalent of pskb_expand_head with positive nhead, and without
> >> calling skb_headers_offset_update.
> >>
> >> Perhaps the cause can be found by instrumenting all the above
> >> functions in the trace to report skb_headroom and csum_start.
> >> And also virtio_net_hdr_to_skb.
> >> .
> >>
> >> Hi, Eric  and Willem,  sorry for not describing this issue clearly enough. Here is the detailed data path:
> >>
> >> 1.  Users call sendmsg() to send message with a AF_PACKET domain and SOCK_RAW type socket. Since vnet_hdr
> >>
> >> is set,  csum_start is calculated as:
> >>
> >>                        skb->csum_start = skb_headroom(skb) + (u32)start;     // see the following code.
> >>
> >> the varible "start" it passed from user data, in my case it is 5 and skb_headroom is 2, so skb->csum_start is 7.
> >>
> > I think you are rephrasing, but you did not address my feedback.
> >
> > Namely, "csum_start < skb->network_header" does not look sensical to me.
> >
> > csum_start should be related to the transport header, not network header.
> 
>      csum_start is calculated in pakcet_snd() as:
> 
>                 skb->csum_start = skb_headroom(skb) + (u32)start;
> 
>     the varible "start" it passed from user data via vnet_hdr as follows:
> 
>      packet_snd()
>      ...	
> 	if (po->has_vnet_hdr) {
> 		err = packet_snd_vnet_parse(msg, &len, &vnet_hdr);   // get vnet_hdr which includes start
> 		if (err)
> 		    goto out_unlock;
> 		has_vnet_hdr = true;
> 	}
>      ...
> 
>    csum_start should be at the transport header but users may pass an incorrect value.

Thanks for the clarification.

So this is another bogus packet socket packet, with csum_start set
somewhere in the L2 header, and that gets popped by ipvlan, correct?

Do you have the exact packet and the virtio_net_hdr that caused this,
perhaps?

skb_partial_csum_set in virtio_net_hdr_to_skb has some basic bounds
tests for csum_start, csum_off and csum_end. But that does not
preclude an offset in the L2 header, from what I can tell.

Conceivably this can be added, though it is a bit complex for
devices with variable length link layer headers. And it would have
to happen not only for packet sockets, but all users of
virtio_net_hdr.