From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anton Ivanov Subject: Re: BUG:af_packet fails to TX TSO frames Date: Thu, 12 Oct 2017 16:44:26 +0100 Message-ID: References: <844d6e0f-6ee7-74bc-b961-faa77b240303@cambridgegreys.com> <23ace6d6-afa7-9a3a-aa61-1245ee6c0498@kot-begemot.co.uk> <1e9c4f8b-2eb6-24cf-764f-b0a98aa0d044@kot-begemot.co.uk> <40e87e75-742f-3542-b79e-1e7fee9b4485@cambridgegreys.com> <2f973588-e193-86c1-a645-7a158b17ebdc@cambridgegreys.com> <529ff560-8230-c799-2948-fae711de382e@kot-begemot.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: Network Development , David Miller To: Willem de Bruijn Return-path: Received: from ivanoab5.miniserver.com ([78.31.111.25]:36704 "EHLO www.kot-begemot.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752454AbdJLPoa (ORCPT ); Thu, 12 Oct 2017 11:44:30 -0400 In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: Found it. Two bugs canceling each other. The bind sequence in:  psock_txring_vnet.c is wrong. It does the following addr.sll_protocol =    htons(ETH_P_IP); before calling bind. If you set addr.sll_protocol to ETH_P_ALL where it should have been in the first place the test program blows up with -ENOBUFS I think what is happening is that this value is taken into account when looking at "what should I use to segment it with" in skb_mac_gso_segment which is invoked at the end of the verification chain which starts in packet_direct_xmit in af_packet.c I have not tried the other test cases like setting it to ETH_P_IP and giving it IPv6 traffic or the opposite, but my guess is that these will fail too if they need GSO to be applied. A. On 10/12/17 15:12, Anton Ivanov wrote: > > > On 10/12/17 14:39, Willem de Bruijn wrote: >>> If I produce a real vnet frame out of a live kernel frame using >>> virtio_net_hdr_from_skb() and try to send it it fails on the check in >>> af_packet, while succeeding for tap. If I remove the af_packet check >>> the >>> frame is accepted by the hardware too. >>> >>> If I produce it a synthetic frame + vnet header using the test >>> program - it >>> works. Go figure. >> Besides looking at the raw frame bytes, also compare the setup >> of virtio_net_header, as well as the tcp checksum field. The stack >> expects the pseudo header to have already been calculated. > > I am feeding it a skb which is coming up in the tx routine of a User > Mode Linux device which is marked as NETIF_F_HW_CSUM and SG - that > results in a skb with csum-ed headers, body set up for CSUM_PARTIAL > and multiple fragments (always at least 1 more frag besides the TCP > head). > > That has everything in order as expected by virtio_net_hdr_from_skb > and this is what I use to generate the vnet header. It works correctly > for csum and GRO with af_packet and it works correctly for everything > using a tap device. It fails only on GSO + af_packet TX. > > What I am doing is the same thing virtio_net does - it just takes the > output of virtio_net_hdr_from_skb and does nothing more. There should > be no need to do anything more :( > > It should just work. > > Unless there is a gremlin somewhere in the machinery and that gremlin > needs some light to be flushed out. >> >>> I am going to continue digging into it. >>> >>> At the very least I now have a positive test case which uses the same >>> semantics as my code so I have something to compare to. >> Glad to hear that the test is helpful. I wrote it because I >> have run into these exact same issues in the past. > > It is. I have changes ready for it so it also supports vector IO, need > to finish fighting with it. > > A. > >> > -- Anton R. Ivanov Cambridge Greys Limited, England and Wales company No 10273661 http://www.cambridgegreys.com/