Kernel Newbies archive on lore.kernel.org
 help / color / Atom feed
From: Ramana Reddy <gtvrreddy@gmail.com>
To: kernelnewbies@kernelnewbies.org
Subject: gso packet is failing with af_packet socket with packet_vnet_hdr
Date: Mon, 4 Nov 2019 01:23:16 +0530
Message-ID: <CAL2CrsOH-EJ6372pUBjSd6T+T=Hi+WT4-cGt=3bMVAWDtiNf-w@mail.gmail.com> (raw)

[-- Attachment #1.1: Type: text/plain, Size: 7263 bytes --]

Hi,
I am wondering if anyone can help me with this. I am having trouble to send
tso/gso packet
with af_packet socket with packet_vnet_hdr (through virtio_net_hdr) over
vxlan tunnel in OVS.

What I observed that, the following function eventually hitting and is
returning false (net/core/skbuff.c), hence the packet is dropping.
static inline bool skb_gso_size_check(const struct sk_buff *skb,
                                      unsigned int seg_len,
                                      unsigned int max_len) {
        const struct skb_shared_info *shinfo = skb_shinfo(skb);
        const struct sk_buff *iter;
        if (shinfo->gso_size != GSO_BY_FRAGS)
                return seg_len <= max_len;
        ..........
}
[  678.756673] ip_finish_output_gso:235 packet_length:2762 (here
packet_length = skb->len - skb_inner_network_offset(skb))
[  678.756678] ip_fragment:510 packet length:1500
[  678.756715] ip_fragment:510 packet length:1314
[  678.956889] skb_gso_size_check:4474 and seg_len:1550 and max_len:1500
and shinfo->gso_size:1448 and GSO_BY_FRAGS:65535

Observation:
When we send the large packet ( example here is packet_length:2762), its
showing the seg_len(1550) > max_len(1500). Hence return seg_len <= max_len
statement returning false.
Because of this, ip_fragment calling icmp_send(skb, ICMP_DEST_UNREACH,
ICMP_FRAG_NEEDED, htonl(mtu)); rather the code reaching to
ip_finish_output2(sk, skb)
function in net/ipv4/ip_output.c and is given below:

static int ip_finish_output_gso(struct sock *sk, struct sk_buff *skb,
                                unsigned int mtu)
{
        netdev_features_t features;
        struct sk_buff *segs;
        int ret = 0;

        /* common case: seglen is <= mtu */
        if (skb_gso_validate_mtu(skb, mtu))
                return ip_finish_output2(sk, skb);
       ...........
      err = ip_fragment(sk, segs, mtu, ip_finish_output2);
      ...........
 }

But when we send normal iperf traffic ( gso/tso  traffic) over vxlan, the
skb_gso_size_check returning a true value, and ip_finish_output2 getting
executed.
Here is the values of normal iperf traffic over vxlan.

[ 1041.400537] skb_gso_size_check:4477 and seg_len:1500 and max_len:1500
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400587] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400594] skb_gso_size_check:4477 and seg_len:1500 and max_len:1500
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400732] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400741] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535

Can someone help me to solve what is missing, and where should I modify the
code in OVS/ or outside of ovs, so that it works as expected.

Thanks in advance.

Some more info:
[root@xx ~]# uname -r
3.10.0-1062.4.1.el7.x86_64
[root@xx ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.7 (Maipo)

[root@xx]# ovs-vsctl --version
ovs-vsctl (Open vSwitch) 2.9.0
DB Schema 7.15.1

And dump_stack output with af_packet:
[ 4833.637460]  <IRQ>  [<ffffffff81979612>] dump_stack+0x19/0x1b
[ 4833.637474]  [<ffffffff8197c3ca>] ip_fragment.constprop.55+0xc3/0x141
[ 4833.637481]  [<ffffffff8189dd84>] ip_finish_output+0x314/0x350
[ 4833.637484]  [<ffffffff8189eb83>] ip_output+0xb3/0x130
[ 4833.637490]  [<ffffffff8189da70>] ? ip_do_fragment+0x910/0x910
[ 4833.637493]  [<ffffffff8189cac9>] ip_local_out_sk+0xf9/0x180
[ 4833.637497]  [<ffffffff818e6f6c>] iptunnel_xmit+0x18c/0x220
[ 4833.637505]  [<ffffffffc073b2e7>] udp_tunnel_xmit_skb+0x117/0x130
[udp_tunnel]
[ 4833.637538]  [<ffffffffc074585a>] vxlan_xmit_one+0xb6a/0xb70 [vxlan]
[ 4833.637545]  [<ffffffff8129dad9>] ? vprintk_default+0x29/0x40
[ 4833.637551]  [<ffffffffc074765e>] vxlan_xmit+0xc9e/0xef0 [vxlan]
[ 4833.637555]  [<ffffffff818356e7>] ? kfree_skbmem+0x37/0x90
[ 4833.637559]  [<ffffffff81836c24>] ? consume_skb+0x34/0x90
[ 4833.637564]  [<ffffffff819547bc>] ? packet_rcv+0x4c/0x3e0
[ 4833.637570]  [<ffffffff8184d346>] dev_hard_start_xmit+0x246/0x3b0
[ 4833.637574]  [<ffffffff81850339>] __dev_queue_xmit+0x519/0x650
[ 4833.637580]  [<ffffffff812d9df0>] ? try_to_wake_up+0x190/0x390
[ 4833.637585]  [<ffffffff81850480>] dev_queue_xmit+0x10/0x20
[ 4833.637592]  [<ffffffffc0724316>] ovs_vport_send+0xa6/0x180 [openvswitch]
[ 4833.637599]  [<ffffffffc07150fe>] do_output+0x4e/0xd0 [openvswitch]
[ 4833.637604]  [<ffffffffc0716699>] do_execute_actions+0xa29/0xa40
[openvswitch]
[ 4833.637610]  [<ffffffff812d24d2>] ? __wake_up_common+0x82/0x120
[ 4833.637615]  [<ffffffffc0716aac>] ovs_execute_actions+0x4c/0x140
[openvswitch]
[ 4833.637621]  [<ffffffffc071a824>] ovs_dp_process_packet+0x84/0x120
[openvswitch]
[ 4833.637627]  [<ffffffffc0725404>] ? ovs_ct_update_key+0xc4/0x150
[openvswitch]
[ 4833.637633]  [<ffffffffc0724213>] ovs_vport_receive+0x73/0xd0
[openvswitch]
[ 4833.637638]  [<ffffffff812d666f>] ? ttwu_do_activate+0x6f/0x80
[ 4833.637642]  [<ffffffff812d9df0>] ? try_to_wake_up+0x190/0x390
[ 4833.637646]  [<ffffffff812da0c2>] ? default_wake_function+0x12/0x20
[ 4833.637651]  [<ffffffff812c61eb>] ? autoremove_wake_function+0x2b/0x40
[ 4833.637657]  [<ffffffff812d24d2>] ? __wake_up_common+0x82/0x120
[ 4833.637661]  [<ffffffff812e3ae9>] ? update_cfs_shares+0xa9/0xf0
[ 4833.637665]  [<ffffffff812e3696>] ? update_curr+0x86/0x1e0
[ 4833.637669]  [<ffffffff812dee88>] ? __enqueue_entity+0x78/0x80
[ 4833.637677]  [<ffffffffc0724cbe>] netdev_frame_hook+0xde/0x180
[openvswitch]
[ 4833.637682]  [<ffffffff8184d6aa>] __netif_receive_skb_core+0x1fa/0xa10
[ 4833.637688]  [<ffffffffc0724be0>] ? vport_netdev_free+0x30/0x30
[openvswitch]
[ 4833.637692]  [<ffffffff812d6539>] ? ttwu_do_wakeup+0x19/0xe0
[ 4833.637697]  [<ffffffff8184ded8>] __netif_receive_skb+0x18/0x60
[ 4833.637703]  [<ffffffff8184ee9e>] process_backlog+0xae/0x180
[ 4833.637707]  [<ffffffff8184e57f>] net_rx_action+0x26f/0x390
[ 4833.637713]  [<ffffffff812a41e5>] __do_softirq+0xf5/0x280
[ 4833.637719]  [<ffffffff8199042c>] call_softirq+0x1c/0x30
[ 4833.637723]  <EOI>  [<ffffffff8122f675>] do_softirq+0x65/0xa0
[ 4833.637730]  [<ffffffff812a363b>] __local_bh_enable_ip+0x9b/0xb0
[ 4833.637735]  [<ffffffff812a3667>] local_bh_enable+0x17/0x20
[ 4833.637741]  [<ffffffff81850065>] __dev_queue_xmit+0x245/0x650
[ 4833.637746]  [<ffffffff81972e28>] ? printk+0x60/0x77
[ 4833.637752]  [<ffffffff81850480>] dev_queue_xmit+0x10/0x20
[ 4833.637757]  [<ffffffff81957a75>] packet_sendmsg+0xf65/0x1210
[ 4833.637761]  [<ffffffff813d7524>] ? shmem_fault+0x84/0x1f0
[ 4833.637768]  [<ffffffff8182d3a6>] sock_sendmsg+0xb6/0xf0
[ 4833.637772]  [<ffffffff812e3696>] ? update_curr+0x86/0x1e0
[ 4833.637777]  [<ffffffff812e3ae9>] ? update_cfs_shares+0xa9/0xf0
[ 4833.637781]  [<ffffffff8122b621>] ? __switch_to+0x151/0x580
[ 4833.637786]  [<ffffffff8182dad1>] SYSC_sendto+0x121/0x1c0
[ 4833.637793]  [<ffffffff812c8d10>] ? hrtimer_get_res+0x50/0x50
[ 4833.637797]  [<ffffffff8197e54b>] ? do_nanosleep+0x5b/0x100
[ 4833.637802]  [<ffffffff8182f5ee>] SyS_sendto+0xe/0x10
[ 4833.637806]  [<ffffffff8198cede>] system_call_fastpath+0x25/0x2a

Looking forward to your reply.

Regards,
Ramana

[-- Attachment #1.2: Type: text/html, Size: 8738 bytes --]

<div dir="ltr"><div>Hi,</div><div>I am wondering if anyone can help me with this. I am having trouble to send tso/gso packet<br></div><div>with af_packet socket with packet_vnet_hdr (through virtio_net_hdr) over vxlan tunnel in OVS. </div><div><br></div><div>What I observed that, the following function eventually hitting and is returning false (net/core/skbuff.c), hence the packet is dropping. </div><div>static inline bool skb_gso_size_check(const struct sk_buff *skb,<br>                                      unsigned int seg_len,<br>                                      unsigned int max_len) {<br>        const struct skb_shared_info *shinfo = skb_shinfo(skb);<br>        const struct sk_buff *iter;<br></div><div>        if (shinfo-&gt;gso_size != GSO_BY_FRAGS)<br>                return seg_len &lt;= max_len;  <br></div><div>        ..........</div><div>}</div><div>[  678.756673] ip_finish_output_gso:235 packet_length:2762 (here packet_length = skb-&gt;len - skb_inner_network_offset(skb))<br>[  678.756678] ip_fragment:510 packet length:1500<br>[  678.756715] ip_fragment:510 packet length:1314<br>[  678.956889] skb_gso_size_check:4474 and seg_len:1550 and max_len:1500 and shinfo-&gt;gso_size:1448 and GSO_BY_FRAGS:65535<br></div><div><br></div><div>Observation:</div><div>When we send the large packet ( example here is packet_length:2762), its showing the seg_len(1550) &gt; max_len(1500). Hence return seg_len &lt;= max_len statement returning false. </div><div>Because of this, ip_fragment calling icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); rather the code reaching to ip_finish_output2(sk, skb)</div><div>function in net/ipv4/ip_output.c and is given below:</div><div><br></div><div>static int ip_finish_output_gso(struct sock *sk, struct sk_buff *skb,<br>                                unsigned int mtu)<br>{<br>        netdev_features_t features;<br>        struct sk_buff *segs;<br>        int ret = 0;<br><br>        /* common case: seglen is &lt;= mtu */<br>        if (skb_gso_validate_mtu(skb, mtu))<br>                return ip_finish_output2(sk, skb);<br></div><div>       ...........</div><div>      err = ip_fragment(sk, segs, mtu, ip_finish_output2);<br></div><div>      ...........</div><div> }</div><div><br></div><div>But when we send normal iperf traffic ( gso/tso  traffic) over vxlan, the skb_gso_size_check returning a true value, and ip_finish_output2 getting executed. </div><div>Here is the values of normal iperf traffic over vxlan.</div><div><br></div><div>[ 1041.400537] skb_gso_size_check:4477 and seg_len:1500 and max_len:1500 and shinfo-&gt;gso_size:1398 and GSO_BY_FRAGS:65535<br>[ 1041.400587] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450 and shinfo-&gt;gso_size:1398 and GSO_BY_FRAGS:65535<br>[ 1041.400594] skb_gso_size_check:4477 and seg_len:1500 and max_len:1500 and shinfo-&gt;gso_size:1398 and GSO_BY_FRAGS:65535<br>[ 1041.400732] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450 and shinfo-&gt;gso_size:1398 and GSO_BY_FRAGS:65535<br>[ 1041.400741] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450 and shinfo-&gt;gso_size:1398 and GSO_BY_FRAGS:65535<br></div><div><br></div><div>Can someone help me to solve what is missing, and where should I modify the code in OVS/ or outside of ovs, so that it works as expected.</div><div><br></div><div>Thanks in advance.</div><div><br></div><div>Some more info:</div><div>[root@xx ~]# uname -r<br>3.10.0-1062.4.1.el7.x86_64<br>[root@xx ~]# cat /etc/redhat-release<br>Red Hat Enterprise Linux Server release 7.7 (Maipo)<br></div><div><br></div><div>[root@xx]# ovs-vsctl --version<br></div><div>ovs-vsctl (Open vSwitch) 2.9.0<br>DB Schema 7.15.1<br></div><div><br></div><div>And dump_stack output with af_packet:</div><div>[ 4833.637460]  &lt;IRQ&gt;  [&lt;ffffffff81979612&gt;] dump_stack+0x19/0x1b<br>[ 4833.637474]  [&lt;ffffffff8197c3ca&gt;] ip_fragment.constprop.55+0xc3/0x141<br>[ 4833.637481]  [&lt;ffffffff8189dd84&gt;] ip_finish_output+0x314/0x350<br>[ 4833.637484]  [&lt;ffffffff8189eb83&gt;] ip_output+0xb3/0x130<br>[ 4833.637490]  [&lt;ffffffff8189da70&gt;] ? ip_do_fragment+0x910/0x910<br>[ 4833.637493]  [&lt;ffffffff8189cac9&gt;] ip_local_out_sk+0xf9/0x180<br>[ 4833.637497]  [&lt;ffffffff818e6f6c&gt;] iptunnel_xmit+0x18c/0x220<br>[ 4833.637505]  [&lt;ffffffffc073b2e7&gt;] udp_tunnel_xmit_skb+0x117/0x130 [udp_tunnel]<br>[ 4833.637538]  [&lt;ffffffffc074585a&gt;] vxlan_xmit_one+0xb6a/0xb70 [vxlan]<br>[ 4833.637545]  [&lt;ffffffff8129dad9&gt;] ? vprintk_default+0x29/0x40<br>[ 4833.637551]  [&lt;ffffffffc074765e&gt;] vxlan_xmit+0xc9e/0xef0 [vxlan]<br>[ 4833.637555]  [&lt;ffffffff818356e7&gt;] ? kfree_skbmem+0x37/0x90<br>[ 4833.637559]  [&lt;ffffffff81836c24&gt;] ? consume_skb+0x34/0x90<br>[ 4833.637564]  [&lt;ffffffff819547bc&gt;] ? packet_rcv+0x4c/0x3e0<br>[ 4833.637570]  [&lt;ffffffff8184d346&gt;] dev_hard_start_xmit+0x246/0x3b0<br>[ 4833.637574]  [&lt;ffffffff81850339&gt;] __dev_queue_xmit+0x519/0x650<br>[ 4833.637580]  [&lt;ffffffff812d9df0&gt;] ? try_to_wake_up+0x190/0x390<br>[ 4833.637585]  [&lt;ffffffff81850480&gt;] dev_queue_xmit+0x10/0x20<br>[ 4833.637592]  [&lt;ffffffffc0724316&gt;] ovs_vport_send+0xa6/0x180 [openvswitch]<br>[ 4833.637599]  [&lt;ffffffffc07150fe&gt;] do_output+0x4e/0xd0 [openvswitch]<br>[ 4833.637604]  [&lt;ffffffffc0716699&gt;] do_execute_actions+0xa29/0xa40 [openvswitch]<br>[ 4833.637610]  [&lt;ffffffff812d24d2&gt;] ? __wake_up_common+0x82/0x120<br>[ 4833.637615]  [&lt;ffffffffc0716aac&gt;] ovs_execute_actions+0x4c/0x140 [openvswitch]<br>[ 4833.637621]  [&lt;ffffffffc071a824&gt;] ovs_dp_process_packet+0x84/0x120 [openvswitch]<br>[ 4833.637627]  [&lt;ffffffffc0725404&gt;] ? ovs_ct_update_key+0xc4/0x150 [openvswitch]<br>[ 4833.637633]  [&lt;ffffffffc0724213&gt;] ovs_vport_receive+0x73/0xd0 [openvswitch]<br>[ 4833.637638]  [&lt;ffffffff812d666f&gt;] ? ttwu_do_activate+0x6f/0x80<br>[ 4833.637642]  [&lt;ffffffff812d9df0&gt;] ? try_to_wake_up+0x190/0x390<br>[ 4833.637646]  [&lt;ffffffff812da0c2&gt;] ? default_wake_function+0x12/0x20<br>[ 4833.637651]  [&lt;ffffffff812c61eb&gt;] ? autoremove_wake_function+0x2b/0x40<br>[ 4833.637657]  [&lt;ffffffff812d24d2&gt;] ? __wake_up_common+0x82/0x120<br>[ 4833.637661]  [&lt;ffffffff812e3ae9&gt;] ? update_cfs_shares+0xa9/0xf0<br>[ 4833.637665]  [&lt;ffffffff812e3696&gt;] ? update_curr+0x86/0x1e0<br>[ 4833.637669]  [&lt;ffffffff812dee88&gt;] ? __enqueue_entity+0x78/0x80<br>[ 4833.637677]  [&lt;ffffffffc0724cbe&gt;] netdev_frame_hook+0xde/0x180 [openvswitch]<br>[ 4833.637682]  [&lt;ffffffff8184d6aa&gt;] __netif_receive_skb_core+0x1fa/0xa10<br>[ 4833.637688]  [&lt;ffffffffc0724be0&gt;] ? vport_netdev_free+0x30/0x30 [openvswitch]<br>[ 4833.637692]  [&lt;ffffffff812d6539&gt;] ? ttwu_do_wakeup+0x19/0xe0<br>[ 4833.637697]  [&lt;ffffffff8184ded8&gt;] __netif_receive_skb+0x18/0x60<br>[ 4833.637703]  [&lt;ffffffff8184ee9e&gt;] process_backlog+0xae/0x180<br>[ 4833.637707]  [&lt;ffffffff8184e57f&gt;] net_rx_action+0x26f/0x390<br>[ 4833.637713]  [&lt;ffffffff812a41e5&gt;] __do_softirq+0xf5/0x280<br>[ 4833.637719]  [&lt;ffffffff8199042c&gt;] call_softirq+0x1c/0x30<br>[ 4833.637723]  &lt;EOI&gt;  [&lt;ffffffff8122f675&gt;] do_softirq+0x65/0xa0<br>[ 4833.637730]  [&lt;ffffffff812a363b&gt;] __local_bh_enable_ip+0x9b/0xb0<br>[ 4833.637735]  [&lt;ffffffff812a3667&gt;] local_bh_enable+0x17/0x20<br>[ 4833.637741]  [&lt;ffffffff81850065&gt;] __dev_queue_xmit+0x245/0x650<br>[ 4833.637746]  [&lt;ffffffff81972e28&gt;] ? printk+0x60/0x77<br>[ 4833.637752]  [&lt;ffffffff81850480&gt;] dev_queue_xmit+0x10/0x20<br>[ 4833.637757]  [&lt;ffffffff81957a75&gt;] packet_sendmsg+0xf65/0x1210<br>[ 4833.637761]  [&lt;ffffffff813d7524&gt;] ? shmem_fault+0x84/0x1f0<br>[ 4833.637768]  [&lt;ffffffff8182d3a6&gt;] sock_sendmsg+0xb6/0xf0<br>[ 4833.637772]  [&lt;ffffffff812e3696&gt;] ? update_curr+0x86/0x1e0<br>[ 4833.637777]  [&lt;ffffffff812e3ae9&gt;] ? update_cfs_shares+0xa9/0xf0<br>[ 4833.637781]  [&lt;ffffffff8122b621&gt;] ? __switch_to+0x151/0x580<br>[ 4833.637786]  [&lt;ffffffff8182dad1&gt;] SYSC_sendto+0x121/0x1c0<br>[ 4833.637793]  [&lt;ffffffff812c8d10&gt;] ? hrtimer_get_res+0x50/0x50<br>[ 4833.637797]  [&lt;ffffffff8197e54b&gt;] ? do_nanosleep+0x5b/0x100<br>[ 4833.637802]  [&lt;ffffffff8182f5ee&gt;] SyS_sendto+0xe/0x10<br>[ 4833.637806]  [&lt;ffffffff8198cede&gt;] system_call_fastpath+0x25/0x2a<br></div><div><br></div><div>Looking forward to your reply.</div><div><br></div><div>Regards,</div><div>Ramana</div><div class="gmail-yj6qo"></div><div class="gmail-adL"><br></div></div>

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

                 reply index

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL2CrsOH-EJ6372pUBjSd6T+T=Hi+WT4-cGt=3bMVAWDtiNf-w@mail.gmail.com' \
    --to=gtvrreddy@gmail.com \
    --cc=kernelnewbies@kernelnewbies.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Kernel Newbies archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kernelnewbies/0 kernelnewbies/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kernelnewbies kernelnewbies/ https://lore.kernel.org/kernelnewbies \
		kernelnewbies@kernelnewbies.org
	public-inbox-index kernelnewbies

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernelnewbies.kernelnewbies


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git