All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Saeed Mahameed <saeedm@mellanox.com>
Cc: David Miller <davem@davemloft.net>,
	Network Development <netdev@vger.kernel.org>,
	ogerlitz@mellanox.com, borisp@mellanox.com, yossiku@mellanox.com,
	Alexander Duyck <alexander.duyck@gmail.com>
Subject: Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
Date: Fri, 29 Jun 2018 18:19:02 -0400	[thread overview]
Message-ID: <CAF=yD-LHN6GS7aGBm3enQDJNrvrUp6aGGMcr7bHOR0PNrLqKJA@mail.gmail.com> (raw)
In-Reply-To: <20180628215103.9141-2-saeedm@mellanox.com>

On Fri, Jun 29, 2018 at 2:24 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> From: Boris Pismenny <borisp@mellanox.com>
>
> This patch enables UDP GSO support. We enable this by using two WQEs
> the first is a UDP LSO WQE for all segments with equal length, and the
> second is for the last segment in case it has different length.
> Due to HW limitation, before sending, we must adjust the packet length fields.
>
> We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
> machines connected back-to-back with Connectx4-Lx (40Gbps) NICs.
> We compare single stream UDP, UDP GSO and UDP GSO with offload.
> Performance:
>                 | MSS (bytes)   | Throughput (Gbps)     | CPU utilization (%)
> UDP GSO offload | 1472          | 35.6                  | 8%
> UDP GSO         | 1472          | 25.5                  | 17%
> UDP             | 1472          | 10.2                  | 17%
> UDP GSO offload | 1024          | 35.6                  | 8%
> UDP GSO         | 1024          | 19.2                  | 17%
> UDP             | 1024          | 5.7                   | 17%
> UDP GSO offload | 512           | 33.8                  | 16%
> UDP GSO         | 512           | 10.4                  | 17%
> UDP             | 512           | 3.5                   | 17%

Very nice results :)

> +static void mlx5e_udp_gso_prepare_last_skb(struct sk_buff *skb,
> +                                          struct sk_buff *nskb,
> +                                          int remaining)
> +{
> +       int bytes_needed = remaining, remaining_headlen, remaining_page_offset;
> +       int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
> +       int payload_len = remaining + sizeof(struct udphdr);
> +       int k = 0, i, j;
> +
> +       skb_copy_bits(skb, 0, nskb->data, headlen);
> +       nskb->dev = skb->dev;
> +       skb_reset_mac_header(nskb);
> +       skb_set_network_header(nskb, skb_network_offset(skb));
> +       skb_set_transport_header(nskb, skb_transport_offset(skb));
> +       skb_set_tail_pointer(nskb, headlen);
> +
> +       /* How many frags do we need? */
> +       for (i = skb_shinfo(skb)->nr_frags - 1; i >= 0; i--) {
> +               bytes_needed -= skb_frag_size(&skb_shinfo(skb)->frags[i]);
> +               k++;
> +               if (bytes_needed <= 0)
> +                       break;
> +       }
> +
> +       /* Fill the first frag and split it if necessary */
> +       j = skb_shinfo(skb)->nr_frags - k;
> +       remaining_page_offset = -bytes_needed;
> +       skb_fill_page_desc(nskb, 0,
> +                          skb_shinfo(skb)->frags[j].page.p,
> +                          skb_shinfo(skb)->frags[j].page_offset + remaining_page_offset,
> +                          skb_shinfo(skb)->frags[j].size - remaining_page_offset);
> +
> +       skb_frag_ref(skb, j);
> +
> +       /* Fill the rest of the frags */
> +       for (i = 1; i < k; i++) {
> +               j = skb_shinfo(skb)->nr_frags - k + i;
> +
> +               skb_fill_page_desc(nskb, i,
> +                                  skb_shinfo(skb)->frags[j].page.p,
> +                                  skb_shinfo(skb)->frags[j].page_offset,
> +                                  skb_shinfo(skb)->frags[j].size);
> +               skb_frag_ref(skb, j);
> +       }
> +       skb_shinfo(nskb)->nr_frags = k;
> +
> +       remaining_headlen = remaining - skb->data_len;
> +
> +       /* headlen contains remaining data? */
> +       if (remaining_headlen > 0)
> +               skb_copy_bits(skb, skb->len - remaining, nskb->data + headlen,
> +                             remaining_headlen);
> +       nskb->len = remaining + headlen;
> +       nskb->data_len =  payload_len - sizeof(struct udphdr) +
> +               max_t(int, 0, remaining_headlen);
> +       nskb->protocol = skb->protocol;
> +       if (nskb->protocol == htons(ETH_P_IP)) {
> +               ip_hdr(nskb)->id = htons(ntohs(ip_hdr(nskb)->id) +
> +                                        skb_shinfo(skb)->gso_segs);
> +               ip_hdr(nskb)->tot_len =
> +                       htons(payload_len + sizeof(struct iphdr));
> +       } else {
> +               ipv6_hdr(nskb)->payload_len = htons(payload_len);
> +       }
> +       udp_hdr(nskb)->len = htons(payload_len);
> +       skb_shinfo(nskb)->gso_size = 0;
> +       nskb->ip_summed = skb->ip_summed;
> +       nskb->csum_start = skb->csum_start;
> +       nskb->csum_offset = skb->csum_offset;
> +       nskb->queue_mapping = skb->queue_mapping;
> +}
> +
> +/* might send skbs and update wqe and pi */
> +struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
> +                                           struct mlx5e_txqsq *sq,
> +                                           struct sk_buff *skb,
> +                                           struct mlx5e_tx_wqe **wqe,
> +                                           u16 *pi)
> +{
> +       int payload_len = skb_shinfo(skb)->gso_size + sizeof(struct udphdr);
> +       int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
> +       int remaining = (skb->len - headlen) % skb_shinfo(skb)->gso_size;
> +       struct sk_buff *nskb;
> +
> +       if (skb->protocol == htons(ETH_P_IP))
> +               ip_hdr(skb)->tot_len = htons(payload_len + sizeof(struct iphdr));
> +       else
> +               ipv6_hdr(skb)->payload_len = htons(payload_len);
> +       udp_hdr(skb)->len = htons(payload_len);
> +       if (!remaining)
> +               return skb;
> +
> +       nskb = alloc_skb(max_t(int, headlen, headlen + remaining - skb->data_len), GFP_ATOMIC);
> +       if (unlikely(!nskb)) {
> +               sq->stats->dropped++;
> +               return NULL;
> +       }
> +
> +       mlx5e_udp_gso_prepare_last_skb(skb, nskb, remaining);
> +
> +       skb_shinfo(skb)->gso_segs--;
> +       pskb_trim(skb, skb->len - remaining);
> +       mlx5e_sq_xmit(sq, skb, *wqe, *pi);
> +       mlx5e_sq_fetch_wqe(sq, wqe, pi);
> +       return nskb;
> +}

The device driver seems to be implementing the packet split here
similar to NETIF_F_GSO_PARTIAL. When advertising the right flag, the
stack should be able to do that for you and pass two packets to the
driver.

  reply	other threads:[~2018-06-29 22:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
2018-06-28 21:50 ` [net-next 01/12] net/mlx5e: Add UDP GSO support Saeed Mahameed
2018-06-29 22:19   ` Willem de Bruijn [this message]
2018-06-30 16:06     ` Boris Pismenny
2018-06-30 19:40       ` Boris Pismenny
     [not found]         ` <CAKgT0UfFMG+i9z_nxB0vkJesDE4CHWbudCh6kJ_1+=Vd1hv=wA@mail.gmail.com>
2018-07-02  1:45           ` Willem de Bruijn
2018-07-02  5:29             ` Boris Pismenny
2018-07-02 13:34               ` Willem de Bruijn
2018-07-02 14:45                 ` Willem de Bruijn
     [not found]                   ` <CAKgT0UdC2c04JagxW8S==-ymBfDZVd6f=7DcLUGjRqiiZA3BwA@mail.gmail.com>
2018-07-02 19:17                     ` Boris Pismenny
2018-06-28 21:50 ` [net-next 02/12] net/mlx5e: Add UDP GSO remaining counter Saeed Mahameed
2018-06-28 21:50 ` [net-next 03/12] net/mlx5e: Convert large order kzalloc allocations to kvzalloc Saeed Mahameed
2018-06-28 21:50 ` [net-next 04/12] net/mlx5e: RX, Use existing WQ local variable Saeed Mahameed
2018-06-28 21:50 ` [net-next 05/12] net/mlx5e: Add TX completions statistics Saeed Mahameed
2018-06-28 21:50 ` [net-next 06/12] net/mlx5e: Add XDP_TX " Saeed Mahameed
2018-06-28 21:50 ` [net-next 07/12] net/mlx5e: Add NAPI statistics Saeed Mahameed
2018-06-28 21:50 ` [net-next 08/12] net/mlx5e: Add a counter for congested UMRs Saeed Mahameed
2018-06-28 21:51 ` [net-next 09/12] net/mlx5e: Add channel events counter Saeed Mahameed
2018-06-28 21:51 ` [net-next 10/12] net/mlx5e: Add counter for MPWQE filler strides Saeed Mahameed
2018-06-28 21:51 ` [net-next 11/12] net/mlx5e: Add counter for total num of NOP operations Saeed Mahameed
2018-06-28 21:51 ` [net-next 12/12] net/mlx5e: Update NIC HW stats on demand only Saeed Mahameed
2018-06-29 14:56 ` [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF=yD-LHN6GS7aGBm3enQDJNrvrUp6aGGMcr7bHOR0PNrLqKJA@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=alexander.duyck@gmail.com \
    --cc=borisp@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=saeedm@mellanox.com \
    --cc=yossiku@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.