All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Miller <davem@davemloft.net>
To: gerlitz.or@gmail.com
Cc: brouer@redhat.com, tom@herbertland.com, eric.dumazet@gmail.com,
	edumazet@google.com, netdev@vger.kernel.org,
	alexander.duyck@gmail.com, alexei.starovoitov@gmail.com,
	borkmann@iogearbox.net, marek@cloudflare.com,
	hannes@stressinduktion.org, fw@strlen.de, pabeni@redhat.com,
	john.r.fastabend@intel.com, amirva@gmail.com,
	matanb@mellanox.com
Subject: Re: Optimizing instruction-cache, more packets at each stage
Date: Thu, 21 Jan 2016 10:56:08 -0800 (PST)	[thread overview]
Message-ID: <20160121.105608.180935122763399438.davem@davemloft.net> (raw)
In-Reply-To: <CAJ3xEMgPi-P6R39LOkBM67SehQE0MLxDXRB7qxCV5Y4WLORpXA@mail.gmail.com>

From: Or Gerlitz <gerlitz.or@gmail.com>
Date: Thu, 21 Jan 2016 14:49:25 +0200

> On Thu, Jan 21, 2016 at 1:27 PM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
>> On Wed, 20 Jan 2016 15:27:38 -0800 Tom Herbert <tom@herbertland.com> wrote:
>>
>>> eth_type_trans touches headers
>>
>> True, the eth_type_trans() call in the driver is a major bottleneck,
>> because it touch the packet header and happens very early in the driver.
>>
>> In my experiments, where I extract several packet before calling
>> napi_gro_receive(), and I also delay calling eth_type_trans().  Most of
>> my speedup comes from this trick, as the prefetch() now that enough
>> time.
>>
>>  while ((skb = __skb_dequeue(&rx_skb_list)) != NULL) {
>>         skb->protocol = eth_type_trans(skb, rq->netdev);
>>         napi_gro_receive(cq->napi, skb);
>>  }
>>
>> What is the HW could provide the info we need in the descriptor?!?
>>
>>
>> eth_type_trans() does two things:
>>
>> 1) determine skb->protocol
>> 2) setup skb->pkt_type = PACKET_{BROADCAST,MULTICAST,OTHERHOST}
>>
>> Could the HW descriptor deliver the "proto", or perhaps just some bits
>> on the most common proto's?
>>
>> The skb->pkt_type don't need many bits.  And I bet the HW already have
>> the information.  The BROADCAST and MULTICAST indication are easy.  The
>> PACKET_OTHERHOST, can be turned around, by instead set a PACKET_HOST
>> indication, if the eth->h_dest match the devices dev->dev_addr (else a
>> SW compare is required).
>>
>> Is that doable in hardware?
> 
> As I wrote earlier, for determination of the eth-type HWs can do what you ask
> here and more.
> 
> Protocol being IP or not (and only then you look in the data) you could
> get I guess from many NICs, e.g if the NIC sets PKT_HASH_TYPE_L4
> or PKT_HASH_TYPE_L3 then we know it's an IP packets and only if
> we don't see this indication we look into the data.

This doesn't differentiate ipv4 vs. ipv6 which is critical here, so this
mechanism is not sufficient.

We must know the exact ETH_P_* value.

  parent reply	other threads:[~2016-01-21 18:56 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-15 13:22 Optimizing instruction-cache, more packets at each stage Jesper Dangaard Brouer
2016-01-15 13:32 ` Hannes Frederic Sowa
2016-01-15 14:17   ` Jesper Dangaard Brouer
2016-01-15 13:36 ` David Laight
2016-01-15 14:00   ` Jesper Dangaard Brouer
2016-01-15 14:38     ` Felix Fietkau
2016-01-18 11:54       ` Jesper Dangaard Brouer
2016-01-18 17:01         ` Eric Dumazet
2016-01-25  0:08         ` Florian Fainelli
2016-01-15 20:47 ` David Miller
2016-01-18 10:27   ` Jesper Dangaard Brouer
2016-01-18 16:24     ` David Miller
2016-01-20 22:20       ` Or Gerlitz
2016-01-20 23:02         ` Eric Dumazet
2016-01-20 23:27           ` Tom Herbert
2016-01-21 11:27             ` Jesper Dangaard Brouer
2016-01-21 12:49               ` Or Gerlitz
2016-01-21 13:57                 ` Jesper Dangaard Brouer
2016-01-21 18:56                 ` David Miller [this message]
2016-01-21 22:45                   ` Or Gerlitz
2016-01-21 22:59                     ` David Miller
2016-01-21 16:38               ` Eric Dumazet
2016-01-21 18:54               ` David Miller
2016-01-24 14:28                 ` Jesper Dangaard Brouer
2016-01-24 14:44                   ` Michael S. Tsirkin
2016-01-24 17:28                     ` John Fastabend
2016-01-25 13:15                       ` Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) Jesper Dangaard Brouer
2016-01-25 17:09                         ` Tom Herbert
2016-01-25 17:50                           ` John Fastabend
2016-01-25 21:32                             ` Tom Herbert
2016-01-25 21:58                               ` John Fastabend
2016-01-25 22:10                             ` Jesper Dangaard Brouer
2016-01-27 20:47                               ` Jesper Dangaard Brouer
2016-01-27 21:56                                 ` Alexei Starovoitov
2016-01-28  9:52                                   ` Jesper Dangaard Brouer
2016-01-28 12:54                                     ` Eric Dumazet
2016-01-28 13:25                                     ` Eric Dumazet
2016-01-28 16:43                                     ` Tom Herbert
2016-01-28  2:50                                 ` Tom Herbert
2016-01-28  9:25                                   ` Jesper Dangaard Brouer
2016-01-28 12:45                                     ` Eric Dumazet
2016-01-28 16:37                                       ` Tom Herbert
2016-01-28 16:43                                         ` Eric Dumazet
2016-01-28 17:04                                         ` Jesper Dangaard Brouer
2016-01-24 20:09                   ` Optimizing instruction-cache, more packets at each stage Tom Herbert
2016-01-24 21:41                     ` John Fastabend
2016-01-24 23:50                       ` Tom Herbert
2016-01-21 12:23             ` Jesper Dangaard Brouer
2016-01-21 16:38               ` Tom Herbert
2016-01-21 17:48                 ` Eric Dumazet
2016-01-22 12:33                   ` Jesper Dangaard Brouer
2016-01-22 14:33                     ` Eric Dumazet
2016-01-22 17:07                     ` Tom Herbert
2016-01-22 17:17                       ` Jesper Dangaard Brouer
2016-02-02 16:13             ` Or Gerlitz
2016-02-02 16:37               ` Eric Dumazet
2016-01-18 16:53     ` Eric Dumazet
2016-01-18 17:36     ` Tom Herbert
2016-01-18 17:49       ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160121.105608.180935122763399438.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=alexander.duyck@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=amirva@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=brouer@redhat.com \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fw@strlen.de \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=john.r.fastabend@intel.com \
    --cc=marek@cloudflare.com \
    --cc=matanb@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.