All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olivier MATZ <olivier.matz@6wind.com>
To: "Chilikin, Andrey" <andrey.chilikin@intel.com>,
	"Liang, Cunming" <cunming.liang@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Subject: Re: [PATCH 05/18] mbuf: add function to get packet type from data
Date: Wed, 6 Jul 2016 14:08:48 +0200	[thread overview]
Message-ID: <f378ba10-262d-87d3-7b66-cc6c043d5b5e@6wind.com> (raw)
In-Reply-To: <AAC06825A3B29643AF5372F5E0DDF0536446F464@IRSMSX106.ger.corp.intel.com>

Hi Andrey,

On 07/06/2016 01:59 PM, Chilikin, Andrey wrote:
> Hi Oliver,
> 
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ
>> Sent: Wednesday, July 6, 2016 8:43 AM
>> To: Liang, Cunming <cunming.liang@intel.com>; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH 05/18] mbuf: add function to get packet type
>> from data
>>
>> Hi Cunming,
>>
>> On 07/06/2016 08:44 AM, Liang, Cunming wrote:
>>> Hi Olivier,
>>>
>>> On 7/5/2016 11:41 PM, Olivier Matz wrote:
>>>> Introduce the function rte_pktmbuf_get_ptype() that parses a mbuf and
>>>> returns its packet type. For now, the following packet types are
>>>> parsed:
>>>>     L2: Ether
>>>>     L3: IPv4, IPv6
>>>>     L4: TCP, UDP, SCTP
>>>>
>>>> The goal here is to provide a reference implementation for packet
>>>> type parsing. This function will be used by testpmd in next commits,
>>>> allowing to compare its result with the value given by the hardware.
>>>>
>>>> This function will also be useful when implementing Rx offload
>>>> support in virtio pmd. Indeed, the virtio protocol gives the csum
>>>> start and offset, but it does not give the L4 protocol nor it tells
>>>> if the checksum is relevant for inner or outer. This information has
>>>> to be known to properly set the ol_flags in mbuf.
>>>>
>>>> Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
>>>> Signed-off-by: Jean Dao <jean.dao@6wind.com>
>>>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>>>> ---
>>>>   doc/guides/rel_notes/release_16_11.rst |   5 +
>>>>   lib/librte_mbuf/Makefile               |   5 +-
>>>>   lib/librte_mbuf/rte_mbuf_ptype.c       | 234
>>>> +++++++++++++++++++++++++++++++++
>>>>   lib/librte_mbuf/rte_mbuf_ptype.h       |  43 ++++++
>>>>   lib/librte_mbuf/rte_mbuf_version.map   |   1 +
>>>>   5 files changed, 286 insertions(+), 2 deletions(-)
>>>>   create mode 100644 lib/librte_mbuf/rte_mbuf_ptype.c
>>>>
>>>> [...]
>>>> +
>>>> +/* parse mbuf data to get packet type */ uint32_t
>>>> +rte_pktmbuf_get_ptype(const struct rte_mbuf *m,
>>>> +    struct rte_mbuf_hdr_lens *hdr_lens) {
>>>> +    struct rte_mbuf_hdr_lens local_hdr_lens;
>>>> +    const struct ether_hdr *eh;
>>>> +    struct ether_hdr eh_copy;
>>>> +    uint32_t pkt_type = RTE_PTYPE_L2_ETHER;
>>>> +    uint32_t off = 0;
>>>> +    uint16_t proto;
>>>> +
>>>> +    if (hdr_lens == NULL)
>>>> +        hdr_lens = &local_hdr_lens;
>>>> +
>>>> +    eh = rte_pktmbuf_read(m, off, sizeof(*eh), &eh_copy);
>>>> +    if (unlikely(eh == NULL))
>>>> +        return 0;
>>>> +    proto = eh->ether_type;
>>>> +    off = sizeof(*eh);
>>>> +    hdr_lens->l2_len = off;
>>>> +
>>>> +    if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) {
>>>> +        const struct ipv4_hdr *ip4h;
>>>> +        struct ipv4_hdr ip4h_copy;
>>>> +
>>>> +        ip4h = rte_pktmbuf_read(m, off, sizeof(*ip4h), &ip4h_copy);
>>>> +        if (unlikely(ip4h == NULL))
>>>> +            return pkt_type;
>>>> +
>>>> +        pkt_type |= ptype_l3_ip(ip4h->version_ihl);
>>>> +        hdr_lens->l3_len = ip4_hlen(ip4h);
>>>> +        off += hdr_lens->l3_len;
>>>> +        if (ip4h->fragment_offset &
>>>> +                rte_cpu_to_be_16(IPV4_HDR_OFFSET_MASK |
>>>> +                    IPV4_HDR_MF_FLAG)) {
>>>> +            pkt_type |= RTE_PTYPE_L4_FRAG;
>>>> +            hdr_lens->l4_len = 0;
>>>> +            return pkt_type;
>>>> +        }
>>>> +        proto = ip4h->next_proto_id;
>>>> +        pkt_type |= ptype_l4(proto);
>>>> +    } else if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv6)) {
>>>> +        const struct ipv6_hdr *ip6h;
>>>> +        struct ipv6_hdr ip6h_copy;
>>>> +        int frag = 0;
>>>> +
>>>> +        ip6h = rte_pktmbuf_read(m, off, sizeof(*ip6h), &ip6h_copy);
>>>> +        if (unlikely(ip6h == NULL))
>>>> +            return pkt_type;
>>>> +
>>>> +        proto = ip6h->proto;
>>>> +        hdr_lens->l3_len = sizeof(*ip6h);
>>>> +        off += hdr_lens->l3_len;
>>>> +        pkt_type |= ptype_l3_ip6(proto);
>>>> +        if ((pkt_type & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6_EXT) {
>>>> +            proto = skip_ip6_ext(proto, m, &off, &frag);
>>>> +            hdr_lens->l3_len = off - hdr_lens->l2_len;
>>>> +        }
>>>> +        if (proto == 0)
>>>> +            return pkt_type;
>>>> +        if (frag) {
>>>> +            pkt_type |= RTE_PTYPE_L4_FRAG;
>>>> +            hdr_lens->l4_len = 0;
>>>> +            return pkt_type;
>>>> +        }
>>>> +        pkt_type |= ptype_l4(proto);
>>>> +    }
>>>> +
>>>> +    if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP) {
>>>> +        hdr_lens->l4_len = sizeof(struct udp_hdr);
>>>> +    } else if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
>>>> +        const struct tcp_hdr *th;
>>>> +        struct tcp_hdr th_copy;
>>>> +
>>>> +        th = rte_pktmbuf_read(m, off, sizeof(*th), &th_copy);
>>>> +        if (unlikely(th == NULL))
>>>> +            return pkt_type & (RTE_PTYPE_L2_MASK |
>>>> +                RTE_PTYPE_L3_MASK);
>>>> +        hdr_lens->l4_len = (th->data_off & 0xf0) >> 2;
>>>> +    } else if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP) {
>>>> +        hdr_lens->l4_len = sizeof(struct sctp_hdr);
>>>> +    } else {
>>>> +        hdr_lens->l4_len = 0;
>>>> +    }
>>>> +
>>>> +    return pkt_type;
>>>> +}
>>>> diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h
>>>> b/lib/librte_mbuf/rte_mbuf_ptype.h
>>>> index 4a34678..f468520 100644
>>>> --- a/lib/librte_mbuf/rte_mbuf_ptype.h
>>>> +++ b/lib/librte_mbuf/rte_mbuf_ptype.h
>>>> @@ -545,6 +545,49 @@ extern "C" {
>>>>           RTE_PTYPE_INNER_L3_MASK |                \
>>>>           RTE_PTYPE_INNER_L4_MASK))
>>>>   +struct rte_mbuf;
>>>> +
>>>> +/**
>>>> + * Structure containing header lengths associated to a packet.
>>>> + */
>>>> +struct rte_mbuf_hdr_lens {
>>>> +    uint8_t l2_len;
>>>> +    uint8_t l3_len;
>>>> +    uint8_t l4_len;
>>>> +    uint8_t tunnel_len;
>>>> +    uint8_t inner_l2_len;
>>>> +    uint8_t inner_l3_len;
>>>> +    uint8_t inner_l4_len;
>>>> +};
>>> [LC] The header parsing graph usually is not unique. The definition
>>> maybe nice for the basic IP and L4 tunnel.
>>> However it can't scale out to other cases, e.g. qinq, mac-in-mac, mpls
>>> l2/l3 tunnel.
>>> The parsing logic of "rte_pktmbuf_get_ptype()" and the definition of
>>> "struct rte_mbuf_hdr_lens" consist a pair for one specific parser scheme.
>>> In this case, the fixed function is to support below.
>>>
>>> + * Supported packet types are:
>>> + *   L2: Ether
>>> + *   L3: IPv4, IPv6
>>> + *   L4: TCP, UDP, SCTP
>>>
>>> Of course, it can add more packet type detection logic in future. But
>>> the more support, the higher the cost.
>>>
>>> One of the alternative way is to allow registering parser pair. APP
>>> decides to choose the predefined scheme(by DPDK LIB), or to
>>> self-define the parsing logic.
>>> In this way, the scheme can do some assumption for the specific case
>>> and ignore some useless graph detection.
>>> In addition, besides the SW parser, the HW parser(identified by
>>> packet_type in mbuf) can be turn on/off by leveraging the same manner.
>>
>> Sorry, I'm not sure I'm fully getting what you are saying. If I understand well,
>> you would like to have something more flexible that supports the registration of
>> protocol to be recognized?
>>
>> I'm not sure having a function with a dynamic registration method would really
>> increase the performance compared to a static complete function.
>> Actually, we will never support a tons of protocols since each layer packet type
>> is 4 bits, and since it requires that at least one hw supports it.
> 
> This patch will be very useful as a reference implementation, but it also highlights an issue with the current implementation of packet types reporting by HW and SW - as you just mentioned there are only 4 bits per each layer. As these 4 bit are used as a enumeration it is impossible to reports multiple headers located on the same layer. MPLS is one example, different packets could have different numbers of MPLS labels, but it is impossible to report using current packet_type structure.
> 
> It is possible, however, to  program HW to report user (application) specific packet types. For example, for IPoMPLS with one MPLS label, HW will report packet type A, but for IPoMPLS with two MPLS labels HW will reports packet type B. In this case, instead of defining and supporting tons of statically defined (or enumerated) protocol headers combinations, application will register packet types it expects from HW in addition to standard packet types. At the moment we  have high bits of packet_type reserved, so one possible solution would be to use the highest bit to indicate that this is user defined packet_type, specific to the application. Then it could be used with HW and with SW parser. For example, packet_type 0x8000000A is IPoMPLS with one MPLS label, 0x8000000B is IPoMPLS with two
  MPLS labels and so on.

Thank you for the explanation. From your description, I wonder if the
flow director API recently [1] proposed by Adrien wouldn't solve this issue?

[1] http://dpdk.org/ml/archives/dev/2016-July/043365.html

Regards,
Olivier

  reply	other threads:[~2016-07-06 12:08 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-05 15:41 [PATCH 00/18] software parser for packet type Olivier Matz
2016-07-05 15:41 ` [PATCH 01/18] doc: add template for release notes 16.11 Olivier Matz
2016-07-06 11:48   ` Mcnamara, John
2016-07-06 12:00     ` Olivier MATZ
2016-07-05 15:41 ` [PATCH 02/18] mbuf: add function to read packet data Olivier Matz
2016-07-05 15:41 ` [PATCH 03/18] net: move Ethernet header definitions to the net library Olivier Matz
2016-07-05 15:41 ` [PATCH 04/18] mbuf: move packet type definitions in a new file Olivier Matz
2016-07-05 15:41 ` [PATCH 05/18] mbuf: add function to get packet type from data Olivier Matz
2016-07-06  6:44   ` Liang, Cunming
2016-07-06  7:42     ` Olivier MATZ
2016-07-06 11:59       ` Chilikin, Andrey
2016-07-06 12:08         ` Olivier MATZ [this message]
2016-07-06 12:21           ` Chilikin, Andrey
2016-07-07  8:19       ` Liang, Cunming
2016-07-07 15:48         ` Olivier Matz
2016-07-08 10:08           ` Liang, Cunming
2016-07-05 15:41 ` [PATCH 06/18] mbuf: support Vlan in software packet type parser Olivier Matz
2016-07-05 15:41 ` [PATCH 07/18] mbuf: support QinQ " Olivier Matz
2016-07-05 15:41 ` [PATCH 08/18] net: add Mpls header structure Olivier Matz
2016-07-05 15:41 ` [PATCH 09/18] mbuf: support Mpls in software packet type parser Olivier Matz
2016-07-06  7:08   ` Liang, Cunming
2016-07-06  8:00     ` Olivier MATZ
2016-07-07  8:48       ` Liang, Cunming
2016-07-07 16:01         ` Olivier Matz
2016-07-05 15:41 ` [PATCH 10/18] mbuf: support Ip tunnels " Olivier Matz
2016-07-05 15:41 ` [PATCH 11/18] net: add Gre header structure Olivier Matz
2016-07-05 15:41 ` [PATCH 12/18] mbuf: support Gre in software packet type parser Olivier Matz
2016-07-05 15:41 ` [PATCH 13/18] mbuf: support Nvgre " Olivier Matz
2016-07-05 15:41 ` [PATCH 14/18] mbuf: get ptype for the first layers only Olivier Matz
2016-07-05 15:41 ` [PATCH 15/18] mbuf: add functions to dump packet type Olivier Matz
2016-07-05 15:41 ` [PATCH 16/18] mbuf: clarify definition of fragment packet types Olivier Matz
2016-07-05 15:41 ` [PATCH 17/18] app/testpmd: dump ptype using the new function Olivier Matz
2016-07-05 15:41 ` [PATCH 18/18] app/testpmd: display sw packet type Olivier Matz
2016-08-29 14:35 ` [PATCH v2 00/16] software parser for " Olivier Matz
2016-08-29 14:35   ` [PATCH v2 01/16] mbuf: add function to read packet data Olivier Matz
2016-08-29 14:35   ` [PATCH v2 02/16] net: move Ethernet header definitions to the net library Olivier Matz
2016-08-29 14:35   ` [PATCH v2 03/16] mbuf: move packet type definitions in a new file Olivier Matz
2016-08-29 14:35   ` [PATCH v2 04/16] net: introduce net library Olivier Matz
2016-08-29 14:35   ` [PATCH v2 05/16] net: add function to get packet type from data Olivier Matz
2016-08-29 14:35   ` [PATCH v2 06/16] net: support Vlan in software packet type parser Olivier Matz
2016-08-29 14:35   ` [PATCH v2 07/16] net: support QinQ " Olivier Matz
2016-08-29 14:35   ` [PATCH v2 08/16] net: support Ip tunnels " Olivier Matz
2016-08-29 14:35   ` [PATCH v2 09/16] net: add Gre header structure Olivier Matz
2016-08-29 14:35   ` [PATCH v2 10/16] net: support Gre in software packet type parser Olivier Matz
2016-08-29 14:35   ` [PATCH v2 11/16] net: support Nvgre " Olivier Matz
2016-08-29 14:35   ` [PATCH v2 12/16] net: get ptype for the first layers only Olivier Matz
2016-08-29 14:35   ` [PATCH v2 13/16] mbuf: add functions to dump packet type Olivier Matz
2016-08-29 14:35   ` [PATCH v2 14/16] mbuf: clarify definition of fragment packet types Olivier Matz
2016-08-29 14:35   ` [PATCH v2 15/16] app/testpmd: dump ptype using the new function Olivier Matz
2016-08-29 14:35   ` [PATCH v2 16/16] app/testpmd: display software packet type Olivier Matz
2016-10-03  8:38   ` [PATCH v3 00/16] software parser for " Olivier Matz
2016-10-03  8:38     ` [PATCH v3 01/16] mbuf: add function to read packet data Olivier Matz
2016-10-03  8:38     ` [PATCH v3 02/16] net: move Ethernet header definitions to the net library Olivier Matz
2016-10-03  8:38     ` [PATCH v3 03/16] mbuf: move packet type definitions in a new file Olivier Matz
2016-10-10 14:52       ` Thomas Monjalon
2016-10-11  9:01         ` Olivier MATZ
2016-10-11 15:51           ` Thomas Monjalon
2016-10-03  8:38     ` [PATCH v3 04/16] net: introduce net library Olivier Matz
2016-10-03  8:38     ` [PATCH v3 05/16] net: add function to get packet type from data Olivier Matz
2016-10-03  8:38     ` [PATCH v3 06/16] net: support Vlan in software packet type parser Olivier Matz
2016-10-03  8:38     ` [PATCH v3 07/16] net: support QinQ " Olivier Matz
2016-10-03  8:38     ` [PATCH v3 08/16] net: support Ip tunnels " Olivier Matz
2016-10-03  8:38     ` [PATCH v3 09/16] net: add Gre header structure Olivier Matz
2016-10-03  8:38     ` [PATCH v3 10/16] net: support Gre in software packet type parser Olivier Matz
2016-10-03  8:38     ` [PATCH v3 11/16] net: support Nvgre " Olivier Matz
2016-10-03  8:38     ` [PATCH v3 12/16] net: get ptype for the first layers only Olivier Matz
2016-10-03  8:38     ` [PATCH v3 13/16] mbuf: add functions to dump packet type Olivier Matz
2016-10-03  8:38     ` [PATCH v3 14/16] mbuf: clarify definition of fragment packet types Olivier Matz
2016-10-03  8:38     ` [PATCH v3 15/16] app/testpmd: dump ptype using the new function Olivier Matz
2016-10-03  8:38     ` [PATCH v3 16/16] app/testpmd: display software packet type Olivier Matz
2016-10-11 16:24     ` [PATCH v3 00/16] software parser for " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f378ba10-262d-87d3-7b66-cc6c043d5b5e@6wind.com \
    --to=olivier.matz@6wind.com \
    --cc=andrey.chilikin@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.