From mboxrd@z Thu Jan  1 00:00:00 1970
From: Simon Horman <simon.horman@netronome.com>
Subject: Re: [PATCH v2 00/25] Generic flow API (rte_flow)
Date: Wed, 4 Jan 2017 10:53:50 +0100
Message-ID: <20170104095347.GA24762@penelope.horms.nl>
References: <cover.1479309719.git.adrien.mazarguil@6wind.com>
 <cover.1481903839.git.adrien.mazarguil@6wind.com>
 <20161221161914.GA14515@penelope.horms.nl>
 <20161222124804.GD10340@6wind.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: dev@dpdk.org
To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Return-path: <dev-bounces@dpdk.org>
Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46])
 by dpdk.org (Postfix) with ESMTP id 7B74E2946
 for <dev@dpdk.org>; Wed,  4 Jan 2017 10:54:01 +0100 (CET)
Received: by mail-wm0-f46.google.com with SMTP id c85so218564485wmi.1
 for <dev@dpdk.org>; Wed, 04 Jan 2017 01:54:01 -0800 (PST)
Content-Disposition: inline
In-Reply-To: <20161222124804.GD10340@6wind.com>
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Thu, Dec 22, 2016 at 01:48:04PM +0100, Adrien Mazarguil wrote:
> On Wed, Dec 21, 2016 at 05:19:16PM +0100, Simon Horman wrote:
> > On Fri, Dec 16, 2016 at 05:24:57PM +0100, Adrien Mazarguil wrote:
> > > As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> > > described in [3] (also pasted below), here is the first non-draft series
> > > for this new API.
> > > 
> > > Its capabilities are so generic that its name had to be vague, it may be
> > > called "Generic flow API", "Generic flow interface" (possibly shortened
> > > as "GFI") to refer to the name of the new filter type, or "rte_flow" from
> > > the prefix used for its public symbols. I personally favor the latter.
> > > 
> > > While it is currently meant to supersede existing filter types in order for
> > > all PMDs to expose a common filtering/classification interface, it may
> > > eventually evolve to cover the following ideas as well:
> > > 
> > > - Rx/Tx offloads configuration through automatic offloads for specific
> > >   packets, e.g. performing checksum on TCP packets could be expressed with
> > >   an egress rule with a TCP pattern and a kind of checksum action.
> > > 
> > > - RSS configuration (already defined actually). Could be global or per rule
> > >   depending on hardware capabilities.
> > > 
> > > - Switching configuration for devices with many physical ports; rules doing
> > >   both ingress and egress could even be used to completely bypass software
> > >   if supported by hardware.

Hi Adrien,

apologies for not replying for some time due to my winter vacation.

> Hi Simon,
> 
> > Hi Adrien,
> > 
> > thanks for this valuable work.
> > 
> > I would like to ask some high level questions on the proposal.
> > I apologise in advance if any of these questions are based on a
> > misunderstanding on my part.
> > 
> > * I am wondering about provisions for actions to modify packet data or
> >   metadata.  I do see support for marking packets. Is the implication of
> >   this that the main focus is to provide a mechanism for classification
> >   with the assumption that any actions - other than drop and variants of
> >   output - would be performed elsewhere?
> 
> I'm not sure to understand what you mean by "elsewhere" here. Packet marking
> as currently defined is a purely ingress action, i.e. HW matches some packet
> and returns a user-defined tag in related meta-data that the PMD copies to
> the appropriate mbuf structure field before returning it to the application.

By elsewhere I meant in the application, sorry for being unclear.

> There is provision for egress rules and I wrote down a few ideas describing
> how they could be useful (as above), however they remain to be defined.
> 
> >   If so I would observe that this seems somewhat limiting in the case of
> >   hardware that can perform a richer set of actions. And seems particularly
> >   limiting on egress as there doesn't seem anywhere else that other actions
> >   could be performed after classification is performed by this API.
> 
> A single flow rule may contain any number of distinct actions. For egress,
> it means you could wrap matching packets in VLAN and VXLAN at once.
> 
> If you wanted to perform the same action twice on matching packets, you'd
> have to provide two rules with defined priorities and use a non-terminating
> action for the first one:
> 
> - Rule with priority 0: match UDP -> add VLAN 42, passthrough
> - Rule with priority 1: match UDP -> add VLAN 64, terminating
> 
> This is how automatic QinQ would be defined for outgoing UDP packets.

Ok understood. I have two follow-up questions:

1. Is the "add VLAN" action included at this time, I was not able to find it
2. Was consideration given to allowing multiple actions in a single rule?
   I see there would be some advantage to that if classification is
   expensive.

> > * I am curious to know what considerations have been given to supporting          support for tunnelling (encapsulation and decapsulation of e.g. VXLAN),
> >   tagging (pushing and popping e.g. VLANs), and labels (pushing or popping
> >   e.g. MPLS).
> > 
> >   Such features seem would useful for application of this work in a variety
> >   of situations including overlay networks and VNFs.
> 
> This is also what I had in mind and we'd only have to define specific
> ingress/egress actions for these. Currently rte_flow only implements a basic
> set of existing features from the legacy filtering framework, but is meant
> to be extended.

Thanks. I think that answers most of my questions: what I see as missing
in terms of actions can be added.

> > * I am wondering if any thought has gone into supporting matching on the
> >   n-th instance of a field that may appear more than once: e.g. VLAN tag.
> 
> Sure, please see the latest documentation [1] and testpmd examples [2].
> Pattern items being stacked in the same order as protocol layers, maching
> specific QinQ traffic and redirecting it to some queue could be expressed
> with something like:
> 
>  testpmd> flow create 0 ingress pattern eth / vlan vid is 64 / vlan vid is 42 / end 
>     actions queue 6 / end
> 
> Such a rule is translated as-is to rte_flow pattern items and action
> structures.

Thanks, I will look over that.

> > With the above questions in mind I am curious to know what use-cases
> > the proposal is targeted at.
> 
> Well, it should be easier to answer if you have a specific use-case in mind
> you would like to support but that cannot be expressed with the API as
> defined in [1], in which case please share it with the community.

A use-case would be implementing OvS DPIF flow offload using this API.

> [1] http://dpdk.org/ml/archives/dev/2016-December/052954.html
> [2] http://dpdk.org/ml/archives/dev/2016-December/052975.html