From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Chandran, Sugesh" <sugesh.chandran@intel.com>
Subject: Re: [PATCH 01/22] ethdev: introduce generic flow API
Date: Tue, 6 Dec 2016 18:11:38 +0000
Message-ID: <2EF2F5C0CC56984AA024D0B180335FCB13EC330D@IRSMSX102.ger.corp.intel.com>
References: <cover.1471632644.git.adrien.mazarguil@6wind.com>
 <cover.1479309719.git.adrien.mazarguil@6wind.com>
 <1c8a8e4fec73ed33836f1da9525b1b8b53048518.1479309720.git.adrien.mazarguil@6wind.com>
 <59393e58-6c85-d2e5-1aab-a721fe9c933e@redhat.com>
 <20161201083652.GI10340@6wind.com>
 <7f65ba09-e6fe-d97a-6ab5-97e84a828a81@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Cc: "dev@dpdk.org" <dev@dpdk.org>, Thomas Monjalon
 <thomas.monjalon@6wind.com>, "De Lara Guarch, Pablo"
 <pablo.de.lara.guarch@intel.com>, Olivier Matz <olivier.matz@6wind.com>,
 "sugesh.chandran@intel.comn" <sugesh.chandran@intel.comn>
To: Kevin Traynor <ktraynor@redhat.com>, Adrien Mazarguil
 <adrien.mazarguil@6wind.com>
Return-path: <dev-bounces@dpdk.org>
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
 by dpdk.org (Postfix) with ESMTP id 3F02E2A66
 for <dev@dpdk.org>; Tue,  6 Dec 2016 19:17:46 +0100 (CET)
In-Reply-To: <7f65ba09-e6fe-d97a-6ab5-97e84a828a81@redhat.com>
Content-Language: en-US
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi Adrien,
Thanks for sending out the patches,

Please find few comments below,


Regards
_Sugesh


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Kevin Traynor
> Sent: Friday, December 2, 2016 9:07 PM
> To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>; De
> Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz
> <olivier.matz@6wind.com>; sugesh.chandran@intel.comn
> Subject: Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
>=20
>>>>>>Snipp
> >>> + *
> >>> + * Attaches a 32 bit value to packets.
> >>> + *
> >>> + * This value is arbitrary and application-defined. For
> >>> +compatibility with
> >>> + * FDIR it is returned in the hash.fdir.hi mbuf field.
> >>> +PKT_RX_FDIR_ID is
> >>> + * also set in ol_flags.
> >>> + */
> >>> +struct rte_flow_action_mark {
> >>> +	uint32_t id; /**< 32 bit value to return with packets. */ };
> >>
> >> One use case I thought we would be able to do for OVS is
> >> classification in hardware and the unique flow id is sent with the pac=
ket to
> software.
> >> But in OVS the ufid is 128 bits, so it means we can't and there is
> >> still the miniflow extract overhead. I'm not sure if there is a
> >> practical way around this.
> >>
> >> Sugesh (cc'd) has looked at this before and may be able to comment or
> >> correct me.
> >
> > Yes, we settled on 32 bit because currently no known hardware
> > implementation supports more than this. If that changes, another
> > action with a larger type shall be provided (no ABI breakage).
> >
> > Also since even 64 bit would not be enough for the use case you
> > mention, there is no choice but use this as an indirect value (such as
> > an array or hash table index/value).
>=20
> ok, cool. I think Sugesh has other ideas anyway!
[Sugesh] It should be fine with 32 bit . we can manage it in OVS accordingl=
y.
>=20
> >
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_RSS
> >>> + *
> >>> +
> >>> + *
> >>> + * Terminating by default.
> >>> + */
> >>> +struct rte_flow_action_vf {
> >>> +	uint32_t original:1; /**< Use original VF ID if possible. */
> >>> +	uint32_t reserved:31; /**< Reserved, must be zero. */
> >>> +	uint32_t id; /**< VF ID to redirect packets to. */ };
> > [...]
> >>> +/**
> >>> + * Check whether a flow rule can be created on a given port.
> >>> + *
> >>> + * While this function has no effect on the target device, the flow
> >>> +rule is
> >>> + * validated against its current configuration state and the
> >>> +returned value
> >>> + * should be considered valid by the caller for that state only.
> >>> + *
> >>> + * The returned value is guaranteed to remain valid only as long as
> >>> +no
> >>> + * successful calls to rte_flow_create() or rte_flow_destroy() are
> >>> +made in
> >>> + * the meantime and no device parameter affecting flow rules in any
> >>> +way are
> >>> + * modified, due to possible collisions or resource limitations
> >>> +(although in
> >>> + * such cases EINVAL should not be returned).
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item)=
.
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 if flow rule is valid and can be created. A negative errno va=
lue
> >>> + *   otherwise (rte_errno is also set), the following errors are def=
ined:
> >>> + *
> >>> + *   -ENOSYS: underlying device does not support this functionality.
> >>> + *
> >>> + *   -EINVAL: unknown or invalid rule specification.
> >>> + *
> >>> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partia=
l
> >>> + *   bit-masks are unsupported).
> >>> + *
> >>> + *   -EEXIST: collision with an existing rule.
> >>> + *
> >>> + *   -ENOMEM: not enough resources.
> >>> + *
> >>> + *   -EBUSY: action cannot be performed due to busy device resources=
,
> may
> >>> + *   succeed if the affected queues or even the entire port are in a
> stopped
> >>> + *   state (see rte_eth_dev_rx_queue_stop() and
> rte_eth_dev_stop()).
> >>> + */
> >>> +int
> >>> +rte_flow_validate(uint8_t port_id,
> >>> +		  const struct rte_flow_attr *attr,
> >>> +		  const struct rte_flow_item pattern[],
> >>> +		  const struct rte_flow_action actions[],
> >>> +		  struct rte_flow_error *error);
> >>
> >> Why not just use rte_flow_create() and get an error? Is it less
> >> disruptive to do a validate and find the rule cannot be created, than
> >> using a create directly?
> >
> > The rationale can be found in the original RFC, which I'll convert to
> > actual documentation in v2. In short:
> >
> > - Calling rte_flow_validate() before rte_flow_create() is useless since
> >   rte_flow_create() also performs validation.
> >
> > - We cannot possibly express a full static set of allowed flow rules, e=
ven
> >   if we could, it usually depends on the current hardware configuration
> >   therefore would not be static.
> >
> > - rte_flow_validate() is thus provided as a replacement for capability
> >   flags. It can be used to determine during initialization if the under=
lying
> >   device can support the typical flow rules an application might want t=
o
> >   provide later and do something useful with that information (e.g. alw=
ays
> >   use software fallback due to HW limitations).
> >
> > - rte_flow_validate() being a subset of rte_flow_create(), it is essent=
ially
> >   free to expose.
>=20
> make sense now, thanks.
[Sugesh] : We had this discussion earlier at the design stage about the tim=
e taken for programming the hardware,
and how to make it deterministic. How about having a timeout parameter as w=
ell for the rte_flow_*
If the hardware flow insert is timed out, error out than waiting indefinite=
ly, so that application have some control over
The time to program the flow. It can be another set of APIs something like,=
 rte_flow_create_timeout()

Are you going to provide any control over the initialization of NIC  to def=
ine the capability matrices
For eg; To operate in a L3 router mode,  software wanted to initialize the =
NIC port only to consider the L2 and L3 fields.
I assume the initialization is done based on the first rules that are progr=
ammed into the NIC.?
>=20
> >
> >>> +
> >>> +/**
> >>> + * Create a flow rule on a given port.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item)=
.
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   A valid handle in case of success, NULL otherwise and rte_errno=
 is
> set
> >>> + *   to the positive version of one of the error codes defined for
> >>> + *   rte_flow_validate().
> >>> + */
> >>> +struct rte_flow *
> >>> +rte_flow_create(uint8_t port_id,
> >>> +		const struct rte_flow_attr *attr,
> >>> +		const struct rte_flow_item pattern[],
> >>> +		const struct rte_flow_action actions[],
> >>> +		struct rte_flow_error *error);
> >>
> >> General question - are these functions threadsafe? In the OVS example
> >> you could have several threads wanting to create flow rules at the
> >> same time for same or different ports.
> >
> > No they aren't, applications have to perform their own locking. The
> > RFC (to be converted to actual documentation in v2) says that:
> >
> > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> >   returned).
> >
> > - There is no provision for reentrancy/multi-thread safety, although
> nothing
> >   should prevent different devices from being configured at the same
> >   time. PMDs may protect their control path functions accordingly.
>=20
> other comment above wrt locking.
>=20
> >