All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: Jan Blunck <jblunck@infradead.org>
Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	Thomas Monjalon <thomas.monjalon@6wind.com>,
	Aaron Conole <aconole@redhat.com>,
	dev@dpdk.org, Stephen Hemminger <stephen@networkplumber.org>
Subject: Re: [PATCH 25/25] rte_eal_init: add info about rte_errno codes
Date: Wed, 1 Feb 2017 14:18:40 +0000	[thread overview]
Message-ID: <20170201141839.GA162640@bricha3-MOBL3.ger.corp.intel.com> (raw)
In-Reply-To: <CALe+Z02jX22VEoH55JTKuNPJbe+99-1oWdA_goZkz_y4cZGs=Q@mail.gmail.com>

On Wed, Feb 01, 2017 at 01:06:03PM +0100, Jan Blunck wrote:
> On Wed, Feb 1, 2017 at 11:54 AM, Adrien Mazarguil
> <adrien.mazarguil@6wind.com> wrote:
> > On Mon, Jan 30, 2017 at 09:19:29PM +0100, Thomas Monjalon wrote:
> >> 2017-01-30 13:38, Aaron Conole:
> >> > Stephen Hemminger <stephen@networkplumber.org> writes:
> >> > > Bruce Richardson <bruce.richardson@intel.com> wrote:
> >> > >> On Fri, Jan 27, 2017 at 08:33:46AM -0800, Stephen Hemminger wrote:
> >> > >> > Why use rte_errno?
> >> > >> > Most DPDK calls just return negative value on error which
> >> > >> > corresponds to error number.
> >> > >> > Are you trying to keep ABI compatibility? Doesn't make sense
> >> > >> > because before all these
> >> > >> > errors were panic's no working application is going to care.
> >> > >>
> >> > >> Either will work, but I actually prefer this way. I view using rte_errno
> >> > >> to be better as it can work in just about all cases, including with
> >> > >> functions which return pointers. This allows you to have a standard
> >> > >> method across all functions for returning error codes, and it only
> >> > >> requires a single sentinal value to indicate error, rather than using a
> >> > >> whole range of values.
> >> > >
> >> > > The problem is DPDK is getting more inconsistent on how this is done.
> >> > > As long as error returns are always same as kernel/glibc errno's it really doesn't
> >> > > matter much which way the value is returned from a technical point of view
> >> > > but the inconsistency is sure to be a usability problem and source of errors.
> >> >
> >> > I am using rte_errno here because I assumed it was the preferred
> >> > method.  In fact, looking at some recently contributed modules (for
> >> > instance pdump), it seems that folks are using it.
> >> >
> >> > I'm not really sure the purpose of having rte_errno if it isn't used, so
> >> > it'd be helpful to know if there's some consensus on reflecting errors
> >> > via this variable, or on returning error codes.  Whichever is the more
> >> > consistent with the way the DPDK project does things, I'm game :).
> >>
> >> I think we can use both return value and rte_errno.
> >> We could try to enforce rte_errno as mandatory everywhere.
> >>
> >> Adrien did the recent rte_flow API.
> >> Please Adrien, could you give your thought?
> >
> > Sure, actually as already pointed out in this thread, both approaches have
> > pros and cons depending on the use-case.
> >
> > Through return value:
> >
> > Pros
> > ----
> >
> > - Most common approach used in DPPK today.
> > - Used internally by the Linux kernel (negative errno) and in the pthreads
> >   library (positive errno).
> > - Avoids the need to access an external, global variable requiring its own
> >   thread-local storage.
> > - Inherently thread-safe and reentrant (i.e. safe with signal handlers).
> > - Returned value is also the error code, two facts reported at once.
> 
> Caller can decide to ignore return value if no error handling is wanted.
>
Not always the case. In the case of a rx or tx burst call, if there is a
negative error that must be checked for or assigned to zero in some
cases to make other logic in the path work sanely, e.g. updating an
array of stats using the return value.

> >
> > Cons
> > ----
> >
> > - Difficult to use with functions returning anything other than signed
> >   integers with negative values having no other meaning.
> > - The returned value must be assigned to a local variable in order not to
> >   discard it and process it later most of the time.
> 
> I believe this is Pro since the rte_errno even needs to assign to a
> thread-local variable even.

No, it's a con, since for errno the value will be preserved in the
absense of other errors. The application can delay handling the error as
long as it wants, in the absense of causes of subsequent errors.

> 
> > - All function calls must be tested for errors.
> 
> The rte_errno needs to do this too to decide if it needs to assign a
> value to rte_errno.
> 
Thats inside the called function, not the application. See my earlier
comment above about having to check your return value is in the valid
"logical range" expected from the call. Having a negative number of
packets received does not make logical sense, so you have to check the
return value when updating stats etc.


> >
> > Through rte_errno:
> >
> > Pros
> > ----
> >
> > - errno-like, well known behavior defined by the C standard and used
> >   everywhere in the C library.
> > - Testing return values is not mandatory, e.g. rte_errno can be initialized
> >   to zero before calling a group of functions and checking its value
> >   afterward (rte_errno is only updated in case of error).
> > - Assigning a local variable to store its value is not necessary as long as
> >   another function that may affect rte_errno is not called.
> >
> > Cons
> > ----
> >
> > - Not fully reentrant, thread-safety is fine for most purposes but signal
> >   handlers affecting it still cause undefined behavior (they must at least
> >   save and restore its value in case they modify it).
> > - Accessing non-local storage may affect CPU cycle-sensitive functions such
> >   as TX/RX burst.
> 
> Actually testing for errors mean you also have to reset the rte_errno
> variable before. That also means you have to access thread-local
> storage twice.
> 
Not true. Your return value still indicates an error via a single
sentinal value. Only in that case do you (the app) access the global value,
to find out the exact error reason.

> Besides that the problem of rte_errno is that you do error handling
> twice because the implementation still needs to check for the error
> condition before assigning a meaningful error value to rte_errno.
> After that again the user code needs to check for the return value to
> decide if looking at rte_errno makes any sense.
> 
Yes, in the case of an error occuring there will be an extra write to a
global variable, and a subsequent read from that value (which should not
be a problem, as the write will have occurred in the same thread).
However, this is irrelevant to normal path processing. Error should be
the exception not the rule.

> 
> > My opinion is that rte_errno is best for control path operations while using
> > the return value makes more sense in the data path. The major issue being
> > that function returning anything other than int (e.g. TX/RX burst) cannot
> > describe any kind of error to the application.
> >
> > I went with both in rte_flow (return + rte_errno) mostly due to the return
> > type of a few functions (e.g. rte_flow_create()) and wanted to keep the API
> > consistent while maintaining compatibility with other DPDK APIs. Note there
> > is little overhead for API functions to set rte_errno _and_ return its
> > value, it's mostly free
+1, and error cases should be rare, even if there is a small cost.
.
> >
> > I think using both is best also because it leaves applications the choice of
> > error-handling method, however if I had to pick one I'd go with rte_errno
> > and standardize on -1 as the default error value (as in the C library).
> >
+1
though I think the sentinal value will vary depending on each case. I would
look to keep the standard packet rx/tx functions and ones like them
returning a zero on any error, to simplify programming logic, and also
because in many cases the only real causes of error they can produce is
from bad parameters.
Functions returning pointers obviously will use NULL as error value.


> > Below are a bunch of use-case examples to illustrate how rte_errno could
> > be convenient to applications.
> >
> > Easily creating many flow rules during init in a all-or-nothing fashion:
> >
> >  rte_errno = 0;
> >  for (i = 0; i != num; ++i)
> >      rule[i] = rte_flow_create(port, ...);
> >  if (unlikely(rte_errno)) {
> >      rte_flow_flush(port);
> >      return -1;
> >  }
> >
> > Complete TX packet burst failure with explanation (could also detect partial
> > failures by initializing rte_errno to 0):
> >
> >  sent = rte_eth_tx_burst(...);
> >  if (unlikely(!sent)) {
> >      switch (rte_errno) {
> >          case E2BIG:
> >              // too many packets in burst
> >          ...
> >          case EMSGSIZE:
> >              // first packet is too large
> >          ...
> >          case ENOBUFS:
> >              // TX queue is full
> >          ...
> >      }
> >  }
> >
> > TX burst functions in PMDs could be modified as follows with minimal impact
> > on their performance and no ABI change:
> >
> >      uint16_t sent = 0;
> >      int error; // new variable
> >
> >      [process burst]
> >      if (unlikely([something went wrong])) { // this check already exists
> >          error = EPROBLEM; // new assignment
> >          goto error; // instead of "return sent"
> >      }
> >      [process burst]
> >      return sent;
> >  error:
> >      rte_errno = error;
> >      return sent;
> >
> > --
> > Adrien Mazarguil
> > 6WIND

  reply	other threads:[~2017-02-01 14:18 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-27 14:56 [PATCH 00/24] linux/eal: Remove most causes of panic on init Aaron Conole
2017-01-27 14:56 ` [PATCH 01/25] eal: CPU init will no longer panic Aaron Conole
2017-01-27 14:56 ` [PATCH 02/25] eal: return error instead of panic for cpu init Aaron Conole
2017-01-27 14:56 ` [PATCH 03/25] eal: No panic on hugepages info init Aaron Conole
2017-01-27 14:56 ` [PATCH 04/25] eal: do not panic on failed hugepage query Aaron Conole
2017-01-27 14:56 ` [PATCH 05/25] eal: failure to parse args returns error Aaron Conole
2017-01-27 14:56 ` [PATCH 06/25] eal-common: introduce a way to query cpu support Aaron Conole
2017-01-27 14:56 ` [PATCH 07/25] eal: Signal error when CPU isn't supported Aaron Conole
2017-01-27 16:27   ` Stephen Hemminger
2017-01-30 16:50     ` Aaron Conole
2017-01-27 14:56 ` [PATCH 08/25] eal: do not panic on memzone initialization fails Aaron Conole
2017-01-27 14:56 ` [PATCH 09/25] eal: set errno when exiting for already called Aaron Conole
2017-01-27 14:56 ` [PATCH 10/25] eal: Do not panic on log failures Aaron Conole
2017-01-27 14:56 ` [PATCH 11/25] eal: Do not panic on pci-probe Aaron Conole
2017-01-27 14:56 ` [PATCH 12/25] eal: do not panic on vfio failure Aaron Conole
2017-01-27 14:56 ` [PATCH 13/25] eal: do not panic on memory init Aaron Conole
2017-01-27 14:56 ` [PATCH 14/25] eal: do not panic on tailq init Aaron Conole
2017-01-27 16:30   ` Stephen Hemminger
2017-01-30 16:51     ` Aaron Conole
2017-01-27 14:56 ` [PATCH 15/25] eal: do not panic on alarm init Aaron Conole
2017-01-27 16:31   ` Stephen Hemminger
2017-01-27 16:42     ` Bruce Richardson
2017-01-30 16:52       ` Aaron Conole
2017-01-27 14:56 ` [PATCH 16/25] eal: convert timer_init not to call panic Aaron Conole
2017-01-27 14:56 ` [PATCH 17/25] eal: change the private pipe call to reflect errno Aaron Conole
2017-01-27 14:56 ` [PATCH 18/25] eal: Do not panic on interrupt thread init Aaron Conole
2017-01-27 14:56 ` [PATCH 19/25] eal: do not error if plugins fail to init Aaron Conole
2017-01-27 14:56 ` [PATCH 20/25] eal_pci: Continue probing even on failures Aaron Conole
2017-01-27 14:56 ` [PATCH 21/25] eal: do not panic on failed PCI probe Aaron Conole
2017-01-27 14:57 ` [PATCH 22/25] eal_common_dev: continue initializing vdevs Aaron Conole
2017-01-27 14:57 ` [PATCH 23/25] eal: do not panic (or abort) if vdev init fails Aaron Conole
2017-01-27 14:57 ` [PATCH 24/25] eal: do not panic when bus probe fails Aaron Conole
2017-01-27 14:57 ` [PATCH 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
2017-01-27 16:33   ` Stephen Hemminger
2017-01-27 16:47     ` Bruce Richardson
2017-01-27 17:37       ` Stephen Hemminger
2017-01-30 18:38         ` Aaron Conole
2017-01-30 20:19           ` Thomas Monjalon
2017-02-01 10:54             ` Adrien Mazarguil
2017-02-01 12:06               ` Jan Blunck
2017-02-01 14:18                 ` Bruce Richardson [this message]
2017-01-31  9:33           ` Bruce Richardson
2017-01-31 16:56             ` Stephen Hemminger
2017-01-27 15:48 ` [PATCH 00/24] linux/eal: Remove most causes of panic on init Aaron Conole

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170201141839.GA162640@bricha3-MOBL3.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=aconole@redhat.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dev@dpdk.org \
    --cc=jblunck@infradead.org \
    --cc=stephen@networkplumber.org \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.