All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: fengchengwen <fengchengwen@huawei.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Question about hardware error handling policy
Date: Thu, 22 Jul 2021 17:46:12 +0200	[thread overview]
Message-ID: <4435152.k7BQ785f6v@thomas> (raw)
In-Reply-To: <0bc940bb-65e6-1acb-d026-7a2a08a0ad8b@huawei.com>

22/07/2021 15:50, fengchengwen:
> Hi, all
> 
>     I notice ethdev support dev_reset ops, which could be used to recover from
> errors, and only 13+ drivers support this function.
>     And also there is event for reset: RTE_ETH_EVENT_INTR_RESET, and only 6
> drivers support it (most of them are VF).
> 
>     This provides users with two ways to handle hardware errors:
>     a. driver report RTE_ETH_EVENT_INTR_RESET, and application do reset ops.
>     b. application detect errors (the detection method is unclear), and call
>     reset ops to recover.
> 
>     According to the design of this API, error handling is assigned to the
> application, and the driver is only responsible for reporting events. This
> simplifies the driver design (for example, the driver does not need to maintain
> mutex locks).
> 
>     As we know, many modern NICs come with firmware, have PCIE interfaces,
> support SR-IOV, the hardware errors can have: firmware reboot/PF reset/
> VF reset/FLR, but these errors(particularly firmware/PF) are not addressed in
> most drivers.
> 
>     Question 1: what do we think of these errors(particularly firmware/PF)? Do
> we think that the probability is very low and that there is no need to deal with
> them?

Even rare errors must be managed.

>     Question 2: I prefer to put error handling in the application layer, because
> doing it in the driver can make the driver complex, but there is no app to
> register the INTR_RESET event handler. I think we can build a standard handler
> in testpmd, What do you think?

Absolutely. As any ethdev API, it must be tested with testpmd.



  reply	other threads:[~2021-07-22 15:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22 13:50 [dpdk-dev] Question about hardware error handling policy fengchengwen
2021-07-22 15:46 ` Thomas Monjalon [this message]
2021-07-23  2:18   ` fengchengwen
2021-07-25 15:12     ` Matan Azrad
2021-07-26  6:21       ` fengchengwen
2021-07-23 12:33   ` Ferruh Yigit
2021-07-23 12:51     ` Thomas Monjalon
2021-07-23 13:04     ` Andrew Rybchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4435152.k7BQ785f6v@thomas \
    --to=thomas@monjalon.net \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.