netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
To: Jiri Pirko <jiri@resnulli.us>
Cc: Michael Chan <michael.chan@broadcom.com>,
	David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>, Ray Jui <ray.jui@broadcom.com>,
	Jiri Pirko <jiri@mellanox.com>,
	ayal@mellanox.com, Moshe Shemesh <moshe@mellanox.com>
Subject: Re: [PATCH net-next v2 14/22] bnxt_en: Add new FW devlink_health_reporter
Date: Wed, 9 Oct 2019 10:25:45 +0530	[thread overview]
Message-ID: <CAACQVJrBLsdnQKcOzWD5UNydFGoBHus1V_2Xxm=yL1zMb_KBQA@mail.gmail.com> (raw)
In-Reply-To: <20191007095623.GA2326@nanopsycho>

On Mon, Oct 7, 2019 at 3:26 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Fri, Aug 30, 2019 at 05:54:57AM CEST, michael.chan@broadcom.com wrote:
> >From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
> >
> >Create new FW devlink_health_reporter, to know the current health
> >status of FW.
> >
> >Command example and output:
> >$ devlink health show pci/0000:af:00.0 reporter fw
> >
> >pci/0000:af:00.0:
> >  name fw
> >    state healthy error 0 recover 0
> >
> > FW status: Healthy; Reset count: 1
>
> I'm puzzled how did you get this output, since you put "FW status" into
> "diagnose" callback fmsg and that is called upon "devlink health diagnose".
>
> [...]
Jiri, you are right last line is output of diagnose command. Command
is missing here.

$ devlink health diagnose pci/0000:af:00.0 reporter fw
 FW status: Healthy; Reset count: 0

>
> >+static int bnxt_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
> >+                                   struct devlink_fmsg *fmsg)
> >+{
> >+      struct bnxt *bp = devlink_health_reporter_priv(reporter);
> >+      struct bnxt_fw_health *health = bp->fw_health;
> >+      u32 val, health_status;
> >+      int rc;
> >+
> >+      if (!health || test_bit(BNXT_STATE_IN_FW_RESET, &bp->state))
> >+              return 0;
> >+
> >+      val = bnxt_fw_health_readl(bp, BNXT_FW_HEALTH_REG);
> >+      health_status = val & 0xffff;
> >+
> >+      if (health_status == BNXT_FW_STATUS_HEALTHY) {
> >+              rc = devlink_fmsg_string_pair_put(fmsg, "FW status",
> >+                                                "Healthy;");
>
> First of all, the ";" is just wrong. You should put plain string if
> anything. You are trying to format user output here. Don't do that
> please.
>
> Please see json output:
> $ devlink health show pci/0000:af:00.0 reporter fw -j -p
>
> Please remove ";" from the strings.
Okay, I will send a patch for removing ";"

>
>
> Second, I do not understand why you need this "FW status" at all. The
> reporter itself has state healthy/error:
> pci/0000:af:00.0:
>   name fw
>     state healthy error 0 recover 0
>           ^^^^^^^
>
> "FW" is redundant of course as the reporter name is "fw".
>
> Please remove "FW status" and replace with some pair indicating the
> actual error state.
Okay, I can rename to "Status description" so that "FW" name will not
be repeated.

>
> In mlx5 they call it "Description".
>
>
> >+              if (rc)
> >+                      return rc;
> >+      } else if (health_status < BNXT_FW_STATUS_HEALTHY) {
> >+              rc = devlink_fmsg_string_pair_put(fmsg, "FW status",
> >+                                                "Not yet completed initialization;");
> >+              if (rc)
> >+                      return rc;
> >+      } else if (health_status > BNXT_FW_STATUS_HEALTHY) {
> >+              rc = devlink_fmsg_string_pair_put(fmsg, "FW status",
> >+                                                "Encountered fatal error and cannot recover;");
> >+              if (rc)
> >+                      return rc;
> >+      }
> >+
> >+      if (val >> 16) {
> >+              rc = devlink_fmsg_u32_pair_put(fmsg, "Error", val >> 16);
>
> Perhaps rather call this "Error code"?
Okay.

>
>
> >+              if (rc)
> >+                      return rc;
> >+      }
> >+
> >+      val = bnxt_fw_health_readl(bp, BNXT_FW_RESET_CNT_REG);
> >+      rc = devlink_fmsg_u32_pair_put(fmsg, "Reset count", val);
>
> What is this counter counting? Number of recoveries?
> If so, that is also already counted internally by devlink.
"Reset count" is the counter that displays the number of times
firmware has gone for
reset through different mechanisms and devlink is one of it. Firmware
could have gone
for a reset through other tools as well.

Driver gets the information from firmware health register, when
diagnose command is invoked.

>
> [...]

  reply	other threads:[~2019-10-09  4:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30  3:54 [PATCH net-next v2 00/22] bnxt_en: health and error recovery Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 01/22] bnxt_en: Use a common function to print the same ethtool -f error message Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 02/22] bnxt_en: Remove the -1 error return code from bnxt_hwrm_do_send_msg() Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 03/22] bnxt_en: Convert error code in firmware message response to standard code Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 04/22] bnxt_en: Simplify error checking in the SR-IOV message forwarding functions Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 05/22] bnxt_en: Suppress all error messages in hwrm_do_send_msg() in silent mode Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 06/22] bnxt_en: Prepare bnxt_init_one() to be called multiple times Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 07/22] bnxt_en: Refactor bnxt_sriov_enable() Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 08/22] bnxt_en: Register buffers for VFs before reserving resources Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 09/22] bnxt_en: Handle firmware reset status during IF_UP Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 10/22] bnxt_en: Discover firmware error recovery capabilities Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 11/22] bnxt_en: Pre-map the firmware health monitoring registers Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 12/22] bnxt_en: Enable health monitoring Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 13/22] bnxt_en: Add BNXT_STATE_IN_FW_RESET state Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 14/22] bnxt_en: Add new FW devlink_health_reporter Michael Chan
2019-10-07  9:56   ` Jiri Pirko
2019-10-09  4:55     ` Vasundhara Volam [this message]
2019-10-09  7:04       ` Jiri Pirko
2019-08-30  3:54 ` [PATCH net-next v2 15/22] bnxt_en: Handle RESET_NOTIFY async event from firmware Michael Chan
2019-08-30  3:54 ` [PATCH net-next v2 16/22] bnxt_en: Handle firmware reset Michael Chan
2019-08-30 17:56   ` kbuild test robot
2019-08-30  3:55 ` [PATCH net-next v2 17/22] bnxt_en: Add devlink health reset reporter Michael Chan
2019-08-30  3:55 ` [PATCH net-next v2 18/22] bnxt_en: Retain user settings on a VF after RESET_NOTIFY event Michael Chan
2019-08-30  3:55 ` [PATCH net-next v2 19/22] bnxt_en: Do not send firmware messages if firmware is in error state Michael Chan
2019-08-30  3:55 ` [PATCH net-next v2 20/22] bnxt_en: Add RESET_FW state logic to bnxt_fw_reset_task() Michael Chan
2019-08-30  3:55 ` [PATCH net-next v2 21/22] bnxt_en: Add bnxt_fw_exception() to handle fatal firmware errors Michael Chan
2019-08-30  3:55 ` [PATCH net-next v2 22/22] bnxt_en: Add FW fatal devlink_health_reporter Michael Chan
2019-08-30 21:02 ` [PATCH net-next v2 00/22] bnxt_en: health and error recovery David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAACQVJrBLsdnQKcOzWD5UNydFGoBHus1V_2Xxm=yL1zMb_KBQA@mail.gmail.com' \
    --to=vasundhara-v.volam@broadcom.com \
    --cc=ayal@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=jiri@mellanox.com \
    --cc=jiri@resnulli.us \
    --cc=michael.chan@broadcom.com \
    --cc=moshe@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=ray.jui@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).