From: Rajat Jain <rajatxjain@gmail.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Rajat Jain <rajatja@google.com>,
Bjorn Helgaas <bhelgaas@google.com>,
Jonathan Corbet <corbet@lwn.net>,
Philippe Ombredanne <pombredanne@nexb.com>,
Kate Stewart <kstewart@linuxfoundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Frederick Lawler <fred@fredlawl.com>,
Oza Pawandeep <poza@codeaurora.org>,
Keith Busch <keith.busch@intel.com>,
Alexandru Gagniuc <mr.nuke.me@gmail.com>,
Thomas Tai <thomas.tai@oracle.com>,
"Steven Rostedt (VMware)" <rostedt@goodmis.org>,
linux-pci <linux-pci@vger.kernel.org>,
linux-doc <linux-doc@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Jes Sorensen <jsorensen@fb.com>, Kyle McMartin <jkkm@fb.com>,
Tyler Baicar <tbaicar@codeaurora.org>
Subject: Re: [PATCH v5 3/5] PCI/AER: Add sysfs attributes to provide breakdown of AERs
Date: Thu, 21 Jun 2018 14:25:07 -0700 [thread overview]
Message-ID: <CAA93t1rz6YGmvvORDB0ehr7Gwp-W+0HDYxEDgjtgYmGm3qMgCw@mail.gmail.com> (raw)
In-Reply-To: <20180621184822.GB14136@bhelgaas-glaptop.roam.corp.google.com>
On Thu, Jun 21, 2018 at 11:48 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> [+cc Tyler for AER dmesg decoding]
>
> I really like this idea a lot; thanks for putting it together!
>
> On Wed, Jun 20, 2018 at 04:41:45PM -0700, Rajat Jain wrote:
>> Add sysfs attributes to provide breakdown of the AERs seen,
>> into different type of correctable or uncorrectable errors:
>>
>> dev_breakdown_correctable
>> dev_breakdown_uncorrectable
>
> - Can you include a more complete sysfs path here in the commit log,
> as well as a snippet of the contents? From the doc patch, I think
> it is currently:
>
> /sys/bus/pci/devices/<dev>/aer_stats/dev_breakdown_correctable
> /sys/bus/pci/devices/<dev>/aer_stats/dev_breakdown_uncorrectable
>
> - I'm not sure it's worth making a new subdirectory. What if you
> simply added these?
Its your call. We're going to be creating 6 files for aer_stats (I'll
be following your suggestion below), and I think it may clutter the
directory. In my next patch, I'm going to remove the sub directory,
but we can add that later if you feel so.
>
> /sys/bus/pci/devices/<dev>/aer_correctable
> /sys/bus/pci/devices/<dev>/aer_uncorrectable
>
> or perhaps, since you split the "total" files into
> cor/nonfatal/fatal, these could match?
>
> /sys/bus/pci/devices/<dev>/aer_correctable
> /sys/bus/pci/devices/<dev>/aer_nonfatal
> /sys/bus/pci/devices/<dev>/aer_fatal
This sounds like a better idea.
>
> I think the nonfatal/fatal distinction might be worth exposing
> because some of those are configurable and the kernel handling is
> significantly different. So I think it would make this more
> approachable if the "remove/re-enumerate" situations that will be
> obvious in dmesg logs were clearly connected with "aer_fatal"
> statistics, as opposed to being connected to some subset of what's
> in "aer_uncorrectable".
Agree, however note that theoretically, the classification of
uncorrectable errors into fatal or non fatal can be programmed /
changed (by who?), so it is possible that some of the same types of
errors may show up such that some instances in counted in fatal and
some in non-fatal (depending on whether those bits were set while
handling ERR_FATAL or ERR_NONFATAL respectively). Not that I think
there is something wrong with this, just thought I will mention.
>
> - Possibly the totals that you currently have in dev_total_cor_errs
> could even be added to the bottom of these? Not sure what direction
> would be best, and as you say, there's the potential for confusion
> because the individual items won't add up to the totals. If they
> were in the same file, maybe that could be addressed in the label.
Agree, this also sounds good.
>
> - Can you include the related doc update in the same patch? That way
> the doc update is more likely to be backported along with the patch.
Will do.
>
> - I was going to ask whether these should all be in a single file or
> whether they should be split up so there's a separate file for each
> type or error, each containing a single number. But
> Documentation/filesystems/sysfs.txt says either is OK and
> /sys/devices/system/node/node0/vmstat is an example of a similar
> situation in an existing file, so I think what you did is perfect.
Thank you, I initially thought of having a different file for each
error, but then it looked like we're be having much more files - at
least large enough for the number of files to overwhelm the user
space.
Thanks,
Rajat
>
>> Signed-off-by: Rajat Jain <rajatja@google.com>
>> ---
>> v5: Fix the signature
>> v4: use "%llu" in place of "%llx"
>> v3: Merge everything in aer.c
>>
>> drivers/pci/pcie/aer.c | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index ce0d675d7bd3..c989bb5bb6f1 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -587,10 +587,38 @@ aer_stats_aggregate_attr(dev_total_cor_errs);
>> aer_stats_aggregate_attr(dev_total_fatal_errs);
>> aer_stats_aggregate_attr(dev_total_nonfatal_errs);
>>
>> +#define aer_stats_breakdown_attr(field, stats_array, strings_array) \
>> + static ssize_t \
>> + field##_show(struct device *dev, struct device_attribute *attr, \
>> + char *buf) \
>> +{ \
>> + unsigned int i; \
>> + char *str = buf; \
>> + struct pci_dev *pdev = to_pci_dev(dev); \
>> + u64 *stats = pdev->aer_stats->stats_array; \
>
> Nit: add a blank line here.
Will do.
>
>> + for (i = 0; i < ARRAY_SIZE(strings_array); i++) { \
>> + if (strings_array[i]) \
>> + str += sprintf(str, "%s = 0x%llu\n", \
>> + strings_array[i], stats[i]); \
>> + else if (stats[i]) \
>> + str += sprintf(str, #stats_array "bit[%d] = 0x%llu\n",\
>> + i, stats[i]); \
>
> - I like the way this uses the same text as used in dmesg
> (aer_correctable_error_string[] and
> aer_uncorrectable_error_string[]).
>
> - I think this incorrectly prints a "0x" prefix for a decimal number
> (probably an artifact of your v4 change).
Will do.
>
> - Tyler posted a patch [1] to update those dmesg strings so they match
> the way lspci decodes them. I really liked that update, but we
> never quite finished it. If we're going to do that, it would be
> nice to do it first, so we don't publish new sysfs files, then
> immediately change the labels used in them.
Sure, I guess you can push them in the right order.
>
> - IIRC, Tyler's patch had the nice property of changing the strings so
> each error name had no spaces, which would make it a little easier
> to parse this sysfs file: each line would be a single identifier
> followed by a single number (I would probably remove the "=" from
> the middle).
Will do.
>
> [1] https://lkml.kernel.org/r/1518034285-3543-1-git-send-email-tbaicar@codeaurora.org
>
>> + } \
>> + return str-buf; \
>> +} \
>> +static DEVICE_ATTR_RO(field)
>> +
>> +aer_stats_breakdown_attr(dev_breakdown_correctable, dev_cor_errs,
>> + aer_correctable_error_string);
>> +aer_stats_breakdown_attr(dev_breakdown_uncorrectable, dev_uncor_errs,
>> + aer_uncorrectable_error_string);
>> +
>> static struct attribute *aer_stats_attrs[] __ro_after_init = {
>> &dev_attr_dev_total_cor_errs.attr,
>> &dev_attr_dev_total_fatal_errs.attr,
>> &dev_attr_dev_total_nonfatal_errs.attr,
>> + &dev_attr_dev_breakdown_correctable.attr,
>> + &dev_attr_dev_breakdown_uncorrectable.attr,
>> NULL
>> };
>>
>> --
>> 2.18.0.rc1.244.gcf134e6275-goog
>>
next prev parent reply other threads:[~2018-06-21 21:25 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-22 22:28 [PATCH 0/5] Expose PCIe AER stats via sysfs Rajat Jain
2018-05-22 22:28 ` [PATCH 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices Rajat Jain
2018-05-23 8:27 ` Greg Kroah-Hartman
2018-05-23 14:20 ` Jes Sorensen
2018-05-23 14:26 ` Alex G.
2018-05-23 14:28 ` Jes Sorensen
2018-05-23 14:26 ` Matthew Wilcox
2018-05-23 14:32 ` Jes Sorensen
2018-05-23 14:33 ` Alex G.
2018-05-23 14:46 ` Steven Rostedt
2018-05-22 22:28 ` [PATCH 2/5] PCI/AER: Add sysfs stats " Rajat Jain
2018-05-22 22:50 ` Alex G.
2018-05-22 23:27 ` Rajat Jain
2018-05-22 23:30 ` Sinan Kaya
2018-05-23 8:22 ` Greg Kroah-Hartman
2018-05-23 8:24 ` Greg Kroah-Hartman
2018-05-22 22:28 ` [PATCH 3/5] PCP/AER: Add sysfs attributes to provide breakdown of AERs Rajat Jain
2018-05-23 8:25 ` Greg Kroah-Hartman
2018-05-22 22:28 ` [PATCH 4/5] PCI/AER: Add sysfs attributes for rootport cumulative stats Rajat Jain
2018-05-22 22:28 ` [PATCH 5/5] Documentation/PCI: Add details of PCI AER statistics Rajat Jain
2018-05-22 22:52 ` Alex G.
2018-05-22 23:18 ` Rajat Jain
2018-05-23 8:23 ` Greg Kroah-Hartman
2018-05-23 17:58 ` [PATCH v2 0/5] Expose PCIe AER stats via sysfs Rajat Jain
2018-05-23 17:58 ` [PATCH v2 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices Rajat Jain
2018-05-24 6:08 ` Greg Kroah-Hartman
2018-05-23 17:58 ` [PATCH v2 2/5] PCI/AER: Add sysfs stats " Rajat Jain
2018-05-23 17:58 ` [PATCH v2 3/5] PCI/AER: Add sysfs attributes to provide breakdown of AERs Rajat Jain
2018-05-23 17:58 ` [PATCH v2 4/5] PCI/AER: Add sysfs attributes for rootport cumulative stats Rajat Jain
2018-05-23 17:58 ` [PATCH v2 5/5] Documentation/ABI: Add details of PCI AER statistics Rajat Jain
2018-06-17 5:24 ` poza
2018-06-19 0:11 ` Rajat Jain
2018-06-19 0:32 ` Rajat Jain
2018-06-19 6:03 ` poza
2018-06-19 16:31 ` Rajat Jain
2018-06-21 9:19 ` poza
2018-06-22 0:45 ` Rajat Jain
2018-06-19 22:16 ` [PATCH v2 0/5] Expose PCIe AER stats via sysfs Bjorn Helgaas
2018-06-19 22:17 ` Rajat Jain
2018-06-19 22:20 ` Alex G.
2018-06-19 22:25 ` Steven Rostedt
2018-06-19 22:29 ` Alex G.
2018-06-20 1:12 ` [PATCH v3 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices Rajat Jain
2018-06-20 1:12 ` [PATCH v3 2/5] PCI/AER: Add sysfs stats " Rajat Jain
2018-06-20 1:12 ` [PATCH v3 3/5] PCI/AER: Add sysfs attributes to provide breakdown of AERs Rajat Jain
2018-06-20 1:12 ` [PATCH v3 4/5] PCI/AER: Add sysfs attributes for rootport cumulative stats Rajat Jain
2018-06-20 3:13 ` kbuild test robot
2018-06-20 1:12 ` [PATCH v3 5/5] Documentation/ABI: Add details of PCI AER statistics Rajat Jain
2018-06-20 23:28 ` [PATCH v4 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices Rajat Jain
2018-06-20 23:28 ` [PATCH v4 2/5] PCI/AER: Add sysfs stats " Rajat Jain
2018-06-20 23:41 ` [PATCH v5 1/5] PCI/AER: Define and allocate aer_stats structure " Rajat Jain
2018-06-20 23:41 ` [PATCH v5 2/5] PCI/AER: Add sysfs stats " Rajat Jain
2018-06-20 23:41 ` [PATCH v5 3/5] PCI/AER: Add sysfs attributes to provide breakdown of AERs Rajat Jain
2018-06-21 18:48 ` Bjorn Helgaas
2018-06-21 21:25 ` Rajat Jain [this message]
2018-06-22 16:38 ` Tyler Baicar
2018-06-22 17:27 ` Bjorn Helgaas
2018-06-20 23:41 ` [PATCH v5 4/5] PCI/AER: Add sysfs attributes for rootport cumulative stats Rajat Jain
2018-06-20 23:41 ` [PATCH v5 5/5] Documentation/ABI: Add details of PCI AER statistics Rajat Jain
2018-06-21 13:17 ` [PATCH v5 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices Bjorn Helgaas
2018-06-21 20:41 ` Rajat Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAA93t1rz6YGmvvORDB0ehr7Gwp-W+0HDYxEDgjtgYmGm3qMgCw@mail.gmail.com \
--to=rajatxjain@gmail.com \
--cc=bhelgaas@google.com \
--cc=corbet@lwn.net \
--cc=fred@fredlawl.com \
--cc=gregkh@linuxfoundation.org \
--cc=helgaas@kernel.org \
--cc=jkkm@fb.com \
--cc=jsorensen@fb.com \
--cc=keith.busch@intel.com \
--cc=kstewart@linuxfoundation.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mr.nuke.me@gmail.com \
--cc=pombredanne@nexb.com \
--cc=poza@codeaurora.org \
--cc=rajatja@google.com \
--cc=rostedt@goodmis.org \
--cc=tbaicar@codeaurora.org \
--cc=tglx@linutronix.de \
--cc=thomas.tai@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).