linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors
@ 2020-03-25 13:55 Shiju Jose
  2020-03-25 15:22 ` Bjorn Helgaas
  0 siblings, 1 reply; 4+ messages in thread
From: Shiju Jose @ 2020-03-25 13:55 UTC (permalink / raw)
  To: linux-acpi, linux-pci, linux-kernel, rjw, helgaas, lenb, bp,
	james.morse, tony.luck, gregkh, zhangliguang, tglx
  Cc: linuxarm, jonathan.cameron, tanxiaofei, yangyicong

Presently the vendor drivers are unable to do the recovery for the
vendor specific recoverable HW errors, reported to the APEI driver
in the vendor defined sections, because APEI driver does not support
reporting the same to the vendor drivers.

This patch set
1. add an interface to the APEI driver to enable the vendor
drivers to register the event handling functions for the corresponding
vendor specific HW errors and report the error to the vendor driver.

2. add driver to handle HiSilicon hip08 PCIe controller's errors
    which is an example application of the above APEI interface.

Changes:

V5:
1. Fix comments from James Morse.
1.1 Changed the notification method to use the atomic_notifier_chain.
1.2 Add the error handled status for the user space.

V4:
1. Fix for the smatch warning in the PCIe error driver:
    warn: should '((((1))) << (9 + i))' be a 64 bit type?
    if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i))
	^^^ This should be BIT_ULL() because it goes up to 9 + 32.

V3:
1. Fix the comments from Bjorn Helgaas.

V2:
1. Changes in the HiSilicon PCIe controller's error handling driver
    for the comments from Bjorn Helgaas.

2. Changes in the APEI interface to support reporting the vendor error
    for module with multiple devices, but use the same section type.
    In the error handler will use socket id/sub module id etc to distinguish
    the device.

V1:
1. Fix comments from James Morse.

2. add driver to handle HiSilicon hip08 PCIe controller's errors,
    which is an application of the above interface.

Shiju Jose (1):
   APEI: Add support to notify the vendor specific HW errors

Yicong Yang (1):
   PCI: HIP: Add handling of HiSilicon HIP PCIe controller errors

  drivers/acpi/apei/ghes.c                 |  35 ++-
  drivers/pci/controller/Kconfig           |   8 +
  drivers/pci/controller/Makefile          |   1 +
  drivers/pci/controller/pcie-hisi-error.c | 357 +++++++++++++++++++++++
  drivers/ras/ras.c                        |   5 +-
  include/acpi/ghes.h                      |  28 ++
  include/linux/ras.h                      |   6 +-
  include/ras/ras_event.h                  |   7 +-
  8 files changed, 440 insertions(+), 7 deletions(-)
  create mode 100644 drivers/pci/controller/pcie-hisi-error.c

-- 
2.17.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors
  2020-03-25 13:55 [PATCH v5 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
@ 2020-03-25 15:22 ` Bjorn Helgaas
  2020-03-25 16:27   ` Shiju Jose
  0 siblings, 1 reply; 4+ messages in thread
From: Bjorn Helgaas @ 2020-03-25 15:22 UTC (permalink / raw)
  To: Shiju Jose
  Cc: linux-acpi, linux-pci, linux-kernel, rjw, lenb, bp, james.morse,
	tony.luck, gregkh, zhangliguang, tglx, linuxarm,
	jonathan.cameron, tanxiaofei, yangyicong

1) If you can post things as a series, i.e., with patch 1/2 and patch
2/2 being responses to the 0/2 cover letter, that makes things easier.
It looks like you did this for the previous postings.

2) When applying these, "git am" complained (but they did apply
cleanly):

  warning: Patch sent with format=flowed; space at the end of lines might be lost.
  Applying: APEI: Add support to notify the vendor specific HW errors
  warning: Patch sent with format=flowed; space at the end of lines might be lost.
  Applying: PCI: HIP: Add handling of HiSilicon HIP PCIe controller errors

3) drivers/pci/controller/pcie-hisi-error.c should be next to
drivers/pci/controller/dwc/pcie-hisi.c, shouldn't it?

4) Your subject lines don't match the convention.  "git log --oneline
drivers/acpi/apei" says:

  011077d8fbfe ("APEI: Add support to notify the vendor specific HW errors")
  cea79e7e2f24 ("apei/ghes: Do not delay GHES polling")
  933ca4e323de ("acpi: Use pr_warn instead of pr_warning")
  6abc7622271d ("ACPI / APEI: Release resources if gen_pool_add() fails")
  bb100b64763c ("ACPI / APEI: Get rid of NULL_UUID_LE constant")
  371b86897d01 ("ACPI / APEI: Remove needless __ghes_check_estatus() calls")

and "git log --oneline --follow drivers/pci/controller/dwc/pcie-hisi*"
says:

  6e0832fa432e ("PCI: Collect all native drivers under drivers/pci/controller/")
  8cfab3cf63cf ("PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate")
  5a4751680189 ("PCI: hisi: Constify dw_pcie_host_ops structure")
  b379d385bbaa ("PCI: hisi: Remove unused variable driver")
  a5f40e8098fe ("PCI: Don't allow unbinding host controllers that aren't prepared")
  e313a447e735 ("PCI: hisi: Update PCI config space remap function")
  b9c1153f7a9c ("PCI: hisi: Fix DT binding (hisi-pcie-almost-ecam)")

So your subject lines should be:

  ACPI / APEI: ...
  PCI: hisi: ...

On Wed, Mar 25, 2020 at 01:55:03PM +0000, Shiju Jose wrote:
> Presently the vendor drivers are unable to do the recovery for the
> vendor specific recoverable HW errors, reported to the APEI driver
> in the vendor defined sections, because APEI driver does not support
> reporting the same to the vendor drivers.
> 
> This patch set
> 1. add an interface to the APEI driver to enable the vendor
> drivers to register the event handling functions for the corresponding
> vendor specific HW errors and report the error to the vendor driver.
> 
> 2. add driver to handle HiSilicon hip08 PCIe controller's errors
>    which is an example application of the above APEI interface.
> 
> Changes:
> 
> V5:
> 1. Fix comments from James Morse.
> 1.1 Changed the notification method to use the atomic_notifier_chain.
> 1.2 Add the error handled status for the user space.
> 
> V4:
> 1. Fix for the smatch warning in the PCIe error driver:
>    warn: should '((((1))) << (9 + i))' be a 64 bit type?
>    if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i))
> 	^^^ This should be BIT_ULL() because it goes up to 9 + 32.
> 
> V3:
> 1. Fix the comments from Bjorn Helgaas.
> 
> V2:
> 1. Changes in the HiSilicon PCIe controller's error handling driver
>    for the comments from Bjorn Helgaas.
> 
> 2. Changes in the APEI interface to support reporting the vendor error
>    for module with multiple devices, but use the same section type.
>    In the error handler will use socket id/sub module id etc to distinguish
>    the device.
> 
> V1:
> 1. Fix comments from James Morse.
> 
> 2. add driver to handle HiSilicon hip08 PCIe controller's errors,
>    which is an application of the above interface.
> 
> Shiju Jose (1):
>   APEI: Add support to notify the vendor specific HW errors
> 
> Yicong Yang (1):
>   PCI: HIP: Add handling of HiSilicon HIP PCIe controller errors
> 
>  drivers/acpi/apei/ghes.c                 |  35 ++-
>  drivers/pci/controller/Kconfig           |   8 +
>  drivers/pci/controller/Makefile          |   1 +
>  drivers/pci/controller/pcie-hisi-error.c | 357 +++++++++++++++++++++++
>  drivers/ras/ras.c                        |   5 +-
>  include/acpi/ghes.h                      |  28 ++
>  include/linux/ras.h                      |   6 +-
>  include/ras/ras_event.h                  |   7 +-
>  8 files changed, 440 insertions(+), 7 deletions(-)
>  create mode 100644 drivers/pci/controller/pcie-hisi-error.c
> 
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH v5 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors
  2020-03-25 15:22 ` Bjorn Helgaas
@ 2020-03-25 16:27   ` Shiju Jose
  2020-03-25 18:31     ` Bjorn Helgaas
  0 siblings, 1 reply; 4+ messages in thread
From: Shiju Jose @ 2020-03-25 16:27 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-acpi, linux-pci, linux-kernel, rjw, lenb, bp, james.morse,
	tony.luck, gregkh, zhangliguang, tglx, Linuxarm,
	Jonathan Cameron, tanxiaofei, yangyicong

Hi Bjorn,

>-----Original Message-----
>From: Bjorn Helgaas [mailto:helgaas@kernel.org]
>Sent: 25 March 2020 15:22
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-acpi@vger.kernel.org; linux-pci@vger.kernel.org; linux-
>kernel@vger.kernel.org; rjw@rjwysocki.net; lenb@kernel.org; bp@alien8.de;
>james.morse@arm.com; tony.luck@intel.com; gregkh@linuxfoundation.org;
>zhangliguang@linux.alibaba.com; tglx@linutronix.de; Linuxarm
><linuxarm@huawei.com>; Jonathan Cameron
><jonathan.cameron@huawei.com>; tanxiaofei <tanxiaofei@huawei.com>;
>yangyicong <yangyicong@huawei.com>
>Subject: Re: [PATCH v5 0/2] ACPI: APEI: Add support to notify the vendor
>specific HW errors
>
>1) If you can post things as a series, i.e., with patch 1/2 and patch
>2/2 being responses to the 0/2 cover letter, that makes things easier.
>It looks like you did this for the previous postings.
I will send the patches as series after fixing the issues in the patch subject lines. 

>
>2) When applying these, "git am" complained (but they did apply
>cleanly):
>
>  warning: Patch sent with format=flowed; space at the end of lines might be
>lost.
>  Applying: APEI: Add support to notify the vendor specific HW errors
>  warning: Patch sent with format=flowed; space at the end of lines might be
>lost.
>  Applying: PCI: HIP: Add handling of HiSilicon HIP PCIe controller errors
>
>3) drivers/pci/controller/pcie-hisi-error.c should be next to
>drivers/pci/controller/dwc/pcie-hisi.c, shouldn't it?
Our hip PCIe controller doesn't use DWC ip.

>
>4) Your subject lines don't match the convention.  "git log --oneline
>drivers/acpi/apei" says:
>
>  011077d8fbfe ("APEI: Add support to notify the vendor specific HW errors")
>  cea79e7e2f24 ("apei/ghes: Do not delay GHES polling")
>  933ca4e323de ("acpi: Use pr_warn instead of pr_warning")
>  6abc7622271d ("ACPI / APEI: Release resources if gen_pool_add() fails")
>  bb100b64763c ("ACPI / APEI: Get rid of NULL_UUID_LE constant")
>  371b86897d01 ("ACPI / APEI: Remove needless __ghes_check_estatus()
>calls")
>
>and "git log --oneline --follow drivers/pci/controller/dwc/pcie-hisi*"
>says:
>
>  6e0832fa432e ("PCI: Collect all native drivers under drivers/pci/controller/")
>  8cfab3cf63cf ("PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate")
>  5a4751680189 ("PCI: hisi: Constify dw_pcie_host_ops structure")
>  b379d385bbaa ("PCI: hisi: Remove unused variable driver")
>  a5f40e8098fe ("PCI: Don't allow unbinding host controllers that aren't
>prepared")
>  e313a447e735 ("PCI: hisi: Update PCI config space remap function")
>  b9c1153f7a9c ("PCI: hisi: Fix DT binding (hisi-pcie-almost-ecam)")
>
>So your subject lines should be:
>
>  ACPI / APEI: ...
Sure. I will fix this.

>  PCI: hisi: ...
Can we use PCI: hip because this driver is for the HIP hardware devices. 

[...]
>> --
>> 2.17.1

Thanks,
Shiju

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors
  2020-03-25 16:27   ` Shiju Jose
@ 2020-03-25 18:31     ` Bjorn Helgaas
  0 siblings, 0 replies; 4+ messages in thread
From: Bjorn Helgaas @ 2020-03-25 18:31 UTC (permalink / raw)
  To: Shiju Jose
  Cc: linux-acpi, linux-pci, linux-kernel, rjw, lenb, bp, james.morse,
	tony.luck, gregkh, zhangliguang, tglx, Linuxarm,
	Jonathan Cameron, tanxiaofei, yangyicong

On Wed, Mar 25, 2020 at 04:27:15PM +0000, Shiju Jose wrote:
> >-----Original Message-----
> >From: Bjorn Helgaas [mailto:helgaas@kernel.org]

> >3) drivers/pci/controller/pcie-hisi-error.c should be next to
> >drivers/pci/controller/dwc/pcie-hisi.c, shouldn't it?
>
> Our hip PCIe controller doesn't use DWC ip.

Ah, I was assuming this pcie-hisi-error.c driver was for the same
device claimed by pcie-hisi.c.

Error drivers like this will have some device-specific knowledge
(e.g., which registers to dump), but I guess they'll always be
used with the generic acpi/pci_root.c driver, right?

It looks like this driver has little or nothing to do with the PCI
core directly.  It does include drivers/pci/pci.h, but I'm not sure it
really needs it.

Maybe drivers/pci/controller/ is the best place for it, but I'm not
sure.  It's a little confusing because it's not really like the other
things there.

There are some vaguely similar things in drivers/acpi/apei/ and
drivers/acpi/nfit/.  And of course there are .acpi_match_table uses
all over the drivers/ tree.  Maybe we need a new subdirectory under
drivers/pci?  drivers/pci/controller/apei/?

Any thoughts, Rafael?

Bjorn

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-03-25 18:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-25 13:55 [PATCH v5 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
2020-03-25 15:22 ` Bjorn Helgaas
2020-03-25 16:27   ` Shiju Jose
2020-03-25 18:31     ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).