All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Shiju Jose <shiju.jose@huawei.com>
Cc: linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, rjw@rjwysocki.net,
	helgaas@kernel.org, lenb@kernel.org, james.morse@arm.com,
	tony.luck@intel.com, gregkh@linuxfoundation.org,
	zhangliguang@linux.alibaba.com, tglx@linutronix.de,
	linuxarm@huawei.com, jonathan.cameron@huawei.com,
	tanxiaofei@huawei.com, yangyicong@hisilicon.com
Subject: Re: [PATCH v6 1/2] ACPI / APEI: Add support to notify the vendor specific HW errors
Date: Fri, 27 Mar 2020 19:22:14 +0100	[thread overview]
Message-ID: <20200327182214.GD8015@zn.tnic> (raw)
In-Reply-To: <20200325164223.650-2-shiju.jose@huawei.com>

On Wed, Mar 25, 2020 at 04:42:22PM +0000, Shiju Jose wrote:
> Presently APEI does not support reporting the vendor specific
> HW errors, received in the vendor defined table entries, to the
> vendor drivers for any recovery.
> 
> This patch adds the support to register and unregister the

Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

> error handling function for the vendor specific HW errors and
> notify the registered kernel driver.
> 
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
>  drivers/acpi/apei/ghes.c | 35 ++++++++++++++++++++++++++++++++++-
>  drivers/ras/ras.c        |  5 +++--
>  include/acpi/ghes.h      | 28 ++++++++++++++++++++++++++++
>  include/linux/ras.h      |  6 ++++--
>  include/ras/ras_event.h  |  7 +++++--
>  5 files changed, 74 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 24c9642e8fc7..d83f0b1aad0d 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -490,6 +490,32 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
>  #endif
>  }
>  
> +static ATOMIC_NOTIFIER_HEAD(ghes_event_notify_list);
> +
> +/**
> + * ghes_register_event_notifier - register an event notifier
> + * for the non-fatal HW errors.
> + * @nb: pointer to the notifier_block structure of the event handler.
> + *
> + * return 0 : SUCCESS, non-zero : FAIL
> + */
> +int ghes_register_event_notifier(struct notifier_block *nb)
> +{
> +	return atomic_notifier_chain_register(&ghes_event_notify_list, nb);
> +}
> +EXPORT_SYMBOL_GPL(ghes_register_event_notifier);
> +
> +/**
> + * ghes_unregister_event_notifier - unregister the previously
> + * registered event notifier.
> + * @nb: pointer to the notifier_block structure of the event handler.
> + */
> +void ghes_unregister_event_notifier(struct notifier_block *nb)
> +{
> +	atomic_notifier_chain_unregister(&ghes_event_notify_list, nb);
> +}
> +EXPORT_SYMBOL_GPL(ghes_unregister_event_notifier);
> +
>  static void ghes_do_proc(struct ghes *ghes,
>  			 const struct acpi_hest_generic_status *estatus)
>  {
> @@ -526,10 +552,17 @@ static void ghes_do_proc(struct ghes *ghes,
>  			log_arm_hw_error(err);
>  		} else {
>  			void *err = acpi_hest_get_payload(gdata);
> +			u8 error_handled = false;
> +			int ret;
> +
> +			ret = atomic_notifier_call_chain(&ghes_event_notify_list, 0, gdata);

Well, this is a notifier with standard name for a non-standard event.
Not optimal.

Why does only this event need a notifier? Because your driver is
interested in only those events?

> +			if (ret & NOTIFY_OK)
> +				error_handled = true;
>  
>  			log_non_standard_event(sec_type, fru_id, fru_text,
>  					       sec_sev, err,
> -					       gdata->error_data_length);
> +					       gdata->error_data_length,
> +					       error_handled);

What's that error_handled thing for? That's just silly.

Your notifier returns NOTIFY_STOP when it has queued the error. If you
don't want to log it, just test == NOTIFY_STOP and do not log it then.

Then your notifier callback is queuing the error into a kfifo for
whatever reason and then scheduling a workqueue to handle it in user
context...

So I'm thinking that it would be better if you:

* make that kfifo generic and part of ghes.c and queue all types of
error records into it in ghes_do_proc() - not just the non-standard
ones.

* then, when you're done queuing, you kick a workqueue.

* that workqueue runs a normal, blocking notifier to which drivers
register.

Your driver can register to that notifier too and do the normal handling
then and not have this ad-hoc, semi-generic, semi-vendor-specific thing.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2020-03-27 18:22 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Shiju Jose>
2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
2019-06-17 14:28   ` [PATCH 1/6] rasdaemon:print non-standard error data if not decoded Shiju Jose
2019-06-17 14:28   ` [PATCH 2/6] rasdaemon: rearrange HiSilicon HIP07 decoding function table Shiju Jose
2019-06-17 14:28   ` [PATCH 3/6] rasdaemon: update iteration logic for the non-standard error decoding functions Shiju Jose
2019-06-17 14:28   ` [PATCH 4/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format1 Shiju Jose
2019-06-17 14:28   ` [PATCH 5/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format2 Shiju Jose
2019-06-17 14:28   ` [PATCH 6/6] rasdaemon:add logging HiSilicon HIP08 PCIe local errors Shiju Jose
2019-06-21 18:42   ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Mauro Carvalho Chehab
2019-08-12 10:11 ` [PATCH RFC 0/4] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
2019-08-12 10:11   ` [PATCH RFC 1/4] " Shiju Jose
2019-08-21 17:23     ` James Morse
2019-08-22 16:57       ` Shiju Jose
2019-08-12 10:11   ` [PATCH RFC 2/4] ACPI: APEI: Add ghes_handle_memory_failure to the new notification method Shiju Jose
2019-08-21 17:22     ` James Morse
2019-08-22 16:57       ` Shiju Jose
2019-08-12 10:11   ` [PATCH RFC 3/4] ACPI: APEI: Add ghes_handle_aer " Shiju Jose
2019-08-12 10:11   ` [PATCH RFC 4/4] ACPI: APEI: Add log_arm_hw_error " Shiju Jose
2019-08-21 17:22   ` [PATCH RFC 0/4] ACPI: APEI: Add support to notify the vendor specific HW errors James Morse
2019-08-22 16:56     ` Shiju Jose
2019-10-03 17:21       ` James Morse
2019-10-16 16:33 ` [PATCH 0/7] rasdaemon: add fixes, database closure and signal handling Shiju Jose
2019-10-16 16:33   ` [PATCH 1/7] rasdaemon: fix cleanup issues in ras-events.c:read_ras_event_all_cpus() Shiju Jose
2019-10-16 16:33   ` [PATCH 2/7] rasdaemon: fix memory leak in ras-events.c:handle_ras_events() Shiju Jose
2019-10-16 16:33   ` [PATCH 3/7] rasdaemon: fix missing fclose in ras-events.c:select_tracing_timestamp() Shiju Jose
2019-10-16 16:33   ` [PATCH 4/7] rasdaemon: fix memory leak in ras-events.c:add_event_handler() Shiju Jose
2019-10-16 16:33   ` [PATCH 5/7] rasdaemon: delete multiple definitions of ARRAY_SIZE Shiju Jose
2019-10-16 16:34   ` [PATCH 6/7] rasdaemon: add closure and cleanups for the database Shiju Jose
2019-10-16 16:34   ` [PATCH 7/7] rasdaemon: add signal handling for the cleanup Shiju Jose
2019-11-13 16:38   ` [PATCH 0/7] rasdaemon: add fixes, database closure and signal handling Shiju Jose
2019-11-20  4:37     ` Mauro Carvalho Chehab
2019-11-13 16:31 ` [PATCH rasdaemon 0/2] rasdaemon: add fix for the sql table Shiju Jose
2019-11-13 16:31   ` [PATCH rasdaemon 1/2] rasdaemon: fix for the ras-record.c:ras_mc_prepare_stmt() failure when new fields added to " Shiju Jose
2019-11-13 16:31   ` [PATCH rasdaemon 2/2] rasdaemon: store PCIe dev name and TLP header for the aer event Shiju Jose
2020-01-15 11:01 ` [RFC PATCH 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
2020-01-15 11:01   ` [RFC PATCH 1/2] " Shiju Jose
2020-01-18 15:18     ` kbuild test robot
2020-01-15 11:01   ` [RFC PATCH 2/2] PCI:hip08:Add driver to handle HiSilicon hip08 PCIe controller's errors Shiju Jose
2020-01-15 14:13     ` Bjorn Helgaas
2020-01-17  9:40       ` Shiju Jose
2020-01-24 12:39 ` [PATCH v2 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
2020-01-24 12:39   ` [PATCH v2 1/2] " Shiju Jose
2020-01-24 12:39   ` [PATCH v2 2/2] PCI: hip: Add handling of HiSilicon hip PCIe controller's errors Shiju Jose
2020-01-24 14:30     ` Bjorn Helgaas
2020-01-26 18:12     ` kbuild test robot
2020-01-26 18:12       ` kbuild test robot
2020-01-26 18:12     ` [RFC PATCH] PCI: hip: hisi_pcie_sec_type can be static kbuild test robot
2020-01-26 18:12       ` kbuild test robot
2020-02-03 16:51 ` [PATCH v3 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
2020-02-03 16:51   ` [PATCH v3 1/2] " Shiju Jose
2020-02-03 16:51   ` [PATCH v3 2/2] PCI: HIP: Add handling of HiSilicon HIP PCIe controller's errors Shiju Jose
2020-02-04 14:31     ` Dan Carpenter
2020-02-04 14:31       ` Dan Carpenter
2020-02-04 14:31       ` Dan Carpenter
2020-02-07 10:31 ` [PATCH v4 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
2020-02-07 10:31   ` [PATCH v4 1/2] " Shiju Jose
2020-03-11 17:29     ` James Morse
2020-03-12 12:10       ` Shiju Jose
2020-03-13 15:17         ` James Morse
2020-03-13 17:08           ` Shiju Jose
2020-02-07 10:31   ` [PATCH v4 2/2] PCI: HIP: Add handling of HiSilicon HIP PCIe controller errors Shiju Jose
2020-03-09  9:23   ` [PATCH v4 0/2] ACPI: APEI: Add support to notify the vendor specific HW errors Shiju Jose
2020-03-11 17:27     ` James Morse
2020-03-25 16:42 ` [PATCH v6 0/2] ACPI / " Shiju Jose
2020-03-25 16:42   ` [PATCH v6 1/2] " Shiju Jose
2020-03-27 18:22     ` Borislav Petkov [this message]
2020-03-30 10:14       ` Shiju Jose
2020-03-30 10:33         ` Borislav Petkov
2020-03-30 11:55           ` Shiju Jose
2020-03-30 13:42             ` Borislav Petkov
2020-03-30 15:44               ` Shiju Jose
2020-03-31  9:09                 ` Borislav Petkov
2020-04-08  9:20                   ` Shiju Jose
2020-04-08 10:03       ` James Morse
2020-04-21 13:18         ` Shiju Jose
2020-05-11 11:20         ` Shiju Jose
2020-03-25 16:42   ` [PATCH v6 2/2] PCI: hip: Add handling of HiSilicon HIP PCIe controller errors Shiju Jose
2020-03-27 15:07   ` [PATCH v6 0/2] ACPI / APEI: Add support to notify the vendor specific HW errors Bjorn Helgaas
2020-04-07 12:00 ` [v7 PATCH 0/6] ACPI / APEI: Add support to notify non-fatal " Shiju Jose
2020-04-07 12:00   ` [v7 PATCH 1/6] ACPI / APEI: Add support to queuing up the non-fatal HW errors and notify Shiju Jose
2020-04-08 19:41     ` kbuild test robot
2020-04-08 19:41       ` kbuild test robot
2020-04-08 19:41     ` [RFC PATCH] ACPI / APEI: ghes_gdata_pool_init() can be static kbuild test robot
2020-04-08 19:41       ` kbuild test robot
2020-04-07 12:00   ` [v7 PATCH 2/6] ACPI / APEI: Add callback for memory errors to the GHES notifier Shiju Jose
2020-04-07 12:00   ` [v7 PATCH 3/6] ACPI / APEI: Add callback for AER " Shiju Jose
2020-04-07 12:00   ` [v7 PATCH 4/6] ACPI / APEI: Add callback for ARM HW errors " Shiju Jose
2020-04-07 12:00   ` [v7 PATCH 5/6] ACPI / APEI: Add callback for non-standard " Shiju Jose
2020-04-07 12:00   ` [v7 PATCH 6/6] PCI: hip: Add handling of HiSilicon HIP PCIe controller errors Shiju Jose
2020-04-07 22:03     ` kbuild test robot
2020-04-07 22:03       ` kbuild test robot
2020-04-21 13:21 ` [RESEND PATCH v7 0/6] ACPI / APEI: Add support to notify non-fatal HW errors Shiju Jose
2020-04-21 13:21   ` [RESEND PATCH v7 1/6] ACPI / APEI: Add support to queuing up the non-fatal HW errors and notify Shiju Jose
2020-04-21 14:12     ` Dan Carpenter
2020-04-21 13:21   ` [RESEND PATCH v7 2/6] ACPI / APEI: Add callback for memory errors to the GHES notifier Shiju Jose
2020-04-21 13:21   ` [RESEND PATCH v7 3/6] ACPI / APEI: Add callback for AER " Shiju Jose
2020-04-21 13:21   ` [RESEND PATCH v7 4/6] ACPI / APEI: Add callback for ARM HW errors " Shiju Jose
2020-04-21 14:14     ` Dan Carpenter
2020-04-21 15:18       ` Shiju Jose
2020-04-21 13:21   ` [RESEND PATCH v7 5/6] ACPI / APEI: Add callback for non-standard " Shiju Jose
2020-04-21 13:21   ` [RESEND PATCH v7 6/6] PCI: hip: Add handling of HiSilicon HIP PCIe controller errors Shiju Jose
2020-04-21 14:20     ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200327182214.GD8015@zn.tnic \
    --to=bp@alien8.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=helgaas@kernel.org \
    --cc=james.morse@arm.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=rjw@rjwysocki.net \
    --cc=shiju.jose@huawei.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=yangyicong@hisilicon.com \
    --cc=zhangliguang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.