From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Baicar, Tyler" Subject: Re: [PATCH V4 01/10] acpi: apei: read ack upon ghes record consumption Date: Mon, 24 Oct 2016 14:28:09 -0600 Message-ID: <768254df-7b4a-65a4-81bf-665f63780073@codeaurora.org> References: <1477071013-29563-1-git-send-email-tbaicar@codeaurora.org> <1477071013-29563-2-git-send-email-tbaicar@codeaurora.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: Suzuki K Poulose , marc.zyngier@arm.com, pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk, catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net, lenb@kernel.org, matt@codeblueprint.co.uk, robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org, zjzhang@codeaurora.org, mark.rutland@arm.com, james.morse@arm.com, akpm@linux-foundation.org, eun.taik.lee@samsung.com, sandeepa.s.prabhu@gmail.com, shijie.huang@arm.com, rruigrok@codeaurora.org, paul.gortmaker@windriver.com, tomasz.nowicki@linaro.org, fu.wei@linaro.org, rostedt@goodmis.org, bristot@redhat.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-efi@vger.kernel.org, punit.agrawal@arm.com, astone@redhat.com, harba@codeaurora.org, hanjun.guo@linaro.org List-Id: linux-efi@vger.kernel.org On 10/24/2016 2:51 AM, Suzuki K Poulose wrote: > On 21/10/16 18:30, Tyler Baicar wrote: >> A RAS (Reliability, Availability, Serviceability) controller >> may be a separate processor running in parallel with OS >> execution, and may generate error records for consumption by >> the OS. If the RAS controller produces multiple error records, >> then they may be overwritten before the OS has consumed them. >> >> The Generic Hardware Error Source (GHES) v2 structure >> introduces the capability for the OS to acknowledge the >> consumption of the error record generated by the RAS >> controller. A RAS controller supporting GHESv2 shall wait for >> the acknowledgment before writing a new error record, thus >> eliminating the race condition. >> >> Signed-off-by: Jonathan (Zhixiong) Zhang >> Signed-off-by: Richard Ruigrok >> Signed-off-by: Tyler Baicar >> Signed-off-by: Naveen Kaje >> --- >> drivers/acpi/apei/ghes.c | 42 >> ++++++++++++++++++++++++++++++++++++++++++ >> drivers/acpi/apei/hest.c | 7 +++++-- >> include/acpi/ghes.h | 5 ++++- >> 3 files changed, 51 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c >> index 60746ef..7d020b0 100644 >> --- a/drivers/acpi/apei/ghes.c >> +++ b/drivers/acpi/apei/ghes.c >> @@ -45,6 +45,7 @@ >> #include >> #include >> >> +#include >> #include >> #include >> #include >> @@ -79,6 +80,10 @@ >> ((struct acpi_hest_generic_status *) \ >> ((struct ghes_estatus_node *)(estatus_node) + 1)) >> >> +#define HEST_TYPE_GENERIC_V2(ghes) \ >> + ((struct acpi_hest_header *)ghes->generic)->type == \ >> + ACPI_HEST_TYPE_GENERIC_ERROR_V2 >> + >> /* >> * This driver isn't really modular, however for the time being, >> * continuing to use module_param is the easiest way to remain >> @@ -248,7 +253,15 @@ static struct ghes *ghes_new(struct >> acpi_hest_generic *generic) >> ghes = kzalloc(sizeof(*ghes), GFP_KERNEL); >> if (!ghes) >> return ERR_PTR(-ENOMEM); >> + >> ghes->generic = generic; >> + if (HEST_TYPE_GENERIC_V2(ghes)) { >> + rc = apei_map_generic_address( >> + &ghes->generic_v2->read_ack_register); >> + if (rc) >> + goto err_unmap; > > I think should be goto err_free, see more below. > >> + } >> + >> rc = apei_map_generic_address(&generic->error_status_address); >> if (rc) >> goto err_free; >> @@ -270,6 +283,9 @@ static struct ghes *ghes_new(struct >> acpi_hest_generic *generic) >> >> err_unmap: >> apei_unmap_generic_address(&generic->error_status_address); >> + if (HEST_TYPE_GENERIC_V2(ghes)) >> + apei_unmap_generic_address( >> + &ghes->generic_v2->read_ack_register); > > We might end up trying to unmap (error_status_address) which is not > mapped > if we hit the error in mapping read_ack_register. The > read_ack_register unmap > hunk should be moved below to err_free. > This needs to be changed, I'll add a separate label for unmapping read_ack_register and error_status_address for the case that the read_ack_register map succeeds but the error_status_address map fails. err_unmap_status_addr: apei_unmap_generic_address(&generic->error_status_address); err_unmap_read_ack_addr: if (HEST_TYPE_GENERIC_V2(ghes)) apei_unmap_generic_address( &ghes->generic_v2->read_ack_register); err_free: kfree(ghes); return ERR_PTR(rc); If mapping read_ack_register fails, goto err_free. If mapping read_ack_register is successful but mapping error_status_address fails, goto err_unmap_read_ack_addr. And if both mappings succeed but the kmalloc fails, then goto err_unmap_status_addr. > >> err_free: >> kfree(ghes); >> return ERR_PTR(rc); >> @@ -279,6 +295,9 @@ static void ghes_fini(struct ghes *ghes) >> { >> kfree(ghes->estatus); >> apei_unmap_generic_address(&ghes->generic->error_status_address); >> + if (HEST_TYPE_GENERIC_V2(ghes)) >> + apei_unmap_generic_address( >> + &ghes->generic_v2->read_ack_register); >> } >> >> static inline int ghes_severity(int severity) >> @@ -648,6 +667,23 @@ static void ghes_estatus_cache_add( >> rcu_read_unlock(); >> } >> > >> +static int ghes_do_read_ack(struct acpi_hest_generic_v2 *generic_v2) > > nit: We are actually writing something to the read_ack_register. The > names > read_ack_register (which may be as per standard) and more importantly the > function name (ghes_do_read_ack) sounds a bit misleading. It is called "Read Ack Register" in the spec (ACPI 6.1 table 18-344), but I agree the function name can be improved. Maybe ghes_acknowledge_error or ghes_ack_error. Thanks, Tyler > > Rest looks fine to me. > > Suzuki > -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.