linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yazen Ghannam <yazen.ghannam@amd.com>
To: Muralidhara M K <muralimk@amd.com>,
	linux-edac@vger.kernel.org, x86@kernel.org
Cc: yazen.ghannam@amd.com, linux-kernel@vger.kernel.org,
	bp@alien8.de, mingo@redhat.com, mchehab@kernel.org,
	nchatrad@amd.com, Muralidhara M K <muralidhara.mk@amd.com>,
	Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Subject: Re: [PATCH 6/7] EDAC/amd64: Add error instance get_err_info() to pvt->ops
Date: Fri, 21 Jul 2023 10:47:57 -0400	[thread overview]
Message-ID: <bf9fbac7-995e-79da-daf6-76cdfaef0e1c@amd.com> (raw)
In-Reply-To: <20230720125425.3735538-7-muralimk@amd.com>

On 7/20/2023 8:54 AM, Muralidhara M K wrote:
> From: Muralidhara M K <muralidhara.mk@amd.com>
> 
> On CPUs the data fabric ID of an instance on a CPU is equal to the
> UMC number. since the UMC number and channel are equal in CPU nodes,
> the channel can be used as the data fabric ID of the instance.
> 
> GPU node has 'X' number of PHYs and 'Y' number of channels.
> This results in 'X*Y' number of instances in the data fabric.
> Therefore the data fabric ID of an instance in GPU as below:
>    df_inst_id = 'X' * number of channels per PHY + 'Y'
> 
> Co-developed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
> Signed-off-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com>
> ---
>   drivers/edac/amd64_edac.c | 36 +++++++++++++++++++++++++++++++++++-
>   drivers/edac/amd64_edac.h |  2 ++
>   2 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 45d8093c117a..74b2b47cc22a 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -3047,6 +3047,17 @@ static inline void decode_bus_error(int node_id, struct mce *m)
>   	__log_ecc_error(mci, &err, ecc_type);
>   }
>   
> +/*
> + * On CPUs, The data fabric ID of an instance is equal to the UMC number.
> + * and since the UMC number and channel are equal in CPU nodes, the channel can be
> + * used as the data fabric ID of the instance.
> + */
> +static int umc_inst_id(struct mem_ctl_info *mci, struct amd64_pvt *pvt,
> +		       struct err_info *err)
> +{
> +	return err->channel;
> +}
> +
>   /*
>    * To find the UMC channel represented by this bank we need to match on its
>    * instance_id. The instance_id of a bank is held in the lower 32 bits of its
> @@ -3071,6 +3082,7 @@ static void decode_umc_error(int node_id, struct mce *m)
>   	struct mem_ctl_info *mci;
>   	struct amd64_pvt *pvt;
>   	struct err_info err;
> +	u8 df_inst_id;
>   	u64 sys_addr;
>   
>   	node_id = fixup_node_id(node_id, m);
> @@ -3101,8 +3113,9 @@ static void decode_umc_error(int node_id, struct mce *m)
>   	}
>   
>   	pvt->ops->get_err_info(m, &err);
> +	df_inst_id = pvt->ops->get_inst_id(mci, pvt, &err);
>   
> -	if (umc_normaddr_to_sysaddr(m->addr, pvt->mc_node_id, err.channel, &sys_addr)) {
> +	if (umc_normaddr_to_sysaddr(m->addr, pvt->mc_node_id, df_inst_id, &sys_addr)) {
>   		err.err_code = ERR_NORM_ADDR;
>   		goto log_error;
>   	}

This patch is not useful until the address translation is updated. So 
lets drop this for now. And these changes can be included as part of the 
address translation updates.

Thanks,
Yazen


  reply	other threads:[~2023-07-21 14:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-20 12:54 [PATCH 0/7] AMD Family 19h Models 90h-9fh EDAC Support Muralidhara M K
2023-07-20 12:54 ` [PATCH 1/7] x86/amd_nb: Add AMD Family 19h Models(80h-80fh) and (90h-9fh) PCI IDs Muralidhara M K
2023-07-21 14:44   ` Yazen Ghannam
2023-07-20 12:54 ` [PATCH 2/7] EDAC/mce_amd: Remove SMCA Extended Error code descriptions Muralidhara M K
2023-07-20 13:59   ` Borislav Petkov
2023-07-20 15:25     ` M K, Muralidhara
2023-07-20 15:55       ` Borislav Petkov
2023-07-21 14:45         ` Yazen Ghannam
2023-10-24  6:18           ` M K, Muralidhara
2023-07-20 12:54 ` [PATCH 3/7] x86/MCE/AMD: Add new MA_LLC, USR_DP, and USR_CP bank types Muralidhara M K
2023-07-22  8:20   ` Borislav Petkov
2023-07-20 12:54 ` [PATCH 4/7] EDAC/mc: Add new HBM3 memory type Muralidhara M K
2023-08-03 10:27   ` Borislav Petkov
2023-07-20 12:54 ` [PATCH 5/7] EDAC/amd64: Add Fam19h Model 90h ~ 9fh enumeration support Muralidhara M K
2023-08-05 10:10   ` Borislav Petkov
2023-07-20 12:54 ` [PATCH 6/7] EDAC/amd64: Add error instance get_err_info() to pvt->ops Muralidhara M K
2023-07-21 14:47   ` Yazen Ghannam [this message]
2023-07-20 12:54 ` [PATCH 7/7] EDAC/amd64: Add Error address conversion for UMC Muralidhara M K
2023-07-21 14:49   ` Yazen Ghannam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf9fbac7-995e-79da-daf6-76cdfaef0e1c@amd.com \
    --to=yazen.ghannam@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=mingo@redhat.com \
    --cc=muralidhara.mk@amd.com \
    --cc=muralimk@amd.com \
    --cc=naveenkrishna.chatradhi@amd.com \
    --cc=nchatrad@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).