linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-06 11:40 Pu Wen
  0 siblings, 0 replies; 9+ messages in thread
From: Pu Wen @ 2018-09-06 11:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On 2018/9/6 16:39, Borislav Petkov wrote:
> On Thu, Sep 06, 2018 at 11:52:42AM +0800, Pu Wen wrote:
>> In most of the normal use cases MCE is necessary. Rarely in some cases
>> such as for test purpose MCE may be unselected.
>
> Test with MCE disabled? Please elaborate.

The first is the compilation test. Test the kernel with both MCE selected
and unselected to see if is the compiling process successful or not.
The other test is functionality test. For example, during the MCE BIOS
development, to see if the MCE functions OK, it may need to select and
unselect MCE in kernel for double checking.

Actually in normal use scenario and in real product the MCE should be
always selected. To ensure this, I think there are two ways as below:
- Select X86_MCE_AMD in CPU_SUP_HYGON config entry, it also cater to the
  first test scenario, but meanwhile lacks flexibility.
- The linux distros(Ubuntu, CentOS, etc) ensure that X86_MCE_AMD is
  selected to the default config file, and indeed they do.
Which way is better?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-06 15:47 Pu Wen
  0 siblings, 0 replies; 9+ messages in thread
From: Pu Wen @ 2018-09-06 15:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On 2018/9/6 20:29, Borislav Petkov wrote:
> Say what now?! What testing do you do if you boot a kernel which doesn't
> even use the functionality you're testing?!

I'm sorry, this case is not good to demonstrate the useage. And I think
there are no other people will do the test without MCE in kernel. So
with MCE enabled in kernel is the main case to support.

> The first way but change that dependency to CPU_SUP_AMD because there
> are people building their own kernels and don't run distro configs. And
> you need CPU_SUP_AMD because you're using their code.

That's good, will select CPU_SUP_AMD for CPU_SUP_HYGON to solve the
dependency issues in kernel.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-06 12:29 Borislav Petkov
  0 siblings, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2018-09-06 12:29 UTC (permalink / raw)
  To: Pu Wen
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On Thu, Sep 06, 2018 at 07:40:52PM +0800, Pu Wen wrote:
> The first is the compilation test. Test the kernel with both MCE selected
> and unselected to see if is the compiling process successful or not.

This is a kernel compile test - it has nothing to do with actual machine
testing.

> The other test is functionality test. For example, during the MCE BIOS
> development, to see if the MCE functions OK, it may need to select and
> unselect MCE in kernel for double checking.

Say what now?! What testing do you do if you boot a kernel which doesn't
even use the functionality you're testing?!

> Actually in normal use scenario and in real product the MCE should be
> always selected.

It better be!

> To ensure this, I think there are two ways as below:
> - Select X86_MCE_AMD in CPU_SUP_HYGON config entry, it also cater to the
>   first test scenario, but meanwhile lacks flexibility.
> - The linux distros(Ubuntu, CentOS, etc) ensure that X86_MCE_AMD is
>   selected to the default config file, and indeed they do.
> Which way is better?

The first way but change that dependency to CPU_SUP_AMD because there
are people building their own kernels and don't run distro configs. And
you need CPU_SUP_AMD because you're using their code.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-06  8:39 Borislav Petkov
  0 siblings, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2018-09-06  8:39 UTC (permalink / raw)
  To: Pu Wen
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On Thu, Sep 06, 2018 at 11:52:42AM +0800, Pu Wen wrote:
> In most of the normal use cases MCE is necessary. Rarely in some cases
> such as for test purpose MCE may be unselected.

Test with MCE disabled? Please elaborate.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-06  3:52 Pu Wen
  0 siblings, 0 replies; 9+ messages in thread
From: Pu Wen @ 2018-09-06  3:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On 2018/9/5 21:15, Borislav Petkov wrote:
> On Wed, Sep 05, 2018 at 08:59:24PM +0800, Pu Wen wrote:
>> If CONFIG_X86_MCE_AMD=n, mce_hygon_feature_init will call the other
>> one mce_amd_feature_init which is a null function and located in the
>> else branch of "#ifdef CONFIG_X86_MCE_AMD". The compilation is OK and
>> the kernel will run without MCE support.
> 
> So my question was rather ironic but I guess irony can't travel through
> mail. So let me paraphrase: is that a use case you wanna support?
> Because I'd advise very strongly against !MCE kernels.

In most of the normal use cases MCE is necessary. Rarely in some cases
such as for test purpose MCE may be unselected.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-05 13:15 Borislav Petkov
  0 siblings, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2018-09-05 13:15 UTC (permalink / raw)
  To: Pu Wen
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On Wed, Sep 05, 2018 at 08:59:24PM +0800, Pu Wen wrote:
> If CONFIG_X86_MCE_AMD=n, mce_hygon_feature_init will call the other
> one mce_amd_feature_init which is a null function and located in the
> else branch of "#ifdef CONFIG_X86_MCE_AMD". The compilation is OK and
> the kernel will run without MCE support.

So my question was rather ironic but I guess irony can't travel through
mail. So let me paraphrase: is that a use case you wanna support?
Because I'd advise very strongly against !MCE kernels.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-05 12:59 Pu Wen
  0 siblings, 0 replies; 9+ messages in thread
From: Pu Wen @ 2018-09-05 12:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On 2018/9/5 17:22, Borislav Petkov wrote:
>> @@ -214,6 +214,11 @@ static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
>>   #endif
>>   
>> +static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c)
>> +{
>> +	return mce_amd_feature_init(c);
>> +}
>> +
> 
> What happens if CONFIG_X86_MCE_AMD=n ?

If CONFIG_X86_MCE_AMD=n, mce_hygon_feature_init will call the other
one mce_amd_feature_init which is a null function and located in the
else branch of "#ifdef CONFIG_X86_MCE_AMD". The compilation is OK and
the kernel will run without MCE support.

> In general, since you're using AMD code, you need to make CPU_SUP_HYGON
> depend on a bunch of AMD config items like CONFIG_X86_MCE_AMD,
> CONFIG_CPU_SUP_AMD, CONFIG_AMD_NB,... Audit your code and see what other
> config items need to be selected.

All right, will check all the config items which are necessary for
CONFIG_CPU_SUP_HYGON.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-09-05  9:22 Borislav Petkov
  0 siblings, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2018-09-05  9:22 UTC (permalink / raw)
  To: Pu Wen
  Cc: tglx, mingo, hpa, x86, thomas.lendacky, pbonzini, tony.luck,
	linux-kernel, linux-arch, linux-edac

On Wed, Aug 29, 2018 at 08:44:54PM +0800, Pu Wen wrote:
> Hygon machine check arch is similar to AMD family 17h. To enable the MCE
> infrastructure support, add CPU vendor check for Hygon to share the code
> path of AMD.
> 
> Add hygon mce init function mce_hygon_feature_init() to minimize further
> maintenance effort.
> 
> Signed-off-by: Pu Wen <puwen@hygon.cn>
> ---
>  arch/x86/include/asm/mce.h                |  5 +++++
>  arch/x86/kernel/cpu/mcheck/mce-severity.c |  3 ++-
>  arch/x86/kernel/cpu/mcheck/mce.c          | 21 +++++++++++++++------
>  3 files changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> index 3a17107..12357aa 100644
> --- a/arch/x86/include/asm/mce.h
> +++ b/arch/x86/include/asm/mce.h
> @@ -214,6 +214,11 @@ static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
>  static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { return -EINVAL; };
>  #endif
>  
> +static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c)
> +{
> +	return mce_amd_feature_init(c);
> +}
> +

What happens if CONFIG_X86_MCE_AMD=n ?

In general, since you're using AMD code, you need to make CPU_SUP_HYGON
depend on a bunch of AMD config items like CONFIG_X86_MCE_AMD,
CONFIG_CPU_SUP_AMD, CONFIG_AMD_NB,... Audit your code and see what other
config items need to be selected.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure
@ 2018-08-29 12:44 Pu Wen
  0 siblings, 0 replies; 9+ messages in thread
From: Pu Wen @ 2018-08-29 12:44 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, thomas.lendacky, bp, pbonzini, tony.luck
  Cc: linux-kernel, linux-arch, linux-edac, Pu Wen

Hygon machine check arch is similar to AMD family 17h. To enable the MCE
infrastructure support, add CPU vendor check for Hygon to share the code
path of AMD.

Add hygon mce init function mce_hygon_feature_init() to minimize further
maintenance effort.

Signed-off-by: Pu Wen <puwen@hygon.cn>
---
 arch/x86/include/asm/mce.h                |  5 +++++
 arch/x86/kernel/cpu/mcheck/mce-severity.c |  3 ++-
 arch/x86/kernel/cpu/mcheck/mce.c          | 21 +++++++++++++++------
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 3a17107..12357aa 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -214,6 +214,11 @@ static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
 static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { return -EINVAL; };
 #endif
 
+static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c)
+{
+	return mce_amd_feature_init(c);
+}
+
 int mce_available(struct cpuinfo_x86 *c);
 bool mce_is_memory_error(struct mce *m);
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index f34d89c..44396d5 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -336,7 +336,8 @@ int (*mce_severity)(struct mce *m, int tolerant, char **msg, bool is_excp) =
 
 void __init mcheck_vendor_init_severity(void)
 {
-	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
+	    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
 		mce_severity = mce_severity_amd;
 }
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 953b3ce..12408a1 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -270,7 +270,8 @@ static void print_mce(struct mce *m)
 {
 	__print_mce(m);
 
-	if (m->cpuvendor != X86_VENDOR_AMD)
+	if (m->cpuvendor != X86_VENDOR_AMD &&
+	    m->cpuvendor != X86_VENDOR_HYGON)
 		pr_emerg_ratelimited(HW_ERR "Run the above through 'mcelog --ascii'\n");
 }
 
@@ -508,9 +509,9 @@ static int mce_usable_address(struct mce *m)
 
 bool mce_is_memory_error(struct mce *m)
 {
-	if (m->cpuvendor == X86_VENDOR_AMD) {
+	if (m->cpuvendor == X86_VENDOR_AMD ||
+	    m->cpuvendor == X86_VENDOR_HYGON) {
 		return amd_mce_is_memory_error(m);
-
 	} else if (m->cpuvendor == X86_VENDOR_INTEL) {
 		/*
 		 * Intel SDM Volume 3B - 15.9.2 Compound Error Codes
@@ -536,7 +537,9 @@ EXPORT_SYMBOL_GPL(mce_is_memory_error);
 
 static bool mce_is_correctable(struct mce *m)
 {
-	if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
+	if ((m->cpuvendor == X86_VENDOR_AMD ||
+	     m->cpuvendor == X86_VENDOR_HYGON) &&
+	    (m->status & MCI_STATUS_DEFERRED))
 		return false;
 
 	if (m->status & MCI_STATUS_UC)
@@ -1705,7 +1708,8 @@ static int __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
  */
 static void __mcheck_cpu_init_early(struct cpuinfo_x86 *c)
 {
-	if (c->x86_vendor == X86_VENDOR_AMD) {
+	if (c->x86_vendor == X86_VENDOR_AMD ||
+	    c->x86_vendor == X86_VENDOR_HYGON) {
 		mce_flags.overflow_recov = !!cpu_has(c, X86_FEATURE_OVERFLOW_RECOV);
 		mce_flags.succor	 = !!cpu_has(c, X86_FEATURE_SUCCOR);
 		mce_flags.smca		 = !!cpu_has(c, X86_FEATURE_SMCA);
@@ -1746,6 +1750,9 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
 		mce_amd_feature_init(c);
 		break;
 		}
+	case X86_VENDOR_HYGON:
+		mce_hygon_feature_init(c);
+		break;
 	case X86_VENDOR_CENTAUR:
 		mce_centaur_feature_init(c);
 		break;
@@ -1971,12 +1978,14 @@ static void mce_disable_error_reporting(void)
 static void vendor_disable_error_reporting(void)
 {
 	/*
-	 * Don't clear on Intel or AMD CPUs. Some of these MSRs are socket-wide.
+	 * Don't clear on Intel or AMD or Hygon CPUs. Some of these MSRs
+	 * are socket-wide.
 	 * Disabling them for just a single offlined CPU is bad, since it will
 	 * inhibit reporting for all shared resources on the socket like the
 	 * last level cache (LLC), the integrated memory controller (iMC), etc.
 	 */
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ||
+	    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON ||
 	    boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
 		return;
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-09-06 15:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-06 11:40 [v5,10/16] x86/mce: enable Hygon support to MCE infrastructure Pu Wen
  -- strict thread matches above, loose matches on Subject: below --
2018-09-06 15:47 Pu Wen
2018-09-06 12:29 Borislav Petkov
2018-09-06  8:39 Borislav Petkov
2018-09-06  3:52 Pu Wen
2018-09-05 13:15 Borislav Petkov
2018-09-05 12:59 Pu Wen
2018-09-05  9:22 Borislav Petkov
2018-08-29 12:44 Pu Wen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).