All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
@ 2014-05-20  2:11 Chen Yucong
  2014-05-20 17:33 ` Borislav Petkov
  2014-05-21  1:40 ` Hidetoshi Seto
  0 siblings, 2 replies; 15+ messages in thread
From: Chen Yucong @ 2014-05-20  2:11 UTC (permalink / raw)
  To: tony.luck
  Cc: bp, ak, ying.huang, seto.hidetoshi, linux-kernel, linux-edac,
	Chen Yucong

mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as possible. So the
clear operation of mces_seen should also be lcoal to Per-CPU rather than monarch CPU.

Meanwhile, there is also a potential risk that mces_seen will not be be cleared if a timeout
occors in mce_end for monarch CPU. As a reuslt, the stale value of mces_seen will reappear
on the next mce.

Based on the above reasons, this patch distirbute the clear operation of mces_seen to Per-CPU
rather than only monarch CPU.

* From v1
 * Add Reviewed-by: Andi Kleen
 * Put the clear operation of mces_seen before the MCG_STATUS write.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Chen Yucong <slaoub@gmail.com>
---
 arch/x86/kernel/cpu/mcheck/mce.c |   18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 68317c8..966a5f5 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -785,13 +785,6 @@ static void mce_reign(void)
 	 */
 	if (global_worst <= MCE_KEEP_SEVERITY && mca_cfg.tolerant < 3)
 		mce_panic("Machine check from unknown source", NULL, NULL);
-
-	/*
-	 * Now clear all the mces_seen so that they don't reappear on
-	 * the next mce.
-	 */
-	for_each_possible_cpu(cpu)
-		memset(&per_cpu(mces_seen, cpu), 0, sizeof(struct mce));
 }
 
 static atomic_t global_nwo;
@@ -1137,9 +1130,6 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 		}
 	}
 
-	/* mce_clear_state will clear *final, save locally for use later */
-	m = *final;
-
 	if (!no_way_out)
 		mce_clear_state(toclear);
 
@@ -1161,7 +1151,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 			mce_panic("Fatal machine check on current CPU", &m, msg);
 		if (worst == MCE_AR_SEVERITY) {
 			/* schedule action before return to userland */
-			mce_save_info(m.addr, m.mcgstatus & MCG_STATUS_RIPV);
+			mce_save_info(final->addr, final->mcgstatus & MCG_STATUS_RIPV);
 			set_thread_flag(TIF_MCE_NOTIFY);
 		} else if (kill_it) {
 			force_sig(SIGBUS, current);
@@ -1170,6 +1160,12 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 
 	if (worst > 0)
 		mce_report_event(regs);
+
+	/*
+	 * Now clear the mces_seen of current CPU -*final - so that it does not
+	 * reappear on the next mce.
+	 */
+	memset(final, 0, sizeof(struct mce));
 	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
 out:
 	atomic_dec(&mce_entry);
-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-20  2:11 [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU Chen Yucong
@ 2014-05-20 17:33 ` Borislav Petkov
  2014-05-21  0:48   ` Chen Yucong
  2014-05-21  1:33   ` Chen Yucong
  2014-05-21  1:40 ` Hidetoshi Seto
  1 sibling, 2 replies; 15+ messages in thread
From: Borislav Petkov @ 2014-05-20 17:33 UTC (permalink / raw)
  To: Chen Yucong
  Cc: tony.luck, ak, ying.huang, seto.hidetoshi, linux-kernel, linux-edac

On Tue, May 20, 2014 at 10:11:25AM +0800, Chen Yucong wrote:
> mces_seen is a Per-CPU variable which should only be accessed by
> Per-CPU as possible. So the clear operation of mces_seen should also
> be lcoal to Per-CPU rather than monarch CPU.
>
> Meanwhile, there is also a potential risk that mces_seen will not
> be be cleared if a timeout occors in mce_end for monarch CPU. As a
> reuslt, the stale value of mces_seen will reappear on the next mce.

I don't know how many times I have to tell you this already: if we reach
the timeout, we have a much bigger friggin' problem!

What you could do instead is make the machine panic in the tolerant==1,
i.e., the default case, in mce_timed_out().

Basically, in the case any core is stuck and we reach a timeout, we want
to panic the whole box immediately. There's a very little chance we can
recover so panic is the only sane thing left to do.

Ok?

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-20 17:33 ` Borislav Petkov
@ 2014-05-21  0:48   ` Chen Yucong
  2014-05-21  1:33   ` Chen Yucong
  1 sibling, 0 replies; 15+ messages in thread
From: Chen Yucong @ 2014-05-21  0:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tony.luck, ak, ying.huang, seto.hidetoshi, linux-kernel, linux-edac

On Tue, 2014-05-20 at 19:33 +0200, Borislav Petkov wrote:
> On Tue, May 20, 2014 at 10:11:25AM +0800, Chen Yucong wrote:
> > mces_seen is a Per-CPU variable which should only be accessed by
> > Per-CPU as possible. So the clear operation of mces_seen should also
> > be lcoal to Per-CPU rather than monarch CPU.
> >
> > Meanwhile, there is also a potential risk that mces_seen will not
> > be be cleared if a timeout occors in mce_end for monarch CPU. As a
> > reuslt, the stale value of mces_seen will reappear on the next mce.
> 
> I don't know how many times I have to tell you this already: if we reach
> the timeout, we have a much bigger friggin' problem!

Even if we do not take into account timeout, we should distribute the
clear operation of mces_seen to Per-CPU rather then monarch CPU.
mce_regin, which is only called by monarch CPU, can be used for system
panics as quickly as possible if there is a truly data corrupting error.
But Monarch CPU don't have to help all other CPU to clean mces_clean.
One advantage of Per-CPU is the isolation of errors propagation, being
so, why do not we clean mces_seen by Per-CPU?

You say, "you need to do the cleaning in mce_reign because the monarch
cpu has to run last after all other cpus have scanned their mce banks."
But this is not an adequate explanation.

thx!
cyc

> 
> What you could do instead is make the machine panic in the tolerant==1,
> i.e., the default case, in mce_timed_out().
> 
> Basically, in the case any core is stuck and we reach a timeout, we want
> to panic the whole box immediately. There's a very little chance we can
> recover so panic is the only sane thing left to do.
> 
> Ok?
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-20 17:33 ` Borislav Petkov
  2014-05-21  0:48   ` Chen Yucong
@ 2014-05-21  1:33   ` Chen Yucong
  1 sibling, 0 replies; 15+ messages in thread
From: Chen Yucong @ 2014-05-21  1:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tony.luck, ak, ying.huang, seto.hidetoshi, linux-kernel, linux-edac

On Tue, 2014-05-20 at 19:33 +0200, Borislav Petkov wrote:
> On Tue, May 20, 2014 at 10:11:25AM +0800, Chen Yucong wrote:
> > mces_seen is a Per-CPU variable which should only be accessed by
> > Per-CPU as possible. So the clear operation of mces_seen should also
> > be lcoal to Per-CPU rather than monarch CPU.
> >
> > Meanwhile, there is also a potential risk that mces_seen will not
> > be be cleared if a timeout occors in mce_end for monarch CPU. As a
> > reuslt, the stale value of mces_seen will reappear on the next mce.
> 
> I don't know how many times I have to tell you this already: if we reach
> the timeout, we have a much bigger friggin' problem!
> 
> What you could do instead is make the machine panic in the tolerant==1,
> i.e., the default case, in mce_timed_out().
> 
> Basically, in the case any core is stuck and we reach a timeout, we want
> to panic the whole box immediately. There's a very little chance we can
> recover so panic is the only sane thing left to do.
> 
> Ok?

Do you have any strong reasons to claim that a timeout is raised by any
core which is stuck? In other word, what is the probability that an
timeout is caused by stuck cores?

thx!
cyc 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-20  2:11 [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU Chen Yucong
  2014-05-20 17:33 ` Borislav Petkov
@ 2014-05-21  1:40 ` Hidetoshi Seto
  2014-05-21  2:03   ` Chen Yucong
  1 sibling, 1 reply; 15+ messages in thread
From: Hidetoshi Seto @ 2014-05-21  1:40 UTC (permalink / raw)
  To: Chen Yucong, tony.luck; +Cc: bp, ak, ying.huang, linux-kernel, linux-edac

(2014/05/20 11:11), Chen Yucong wrote:
> mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as possible. So the
> clear operation of mces_seen should also be lcoal to Per-CPU rather than monarch CPU.

I don't think it should be local.
Originally what we want to have here is memory to save mces_seen for each online cpus,
such as a global array like mces_seen[cpus]. But at same time we don't want to preallocate
big array enough for max possible cpus. So we use per-cpu store instead. 

> 
> Meanwhile, there is also a potential risk that mces_seen will not be be cleared if a timeout
> occors in mce_end for monarch CPU. As a reuslt, the stale value of mces_seen will reappear
> on the next mce.

In case if we decide panic in mce path, uncleared mces_seen can be referred from memory dump.
I suppose it will help trouble investigation.

> 
> Based on the above reasons, this patch distirbute the clear operation of mces_seen to Per-CPU
> rather than only monarch CPU.

"local" "occurs" "result" "distribute"
I recommend you to use spell-checker for your postings...


Thanks,
H.Seto


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-21  1:40 ` Hidetoshi Seto
@ 2014-05-21  2:03   ` Chen Yucong
  2014-05-21  2:43     ` Hidetoshi Seto
  0 siblings, 1 reply; 15+ messages in thread
From: Chen Yucong @ 2014-05-21  2:03 UTC (permalink / raw)
  To: Hidetoshi Seto; +Cc: tony.luck, bp, ak, ying.huang, linux-kernel, linux-edac

On Wed, 2014-05-21 at 10:40 +0900, Hidetoshi Seto wrote:
> (2014/05/20 11:11), Chen Yucong wrote:
> > mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as possible. So the
> > clear operation of mces_seen should also be lcoal to Per-CPU rather than monarch CPU.
> 
> I don't think it should be local.
> Originally what we want to have here is memory to save mces_seen for each online cpus,
> such as a global array like mces_seen[cpus]. But at same time we don't want to preallocate
> big array enough for max possible cpus. So we use per-cpu store instead. 
> 
But mces_seen will just be updated by Per-CPU rather than monarch CPU.
It is only read by monarch CPU.
> > 
> > Meanwhile, there is also a potential risk that mces_seen will not be be cleared if a timeout
> > occors in mce_end for monarch CPU. As a reuslt, the stale value of mces_seen will reappear
> > on the next mce.
> 
> In case if we decide panic in mce path, uncleared mces_seen can be referred from memory dump.
> I suppose it will help trouble investigation.
> 
> > 
> > Based on the above reasons, this patch distirbute the clear operation of mces_seen to Per-CPU
> > rather than only monarch CPU.
> 
> "local" "occurs" "result" "distribute"
> I recommend you to use spell-checker for your postings...
Thanks very much for your suggestion, I will use a spell-cheker for my
carelessness. 

thx!
cyc


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-21  2:03   ` Chen Yucong
@ 2014-05-21  2:43     ` Hidetoshi Seto
  2014-05-21  3:19       ` Chen Yucong
  0 siblings, 1 reply; 15+ messages in thread
From: Hidetoshi Seto @ 2014-05-21  2:43 UTC (permalink / raw)
  To: Chen Yucong; +Cc: tony.luck, bp, ak, ying.huang, linux-kernel, linux-edac

(2014/05/21 11:03), Chen Yucong wrote:
> On Wed, 2014-05-21 at 10:40 +0900, Hidetoshi Seto wrote:
>> (2014/05/20 11:11), Chen Yucong wrote:
>>> mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as possible. So the
>>> clear operation of mces_seen should also be lcoal to Per-CPU rather than monarch CPU.
>>
>> I don't think it should be local.
>> Originally what we want to have here is memory to save mces_seen for each online cpus,
>> such as a global array like mces_seen[cpus]. But at same time we don't want to preallocate
>> big array enough for max possible cpus. So we use per-cpu store instead. 
>>
> But mces_seen will just be updated by Per-CPU rather than monarch CPU.
> It is only read by monarch CPU.

Because mce status registers are per-cpu and monarch cannot access subjects' registers
directly, all subjects read it's status for monarch, store the status to memory for monarch,
and then monarch gather all status to make decision for all.

At last monarch kindly clear gathered status for all.
It will be one of important steps to ready for next mce events.

I think you should clarify why "distributing the clear operation" is required here.
What is the benefit?


Thanks,
H.Seto


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-21  2:43     ` Hidetoshi Seto
@ 2014-05-21  3:19       ` Chen Yucong
  2014-05-21  3:36         ` Hidetoshi Seto
  0 siblings, 1 reply; 15+ messages in thread
From: Chen Yucong @ 2014-05-21  3:19 UTC (permalink / raw)
  To: Hidetoshi Seto; +Cc: tony.luck, bp, ak, ying.huang, linux-kernel, linux-edac

On Wed, 2014-05-21 at 11:43 +0900, Hidetoshi Seto wrote:
> (2014/05/21 11:03), Chen Yucong wrote:
> > On Wed, 2014-05-21 at 10:40 +0900, Hidetoshi Seto wrote:
> >> (2014/05/20 11:11), Chen Yucong wrote:
> >>> mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as possible. So the
> >>> clear operation of mces_seen should also be lcoal to Per-CPU rather than monarch CPU.
> >>
> >> I don't think it should be local.
> >> Originally what we want to have here is memory to save mces_seen for each online cpus,
> >> such as a global array like mces_seen[cpus]. But at same time we don't want to preallocate
> >> big array enough for max possible cpus. So we use per-cpu store instead. 
> >>
> > But mces_seen will just be updated by Per-CPU rather than monarch CPU.
> > It is only read by monarch CPU.
> 
> Because mce status registers are per-cpu and monarch cannot access subjects' registers
> directly,
Right. This is one reason why we need to distribute the clear operation
to Per-CPU. And in fact it exactly assigns per-cpu property to
mces_seen.

>  all subjects read it's status for monarch, store the status to memory for monarch,
> and then monarch gather all status to make decision for all.

mce_regin, which is only called by monarch CPU, can be used for system
panics as quickly as possible if there is a truly data corrupting error.
But Monarch CPU don't have to help all other CPU to clean mces_clean.
One advantage of Per-CPU is the isolation of errors propagation, being
so, why do not we clean mces_seen by Per-CPU?

thx!
cyc
> 
> At last monarch kindly clear gathered status for all.
> It will be one of important steps to ready for next mce events.
> 
> I think you should clarify why "distributing the clear operation" is required here.
> What is the benefit?
> 




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-21  3:19       ` Chen Yucong
@ 2014-05-21  3:36         ` Hidetoshi Seto
  2014-05-21 21:09           ` Luck, Tony
  0 siblings, 1 reply; 15+ messages in thread
From: Hidetoshi Seto @ 2014-05-21  3:36 UTC (permalink / raw)
  To: Chen Yucong; +Cc: tony.luck, bp, ak, ying.huang, linux-kernel, linux-edac

(2014/05/21 12:19), Chen Yucong wrote:
> On Wed, 2014-05-21 at 11:43 +0900, Hidetoshi Seto wrote:
>> (2014/05/21 11:03), Chen Yucong wrote:
>>> On Wed, 2014-05-21 at 10:40 +0900, Hidetoshi Seto wrote:
>>>> (2014/05/20 11:11), Chen Yucong wrote:
>>>>> mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as possible. So the
>>>>> clear operation of mces_seen should also be lcoal to Per-CPU rather than monarch CPU.
>>>>
>>>> I don't think it should be local.
>>>> Originally what we want to have here is memory to save mces_seen for each online cpus,
>>>> such as a global array like mces_seen[cpus]. But at same time we don't want to preallocate
>>>> big array enough for max possible cpus. So we use per-cpu store instead. 
>>>>
>>> But mces_seen will just be updated by Per-CPU rather than monarch CPU.
>>> It is only read by monarch CPU.
>>
>> Because mce status registers are per-cpu and monarch cannot access subjects' registers
>> directly,
> Right. This is one reason why we need to distribute the clear operation
> to Per-CPU. And in fact it exactly assigns per-cpu property to
> mces_seen.
> 
>>  all subjects read it's status for monarch, store the status to memory for monarch,
>> and then monarch gather all status to make decision for all.
> 
> mce_regin, which is only called by monarch CPU, can be used for system
> panics as quickly as possible if there is a truly data corrupting error.
> But Monarch CPU don't have to help all other CPU to clean mces_clean.
> One advantage of Per-CPU is the isolation of errors propagation, being
> so, why do not we clean mces_seen by Per-CPU?

What kind of error propagations are you expecting/concerning here?
Could you explain the problem more in detail?


Thanks,
H.Seto


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-21  3:36         ` Hidetoshi Seto
@ 2014-05-21 21:09           ` Luck, Tony
  2014-05-23  1:32             ` Chen Yucong
  0 siblings, 1 reply; 15+ messages in thread
From: Luck, Tony @ 2014-05-21 21:09 UTC (permalink / raw)
  To: Hidetoshi Seto, Chen Yucong; +Cc: bp, ak, Huang, Ying, linux-kernel, linux-edac

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1768 bytes --]

>> mce_regin, which is only called by monarch CPU, can be used for system
>> panics as quickly as possible if there is a truly data corrupting error.
>> But Monarch CPU don't have to help all other CPU to clean mces_clean.
>> One advantage of Per-CPU is the isolation of errors propagation, being
>> so, why do not we clean mces_seen by Per-CPU?
>
> What kind of error propagations are you expecting/concerning here?
> Could you explain the problem more in detail?

Please do give us more detail on the scenario that you see that would
make your new version behave better.

I'm sure the current code has no races w.r.t. clearing mces_seen. The
monarch clears them all in mce_reign() before clearing mce_executing
at the foot of mce_end() and allowing the others to run again.

Your code has the monarch release all the other cpus from the spinloop
in mce_end() so they will all rush together through the final lines of
do_machine_check().  Some of them will have work to do if they saw
errors - they may have to send signals, or log the error. Others can
fly directly to the end of do_machine_check() and clear MCG_STATUS
and return to executing whatever code was interrupted.

So it is possible that some processors will be out doing things that can
generate another machine check, before others have finished their
tasks and got to the point to clear mces_seen.(*)

-Tony

(*) maybe that doesn't matter because they haven't zeroed MCG_STATUS
yet - so this second machine check will force those cpus to shutdown. See MCIP
description in "15.3.1.2 IA32_MCG_STATUS_MSR" section of software
developer manual.

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-21 21:09           ` Luck, Tony
@ 2014-05-23  1:32             ` Chen Yucong
  2014-05-23  9:10               ` Borislav Petkov
  2014-05-23 21:50               ` Tony Luck
  0 siblings, 2 replies; 15+ messages in thread
From: Chen Yucong @ 2014-05-23  1:32 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Hidetoshi Seto, bp, ak, Huang, Ying, linux-kernel, linux-edac

On Wed, 2014-05-21 at 21:09 +0000, Luck, Tony wrote:

> Please do give us more detail on the scenario that you see that would
> make your new version behave better.
> 
> I'm sure the current code has no races w.r.t. clearing mces_seen. The
> monarch clears them all in mce_reign() before clearing mce_executing
> at the foot of mce_end() and allowing the others to run again.
> 
Right. There are not races for cleaning mces_seen. But, if a timeout
occurs in monarch, mces_seen will be not cleaned. It will affect all
other CPUs.
As Borislav Petkov says, if we reach a timeout, there is very little
chance for recovering. Thought. the probability for this situation to
happen is very slight, it's not impossible. Indeed, it's hard to know
the precise causes for timeout.

As Naoya Horiguchi says, this patch also have a small benefit that it
can reduce the processing time of monarch CPU. 
> Your code has the monarch release all the other cpus from the spinloop
> in mce_end() so they will all rush together through the final lines of
> do_machine_check().  
No. My code just distribute cleaning operation to Per-CPU. And all other
CPUs still have to wait for clearing mce_executing by monarch. 

In fact, mces_seen is just used for system panics as quickly as possible
if there is a truly data corrupting error. So there is not advantage for
cleaning mces_seen in the monarch. 
> Some of them will have work to do if they saw
> errors - they may have to send signals, or log the error. Others can
> fly directly to the end of do_machine_check() and clear MCG_STATUS
> and return to executing whatever code was interrupted.
> 
> So it is possible that some processors will be out doing things that can
> generate another machine check, before others have finished their
> tasks and got to the point to clear mces_seen.(*)
> 
> -Tony
> 
> (*) maybe that doesn't matter because they haven't zeroed MCG_STATUS
> yet - so this second machine check will force those cpus to shutdown. See MCIP
> description in "15.3.1.2 IA32_MCG_STATUS_MSR" section of software
> developer manual.
Right. This I also know. This is the reason why you can find the
following snippet in my code:

       /*
        * Now clear the mces_seen of current CPU -*final - so that it
does not
        * reappear on the next mce.
        */
       memset(final, 0, sizeof(struct mce));
       mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);

---
Thanks very much for your reply.
cyc





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-23  1:32             ` Chen Yucong
@ 2014-05-23  9:10               ` Borislav Petkov
  2014-05-23 11:57                 ` Chen Yucong
  2014-05-23 21:50               ` Tony Luck
  1 sibling, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2014-05-23  9:10 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Chen Yucong, Hidetoshi Seto, ak, Huang, Ying, linux-kernel, linux-edac

On Fri, May 23, 2014 at 09:32:19AM +0800, Chen Yucong wrote:
> ...if we reach a timeout, there is very little
> chance for recovering. Thought. the probability for this situation to
> happen is very slight, it's not impossible. Indeed, it's hard to know
> the precise causes for timeout.

Ok, enough talking, let's close that hole and get on with our lives:

---
From: Borislav Petkov <bp@suse.de>
Date: Fri, 23 May 2014 11:06:35 +0200
Subject: [PATCH] mce: Panic when a core has reached a timeout

There is very little and maybe practically nothing we can do to recover
from a system where at least one core has reached a timeout during the
whole monarch cores gathering. So panic when that happens.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index bfde4871848f..529ccc488f5a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -704,8 +704,7 @@ static int mce_timed_out(u64 *t)
 	if (!mca_cfg.monarch_timeout)
 		goto out;
 	if ((s64)*t < SPINUNIT) {
-		/* CHECKME: Make panic default for 1 too? */
-		if (mca_cfg.tolerant < 1)
+		if (mca_cfg.tolerant <= 1)
 			mce_panic("Timeout synchronizing machine check over CPUs",
 				  NULL, NULL);
 		cpu_missing = 1;
-- 
1.9.0

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-23  9:10               ` Borislav Petkov
@ 2014-05-23 11:57                 ` Chen Yucong
  2014-05-23 22:40                   ` Tony Luck
  0 siblings, 1 reply; 15+ messages in thread
From: Chen Yucong @ 2014-05-23 11:57 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Hidetoshi Seto, ak, Huang, Ying, linux-kernel, linux-edac

On Fri, 2014-05-23 at 11:10 +0200, Borislav Petkov wrote:
> On Fri, May 23, 2014 at 09:32:19AM +0800, Chen Yucong wrote:
> > ...if we reach a timeout, there is very little
> > chance for recovering. Thought. the probability for this situation to
> > happen is very slight, it's not impossible. Indeed, it's hard to know
> > the precise causes for timeout.
OK, we can exclude the timeout. 
Why can not we distribute the clear operations of mces_seen to Per-CPU?
Why must monarch need to help all other CPUs to clean mces_seen? What's
the advantage for this?
Why do we have to discard the property of Per-CPU variable?
Why can not we reduce the processing time of monarch CPU?
... 
> 
> Ok, enough talking, let's close that hole and get on with our lives:
You can safely ignore all messages about this talking.

> 
> There is very little and maybe practically nothing we can do to recover
> from a system where at least one core has reached a timeout during the
> whole monarch cores gathering. So panic when that happens.
> 
Why do you prefer to use "very little" and "maybe practically"?
Do you still not sure about that?


>  	if ((s64)*t < SPINUNIT) {
> -		/* CHECKME: Make panic default for 1 too? */
> -		if (mca_cfg.tolerant < 1)
> +		if (mca_cfg.tolerant <= 1)
If (mca_cfg.tolerant == 2 || mce_cfg.tolerant == 3), what can you do for
it?
>  			mce_panic("Timeout synchronizing machine check over CPUs",
>  				  NULL, NULL);
>  		cpu_missing = 1;
> -- 
> 1.9.0
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-23  1:32             ` Chen Yucong
  2014-05-23  9:10               ` Borislav Petkov
@ 2014-05-23 21:50               ` Tony Luck
  1 sibling, 0 replies; 15+ messages in thread
From: Tony Luck @ 2014-05-23 21:50 UTC (permalink / raw)
  To: Chen Yucong; +Cc: Hidetoshi Seto, bp, ak, Huang, Ying, linux-kernel, linux-edac

On Thu, May 22, 2014 at 6:32 PM, Chen Yucong <slaoub@gmail.com> wrote:
> As Naoya Horiguchi says, this patch also have a small benefit that it
> can reduce the processing time of monarch CPU.

This is indeed a benefit - but I'm not super worried about performance
of machine check handler.

>        /*
>         * Now clear the mces_seen of current CPU -*final - so that it
> does not
>         * reappear on the next mce.
>         */
>        memset(final, 0, sizeof(struct mce));
>        mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);

But if the monarch hasn't managed to clean mces_seen, then
it certainly hasn't cleared MCG_STATUS ... so there can't be
a "next" mce that would see these old values. Any extra MCE
will result in system reset.

So we are not arguing that your patch is wrong - it just doesn't seem
to be any better that what we have now (except for an unimportant
small performance improvement).

-Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
  2014-05-23 11:57                 ` Chen Yucong
@ 2014-05-23 22:40                   ` Tony Luck
  0 siblings, 0 replies; 15+ messages in thread
From: Tony Luck @ 2014-05-23 22:40 UTC (permalink / raw)
  To: Chen Yucong
  Cc: Borislav Petkov, Hidetoshi Seto, ak, Huang, Ying, linux-kernel,
	linux-edac

On Fri, May 23, 2014 at 4:57 AM, Chen Yucong <slaoub@gmail.com> wrote:
> If (mca_cfg.tolerant == 2 || mce_cfg.tolerant == 3), what can you do for
> it?

Maybe we need to look again at the effects of "tolerant" - and maybe
specify what happens at various levels,  There are some obvious
silly bits of code (picking one that is my fault):
        if (cfg->tolerant < 3) {
                if (no_way_out)
                        mce_panic("Fatal machine check on current
CPU", &m, msg);
                if (worst == MCE_AR_SEVERITY) {
                        /* schedule action before return to userland */
                        mce_save_info(m.addr, m.mcgstatus & MCG_STATUS_RIPV);
                        set_thread_flag(TIF_MCE_NOTIFY);
                } else if (kill_it) {
                        force_sig(SIGBUS, current);
                }
        }

Why is the MCE_AR_SEVERITY recovery code not even attempted
if tolerant is >=3?  That block of code dates back to before there were
any recoverable cases ... so the insane option of just ignoring the error
and hoping that the end result wasn't too bad made some sort of sense
when compared against a machine crash and not getting any answer at
all.

Or one that Andi pointed out years ago (and had a fix in a tree for):

        if (order == 1) {
                /* CHECKME: Can this race with a parallel hotplug? */
                int cpus = num_online_cpus();

                /*
                 * Monarch: Wait for everyone to go through their scanning
                 * loops.
                 */
                while (atomic_read(&mce_executing) <= cpus) {

What if some cpus were offline when this machine check arrived?
Our "offline" code doesn't do anything to the h/w to prevent those
cpus from joining in the machine check fun. So we'll see more than
num_online_cpus() processors arrive to process the machine check.
Andi's fix was in the start of do_machine_check() and just had each
cpu that showed up check whether it was listed as "online" by Linux.
If not, it just cleared MCG_STATUS and returned.  I didn't apply it
because I thought we needed to be a bit more robust (what if the offline
cpu actually did have a problem? ... we should at least check that
MCG_STATUS.RIPV=1 before rashly returning ... perhaps even more
tests are needed if the cpu had never been online at all).

So I'm happy that you are taking an interest in machine check code.
I think there are places where it can be made a lot better. I don't
think that moving where mces_seen gets cleared is one of those
places.

-Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-05-23 22:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-20  2:11 [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU Chen Yucong
2014-05-20 17:33 ` Borislav Petkov
2014-05-21  0:48   ` Chen Yucong
2014-05-21  1:33   ` Chen Yucong
2014-05-21  1:40 ` Hidetoshi Seto
2014-05-21  2:03   ` Chen Yucong
2014-05-21  2:43     ` Hidetoshi Seto
2014-05-21  3:19       ` Chen Yucong
2014-05-21  3:36         ` Hidetoshi Seto
2014-05-21 21:09           ` Luck, Tony
2014-05-23  1:32             ` Chen Yucong
2014-05-23  9:10               ` Borislav Petkov
2014-05-23 11:57                 ` Chen Yucong
2014-05-23 22:40                   ` Tony Luck
2014-05-23 21:50               ` Tony Luck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.