linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
@ 2021-01-15  7:55 Aili Yao
  2021-01-15  8:49 ` Oscar Salvador
  0 siblings, 1 reply; 21+ messages in thread
From: Aili Yao @ 2021-01-15  7:55 UTC (permalink / raw)
  To: naoya.horiguchi; +Cc: linux-mm, yangfeng1

Hello,From 740148005f1333150ee1e119ed8a34454abb7c1d Mon Sep 17 00:00:00 2001
From: Aili Yao <yaoaili@kingsoft.com>
Date: Fri, 15 Jan 2021 14:56:06 +0800
Subject: [PATCH] mm,hwpoison: non-current task should be checked early_kill
 for force_early

Other process may also care the error info with the early-kill flag set.
when force_early is set, and if tsk is not current, leave it to the
early-kill flag check.

Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
---
 mm/memory-failure.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5a38e9eade94..5e9e591c0929 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -457,8 +457,6 @@ static struct task_struct *task_early_kill(struct task_struct *tsk,
 		 */
 		if (tsk->mm == current->mm)
 			return current;
-		else
-			return NULL;
 	}
 	return find_early_kill_thread(tsk);
 }
-- 
2.25.1



-- 
Best Regards!

Aili Yao


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-15  7:55 [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early Aili Yao
@ 2021-01-15  8:49 ` Oscar Salvador
  2021-01-15  9:26   ` Aili Yao
  0 siblings, 1 reply; 21+ messages in thread
From: Oscar Salvador @ 2021-01-15  8:49 UTC (permalink / raw)
  To: Aili Yao; +Cc: naoya.horiguchi, linux-mm, yangfeng1

On Fri, Jan 15, 2021 at 03:55:06PM +0800, Aili Yao wrote:
> Hello,From 740148005f1333150ee1e119ed8a34454abb7c1d Mon Sep 17 00:00:00 2001
> From: Aili Yao <yaoaili@kingsoft.com>
> Date: Fri, 15 Jan 2021 14:56:06 +0800
> Subject: [PATCH] mm,hwpoison: non-current task should be checked early_kill
>  for force_early
> 
> Other process may also care the error info with the early-kill flag set.
> when force_early is set, and if tsk is not current, leave it to the
> early-kill flag check.

I am having a hard time trying to grasp what are you trying to achieve here.
Could you elaborate some more? Ideally stating what is the problem you are
fixing here.

I was also a bit confused about task_early_kill.
So, if force_early is true (aka. MF_ACTION_REQUIRED was set), we only
return a task struct if we find the task or a thread belonging to the task
(sharing the mm).

If it is not set by the caller of memory_failure, we still want to check the
task/threads' MCE policy to check whether PF_MCE_KILL_EARLY it was set with
prctl, right?

That makes sense to me.

So, back to your change, if force_early was set, but the main task does not
match, you still want to check whether any thread belonging to the main task
has the policy set?

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-15  8:49 ` Oscar Salvador
@ 2021-01-15  9:26   ` Aili Yao
  2021-01-15  9:31     ` Aili Yao
                       ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Aili Yao @ 2021-01-15  9:26 UTC (permalink / raw)
  To: Oscar Salvador; +Cc: naoya.horiguchi, linux-mm, yangfeng1

On Fri, 15 Jan 2021 09:49:24 +0100
Oscar Salvador <osalvador@suse.de> wrote:

> I am having a hard time trying to grasp what are you trying to achieve here.
> Could you elaborate some more? Ideally stating what is the problem you are
> fixing here.
> 
Sorry for confusion, example: there are four process A,B,C,D,which map the same file into
there process space, which set there PF_MCE_KILL_EARLY flag to TRUE, if process A trigger one
UE with  MF_ACTION_REQUIRED set, in current code, only process A will be killed, B,C,D remain
alive, but for the PF_MCE_KILL_EARLY we set, we want B,C,D also be killed.
this is what I want to fix.

> If it is not set by the caller of memory_failure, we still want to check the
> task/threads' MCE policy to check whether PF_MCE_KILL_EARLY it was set with
> prctl, right?
>
Yes, right! 

> That makes sense to me.
> 
> So, back to your change, if force_early was set, but the main task does not
> match, you still want to check whether any thread belonging to the main task
> has the policy set?
> 
Yes.

-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-15  9:26   ` Aili Yao
@ 2021-01-15  9:31     ` Aili Yao
  2021-01-15  9:40       ` Oscar Salvador
  2021-01-15 10:31     ` Oscar Salvador
  2021-01-18  5:15     ` HORIGUCHI NAOYA(堀口 直也)
  2 siblings, 1 reply; 21+ messages in thread
From: Aili Yao @ 2021-01-15  9:31 UTC (permalink / raw)
  To: Oscar Salvador; +Cc: naoya.horiguchi, linux-mm, yangfeng1

On Fri, 15 Jan 2021 17:26:22 +0800
Aili Yao <yaoaili@kingsoft.com> wrote:

> > So, back to your change, if force_early was set, but the main task does not
> > match, you still want to check whether any thread belonging to the main task
> > has the policy set?
> >   
> Yes.
> 
Sorry, the code only loop processes, not thread.


-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-15  9:31     ` Aili Yao
@ 2021-01-15  9:40       ` Oscar Salvador
  2021-01-15  9:53         ` Aili Yao
  0 siblings, 1 reply; 21+ messages in thread
From: Oscar Salvador @ 2021-01-15  9:40 UTC (permalink / raw)
  To: Aili Yao; +Cc: naoya.horiguchi, linux-mm, yangfeng1

On Fri, Jan 15, 2021 at 05:31:25PM +0800, Aili Yao wrote:
> On Fri, 15 Jan 2021 17:26:22 +0800
> Aili Yao <yaoaili@kingsoft.com> wrote:
> 
> > > So, back to your change, if force_early was set, but the main task does not
> > > match, you still want to check whether any thread belonging to the main task
> > > has the policy set?
> > >   
> > Yes.
> > 
> Sorry, the code only loop processes, not thread.

The loop in task_early_kill goes through processes, while the one in
find_early_kill_thread goes through threads, right?

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-15  9:40       ` Oscar Salvador
@ 2021-01-15  9:53         ` Aili Yao
  0 siblings, 0 replies; 21+ messages in thread
From: Aili Yao @ 2021-01-15  9:53 UTC (permalink / raw)
  To: Oscar Salvador; +Cc: naoya.horiguchi, linux-mm, yangfeng1

On Fri, 15 Jan 2021 10:40:37 +0100
Oscar Salvador <osalvador@suse.de> wrote:

> On Fri, Jan 15, 2021 at 05:31:25PM +0800, Aili Yao wrote:
> > On Fri, 15 Jan 2021 17:26:22 +0800
> > Aili Yao <yaoaili@kingsoft.com> wrote:
> >   
> > > > So, back to your change, if force_early was set, but the main task does not
> > > > match, you still want to check whether any thread belonging to the main task
> > > > has the policy set?
> > > >     
> > > Yes.
> > >   
> > Sorry, the code only loop processes, not thread.  
> 
> The loop in task_early_kill goes through processes, while the one in
> find_early_kill_thread goes through threads, right?
> 
I think you are right, to early_kill, the code wants pick the right thread of one process.


-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-15  9:26   ` Aili Yao
  2021-01-15  9:31     ` Aili Yao
@ 2021-01-15 10:31     ` Oscar Salvador
  2021-01-18  5:15     ` HORIGUCHI NAOYA(堀口 直也)
  2 siblings, 0 replies; 21+ messages in thread
From: Oscar Salvador @ 2021-01-15 10:31 UTC (permalink / raw)
  To: Aili Yao; +Cc: naoya.horiguchi, linux-mm, yangfeng1

On Fri, Jan 15, 2021 at 05:26:22PM +0800, Aili Yao wrote:
> On Fri, 15 Jan 2021 09:49:24 +0100
> Oscar Salvador <osalvador@suse.de> wrote:
> 
> > I am having a hard time trying to grasp what are you trying to achieve here.
> > Could you elaborate some more? Ideally stating what is the problem you are
> > fixing here.
> > 
> Sorry for confusion, example: there are four process A,B,C,D,which map the same file into
> there process space, which set there PF_MCE_KILL_EARLY flag to TRUE, if process A trigger one
> UE with  MF_ACTION_REQUIRED set, in current code, only process A will be killed, B,C,D remain
> alive, but for the PF_MCE_KILL_EARLY we set, we want B,C,D also be killed.
> this is what I want to fix.

This should really be part of the changelog. (what is it fixing? how? why?)
All this information should be there.

And the title should be renamed as well as it is bit confusing as is.

"Do not bail out in task_early_kill when force_early is set"

or something along those lines.

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-15  9:26   ` Aili Yao
  2021-01-15  9:31     ` Aili Yao
  2021-01-15 10:31     ` Oscar Salvador
@ 2021-01-18  5:15     ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-18  5:57       ` Aili Yao
  2 siblings, 1 reply; 21+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-01-18  5:15 UTC (permalink / raw)
  To: Aili Yao; +Cc: Oscar Salvador, linux-mm, yangfeng1

Hi Aili,

On Fri, Jan 15, 2021 at 05:26:22PM +0800, Aili Yao wrote:
> On Fri, 15 Jan 2021 09:49:24 +0100
> Oscar Salvador <osalvador@suse.de> wrote:
> 
> > I am having a hard time trying to grasp what are you trying to achieve here.
> > Could you elaborate some more? Ideally stating what is the problem you are
> > fixing here.
> > 
> Sorry for confusion, example: there are four process A,B,C,D,which map the same file into
> there process space, which set there PF_MCE_KILL_EARLY flag to TRUE, if process A trigger one
> UE with  MF_ACTION_REQUIRED set, in current code, only process A will be killed, B,C,D remain
> alive, but for the PF_MCE_KILL_EARLY we set, we want B,C,D also be killed.

This behavior seems not to me what PF_MCE_KILL_EARLY intends.  This flag
controls whether memory error handler kills processes immediately or not,
and it only affects action optional cases (i.e. called without
MF_ACTION_REQUIRED).  In MF_ACTION_REQUIRED case, we have no such choice
and affected processes should be always killed immediately.

We may also need to consider the difference in context of these two cases.
Action optional case is called asynchronously by background process like
memory scrubbing, so all processes mapping the error memory are the affected
ones.  Action required event is more synchronous, and is called when a
process experiences memory access errors on data load and instruction fetch
instructions.  So the affected process in this case is only the process.
So I still think the this background justifies the current behavior.

But my knowledge might be old, if you have newer hardwares which define
other type of memory error and that doesn't fit with current implementation,
I'd like to extend code to support the new cases, so please let me know.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  5:15     ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-01-18  5:57       ` Aili Yao
  2021-01-18  6:50         ` HORIGUCHI NAOYA(堀口 直也)
  0 siblings, 1 reply; 21+ messages in thread
From: Aili Yao @ 2021-01-18  5:57 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Oscar Salvador, linux-mm, yangfeng1

On Mon, 18 Jan 2021 05:15:55 +0000
HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:

> Hi Aili,
> 
> On Fri, Jan 15, 2021 at 05:26:22PM +0800, Aili Yao wrote:
> > On Fri, 15 Jan 2021 09:49:24 +0100
> > Oscar Salvador <osalvador@suse.de> wrote:
> >   
> > > I am having a hard time trying to grasp what are you trying to achieve here.
> > > Could you elaborate some more? Ideally stating what is the problem you are
> > > fixing here.
> > >   
> > Sorry for confusion, example: there are four process A,B,C,D,which map the same file into
> > there process space, which set there PF_MCE_KILL_EARLY flag to TRUE, if process A trigger one
> > UE with  MF_ACTION_REQUIRED set, in current code, only process A will be killed, B,C,D remain
> > alive, but for the PF_MCE_KILL_EARLY we set, we want B,C,D also be killed.  
> 
> This behavior seems not to me what PF_MCE_KILL_EARLY intends.  This flag
> controls whether memory error handler kills processes immediately or not,
> and it only affects action optional cases (i.e. called without
> MF_ACTION_REQUIRED).  In MF_ACTION_REQUIRED case, we have no such choice
> and affected processes should be always killed immediately.
> 
> We may also need to consider the difference in context of these two cases.
> Action optional case is called asynchronously by background process like
> memory scrubbing, so all processes mapping the error memory are the affected
> ones.  Action required event is more synchronous, and is called when a
> process experiences memory access errors on data load and instruction fetch
> instructions.  So the affected process in this case is only the process.
> So I still think the this background justifies the current behavior.
> 
> But my knowledge might be old, if you have newer hardwares which define
> other type of memory error and that doesn't fit with current implementation,
> I'd like to extend code to support the new cases, so please let me know.
> 
Sorry, I don't fully get your concern.

For Action optional cases, It's may from CE storm or patrol scrub, when the process want to process this condition,
it will set PF_MCE_KILL_EARLY, and it will be signaled for such case.
For Action Required cases,we must do something, I think it's more urgent and serious, In the current code, the process triggered the Error
Should be signaled. but the process with PF_MCE_KILL_EARLY won't get signaled, just because PF_MCE_KILL_EARLY is for action optional case?

Action Required is for current we must handle, the same Action Required issue is Action optional for non-current processes, Right?
I don't think Action Required is for all processes, For current processes , it may be AR, for other process, it may be AO, and they should also
be signaled, I think this behavior its reasonable. 

And we can't determine which error will be triggered, the PF_MCE_KILL_EARLY fLAG is meant to handle memory error gracefully and won't be restricted
to explicitly declared AO errors.

Thanks!

-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  5:57       ` Aili Yao
@ 2021-01-18  6:50         ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-18  7:16           ` Aili Yao
  2021-01-18  8:15           ` Aili Yao
  0 siblings, 2 replies; 21+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-01-18  6:50 UTC (permalink / raw)
  To: Aili Yao; +Cc: Oscar Salvador, linux-mm, yangfeng1

On Mon, Jan 18, 2021 at 01:57:44PM +0800, Aili Yao wrote:
> On Mon, 18 Jan 2021 05:15:55 +0000
> HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:
> 
> > Hi Aili,
> > 
> > On Fri, Jan 15, 2021 at 05:26:22PM +0800, Aili Yao wrote:
> > > On Fri, 15 Jan 2021 09:49:24 +0100
> > > Oscar Salvador <osalvador@suse.de> wrote:
> > >   
> > > > I am having a hard time trying to grasp what are you trying to achieve here.
> > > > Could you elaborate some more? Ideally stating what is the problem you are
> > > > fixing here.
> > > >   
> > > Sorry for confusion, example: there are four process A,B,C,D,which map the same file into
> > > there process space, which set there PF_MCE_KILL_EARLY flag to TRUE, if process A trigger one
> > > UE with  MF_ACTION_REQUIRED set, in current code, only process A will be killed, B,C,D remain
> > > alive, but for the PF_MCE_KILL_EARLY we set, we want B,C,D also be killed.  
> > 
> > This behavior seems not to me what PF_MCE_KILL_EARLY intends.  This flag
> > controls whether memory error handler kills processes immediately or not,
> > and it only affects action optional cases (i.e. called without
> > MF_ACTION_REQUIRED).  In MF_ACTION_REQUIRED case, we have no such choice
> > and affected processes should be always killed immediately.
> > 
> > We may also need to consider the difference in context of these two cases.
> > Action optional case is called asynchronously by background process like
> > memory scrubbing, so all processes mapping the error memory are the affected
> > ones.  Action required event is more synchronous, and is called when a
> > process experiences memory access errors on data load and instruction fetch
> > instructions.  So the affected process in this case is only the process.
> > So I still think the this background justifies the current behavior.
> > 
> > But my knowledge might be old, if you have newer hardwares which define
> > other type of memory error and that doesn't fit with current implementation,
> > I'd like to extend code to support the new cases, so please let me know.
> > 
> Sorry, I don't fully get your concern.
> 
> For Action optional cases, It's may from CE storm or patrol scrub, ...

hwpoison is not about corrected errors, but about uncorrected errors. CE storm
should be handled by CMCI and userspace tool like mcelog, although it seems not
current main topic, sorry for nitpick.

> when the process want to process this condition,
> it will set PF_MCE_KILL_EARLY, and it will be signaled for such case.
> For Action Required cases,we must do something, I think it's more urgent and serious, In the current code, the process triggered the Error
> Should be signaled. but the process with PF_MCE_KILL_EARLY won't get signaled, just because PF_MCE_KILL_EARLY is for action optional case?

I don't use PF_MCE_KILL_EARLY to justify current code. Let me explain more.

For action optional cases, one error event kills *only one* process. If an
error page are shared by multiple processes, these processes will be killed
by separate error events, each of which is triggered when each process tries
to access the error memory.  So these processes would be killed immediately
when accessing the error, but you don't have to kill all at the same time
(or actually you might not even have to kill it at all if the process exits
finally without accessing the error later).

Maybe the function variable "force_early" is named confusingly (it sounds
that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
I'll submit a fix later.  (I'll add your "Reported-by" because you made me
find it, thank you.)

> 
> Action Required is for current we must handle, the same Action Required issue is Action optional for non-current processes, Right?

Right.

> I don't think Action Required is for all processes, For current processes , it may be AR, for other process, it may be AO, and they should also
> be signaled, I think this behavior its reasonable. 
> 
> And we can't determine which error will be triggered, the PF_MCE_KILL_EARLY fLAG is meant to handle memory error gracefully and won't be restricted
> to explicitly declared AO errors.
> 
> Thanks!

Thank you, too.

- Naoya

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  6:50         ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-01-18  7:16           ` Aili Yao
  2021-01-18  8:15           ` Aili Yao
  1 sibling, 0 replies; 21+ messages in thread
From: Aili Yao @ 2021-01-18  7:16 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Oscar Salvador, linux-mm, yangfeng1

On Mon, 18 Jan 2021 06:50:54 +0000
HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:

> On Mon, Jan 18, 2021 at 01:57:44PM +0800, Aili Yao wrote:
> > On Mon, 18 Jan 2021 05:15:55 +0000
> > HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:
> >   
> > > Hi Aili,
> > > 
> > > On Fri, Jan 15, 2021 at 05:26:22PM +0800, Aili Yao wrote:  
> > > > On Fri, 15 Jan 2021 09:49:24 +0100
> > > > Oscar Salvador <osalvador@suse.de> wrote:
> > > >     
> > > > > I am having a hard time trying to grasp what are you trying to achieve here.
> > > > > Could you elaborate some more? Ideally stating what is the problem you are
> > > > > fixing here.
> > > > >     
> > > > Sorry for confusion, example: there are four process A,B,C,D,which map the same file into
> > > > there process space, which set there PF_MCE_KILL_EARLY flag to TRUE, if process A trigger one
> > > > UE with  MF_ACTION_REQUIRED set, in current code, only process A will be killed, B,C,D remain
> > > > alive, but for the PF_MCE_KILL_EARLY we set, we want B,C,D also be killed.    
> > > 
> > > This behavior seems not to me what PF_MCE_KILL_EARLY intends.  This flag
> > > controls whether memory error handler kills processes immediately or not,
> > > and it only affects action optional cases (i.e. called without
> > > MF_ACTION_REQUIRED).  In MF_ACTION_REQUIRED case, we have no such choice
> > > and affected processes should be always killed immediately.
> > > 
> > > We may also need to consider the difference in context of these two cases.
> > > Action optional case is called asynchronously by background process like
> > > memory scrubbing, so all processes mapping the error memory are the affected
> > > ones.  Action required event is more synchronous, and is called when a
> > > process experiences memory access errors on data load and instruction fetch
> > > instructions.  So the affected process in this case is only the process.
> > > So I still think the this background justifies the current behavior.
> > > 
> > > But my knowledge might be old, if you have newer hardwares which define
> > > other type of memory error and that doesn't fit with current implementation,
> > > I'd like to extend code to support the new cases, so please let me know.
> > >   
> > Sorry, I don't fully get your concern.
> > 
> > For Action optional cases, It's may from CE storm or patrol scrub, ...  
> 
> hwpoison is not about corrected errors, but about uncorrected errors. CE storm
> should be handled by CMCI and userspace tool like mcelog, although it seems not
> current main topic, sorry for nitpick.
> 

When hard page offline is configured, CE will also call memory-failure

> > when the process want to process this condition,
> > it will set PF_MCE_KILL_EARLY, and it will be signaled for such case.
> > For Action Required cases,we must do something, I think it's more urgent and serious, In the current code, the process triggered the Error
> > Should be signaled. but the process with PF_MCE_KILL_EARLY won't get signaled, just because PF_MCE_KILL_EARLY is for action optional case?  
> 
> I don't use PF_MCE_KILL_EARLY to justify current code. Let me explain more.
> 
> For action optional cases, one error event kills *only one* process. If an
> error page are shared by multiple processes, these processes will be killed
> by separate error events, each of which is triggered when each process tries
> to access the error memory.  So these processes would be killed immediately
> when accessing the error, but you don't have to kill all at the same time
> (or actually you might not even have to kill it at all if the process exits
> finally without accessing the error later).
> 
It's not the way PF_MCE_KILL_EARLY want, normally one action optional without PF_MCE_KILL_EARLY will
be signaled when it really access it, when PF_MCE_KILL_EARLY set, we may not just want be killed,
wo may capture the signal and do some thing more.

> Maybe the function variable "force_early" is named confusingly (it sounds
> that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
> I'll submit a fix later.  (I'll add your "Reported-by" because you made me
> find it, thank you.)
> 
not related to force_early, this is about the memory action we take for error , but if you have a better one, that's will be good.

> > 
> > Action Required is for current we must handle, the same Action Required issue is Action optional for non-current processes, Right?  
> 
> Right.
> 
> > I don't think Action Required is for all processes, For current processes , it may be AR, for other process, it may be AO, and they should also
> > be signaled, I think this behavior its reasonable. 
> > 
> > And we can't determine which error will be triggered, the PF_MCE_KILL_EARLY fLAG is meant to handle memory error gracefully and won't be restricted
> > to explicitly declared AO errors.
> > 

Thanks


-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  6:50         ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-18  7:16           ` Aili Yao
@ 2021-01-18  8:15           ` Aili Yao
  2021-01-18  8:57             ` HORIGUCHI NAOYA(堀口 直也)
  1 sibling, 1 reply; 21+ messages in thread
From: Aili Yao @ 2021-01-18  8:15 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Oscar Salvador, linux-mm, yangfeng1

On Mon, 18 Jan 2021 06:50:54 +0000
HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:

> 
> For action optional cases, one error event kills *only one* process. If an
> error page are shared by multiple processes, these processes will be killed
> by separate error events, each of which is triggered when each process tries
> to access the error memory.  So these processes would be killed immediately
> when accessing the error, but you don't have to kill all at the same time
> (or actually you might not even have to kill it at all if the process exits
> finally without accessing the error later).
> 
> Maybe the function variable "force_early" is named confusingly (it sounds
> that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
> I'll submit a fix later.  (I'll add your "Reported-by" because you made me
> find it, thank you.)
> 
I think we should do more for non current process error case, we should mark it AO for processes to be signaled
or we may take wrong action.


-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  8:15           ` Aili Yao
@ 2021-01-18  8:57             ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-18  9:09               ` Aili Yao
                                 ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-01-18  8:57 UTC (permalink / raw)
  To: Aili Yao; +Cc: Oscar Salvador, linux-mm, yangfeng1

On Mon, Jan 18, 2021 at 04:15:12PM +0800, Aili Yao wrote:
> On Mon, 18 Jan 2021 06:50:54 +0000
> HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:
> 
> > 
> > For action optional cases, one error event kills *only one* process. If an
> > error page are shared by multiple processes, these processes will be killed
> > by separate error events, each of which is triggered when each process tries
> > to access the error memory.  So these processes would be killed immediately
> > when accessing the error, but you don't have to kill all at the same time
> > (or actually you might not even have to kill it at all if the process exits
> > finally without accessing the error later).
> > 
> > Maybe the function variable "force_early" is named confusingly (it sounds
> > that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
> > I'll submit a fix later.  (I'll add your "Reported-by" because you made me
> > find it, thank you.)
> > 
> I think we should do more for non current process error case, we should mark it AO for processes to be signaled
> or we may take wrong action.

I'm not sure what you mean by "non current process error case" and "we
should mark it AO", so could you explain more specifically about your error
scenario?  Especially I'd like to know about who triggers hard offline on
what hardware events and what "wrong action" could happen.  Maybe just
"calling memory_failure() with MF_ACTION_REQUIRED" is not enough, because
it's not enough for us to see that your scenario is possible. Current
implementation implicitly assumes some hardware behavior, and does not work
for the case which never happens under the assumption.

Do you have some test cases to reproduce any specific issue (like data lost)
on your system? (If yes, please share it.) Or your concern is from code review?

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  8:57             ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-01-18  9:09               ` Aili Yao
  2021-01-19  5:25                 ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-18  9:24               ` Oscar Salvador
  2021-01-19  4:21               ` Aili Yao
  2 siblings, 1 reply; 21+ messages in thread
From: Aili Yao @ 2021-01-18  9:09 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Oscar Salvador, linux-mm, yangfeng1

On Mon, 18 Jan 2021 08:57:47 +0000
HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:

> > > 
> > > For action optional cases, one error event kills *only one* process. If an
> > > error page are shared by multiple processes, these processes will be killed
> > > by separate error events, each of which is triggered when each process tries
> > > to access the error memory.  So these processes would be killed immediately
> > > when accessing the error, but you don't have to kill all at the same time
> > > (or actually you might not even have to kill it at all if the process exits
> > > finally without accessing the error later).
> > > 
> > > Maybe the function variable "force_early" is named confusingly (it sounds
> > > that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
> > > I'll submit a fix later.  (I'll add your "Reported-by" because you made me
> > > find it, thank you.)
> > >   
> > I think we should do more for non current process error case, we should mark it AO for processes to be signaled
> > or we may take wrong action.  
> 
> I'm not sure what you mean by "non current process error case" and "we
> should mark it AO", so could you explain more specifically about your error
> scenario?  
  I will share my test code and i will submit another patch to this scenario.
  please give me some time, thanks!
  And I think you are right, AR is only current process.

> Especially I'd like to know about who triggers hard offline on
> what hardware events and what "wrong action" could happen.  Maybe just
> "calling memory_failure() with MF_ACTION_REQUIRED" is not enough, because
> it's not enough for us to see that your scenario is possible. Current
> implementation implicitly assumes some hardware behavior, and does not work
> for the case which never happens under the assumption.
> 
  This action is from mcelog daemon, normally softpage offlie is default, but we can configure
hardpage offline for CE storms, to get related processes signaled.

> Do you have some test cases to reproduce any specific issue (like data lost)
> on your system? (If yes, please share it.) Or your concern is from code review?
>
  I will make it clean, get it shared

Thanks
-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  8:57             ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-18  9:09               ` Aili Yao
@ 2021-01-18  9:24               ` Oscar Salvador
  2021-01-18  9:38                 ` Aili Yao
  2021-01-19  4:21               ` Aili Yao
  2 siblings, 1 reply; 21+ messages in thread
From: Oscar Salvador @ 2021-01-18  9:24 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Aili Yao, linux-mm, yangfeng1

On Mon, Jan 18, 2021 at 08:57:47AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> I'm not sure what you mean by "non current process error case" and "we
> should mark it AO", so could you explain more specifically about your error
> scenario?  Especially I'd like to know about who triggers hard offline on
> what hardware events and what "wrong action" could happen.  Maybe just
> "calling memory_failure() with MF_ACTION_REQUIRED" is not enough, because
> it's not enough for us to see that your scenario is possible. Current
> implementation implicitly assumes some hardware behavior, and does not work
> for the case which never happens under the assumption.

So, the scenario case is a multithread application with the same page mapped.
And PF_MCE_KILL_EARLY flag was set.

IIUIC, Aili Yao concern is that when the MCE machinery calls memory_failure
which MF_ACTION_REQUIRED, only the process that triggered the MCE exception
will receive a SIGBUG, and not the other threads that might have PF_MCE_EARLY.
Aili Yao would like memory_failure() to also signal those threads who might
have the flag set, in case they want to do something with that information.

But reading the code, I do not think that is what the code expects.
Looking at the comment above find_early_kill_thread:

"/*
 * Find a dedicated thread which is supposed to handle SIGBUS(BUS_MCEERR_AO)
 * on behalf of the thread group. Return task_struct of the (first found)
 * dedicated thread if found, and return NULL otherwise.
 *
 * We already hold read_lock(&tasklist_lock) in the caller, so we don't
 * have to call rcu_read_lock/unlock() in this function.
 */"

What I understand from that is:

"
 If memory_failure() was not triggered by any concrete process (aka: no one was
 trying to manipulate the corrupted area), we need to find the main thread who
 might have set the MCE policy by pcrtl and see if they want to be signaled
 __before__ they access the corrupted area.
 
"

Note that if the PF_MCE policy was not set, we check the global knob
sysctm_memory_early_kill.
And if that is not set either, we defer the signaling till later when a process
actually tries to operate the corrupted area.

Does that makes sense?

Actually, unless I am mistaken, if a multithread process receives a signal,
all threads belonging to the process will receive the signal as well:

"The signal disposition is a per-process attribute: in a
multithreaded application, the disposition of a particular signal
is the same for all threads."

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  9:24               ` Oscar Salvador
@ 2021-01-18  9:38                 ` Aili Yao
  2021-01-18 10:09                   ` Oscar Salvador
  0 siblings, 1 reply; 21+ messages in thread
From: Aili Yao @ 2021-01-18  9:38 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: HORIGUCHI NAOYA(堀口 直也),
	linux-mm, yangfeng1

On Mon, 18 Jan 2021 10:24:23 +0100
Oscar Salvador <osalvador@suse.de> wrote:

> 
> So, the scenario case is a multithread application with the same page mapped.
> And PF_MCE_KILL_EARLY flag was set.
> 
  The scenario is not related multithread application;
  it's about multiprocess application which share the same page

> IIUIC, Aili Yao concern is that when the MCE machinery calls memory_failure
> which MF_ACTION_REQUIRED, only the process that triggered the MCE exception
> will receive a SIGBUG, and not the other threads that might have PF_MCE_EARLY.
> Aili Yao would like memory_failure() to also signal those threads who might
> have the flag set, in case they want to do something with that information.
> 
 For the processes who care the memory with early flag set, they may specify one thread
to process related signal such as signal bus, when the flag set, the thread want to handle the error
gracefully and not kill the process all, and may do something more.

> But reading the code, I do not think that is what the code expects.
> Looking at the comment above find_early_kill_thread:
> 
> "/*
>  * Find a dedicated thread which is supposed to handle SIGBUS(BUS_MCEERR_AO)
>  * on behalf of the thread group. Return task_struct of the (first found)
>  * dedicated thread if found, and return NULL otherwise.
>  *
>  * We already hold read_lock(&tasklist_lock) in the caller, so we don't
>  * have to call rcu_read_lock/unlock() in this function.
>  */"
> 
> What I understand from that is:
> 
> "
>  If memory_failure() was not triggered by any concrete process (aka: no one was
>  trying to manipulate the corrupted area), we need to find the main thread who
>  might have set the MCE policy by pcrtl and see if they want to be signaled
>  __before__ they access the corrupted area.

yes, it doesn't just want to be killed all.  

Thanks
-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  9:38                 ` Aili Yao
@ 2021-01-18 10:09                   ` Oscar Salvador
  0 siblings, 0 replies; 21+ messages in thread
From: Oscar Salvador @ 2021-01-18 10:09 UTC (permalink / raw)
  To: Aili Yao
  Cc: HORIGUCHI NAOYA(堀口 直也),
	linux-mm, yangfeng1

On Mon, Jan 18, 2021 at 05:38:52PM +0800, Aili Yao wrote:
> On Mon, 18 Jan 2021 10:24:23 +0100
> Oscar Salvador <osalvador@suse.de> wrote:
> 
> > 
> > So, the scenario case is a multithread application with the same page mapped.
> > And PF_MCE_KILL_EARLY flag was set.
> > 
>   The scenario is not related multithread application;
>   it's about multiprocess application which share the same page

Ok, I misunderstood that part.

Then, I do not fully understand the concern.
I guess you are kind of memory scrubbing with mcelog defaulting to
memory_failure instead of soft_offline.

E.g: collect_procs_anon will pick the processes which have the MCE policy
set and have the page mapped. That is all we should need.

So, if you have processes A, B, C and D with X page mapped, and you set
the MCE policy on those four, they should get picked up by your scrubbing
when handling X page by memory_failure.

Is not that right?

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  8:57             ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-18  9:09               ` Aili Yao
  2021-01-18  9:24               ` Oscar Salvador
@ 2021-01-19  4:21               ` Aili Yao
  2 siblings, 0 replies; 21+ messages in thread
From: Aili Yao @ 2021-01-19  4:21 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Oscar Salvador, linux-mm, yangfeng1

[-- Attachment #1: Type: text/plain, Size: 2128 bytes --]

On Mon, 18 Jan 2021 08:57:47 +0000
HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:

> On Mon, Jan 18, 2021 at 04:15:12PM +0800, Aili Yao wrote:
> > On Mon, 18 Jan 2021 06:50:54 +0000
> > HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:
> >   
> > > 
> > > For action optional cases, one error event kills *only one* process. If an
> > > error page are shared by multiple processes, these processes will be killed
> > > by separate error events, each of which is triggered when each process tries
> > > to access the error memory.  So these processes would be killed immediately
> > > when accessing the error, but you don't have to kill all at the same time
> > > (or actually you might not even have to kill it at all if the process exits
> > > finally without accessing the error later).
> > > 
> > > Maybe the function variable "force_early" is named confusingly (it sounds
> > > that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
> > > I'll submit a fix later.  (I'll add your "Reported-by" because you made me
> > > find it, thank you.)
> > >   
> > I think we should do more for non current process error case, we should mark it AO for processes to be signaled
> > or we may take wrong action.  
> 
> I'm not sure what you mean by "non current process error case" and "we
> should mark it AO", so could you explain more specifically about your error
> scenario?  Especially I'd like to know about who triggers hard offline on
> what hardware events and what "wrong action" could happen.  Maybe just
> "calling memory_failure() with MF_ACTION_REQUIRED" is not enough, because
> it's not enough for us to see that your scenario is possible. Current
> implementation implicitly assumes some hardware behavior, and does not work
> for the case which never happens under the assumption.
> 
> Do you have some test cases to reproduce any specific issue (like data lost)
> on your system? (If yes, please share it.) Or your concern is from code review?

Hope this draft test code will be helpful, Thanks


-- 
Best Regards!

Aili Yao

[-- Attachment #2: multitprocess_test_share.c --]
[-- Type: text/x-c++src, Size: 5790 bytes --]

#define _GNU_SOURCE
#include <pthread.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <ctype.h>
#include <hugetlbfs.h>
#include <semaphore.h>
#include <dirent.h>
#include <sys/mman.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#include <setjmp.h>
#include <signal.h>
#include <sched.h>


char empty[1024*1024*2] = {1};
sigjmp_buf recover;

/*
 * Set early/late kill mode for hwpoison memory corruption.
 * This influences when the process gets killed on a memory corruption.
 */
#define PR_MCE_KILL     33
# define PR_MCE_KILL_CLEAR   0
# define PR_MCE_KILL_SET     1

# define PR_MCE_KILL_LATE    0
# define PR_MCE_KILL_EARLY   1
# define PR_MCE_KILL_DEFAULT 2

#define PR_MCE_KILL_GET 34

sem_t sem_id; 
sem_t sem_id_triggrer; 
int global_fd = -1;
int * main_addr = NULL;


#define PAGE_SHIFT 21
#define PAGEMAP_LENGTH 8

#define handle_error_en(en, msg) \
       do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)

#define handle_error(msg) \
       do { perror(msg); exit(EXIT_FAILURE); } while (0)

struct thread_info {    /* Used as argument to thread_start() */
   pthread_t thread_id;        /* ID returned by pthread_create() */
   int       thread_num;       /* Application-defined thread # */
   char     *argv_string;      /* From command-line argument */
};

unsigned long get_page_frame_number_of_address(void *addr) {
   // Open the pagemap file for the current process
   FILE *pagemap = fopen("/proc/self/pagemap", "rb");

   // Seek to the page that the buffer is on
   unsigned long offset = (unsigned long)addr / getpagesize() * PAGEMAP_LENGTH;
   if(fseek(pagemap, (unsigned long)offset, SEEK_SET) != 0) {
      fprintf(stderr, "Failed to seek pagemap to proper location\n");
      exit(1);
   }

   // The page frame number is in bits 0-54 so read the first 7 bytes and clear the 55th bit
   unsigned long page_frame_number = 0;
   fread(&page_frame_number, 1, PAGEMAP_LENGTH-1, pagemap);

   page_frame_number &= 0x7FFFFFFFFFFFFF;

   fclose(pagemap);

   return page_frame_number;
}

void sighandler(int sig, siginfo_t *si, void *arg)
{
	printf("signal %d code %d addr %p\n", sig, si->si_code, si->si_addr);
	siglongjmp(recover, 1);
}

void *alloc_filebacked_page(char *filepath, int private, int *fd)
{
	int i, sum,mapflag = MAP_SHARED;
	int *addr;

	if(*fd == -1){
		if ((*fd = open(filepath, O_RDWR|O_CREAT, 0644)) < 0) {
			perror("open");
			return NULL;
		}
	}

	if(private)
		write(*fd, empty, sizeof(empty));

	if ((addr = mmap(0, getpagesize(),PROT_READ|PROT_WRITE, mapflag, *fd, 0)) == MAP_FAILED) {
		perror("mmap");
		return NULL;
	}
	for (i = 0; i < getpagesize()/4; i = i + 4)
	{
		sum = addr[i];
	}
	return addr;
}

void *alloc_anony_page(void)
{
	int i;
	void *addr;

	if ((addr = mmap(0, getpagesize(),PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_LOCKED, 0, 0)) == MAP_FAILED) {
		perror("mmap");
		//unlink(filepath);
		return NULL;
	}
	memcpy(addr, empty, 4096);
	return addr;
}

struct sigaction sa = {
	.sa_sigaction = sighandler,
	.sa_flags = SA_SIGINFO
};

static void * thread_start(void *arg)
{
	struct thread_info *tinfo = arg;
	char *uargv;
	char filepath[32] = {0};
	int i,sum,fd = -1;
	int *addr = main_addr;
	unsigned int page_frame_number;
	pid_t pid = getpid();
	pid_t ppid = getppid();
	char file_name[20] ={0};	
	
	printf("Thread %d %d; argv_string=%s\n",pid, ppid , tinfo->argv_string);
	uargv = strdup(tinfo->argv_string);
	if (uargv == NULL)
		handle_error("strdup");
	
	if(!strcmp(uargv,"test"))
	{
		printf("test : Page frame: 0x%x \n",addr);
		while(1)
		{
			for (i = 0; i < getpagesize()/4; i+=4)
			{
				sum = ((volatile int *)addr)[i];
				((volatile int *)addr)[i] = i;
			}
			usleep(5000);
		}
	}else{
		printf("%s Page frame: 0x%x \n", uargv, addr);
		sigaction(SIGBUS, &sa, NULL);
		if(sigsetjmp(recover,1)){
			exit;
		}
		while(1)
			sleep(1);
	}
	return uargv;
}

int main(int argc, char *argv[])
{
	int i,s, opt, num_threads;
	size_t stack_size;
	void *res;
	int main_fd = -1;
	unsigned int main_page_frame_number;
	pid_t pid,ppid;
    cpu_set_t mask;
    CPU_ZERO(&mask);

	/* The "-s" option specifies a stack size for our threads */

	stack_size = -1;
	while ((opt = getopt(argc, argv, "s:")) != -1) {
		switch (opt) {
		case 's':
			if (prctl(PR_MCE_KILL, PR_MCE_KILL_SET, PR_MCE_KILL_EARLY, 0, 0, 0) < 0)
				err("PR_MCE_KILL_SET early");
		break;
		default:
		fprintf(stderr, "Usage: %s [-s stack-size] arg...\n", argv[0]);
		exit(EXIT_FAILURE);
		}
	}

	num_threads = argc - optind;

	/* Allocate memory for pthread_create() arguments */
	struct thread_info *tinfo = calloc(num_threads, sizeof(*tinfo));
	if (tinfo == NULL)
		handle_error("calloc");
	
	main_addr = alloc_filebacked_page("./main_test", 1, &global_fd);
	//main_addr = create_buffer();
	pid = getpid();
	ppid = getppid();
	
#if 1

#endif


	main_page_frame_number = get_page_frame_number_of_address((void *)(&(main_addr[0])));
	printf("pid:%d ppid:%d  Page frame: 0x%x 0x%x \n", pid,ppid, main_page_frame_number, (void *)(&(main_addr[0])));

	for (int tnum = 0; tnum < num_threads; tnum++) {
		tinfo[tnum].thread_num = tnum + 1;
		tinfo[tnum].argv_string = argv[optind + tnum];
		pid = fork();  
		
		if(pid<0)	  
			printf("error in fork!\n"); 	
		
		else if(pid == 0)	  
		{	  
			prctl(PR_SET_NAME, tinfo[tnum].argv_string);
			CPU_SET(tinfo[tnum].thread_num, &mask);
			if (sched_setaffinity(0, sizeof(mask), &mask) < 0)
		    {
		        return -1;
		    }
			thread_start(&tinfo[tnum]);
		}	  
		else   
		{	  
			//printf("I am the parent process,ID is %d\n",getpid());   
		}	  

	}

	while(1)
		sleep(1);
	
	free(tinfo);
	exit(EXIT_SUCCESS);
}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-18  9:09               ` Aili Yao
@ 2021-01-19  5:25                 ` HORIGUCHI NAOYA(堀口 直也)
  2021-01-19  6:04                   ` Aili Yao
  0 siblings, 1 reply; 21+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-01-19  5:25 UTC (permalink / raw)
  To: Aili Yao; +Cc: Oscar Salvador, linux-mm, yangfeng1

On Mon, Jan 18, 2021 at 05:09:00PM +0800, Aili Yao wrote:
> On Mon, 18 Jan 2021 08:57:47 +0000
> HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:
> 
> > > > 
> > > > For action optional cases, one error event kills *only one* process. If an
> > > > error page are shared by multiple processes, these processes will be killed
> > > > by separate error events, each of which is triggered when each process tries
> > > > to access the error memory.  So these processes would be killed immediately
> > > > when accessing the error, but you don't have to kill all at the same time
> > > > (or actually you might not even have to kill it at all if the process exits
> > > > finally without accessing the error later).
> > > > 
> > > > Maybe the function variable "force_early" is named confusingly (it sounds
> > > > that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
> > > > I'll submit a fix later.  (I'll add your "Reported-by" because you made me
> > > > find it, thank you.)
> > > >   
> > > I think we should do more for non current process error case, we should mark it AO for processes to be signaled
> > > or we may take wrong action.  
> > 
> > I'm not sure what you mean by "non current process error case" and "we
> > should mark it AO", so could you explain more specifically about your error
> > scenario?  
>   I will share my test code and i will submit another patch to this scenario.
>   please give me some time, thanks!
>   And I think you are right, AR is only current process.
> 
> > Especially I'd like to know about who triggers hard offline on
> > what hardware events and what "wrong action" could happen.  Maybe just
> > "calling memory_failure() with MF_ACTION_REQUIRED" is not enough, because
> > it's not enough for us to see that your scenario is possible. Current
> > implementation implicitly assumes some hardware behavior, and does not work
> > for the case which never happens under the assumption.
> > 
>   This action is from mcelog daemon, normally softpage offlie is default, but we can configure
> hardpage offline for CE storms, to get related processes signaled.

Thanks, so which interface did you use for error injection? I guess first
you used /sys/devices/system/memory/hard_offline_page, but if it's true,
then the error event should be action optional (no MF_ACTION_REQUIRED set).
So now I'm wondering why you are observing action required events?
My another guess is that you might have used mce-inject tool, if that's true,
please use hard_offline_page, then current kernel code should properly send
SIGBUS to dedicated process.

- Naoya

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-19  5:25                 ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-01-19  6:04                   ` Aili Yao
  2021-01-19  7:33                     ` HORIGUCHI NAOYA(堀口 直也)
  0 siblings, 1 reply; 21+ messages in thread
From: Aili Yao @ 2021-01-19  6:04 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Oscar Salvador, linux-mm, yangfeng1


> Thanks, so which interface did you use for error injection? I guess first
> you used /sys/devices/system/memory/hard_offline_page, but if it's true,
> then the error event should be action optional (no MF_ACTION_REQUIRED set).
> So now I'm wondering why you are observing action required events?
> My another guess is that you might have used mce-inject tool, if that's true,
> please use hard_offline_page, then current kernel code should properly send
> SIGBUS to dedicated process.
> 
The test has no relation to ce and hard_offline_page, sorry for misleading.

if the test code will compiled to my_test bin, here is my script:
./my_test hola salut servus test haaa     --- this case no early-kill flag set
or
./my_test -s hola salut servus test haaa  --- this case early-kill is set.

there must be a process names "test" which will trigger the UE;

when the my_test start, the shared page's physical address will be printed;
In another console, I will use einj module to inject the 0X10 LEVEL error to this
physical address.

After that, the test process trigger the memory-failure process, and then the log
should be observed.

Thanks.

-- 
Best Regards!

Aili Yao


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
  2021-01-19  6:04                   ` Aili Yao
@ 2021-01-19  7:33                     ` HORIGUCHI NAOYA(堀口 直也)
  0 siblings, 0 replies; 21+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-01-19  7:33 UTC (permalink / raw)
  To: Aili Yao; +Cc: Oscar Salvador, linux-mm, yangfeng1

On Tue, Jan 19, 2021 at 02:04:18PM +0800, Aili Yao wrote:
> 
> > Thanks, so which interface did you use for error injection? I guess first
> > you used /sys/devices/system/memory/hard_offline_page, but if it's true,
> > then the error event should be action optional (no MF_ACTION_REQUIRED set).
> > So now I'm wondering why you are observing action required events?
> > My another guess is that you might have used mce-inject tool, if that's true,
> > please use hard_offline_page, then current kernel code should properly send
> > SIGBUS to dedicated process.
> > 
> The test has no relation to ce and hard_offline_page, sorry for misleading.
> 
> if the test code will compiled to my_test bin, here is my script:
> ./my_test hola salut servus test haaa     --- this case no early-kill flag set
> or
> ./my_test -s hola salut servus test haaa  --- this case early-kill is set.
> 
> there must be a process names "test" which will trigger the UE;
> 
> when the my_test start, the shared page's physical address will be printed;
> In another console, I will use einj module to inject the 0X10 LEVEL error to this
> physical address.

Ah, OK, so the problem is becoming clearer, thanks.  I'm feeling positive
to change code to fall back to find_early_kill_thread().
So I'll take a look on v2.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-01-19  7:34 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-15  7:55 [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early Aili Yao
2021-01-15  8:49 ` Oscar Salvador
2021-01-15  9:26   ` Aili Yao
2021-01-15  9:31     ` Aili Yao
2021-01-15  9:40       ` Oscar Salvador
2021-01-15  9:53         ` Aili Yao
2021-01-15 10:31     ` Oscar Salvador
2021-01-18  5:15     ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  5:57       ` Aili Yao
2021-01-18  6:50         ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  7:16           ` Aili Yao
2021-01-18  8:15           ` Aili Yao
2021-01-18  8:57             ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  9:09               ` Aili Yao
2021-01-19  5:25                 ` HORIGUCHI NAOYA(堀口 直也)
2021-01-19  6:04                   ` Aili Yao
2021-01-19  7:33                     ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  9:24               ` Oscar Salvador
2021-01-18  9:38                 ` Aili Yao
2021-01-18 10:09                   ` Oscar Salvador
2021-01-19  4:21               ` Aili Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).