All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm, oom: show process exiting information in __oom_kill_process()
@ 2020-07-19 13:53 Yafang Shao
  2020-07-19 23:20 ` Tetsuo Handa
  2020-07-20  7:16 ` Michal Hocko
  0 siblings, 2 replies; 13+ messages in thread
From: Yafang Shao @ 2020-07-19 13:53 UTC (permalink / raw)
  To: rientjes, mhocko, penguin-kernel, akpm; +Cc: linux-mm, Yafang Shao

When the OOM killer finding a victim and trying to kill it, if the victim
is already exiting, the task mm will be NULL and no process will be killed.
But the dump_header() has been already executed, so it will be strange to
dump so many information without killing a process. We'd better show some
helpful information to indicate why this happens.

Suggested-by: David Rientjes <rientjes@google.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
---
 mm/oom_kill.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 6e94962..0480dde 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -863,9 +863,13 @@ static void __oom_kill_process(struct task_struct *victim, const char *message)
 
 	p = find_lock_task_mm(victim);
 	if (!p) {
+		pr_info("Process %d (%s) is already exiting\n",
+			task_pid_nr(victim), victim->comm);
 		put_task_struct(victim);
 		return;
-	} else if (victim != p) {
+	}
+
+	if (victim != p) {
 		get_task_struct(p);
 		put_task_struct(victim);
 		victim = p;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-19 13:53 [PATCH] mm, oom: show process exiting information in __oom_kill_process() Yafang Shao
@ 2020-07-19 23:20 ` Tetsuo Handa
  2020-07-20  1:43   ` Yafang Shao
  2020-07-20  7:16 ` Michal Hocko
  1 sibling, 1 reply; 13+ messages in thread
From: Tetsuo Handa @ 2020-07-19 23:20 UTC (permalink / raw)
  To: Yafang Shao; +Cc: rientjes, mhocko, akpm, linux-mm

On 2020/07/19 22:53, Yafang Shao wrote:
>  	p = find_lock_task_mm(victim);
>  	if (!p) {
> +		pr_info("Process %d (%s) is already exiting\n",
> +			task_pid_nr(victim), victim->comm);

Maybe

+               pr_info("%s: Process %d (%s) is already exiting\n",
+                       message, task_pid_nr(victim), victim->comm);

line is easier to compare with

        pr_err("%s: Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB, UID:%u pgtables:%lukB oom_score_adj:%hd\n",
                message, task_pid_nr(victim), victim->comm, K(mm->total_vm),

line?

>  		put_task_struct(victim);
>  		return;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-19 23:20 ` Tetsuo Handa
@ 2020-07-20  1:43   ` Yafang Shao
  0 siblings, 0 replies; 13+ messages in thread
From: Yafang Shao @ 2020-07-20  1:43 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: David Rientjes, Michal Hocko, Andrew Morton, Linux MM

On Mon, Jul 20, 2020 at 7:20 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/07/19 22:53, Yafang Shao wrote:
> >       p = find_lock_task_mm(victim);
> >       if (!p) {
> > +             pr_info("Process %d (%s) is already exiting\n",
> > +                     task_pid_nr(victim), victim->comm);
>
> Maybe
>
> +               pr_info("%s: Process %d (%s) is already exiting\n",
> +                       message, task_pid_nr(victim), victim->comm);
>
> line is easier to compare with
>
>         pr_err("%s: Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB, UID:%u pgtables:%lukB oom_score_adj:%hd\n",
>                 message, task_pid_nr(victim), victim->comm, K(mm->total_vm),
>
> line?
>

Seems so.
Thanks for the suggestion.

> >               put_task_struct(victim);
> >               return;



-- 
Thanks
Yafang


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-19 13:53 [PATCH] mm, oom: show process exiting information in __oom_kill_process() Yafang Shao
  2020-07-19 23:20 ` Tetsuo Handa
@ 2020-07-20  7:16 ` Michal Hocko
  2020-07-20 10:36   ` Yafang Shao
  1 sibling, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2020-07-20  7:16 UTC (permalink / raw)
  To: Yafang Shao; +Cc: rientjes, penguin-kernel, akpm, linux-mm

On Sun 19-07-20 09:53:15, Yafang Shao wrote:
> When the OOM killer finding a victim and trying to kill it, if the victim
> is already exiting, the task mm will be NULL and no process will be killed.
> But the dump_header() has been already executed, so it will be strange to
> dump so many information without killing a process. We'd better show some
> helpful information to indicate why this happens.
> 
> Suggested-by: David Rientjes <rientjes@google.com>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> ---
>  mm/oom_kill.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 6e94962..0480dde 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -863,9 +863,13 @@ static void __oom_kill_process(struct task_struct *victim, const char *message)
>  
>  	p = find_lock_task_mm(victim);
>  	if (!p) {
> +		pr_info("Process %d (%s) is already exiting\n",
> +			task_pid_nr(victim), victim->comm);
>  		put_task_struct(victim);

I do agree that a silent bail out is not the best thing to do. The above
message would be more useful if it also explained what the oom killer
does (or does not):

	"OOM victim %d (%s) is already exiting. Skip killing the task\n"

>  		return;
> -	} else if (victim != p) {
> +	}
> +
> +	if (victim != p) {

Why do we need this?

>  		get_task_struct(p);
>  		put_task_struct(victim);
>  		victim = p;
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20  7:16 ` Michal Hocko
@ 2020-07-20 10:36   ` Yafang Shao
  2020-07-20 11:06     ` Tetsuo Handa
  2020-07-20 13:35     ` Michal Hocko
  0 siblings, 2 replies; 13+ messages in thread
From: Yafang Shao @ 2020-07-20 10:36 UTC (permalink / raw)
  To: Michal Hocko; +Cc: David Rientjes, Tetsuo Handa, Andrew Morton, Linux MM

On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Sun 19-07-20 09:53:15, Yafang Shao wrote:
> > When the OOM killer finding a victim and trying to kill it, if the victim
> > is already exiting, the task mm will be NULL and no process will be killed.
> > But the dump_header() has been already executed, so it will be strange to
> > dump so many information without killing a process. We'd better show some
> > helpful information to indicate why this happens.
> >
> > Suggested-by: David Rientjes <rientjes@google.com>
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> > ---
> >  mm/oom_kill.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 6e94962..0480dde 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -863,9 +863,13 @@ static void __oom_kill_process(struct task_struct *victim, const char *message)
> >
> >       p = find_lock_task_mm(victim);
> >       if (!p) {
> > +             pr_info("Process %d (%s) is already exiting\n",
> > +                     task_pid_nr(victim), victim->comm);
> >               put_task_struct(victim);
>
> I do agree that a silent bail out is not the best thing to do. The above
> message would be more useful if it also explained what the oom killer
> does (or does not):
>
>         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
>

Sure.

> >               return;
> > -     } else if (victim != p) {
> > +     }
> > +
> > +     if (victim != p) {
>
> Why do we need this?
>

Because I don't  like that code style.
But it is not a big problem, I will not change it in the next version.

> >               get_task_struct(p);
> >               put_task_struct(victim);
> >               victim = p;
> > --
> > 1.8.3.1
>
> --
> Michal Hocko
> SUSE Labs



-- 
Thanks
Yafang


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 10:36   ` Yafang Shao
@ 2020-07-20 11:06     ` Tetsuo Handa
  2020-07-20 12:19       ` Yafang Shao
  2020-07-20 13:41       ` Michal Hocko
  2020-07-20 13:35     ` Michal Hocko
  1 sibling, 2 replies; 13+ messages in thread
From: Tetsuo Handa @ 2020-07-20 11:06 UTC (permalink / raw)
  To: Yafang Shao, Michal Hocko; +Cc: David Rientjes, Andrew Morton, Linux MM

On 2020/07/20 19:36, Yafang Shao wrote:
> On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@kernel.org> wrote:
>> I do agree that a silent bail out is not the best thing to do. The above
>> message would be more useful if it also explained what the oom killer
>> does (or does not):
>>
>>         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
>>
> 
> Sure.

This path is rarely hit because find_lock_task_mm() in oom_badness() from
select_bad_process() in the next round of OOM killer will skip this task.

Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
for this task itself immediately reclaims memory and updates the statistics
counter, we just get two chunks of dump_header() messages and one OOM victim.

Current synchronous printk() gives __mmput() some time for reclaiming memory
and updating the statistics counter. But when printk() becomes asynchronous,
there might be quite small time. People might wonder "why killed message
follows immediately after skipped killing message"... Wouldn't the skip
message confuse people?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 11:06     ` Tetsuo Handa
@ 2020-07-20 12:19       ` Yafang Shao
  2020-07-20 13:11         ` Tetsuo Handa
  2020-07-20 13:41       ` Michal Hocko
  1 sibling, 1 reply; 13+ messages in thread
From: Yafang Shao @ 2020-07-20 12:19 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Michal Hocko, David Rientjes, Andrew Morton, Linux MM

On Mon, Jul 20, 2020 at 7:06 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/07/20 19:36, Yafang Shao wrote:
> > On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@kernel.org> wrote:
> >> I do agree that a silent bail out is not the best thing to do. The above
> >> message would be more useful if it also explained what the oom killer
> >> does (or does not):
> >>
> >>         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
> >>
> >
> > Sure.
>
> This path is rarely hit because find_lock_task_mm() in oom_badness() from
> select_bad_process() in the next round of OOM killer will skip this task.
>
> Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
> for this task itself immediately reclaims memory and updates the statistics
> counter, we just get two chunks of dump_header() messages and one OOM victim.
>

Could you pls. explain more specifically why we will get two chunks of
dump_header()?
My understanding is the free_mm() happens between select_bad_process()
and __oom_kill_process() as bellow,

P1
             Victim
select_bad_process()
    oom_badness()
        p = find_lock_task_mm()  # p isn't NULL

                __mmput()

                    free_mm()
dump_header()  # dump once
__oom_kill_process()
    p = find_lock_task_mm(victim); # p is NULL now

So where is another dump_header() ?


> Current synchronous printk() gives __mmput() some time for reclaiming memory
> and updating the statistics counter. But when printk() becomes asynchronous,
> there might be quite small time. People might wonder "why killed message
> follows immediately after skipped killing message"... Wouldn't the skip
> message confuse people?



-- 
Thanks
Yafang


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 12:19       ` Yafang Shao
@ 2020-07-20 13:11         ` Tetsuo Handa
  2020-07-20 13:59           ` Yafang Shao
  0 siblings, 1 reply; 13+ messages in thread
From: Tetsuo Handa @ 2020-07-20 13:11 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Michal Hocko, David Rientjes, Andrew Morton, Linux MM

On 2020/07/20 21:19, Yafang Shao wrote:
> On Mon, Jul 20, 2020 at 7:06 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> On 2020/07/20 19:36, Yafang Shao wrote:
>>> On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@kernel.org> wrote:
>>>> I do agree that a silent bail out is not the best thing to do. The above
>>>> message would be more useful if it also explained what the oom killer
>>>> does (or does not):
>>>>
>>>>         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
>>>>
>>>
>>> Sure.
>>
>> This path is rarely hit because find_lock_task_mm() in oom_badness() from
>> select_bad_process() in the next round of OOM killer will skip this task.
>>
>> Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
>> for this task itself immediately reclaims memory and updates the statistics
>> counter, we just get two chunks of dump_header() messages and one OOM victim.
>>
> 
> Could you pls. explain more specifically why we will get two chunks of
> dump_header()?
> My understanding is the free_mm() happens between select_bad_process()
> and __oom_kill_process() as bellow,
> 
> P1
>              Victim
> select_bad_process()
>     oom_badness()
>         p = find_lock_task_mm()  # p isn't NULL
> 
>                 __mmput()
> 
>                     free_mm()
> dump_header()  # dump once
> __oom_kill_process()
>     p = find_lock_task_mm(victim); # p is NULL now
> 
> So where is another dump_header() ?
> 

Start of __mmput() does not guarantee that memory is reclaimed immediately.
Moreover, even __mmput() might not have started by the moment second chunk of
dump_header() happens. The "OOM victim %d (%s) is already exiting." case only
indicates that victim's mm became NULL; there is no guarantee that memory is
reclaimed (in order to avoid OOM kill) by the moment next round hits.

P1                                Victim1                              Victim2

out_of_memory() {
  select_bad_process() {
    oom_badness() {
      p = find_lock_task_mm() {
        task_lock(victim);       // finds Victim1 because Victim1->mm != NULL.
      }
      get_task_struct(p);
      task_unlock(p);
    }
  }
  oom_kill_process() {
    task_lock(victim);
    task_unlock(victim);
                                  do_exit() {
    dump_header(oc, victim); // first dump_header() with Victim1 and Victim2
    __oom_kill_process(victim, message) {
                                    exit_mm() {
                                      task_lock(current);
                                      current->mm = NULL;
                                      task_unlock(current);
        p = find_lock_task_mm(victim);
        put_task_struct(victim); // without killing Victim1 because p == NULL.
      }
    }
  }
}
out_of_memory() {
  select_bad_process() {
    oom_badness() {
      p = find_lock_task_mm() {
        task_lock(victim);       // finds Victim2 because Victim2->mm != NULL.
      }
      get_task_struct(p);
      task_unlock(p);
    }
  }
                                      mmput() {
                                        __mmput() {
                                          uprobe_clear_state() {
                                            // Might wait for delayed_uprobe_lock.
                                          }
  oom_kill_process() {
    task_lock(victim);
    task_unlock(victim);
    dump_header(oc, victim); // second dump_header() with Victim2
    __oom_kill_process(victim, message) {
      p = find_lock_task_mm(victim);
      pr_err("%s: Killed process %d (%s) "...); // first kill message.
      put_task_struct(p);
    }
  }
}
                                          exit_mmap(); // Which frees memory.
                                        }
                                      }
                                    }
                                  }

Maybe the better behavior is to restart out_of_memory() without dump_header()
(we can remember whether we already called dump_header() into "struct oom_control"),
with last second watermark check before select_bad_process() and after dump_header().


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 10:36   ` Yafang Shao
  2020-07-20 11:06     ` Tetsuo Handa
@ 2020-07-20 13:35     ` Michal Hocko
  1 sibling, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2020-07-20 13:35 UTC (permalink / raw)
  To: Yafang Shao; +Cc: David Rientjes, Tetsuo Handa, Andrew Morton, Linux MM

On Mon 20-07-20 18:36:53, Yafang Shao wrote:
> On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Sun 19-07-20 09:53:15, Yafang Shao wrote:
> > > When the OOM killer finding a victim and trying to kill it, if the victim
> > > is already exiting, the task mm will be NULL and no process will be killed.
> > > But the dump_header() has been already executed, so it will be strange to
> > > dump so many information without killing a process. We'd better show some
> > > helpful information to indicate why this happens.
> > >
> > > Suggested-by: David Rientjes <rientjes@google.com>
> > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > Cc: Michal Hocko <mhocko@kernel.org>
> > > Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> > > ---
> > >  mm/oom_kill.c | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > > index 6e94962..0480dde 100644
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -863,9 +863,13 @@ static void __oom_kill_process(struct task_struct *victim, const char *message)
> > >
> > >       p = find_lock_task_mm(victim);
> > >       if (!p) {
> > > +             pr_info("Process %d (%s) is already exiting\n",
> > > +                     task_pid_nr(victim), victim->comm);
> > >               put_task_struct(victim);
> >
> > I do agree that a silent bail out is not the best thing to do. The above
> > message would be more useful if it also explained what the oom killer
> > does (or does not):
> >
> >         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
> >
> 
> Sure.
> 
> > >               return;
> > > -     } else if (victim != p) {
> > > +     }
> > > +
> > > +     if (victim != p) {
> >
> > Why do we need this?
> >
> 
> Because I don't  like that code style.

We usually prefer to keep unrelated changes separate. Minor coding
style changes are usually questionable because many people have
different sense for the code.

> But it is not a big problem, I will not change it in the next version.

Thanks
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 11:06     ` Tetsuo Handa
  2020-07-20 12:19       ` Yafang Shao
@ 2020-07-20 13:41       ` Michal Hocko
  2020-07-20 14:03         ` Tetsuo Handa
  1 sibling, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2020-07-20 13:41 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Yafang Shao, David Rientjes, Andrew Morton, Linux MM

On Mon 20-07-20 20:06:17, Tetsuo Handa wrote:
> On 2020/07/20 19:36, Yafang Shao wrote:
> > On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@kernel.org> wrote:
> >> I do agree that a silent bail out is not the best thing to do. The above
> >> message would be more useful if it also explained what the oom killer
> >> does (or does not):
> >>
> >>         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
> >>
> > 
> > Sure.
> 
> This path is rarely hit because find_lock_task_mm() in oom_badness() from
> select_bad_process() in the next round of OOM killer will skip this task.

Agreed!

> Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
> for this task itself immediately reclaims memory and updates the statistics
> counter, we just get two chunks of dump_header() messages and one OOM victim.
> 
> Current synchronous printk() gives __mmput() some time for reclaiming memory
> and updating the statistics counter. But when printk() becomes asynchronous,
> there might be quite small time. People might wonder "why killed message
> follows immediately after skipped killing message"... Wouldn't the skip
> message confuse people?

I would ask other way around. Wouldn't that give us a better clue that
the first oom invocation and the back off was a suboptimal decision? If
we learn about more of those, maybe we want to reconsider this heuristic
and rather retry the victim selection instead.

I do not really see how this message would be harmful TBH.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 13:11         ` Tetsuo Handa
@ 2020-07-20 13:59           ` Yafang Shao
  0 siblings, 0 replies; 13+ messages in thread
From: Yafang Shao @ 2020-07-20 13:59 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Michal Hocko, David Rientjes, Andrew Morton, Linux MM

On Mon, Jul 20, 2020 at 9:11 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/07/20 21:19, Yafang Shao wrote:
> > On Mon, Jul 20, 2020 at 7:06 PM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>
> >> On 2020/07/20 19:36, Yafang Shao wrote:
> >>> On Mon, Jul 20, 2020 at 3:16 PM Michal Hocko <mhocko@kernel.org> wrote:
> >>>> I do agree that a silent bail out is not the best thing to do. The above
> >>>> message would be more useful if it also explained what the oom killer
> >>>> does (or does not):
> >>>>
> >>>>         "OOM victim %d (%s) is already exiting. Skip killing the task\n"
> >>>>
> >>>
> >>> Sure.
> >>
> >> This path is rarely hit because find_lock_task_mm() in oom_badness() from
> >> select_bad_process() in the next round of OOM killer will skip this task.
> >>
> >> Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
> >> for this task itself immediately reclaims memory and updates the statistics
> >> counter, we just get two chunks of dump_header() messages and one OOM victim.
> >>
> >
> > Could you pls. explain more specifically why we will get two chunks of
> > dump_header()?
> > My understanding is the free_mm() happens between select_bad_process()
> > and __oom_kill_process() as bellow,
> >
> > P1
> >              Victim
> > select_bad_process()
> >     oom_badness()
> >         p = find_lock_task_mm()  # p isn't NULL
> >
> >                 __mmput()
> >
> >                     free_mm()
> > dump_header()  # dump once
> > __oom_kill_process()
> >     p = find_lock_task_mm(victim); # p is NULL now
> >
> > So where is another dump_header() ?
> >
>
> Start of __mmput() does not guarantee that memory is reclaimed immediately.
> Moreover, even __mmput() might not have started by the moment second chunk of
> dump_header() happens. The "OOM victim %d (%s) is already exiting." case only
> indicates that victim's mm became NULL; there is no guarantee that memory is
> reclaimed (in order to avoid OOM kill) by the moment next round hits.
>
> P1                                Victim1                              Victim2
>
> out_of_memory() {
>   select_bad_process() {
>     oom_badness() {
>       p = find_lock_task_mm() {
>         task_lock(victim);       // finds Victim1 because Victim1->mm != NULL.
>       }
>       get_task_struct(p);
>       task_unlock(p);
>     }
>   }
>   oom_kill_process() {
>     task_lock(victim);
>     task_unlock(victim);
>                                   do_exit() {
>     dump_header(oc, victim); // first dump_header() with Victim1 and Victim2
>     __oom_kill_process(victim, message) {
>                                     exit_mm() {
>                                       task_lock(current);
>                                       current->mm = NULL;
>                                       task_unlock(current);
>         p = find_lock_task_mm(victim);
>         put_task_struct(victim); // without killing Victim1 because p == NULL.
>       }
>     }
>   }
> }
> out_of_memory() {
>   select_bad_process() {
>     oom_badness() {
>       p = find_lock_task_mm() {
>         task_lock(victim);       // finds Victim2 because Victim2->mm != NULL.
>       }
>       get_task_struct(p);
>       task_unlock(p);
>     }
>   }
>                                       mmput() {
>                                         __mmput() {
>                                           uprobe_clear_state() {
>                                             // Might wait for delayed_uprobe_lock.
>                                           }
>   oom_kill_process() {
>     task_lock(victim);
>     task_unlock(victim);
>     dump_header(oc, victim); // second dump_header() with Victim2
>     __oom_kill_process(victim, message) {
>       p = find_lock_task_mm(victim);
>       pr_err("%s: Killed process %d (%s) "...); // first kill message.
>       put_task_struct(p);
>     }
>   }
> }
>                                           exit_mmap(); // Which frees memory.
>                                         }
>                                       }
>                                     }
>                                   }
>
> Maybe the better behavior is to restart out_of_memory() without dump_header()
> (we can remember whether we already called dump_header() into "struct oom_control"),
> with last second watermark check before select_bad_process() and after dump_header().

I understand what you mean now.
But I agree with Michal that this output won't be harmful in your case.
And for your case, I think Michal's suggestion that retry the victim
selection would be better.

-- 
Thanks
Yafang


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 13:41       ` Michal Hocko
@ 2020-07-20 14:03         ` Tetsuo Handa
  2020-07-20 14:23           ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: Tetsuo Handa @ 2020-07-20 14:03 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Yafang Shao, David Rientjes, Andrew Morton, Linux MM

On 2020/07/20 22:41, Michal Hocko wrote:
>> Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
>> for this task itself immediately reclaims memory and updates the statistics
>> counter, we just get two chunks of dump_header() messages and one OOM victim.
>>
>> Current synchronous printk() gives __mmput() some time for reclaiming memory
>> and updating the statistics counter. But when printk() becomes asynchronous,
>> there might be quite small time. People might wonder "why killed message
>> follows immediately after skipped killing message"... Wouldn't the skip
>> message confuse people?
> 
> I would ask other way around. Wouldn't that give us a better clue that
> the first oom invocation and the back off was a suboptimal decision? If
> we learn about more of those, maybe we want to reconsider this heuristic
> and rather retry the victim selection instead.

I've just suggested

  Maybe the better behavior is to restart out_of_memory() without dump_header()
  (we can remember whether we already called dump_header() into "struct oom_control"),
  with last second watermark check before select_bad_process() and after dump_header().

at https://lkml.kernel.org/r/7f58363a-db1a-5502-e2b4-ee4b9fa31824@i-love.sakura.ne.jp .


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm, oom: show process exiting information in __oom_kill_process()
  2020-07-20 14:03         ` Tetsuo Handa
@ 2020-07-20 14:23           ` Michal Hocko
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2020-07-20 14:23 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Yafang Shao, David Rientjes, Andrew Morton, Linux MM

On Mon 20-07-20 23:03:46, Tetsuo Handa wrote:
> On 2020/07/20 22:41, Michal Hocko wrote:
> >> Since we don't wake up the OOM reaper when hitting this path, unless __mmput()
> >> for this task itself immediately reclaims memory and updates the statistics
> >> counter, we just get two chunks of dump_header() messages and one OOM victim.
> >>
> >> Current synchronous printk() gives __mmput() some time for reclaiming memory
> >> and updating the statistics counter. But when printk() becomes asynchronous,
> >> there might be quite small time. People might wonder "why killed message
> >> follows immediately after skipped killing message"... Wouldn't the skip
> >> message confuse people?
> > 
> > I would ask other way around. Wouldn't that give us a better clue that
> > the first oom invocation and the back off was a suboptimal decision? If
> > we learn about more of those, maybe we want to reconsider this heuristic
> > and rather retry the victim selection instead.
> 
> I've just suggested
> 
>   Maybe the better behavior is to restart out_of_memory() without dump_header()
>   (we can remember whether we already called dump_header() into "struct oom_control"),
>   with last second watermark check before select_bad_process() and after dump_header().
> 
> at https://lkml.kernel.org/r/7f58363a-db1a-5502-e2b4-ee4b9fa31824@i-love.sakura.ne.jp .

As soon as we learn about more of those as mentioned above.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-07-20 14:27 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-19 13:53 [PATCH] mm, oom: show process exiting information in __oom_kill_process() Yafang Shao
2020-07-19 23:20 ` Tetsuo Handa
2020-07-20  1:43   ` Yafang Shao
2020-07-20  7:16 ` Michal Hocko
2020-07-20 10:36   ` Yafang Shao
2020-07-20 11:06     ` Tetsuo Handa
2020-07-20 12:19       ` Yafang Shao
2020-07-20 13:11         ` Tetsuo Handa
2020-07-20 13:59           ` Yafang Shao
2020-07-20 13:41       ` Michal Hocko
2020-07-20 14:03         ` Tetsuo Handa
2020-07-20 14:23           ` Michal Hocko
2020-07-20 13:35     ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.