All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: rientjes@google.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: Re: [patch] mm, oom: prevent additional oom kills before memoryis freed
Date: Fri, 16 Jun 2017 16:42:37 +0200	[thread overview]
Message-ID: <20170616144237.GP30580@dhcp22.suse.cz> (raw)
In-Reply-To: <201706162326.IEJ52125.JFFtMVQOSLHOFO@I-love.SAKURA.ne.jp>

On Fri 16-06-17 23:26:20, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Fri 16-06-17 19:27:19, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > On Fri 16-06-17 09:54:34, Tetsuo Handa wrote:
> > > > [...]
> > > > > And the patch you proposed is broken.
> > > > 
> > > > Thanks for your testing!
> > > >  
> > > > > ----------
> > > > > [  161.846202] Out of memory: Kill process 6331 (a.out) score 999 or sacrifice child
> > > > > [  161.850327] Killed process 6331 (a.out) total-vm:4172kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> > > > > [  161.858503] ------------[ cut here ]------------
> > > > > [  161.861512] kernel BUG at mm/memory.c:1381!
> > > > 
> > > > BUG_ON(addr >= end) suggests our vma has trimmed. I guess I see what is
> > > > going on here.
> > > > __oom_reap_task_mm				exit_mmap
> > > > 						  free_pgtables
> > > > 						  up_write(mm->mmap_sem)
> > > >   down_read_trylock(&mm->mmap_sem)
> > > >   						  remove_vma
> > > >     unmap_page_range
> > > > 
> > > > So we need to extend the mmap_sem coverage. See the updated diff (not
> > > > the full proper patch yet).
> > > 
> > > That diff is still wrong. We need to prevent __oom_reap_task_mm() from calling
> > > unmap_page_range() when __mmput() already called exit_mm(), by setting/checking
> > > MMF_OOM_SKIP like shown below.
> > 
> > Care to explain why?
> 
> I don't know. Your updated diff is causing below oops.
> 
> ----------
> [   90.621890] Out of memory: Kill process 2671 (a.out) score 999 or sacrifice child
> [   90.624636] Killed process 2671 (a.out) total-vm:4172kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> [   90.861308] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [   90.863695] Modules linked in: coretemp pcspkr sg vmw_vmci shpchp i2c_piix4 sd_mod ata_generic pata_acpi serio_raw vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mptspi scsi_transport_spi mptscsih ahci mptbase libahci drm e1000 ata_piix i2c_core libata ipv6
> [   90.870672] CPU: 2 PID: 47 Comm: oom_reaper Not tainted 4.12.0-rc5+ #128
> [   90.872929] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [   90.875995] task: ffff88007b6cd2c0 task.stack: ffff88007b6d0000
> [   90.878290] RIP: 0010:__oom_reap_task_mm+0xa1/0x160

What does this dissassemble to on your kernel? Care to post addr2line?

[...]
 
> It is you who should explain why.

I can definitely try but it was really impossible to deduce that you
have seen an oops from your previous email...

> I found my patch via trial and error.
> 
> > [...]
> >  
> > > Since the OOM reaper does not reap hugepages, khugepaged_exit() part could be
> > > safe.
> > 
> > I think you are mixing hugetlb and THP pages here. khugepaged_exit is
> > about later and we do unmap those.
> 
> OK.
> 
> > 
> > > But ksm_exit() part might interfere.
> > 
> > How?
> 
> Why you think it does not interfere?

Because it doesn't modify address space in any way.

> Please explain it in your patch description because your patch is
> trying to do a tricky thing. I'm not a MM person. I just suspect
> what you think no problem.

yeah, poking holes into a patch is a reasonable approach but if you make
a statement that "ksm_exit() part might interfere." then you should back
it by an argument.
 
> > > If it is guaranteed to be safe,
> > > what will go wrong if we move uprobe_clear_state()/exit_aio()/ksm_exit() etc.
> > > to just before mmdrop() (i.e. after setting MMF_OOM_SKIP) ?
> > 
> > I do not see why those matter and why they should be any special. Unless
> > I miss anything we really do only care about page table tear down and
> > the address space modification. They do none of that.
> 
> I think the patch I posted at
> http://lkml.kernel.org/r/201706162122.ACE95321.tOFLOOVFFHMSJQ@I-love.SAKURA.ne.jp
> will be safer, and you agree that a solution which is fully contained inside
> the oom proper would be preferable. Thus, let's start checking that patch.

Yes I will keep thinking about your approach some more but it indeed
seems easier and less tricky.

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: rientjes@google.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: Re: [patch] mm, oom: prevent additional oom kills before memoryis freed
Date: Fri, 16 Jun 2017 16:42:37 +0200	[thread overview]
Message-ID: <20170616144237.GP30580@dhcp22.suse.cz> (raw)
In-Reply-To: <201706162326.IEJ52125.JFFtMVQOSLHOFO@I-love.SAKURA.ne.jp>

On Fri 16-06-17 23:26:20, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Fri 16-06-17 19:27:19, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > On Fri 16-06-17 09:54:34, Tetsuo Handa wrote:
> > > > [...]
> > > > > And the patch you proposed is broken.
> > > > 
> > > > Thanks for your testing!
> > > >  
> > > > > ----------
> > > > > [  161.846202] Out of memory: Kill process 6331 (a.out) score 999 or sacrifice child
> > > > > [  161.850327] Killed process 6331 (a.out) total-vm:4172kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> > > > > [  161.858503] ------------[ cut here ]------------
> > > > > [  161.861512] kernel BUG at mm/memory.c:1381!
> > > > 
> > > > BUG_ON(addr >= end) suggests our vma has trimmed. I guess I see what is
> > > > going on here.
> > > > __oom_reap_task_mm				exit_mmap
> > > > 						  free_pgtables
> > > > 						  up_write(mm->mmap_sem)
> > > >   down_read_trylock(&mm->mmap_sem)
> > > >   						  remove_vma
> > > >     unmap_page_range
> > > > 
> > > > So we need to extend the mmap_sem coverage. See the updated diff (not
> > > > the full proper patch yet).
> > > 
> > > That diff is still wrong. We need to prevent __oom_reap_task_mm() from calling
> > > unmap_page_range() when __mmput() already called exit_mm(), by setting/checking
> > > MMF_OOM_SKIP like shown below.
> > 
> > Care to explain why?
> 
> I don't know. Your updated diff is causing below oops.
> 
> ----------
> [   90.621890] Out of memory: Kill process 2671 (a.out) score 999 or sacrifice child
> [   90.624636] Killed process 2671 (a.out) total-vm:4172kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> [   90.861308] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [   90.863695] Modules linked in: coretemp pcspkr sg vmw_vmci shpchp i2c_piix4 sd_mod ata_generic pata_acpi serio_raw vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mptspi scsi_transport_spi mptscsih ahci mptbase libahci drm e1000 ata_piix i2c_core libata ipv6
> [   90.870672] CPU: 2 PID: 47 Comm: oom_reaper Not tainted 4.12.0-rc5+ #128
> [   90.872929] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [   90.875995] task: ffff88007b6cd2c0 task.stack: ffff88007b6d0000
> [   90.878290] RIP: 0010:__oom_reap_task_mm+0xa1/0x160

What does this dissassemble to on your kernel? Care to post addr2line?

[...]
 
> It is you who should explain why.

I can definitely try but it was really impossible to deduce that you
have seen an oops from your previous email...

> I found my patch via trial and error.
> 
> > [...]
> >  
> > > Since the OOM reaper does not reap hugepages, khugepaged_exit() part could be
> > > safe.
> > 
> > I think you are mixing hugetlb and THP pages here. khugepaged_exit is
> > about later and we do unmap those.
> 
> OK.
> 
> > 
> > > But ksm_exit() part might interfere.
> > 
> > How?
> 
> Why you think it does not interfere?

Because it doesn't modify address space in any way.

> Please explain it in your patch description because your patch is
> trying to do a tricky thing. I'm not a MM person. I just suspect
> what you think no problem.

yeah, poking holes into a patch is a reasonable approach but if you make
a statement that "ksm_exit() part might interfere." then you should back
it by an argument.
 
> > > If it is guaranteed to be safe,
> > > what will go wrong if we move uprobe_clear_state()/exit_aio()/ksm_exit() etc.
> > > to just before mmdrop() (i.e. after setting MMF_OOM_SKIP) ?
> > 
> > I do not see why those matter and why they should be any special. Unless
> > I miss anything we really do only care about page table tear down and
> > the address space modification. They do none of that.
> 
> I think the patch I posted at
> http://lkml.kernel.org/r/201706162122.ACE95321.tOFLOOVFFHMSJQ@I-love.SAKURA.ne.jp
> will be safer, and you agree that a solution which is fully contained inside
> the oom proper would be preferable. Thus, let's start checking that patch.

Yes I will keep thinking about your approach some more but it indeed
seems easier and less tricky.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-06-16 14:42 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-14 23:43 [patch] mm, oom: prevent additional oom kills before memory is freed David Rientjes
2017-06-14 23:43 ` David Rientjes
2017-06-15 10:39 ` Michal Hocko
2017-06-15 10:39   ` Michal Hocko
2017-06-15 10:53   ` Tetsuo Handa
2017-06-15 10:53     ` Tetsuo Handa
2017-06-15 11:01     ` Michal Hocko
2017-06-15 11:01       ` Michal Hocko
2017-06-15 11:32       ` Tetsuo Handa
2017-06-15 11:32         ` Tetsuo Handa
2017-06-15 12:03         ` Michal Hocko
2017-06-15 12:03           ` Michal Hocko
2017-06-15 12:13           ` Michal Hocko
2017-06-15 12:13             ` Michal Hocko
2017-06-15 13:01             ` Tetsuo Handa
2017-06-15 13:01               ` Tetsuo Handa
2017-06-15 13:22               ` Michal Hocko
2017-06-15 13:22                 ` Michal Hocko
2017-06-15 21:43                 ` Tetsuo Handa
2017-06-15 21:43                   ` Tetsuo Handa
2017-06-15 21:37               ` David Rientjes
2017-06-15 21:37                 ` David Rientjes
2017-06-15 12:20       ` Michal Hocko
2017-06-15 12:20         ` Michal Hocko
2017-06-15 21:26   ` David Rientjes
2017-06-15 21:26     ` David Rientjes
2017-06-15 21:41     ` Michal Hocko
2017-06-15 21:41       ` Michal Hocko
2017-06-15 22:03       ` David Rientjes
2017-06-15 22:03         ` David Rientjes
2017-06-15 22:12         ` Michal Hocko
2017-06-15 22:12           ` Michal Hocko
2017-06-15 22:42           ` David Rientjes
2017-06-15 22:42             ` David Rientjes
2017-06-16  8:06             ` Michal Hocko
2017-06-16  8:06               ` Michal Hocko
2017-06-16  0:54           ` Tetsuo Handa
2017-06-16  0:54             ` Tetsuo Handa
2017-06-16  4:00             ` Tetsuo Handa
2017-06-16  4:00               ` Tetsuo Handa
2017-06-16  8:39             ` Michal Hocko
2017-06-16  8:39               ` Michal Hocko
2017-06-16 10:27               ` Tetsuo Handa
2017-06-16 10:27                 ` Tetsuo Handa
2017-06-16 11:02                 ` Michal Hocko
2017-06-16 11:02                   ` Michal Hocko
2017-06-16 14:26                   ` Re: [patch] mm, oom: prevent additional oom kills before memoryis freed Tetsuo Handa
2017-06-16 14:26                     ` Tetsuo Handa
2017-06-16 14:42                     ` Michal Hocko [this message]
2017-06-16 14:42                       ` Michal Hocko
2017-06-17 13:30                       ` Re: [patch] mm, oom: prevent additional oom kills before memory is freed Tetsuo Handa
2017-06-17 13:30                         ` Tetsuo Handa
2017-06-23 12:38                         ` Michal Hocko
2017-06-23 12:38                           ` Michal Hocko
2017-06-16 12:22       ` Tetsuo Handa
2017-06-16 12:22         ` Tetsuo Handa
2017-06-16 14:12         ` Michal Hocko
2017-06-16 14:12           ` Michal Hocko
2017-06-17  5:17           ` [PATCH] mm,oom_kill: Close race window of needlessly selecting new victims Tetsuo Handa
2017-06-17  5:17             ` Tetsuo Handa
2017-06-20 22:12             ` David Rientjes
2017-06-20 22:12               ` David Rientjes
2017-06-21  2:17               ` Tetsuo Handa
2017-06-21 20:31                 ` David Rientjes
2017-06-21 20:31                   ` David Rientjes
2017-06-22  0:53                   ` Tetsuo Handa
2017-06-23 12:45                     ` Michal Hocko
2017-06-23 12:45                       ` Michal Hocko
2017-06-21 13:18               ` Michal Hocko
2017-06-21 13:18                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170616144237.GP30580@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.