All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] THP mlock fix
@ 2015-12-29 20:46 Kirill A. Shutemov
  2015-12-29 20:46 ` [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas() Kirill A. Shutemov
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Kirill A. Shutemov @ 2015-12-29 20:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Sasha Levin, linux-mm, Kirill A. Shutemov

Hi Andrew,

There are two patches below. I believe either of the would fix the bug
reported by Sasha, but it worth applying both.

Sasha, as I cannot trigger the bug, I would like to have your Tested-by.

Kirill A. Shutemov (2):
  mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  mm, thp: clear PG_mlocked when last mapping gone

 mm/oom_kill.c | 7 +++++++
 mm/rmap.c     | 3 +++
 2 files changed, 10 insertions(+)

-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2015-12-29 20:46 [PATCH 0/2] THP mlock fix Kirill A. Shutemov
@ 2015-12-29 20:46 ` Kirill A. Shutemov
  2016-01-05 12:47   ` Michal Hocko
  2016-01-05 13:33   ` Michal Hocko
  2015-12-29 20:46 ` [PATCH 2/2] mm, thp: clear PG_mlocked when last mapping gone Kirill A. Shutemov
  2015-12-29 20:52 ` [PATCH 0/2] THP mlock fix Sasha Levin
  2 siblings, 2 replies; 14+ messages in thread
From: Kirill A. Shutemov @ 2015-12-29 20:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Sasha Levin, linux-mm, Kirill A. Shutemov, Michal Hocko

As far as I can see we explicitly munlock pages everywhere before unmap
them. The only case when we don't to that is OOM-reaper.

I don't think we should bother with munlocking in this case, we can just
skip the locked VMA.

I think this patch would fix this crash:
 http://lkml.kernel.org/r/5661FBB6.6050307@oracle.com

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
---
 mm/oom_kill.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 4b0a5d8b92e1..fe58d76c1215 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -447,6 +447,13 @@ static bool __oom_reap_vmas(struct mm_struct *mm)
 			continue;
 
 		/*
+		 * mlocked VMAs require explicit munlocking before unmap.
+		 * Let's keep it simple here and skip such VMAs.
+		 */
+		if (vma->vm_flags & VM_LOCKED)
+			continue;
+
+		/*
 		 * Only anonymous pages have a good chance to be dropped
 		 * without additional steps which we cannot afford as we
 		 * are OOM already.
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] mm, thp: clear PG_mlocked when last mapping gone
  2015-12-29 20:46 [PATCH 0/2] THP mlock fix Kirill A. Shutemov
  2015-12-29 20:46 ` [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas() Kirill A. Shutemov
@ 2015-12-29 20:46 ` Kirill A. Shutemov
  2016-01-05  9:37   ` Vlastimil Babka
  2015-12-29 20:52 ` [PATCH 0/2] THP mlock fix Sasha Levin
  2 siblings, 1 reply; 14+ messages in thread
From: Kirill A. Shutemov @ 2015-12-29 20:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Sasha Levin, linux-mm, Kirill A. Shutemov

I missed clear_page_mlock() in page_remove_anon_compound_rmap().
It usually shouldn't cause any problems since we munlock pages
explicitly, but in conjunction with missed munlock in __oom_reap_vmas()
it causes problems:
 http://lkml.kernel.org/r/5661FBB6.6050307@oracle.com

Let's put it in place an mirror behaviour for small pages.

NOTE: I'm not entirely sure why we ever need clear_page_mlock() in
page_remove_rmap() codepath. It looks redundant to me as we munlock
pages anyway. But this is out of scope of the patch.

The patch can be folded into
 "thp: allow mlocked THP again"

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
---
 mm/rmap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/rmap.c b/mm/rmap.c
index 384516fb7495..68af2e32f7ed 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1356,6 +1356,9 @@ static void page_remove_anon_compound_rmap(struct page *page)
 		nr = HPAGE_PMD_NR;
 	}
 
+	if (unlikely(PageMlocked(page)))
+		clear_page_mlock(page);
+
 	if (nr) {
 		__mod_zone_page_state(page_zone(page), NR_ANON_PAGES, -nr);
 		deferred_split_huge_page(page);
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] THP mlock fix
  2015-12-29 20:46 [PATCH 0/2] THP mlock fix Kirill A. Shutemov
  2015-12-29 20:46 ` [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas() Kirill A. Shutemov
  2015-12-29 20:46 ` [PATCH 2/2] mm, thp: clear PG_mlocked when last mapping gone Kirill A. Shutemov
@ 2015-12-29 20:52 ` Sasha Levin
  2 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2015-12-29 20:52 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton; +Cc: linux-mm

On 12/29/2015 03:46 PM, Kirill A. Shutemov wrote:
> Hi Andrew,
> 
> There are two patches below. I believe either of the would fix the bug
> reported by Sasha, but it worth applying both.
> 
> Sasha, as I cannot trigger the bug, I would like to have your Tested-by.
> 
> Kirill A. Shutemov (2):
>   mm, oom: skip mlocked VMAs in __oom_reap_vmas()
>   mm, thp: clear PG_mlocked when last mapping gone
> 
>  mm/oom_kill.c | 7 +++++++
>  mm/rmap.c     | 3 +++
>  2 files changed, 10 insertions(+)
> 

Fixed for me.

	Tested-by: Sasha Levin <sasha.levin@oracle.com>


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] mm, thp: clear PG_mlocked when last mapping gone
  2015-12-29 20:46 ` [PATCH 2/2] mm, thp: clear PG_mlocked when last mapping gone Kirill A. Shutemov
@ 2016-01-05  9:37   ` Vlastimil Babka
  2016-01-05 14:37     ` Kirill A. Shutemov
  0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2016-01-05  9:37 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton; +Cc: Sasha Levin, linux-mm

On 12/29/2015 09:46 PM, Kirill A. Shutemov wrote:
> I missed clear_page_mlock() in page_remove_anon_compound_rmap().
> It usually shouldn't cause any problems since we munlock pages
> explicitly, but in conjunction with missed munlock in __oom_reap_vmas()
> it causes problems:
>   http://lkml.kernel.org/r/5661FBB6.6050307@oracle.com
>
> Let's put it in place an mirror behaviour for small pages.
>
> NOTE: I'm not entirely sure why we ever need clear_page_mlock() in
> page_remove_rmap() codepath. It looks redundant to me as we munlock
> pages anyway. But this is out of scope of the patch.

Git blame actually quickly points to commit e6c509f854550 which explains 
it :)

>
> The patch can be folded into
>   "thp: allow mlocked THP again"
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Ack.

> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> ---
>   mm/rmap.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 384516fb7495..68af2e32f7ed 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1356,6 +1356,9 @@ static void page_remove_anon_compound_rmap(struct page *page)
>   		nr = HPAGE_PMD_NR;
>   	}
>
> +	if (unlikely(PageMlocked(page)))
> +		clear_page_mlock(page);
> +
>   	if (nr) {
>   		__mod_zone_page_state(page_zone(page), NR_ANON_PAGES, -nr);
>   		deferred_split_huge_page(page);
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2015-12-29 20:46 ` [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas() Kirill A. Shutemov
@ 2016-01-05 12:47   ` Michal Hocko
  2016-01-05 13:10     ` Kirill A. Shutemov
  2016-01-05 13:33   ` Michal Hocko
  1 sibling, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2016-01-05 12:47 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: Andrew Morton, Sasha Levin, linux-mm

On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
> As far as I can see we explicitly munlock pages everywhere before unmap
> them. The only case when we don't to that is OOM-reaper.

Very well spotted!

> I don't think we should bother with munlocking in this case, we can just
> skip the locked VMA.

Why cannot we simply munlock them here for the private mappings?

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 4b0a5d8b92e1..25dd7cd6fb5e 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -456,9 +456,12 @@ static bool __oom_reap_vmas(struct mm_struct *mm)
 		 * we do not want to block exit_mmap by keeping mm ref
 		 * count elevated without a good reason.
 		 */
-		if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED))
+		if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) {
+			if (vma->vm_flags & VM_LOCKED)
+				munlock_vma_pages_all(vma);
 			unmap_page_range(&tlb, vma, vma->vm_start, vma->vm_end,
 					 &details);
+		}
 	}
 	tlb_finish_mmu(&tlb, 0, -1);
 	up_read(&mm->mmap_sem);
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2016-01-05 12:47   ` Michal Hocko
@ 2016-01-05 13:10     ` Kirill A. Shutemov
  2016-01-05 13:31       ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: Kirill A. Shutemov @ 2016-01-05 13:10 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Kirill A. Shutemov, Andrew Morton, Sasha Levin, linux-mm

On Tue, Jan 05, 2016 at 01:47:35PM +0100, Michal Hocko wrote:
> On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
> > As far as I can see we explicitly munlock pages everywhere before unmap
> > them. The only case when we don't to that is OOM-reaper.
> 
> Very well spotted!
> 
> > I don't think we should bother with munlocking in this case, we can just
> > skip the locked VMA.
> 
> Why cannot we simply munlock them here for the private mappings?

It's probably right think to do, but I wanted to fix the bug first.
And I wasn't ready to investigate context the reaper working in to check
if it's safe to munlock there. For instance, munlock would take page lock
and I'm not sure at the moment if it can or cannot lead to deadlock in
some scenario. So I choose safer fix.

If calling munlock is always safe where unmap happens, why not move inside
unmap?

> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 4b0a5d8b92e1..25dd7cd6fb5e 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -456,9 +456,12 @@ static bool __oom_reap_vmas(struct mm_struct *mm)
>  		 * we do not want to block exit_mmap by keeping mm ref
>  		 * count elevated without a good reason.
>  		 */
> -		if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED))
> +		if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) {
> +			if (vma->vm_flags & VM_LOCKED)
> +				munlock_vma_pages_all(vma);
>  			unmap_page_range(&tlb, vma, vma->vm_start, vma->vm_end,
>  					 &details);
> +		}
>  	}
>  	tlb_finish_mmu(&tlb, 0, -1);
>  	up_read(&mm->mmap_sem);
-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2016-01-05 13:10     ` Kirill A. Shutemov
@ 2016-01-05 13:31       ` Michal Hocko
  2016-01-05 15:03         ` Kirill A. Shutemov
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2016-01-05 13:31 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, Andrew Morton, Sasha Levin, linux-mm

On Tue 05-01-16 15:10:39, Kirill A. Shutemov wrote:
> On Tue, Jan 05, 2016 at 01:47:35PM +0100, Michal Hocko wrote:
> > On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
> > > As far as I can see we explicitly munlock pages everywhere before unmap
> > > them. The only case when we don't to that is OOM-reaper.
> > 
> > Very well spotted!
> > 
> > > I don't think we should bother with munlocking in this case, we can just
> > > skip the locked VMA.
> > 
> > Why cannot we simply munlock them here for the private mappings?
> 
> It's probably right think to do, but I wanted to fix the bug first.

Fair enough. It is surely simpler, although I think we should tear
private mappings down even when mlocked. I can cook up a separate patch
on top of yours which is obviously correct and can be folded into the
original one.

> And I wasn't ready to investigate context the reaper working in to check
> if it's safe to munlock there. For instance, munlock would take page lock
> and I'm not sure at the moment if it can or cannot lead to deadlock in
> some scenario. So I choose safer fix.

repear is a flat kernel thread context which doesn't sit on any locks
(except for mmap sem for read taken on the way) so I do not immediately
see any potential for the dead lock. If the original context which
wakes it up depend on the page lock to move on then we would be screwed
already because we can end up doing exit_mmap in that context already
and so end up doing munlock as well.

> If calling munlock is always safe where unmap happens, why not move inside
> unmap?

This would be less error prone for sure. I would rather see it as a
separate patch which explains why it is safe in all cases though.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2015-12-29 20:46 ` [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas() Kirill A. Shutemov
  2016-01-05 12:47   ` Michal Hocko
@ 2016-01-05 13:33   ` Michal Hocko
  2016-01-05 15:03     ` Kirill A. Shutemov
  1 sibling, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2016-01-05 13:33 UTC (permalink / raw)
  To: Kirill A. Shutemov, Sasha Levin; +Cc: Andrew Morton, linux-mm

On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
> As far as I can see we explicitly munlock pages everywhere before unmap
> them. The only case when we don't to that is OOM-reaper.
> 
> I don't think we should bother with munlocking in this case, we can just
> skip the locked VMA.
> 
> I think this patch would fix this crash:
>  http://lkml.kernel.org/r/5661FBB6.6050307@oracle.com

Btw, do you happen to have the full log here. OOM reaper can only
interfere if there was an OOM killer invoked.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] mm, thp: clear PG_mlocked when last mapping gone
  2016-01-05  9:37   ` Vlastimil Babka
@ 2016-01-05 14:37     ` Kirill A. Shutemov
  0 siblings, 0 replies; 14+ messages in thread
From: Kirill A. Shutemov @ 2016-01-05 14:37 UTC (permalink / raw)
  To: Vlastimil Babka, Hugh Dickins
  Cc: Kirill A. Shutemov, Andrew Morton, Sasha Levin, linux-mm

On Tue, Jan 05, 2016 at 10:37:18AM +0100, Vlastimil Babka wrote:
> On 12/29/2015 09:46 PM, Kirill A. Shutemov wrote:
> >I missed clear_page_mlock() in page_remove_anon_compound_rmap().
> >It usually shouldn't cause any problems since we munlock pages
> >explicitly, but in conjunction with missed munlock in __oom_reap_vmas()
> >it causes problems:
> >  http://lkml.kernel.org/r/5661FBB6.6050307@oracle.com
> >
> >Let's put it in place an mirror behaviour for small pages.
> >
> >NOTE: I'm not entirely sure why we ever need clear_page_mlock() in
> >page_remove_rmap() codepath. It looks redundant to me as we munlock
> >pages anyway. But this is out of scope of the patch.
> 
> Git blame actually quickly points to commit e6c509f854550 which explains it
> :)

Okay, it explains situation somewhat.

The thing which still makes me a bit uncomfortable with the situation is
that we remove PG_mlocked only when the last mapping of the page gone.
It's not necessary the mapping which was VM_LOCKED. It means we can rely
on the clear_page_mlock() inside page_remove_rmap() only when remove all
page mappings at once (like in truncate case).

The clear_page_mlock() also helps hide real mlock leak bugs, like fixed by
patch 1/2: we saliently munlock page when last mapping gone, even if the
VMA was never been mlocked in the first place.

That's kinda suboptimal.

> 
> >
> >The patch can be folded into
> >  "thp: allow mlocked THP again"
> >
> >Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> Ack.
> 
> >Reported-by: Sasha Levin <sasha.levin@oracle.com>
> >---
> >  mm/rmap.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> >diff --git a/mm/rmap.c b/mm/rmap.c
> >index 384516fb7495..68af2e32f7ed 100644
> >--- a/mm/rmap.c
> >+++ b/mm/rmap.c
> >@@ -1356,6 +1356,9 @@ static void page_remove_anon_compound_rmap(struct page *page)
> >  		nr = HPAGE_PMD_NR;
> >  	}
> >
> >+	if (unlikely(PageMlocked(page)))
> >+		clear_page_mlock(page);
> >+
> >  	if (nr) {
> >  		__mod_zone_page_state(page_zone(page), NR_ANON_PAGES, -nr);
> >  		deferred_split_huge_page(page);
> >
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2016-01-05 13:31       ` Michal Hocko
@ 2016-01-05 15:03         ` Kirill A. Shutemov
  2016-01-05 15:45           ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: Kirill A. Shutemov @ 2016-01-05 15:03 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Kirill A. Shutemov, Andrew Morton, Sasha Levin, linux-mm

On Tue, Jan 05, 2016 at 02:31:23PM +0100, Michal Hocko wrote:
> On Tue 05-01-16 15:10:39, Kirill A. Shutemov wrote:
> > On Tue, Jan 05, 2016 at 01:47:35PM +0100, Michal Hocko wrote:
> > > On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
> > > > As far as I can see we explicitly munlock pages everywhere before unmap
> > > > them. The only case when we don't to that is OOM-reaper.
> > > 
> > > Very well spotted!
> > > 
> > > > I don't think we should bother with munlocking in this case, we can just
> > > > skip the locked VMA.
> > > 
> > > Why cannot we simply munlock them here for the private mappings?
> > 
> > It's probably right think to do, but I wanted to fix the bug first.
> 
> Fair enough. It is surely simpler, although I think we should tear
> private mappings down even when mlocked. I can cook up a separate patch
> on top of yours which is obviously correct and can be folded into the
> original one.

I prefer it not to be folded. To be able to revert in something go wrong.

> > And I wasn't ready to investigate context the reaper working in to check
> > if it's safe to munlock there. For instance, munlock would take page lock
> > and I'm not sure at the moment if it can or cannot lead to deadlock in
> > some scenario. So I choose safer fix.
> 
> repear is a flat kernel thread context which doesn't sit on any locks
> (except for mmap sem for read taken on the way) so I do not immediately
> see any potential for the dead lock. If the original context which
> wakes it up depend on the page lock to move on then we would be screwed
> already because we can end up doing exit_mmap in that context already
> and so end up doing munlock as well.

Can target process hold page lock? Or a process in direct replaim?
Basically, I don't know what I'm talking about ;-P

> > If calling munlock is always safe where unmap happens, why not move inside
> > unmap?
> 
> This would be less error prone for sure. I would rather see it as a
> separate patch which explains why it is safe in all cases though.

I haven't subscribed to implementing this just yet :)

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2016-01-05 13:33   ` Michal Hocko
@ 2016-01-05 15:03     ` Kirill A. Shutemov
  2016-01-05 15:44       ` Sasha Levin
  0 siblings, 1 reply; 14+ messages in thread
From: Kirill A. Shutemov @ 2016-01-05 15:03 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Kirill A. Shutemov, Sasha Levin, Andrew Morton, linux-mm

On Tue, Jan 05, 2016 at 02:33:38PM +0100, Michal Hocko wrote:
> On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
> > As far as I can see we explicitly munlock pages everywhere before unmap
> > them. The only case when we don't to that is OOM-reaper.
> > 
> > I don't think we should bother with munlocking in this case, we can just
> > skip the locked VMA.
> > 
> > I think this patch would fix this crash:
> >  http://lkml.kernel.org/r/5661FBB6.6050307@oracle.com
> 
> Btw, do you happen to have the full log here. OOM reaper can only
> interfere if there was an OOM killer invoked.

No, I don't. Sasha?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2016-01-05 15:03     ` Kirill A. Shutemov
@ 2016-01-05 15:44       ` Sasha Levin
  0 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2016-01-05 15:44 UTC (permalink / raw)
  To: Kirill A. Shutemov, Michal Hocko
  Cc: Kirill A. Shutemov, Andrew Morton, linux-mm

On 01/05/2016 10:03 AM, Kirill A. Shutemov wrote:
> On Tue, Jan 05, 2016 at 02:33:38PM +0100, Michal Hocko wrote:
>> > On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
>>> > > As far as I can see we explicitly munlock pages everywhere before unmap
>>> > > them. The only case when we don't to that is OOM-reaper.
>>> > > 
>>> > > I don't think we should bother with munlocking in this case, we can just
>>> > > skip the locked VMA.
>>> > > 
>>> > > I think this patch would fix this crash:
>>> > >  http://lkml.kernel.org/r/5661FBB6.6050307@oracle.com
>> > 
>> > Btw, do you happen to have the full log here. OOM reaper can only
>> > interfere if there was an OOM killer invoked.
> No, I don't. Sasha?

I don't have the log, but my setup does invoke the OOM killer occasionally.


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas()
  2016-01-05 15:03         ` Kirill A. Shutemov
@ 2016-01-05 15:45           ` Michal Hocko
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2016-01-05 15:45 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, Andrew Morton, Sasha Levin, linux-mm

On Tue 05-01-16 17:03:12, Kirill A. Shutemov wrote:
> On Tue, Jan 05, 2016 at 02:31:23PM +0100, Michal Hocko wrote:
> > On Tue 05-01-16 15:10:39, Kirill A. Shutemov wrote:
> > > On Tue, Jan 05, 2016 at 01:47:35PM +0100, Michal Hocko wrote:
> > > > On Tue 29-12-15 23:46:29, Kirill A. Shutemov wrote:
> > > > > As far as I can see we explicitly munlock pages everywhere before unmap
> > > > > them. The only case when we don't to that is OOM-reaper.
> > > > 
> > > > Very well spotted!
> > > > 
> > > > > I don't think we should bother with munlocking in this case, we can just
> > > > > skip the locked VMA.
> > > > 
> > > > Why cannot we simply munlock them here for the private mappings?
> > > 
> > > It's probably right think to do, but I wanted to fix the bug first.
> > 
> > Fair enough. It is surely simpler, although I think we should tear
> > private mappings down even when mlocked. I can cook up a separate patch
> > on top of yours which is obviously correct and can be folded into the
> > original one.
> 
> I prefer it not to be folded. To be able to revert in something go wrong.

Sorry, I meant your fixup should be folded. The one to allow munlock as
a separate fix.
> 
> > > And I wasn't ready to investigate context the reaper working in to check
> > > if it's safe to munlock there. For instance, munlock would take page lock
> > > and I'm not sure at the moment if it can or cannot lead to deadlock in
> > > some scenario. So I choose safer fix.
> > 
> > repear is a flat kernel thread context which doesn't sit on any locks
> > (except for mmap sem for read taken on the way) so I do not immediately
> > see any potential for the dead lock. If the original context which
> > wakes it up depend on the page lock to move on then we would be screwed
> > already because we can end up doing exit_mmap in that context already
> > and so end up doing munlock as well.
> 
> Can target process hold page lock? Or a process in direct replaim?

The target process is the OOM victim. It can be doing anything.
Including holding the page lock. But it cannot be holding page lock
while doing the allocation or invoking the OOM killer because that would
be a deadlock already even without oom reaper in the picture.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-01-05 15:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-29 20:46 [PATCH 0/2] THP mlock fix Kirill A. Shutemov
2015-12-29 20:46 ` [PATCH 1/2] mm, oom: skip mlocked VMAs in __oom_reap_vmas() Kirill A. Shutemov
2016-01-05 12:47   ` Michal Hocko
2016-01-05 13:10     ` Kirill A. Shutemov
2016-01-05 13:31       ` Michal Hocko
2016-01-05 15:03         ` Kirill A. Shutemov
2016-01-05 15:45           ` Michal Hocko
2016-01-05 13:33   ` Michal Hocko
2016-01-05 15:03     ` Kirill A. Shutemov
2016-01-05 15:44       ` Sasha Levin
2015-12-29 20:46 ` [PATCH 2/2] mm, thp: clear PG_mlocked when last mapping gone Kirill A. Shutemov
2016-01-05  9:37   ` Vlastimil Babka
2016-01-05 14:37     ` Kirill A. Shutemov
2015-12-29 20:52 ` [PATCH 0/2] THP mlock fix Sasha Levin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.