linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context
@ 2016-05-20  1:30 Minchan Kim
  2016-05-20  6:16 ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Minchan Kim @ 2016-05-20  1:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Tetsuo Handa, David Rientjes, linux-mm, LKML,
	Michal Hocko

Forking new thread because my comment is not related to this patch's
purpose but found a thing during reading this patch.

On Tue, Apr 26, 2016 at 04:04:30PM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Tetsuo has properly noted that mmput slow path might get blocked waiting
> for another party (e.g. exit_aio waits for an IO). If that happens the
> oom_reaper would be put out of the way and will not be able to process
> next oom victim. We should strive for making this context as reliable
> and independent on other subsystems as much as possible.
> 
> Introduce mmput_async which will perform the slow path from an async
> (WQ) context. This will delay the operation but that shouldn't be a
> problem because the oom_reaper has reclaimed the victim's address space
> for most cases as much as possible and the remaining context shouldn't
> bind too much memory anymore. The only exception is when mmap_sem
> trylock has failed which shouldn't happen too often.
> 
> The issue is only theoretical but not impossible.

The mmput_async is used for only OOM reaper which is enabled on CONFIG_MMU.
So until someone who want to use mmput_async in !CONFIG_MMU come out,
we could save sizeof(struct work_struct) per mm in !CONFIG_MMU.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context
  2016-05-20  1:30 [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context Minchan Kim
@ 2016-05-20  6:16 ` Michal Hocko
  2016-05-20  7:12   ` Minchan Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2016-05-20  6:16 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, Tetsuo Handa, David Rientjes, linux-mm, LKML

On Fri 20-05-16 10:30:53, Minchan Kim wrote:
> Forking new thread because my comment is not related to this patch's
> purpose but found a thing during reading this patch.
> 
> On Tue, Apr 26, 2016 at 04:04:30PM +0200, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > Tetsuo has properly noted that mmput slow path might get blocked waiting
> > for another party (e.g. exit_aio waits for an IO). If that happens the
> > oom_reaper would be put out of the way and will not be able to process
> > next oom victim. We should strive for making this context as reliable
> > and independent on other subsystems as much as possible.
> > 
> > Introduce mmput_async which will perform the slow path from an async
> > (WQ) context. This will delay the operation but that shouldn't be a
> > problem because the oom_reaper has reclaimed the victim's address space
> > for most cases as much as possible and the remaining context shouldn't
> > bind too much memory anymore. The only exception is when mmap_sem
> > trylock has failed which shouldn't happen too often.
> > 
> > The issue is only theoretical but not impossible.
> 
> The mmput_async is used for only OOM reaper which is enabled on CONFIG_MMU.
> So until someone who want to use mmput_async in !CONFIG_MMU come out,
> we could save sizeof(struct work_struct) per mm in !CONFIG_MMU.

You are right. What about the following?
---
>From 8f8a34bf00882bfc0b557ed79e0e9e956ac9d217 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Fri, 20 May 2016 08:14:39 +0200
Subject: [PATCH] mmotm:
 mm-oom_reaper-do-not-mmput-synchronously-from-the-oom-reaper-context-fix

mmput_async is currently used only from the oom_reaper which is defined
only for CONFIG_MMU. We can save work_struct in mm_struct for
!CONFIG_MMU.

Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/mm_types.h | 2 ++
 include/linux/sched.h    | 2 ++
 kernel/fork.c            | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index ab142ace96f3..a16dcb2efca4 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -514,7 +514,9 @@ struct mm_struct {
 #ifdef CONFIG_HUGETLB_PAGE
 	atomic_long_t hugetlb_usage;
 #endif
+#ifdef CONFIG_MMU
 	struct work_struct async_put_work;
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index df8778e72211..11b31ded65cf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2604,10 +2604,12 @@ static inline void mmdrop(struct mm_struct * mm)
 
 /* mmput gets rid of the mappings and all user-space */
 extern void mmput(struct mm_struct *);
+#ifdef CONFIG_MMU
 /* same as above but performs the slow path from the async kontext. Can
  * be called from the atomic context as well
  */
 extern void mmput_async(struct mm_struct *);
+#endif
 
 /* Grab a reference to a task's mm, if it is not already going away */
 extern struct mm_struct *get_task_mm(struct task_struct *task);
diff --git a/kernel/fork.c b/kernel/fork.c
index e1dc6b02ac8b..1e3dc3af6845 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -732,6 +732,7 @@ void mmput(struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(mmput);
 
+#ifdef CONFIG_MMU
 static void mmput_async_fn(struct work_struct *work)
 {
 	struct mm_struct *mm = container_of(work, struct mm_struct, async_put_work);
@@ -745,6 +746,7 @@ void mmput_async(struct mm_struct *mm)
 		schedule_work(&mm->async_put_work);
 	}
 }
+#endif
 
 /**
  * set_mm_exe_file - change a reference to the mm's executable file
-- 
2.8.1


-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context
  2016-05-20  6:16 ` Michal Hocko
@ 2016-05-20  7:12   ` Minchan Kim
  0 siblings, 0 replies; 6+ messages in thread
From: Minchan Kim @ 2016-05-20  7:12 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Tetsuo Handa, David Rientjes, linux-mm, LKML

On Fri, May 20, 2016 at 08:16:59AM +0200, Michal Hocko wrote:
> On Fri 20-05-16 10:30:53, Minchan Kim wrote:
> > Forking new thread because my comment is not related to this patch's
> > purpose but found a thing during reading this patch.
> > 
> > On Tue, Apr 26, 2016 at 04:04:30PM +0200, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > > 
> > > Tetsuo has properly noted that mmput slow path might get blocked waiting
> > > for another party (e.g. exit_aio waits for an IO). If that happens the
> > > oom_reaper would be put out of the way and will not be able to process
> > > next oom victim. We should strive for making this context as reliable
> > > and independent on other subsystems as much as possible.
> > > 
> > > Introduce mmput_async which will perform the slow path from an async
> > > (WQ) context. This will delay the operation but that shouldn't be a
> > > problem because the oom_reaper has reclaimed the victim's address space
> > > for most cases as much as possible and the remaining context shouldn't
> > > bind too much memory anymore. The only exception is when mmap_sem
> > > trylock has failed which shouldn't happen too often.
> > > 
> > > The issue is only theoretical but not impossible.
> > 
> > The mmput_async is used for only OOM reaper which is enabled on CONFIG_MMU.
> > So until someone who want to use mmput_async in !CONFIG_MMU come out,
> > we could save sizeof(struct work_struct) per mm in !CONFIG_MMU.
> 
> You are right. What about the following?
> ---
> From 8f8a34bf00882bfc0b557ed79e0e9e956ac9d217 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Fri, 20 May 2016 08:14:39 +0200
> Subject: [PATCH] mmotm:
>  mm-oom_reaper-do-not-mmput-synchronously-from-the-oom-reaper-context-fix
> 
> mmput_async is currently used only from the oom_reaper which is defined
> only for CONFIG_MMU. We can save work_struct in mm_struct for
> !CONFIG_MMU.
> 
> Reported-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Minchan Kim <minchan@kernel.org>

Found a typo below although it was not caused by this patch.
My brand new glasses are really good for me.

> ---
>  include/linux/mm_types.h | 2 ++
>  include/linux/sched.h    | 2 ++
>  kernel/fork.c            | 2 ++
>  3 files changed, 6 insertions(+)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index ab142ace96f3..a16dcb2efca4 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -514,7 +514,9 @@ struct mm_struct {
>  #ifdef CONFIG_HUGETLB_PAGE
>  	atomic_long_t hugetlb_usage;
>  #endif
> +#ifdef CONFIG_MMU
>  	struct work_struct async_put_work;
> +#endif
>  };
>  
>  static inline void mm_init_cpumask(struct mm_struct *mm)
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index df8778e72211..11b31ded65cf 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2604,10 +2604,12 @@ static inline void mmdrop(struct mm_struct * mm)
>  
>  /* mmput gets rid of the mappings and all user-space */
>  extern void mmput(struct mm_struct *);
> +#ifdef CONFIG_MMU
>  /* same as above but performs the slow path from the async kontext. Can

                                                              c

>   * be called from the atomic context as well
>   */
>  extern void mmput_async(struct mm_struct *);
> +#endif
>  
>  /* Grab a reference to a task's mm, if it is not already going away */
>  extern struct mm_struct *get_task_mm(struct task_struct *task);
> diff --git a/kernel/fork.c b/kernel/fork.c
> index e1dc6b02ac8b..1e3dc3af6845 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -732,6 +732,7 @@ void mmput(struct mm_struct *mm)
>  }
>  EXPORT_SYMBOL_GPL(mmput);
>  
> +#ifdef CONFIG_MMU
>  static void mmput_async_fn(struct work_struct *work)
>  {
>  	struct mm_struct *mm = container_of(work, struct mm_struct, async_put_work);
> @@ -745,6 +746,7 @@ void mmput_async(struct mm_struct *mm)
>  		schedule_work(&mm->async_put_work);
>  	}
>  }
> +#endif
>  
>  /**
>   * set_mm_exe_file - change a reference to the mm's executable file
> -- 
> 2.8.1
> 
> 
> -- 
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context
  2016-04-26 14:18   ` kbuild test robot
@ 2016-04-26 14:58     ` Michal Hocko
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2016-04-26 14:58 UTC (permalink / raw)
  To: kbuild test robot, Andrew Morton
  Cc: kbuild-all, Tetsuo Handa, David Rientjes, linux-mm, LKML

On Tue 26-04-16 22:18:22, kbuild test robot wrote:
> >> include/linux/mm_types.h:516:21: error: field 'async_put_work' has incomplete type
>      struct work_struct async_put_work;

My bad. We need to include <linux/workqueue.h> because we rely on the
include only indirectly which happened to work fine for most of my
configs - not so for allnoconfig, though. Please fold this into the
original patch or let me know and I will repost the full patch again.
---
>From 368e90e7640a1eb5f0e621c7ccb08bc7ef2d272b Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Tue, 26 Apr 2016 16:48:13 +0200
Subject: [PATCH] mm-fix: mm, oom_reaper: do not mmput synchronously from the
 oom reaper context

In file included from include/linux/sched.h:27:0,
	    from include/linux/oom.h:5,
	    from mm/oom_kill.c:20:
>> include/linux/mm_types.h:516:21: error: field 'async_put_work' has incomplete type
	struct work_struct async_put_work;

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/mm_types.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index f4f477244679..ab142ace96f3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -12,6 +12,7 @@
 #include <linux/cpumask.h>
 #include <linux/uprobes.h>
 #include <linux/page-flags-layout.h>
+#include <linux/workqueue.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
 
-- 
2.8.0.rc3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context
  2016-04-26 14:04 ` [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context Michal Hocko
@ 2016-04-26 14:18   ` kbuild test robot
  2016-04-26 14:58     ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: kbuild test robot @ 2016-04-26 14:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kbuild-all, Andrew Morton, Tetsuo Handa, David Rientjes,
	linux-mm, LKML, Michal Hocko

[-- Attachment #1: Type: text/plain, Size: 1376 bytes --]

Hi,

[auto build test ERROR on next-20160426]
[cannot apply to tip/sched/core v4.6-rc5 v4.6-rc4 v4.6-rc3 v4.6-rc5]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-oom_reaper-hide-oom-reaped-tasks-from-OOM-killer-more-carefully/20160426-220841
config: x86_64-randconfig-x010-201617 (attached as .config)
compiler: 
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   In file included from include/linux/sched.h:27:0,
                    from include/linux/oom.h:5,
                    from mm/oom_kill.c:20:
>> include/linux/mm_types.h:516:21: error: field 'async_put_work' has incomplete type
     struct work_struct async_put_work;
                        ^

vim +/async_put_work +516 include/linux/mm_types.h

   510		/* address of the bounds directory */
   511		void __user *bd_addr;
   512	#endif
   513	#ifdef CONFIG_HUGETLB_PAGE
   514		atomic_long_t hugetlb_usage;
   515	#endif
 > 516		struct work_struct async_put_work;
   517	};
   518	
   519	static inline void mm_init_cpumask(struct mm_struct *mm)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 33900 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context
  2016-04-26 14:04 [PATCH 0/2] last pile of oom_reaper patches for now Michal Hocko
@ 2016-04-26 14:04 ` Michal Hocko
  2016-04-26 14:18   ` kbuild test robot
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2016-04-26 14:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Tetsuo Handa, David Rientjes, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Tetsuo has properly noted that mmput slow path might get blocked waiting
for another party (e.g. exit_aio waits for an IO). If that happens the
oom_reaper would be put out of the way and will not be able to process
next oom victim. We should strive for making this context as reliable
and independent on other subsystems as much as possible.

Introduce mmput_async which will perform the slow path from an async
(WQ) context. This will delay the operation but that shouldn't be a
problem because the oom_reaper has reclaimed the victim's address space
for most cases as much as possible and the remaining context shouldn't
bind too much memory anymore. The only exception is when mmap_sem
trylock has failed which shouldn't happen too often.

The issue is only theoretical but not impossible.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/mm_types.h |  1 +
 include/linux/sched.h    |  5 +++++
 kernel/fork.c            | 50 +++++++++++++++++++++++++++++++++---------------
 mm/oom_kill.c            |  8 ++++++--
 4 files changed, 47 insertions(+), 17 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index bf4f3ac5110c..f4f477244679 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -513,6 +513,7 @@ struct mm_struct {
 #ifdef CONFIG_HUGETLB_PAGE
 	atomic_long_t hugetlb_usage;
 #endif
+	struct work_struct async_put_work;
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7bd0fa9db199..df8778e72211 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2604,6 +2604,11 @@ static inline void mmdrop(struct mm_struct * mm)
 
 /* mmput gets rid of the mappings and all user-space */
 extern void mmput(struct mm_struct *);
+/* same as above but performs the slow path from the async kontext. Can
+ * be called from the atomic context as well
+ */
+extern void mmput_async(struct mm_struct *);
+
 /* Grab a reference to a task's mm, if it is not already going away */
 extern struct mm_struct *get_task_mm(struct task_struct *task);
 /*
diff --git a/kernel/fork.c b/kernel/fork.c
index accb7221d547..10b0f771d795 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -696,6 +696,26 @@ void __mmdrop(struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(__mmdrop);
 
+static inline void __mmput(struct mm_struct *mm)
+{
+	VM_BUG_ON(atomic_read(&mm->mm_users));
+
+	uprobe_clear_state(mm);
+	exit_aio(mm);
+	ksm_exit(mm);
+	khugepaged_exit(mm); /* must run before exit_mmap */
+	exit_mmap(mm);
+	set_mm_exe_file(mm, NULL);
+	if (!list_empty(&mm->mmlist)) {
+		spin_lock(&mmlist_lock);
+		list_del(&mm->mmlist);
+		spin_unlock(&mmlist_lock);
+	}
+	if (mm->binfmt)
+		module_put(mm->binfmt->module);
+	mmdrop(mm);
+}
+
 /*
  * Decrement the use count and release all resources for an mm.
  */
@@ -703,24 +723,24 @@ void mmput(struct mm_struct *mm)
 {
 	might_sleep();
 
+	if (atomic_dec_and_test(&mm->mm_users))
+		__mmput(mm);
+}
+EXPORT_SYMBOL_GPL(mmput);
+
+static void mmput_async_fn(struct work_struct *work)
+{
+	struct mm_struct *mm = container_of(work, struct mm_struct, async_put_work);
+	__mmput(mm);
+}
+
+void mmput_async(struct mm_struct *mm)
+{
 	if (atomic_dec_and_test(&mm->mm_users)) {
-		uprobe_clear_state(mm);
-		exit_aio(mm);
-		ksm_exit(mm);
-		khugepaged_exit(mm); /* must run before exit_mmap */
-		exit_mmap(mm);
-		set_mm_exe_file(mm, NULL);
-		if (!list_empty(&mm->mmlist)) {
-			spin_lock(&mmlist_lock);
-			list_del(&mm->mmlist);
-			spin_unlock(&mmlist_lock);
-		}
-		if (mm->binfmt)
-			module_put(mm->binfmt->module);
-		mmdrop(mm);
+		INIT_WORK(&mm->async_put_work, mmput_async_fn);
+		schedule_work(&mm->async_put_work);
 	}
 }
-EXPORT_SYMBOL_GPL(mmput);
 
 /**
  * set_mm_exe_file - change a reference to the mm's executable file
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index c0376efa79ec..c0e37dd1422f 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -446,7 +446,6 @@ static DECLARE_WAIT_QUEUE_HEAD(oom_reaper_wait);
 static struct task_struct *oom_reaper_list;
 static DEFINE_SPINLOCK(oom_reaper_lock);
 
-
 static bool __oom_reap_task(struct task_struct *tsk)
 {
 	struct mmu_gather tlb;
@@ -520,7 +519,12 @@ static bool __oom_reap_task(struct task_struct *tsk)
 	 */
 	set_bit(MMF_OOM_REAPED, &mm->flags);
 out:
-	mmput(mm);
+	/*
+	 * Drop our reference but make sure the mmput slow path is called from a
+	 * different context because we shouldn't risk we get stuck there and
+	 * put the oom_reaper out of the way.
+	 */
+	mmput_async(mm);
 	return ret;
 }
 
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-05-20  7:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-20  1:30 [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context Minchan Kim
2016-05-20  6:16 ` Michal Hocko
2016-05-20  7:12   ` Minchan Kim
  -- strict thread matches above, loose matches on Subject: below --
2016-04-26 14:04 [PATCH 0/2] last pile of oom_reaper patches for now Michal Hocko
2016-04-26 14:04 ` [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context Michal Hocko
2016-04-26 14:18   ` kbuild test robot
2016-04-26 14:58     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).