All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
@ 2012-11-14  9:15 ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Kroah-Hartman, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Michal Hocko, linux-kernel, linux-mm

With hotpluggable and memoryless nodes, it's possible that node 0 will
not be online, so use the first online node's zonelist rather than
hardcoding node 0 to pass a zonelist with all zones to the oom killer.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 drivers/tty/sysrq.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -346,7 +346,8 @@ static struct sysrq_key_op sysrq_term_op = {
 
 static void moom_callback(struct work_struct *ignored)
 {
-	out_of_memory(node_zonelist(0, GFP_KERNEL), GFP_KERNEL, 0, NULL, true);
+	out_of_memory(node_zonelist(first_online_node, GFP_KERNEL), GFP_KERNEL,
+		      0, NULL, true);
 }
 
 static DECLARE_WORK(moom_work, moom_callback);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
@ 2012-11-14  9:15 ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Kroah-Hartman, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Michal Hocko, linux-kernel, linux-mm

With hotpluggable and memoryless nodes, it's possible that node 0 will
not be online, so use the first online node's zonelist rather than
hardcoding node 0 to pass a zonelist with all zones to the oom killer.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 drivers/tty/sysrq.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -346,7 +346,8 @@ static struct sysrq_key_op sysrq_term_op = {
 
 static void moom_callback(struct work_struct *ignored)
 {
-	out_of_memory(node_zonelist(0, GFP_KERNEL), GFP_KERNEL, 0, NULL, true);
+	out_of_memory(node_zonelist(first_online_node, GFP_KERNEL), GFP_KERNEL,
+		      0, NULL, true);
 }
 
 static DECLARE_WORK(moom_work, moom_callback);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 2/4] mm, oom: cleanup pagefault oom handler
  2012-11-14  9:15 ` David Rientjes
@ 2012-11-14  9:15   ` David Rientjes
  -1 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

To lock the entire system from parallel oom killing, it's possible to
pass in a zonelist with all zones rather than using
for_each_populated_zone() for the iteration.  This obsoletes
try_set_system_oom() and clear_system_oom() so that they can be removed.

Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   49 +++++++------------------------------------------
 1 files changed, 7 insertions(+), 42 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -591,43 +591,6 @@ void clear_zonelist_oom(struct zonelist *zonelist, gfp_t gfp_mask)
 	spin_unlock(&zone_scan_lock);
 }
 
-/*
- * Try to acquire the oom killer lock for all system zones.  Returns zero if a
- * parallel oom killing is taking place, otherwise locks all zones and returns
- * non-zero.
- */
-static int try_set_system_oom(void)
-{
-	struct zone *zone;
-	int ret = 1;
-
-	spin_lock(&zone_scan_lock);
-	for_each_populated_zone(zone)
-		if (zone_is_oom_locked(zone)) {
-			ret = 0;
-			goto out;
-		}
-	for_each_populated_zone(zone)
-		zone_set_flag(zone, ZONE_OOM_LOCKED);
-out:
-	spin_unlock(&zone_scan_lock);
-	return ret;
-}
-
-/*
- * Clears ZONE_OOM_LOCKED for all system zones so that failed allocation
- * attempts or page faults may now recall the oom killer, if necessary.
- */
-static void clear_system_oom(void)
-{
-	struct zone *zone;
-
-	spin_lock(&zone_scan_lock);
-	for_each_populated_zone(zone)
-		zone_clear_flag(zone, ZONE_OOM_LOCKED);
-	spin_unlock(&zone_scan_lock);
-}
-
 /**
  * out_of_memory - kill the "best" process when we run out of memory
  * @zonelist: zonelist pointer
@@ -708,15 +671,17 @@ out:
 
 /*
  * The pagefault handler calls here because it is out of memory, so kill a
- * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
- * oom killing is already in progress so do nothing.  If a task is found with
- * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
+ * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
+ * parallel oom killing is already in progress so do nothing.
  */
 void pagefault_out_of_memory(void)
 {
-	if (try_set_system_oom()) {
+	struct zonelist *zonelist = node_zonelist(first_online_node,
+						  GFP_KERNEL);
+
+	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
 		out_of_memory(NULL, 0, 0, NULL, false);
-		clear_system_oom();
+		clear_zonelist_oom(zonelist, GFP_KERNEL);
 	}
 	schedule_timeout_killable(1);
 }

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 2/4] mm, oom: cleanup pagefault oom handler
@ 2012-11-14  9:15   ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

To lock the entire system from parallel oom killing, it's possible to
pass in a zonelist with all zones rather than using
for_each_populated_zone() for the iteration.  This obsoletes
try_set_system_oom() and clear_system_oom() so that they can be removed.

Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   49 +++++++------------------------------------------
 1 files changed, 7 insertions(+), 42 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -591,43 +591,6 @@ void clear_zonelist_oom(struct zonelist *zonelist, gfp_t gfp_mask)
 	spin_unlock(&zone_scan_lock);
 }
 
-/*
- * Try to acquire the oom killer lock for all system zones.  Returns zero if a
- * parallel oom killing is taking place, otherwise locks all zones and returns
- * non-zero.
- */
-static int try_set_system_oom(void)
-{
-	struct zone *zone;
-	int ret = 1;
-
-	spin_lock(&zone_scan_lock);
-	for_each_populated_zone(zone)
-		if (zone_is_oom_locked(zone)) {
-			ret = 0;
-			goto out;
-		}
-	for_each_populated_zone(zone)
-		zone_set_flag(zone, ZONE_OOM_LOCKED);
-out:
-	spin_unlock(&zone_scan_lock);
-	return ret;
-}
-
-/*
- * Clears ZONE_OOM_LOCKED for all system zones so that failed allocation
- * attempts or page faults may now recall the oom killer, if necessary.
- */
-static void clear_system_oom(void)
-{
-	struct zone *zone;
-
-	spin_lock(&zone_scan_lock);
-	for_each_populated_zone(zone)
-		zone_clear_flag(zone, ZONE_OOM_LOCKED);
-	spin_unlock(&zone_scan_lock);
-}
-
 /**
  * out_of_memory - kill the "best" process when we run out of memory
  * @zonelist: zonelist pointer
@@ -708,15 +671,17 @@ out:
 
 /*
  * The pagefault handler calls here because it is out of memory, so kill a
- * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
- * oom killing is already in progress so do nothing.  If a task is found with
- * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
+ * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
+ * parallel oom killing is already in progress so do nothing.
  */
 void pagefault_out_of_memory(void)
 {
-	if (try_set_system_oom()) {
+	struct zonelist *zonelist = node_zonelist(first_online_node,
+						  GFP_KERNEL);
+
+	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
 		out_of_memory(NULL, 0, 0, NULL, false);
-		clear_system_oom();
+		clear_zonelist_oom(zonelist, GFP_KERNEL);
 	}
 	schedule_timeout_killable(1);
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 3/4] mm, oom: remove redundant sleep in pagefault oom handler
  2012-11-14  9:15 ` David Rientjes
@ 2012-11-14  9:15   ` David Rientjes
  -1 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

out_of_memory() will already cause current to schedule if it has not been
killed, so doing it again in pagefault_out_of_memory() is redundant.
Remove it.

Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -683,5 +683,4 @@ void pagefault_out_of_memory(void)
 		out_of_memory(NULL, 0, 0, NULL, false);
 		clear_zonelist_oom(zonelist, GFP_KERNEL);
 	}
-	schedule_timeout_killable(1);
 }

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 3/4] mm, oom: remove redundant sleep in pagefault oom handler
@ 2012-11-14  9:15   ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

out_of_memory() will already cause current to schedule if it has not been
killed, so doing it again in pagefault_out_of_memory() is redundant.
Remove it.

Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -683,5 +683,4 @@ void pagefault_out_of_memory(void)
 		out_of_memory(NULL, 0, 0, NULL, false);
 		clear_zonelist_oom(zonelist, GFP_KERNEL);
 	}
-	schedule_timeout_killable(1);
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 4/4] mm, oom: remove statically defined arch functions of same name
  2012-11-14  9:15 ` David Rientjes
  (?)
  (?)
@ 2012-11-14  9:15   ` David Rientjes
  -1 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Kroah-Hartman, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Michal Hocko, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

out_of_memory() is a globally defined function to call the oom killer.
x86, sh, and powerpc all use a function of the same name within file
scope in their respective fault.c unnecessarily.  Inline the functions
into the pagefault handlers to clean the code up.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
 arch/sh/mm/fault.c      |   19 +++++++------------
 arch/x86/mm/fault.c     |   23 ++++++++---------------
 3 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
 #define MM_FAULT_CONTINUE	-1
 #define MM_FAULT_ERR(sig)	(sig)
 
-static int out_of_memory(struct pt_regs *regs)
-{
-	/*
-	 * We ran out of memory, or some other thing happened to us that made
-	 * us unable to handle the page fault gracefully.
-	 */
-	up_read(&current->mm->mmap_sem);
-	if (!user_mode(regs))
-		return MM_FAULT_ERR(SIGKILL);
-	pagefault_out_of_memory();
-	return MM_FAULT_RETURN;
-}
-
 static int do_sigbus(struct pt_regs *regs, unsigned long address)
 {
 	siginfo_t info;
@@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
 		return MM_FAULT_CONTINUE;
 
 	/* Out of memory */
-	if (fault & VM_FAULT_OOM)
-		return out_of_memory(regs);
+	if (fault & VM_FAULT_OOM) {
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, or some other thing happened to us that
+		 * made us unable to handle the page fault gracefully.
+		 */
+		if (!user_mode(regs))
+			return MM_FAULT_ERR(SIGKILL);
+		pagefault_out_of_memory();
+		return MM_FAULT_RETURN;
+	}
 
 	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
 	 * we support the feature in HW
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-static void out_of_memory(void)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
 {
@@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			no_context(regs, error_code, address);
 			return 1;
 		}
+		up_read(&current->mm->mmap_sem);
 
-		out_of_memory();
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & VM_FAULT_SIGBUS)
 			do_sigbus(regs, error_code, address);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
-static void
-out_of_memory(struct pt_regs *regs, unsigned long error_code,
-	      unsigned long address)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 	  unsigned int fault)
@@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			return 1;
 		}
 
-		out_of_memory(regs, error_code, address);
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
 			     VM_FAULT_HWPOISON_LARGE))

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-14  9:15   ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Kroah-Hartman, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Michal Hocko, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

out_of_memory() is a globally defined function to call the oom killer.
x86, sh, and powerpc all use a function of the same name within file
scope in their respective fault.c unnecessarily.  Inline the functions
into the pagefault handlers to clean the code up.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
 arch/sh/mm/fault.c      |   19 +++++++------------
 arch/x86/mm/fault.c     |   23 ++++++++---------------
 3 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
 #define MM_FAULT_CONTINUE	-1
 #define MM_FAULT_ERR(sig)	(sig)
 
-static int out_of_memory(struct pt_regs *regs)
-{
-	/*
-	 * We ran out of memory, or some other thing happened to us that made
-	 * us unable to handle the page fault gracefully.
-	 */
-	up_read(&current->mm->mmap_sem);
-	if (!user_mode(regs))
-		return MM_FAULT_ERR(SIGKILL);
-	pagefault_out_of_memory();
-	return MM_FAULT_RETURN;
-}
-
 static int do_sigbus(struct pt_regs *regs, unsigned long address)
 {
 	siginfo_t info;
@@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
 		return MM_FAULT_CONTINUE;
 
 	/* Out of memory */
-	if (fault & VM_FAULT_OOM)
-		return out_of_memory(regs);
+	if (fault & VM_FAULT_OOM) {
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, or some other thing happened to us that
+		 * made us unable to handle the page fault gracefully.
+		 */
+		if (!user_mode(regs))
+			return MM_FAULT_ERR(SIGKILL);
+		pagefault_out_of_memory();
+		return MM_FAULT_RETURN;
+	}
 
 	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
 	 * we support the feature in HW
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-static void out_of_memory(void)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
 {
@@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			no_context(regs, error_code, address);
 			return 1;
 		}
+		up_read(&current->mm->mmap_sem);
 
-		out_of_memory();
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & VM_FAULT_SIGBUS)
 			do_sigbus(regs, error_code, address);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
-static void
-out_of_memory(struct pt_regs *regs, unsigned long error_code,
-	      unsigned long address)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 	  unsigned int fault)
@@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			return 1;
 		}
 
-		out_of_memory(regs, error_code, address);
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
 			     VM_FAULT_HWPOISON_LARGE))

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-14  9:15   ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Kroah-Hartman, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Michal Hocko, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

out_of_memory() is a globally defined function to call the oom killer.
x86, sh, and powerpc all use a function of the same name within file
scope in their respective fault.c unnecessarily.  Inline the functions
into the pagefault handlers to clean the code up.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
 arch/sh/mm/fault.c      |   19 +++++++------------
 arch/x86/mm/fault.c     |   23 ++++++++---------------
 3 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
 #define MM_FAULT_CONTINUE	-1
 #define MM_FAULT_ERR(sig)	(sig)
 
-static int out_of_memory(struct pt_regs *regs)
-{
-	/*
-	 * We ran out of memory, or some other thing happened to us that made
-	 * us unable to handle the page fault gracefully.
-	 */
-	up_read(&current->mm->mmap_sem);
-	if (!user_mode(regs))
-		return MM_FAULT_ERR(SIGKILL);
-	pagefault_out_of_memory();
-	return MM_FAULT_RETURN;
-}
-
 static int do_sigbus(struct pt_regs *regs, unsigned long address)
 {
 	siginfo_t info;
@@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
 		return MM_FAULT_CONTINUE;
 
 	/* Out of memory */
-	if (fault & VM_FAULT_OOM)
-		return out_of_memory(regs);
+	if (fault & VM_FAULT_OOM) {
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, or some other thing happened to us that
+		 * made us unable to handle the page fault gracefully.
+		 */
+		if (!user_mode(regs))
+			return MM_FAULT_ERR(SIGKILL);
+		pagefault_out_of_memory();
+		return MM_FAULT_RETURN;
+	}
 
 	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
 	 * we support the feature in HW
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-static void out_of_memory(void)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
 {
@@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			no_context(regs, error_code, address);
 			return 1;
 		}
+		up_read(&current->mm->mmap_sem);
 
-		out_of_memory();
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & VM_FAULT_SIGBUS)
 			do_sigbus(regs, error_code, address);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
-static void
-out_of_memory(struct pt_regs *regs, unsigned long error_code,
-	      unsigned long address)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 	  unsigned int fault)
@@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			return 1;
 		}
 
-		out_of_memory(regs, error_code, address);
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
 			     VM_FAULT_HWPOISON_LARGE))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-14  9:15   ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Mundt, linux-sh, Greg Kroah-Hartman, x86, linux-kernel,
	Michal Hocko, linux-mm, Ingo Molnar, Paul Mackerras,
	KOSAKI Motohiro, H. Peter Anvin, Thomas Gleixner, linuxppc-dev,
	KAMEZAWA Hiroyuki

out_of_memory() is a globally defined function to call the oom killer.
x86, sh, and powerpc all use a function of the same name within file
scope in their respective fault.c unnecessarily.  Inline the functions
into the pagefault handlers to clean the code up.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
 arch/sh/mm/fault.c      |   19 +++++++------------
 arch/x86/mm/fault.c     |   23 ++++++++---------------
 3 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
 #define MM_FAULT_CONTINUE	-1
 #define MM_FAULT_ERR(sig)	(sig)
 
-static int out_of_memory(struct pt_regs *regs)
-{
-	/*
-	 * We ran out of memory, or some other thing happened to us that made
-	 * us unable to handle the page fault gracefully.
-	 */
-	up_read(&current->mm->mmap_sem);
-	if (!user_mode(regs))
-		return MM_FAULT_ERR(SIGKILL);
-	pagefault_out_of_memory();
-	return MM_FAULT_RETURN;
-}
-
 static int do_sigbus(struct pt_regs *regs, unsigned long address)
 {
 	siginfo_t info;
@@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
 		return MM_FAULT_CONTINUE;
 
 	/* Out of memory */
-	if (fault & VM_FAULT_OOM)
-		return out_of_memory(regs);
+	if (fault & VM_FAULT_OOM) {
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, or some other thing happened to us that
+		 * made us unable to handle the page fault gracefully.
+		 */
+		if (!user_mode(regs))
+			return MM_FAULT_ERR(SIGKILL);
+		pagefault_out_of_memory();
+		return MM_FAULT_RETURN;
+	}
 
 	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
 	 * we support the feature in HW
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-static void out_of_memory(void)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
 {
@@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			no_context(regs, error_code, address);
 			return 1;
 		}
+		up_read(&current->mm->mmap_sem);
 
-		out_of_memory();
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & VM_FAULT_SIGBUS)
 			do_sigbus(regs, error_code, address);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 	__bad_area(regs, error_code, address, SEGV_ACCERR);
 }
 
-/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
-static void
-out_of_memory(struct pt_regs *regs, unsigned long error_code,
-	      unsigned long address)
-{
-	/*
-	 * We ran out of memory, call the OOM killer, and return the userspace
-	 * (which will retry the fault, or kill us if we got oom-killed):
-	 */
-	up_read(&current->mm->mmap_sem);
-
-	pagefault_out_of_memory();
-}
-
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
 	  unsigned int fault)
@@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
 			return 1;
 		}
 
-		out_of_memory(regs, error_code, address);
+		up_read(&current->mm->mmap_sem);
+
+		/*
+		 * We ran out of memory, call the OOM killer, and return the
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed):
+		 */
+		pagefault_out_of_memory();
 	} else {
 		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
 			     VM_FAULT_HWPOISON_LARGE))

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
  2012-11-14  9:15 ` David Rientjes
@ 2012-11-14 10:50   ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 10:50 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, linux-kernel, linux-mm

On Wed 14-11-12 01:15:19, David Rientjes wrote:
> With hotpluggable and memoryless nodes, it's possible that node 0 will
> not be online, so use the first online node's zonelist rather than
> hardcoding node 0 to pass a zonelist with all zones to the oom killer.

Makes sense although I haven't seen a machine with no 0 node yet.
According to 13808910 this is indeed possible.

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  drivers/tty/sysrq.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -346,7 +346,8 @@ static struct sysrq_key_op sysrq_term_op = {
>  
>  static void moom_callback(struct work_struct *ignored)
>  {
> -	out_of_memory(node_zonelist(0, GFP_KERNEL), GFP_KERNEL, 0, NULL, true);
> +	out_of_memory(node_zonelist(first_online_node, GFP_KERNEL), GFP_KERNEL,
> +		      0, NULL, true);
>  }
>  
>  static DECLARE_WORK(moom_work, moom_callback);
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
@ 2012-11-14 10:50   ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 10:50 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, linux-kernel, linux-mm

On Wed 14-11-12 01:15:19, David Rientjes wrote:
> With hotpluggable and memoryless nodes, it's possible that node 0 will
> not be online, so use the first online node's zonelist rather than
> hardcoding node 0 to pass a zonelist with all zones to the oom killer.

Makes sense although I haven't seen a machine with no 0 node yet.
According to 13808910 this is indeed possible.

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  drivers/tty/sysrq.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -346,7 +346,8 @@ static struct sysrq_key_op sysrq_term_op = {
>  
>  static void moom_callback(struct work_struct *ignored)
>  {
> -	out_of_memory(node_zonelist(0, GFP_KERNEL), GFP_KERNEL, 0, NULL, true);
> +	out_of_memory(node_zonelist(first_online_node, GFP_KERNEL), GFP_KERNEL,
> +		      0, NULL, true);
>  }
>  
>  static DECLARE_WORK(moom_work, moom_callback);
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
  2012-11-14 10:50   ` Michal Hocko
@ 2012-11-14 11:03     ` David Rientjes
  -1 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14 11:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, linux-kernel, linux-mm

On Wed, 14 Nov 2012, Michal Hocko wrote:

> > With hotpluggable and memoryless nodes, it's possible that node 0 will
> > not be online, so use the first online node's zonelist rather than
> > hardcoding node 0 to pass a zonelist with all zones to the oom killer.
> 
> Makes sense although I haven't seen a machine with no 0 node yet.

We routinely do testing with them, actually, just by physically removing 
all memory described by the SRAT that maps to node 0.  You could do the 
same thing by making all pxms that map to node 0 to be hotpluggable in 
your memory affinity structure.  I've been bit by it one too many times so 
I always keep in mind that no single node id is guaranteed to be online 
(although at least one node is always online); hence, first_online_node is 
the solution.

> According to 13808910 this is indeed possible.
> 
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Signed-off-by: David Rientjes <rientjes@google.com>
> 
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> 

Thanks!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
@ 2012-11-14 11:03     ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-14 11:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, linux-kernel, linux-mm

On Wed, 14 Nov 2012, Michal Hocko wrote:

> > With hotpluggable and memoryless nodes, it's possible that node 0 will
> > not be online, so use the first online node's zonelist rather than
> > hardcoding node 0 to pass a zonelist with all zones to the oom killer.
> 
> Makes sense although I haven't seen a machine with no 0 node yet.

We routinely do testing with them, actually, just by physically removing 
all memory described by the SRAT that maps to node 0.  You could do the 
same thing by making all pxms that map to node 0 to be hotpluggable in 
your memory affinity structure.  I've been bit by it one too many times so 
I always keep in mind that no single node id is guaranteed to be online 
(although at least one node is always online); hence, first_online_node is 
the solution.

> According to 13808910 this is indeed possible.
> 
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Signed-off-by: David Rientjes <rientjes@google.com>
> 
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> 

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
  2012-11-14 11:03     ` David Rientjes
@ 2012-11-14 13:31       ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:31 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, linux-kernel, linux-mm

On Wed 14-11-12 03:03:02, David Rientjes wrote:
> On Wed, 14 Nov 2012, Michal Hocko wrote:
> 
> > > With hotpluggable and memoryless nodes, it's possible that node 0 will
> > > not be online, so use the first online node's zonelist rather than
> > > hardcoding node 0 to pass a zonelist with all zones to the oom killer.
> > 
> > Makes sense although I haven't seen a machine with no 0 node yet.
> 
> We routinely do testing with them, actually, just by physically removing 
> all memory described by the SRAT that maps to node 0.  You could do the 
> same thing by making all pxms that map to node 0 to be hotpluggable in 
> your memory affinity structure.  I've been bit by it one too many times so 
> I always keep in mind that no single node id is guaranteed to be online 
> (although at least one node is always online); hence, first_online_node is 
> the solution.

I thought that a boot cpu would be bound to a node0 or something similar.
Thanks for the clarification!

> > According to 13808910 this is indeed possible.
> > 
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Signed-off-by: David Rientjes <rientjes@google.com>
> > 
> > Reviewed-by: Michal Hocko <mhocko@suse.cz>
> > 
> 
> Thanks!

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
@ 2012-11-14 13:31       ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:31 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, linux-kernel, linux-mm

On Wed 14-11-12 03:03:02, David Rientjes wrote:
> On Wed, 14 Nov 2012, Michal Hocko wrote:
> 
> > > With hotpluggable and memoryless nodes, it's possible that node 0 will
> > > not be online, so use the first online node's zonelist rather than
> > > hardcoding node 0 to pass a zonelist with all zones to the oom killer.
> > 
> > Makes sense although I haven't seen a machine with no 0 node yet.
> 
> We routinely do testing with them, actually, just by physically removing 
> all memory described by the SRAT that maps to node 0.  You could do the 
> same thing by making all pxms that map to node 0 to be hotpluggable in 
> your memory affinity structure.  I've been bit by it one too many times so 
> I always keep in mind that no single node id is guaranteed to be online 
> (although at least one node is always online); hence, first_online_node is 
> the solution.

I thought that a boot cpu would be bound to a node0 or something similar.
Thanks for the clarification!

> > According to 13808910 this is indeed possible.
> > 
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Signed-off-by: David Rientjes <rientjes@google.com>
> > 
> > Reviewed-by: Michal Hocko <mhocko@suse.cz>
> > 
> 
> Thanks!

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
  2012-11-14  9:15   ` David Rientjes
@ 2012-11-14 13:32     ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:32 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro, linux-kernel,
	linux-mm

On Wed 14-11-12 01:15:22, David Rientjes wrote:
> To lock the entire system from parallel oom killing, it's possible to
> pass in a zonelist with all zones rather than using
> for_each_populated_zone() for the iteration.  This obsoletes
> try_set_system_oom() and clear_system_oom() so that they can be removed.
> 
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

The only _potential_ problem I can see with this is that if we ever have
a HW which requires that a node zonelist doesn't contain others nodes'
zones then this wouldn't work. I do not think such a HW exists. Such a HW
would need more changes in the code anyway.

so
Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/oom_kill.c |   49 +++++++------------------------------------------
>  1 files changed, 7 insertions(+), 42 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -591,43 +591,6 @@ void clear_zonelist_oom(struct zonelist *zonelist, gfp_t gfp_mask)
>  	spin_unlock(&zone_scan_lock);
>  }
>  
> -/*
> - * Try to acquire the oom killer lock for all system zones.  Returns zero if a
> - * parallel oom killing is taking place, otherwise locks all zones and returns
> - * non-zero.
> - */
> -static int try_set_system_oom(void)
> -{
> -	struct zone *zone;
> -	int ret = 1;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		if (zone_is_oom_locked(zone)) {
> -			ret = 0;
> -			goto out;
> -		}
> -	for_each_populated_zone(zone)
> -		zone_set_flag(zone, ZONE_OOM_LOCKED);
> -out:
> -	spin_unlock(&zone_scan_lock);
> -	return ret;
> -}
> -
> -/*
> - * Clears ZONE_OOM_LOCKED for all system zones so that failed allocation
> - * attempts or page faults may now recall the oom killer, if necessary.
> - */
> -static void clear_system_oom(void)
> -{
> -	struct zone *zone;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> -	spin_unlock(&zone_scan_lock);
> -}
> -
>  /**
>   * out_of_memory - kill the "best" process when we run out of memory
>   * @zonelist: zonelist pointer
> @@ -708,15 +671,17 @@ out:
>  
>  /*
>   * The pagefault handler calls here because it is out of memory, so kill a
> - * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
> - * oom killing is already in progress so do nothing.  If a task is found with
> - * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> + * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> + * parallel oom killing is already in progress so do nothing.
>   */
>  void pagefault_out_of_memory(void)
>  {
> -	if (try_set_system_oom()) {
> +	struct zonelist *zonelist = node_zonelist(first_online_node,
> +						  GFP_KERNEL);
> +
> +	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
>  		out_of_memory(NULL, 0, 0, NULL, false);
> -		clear_system_oom();
> +		clear_zonelist_oom(zonelist, GFP_KERNEL);
>  	}
>  	schedule_timeout_killable(1);
>  }

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
@ 2012-11-14 13:32     ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:32 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro, linux-kernel,
	linux-mm

On Wed 14-11-12 01:15:22, David Rientjes wrote:
> To lock the entire system from parallel oom killing, it's possible to
> pass in a zonelist with all zones rather than using
> for_each_populated_zone() for the iteration.  This obsoletes
> try_set_system_oom() and clear_system_oom() so that they can be removed.
> 
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

The only _potential_ problem I can see with this is that if we ever have
a HW which requires that a node zonelist doesn't contain others nodes'
zones then this wouldn't work. I do not think such a HW exists. Such a HW
would need more changes in the code anyway.

so
Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/oom_kill.c |   49 +++++++------------------------------------------
>  1 files changed, 7 insertions(+), 42 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -591,43 +591,6 @@ void clear_zonelist_oom(struct zonelist *zonelist, gfp_t gfp_mask)
>  	spin_unlock(&zone_scan_lock);
>  }
>  
> -/*
> - * Try to acquire the oom killer lock for all system zones.  Returns zero if a
> - * parallel oom killing is taking place, otherwise locks all zones and returns
> - * non-zero.
> - */
> -static int try_set_system_oom(void)
> -{
> -	struct zone *zone;
> -	int ret = 1;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		if (zone_is_oom_locked(zone)) {
> -			ret = 0;
> -			goto out;
> -		}
> -	for_each_populated_zone(zone)
> -		zone_set_flag(zone, ZONE_OOM_LOCKED);
> -out:
> -	spin_unlock(&zone_scan_lock);
> -	return ret;
> -}
> -
> -/*
> - * Clears ZONE_OOM_LOCKED for all system zones so that failed allocation
> - * attempts or page faults may now recall the oom killer, if necessary.
> - */
> -static void clear_system_oom(void)
> -{
> -	struct zone *zone;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> -	spin_unlock(&zone_scan_lock);
> -}
> -
>  /**
>   * out_of_memory - kill the "best" process when we run out of memory
>   * @zonelist: zonelist pointer
> @@ -708,15 +671,17 @@ out:
>  
>  /*
>   * The pagefault handler calls here because it is out of memory, so kill a
> - * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
> - * oom killing is already in progress so do nothing.  If a task is found with
> - * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> + * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> + * parallel oom killing is already in progress so do nothing.
>   */
>  void pagefault_out_of_memory(void)
>  {
> -	if (try_set_system_oom()) {
> +	struct zonelist *zonelist = node_zonelist(first_online_node,
> +						  GFP_KERNEL);
> +
> +	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
>  		out_of_memory(NULL, 0, 0, NULL, false);
> -		clear_system_oom();
> +		clear_zonelist_oom(zonelist, GFP_KERNEL);
>  	}
>  	schedule_timeout_killable(1);
>  }

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 3/4] mm, oom: remove redundant sleep in pagefault oom handler
  2012-11-14  9:15   ` David Rientjes
@ 2012-11-14 13:45     ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro, linux-kernel,
	linux-mm

On Wed 14-11-12 01:15:25, David Rientjes wrote:
> out_of_memory() will already cause current to schedule if it has not been
> killed, so doing it again in pagefault_out_of_memory() is redundant.
> Remove it.
> 
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/oom_kill.c |    1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -683,5 +683,4 @@ void pagefault_out_of_memory(void)
>  		out_of_memory(NULL, 0, 0, NULL, false);
>  		clear_zonelist_oom(zonelist, GFP_KERNEL);
>  	}
> -	schedule_timeout_killable(1);
>  }

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 3/4] mm, oom: remove redundant sleep in pagefault oom handler
@ 2012-11-14 13:45     ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro, linux-kernel,
	linux-mm

On Wed 14-11-12 01:15:25, David Rientjes wrote:
> out_of_memory() will already cause current to schedule if it has not been
> killed, so doing it again in pagefault_out_of_memory() is redundant.
> Remove it.
> 
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/oom_kill.c |    1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -683,5 +683,4 @@ void pagefault_out_of_memory(void)
>  		out_of_memory(NULL, 0, 0, NULL, false);
>  		clear_zonelist_oom(zonelist, GFP_KERNEL);
>  	}
> -	schedule_timeout_killable(1);
>  }

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
  2012-11-14  9:15   ` David Rientjes
  (?)
  (?)
@ 2012-11-14 13:47     ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:47 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

On Wed 14-11-12 01:15:28, David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.

Yes I like it. It is really confusing to have a local function with the
same name.

> 
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
>  arch/sh/mm/fault.c      |   19 +++++++------------
>  arch/x86/mm/fault.c     |   23 ++++++++---------------
>  3 files changed, 27 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
>  #define MM_FAULT_CONTINUE	-1
>  #define MM_FAULT_ERR(sig)	(sig)
>  
> -static int out_of_memory(struct pt_regs *regs)
> -{
> -	/*
> -	 * We ran out of memory, or some other thing happened to us that made
> -	 * us unable to handle the page fault gracefully.
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -	if (!user_mode(regs))
> -		return MM_FAULT_ERR(SIGKILL);
> -	pagefault_out_of_memory();
> -	return MM_FAULT_RETURN;
> -}
> -
>  static int do_sigbus(struct pt_regs *regs, unsigned long address)
>  {
>  	siginfo_t info;
> @@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
>  		return MM_FAULT_CONTINUE;
>  
>  	/* Out of memory */
> -	if (fault & VM_FAULT_OOM)
> -		return out_of_memory(regs);
> +	if (fault & VM_FAULT_OOM) {
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, or some other thing happened to us that
> +		 * made us unable to handle the page fault gracefully.
> +		 */
> +		if (!user_mode(regs))
> +			return MM_FAULT_ERR(SIGKILL);
> +		pagefault_out_of_memory();
> +		return MM_FAULT_RETURN;
> +	}
>  
>  	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
>  	 * we support the feature in HW
> diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
> --- a/arch/sh/mm/fault.c
> +++ b/arch/sh/mm/fault.c
> @@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -static void out_of_memory(void)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
>  {
> @@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			no_context(regs, error_code, address);
>  			return 1;
>  		}
> +		up_read(&current->mm->mmap_sem);
>  
> -		out_of_memory();
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & VM_FAULT_SIGBUS)
>  			do_sigbus(regs, error_code, address);
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
> -static void
> -out_of_memory(struct pt_regs *regs, unsigned long error_code,
> -	      unsigned long address)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
>  	  unsigned int fault)
> @@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			return 1;
>  		}
>  
> -		out_of_memory(regs, error_code, address);
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
>  			     VM_FAULT_HWPOISON_LARGE))

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-14 13:47     ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:47 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

On Wed 14-11-12 01:15:28, David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.

Yes I like it. It is really confusing to have a local function with the
same name.

> 
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
>  arch/sh/mm/fault.c      |   19 +++++++------------
>  arch/x86/mm/fault.c     |   23 ++++++++---------------
>  3 files changed, 27 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
>  #define MM_FAULT_CONTINUE	-1
>  #define MM_FAULT_ERR(sig)	(sig)
>  
> -static int out_of_memory(struct pt_regs *regs)
> -{
> -	/*
> -	 * We ran out of memory, or some other thing happened to us that made
> -	 * us unable to handle the page fault gracefully.
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -	if (!user_mode(regs))
> -		return MM_FAULT_ERR(SIGKILL);
> -	pagefault_out_of_memory();
> -	return MM_FAULT_RETURN;
> -}
> -
>  static int do_sigbus(struct pt_regs *regs, unsigned long address)
>  {
>  	siginfo_t info;
> @@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
>  		return MM_FAULT_CONTINUE;
>  
>  	/* Out of memory */
> -	if (fault & VM_FAULT_OOM)
> -		return out_of_memory(regs);
> +	if (fault & VM_FAULT_OOM) {
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, or some other thing happened to us that
> +		 * made us unable to handle the page fault gracefully.
> +		 */
> +		if (!user_mode(regs))
> +			return MM_FAULT_ERR(SIGKILL);
> +		pagefault_out_of_memory();
> +		return MM_FAULT_RETURN;
> +	}
>  
>  	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
>  	 * we support the feature in HW
> diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
> --- a/arch/sh/mm/fault.c
> +++ b/arch/sh/mm/fault.c
> @@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -static void out_of_memory(void)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
>  {
> @@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			no_context(regs, error_code, address);
>  			return 1;
>  		}
> +		up_read(&current->mm->mmap_sem);
>  
> -		out_of_memory();
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & VM_FAULT_SIGBUS)
>  			do_sigbus(regs, error_code, address);
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
> -static void
> -out_of_memory(struct pt_regs *regs, unsigned long error_code,
> -	      unsigned long address)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
>  	  unsigned int fault)
> @@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			return 1;
>  		}
>  
> -		out_of_memory(regs, error_code, address);
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
>  			     VM_FAULT_HWPOISON_LARGE))

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-14 13:47     ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:47 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

On Wed 14-11-12 01:15:28, David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.

Yes I like it. It is really confusing to have a local function with the
same name.

> 
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
>  arch/sh/mm/fault.c      |   19 +++++++------------
>  arch/x86/mm/fault.c     |   23 ++++++++---------------
>  3 files changed, 27 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
>  #define MM_FAULT_CONTINUE	-1
>  #define MM_FAULT_ERR(sig)	(sig)
>  
> -static int out_of_memory(struct pt_regs *regs)
> -{
> -	/*
> -	 * We ran out of memory, or some other thing happened to us that made
> -	 * us unable to handle the page fault gracefully.
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -	if (!user_mode(regs))
> -		return MM_FAULT_ERR(SIGKILL);
> -	pagefault_out_of_memory();
> -	return MM_FAULT_RETURN;
> -}
> -
>  static int do_sigbus(struct pt_regs *regs, unsigned long address)
>  {
>  	siginfo_t info;
> @@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
>  		return MM_FAULT_CONTINUE;
>  
>  	/* Out of memory */
> -	if (fault & VM_FAULT_OOM)
> -		return out_of_memory(regs);
> +	if (fault & VM_FAULT_OOM) {
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, or some other thing happened to us that
> +		 * made us unable to handle the page fault gracefully.
> +		 */
> +		if (!user_mode(regs))
> +			return MM_FAULT_ERR(SIGKILL);
> +		pagefault_out_of_memory();
> +		return MM_FAULT_RETURN;
> +	}
>  
>  	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
>  	 * we support the feature in HW
> diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
> --- a/arch/sh/mm/fault.c
> +++ b/arch/sh/mm/fault.c
> @@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -static void out_of_memory(void)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
>  {
> @@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			no_context(regs, error_code, address);
>  			return 1;
>  		}
> +		up_read(&current->mm->mmap_sem);
>  
> -		out_of_memory();
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & VM_FAULT_SIGBUS)
>  			do_sigbus(regs, error_code, address);
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
> -static void
> -out_of_memory(struct pt_regs *regs, unsigned long error_code,
> -	      unsigned long address)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
>  	  unsigned int fault)
> @@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			return 1;
>  		}
>  
> -		out_of_memory(regs, error_code, address);
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
>  			     VM_FAULT_HWPOISON_LARGE))

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-14 13:47     ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-14 13:47 UTC (permalink / raw)
  To: David Rientjes
  Cc: Paul Mundt, linux-sh, Greg Kroah-Hartman, x86, linux-kernel,
	linux-mm, Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	H. Peter Anvin, Andrew Morton, linuxppc-dev, Thomas Gleixner,
	KAMEZAWA Hiroyuki

On Wed 14-11-12 01:15:28, David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.

Yes I like it. It is really confusing to have a local function with the
same name.

> 
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  arch/powerpc/mm/fault.c |   27 ++++++++++++---------------
>  arch/sh/mm/fault.c      |   19 +++++++------------
>  arch/x86/mm/fault.c     |   23 ++++++++---------------
>  3 files changed, 27 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
>  #define MM_FAULT_CONTINUE	-1
>  #define MM_FAULT_ERR(sig)	(sig)
>  
> -static int out_of_memory(struct pt_regs *regs)
> -{
> -	/*
> -	 * We ran out of memory, or some other thing happened to us that made
> -	 * us unable to handle the page fault gracefully.
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -	if (!user_mode(regs))
> -		return MM_FAULT_ERR(SIGKILL);
> -	pagefault_out_of_memory();
> -	return MM_FAULT_RETURN;
> -}
> -
>  static int do_sigbus(struct pt_regs *regs, unsigned long address)
>  {
>  	siginfo_t info;
> @@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
>  		return MM_FAULT_CONTINUE;
>  
>  	/* Out of memory */
> -	if (fault & VM_FAULT_OOM)
> -		return out_of_memory(regs);
> +	if (fault & VM_FAULT_OOM) {
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, or some other thing happened to us that
> +		 * made us unable to handle the page fault gracefully.
> +		 */
> +		if (!user_mode(regs))
> +			return MM_FAULT_ERR(SIGKILL);
> +		pagefault_out_of_memory();
> +		return MM_FAULT_RETURN;
> +	}
>  
>  	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
>  	 * we support the feature in HW
> diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
> --- a/arch/sh/mm/fault.c
> +++ b/arch/sh/mm/fault.c
> @@ -301,17 +301,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -static void out_of_memory(void)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address)
>  {
> @@ -353,8 +342,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			no_context(regs, error_code, address);
>  			return 1;
>  		}
> +		up_read(&current->mm->mmap_sem);
>  
> -		out_of_memory();
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & VM_FAULT_SIGBUS)
>  			do_sigbus(regs, error_code, address);
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -803,20 +803,6 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
>  	__bad_area(regs, error_code, address, SEGV_ACCERR);
>  }
>  
> -/* TODO: fixup for "mm-invoke-oom-killer-from-page-fault.patch" */
> -static void
> -out_of_memory(struct pt_regs *regs, unsigned long error_code,
> -	      unsigned long address)
> -{
> -	/*
> -	 * We ran out of memory, call the OOM killer, and return the userspace
> -	 * (which will retry the fault, or kill us if we got oom-killed):
> -	 */
> -	up_read(&current->mm->mmap_sem);
> -
> -	pagefault_out_of_memory();
> -}
> -
>  static void
>  do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address,
>  	  unsigned int fault)
> @@ -879,7 +865,14 @@ mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>  			return 1;
>  		}
>  
> -		out_of_memory(regs, error_code, address);
> +		up_read(&current->mm->mmap_sem);
> +
> +		/*
> +		 * We ran out of memory, call the OOM killer, and return the
> +		 * userspace (which will retry the fault, or kill us if we got
> +		 * oom-killed):
> +		 */
> +		pagefault_out_of_memory();
>  	} else {
>  		if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
>  			     VM_FAULT_HWPOISON_LARGE))

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
  2012-11-14  9:15 ` David Rientjes
@ 2012-11-15  8:41   ` Kamezawa Hiroyuki
  -1 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:41 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KOSAKI Motohiro, Michal Hocko,
	linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> With hotpluggable and memoryless nodes, it's possible that node 0 will
> not be online, so use the first online node's zonelist rather than
> hardcoding node 0 to pass a zonelist with all zones to the oom killer.
>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Thank you.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist
@ 2012-11-15  8:41   ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:41 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KOSAKI Motohiro, Michal Hocko,
	linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> With hotpluggable and memoryless nodes, it's possible that node 0 will
> not be online, so use the first online node's zonelist rather than
> hardcoding node 0 to pass a zonelist with all zones to the oom killer.
>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Thank you.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
  2012-11-14  9:15   ` David Rientjes
@ 2012-11-15  8:45     ` Kamezawa Hiroyuki
  -1 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> To lock the entire system from parallel oom killing, it's possible to
> pass in a zonelist with all zones rather than using
> for_each_populated_zone() for the iteration.  This obsoletes
> try_set_system_oom() and clear_system_oom() so that they can be removed.
>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

I'm sorry if I missed something...

> ---
>   mm/oom_kill.c |   49 +++++++------------------------------------------
>   1 files changed, 7 insertions(+), 42 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -591,43 +591,6 @@ void clear_zonelist_oom(struct zonelist *zonelist, gfp_t gfp_mask)
>   	spin_unlock(&zone_scan_lock);
>   }
>
> -/*
> - * Try to acquire the oom killer lock for all system zones.  Returns zero if a
> - * parallel oom killing is taking place, otherwise locks all zones and returns
> - * non-zero.
> - */
> -static int try_set_system_oom(void)
> -{
> -	struct zone *zone;
> -	int ret = 1;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		if (zone_is_oom_locked(zone)) {
> -			ret = 0;
> -			goto out;
> -		}
> -	for_each_populated_zone(zone)
> -		zone_set_flag(zone, ZONE_OOM_LOCKED);
> -out:
> -	spin_unlock(&zone_scan_lock);
> -	return ret;
> -}
> -
> -/*
> - * Clears ZONE_OOM_LOCKED for all system zones so that failed allocation
> - * attempts or page faults may now recall the oom killer, if necessary.
> - */
> -static void clear_system_oom(void)
> -{
> -	struct zone *zone;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> -	spin_unlock(&zone_scan_lock);
> -}
> -
>   /**
>    * out_of_memory - kill the "best" process when we run out of memory
>    * @zonelist: zonelist pointer
> @@ -708,15 +671,17 @@ out:
>
>   /*
>    * The pagefault handler calls here because it is out of memory, so kill a
> - * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
> - * oom killing is already in progress so do nothing.  If a task is found with
> - * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> + * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> + * parallel oom killing is already in progress so do nothing.
>    */
>   void pagefault_out_of_memory(void)
>   {
> -	if (try_set_system_oom()) {
> +	struct zonelist *zonelist = node_zonelist(first_online_node,
> +						  GFP_KERNEL);


why GFP_KERNEL ? not GFP_HIGHUSER_MOVABLE ?

Thanks,
-Kame

> +
> +	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
>   		out_of_memory(NULL, 0, 0, NULL, false);
> -		clear_system_oom();
> +		clear_zonelist_oom(zonelist, GFP_KERNEL);
>   	}
>   	schedule_timeout_killable(1);
>   }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
@ 2012-11-15  8:45     ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> To lock the entire system from parallel oom killing, it's possible to
> pass in a zonelist with all zones rather than using
> for_each_populated_zone() for the iteration.  This obsoletes
> try_set_system_oom() and clear_system_oom() so that they can be removed.
>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

I'm sorry if I missed something...

> ---
>   mm/oom_kill.c |   49 +++++++------------------------------------------
>   1 files changed, 7 insertions(+), 42 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -591,43 +591,6 @@ void clear_zonelist_oom(struct zonelist *zonelist, gfp_t gfp_mask)
>   	spin_unlock(&zone_scan_lock);
>   }
>
> -/*
> - * Try to acquire the oom killer lock for all system zones.  Returns zero if a
> - * parallel oom killing is taking place, otherwise locks all zones and returns
> - * non-zero.
> - */
> -static int try_set_system_oom(void)
> -{
> -	struct zone *zone;
> -	int ret = 1;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		if (zone_is_oom_locked(zone)) {
> -			ret = 0;
> -			goto out;
> -		}
> -	for_each_populated_zone(zone)
> -		zone_set_flag(zone, ZONE_OOM_LOCKED);
> -out:
> -	spin_unlock(&zone_scan_lock);
> -	return ret;
> -}
> -
> -/*
> - * Clears ZONE_OOM_LOCKED for all system zones so that failed allocation
> - * attempts or page faults may now recall the oom killer, if necessary.
> - */
> -static void clear_system_oom(void)
> -{
> -	struct zone *zone;
> -
> -	spin_lock(&zone_scan_lock);
> -	for_each_populated_zone(zone)
> -		zone_clear_flag(zone, ZONE_OOM_LOCKED);
> -	spin_unlock(&zone_scan_lock);
> -}
> -
>   /**
>    * out_of_memory - kill the "best" process when we run out of memory
>    * @zonelist: zonelist pointer
> @@ -708,15 +671,17 @@ out:
>
>   /*
>    * The pagefault handler calls here because it is out of memory, so kill a
> - * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
> - * oom killing is already in progress so do nothing.  If a task is found with
> - * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> + * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> + * parallel oom killing is already in progress so do nothing.
>    */
>   void pagefault_out_of_memory(void)
>   {
> -	if (try_set_system_oom()) {
> +	struct zonelist *zonelist = node_zonelist(first_online_node,
> +						  GFP_KERNEL);


why GFP_KERNEL ? not GFP_HIGHUSER_MOVABLE ?

Thanks,
-Kame

> +
> +	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
>   		out_of_memory(NULL, 0, 0, NULL, false);
> -		clear_system_oom();
> +		clear_zonelist_oom(zonelist, GFP_KERNEL);
>   	}
>   	schedule_timeout_killable(1);
>   }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 3/4] mm, oom: remove redundant sleep in pagefault oom handler
  2012-11-14  9:15   ` David Rientjes
@ 2012-11-15  8:46     ` Kamezawa Hiroyuki
  -1 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:46 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> out_of_memory() will already cause current to schedule if it has not been
> killed, so doing it again in pagefault_out_of_memory() is redundant.
> Remove it.
>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 3/4] mm, oom: remove redundant sleep in pagefault oom handler
@ 2012-11-15  8:46     ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:46 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> out_of_memory() will already cause current to schedule if it has not been
> killed, so doing it again in pagefault_out_of_memory() is redundant.
> Remove it.
>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: David Rientjes <rientjes@google.com>

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
  2012-11-14  9:15   ` David Rientjes
  (?)
  (?)
@ 2012-11-15  8:48     ` Kamezawa Hiroyuki
  -1 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:48 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KOSAKI Motohiro, Michal Hocko,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.
>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

I think this is good.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-15  8:48     ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:48 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KOSAKI Motohiro, Michal Hocko,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.
>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

I think this is good.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-15  8:48     ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:48 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Greg Kroah-Hartman, KOSAKI Motohiro, Michal Hocko,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Paul Mundt, x86,
	linuxppc-dev, linux-sh, linux-kernel, linux-mm

(2012/11/14 18:15), David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.
>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

I think this is good.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 4/4] mm, oom: remove statically defined arch functions of same name
@ 2012-11-15  8:48     ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 38+ messages in thread
From: Kamezawa Hiroyuki @ 2012-11-15  8:48 UTC (permalink / raw)
  To: David Rientjes
  Cc: Paul Mundt, linux-sh, Greg Kroah-Hartman, x86, linux-kernel,
	Michal Hocko, linux-mm, Ingo Molnar, Paul Mackerras,
	KOSAKI Motohiro, H. Peter Anvin, Andrew Morton, linuxppc-dev,
	Thomas Gleixner

(2012/11/14 18:15), David Rientjes wrote:
> out_of_memory() is a globally defined function to call the oom killer.
> x86, sh, and powerpc all use a function of the same name within file
> scope in their respective fault.c unnecessarily.  Inline the functions
> into the pagefault handlers to clean the code up.
>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

I think this is good.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
  2012-11-15  8:45     ` Kamezawa Hiroyuki
@ 2012-11-15  9:02       ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-15  9:02 UTC (permalink / raw)
  To: Kamezawa Hiroyuki
  Cc: David Rientjes, Andrew Morton, KOSAKI Motohiro, linux-kernel, linux-mm

On Thu 15-11-12 17:45:18, KAMEZAWA Hiroyuki wrote:
> (2012/11/14 18:15), David Rientjes wrote:
[...]
> >@@ -708,15 +671,17 @@ out:
> >
> >  /*
> >   * The pagefault handler calls here because it is out of memory, so kill a
> >- * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
> >- * oom killing is already in progress so do nothing.  If a task is found with
> >- * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> >+ * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> >+ * parallel oom killing is already in progress so do nothing.
> >   */
> >  void pagefault_out_of_memory(void)
> >  {
> >-	if (try_set_system_oom()) {
> >+	struct zonelist *zonelist = node_zonelist(first_online_node,
> >+						  GFP_KERNEL);
> 
> 
> why GFP_KERNEL ? not GFP_HIGHUSER_MOVABLE ?

I was wondering about the same but gfp_zonelist cares only about
__GFP_THISNODE so GFP_HIGHUSER_MOVABLE doesn't do any difference.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
@ 2012-11-15  9:02       ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2012-11-15  9:02 UTC (permalink / raw)
  To: Kamezawa Hiroyuki
  Cc: David Rientjes, Andrew Morton, KOSAKI Motohiro, linux-kernel, linux-mm

On Thu 15-11-12 17:45:18, KAMEZAWA Hiroyuki wrote:
> (2012/11/14 18:15), David Rientjes wrote:
[...]
> >@@ -708,15 +671,17 @@ out:
> >
> >  /*
> >   * The pagefault handler calls here because it is out of memory, so kill a
> >- * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a parallel
> >- * oom killing is already in progress so do nothing.  If a task is found with
> >- * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> >+ * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> >+ * parallel oom killing is already in progress so do nothing.
> >   */
> >  void pagefault_out_of_memory(void)
> >  {
> >-	if (try_set_system_oom()) {
> >+	struct zonelist *zonelist = node_zonelist(first_online_node,
> >+						  GFP_KERNEL);
> 
> 
> why GFP_KERNEL ? not GFP_HIGHUSER_MOVABLE ?

I was wondering about the same but gfp_zonelist cares only about
__GFP_THISNODE so GFP_HIGHUSER_MOVABLE doesn't do any difference.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
  2012-11-15  8:45     ` Kamezawa Hiroyuki
@ 2012-11-15 21:01       ` David Rientjes
  -1 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-15 21:01 UTC (permalink / raw)
  To: Kamezawa Hiroyuki
  Cc: Andrew Morton, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

On Thu, 15 Nov 2012, Kamezawa Hiroyuki wrote:

> > @@ -708,15 +671,17 @@ out:
> > 
> >   /*
> >    * The pagefault handler calls here because it is out of memory, so kill a
> > - * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a
> > parallel
> > - * oom killing is already in progress so do nothing.  If a task is found
> > with
> > - * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> > + * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> > + * parallel oom killing is already in progress so do nothing.
> >    */
> >   void pagefault_out_of_memory(void)
> >   {
> > -	if (try_set_system_oom()) {
> > +	struct zonelist *zonelist = node_zonelist(first_online_node,
> > +						  GFP_KERNEL);
> 
> 
> why GFP_KERNEL ? not GFP_HIGHUSER_MOVABLE ?
> 

The usual way to get a zonelist consisting of all zones ordered by 
preferring a node is node_zonelist(nid, GFP_KERNEL), but there's no 
difference between using GFP_KERNEL, GFP_HIGHUSER_MOVABLE, or even 0.  I 
simply duplicated what the sysrq trigger was doing, but you could also do 
&first_online_pgdat->node_zonelists[0], it's really just a matter of 
preference.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch 2/4] mm, oom: cleanup pagefault oom handler
@ 2012-11-15 21:01       ` David Rientjes
  0 siblings, 0 replies; 38+ messages in thread
From: David Rientjes @ 2012-11-15 21:01 UTC (permalink / raw)
  To: Kamezawa Hiroyuki
  Cc: Andrew Morton, KOSAKI Motohiro, Michal Hocko, linux-kernel, linux-mm

On Thu, 15 Nov 2012, Kamezawa Hiroyuki wrote:

> > @@ -708,15 +671,17 @@ out:
> > 
> >   /*
> >    * The pagefault handler calls here because it is out of memory, so kill a
> > - * memory-hogging task.  If a populated zone has ZONE_OOM_LOCKED set, a
> > parallel
> > - * oom killing is already in progress so do nothing.  If a task is found
> > with
> > - * TIF_MEMDIE set, it has been killed so do nothing and allow it to exit.
> > + * memory-hogging task.  If any populated zone has ZONE_OOM_LOCKED set, a
> > + * parallel oom killing is already in progress so do nothing.
> >    */
> >   void pagefault_out_of_memory(void)
> >   {
> > -	if (try_set_system_oom()) {
> > +	struct zonelist *zonelist = node_zonelist(first_online_node,
> > +						  GFP_KERNEL);
> 
> 
> why GFP_KERNEL ? not GFP_HIGHUSER_MOVABLE ?
> 

The usual way to get a zonelist consisting of all zones ordered by 
preferring a node is node_zonelist(nid, GFP_KERNEL), but there's no 
difference between using GFP_KERNEL, GFP_HIGHUSER_MOVABLE, or even 0.  I 
simply duplicated what the sysrq trigger was doing, but you could also do 
&first_online_pgdat->node_zonelists[0], it's really just a matter of 
preference.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-11-15 21:01 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-14  9:15 [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist David Rientjes
2012-11-14  9:15 ` David Rientjes
2012-11-14  9:15 ` [patch 2/4] mm, oom: cleanup pagefault oom handler David Rientjes
2012-11-14  9:15   ` David Rientjes
2012-11-14 13:32   ` Michal Hocko
2012-11-14 13:32     ` Michal Hocko
2012-11-15  8:45   ` Kamezawa Hiroyuki
2012-11-15  8:45     ` Kamezawa Hiroyuki
2012-11-15  9:02     ` Michal Hocko
2012-11-15  9:02       ` Michal Hocko
2012-11-15 21:01     ` David Rientjes
2012-11-15 21:01       ` David Rientjes
2012-11-14  9:15 ` [patch 3/4] mm, oom: remove redundant sleep in " David Rientjes
2012-11-14  9:15   ` David Rientjes
2012-11-14 13:45   ` Michal Hocko
2012-11-14 13:45     ` Michal Hocko
2012-11-15  8:46   ` Kamezawa Hiroyuki
2012-11-15  8:46     ` Kamezawa Hiroyuki
2012-11-14  9:15 ` [patch 4/4] mm, oom: remove statically defined arch functions of same name David Rientjes
2012-11-14  9:15   ` David Rientjes
2012-11-14  9:15   ` David Rientjes
2012-11-14  9:15   ` David Rientjes
2012-11-14 13:47   ` Michal Hocko
2012-11-14 13:47     ` Michal Hocko
2012-11-14 13:47     ` Michal Hocko
2012-11-14 13:47     ` Michal Hocko
2012-11-15  8:48   ` Kamezawa Hiroyuki
2012-11-15  8:48     ` Kamezawa Hiroyuki
2012-11-15  8:48     ` Kamezawa Hiroyuki
2012-11-15  8:48     ` Kamezawa Hiroyuki
2012-11-14 10:50 ` [patch 1/4] mm, oom: ensure sysrq+f always passes valid zonelist Michal Hocko
2012-11-14 10:50   ` Michal Hocko
2012-11-14 11:03   ` David Rientjes
2012-11-14 11:03     ` David Rientjes
2012-11-14 13:31     ` Michal Hocko
2012-11-14 13:31       ` Michal Hocko
2012-11-15  8:41 ` Kamezawa Hiroyuki
2012-11-15  8:41   ` Kamezawa Hiroyuki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.