All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] Implement lru_add_drain_all_async()
@ 2009-10-06  2:40 ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-06  2:40 UTC (permalink / raw)
  To: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov,
	Christoph Lameter
  Cc: kosaki.motohiro


===================================================================
Implement asynchronous lru_add_drain_all()

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/swap.h |    1 +
 mm/swap.c            |   24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4ec9001..1f5772a 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -204,6 +204,7 @@ extern void activate_page(struct page *);
 extern void mark_page_accessed(struct page *);
 extern void lru_add_drain(void);
 extern int lru_add_drain_all(void);
+extern int lru_add_drain_all_async(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void swap_setup(void);
 
diff --git a/mm/swap.c b/mm/swap.c
index 308e57d..e16cd40 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -38,6 +38,7 @@ int page_cluster;
 
 static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
+static DEFINE_PER_CPU(struct work_struct, lru_drain_work);
 
 /*
  * This path almost never happens for VM activity - pages are normally
@@ -312,6 +313,24 @@ int lru_add_drain_all(void)
 }
 
 /*
+ * Returns 0 for success
+ */
+int lru_add_drain_all_async(void)
+{
+	int cpu;
+
+	get_online_cpus();
+	for_each_online_cpu(cpu) {
+		struct work_struct *work = &per_cpu(lru_drain_work, cpu);
+		schedule_work_on(cpu, work);
+	}
+	put_online_cpus();
+
+	return 0;
+}
+
+
+/*
  * Batched page_cache_release().  Decrement the reference count on all the
  * passed pages.  If it fell to zero then remove the page from the LRU and
  * free it.
@@ -497,6 +516,7 @@ EXPORT_SYMBOL(pagevec_lookup_tag);
 void __init swap_setup(void)
 {
 	unsigned long megs = totalram_pages >> (20 - PAGE_SHIFT);
+	int cpu;
 
 #ifdef CONFIG_SWAP
 	bdi_init(swapper_space.backing_dev_info);
@@ -511,4 +531,8 @@ void __init swap_setup(void)
 	 * Right now other parts of the system means that we
 	 * _really_ don't want to cluster much more
 	 */
+
+	for_each_possible_cpu(cpu) {
+		INIT_WORK(&per_cpu(lru_drain_work, cpu), lru_add_drain_per_cpu);
+	}
 }
-- 
1.6.2.5




^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 1/2] Implement lru_add_drain_all_async()
@ 2009-10-06  2:40 ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-06  2:40 UTC (permalink / raw)
  To: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov,
	Christoph Lameter
  Cc: kosaki.motohiro


===================================================================
Implement asynchronous lru_add_drain_all()

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/swap.h |    1 +
 mm/swap.c            |   24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4ec9001..1f5772a 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -204,6 +204,7 @@ extern void activate_page(struct page *);
 extern void mark_page_accessed(struct page *);
 extern void lru_add_drain(void);
 extern int lru_add_drain_all(void);
+extern int lru_add_drain_all_async(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void swap_setup(void);
 
diff --git a/mm/swap.c b/mm/swap.c
index 308e57d..e16cd40 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -38,6 +38,7 @@ int page_cluster;
 
 static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
+static DEFINE_PER_CPU(struct work_struct, lru_drain_work);
 
 /*
  * This path almost never happens for VM activity - pages are normally
@@ -312,6 +313,24 @@ int lru_add_drain_all(void)
 }
 
 /*
+ * Returns 0 for success
+ */
+int lru_add_drain_all_async(void)
+{
+	int cpu;
+
+	get_online_cpus();
+	for_each_online_cpu(cpu) {
+		struct work_struct *work = &per_cpu(lru_drain_work, cpu);
+		schedule_work_on(cpu, work);
+	}
+	put_online_cpus();
+
+	return 0;
+}
+
+
+/*
  * Batched page_cache_release().  Decrement the reference count on all the
  * passed pages.  If it fell to zero then remove the page from the LRU and
  * free it.
@@ -497,6 +516,7 @@ EXPORT_SYMBOL(pagevec_lookup_tag);
 void __init swap_setup(void)
 {
 	unsigned long megs = totalram_pages >> (20 - PAGE_SHIFT);
+	int cpu;
 
 #ifdef CONFIG_SWAP
 	bdi_init(swapper_space.backing_dev_info);
@@ -511,4 +531,8 @@ void __init swap_setup(void)
 	 * Right now other parts of the system means that we
 	 * _really_ don't want to cluster much more
 	 */
+
+	for_each_possible_cpu(cpu) {
+		INIT_WORK(&per_cpu(lru_drain_work, cpu), lru_add_drain_per_cpu);
+	}
 }
-- 
1.6.2.5



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] mlock use lru_add_drain_all_async()
  2009-10-06  2:40 ` KOSAKI Motohiro
@ 2009-10-06  2:41   ` KOSAKI Motohiro
  -1 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-06  2:41 UTC (permalink / raw)
  To: LKML
  Cc: kosaki.motohiro, linux-mm, Andrew Morton, Peter Zijlstra,
	Oleg Nesterov, Christoph Lameter


Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
very long time.

  Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
  cpu0 does mlock()->lru_add_drain_all(), which does
  schedule_on_each_cpu(), which then waits for all cpus to complete the
  work. Except that cpu1, which is busy with the RT task, will never run
  keventd until the RT load goes away.

  This is not so much an actual deadlock as a serious starvation case.

Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
Thus, this patch replace it with lru_add_drain_all_async().

Cc: Oleg Nesterov <onestero@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/mlock.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 22041aa..46a016f 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -458,7 +458,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
 	if (!can_do_mlock())
 		return -EPERM;
 
-	lru_add_drain_all();	/* flush pagevec */
+	lru_add_drain_all_async();	/* flush pagevec */
 
 	down_write(&current->mm->mmap_sem);
 	len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
@@ -526,7 +526,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
 	if (!can_do_mlock())
 		goto out;
 
-	lru_add_drain_all();	/* flush pagevec */
+	lru_add_drain_all_async();	/* flush pagevec */
 
 	down_write(&current->mm->mmap_sem);
 
-- 
1.6.2.5




^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] mlock use lru_add_drain_all_async()
@ 2009-10-06  2:41   ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-06  2:41 UTC (permalink / raw)
  To: LKML
  Cc: kosaki.motohiro, linux-mm, Andrew Morton, Peter Zijlstra,
	Oleg Nesterov, Christoph Lameter


Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
very long time.

  Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
  cpu0 does mlock()->lru_add_drain_all(), which does
  schedule_on_each_cpu(), which then waits for all cpus to complete the
  work. Except that cpu1, which is busy with the RT task, will never run
  keventd until the RT load goes away.

  This is not so much an actual deadlock as a serious starvation case.

Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
Thus, this patch replace it with lru_add_drain_all_async().

Cc: Oleg Nesterov <onestero@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/mlock.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 22041aa..46a016f 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -458,7 +458,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
 	if (!can_do_mlock())
 		return -EPERM;
 
-	lru_add_drain_all();	/* flush pagevec */
+	lru_add_drain_all_async();	/* flush pagevec */
 
 	down_write(&current->mm->mmap_sem);
 	len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
@@ -526,7 +526,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
 	if (!can_do_mlock())
 		goto out;
 
-	lru_add_drain_all();	/* flush pagevec */
+	lru_add_drain_all_async();	/* flush pagevec */
 
 	down_write(&current->mm->mmap_sem);
 
-- 
1.6.2.5



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
  2009-10-06  2:41   ` KOSAKI Motohiro
@ 2009-10-06  9:08     ` Peter Zijlstra
  -1 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2009-10-06  9:08 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Oleg Nesterov, Christoph Lameter

On Tue, 2009-10-06 at 11:41 +0900, KOSAKI Motohiro wrote:
> Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
> very long time.
> 
>   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>   cpu0 does mlock()->lru_add_drain_all(), which does
>   schedule_on_each_cpu(), which then waits for all cpus to complete the
>   work. Except that cpu1, which is busy with the RT task, will never run
>   keventd until the RT load goes away.
> 
>   This is not so much an actual deadlock as a serious starvation case.
> 
> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> Thus, this patch replace it with lru_add_drain_all_async().
> 
> Cc: Oleg Nesterov <onestero@redhat.com>
> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

It was actually Mike Galbraith who brought it to my attention.

Patch looks sane enough, altough I'm not sure I'd have split it in two
like you did (leaves the first without a real changelog too).




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
@ 2009-10-06  9:08     ` Peter Zijlstra
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2009-10-06  9:08 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Oleg Nesterov, Christoph Lameter

On Tue, 2009-10-06 at 11:41 +0900, KOSAKI Motohiro wrote:
> Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
> very long time.
> 
>   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>   cpu0 does mlock()->lru_add_drain_all(), which does
>   schedule_on_each_cpu(), which then waits for all cpus to complete the
>   work. Except that cpu1, which is busy with the RT task, will never run
>   keventd until the RT load goes away.
> 
>   This is not so much an actual deadlock as a serious starvation case.
> 
> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> Thus, this patch replace it with lru_add_drain_all_async().
> 
> Cc: Oleg Nesterov <onestero@redhat.com>
> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

It was actually Mike Galbraith who brought it to my attention.

Patch looks sane enough, altough I'm not sure I'd have split it in two
like you did (leaves the first without a real changelog too).



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
  2009-10-06  9:08     ` Peter Zijlstra
@ 2009-10-06 10:11       ` KOSAKI Motohiro
  -1 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-06 10:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Oleg Nesterov,
	Christoph Lameter

> On Tue, 2009-10-06 at 11:41 +0900, KOSAKI Motohiro wrote:
> > Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
> > very long time.
> > 
> >   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
> >   cpu0 does mlock()->lru_add_drain_all(), which does
> >   schedule_on_each_cpu(), which then waits for all cpus to complete the
> >   work. Except that cpu1, which is busy with the RT task, will never run
> >   keventd until the RT load goes away.
> > 
> >   This is not so much an actual deadlock as a serious starvation case.
> > 
> > Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> > Thus, this patch replace it with lru_add_drain_all_async().
> > 
> > Cc: Oleg Nesterov <onestero@redhat.com>
> > Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 
> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 
> It was actually Mike Galbraith who brought it to my attention.
> 
> Patch looks sane enough, altough I'm not sure I'd have split it in two
> like you did (leaves the first without a real changelog too).

Ah, yes. they shold be folded. thanks.
In my local patch queue, this patch series have another two caller.

  - lumpy reclaim: currently, PCP often cause failure large order allocation.
  - page migration: in almost case, PCP doesn't hold the migration target page.

but they are still testing.






^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
@ 2009-10-06 10:11       ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-06 10:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Oleg Nesterov,
	Christoph Lameter

> On Tue, 2009-10-06 at 11:41 +0900, KOSAKI Motohiro wrote:
> > Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
> > very long time.
> > 
> >   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
> >   cpu0 does mlock()->lru_add_drain_all(), which does
> >   schedule_on_each_cpu(), which then waits for all cpus to complete the
> >   work. Except that cpu1, which is busy with the RT task, will never run
> >   keventd until the RT load goes away.
> > 
> >   This is not so much an actual deadlock as a serious starvation case.
> > 
> > Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> > Thus, this patch replace it with lru_add_drain_all_async().
> > 
> > Cc: Oleg Nesterov <onestero@redhat.com>
> > Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> 
> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 
> It was actually Mike Galbraith who brought it to my attention.
> 
> Patch looks sane enough, altough I'm not sure I'd have split it in two
> like you did (leaves the first without a real changelog too).

Ah, yes. they shold be folded. thanks.
In my local patch queue, this patch series have another two caller.

  - lumpy reclaim: currently, PCP often cause failure large order allocation.
  - page migration: in almost case, PCP doesn't hold the migration target page.

but they are still testing.





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
  2009-10-06  2:41   ` KOSAKI Motohiro
@ 2009-10-06 16:27     ` Christoph Lameter
  -1 siblings, 0 replies; 16+ messages in thread
From: Christoph Lameter @ 2009-10-06 16:27 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov

On Tue, 6 Oct 2009, KOSAKI Motohiro wrote:

>   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>   cpu0 does mlock()->lru_add_drain_all(), which does
>   schedule_on_each_cpu(), which then waits for all cpus to complete the
>   work. Except that cpu1, which is busy with the RT task, will never run
>   keventd until the RT load goes away.
>
>   This is not so much an actual deadlock as a serious starvation case.
>
> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> Thus, this patch replace it with lru_add_drain_all_async().

Ok so this will queue up lots of events for the cpu doing a RT task. If
the RT task is continuous then they will be queued there forever?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
@ 2009-10-06 16:27     ` Christoph Lameter
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Lameter @ 2009-10-06 16:27 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov

On Tue, 6 Oct 2009, KOSAKI Motohiro wrote:

>   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>   cpu0 does mlock()->lru_add_drain_all(), which does
>   schedule_on_each_cpu(), which then waits for all cpus to complete the
>   work. Except that cpu1, which is busy with the RT task, will never run
>   keventd until the RT load goes away.
>
>   This is not so much an actual deadlock as a serious starvation case.
>
> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> Thus, this patch replace it with lru_add_drain_all_async().

Ok so this will queue up lots of events for the cpu doing a RT task. If
the RT task is continuous then they will be queued there forever?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
  2009-10-06  2:41   ` KOSAKI Motohiro
@ 2009-10-06 22:59     ` Ying Han
  -1 siblings, 0 replies; 16+ messages in thread
From: Ying Han @ 2009-10-06 22:59 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov,
	Christoph Lameter

Hello KOSAKI-san,

Few questions on the lru_add_drain_all_async(). If i understand
correctly, the reason that we have lru_add_drain_all() in the mlock()
call is to isolate mlocked pages into the separate LRU in case they
are sitting in pagevec.

And I also understand the RT use cases you put in the patch
description, now my questions is that do we have race after applying
the patch? For example that if the RT task not giving up the cpu by
the time mlock returns, you have pages left in the pagevec which not
being drained back to the lru list. Do we have problem with that?

--Ying

On Mon, Oct 5, 2009 at 7:41 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>
> Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
> very long time.
>
>  Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>  cpu0 does mlock()->lru_add_drain_all(), which does
>  schedule_on_each_cpu(), which then waits for all cpus to complete the
>  work. Except that cpu1, which is busy with the RT task, will never run
>  keventd until the RT load goes away.
>
>  This is not so much an actual deadlock as a serious starvation case.
>
> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> Thus, this patch replace it with lru_add_drain_all_async().
>
> Cc: Oleg Nesterov <onestero@redhat.com>
> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
>  mm/mlock.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 22041aa..46a016f 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -458,7 +458,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
>        if (!can_do_mlock())
>                return -EPERM;
>
> -       lru_add_drain_all();    /* flush pagevec */
> +       lru_add_drain_all_async();      /* flush pagevec */
>
>        down_write(&current->mm->mmap_sem);
>        len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
> @@ -526,7 +526,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>        if (!can_do_mlock())
>                goto out;
>
> -       lru_add_drain_all();    /* flush pagevec */
> +       lru_add_drain_all_async();      /* flush pagevec */
>
>        down_write(&current->mm->mmap_sem);
>
> --
> 1.6.2.5
>
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
@ 2009-10-06 22:59     ` Ying Han
  0 siblings, 0 replies; 16+ messages in thread
From: Ying Han @ 2009-10-06 22:59 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov,
	Christoph Lameter

Hello KOSAKI-san,

Few questions on the lru_add_drain_all_async(). If i understand
correctly, the reason that we have lru_add_drain_all() in the mlock()
call is to isolate mlocked pages into the separate LRU in case they
are sitting in pagevec.

And I also understand the RT use cases you put in the patch
description, now my questions is that do we have race after applying
the patch? For example that if the RT task not giving up the cpu by
the time mlock returns, you have pages left in the pagevec which not
being drained back to the lru list. Do we have problem with that?

--Ying

On Mon, Oct 5, 2009 at 7:41 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>
> Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
> very long time.
>
>  Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>  cpu0 does mlock()->lru_add_drain_all(), which does
>  schedule_on_each_cpu(), which then waits for all cpus to complete the
>  work. Except that cpu1, which is busy with the RT task, will never run
>  keventd until the RT load goes away.
>
>  This is not so much an actual deadlock as a serious starvation case.
>
> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> Thus, this patch replace it with lru_add_drain_all_async().
>
> Cc: Oleg Nesterov <onestero@redhat.com>
> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
>  mm/mlock.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 22041aa..46a016f 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -458,7 +458,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
>        if (!can_do_mlock())
>                return -EPERM;
>
> -       lru_add_drain_all();    /* flush pagevec */
> +       lru_add_drain_all_async();      /* flush pagevec */
>
>        down_write(&current->mm->mmap_sem);
>        len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
> @@ -526,7 +526,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>        if (!can_do_mlock())
>                goto out;
>
> -       lru_add_drain_all();    /* flush pagevec */
> +       lru_add_drain_all_async();      /* flush pagevec */
>
>        down_write(&current->mm->mmap_sem);
>
> --
> 1.6.2.5
>
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
  2009-10-06 16:27     ` Christoph Lameter
@ 2009-10-07  3:37       ` KOSAKI Motohiro
  -1 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-07  3:37 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov

2009/10/7 Christoph Lameter <cl@linux-foundation.org>:
> On Tue, 6 Oct 2009, KOSAKI Motohiro wrote:
>
>>   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>>   cpu0 does mlock()->lru_add_drain_all(), which does
>>   schedule_on_each_cpu(), which then waits for all cpus to complete the
>>   work. Except that cpu1, which is busy with the RT task, will never run
>>   keventd until the RT load goes away.
>>
>>   This is not so much an actual deadlock as a serious starvation case.
>>
>> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
>> Thus, this patch replace it with lru_add_drain_all_async().
>
> Ok so this will queue up lots of events for the cpu doing a RT task. If
> the RT task is continuous then they will be queued there forever?

Yes. this patch solved very specific issue only.
In original bug-report case, the system has two cpuset and the RT task
own one cpuset as monopoly. Thus, your worried thing doesn't occur.

Perhaps, we need complete solution. but I don't think this patch have
bad side effect. then, I hope to push it into mainline.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
@ 2009-10-07  3:37       ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-07  3:37 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov

2009/10/7 Christoph Lameter <cl@linux-foundation.org>:
> On Tue, 6 Oct 2009, KOSAKI Motohiro wrote:
>
>>   Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>>   cpu0 does mlock()->lru_add_drain_all(), which does
>>   schedule_on_each_cpu(), which then waits for all cpus to complete the
>>   work. Except that cpu1, which is busy with the RT task, will never run
>>   keventd until the RT load goes away.
>>
>>   This is not so much an actual deadlock as a serious starvation case.
>>
>> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
>> Thus, this patch replace it with lru_add_drain_all_async().
>
> Ok so this will queue up lots of events for the cpu doing a RT task. If
> the RT task is continuous then they will be queued there forever?

Yes. this patch solved very specific issue only.
In original bug-report case, the system has two cpuset and the RT task
own one cpuset as monopoly. Thus, your worried thing doesn't occur.

Perhaps, we need complete solution. but I don't think this patch have
bad side effect. then, I hope to push it into mainline.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
  2009-10-06 22:59     ` Ying Han
@ 2009-10-07  3:48       ` KOSAKI Motohiro
  -1 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-07  3:48 UTC (permalink / raw)
  To: Ying Han
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov,
	Christoph Lameter

Hi

> Hello KOSAKI-san,
>
> Few questions on the lru_add_drain_all_async(). If i understand
> correctly, the reason that we have lru_add_drain_all() in the mlock()
> call is to isolate mlocked pages into the separate LRU in case they
> are sitting in pagevec.
>
> And I also understand the RT use cases you put in the patch
> description, now my questions is that do we have race after applying
> the patch? For example that if the RT task not giving up the cpu by
> the time mlock returns, you have pages left in the pagevec which not
> being drained back to the lru list. Do we have problem with that?

This patch don't introduce new race. current code has following race.

1. call mlock
2. lru_add_drain_all()
3. another cpu grab the page into its pagevec
4. actual PG_mlocked processing

I'd like to explain why this code works. linux has VM_LOCKED in vma
and PG_mlocked in page.  if we failed to turn on PG_mlocked, we can
recover it at vmscan phase by VM_LOCKED.

Then, this patch effect are
  - increase race possibility a bit
  - decrease RT-task problem risk

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] mlock use lru_add_drain_all_async()
@ 2009-10-07  3:48       ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2009-10-07  3:48 UTC (permalink / raw)
  To: Ying Han
  Cc: LKML, linux-mm, Andrew Morton, Peter Zijlstra, Oleg Nesterov,
	Christoph Lameter

Hi

> Hello KOSAKI-san,
>
> Few questions on the lru_add_drain_all_async(). If i understand
> correctly, the reason that we have lru_add_drain_all() in the mlock()
> call is to isolate mlocked pages into the separate LRU in case they
> are sitting in pagevec.
>
> And I also understand the RT use cases you put in the patch
> description, now my questions is that do we have race after applying
> the patch? For example that if the RT task not giving up the cpu by
> the time mlock returns, you have pages left in the pagevec which not
> being drained back to the lru list. Do we have problem with that?

This patch don't introduce new race. current code has following race.

1. call mlock
2. lru_add_drain_all()
3. another cpu grab the page into its pagevec
4. actual PG_mlocked processing

I'd like to explain why this code works. linux has VM_LOCKED in vma
and PG_mlocked in page.  if we failed to turn on PG_mlocked, we can
recover it at vmscan phase by VM_LOCKED.

Then, this patch effect are
  - increase race possibility a bit
  - decrease RT-task problem risk

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-10-07  3:49 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-06  2:40 [PATCH 1/2] Implement lru_add_drain_all_async() KOSAKI Motohiro
2009-10-06  2:40 ` KOSAKI Motohiro
2009-10-06  2:41 ` [PATCH 2/2] mlock use lru_add_drain_all_async() KOSAKI Motohiro
2009-10-06  2:41   ` KOSAKI Motohiro
2009-10-06  9:08   ` Peter Zijlstra
2009-10-06  9:08     ` Peter Zijlstra
2009-10-06 10:11     ` KOSAKI Motohiro
2009-10-06 10:11       ` KOSAKI Motohiro
2009-10-06 16:27   ` Christoph Lameter
2009-10-06 16:27     ` Christoph Lameter
2009-10-07  3:37     ` KOSAKI Motohiro
2009-10-07  3:37       ` KOSAKI Motohiro
2009-10-06 22:59   ` Ying Han
2009-10-06 22:59     ` Ying Han
2009-10-07  3:48     ` KOSAKI Motohiro
2009-10-07  3:48       ` KOSAKI Motohiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.