linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: do not drain pagevecs for mlock
@ 2011-12-30  6:36 Tao Ma
  2011-12-30  8:11 ` KOSAKI Motohiro
  0 siblings, 1 reply; 28+ messages in thread
From: Tao Ma @ 2011-12-30  6:36 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, David Rientjes, Minchan Kim, KOSAKI Motohiro,
	Mel Gorman, Johannes Weiner, Andrew Morton

In our test of mlock, we have found some severe performance regression
in it. Some more investigations show that mlocked is blocked heavily
by lur_add_drain_all which calls schedule_on_each_cpu and flush the work
queue which is very slower if we have several cpus.

So we have tried 2 ways to solve it:
1. Add a per cpu counter for all the pagevecs so that we don't schedule
   and flush the lru_drain work if the cpu doesn't have any pagevecs(I
   have finished the codes already).
2. Remove the lru_add_drain_all.

The first one has some problems since in our product system, all the cpus
are busy, so I guess there is very little chance for a cpu to have 0 pagevecs
except that you run several consecutive mlocks.

>From the commit log which added this function(8891d6da), it seems that we
don't have to call it. So the 2nd one seems to be both easy and workable and
comes this patch.

Thanks
Tao

>From 8cdf7f7ed236367e85151db65ae06f781aca7d77 Mon Sep 17 00:00:00 2001
From: Tao Ma <boyu.mt@taobao.com>
Date: Fri, 30 Dec 2011 14:20:08 +0800
Subject: [PATCH] mm: do not drain pagevecs for mlock

In 8891d6da, lru_add_drain_all is added to mlock to flush all the per
cpu pagevecs. It makes this system call runs much slower than the
predecessor(For a 16 core Xeon E5620, it is around 20 times). And the
the more cores we have, the more the performance penalty because of the
nasty call to schedule_on_each_cpu.

>From the commit log of 8891d6da we can see that "it isn't must.  but it
reduce the failure of moving to unevictable list.  its failure can rescue
in vmscan later." Christoph Lameter removes the call in mlockall(ML_FUTURE),
So this patch just removes all the call from mlock/mlockall.

Without this patch:
time ./test_mlock -c 100000

real    0m20.566s
user    0m0.074s
sys     0m12.759s

With this patch:
time ./test_mlock -c 100000

real	0m1.675s
user	0m0.049s
sys	0m1.622s

Cc: David Rientjes <rientjes@google.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
 mm/mlock.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 4f4f53b..bb5fc42 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -487,8 +487,6 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
 	if (!can_do_mlock())
 		return -EPERM;
 
-	lru_add_drain_all();	/* flush pagevec */
-
 	down_write(&current->mm->mmap_sem);
 	len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
 	start &= PAGE_MASK;
@@ -557,9 +555,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
 	if (!can_do_mlock())
 		goto out;
 
-	if (flags & MCL_CURRENT)
-		lru_add_drain_all();	/* flush pagevec */
-
 	down_write(&current->mm->mmap_sem);
 
 	lock_limit = rlimit(RLIMIT_MEMLOCK);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30  6:36 [PATCH] mm: do not drain pagevecs for mlock Tao Ma
@ 2011-12-30  8:11 ` KOSAKI Motohiro
  2011-12-30  8:48   ` Tao Ma
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2011-12-30  8:11 UTC (permalink / raw)
  To: Tao Ma
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

2011/12/30 Tao Ma <tm@tao.ma>:
> In our test of mlock, we have found some severe performance regression
> in it. Some more investigations show that mlocked is blocked heavily
> by lur_add_drain_all which calls schedule_on_each_cpu and flush the work
> queue which is very slower if we have several cpus.
>
> So we have tried 2 ways to solve it:
> 1. Add a per cpu counter for all the pagevecs so that we don't schedule
>   and flush the lru_drain work if the cpu doesn't have any pagevecs(I
>   have finished the codes already).
> 2. Remove the lru_add_drain_all.
>
> The first one has some problems since in our product system, all the cpus
> are busy, so I guess there is very little chance for a cpu to have 0 pagevecs
> except that you run several consecutive mlocks.
>
> From the commit log which added this function(8891d6da), it seems that we
> don't have to call it. So the 2nd one seems to be both easy and workable and
> comes this patch.

Could you please show us your system environment and benchmark programs?
Usually lru_drain_** is very fast than mlock() body because it makes
plenty memset(page).

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30  8:11 ` KOSAKI Motohiro
@ 2011-12-30  8:48   ` Tao Ma
  2011-12-30  9:31     ` KOSAKI Motohiro
  0 siblings, 1 reply; 28+ messages in thread
From: Tao Ma @ 2011-12-30  8:48 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1336 bytes --]

On 12/30/2011 04:11 PM, KOSAKI Motohiro wrote:
> 2011/12/30 Tao Ma <tm@tao.ma>:
>> In our test of mlock, we have found some severe performance regression
>> in it. Some more investigations show that mlocked is blocked heavily
>> by lur_add_drain_all which calls schedule_on_each_cpu and flush the work
>> queue which is very slower if we have several cpus.
>>
>> So we have tried 2 ways to solve it:
>> 1. Add a per cpu counter for all the pagevecs so that we don't schedule
>>   and flush the lru_drain work if the cpu doesn't have any pagevecs(I
>>   have finished the codes already).
>> 2. Remove the lru_add_drain_all.
>>
>> The first one has some problems since in our product system, all the cpus
>> are busy, so I guess there is very little chance for a cpu to have 0 pagevecs
>> except that you run several consecutive mlocks.
>>
>> From the commit log which added this function(8891d6da), it seems that we
>> don't have to call it. So the 2nd one seems to be both easy and workable and
>> comes this patch.
> 
> Could you please show us your system environment and benchmark programs?
> Usually lru_drain_** is very fast than mlock() body because it makes
> plenty memset(page).
The system environment is: 16 core Xeon E5620. 24G memory.

I have attached the program. It is very simple and just uses mlock/munlock.

Thanks
Tao

[-- Attachment #2: test_mlock.c --]
[-- Type: text/x-csrc, Size: 981 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <time.h>
#include <sys/time.h>
#include <sys/mman.h>

#define MM_SZ1 24
#define MM_SZ2 56
#define MM_SZ3 4168

void mlock_test()
{
	char ptr1[MM_SZ1];
	char ptr2[MM_SZ2];
	char ptr3[MM_SZ3];

	if(0 != mlock(ptr1, MM_SZ1) )
		perror("mlock MM_SZ1\n");
	if(0 != mlock(ptr2, MM_SZ2) )
		perror("mlock MM_SZ2\n");
	if(0 != mlock(ptr3, MM_SZ3) )
		perror("mlock MM_SZ3\n");

	if(0 != munlock(ptr1, MM_SZ1) )
		perror("munlock MM_SZ1\n");       
	if(0 != munlock(ptr2, MM_SZ2) )
		perror("munlock MM_SZ2\n");
	if(0 != munlock(ptr3, MM_SZ3) )
		perror("munlock MM_SZ3\n");
}

int main(int argc, char *argv[])
{
	int ret, opt;
	int i,cnt;

	while((opt = getopt(argc, argv, "c:")) != -1 )
	{
		switch(opt){
		case 'c':
			cnt = atoi(optarg);
			break;
		default:
			printf("Usage: %s [-c count] arg...\n", argv[0]);
			exit(EXIT_FAILURE);
		}
	}

	for(i = 0; i < cnt; i++)
		mlock_test();

	return 0;
}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30  8:48   ` Tao Ma
@ 2011-12-30  9:31     ` KOSAKI Motohiro
  2011-12-30  9:45       ` Tao Ma
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2011-12-30  9:31 UTC (permalink / raw)
  To: Tao Ma
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

2011/12/30 Tao Ma <tm@tao.ma>:
> On 12/30/2011 04:11 PM, KOSAKI Motohiro wrote:
>> 2011/12/30 Tao Ma <tm@tao.ma>:
>>> In our test of mlock, we have found some severe performance regression
>>> in it. Some more investigations show that mlocked is blocked heavily
>>> by lur_add_drain_all which calls schedule_on_each_cpu and flush the work
>>> queue which is very slower if we have several cpus.
>>>
>>> So we have tried 2 ways to solve it:
>>> 1. Add a per cpu counter for all the pagevecs so that we don't schedule
>>>   and flush the lru_drain work if the cpu doesn't have any pagevecs(I
>>>   have finished the codes already).
>>> 2. Remove the lru_add_drain_all.
>>>
>>> The first one has some problems since in our product system, all the cpus
>>> are busy, so I guess there is very little chance for a cpu to have 0 pagevecs
>>> except that you run several consecutive mlocks.
>>>
>>> From the commit log which added this function(8891d6da), it seems that we
>>> don't have to call it. So the 2nd one seems to be both easy and workable and
>>> comes this patch.
>>
>> Could you please show us your system environment and benchmark programs?
>> Usually lru_drain_** is very fast than mlock() body because it makes
>> plenty memset(page).
> The system environment is: 16 core Xeon E5620. 24G memory.
>
> I have attached the program. It is very simple and just uses mlock/munlock.

Because your test program is too artificial. 20sec/100000times =
200usec. And your
program repeat mlock and munlock the exact same address. so, yes, if
lru_add_drain_all() is removed, it become near no-op. but it's
worthless comparision.
none of any practical program does such strange mlock usage.

But, 200usec is much than I measured before. I'll dig it a bit more.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30  9:31     ` KOSAKI Motohiro
@ 2011-12-30  9:45       ` Tao Ma
  2011-12-30 10:07         ` KOSAKI Motohiro
  2011-12-30 10:14         ` KOSAKI Motohiro
  0 siblings, 2 replies; 28+ messages in thread
From: Tao Ma @ 2011-12-30  9:45 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

On 12/30/2011 05:31 PM, KOSAKI Motohiro wrote:
> 2011/12/30 Tao Ma <tm@tao.ma>:
>> On 12/30/2011 04:11 PM, KOSAKI Motohiro wrote:
>>> 2011/12/30 Tao Ma <tm@tao.ma>:
>>>> In our test of mlock, we have found some severe performance regression
>>>> in it. Some more investigations show that mlocked is blocked heavily
>>>> by lur_add_drain_all which calls schedule_on_each_cpu and flush the work
>>>> queue which is very slower if we have several cpus.
>>>>
>>>> So we have tried 2 ways to solve it:
>>>> 1. Add a per cpu counter for all the pagevecs so that we don't schedule
>>>>   and flush the lru_drain work if the cpu doesn't have any pagevecs(I
>>>>   have finished the codes already).
>>>> 2. Remove the lru_add_drain_all.
>>>>
>>>> The first one has some problems since in our product system, all the cpus
>>>> are busy, so I guess there is very little chance for a cpu to have 0 pagevecs
>>>> except that you run several consecutive mlocks.
>>>>
>>>> From the commit log which added this function(8891d6da), it seems that we
>>>> don't have to call it. So the 2nd one seems to be both easy and workable and
>>>> comes this patch.
>>>
>>> Could you please show us your system environment and benchmark programs?
>>> Usually lru_drain_** is very fast than mlock() body because it makes
>>> plenty memset(page).
>> The system environment is: 16 core Xeon E5620. 24G memory.
>>
>> I have attached the program. It is very simple and just uses mlock/munlock.
> 
> Because your test program is too artificial. 20sec/100000times =
> 200usec. And your
> program repeat mlock and munlock the exact same address. so, yes, if
> lru_add_drain_all() is removed, it become near no-op. but it's
> worthless comparision.
> none of any practical program does such strange mlock usage.
yes, I should say it is artificial. But mlock did cause the problem in
our product system and perf shows that the mlock uses the system time
much more than others. That's the reason we created this program to test
whether mlock really sucks. And we compared the result with
rhel5(2.6.18) which runs much much faster.

And from the commit log you described, we can remove lru_add_drain_all
safely here, so why add it? At least removing it makes mlock much faster
compared to the vanilla kernel.

> 
> But, 200usec is much than I measured before. I'll dig it a bit more.
Thanks for the help.

Tao

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30  9:45       ` Tao Ma
@ 2011-12-30 10:07         ` KOSAKI Motohiro
  2012-01-01  7:30           ` [PATCH 1/2] mm,mlock: drain pagevecs asynchronously kosaki.motohiro
                             ` (3 more replies)
  2011-12-30 10:14         ` KOSAKI Motohiro
  1 sibling, 4 replies; 28+ messages in thread
From: KOSAKI Motohiro @ 2011-12-30 10:07 UTC (permalink / raw)
  To: Tao Ma
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

>> Because your test program is too artificial. 20sec/100000times =
>> 200usec. And your
>> program repeat mlock and munlock the exact same address. so, yes, if
>> lru_add_drain_all() is removed, it become near no-op. but it's
>> worthless comparision.
>> none of any practical program does such strange mlock usage.
> yes, I should say it is artificial. But mlock did cause the problem in
> our product system and perf shows that the mlock uses the system time
> much more than others. That's the reason we created this program to test
> whether mlock really sucks. And we compared the result with
> rhel5(2.6.18) which runs much much faster.
>
> And from the commit log you described, we can remove lru_add_drain_all
> safely here, so why add it? At least removing it makes mlock much faster
> compared to the vanilla kernel.

If we remove it, we lose to a test way of mlock. "Memlocked" field of
/proc/meminfo
show inaccurate number very easily. So, if 200usec is no avoidable,
I'll ack you.
But I'm not convinced yet.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30  9:45       ` Tao Ma
  2011-12-30 10:07         ` KOSAKI Motohiro
@ 2011-12-30 10:14         ` KOSAKI Motohiro
  1 sibling, 0 replies; 28+ messages in thread
From: KOSAKI Motohiro @ 2011-12-30 10:14 UTC (permalink / raw)
  To: Tao Ma
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

One more thing.

>> Because your test program is too artificial. 20sec/100000times =
>> 200usec. And your
>> program repeat mlock and munlock the exact same address. so, yes, if
>> lru_add_drain_all() is removed, it become near no-op. but it's
>> worthless comparision.
>> none of any practical program does such strange mlock usage.
> yes, I should say it is artificial. But mlock did cause the problem in
> our product system and perf shows that the mlock uses the system time
> much more than others. That's the reason we created this program to test
> whether mlock really sucks. And we compared the result with
> rhel5(2.6.18) which runs much much faster.

rhel5 is faster because it doesn't have proper mlock implementation.
then it has a lot
of mlock related performance issue. We optimized PAGE_MLOCK thing for
typical workload.
number of mlock call are rare and number of memory reclaim are a lot.
That's the reason why we haven't got any complication of mlock thing.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2011-12-30 10:07         ` KOSAKI Motohiro
@ 2012-01-01  7:30           ` kosaki.motohiro
  2012-01-04  1:17             ` Minchan Kim
                               ` (2 more replies)
  2012-01-01  7:30           ` [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async() kosaki.motohiro
                             ` (2 subsequent siblings)
  3 siblings, 3 replies; 28+ messages in thread
From: kosaki.motohiro @ 2012-01-01  7:30 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: KOSAKI Motohiro, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Tao Ma reported current mlock is much slower than old 2.6.18 kernel. Because
lru_add_drain_all() spent much time. The problem are two. 1) lru_add_drain_all()
broadcast a worker thread to all cpus unconditionally. then, the performance
penalty is increased in proportion to number of cpus. 2) lru_add_drain_all()
wait the worker finished unnecessary. It makes bigger penalty.

This patch makes lru_add_drain_all_async() and changes mlock/mlockall use it.

Technical side note:
 - has_pages_lru_pvecs() checks pagevecs locklessly. Of course, it's racy.
   But it's no matter because asynchronous worker itself is also racy.
   any lock can't close a race.
 - Now, we drain pagevec at last of mlock instead of beginning. because
   a page drain function (____pagevec_lru_add_fn) is PG_mlocked aware now.
   Then it's safe and it close more race.

Without the patch:
% time ./test_mlock -c 100000

 real   1m13.608s
 user   0m0.204s
 sys    0m40.115s

 i.e. 200usec per mlock

With the patch:
% time ./test_mlock -c 100000
 real    0m3.939s
 user    0m0.060s
 sys     0m3.868s

 i.e. 13usec per mlock

test_mlock.c
==========================================
 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <errno.h>
 #include <time.h>
 #include <sys/time.h>
 #include <sys/mman.h>

 #define MM_SZ1 24
 #define MM_SZ2 56
 #define MM_SZ3 4168

void mlock_test()
{
	char ptr1[MM_SZ1];
	char ptr2[MM_SZ2];
	char ptr3[MM_SZ3];

	if(0 != mlock(ptr1, MM_SZ1) )
		perror("mlock MM_SZ1\n");
	if(0 != mlock(ptr2, MM_SZ2) )
		perror("mlock MM_SZ2\n");
	if(0 != mlock(ptr3, MM_SZ3) )
		perror("mlock MM_SZ3\n");

	if(0 != munlock(ptr1, MM_SZ1) )
		perror("munlock MM_SZ1\n");
	if(0 != munlock(ptr2, MM_SZ2) )
		perror("munlock MM_SZ2\n");
	if(0 != munlock(ptr3, MM_SZ3) )
		perror("munlock MM_SZ3\n");
}

int main(int argc, char *argv[])
{
	int ret, opt;
	int i,cnt;

	while((opt = getopt(argc, argv, "c:")) != -1 )
	{
		switch(opt){
		case 'c':
			cnt = atoi(optarg);
			break;
		default:
			printf("Usage: %s [-c count] arg...\n", argv[0]);
			exit(EXIT_FAILURE);
		}
	}

	for(i = 0; i < cnt; i++)
		mlock_test();

	return 0;
}
===========================================

Reported-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/swap.h |    1 +
 mm/mlock.c           |    7 +----
 mm/swap.c            |   64 ++++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1e22e12..11ad301 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -223,6 +223,7 @@ extern void activate_page(struct page *);
 extern void mark_page_accessed(struct page *);
 extern void lru_add_drain(void);
 extern int lru_add_drain_all(void);
+extern void lru_add_drain_all_async(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_page(struct page *page);
 extern void swap_setup(void);
diff --git a/mm/mlock.c b/mm/mlock.c
index 4f4f53b..08f5b6b 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -487,8 +487,6 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
 	if (!can_do_mlock())
 		return -EPERM;
 
-	lru_add_drain_all();	/* flush pagevec */
-
 	down_write(&current->mm->mmap_sem);
 	len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
 	start &= PAGE_MASK;
@@ -505,6 +503,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
 	up_write(&current->mm->mmap_sem);
 	if (!error)
 		error = do_mlock_pages(start, len, 0);
+	lru_add_drain_all_async();
 	return error;
 }
 
@@ -557,9 +556,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
 	if (!can_do_mlock())
 		goto out;
 
-	if (flags & MCL_CURRENT)
-		lru_add_drain_all();	/* flush pagevec */
-
 	down_write(&current->mm->mmap_sem);
 
 	lock_limit = rlimit(RLIMIT_MEMLOCK);
@@ -573,6 +569,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
 	if (!ret && (flags & MCL_CURRENT)) {
 		/* Ignore errors */
 		do_mlock_pages(0, TASK_SIZE, 1);
+		lru_add_drain_all_async();
 	}
 out:
 	return ret;
diff --git a/mm/swap.c b/mm/swap.c
index a91caf7..2690f04 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -569,6 +569,49 @@ int lru_add_drain_all(void)
 	return schedule_on_each_cpu(lru_add_drain_per_cpu);
 }
 
+static DEFINE_PER_CPU(struct work_struct, lru_drain_work);
+
+static int __init lru_drain_work_init(void)
+{
+	struct work_struct *work;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		work = &per_cpu(lru_drain_work, cpu);
+		INIT_WORK(work, &lru_add_drain_per_cpu);
+	}
+
+	return 0;
+}
+core_initcall(lru_drain_work_init);
+
+static bool has_pages_lru_pvecs(int cpu)
+{
+	struct pagevec *pvecs = per_cpu(lru_add_pvecs, cpu);
+	struct pagevec *pvec;
+	int lru;
+
+	for_each_lru(lru) {
+		pvec = &pvecs[lru - LRU_BASE];
+		if (pagevec_count(pvec))
+			return true;
+	}
+
+	return false;
+}
+
+void lru_add_drain_all_async(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		struct work_struct *work = &per_cpu(lru_drain_work, cpu);
+
+		if (has_pages_lru_pvecs(cpu))
+			schedule_work_on(cpu, work);
+	}
+}
+
 /*
  * Batched page_cache_release().  Decrement the reference count on all the
  * passed pages.  If it fell to zero then remove the page from the LRU and
@@ -704,10 +747,23 @@ static void ____pagevec_lru_add_fn(struct page *page, void *arg)
 	VM_BUG_ON(PageLRU(page));
 
 	SetPageLRU(page);
-	if (active)
-		SetPageActive(page);
-	update_page_reclaim_stat(zone, page, file, active);
-	add_page_to_lru_list(zone, page, lru);
+ redo:
+	if (page_evictable(page, NULL)) {
+		if (active)
+			SetPageActive(page);
+		update_page_reclaim_stat(zone, page, file, active);
+		add_page_to_lru_list(zone, page, lru);
+	} else {
+		SetPageUnevictable(page);
+		add_page_to_lru_list(zone, page, LRU_UNEVICTABLE);
+		smp_mb();
+
+		if (page_evictable(page, NULL)) {
+			del_page_from_lru_list(zone, page, LRU_UNEVICTABLE);
+			ClearPageUnevictable(page);
+			goto redo;
+		}
+	}
 }
 
 /*
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async()
  2011-12-30 10:07         ` KOSAKI Motohiro
  2012-01-01  7:30           ` [PATCH 1/2] mm,mlock: drain pagevecs asynchronously kosaki.motohiro
@ 2012-01-01  7:30           ` kosaki.motohiro
  2012-01-04  1:51             ` Hugh Dickins
  2012-01-06  6:13           ` [PATCH] mm: do not drain pagevecs for mlock Tao Ma
  2012-01-09  7:25           ` Tao Ma
  3 siblings, 1 reply; 28+ messages in thread
From: kosaki.motohiro @ 2012-01-01  7:30 UTC (permalink / raw)
  To: linux-mm, linux-kernel; +Cc: KOSAKI Motohiro

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

shmctl also don't need synchrounous pagevec drain. This patch replace it with
lru_add_drain_all_async().

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 ipc/shm.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 02ecf2c..1eb25f0 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -872,8 +872,6 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf)
 	{
 		struct file *uninitialized_var(shm_file);
 
-		lru_add_drain_all();  /* drain pagevecs to lru lists */
-
 		shp = shm_lock_check(ns, shmid);
 		if (IS_ERR(shp)) {
 			err = PTR_ERR(shp);
@@ -911,6 +909,8 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf)
 			shp->mlock_user = NULL;
 		}
 		shm_unlock(shp);
+		/* prevent user visible mismatch of unevictable accounting */
+		lru_add_drain_all_async();
 		goto out;
 	}
 	case IPC_RMID:
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2012-01-01  7:30           ` [PATCH 1/2] mm,mlock: drain pagevecs asynchronously kosaki.motohiro
@ 2012-01-04  1:17             ` Minchan Kim
  2012-01-04  2:38               ` KOSAKI Motohiro
  2012-01-04  2:56             ` Hugh Dickins
  2012-01-04 22:05             ` Andrew Morton
  2 siblings, 1 reply; 28+ messages in thread
From: Minchan Kim @ 2012-01-04  1:17 UTC (permalink / raw)
  To: kosaki.motohiro
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner, Andrew Morton

Hi KOSAKI,

On Sun, Jan 01, 2012 at 02:30:24AM -0500, kosaki.motohiro@gmail.com wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 
> Tao Ma reported current mlock is much slower than old 2.6.18 kernel. Because
> lru_add_drain_all() spent much time. The problem are two. 1) lru_add_drain_all()
> broadcast a worker thread to all cpus unconditionally. then, the performance
> penalty is increased in proportion to number of cpus. 2) lru_add_drain_all()
> wait the worker finished unnecessary. It makes bigger penalty.
> 
> This patch makes lru_add_drain_all_async() and changes mlock/mlockall use it.
> 
> Technical side note:
>  - has_pages_lru_pvecs() checks pagevecs locklessly. Of course, it's racy.
>    But it's no matter because asynchronous worker itself is also racy.
>    any lock can't close a race.
>  - Now, we drain pagevec at last of mlock instead of beginning. because
>    a page drain function (____pagevec_lru_add_fn) is PG_mlocked aware now.
>    Then it's safe and it close more race.
> 
> Without the patch:
> % time ./test_mlock -c 100000
> 
>  real   1m13.608s
>  user   0m0.204s
>  sys    0m40.115s
> 
>  i.e. 200usec per mlock
> 
> With the patch:
> % time ./test_mlock -c 100000
>  real    0m3.939s
>  user    0m0.060s
>  sys     0m3.868s
> 
>  i.e. 13usec per mlock
> 
> test_mlock.c
> ==========================================
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <unistd.h>
>  #include <errno.h>
>  #include <time.h>
>  #include <sys/time.h>
>  #include <sys/mman.h>
> 
>  #define MM_SZ1 24
>  #define MM_SZ2 56
>  #define MM_SZ3 4168
> 
> void mlock_test()
> {
> 	char ptr1[MM_SZ1];
> 	char ptr2[MM_SZ2];
> 	char ptr3[MM_SZ3];
> 
> 	if(0 != mlock(ptr1, MM_SZ1) )
> 		perror("mlock MM_SZ1\n");
> 	if(0 != mlock(ptr2, MM_SZ2) )
> 		perror("mlock MM_SZ2\n");
> 	if(0 != mlock(ptr3, MM_SZ3) )
> 		perror("mlock MM_SZ3\n");
> 
> 	if(0 != munlock(ptr1, MM_SZ1) )
> 		perror("munlock MM_SZ1\n");
> 	if(0 != munlock(ptr2, MM_SZ2) )
> 		perror("munlock MM_SZ2\n");
> 	if(0 != munlock(ptr3, MM_SZ3) )
> 		perror("munlock MM_SZ3\n");
> }
> 
> int main(int argc, char *argv[])
> {
> 	int ret, opt;
> 	int i,cnt;
> 
> 	while((opt = getopt(argc, argv, "c:")) != -1 )
> 	{
> 		switch(opt){
> 		case 'c':
> 			cnt = atoi(optarg);
> 			break;
> 		default:
> 			printf("Usage: %s [-c count] arg...\n", argv[0]);
> 			exit(EXIT_FAILURE);
> 		}
> 	}
> 
> 	for(i = 0; i < cnt; i++)
> 		mlock_test();
> 
> 	return 0;
> }
> ===========================================
> 
> Reported-by: Tao Ma <boyu.mt@taobao.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Minchan Kim <minchan.kim@gmail.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Mel Gorman <mel@csn.ul.ie>
> Cc: Johannes Weiner <jweiner@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  include/linux/swap.h |    1 +
>  mm/mlock.c           |    7 +----
>  mm/swap.c            |   64 ++++++++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 63 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 1e22e12..11ad301 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -223,6 +223,7 @@ extern void activate_page(struct page *);
>  extern void mark_page_accessed(struct page *);
>  extern void lru_add_drain(void);
>  extern int lru_add_drain_all(void);
> +extern void lru_add_drain_all_async(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_page(struct page *page);
>  extern void swap_setup(void);
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 4f4f53b..08f5b6b 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -487,8 +487,6 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	lru_add_drain_all();	/* flush pagevec */
> -
>  	down_write(&current->mm->mmap_sem);
>  	len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
>  	start &= PAGE_MASK;
> @@ -505,6 +503,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
>  	up_write(&current->mm->mmap_sem);
>  	if (!error)
>  		error = do_mlock_pages(start, len, 0);
> +	lru_add_drain_all_async();

flush pagevec was not a good comment.
In this chance, we should add more kind comment why we should drain in here.

>  	return error;
>  }
>  
> @@ -557,9 +556,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>  	if (!can_do_mlock())
>  		goto out;
>  
> -	if (flags & MCL_CURRENT)
> -		lru_add_drain_all();	/* flush pagevec */
> -
>  	down_write(&current->mm->mmap_sem);
>  
>  	lock_limit = rlimit(RLIMIT_MEMLOCK);
> @@ -573,6 +569,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>  	if (!ret && (flags & MCL_CURRENT)) {
>  		/* Ignore errors */
>  		do_mlock_pages(0, TASK_SIZE, 1);
> +		lru_add_drain_all_async();
>  	}
>  out:
>  	return ret;
> diff --git a/mm/swap.c b/mm/swap.c
> index a91caf7..2690f04 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -569,6 +569,49 @@ int lru_add_drain_all(void)
>  	return schedule_on_each_cpu(lru_add_drain_per_cpu);
>  }
>  
> +static DEFINE_PER_CPU(struct work_struct, lru_drain_work);
> +
> +static int __init lru_drain_work_init(void)
> +{
> +	struct work_struct *work;
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		work = &per_cpu(lru_drain_work, cpu);
> +		INIT_WORK(work, &lru_add_drain_per_cpu);
> +	}
> +
> +	return 0;
> +}
> +core_initcall(lru_drain_work_init);
> +
> +static bool has_pages_lru_pvecs(int cpu)
> +{
> +	struct pagevec *pvecs = per_cpu(lru_add_pvecs, cpu);
> +	struct pagevec *pvec;
> +	int lru;
> +
> +	for_each_lru(lru) {
> +		pvec = &pvecs[lru - LRU_BASE];
> +		if (pagevec_count(pvec))
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
> +void lru_add_drain_all_async(void)
> +{
> +	int cpu;
> +
> +	for_each_online_cpu(cpu) {
> +		struct work_struct *work = &per_cpu(lru_drain_work, cpu);
> +
> +		if (has_pages_lru_pvecs(cpu))
> +			schedule_work_on(cpu, work);
> +	}
> +}
> +
>  /*
>   * Batched page_cache_release().  Decrement the reference count on all the
>   * passed pages.  If it fell to zero then remove the page from the LRU and
> @@ -704,10 +747,23 @@ static void ____pagevec_lru_add_fn(struct page *page, void *arg)
>  	VM_BUG_ON(PageLRU(page));
>  
>  	SetPageLRU(page);
> -	if (active)
> -		SetPageActive(page);
> -	update_page_reclaim_stat(zone, page, file, active);
> -	add_page_to_lru_list(zone, page, lru);
> + redo:
> +	if (page_evictable(page, NULL)) {
> +		if (active)
> +			SetPageActive(page);
> +		update_page_reclaim_stat(zone, page, file, active);
> +		add_page_to_lru_list(zone, page, lru);
> +	} else {
> +		SetPageUnevictable(page);
> +		add_page_to_lru_list(zone, page, LRU_UNEVICTABLE);
> +		smp_mb();

Why do we need barrier in here? Please comment it.

> +
> +		if (page_evictable(page, NULL)) {
> +			del_page_from_lru_list(zone, page, LRU_UNEVICTABLE);
> +			ClearPageUnevictable(page);
> +			goto redo;
> +		}
> +	}

I am not sure it's a good idea.
mlock is very rare event but ____pagevec_lru_add_fn is called frequently.
We are adding more overhead in ____pagevec_lru_add_fn.
Is it valuable?

>  }
>  
>  /*
> -- 
> 1.7.1
> 

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async()
  2012-01-01  7:30           ` [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async() kosaki.motohiro
@ 2012-01-04  1:51             ` Hugh Dickins
  2012-01-04  2:19               ` KOSAKI Motohiro
  0 siblings, 1 reply; 28+ messages in thread
From: Hugh Dickins @ 2012-01-04  1:51 UTC (permalink / raw)
  To: kosaki.motohiro; +Cc: linux-mm, linux-kernel, KOSAKI Motohiro

On Sun, 1 Jan 2012, kosaki.motohiro@gmail.com wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 
> shmctl also don't need synchrounous pagevec drain. This patch replace it with
> lru_add_drain_all_async().
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Let me answer this 2/2 first since it's easier.

I'm going to thank you for bringing this lru_add_drain_all()
to my attention, I had not noticed it; but Nak the patch itself.

The reason being, that particular lru_add_drain_all() serves no
useful purpose, so let's delete it instead of replacing it.  I believe
that it serves no purpose for SHM_LOCK and no purpose for SHM_UNLOCK.

I'm dabbling in this area myself, since you so cogently pointed out that
I'd tried to add a cond_resched() to scan_mapping_unevictable_pages()
(which is a helper for SHM_UNLOCK here) while it's under spinlock.

In testing my fix for that, I find that there has been no attempt to
keep the Unevictable count accurate on SysVShm: SHM_LOCK pages get
marked unevictable lazily later as memory pressure discovers them -
which perhaps mirrors the way in which SHM_LOCK makes no attempt to
instantiate pages, unlike mlock.

Since nobody has complained about that in the two years since we've
had an Unevictable count in /proc/meminfo, I don't see any need to
add code (it would need more than just your change here; would need
more even than calling scan_mapping_unevictable_pages() at SHM_LOCK
time - though perhaps along with your 1/2 that could handle it) and
overhead to satisfy a need that nobody has.

I'll delete that lru_add_drain_all() in my patch, okay?

(But in writing this, realize I still don't quite understand why
the Unevictable count takes a second or two to get back to 0 after
SHM_UNLOCK: perhaps I've more to discover.)

Hugh

> ---
>  ipc/shm.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 02ecf2c..1eb25f0 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -872,8 +872,6 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf)
>  	{
>  		struct file *uninitialized_var(shm_file);
>  
> -		lru_add_drain_all();  /* drain pagevecs to lru lists */
> -
>  		shp = shm_lock_check(ns, shmid);
>  		if (IS_ERR(shp)) {
>  			err = PTR_ERR(shp);
> @@ -911,6 +909,8 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf)
>  			shp->mlock_user = NULL;
>  		}
>  		shm_unlock(shp);
> +		/* prevent user visible mismatch of unevictable accounting */
> +		lru_add_drain_all_async();
>  		goto out;
>  	}
>  	case IPC_RMID:
> -- 
> 1.7.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async()
  2012-01-04  1:51             ` Hugh Dickins
@ 2012-01-04  2:19               ` KOSAKI Motohiro
  2012-01-04  5:17                 ` Hugh Dickins
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-01-04  2:19 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm, linux-kernel, KOSAKI Motohiro

(1/3/12 8:51 PM), Hugh Dickins wrote:
> On Sun, 1 Jan 2012, kosaki.motohiro@gmail.com wrote:
>> From: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>
>>
>> shmctl also don't need synchrounous pagevec drain. This patch replace it with
>> lru_add_drain_all_async().
>>
>> Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>
>
> Let me answer this 2/2 first since it's easier.
>
> I'm going to thank you for bringing this lru_add_drain_all()
> to my attention, I had not noticed it; but Nak the patch itself.
>
> The reason being, that particular lru_add_drain_all() serves no
> useful purpose, so let's delete it instead of replacing it.  I believe
> that it serves no purpose for SHM_LOCK and no purpose for SHM_UNLOCK.
>
> I'm dabbling in this area myself, since you so cogently pointed out that
> I'd tried to add a cond_resched() to scan_mapping_unevictable_pages()
> (which is a helper for SHM_UNLOCK here) while it's under spinlock.
>
> In testing my fix for that, I find that there has been no attempt to
> keep the Unevictable count accurate on SysVShm: SHM_LOCK pages get
> marked unevictable lazily later as memory pressure discovers them -
> which perhaps mirrors the way in which SHM_LOCK makes no attempt to
> instantiate pages, unlike mlock.

Ugh, you are right. I'm recovering my remember gradually. Lee 
implemented immediate lru off logic at first and I killed it
to close a race. I completely forgot. So, yes, now SHM_LOCK has no 
attempt to instantiate pages. I'm ashamed.


>
> Since nobody has complained about that in the two years since we've
> had an Unevictable count in /proc/meminfo, I don't see any need to
> add code (it would need more than just your change here; would need
> more even than calling scan_mapping_unevictable_pages() at SHM_LOCK
> time - though perhaps along with your 1/2 that could handle it) and
> overhead to satisfy a need that nobody has.
>
> I'll delete that lru_add_drain_all() in my patch, okay?

Sure thing. :)


> (But in writing this, realize I still don't quite understand why
> the Unevictable count takes a second or two to get back to 0 after
> SHM_UNLOCK: perhaps I've more to discover.)

Interesting. I'm looking at this too.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2012-01-04  1:17             ` Minchan Kim
@ 2012-01-04  2:38               ` KOSAKI Motohiro
  2012-01-10  8:53                 ` Tao Ma
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-01-04  2:38 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner, Andrew Morton


>> @@ -704,10 +747,23 @@ static void ____pagevec_lru_add_fn(struct page *page, void *arg)
>>   	VM_BUG_ON(PageLRU(page));
>>
>>   	SetPageLRU(page);
>> -	if (active)
>> -		SetPageActive(page);
>> -	update_page_reclaim_stat(zone, page, file, active);
>> -	add_page_to_lru_list(zone, page, lru);
>> + redo:
>> +	if (page_evictable(page, NULL)) {
>> +		if (active)
>> +			SetPageActive(page);
>> +		update_page_reclaim_stat(zone, page, file, active);
>> +		add_page_to_lru_list(zone, page, lru);
>> +	} else {
>> +		SetPageUnevictable(page);
>> +		add_page_to_lru_list(zone, page, LRU_UNEVICTABLE);
>> +		smp_mb();
>
> Why do we need barrier in here? Please comment it.

To cut-n-paste a comment from putback_lru_page() is good idea? :)

+               /*
+                * When racing with an mlock clearing (page is
+                * unlocked), make sure that if the other thread does
+                * not observe our setting of PG_lru and fails
+                * isolation, we see PG_mlocked cleared below and move
+                * the page back to the evictable list.
+                *
+                * The other side is TestClearPageMlocked().
+                */
+               smp_mb();



>> +		if (page_evictable(page, NULL)) {
>> +			del_page_from_lru_list(zone, page, LRU_UNEVICTABLE);
>> +			ClearPageUnevictable(page);
>> +			goto redo;
>> +		}
>> +	}
>
> I am not sure it's a good idea.
> mlock is very rare event but ____pagevec_lru_add_fn is called frequently.
> We are adding more overhead in ____pagevec_lru_add_fn.
> Is it valuable?

dunno.

Personally, I think tao's case is too artificial and I haven't observed
any real world application do such crazy mlock/munlock repeatness. But
he said he has a such application.

If my remember is correct, ltp or some test suite depend on current 
meminfo synching behavior. then I'm afraid simple removing bring us
new annoying bug report.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2012-01-01  7:30           ` [PATCH 1/2] mm,mlock: drain pagevecs asynchronously kosaki.motohiro
  2012-01-04  1:17             ` Minchan Kim
@ 2012-01-04  2:56             ` Hugh Dickins
  2012-01-04 22:05             ` Andrew Morton
  2 siblings, 0 replies; 28+ messages in thread
From: Hugh Dickins @ 2012-01-04  2:56 UTC (permalink / raw)
  To: kosaki.motohiro
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner, Andrew Morton,
	Michel Lespinasse

On Sun, 1 Jan 2012, kosaki.motohiro@gmail.com wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 
> Tao Ma reported current mlock is much slower than old 2.6.18 kernel. Because
> lru_add_drain_all() spent much time. The problem are two. 1) lru_add_drain_all()
> broadcast a worker thread to all cpus unconditionally. then, the performance
> penalty is increased in proportion to number of cpus. 2) lru_add_drain_all()
> wait the worker finished unnecessary. It makes bigger penalty.
> 
> This patch makes lru_add_drain_all_async() and changes mlock/mlockall use it.
> 
> Technical side note:
>  - has_pages_lru_pvecs() checks pagevecs locklessly. Of course, it's racy.
>    But it's no matter because asynchronous worker itself is also racy.
>    any lock can't close a race.
>  - Now, we drain pagevec at last of mlock instead of beginning. because
>    a page drain function (____pagevec_lru_add_fn) is PG_mlocked aware now.
>    Then it's safe and it close more race.
> 
> Without the patch:
> % time ./test_mlock -c 100000
> 
>  real   1m13.608s
>  user   0m0.204s
>  sys    0m40.115s
> 
>  i.e. 200usec per mlock
> 
> With the patch:
> % time ./test_mlock -c 100000
>  real    0m3.939s
>  user    0m0.060s
>  sys     0m3.868s
> 
>  i.e. 13usec per mlock

That's very nice; but it is an artificial test of the codepath you
have clearly speeded up, without any results for the codepaths you
have probably slowed down - I see Minchan is worried about that too.

> 
> test_mlock.c
> ==========================================
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <unistd.h>
>  #include <errno.h>
>  #include <time.h>
>  #include <sys/time.h>
>  #include <sys/mman.h>
> 
>  #define MM_SZ1 24
>  #define MM_SZ2 56
>  #define MM_SZ3 4168
> 
> void mlock_test()
> {
> 	char ptr1[MM_SZ1];
> 	char ptr2[MM_SZ2];
> 	char ptr3[MM_SZ3];
> 
> 	if(0 != mlock(ptr1, MM_SZ1) )
> 		perror("mlock MM_SZ1\n");
> 	if(0 != mlock(ptr2, MM_SZ2) )
> 		perror("mlock MM_SZ2\n");
> 	if(0 != mlock(ptr3, MM_SZ3) )
> 		perror("mlock MM_SZ3\n");
> 
> 	if(0 != munlock(ptr1, MM_SZ1) )
> 		perror("munlock MM_SZ1\n");
> 	if(0 != munlock(ptr2, MM_SZ2) )
> 		perror("munlock MM_SZ2\n");
> 	if(0 != munlock(ptr3, MM_SZ3) )
> 		perror("munlock MM_SZ3\n");
> }
> 
> int main(int argc, char *argv[])
> {
> 	int ret, opt;
> 	int i,cnt;
> 
> 	while((opt = getopt(argc, argv, "c:")) != -1 )
> 	{
> 		switch(opt){
> 		case 'c':
> 			cnt = atoi(optarg);
> 			break;
> 		default:
> 			printf("Usage: %s [-c count] arg...\n", argv[0]);
> 			exit(EXIT_FAILURE);
> 		}
> 	}
> 
> 	for(i = 0; i < cnt; i++)
> 		mlock_test();
> 
> 	return 0;
> }
> ===========================================
> 
> Reported-by: Tao Ma <boyu.mt@taobao.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Minchan Kim <minchan.kim@gmail.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Mel Gorman <mel@csn.ul.ie>
> Cc: Johannes Weiner <jweiner@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  include/linux/swap.h |    1 +
>  mm/mlock.c           |    7 +----
>  mm/swap.c            |   64 ++++++++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 63 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 1e22e12..11ad301 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -223,6 +223,7 @@ extern void activate_page(struct page *);
>  extern void mark_page_accessed(struct page *);
>  extern void lru_add_drain(void);
>  extern int lru_add_drain_all(void);
> +extern void lru_add_drain_all_async(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_page(struct page *page);
>  extern void swap_setup(void);
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 4f4f53b..08f5b6b 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -487,8 +487,6 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	lru_add_drain_all();	/* flush pagevec */
> -
>  	down_write(&current->mm->mmap_sem);
>  	len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
>  	start &= PAGE_MASK;
> @@ -505,6 +503,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
>  	up_write(&current->mm->mmap_sem);
>  	if (!error)
>  		error = do_mlock_pages(start, len, 0);
> +	lru_add_drain_all_async();
>  	return error;
>  }
>  
> @@ -557,9 +556,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>  	if (!can_do_mlock())
>  		goto out;
>  
> -	if (flags & MCL_CURRENT)
> -		lru_add_drain_all();	/* flush pagevec */
> -
>  	down_write(&current->mm->mmap_sem);
>  
>  	lock_limit = rlimit(RLIMIT_MEMLOCK);
> @@ -573,6 +569,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>  	if (!ret && (flags & MCL_CURRENT)) {
>  		/* Ignore errors */
>  		do_mlock_pages(0, TASK_SIZE, 1);
> +		lru_add_drain_all_async();
>  	}
>  out:
>  	return ret;
> diff --git a/mm/swap.c b/mm/swap.c
> index a91caf7..2690f04 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -569,6 +569,49 @@ int lru_add_drain_all(void)
>  	return schedule_on_each_cpu(lru_add_drain_per_cpu);
>  }
>  
> +static DEFINE_PER_CPU(struct work_struct, lru_drain_work);
> +
> +static int __init lru_drain_work_init(void)
> +{
> +	struct work_struct *work;
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		work = &per_cpu(lru_drain_work, cpu);
> +		INIT_WORK(work, &lru_add_drain_per_cpu);
> +	}
> +
> +	return 0;
> +}
> +core_initcall(lru_drain_work_init);
> +
> +static bool has_pages_lru_pvecs(int cpu)

I couldn't make sense of that name!  Though, to be fair, I never made
sense of "lru_add_drain" either: I always imagine a plumber adding a
drain.

pages_are_pending_lru_add(cpu)?

> +{
> +	struct pagevec *pvecs = per_cpu(lru_add_pvecs, cpu);
> +	struct pagevec *pvec;
> +	int lru;
> +
> +	for_each_lru(lru) {
> +		pvec = &pvecs[lru - LRU_BASE];

Hmm, yes, I see other such loops say "- LRU_BASE" too.
Good thing LRU_BASE is 0!  

> +		if (pagevec_count(pvec))
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
> +void lru_add_drain_all_async(void)
> +{
> +	int cpu;
> +
> +	for_each_online_cpu(cpu) {
> +		struct work_struct *work = &per_cpu(lru_drain_work, cpu);
> +
> +		if (has_pages_lru_pvecs(cpu))
> +			schedule_work_on(cpu, work);
> +	}
> +}

You'll be peeking at a lot of remote cpus here, rather than sending them
all an IPI.  My guess is that it is more efficient your way, but that's
not as obvious as I'd like.

I haven't added Gilad Ben-Yossef to the Cc list, but be aware that he's
particularly interested in reducing IPIs, and has some patches on lkml.

I have added Michel Lespinasse to the Cc list, he's good to Cc on
mlock matters.  He thought that if the pages_are_pending check is
a win, then you should do it in lru_add_drain_all() too.  But more,
he wondered if we couldn't do a better job without all this, by
doing just lru_add_drain() first, and then doing the extra work
(of prodding other cpus) only if we come across a page which needs it.

> +
>  /*
>   * Batched page_cache_release().  Decrement the reference count on all the
>   * passed pages.  If it fell to zero then remove the page from the LRU and
> @@ -704,10 +747,23 @@ static void ____pagevec_lru_add_fn(struct page *page, void *arg)
>  	VM_BUG_ON(PageLRU(page));
>  
>  	SetPageLRU(page);
> -	if (active)
> -		SetPageActive(page);
> -	update_page_reclaim_stat(zone, page, file, active);
> -	add_page_to_lru_list(zone, page, lru);
> + redo:
> +	if (page_evictable(page, NULL)) {
> +		if (active)
> +			SetPageActive(page);
> +		update_page_reclaim_stat(zone, page, file, active);
> +		add_page_to_lru_list(zone, page, lru);
> +	} else {
> +		SetPageUnevictable(page);
> +		add_page_to_lru_list(zone, page, LRU_UNEVICTABLE);
> +		smp_mb();
> +
> +		if (page_evictable(page, NULL)) {
> +			del_page_from_lru_list(zone, page, LRU_UNEVICTABLE);
> +			ClearPageUnevictable(page);
> +			goto redo;
> +		}
> +	}
>  }

Right, this is where the overhead comes, in everybody's pagevec_lru_add_fn.

Now, I do like that you're resolving the un/evictable issue here, and
if the overhead really doesn't hurt, then this is a nice way to go.

But I think you should go further to reduce your overhead.  For one
thing, certainly take that silly almost-always-NULL vma arg out of
page_unevictable, and test vma directly in the one or two places that
want it; maybe make page_unevictable(page) an mm_inline function.

And update those various places (many or all in vmscan.c) which are
asking if page_unevictable(), to avoid using pagevec for unevictable
page because they know it doesn't work: you are now making it work.

Hah, I see my own putback_lru(or inactive)_pages patch is open to the
same criticism: if I'm now using a page_list there, it could handle the
unevictable pages directly instead of having to avoid them.  I think.

(I've never noticed before, it worries me now that page_unevictable()
is referring to page mapping flags at a point when we generally have
no hold on the mapping at all; but I've never heard of that being a
problem, and I guess if the mapping is stale, the page will very soon
be freed, so no matter if it went on the Unevictable list for a moment.)

Hugh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async()
  2012-01-04  2:19               ` KOSAKI Motohiro
@ 2012-01-04  5:17                 ` Hugh Dickins
  2012-01-04  8:34                   ` KOSAKI Motohiro
  0 siblings, 1 reply; 28+ messages in thread
From: Hugh Dickins @ 2012-01-04  5:17 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Michel Lespinasse

On Tue, 3 Jan 2012, KOSAKI Motohiro wrote:
> (1/3/12 8:51 PM), Hugh Dickins wrote:
> > 
> > In testing my fix for that, I find that there has been no attempt to
> > keep the Unevictable count accurate on SysVShm: SHM_LOCK pages get
> > marked unevictable lazily later as memory pressure discovers them -
> > which perhaps mirrors the way in which SHM_LOCK makes no attempt to
> > instantiate pages, unlike mlock.
> 
> Ugh, you are right. I'm recovering my remember gradually. Lee implemented
> immediate lru off logic at first and I killed it
> to close a race. I completely forgot. So, yes, now SHM_LOCK has no attempt to
> instantiate pages. I'm ashamed.

Why ashamed?  The shmctl man-page documents "The caller must fault in any
pages that are required to be present after locking is enabled."  That's
just how it behaves.

> > (But in writing this, realize I still don't quite understand why
> > the Unevictable count takes a second or two to get back to 0 after
> > SHM_UNLOCK: perhaps I've more to discover.)
> 
> Interesting. I'm looking at this too.

In case you got distracted before you found it, mm/vmstat.c's

static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
int sysctl_stat_interval __read_mostly = HZ;

static void vmstat_update(struct work_struct *w)
{
	refresh_cpu_vm_stats(smp_processor_id());
	schedule_delayed_work(&__get_cpu_var(vmstat_work),
		round_jiffies_relative(sysctl_stat_interval));
}

would be why, I think.  And that implies to me that your
lru_add_drain_all_async() is not necessary, you'd get just as good
an effect, more cheaply, by doing a local lru_add_drain() before the
refresh in vmstat_update().

But it would still require your changes to ____pagevec_lru_add_fn(),
if those turn out to help more than they hurt.

Hugh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async()
  2012-01-04  5:17                 ` Hugh Dickins
@ 2012-01-04  8:34                   ` KOSAKI Motohiro
  0 siblings, 0 replies; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-01-04  8:34 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm, linux-kernel, Michel Lespinasse

2012/1/4 Hugh Dickins <hughd@google.com>:
> On Tue, 3 Jan 2012, KOSAKI Motohiro wrote:
>> (1/3/12 8:51 PM), Hugh Dickins wrote:
>> >
>> > In testing my fix for that, I find that there has been no attempt to
>> > keep the Unevictable count accurate on SysVShm: SHM_LOCK pages get
>> > marked unevictable lazily later as memory pressure discovers them -
>> > which perhaps mirrors the way in which SHM_LOCK makes no attempt to
>> > instantiate pages, unlike mlock.
>>
>> Ugh, you are right. I'm recovering my remember gradually. Lee implemented
>> immediate lru off logic at first and I killed it
>> to close a race. I completely forgot. So, yes, now SHM_LOCK has no attempt to
>> instantiate pages. I'm ashamed.
>
> Why ashamed?  The shmctl man-page documents "The caller must fault in any
> pages that are required to be present after locking is enabled."  That's
> just how it behaves.

hehe, I have big bad reputation about for bad remember capabilities from
my friends. I should have remembered what i implemented. ;-)



>> > (But in writing this, realize I still don't quite understand why
>> > the Unevictable count takes a second or two to get back to 0 after
>> > SHM_UNLOCK: perhaps I've more to discover.)
>>
>> Interesting. I'm looking at this too.
>
> In case you got distracted before you found it, mm/vmstat.c's
>
> static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
> int sysctl_stat_interval __read_mostly = HZ;
>
> static void vmstat_update(struct work_struct *w)
> {
>        refresh_cpu_vm_stats(smp_processor_id());
>        schedule_delayed_work(&__get_cpu_var(vmstat_work),
>                round_jiffies_relative(sysctl_stat_interval));
> }
>
> would be why, I think.  And that implies to me that your
> lru_add_drain_all_async() is not necessary, you'd get just as good
> an effect, more cheaply, by doing a local lru_add_drain() before the
> refresh in vmstat_update().

When, I implement lru_add_drain_all_async(), I thought this idea. I don't
dislike both. But if we take vmstat_update() one, I think we need more tricks.
pcp draining in refresh_cpu_vm_stats() delays up to 3 seconds. Why?
round_jiffies_relative() don't silly round to HZ boundary. Instead of, it adds
a few unique offset per each cpus. thus, 3 seconds mean max 3000cpus
don't make zone_{lru_}lock contention. pagevec draining also need same
trick for rescue SGI UV. It might be too pessimistic concern. but
vmstat_update() shouldn't make obsevable lock contention.


> But it would still require your changes to ____pagevec_lru_add_fn(),
> if those turn out to help more than they hurt.

I agree.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2012-01-01  7:30           ` [PATCH 1/2] mm,mlock: drain pagevecs asynchronously kosaki.motohiro
  2012-01-04  1:17             ` Minchan Kim
  2012-01-04  2:56             ` Hugh Dickins
@ 2012-01-04 22:05             ` Andrew Morton
  2012-01-04 23:33               ` KOSAKI Motohiro
  2 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2012-01-04 22:05 UTC (permalink / raw)
  To: kosaki.motohiro
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner

On Sun,  1 Jan 2012 02:30:24 -0500
kosaki.motohiro@gmail.com wrote:

> Because lru_add_drain_all() spent much time.

Those LRU pagevecs are horrid things.  They add high code and
conceptual complexity, they add pointless uniprocessor overhead and the
way in which they leave LRU pages floating around not on an LRU is
rather maddening.

So the best way to fix all of this as well as this problem we're
observing is, I hope, to completely remove them.

They've been in there for ~10 years and at the time they were quite
beneficial in reducing lru_lock contention, hold times, acquisition
frequency, etc.

The approach to take here is to prepare the patches which eliminate
lru_*_pvecs then identify the problems which occur as a result, via
code inspection and runtime testing.  Then fix those up.

Many sites which take lru_lock are already batching the operation. 
It's a matter of hunting down those sites which take the lock
once-per-page and, if they have high frequency, batch them up.

Converting readahead to batch the locking will be pretty simple
(read_pages(), mpage_readpages(), others).  That will fix pagefaults
too.  

rotate_reclaimable_page() can be batched by batching
end_page_writeback(): a bio contains many pages already.

deactivate_page() can be batched too - invalidate_mapping_pages() is
already working on large chunks of pages.

Those three cases are fairly simple - we just didn't try, because the
lru_*_pvecs were there to do the work for us.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2012-01-04 22:05             ` Andrew Morton
@ 2012-01-04 23:33               ` KOSAKI Motohiro
  2012-01-05  0:19                 ` Hugh Dickins
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-01-04 23:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner

(1/4/12 5:05 PM), Andrew Morton wrote:
> On Sun,  1 Jan 2012 02:30:24 -0500
> kosaki.motohiro@gmail.com wrote:
>
>> Because lru_add_drain_all() spent much time.
>
> Those LRU pagevecs are horrid things.  They add high code and
> conceptual complexity, they add pointless uniprocessor overhead and the
> way in which they leave LRU pages floating around not on an LRU is
> rather maddening.
>
> So the best way to fix all of this as well as this problem we're
> observing is, I hope, to completely remove them.
>
> They've been in there for ~10 years and at the time they were quite
> beneficial in reducing lru_lock contention, hold times, acquisition
> frequency, etc.
>
> The approach to take here is to prepare the patches which eliminate
> lru_*_pvecs then identify the problems which occur as a result, via
> code inspection and runtime testing.  Then fix those up.
>
> Many sites which take lru_lock are already batching the operation.
> It's a matter of hunting down those sites which take the lock
> once-per-page and, if they have high frequency, batch them up.
>
> Converting readahead to batch the locking will be pretty simple
> (read_pages(), mpage_readpages(), others).  That will fix pagefaults
> too.
>
> rotate_reclaimable_page() can be batched by batching
> end_page_writeback(): a bio contains many pages already.
>
> deactivate_page() can be batched too - invalidate_mapping_pages() is
> already working on large chunks of pages.
>
> Those three cases are fairly simple - we just didn't try, because the
> lru_*_pvecs were there to do the work for us.

got it. so, let's wait hugh's "mm: take pagevecs off reclaim stack" next spin
and make the patches on top of it.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2012-01-04 23:33               ` KOSAKI Motohiro
@ 2012-01-05  0:19                 ` Hugh Dickins
  0 siblings, 0 replies; 28+ messages in thread
From: Hugh Dickins @ 2012-01-05  0:19 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, linux-mm, linux-kernel, KOSAKI Motohiro,
	David Rientjes, Minchan Kim, Mel Gorman, Johannes Weiner

On Wed, 4 Jan 2012, KOSAKI Motohiro wrote:
> (1/4/12 5:05 PM), Andrew Morton wrote:
> > On Sun,  1 Jan 2012 02:30:24 -0500
> > kosaki.motohiro@gmail.com wrote:
> > 
> > > Because lru_add_drain_all() spent much time.
> > 
> > Those LRU pagevecs are horrid things.  They add high code and
> > conceptual complexity, they add pointless uniprocessor overhead and the
> > way in which they leave LRU pages floating around not on an LRU is
> > rather maddening.

Yes, we continue to have difficulties with not-quite-PageLRU-yet pages.

> > 
> > So the best way to fix all of this as well as this problem we're
> > observing is, I hope, to completely remove them.

Nice aim, sounds like a dirty job.  I wonder how far we could get using
lru_add_drain, avoiding lru_add_drain_all, and flushing pvec when pre-empted.

> ...
> 
> got it. so, let's wait hugh's "mm: take pagevecs off reclaim stack" next spin
> and make the patches on top of it.

Don't wait on me, I wasn't intending another spin, with Andrew's last
word on it today:

> If we already have the latency problem at the isolate_lru_pages() stage
> then I suppose we can assume that nobody is noticing it, so we'll
> probably be OK.
> 
> For a while.  Someone will complain at some stage and we'll probably
> end up busting this work into chunks.

and mm-commits does presently have my
mm-rearrange-putback-inactive-pages.patch in on top of it.

Besides, these free_hot_cold_page_list() users are already avoiding
the lru add pagevecs which Andrew is nudging towards removing above,
so there shouldn't be much overlap.

Or maybe you're thinking of my observation that it could avoid the
!page_evictable putback_lru_page special case now: yes, I'd like to
make that change sometime, but moved away to other things for now.

Hugh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30 10:07         ` KOSAKI Motohiro
  2012-01-01  7:30           ` [PATCH 1/2] mm,mlock: drain pagevecs asynchronously kosaki.motohiro
  2012-01-01  7:30           ` [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async() kosaki.motohiro
@ 2012-01-06  6:13           ` Tao Ma
  2012-01-06  6:18             ` KOSAKI Motohiro
  2012-01-09  7:25           ` Tao Ma
  3 siblings, 1 reply; 28+ messages in thread
From: Tao Ma @ 2012-01-06  6:13 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

Hi Kosaki,
On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>> Because your test program is too artificial. 20sec/100000times =
>>> 200usec. And your
>>> program repeat mlock and munlock the exact same address. so, yes, if
>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>> worthless comparision.
>>> none of any practical program does such strange mlock usage.
>> yes, I should say it is artificial. But mlock did cause the problem in
>> our product system and perf shows that the mlock uses the system time
>> much more than others. That's the reason we created this program to test
>> whether mlock really sucks. And we compared the result with
>> rhel5(2.6.18) which runs much much faster.
>>
>> And from the commit log you described, we can remove lru_add_drain_all
>> safely here, so why add it? At least removing it makes mlock much faster
>> compared to the vanilla kernel.
> 
> If we remove it, we lose to a test way of mlock. "Memlocked" field of
> /proc/meminfo
> show inaccurate number very easily. So, if 200usec is no avoidable,
> I'll ack you.
> But I'm not convinced yet.
Do you find something new for this?

Thanks
Tao

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2012-01-06  6:13           ` [PATCH] mm: do not drain pagevecs for mlock Tao Ma
@ 2012-01-06  6:18             ` KOSAKI Motohiro
  2012-01-06  6:30               ` Tao Ma
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-01-06  6:18 UTC (permalink / raw)
  To: Tao Ma
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

2012/1/6 Tao Ma <tm@tao.ma>:
> Hi Kosaki,
> On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>>> Because your test program is too artificial. 20sec/100000times =
>>>> 200usec. And your
>>>> program repeat mlock and munlock the exact same address. so, yes, if
>>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>>> worthless comparision.
>>>> none of any practical program does such strange mlock usage.
>>> yes, I should say it is artificial. But mlock did cause the problem in
>>> our product system and perf shows that the mlock uses the system time
>>> much more than others. That's the reason we created this program to test
>>> whether mlock really sucks. And we compared the result with
>>> rhel5(2.6.18) which runs much much faster.
>>>
>>> And from the commit log you described, we can remove lru_add_drain_all
>>> safely here, so why add it? At least removing it makes mlock much faster
>>> compared to the vanilla kernel.
>>
>> If we remove it, we lose to a test way of mlock. "Memlocked" field of
>> /proc/meminfo
>> show inaccurate number very easily. So, if 200usec is no avoidable,
>> I'll ack you.
>> But I'm not convinced yet.
> Do you find something new for this?

No.

Or more exactly, 200usec is my calculation mistake. your program call mlock
3 times per each iteration. so, correct cost is 66usec.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2012-01-06  6:18             ` KOSAKI Motohiro
@ 2012-01-06  6:30               ` Tao Ma
  2012-01-06  6:33                 ` KOSAKI Motohiro
  0 siblings, 1 reply; 28+ messages in thread
From: Tao Ma @ 2012-01-06  6:30 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

On 01/06/2012 02:18 PM, KOSAKI Motohiro wrote:
> 2012/1/6 Tao Ma <tm@tao.ma>:
>> Hi Kosaki,
>> On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>>>> Because your test program is too artificial. 20sec/100000times =
>>>>> 200usec. And your
>>>>> program repeat mlock and munlock the exact same address. so, yes, if
>>>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>>>> worthless comparision.
>>>>> none of any practical program does such strange mlock usage.
>>>> yes, I should say it is artificial. But mlock did cause the problem in
>>>> our product system and perf shows that the mlock uses the system time
>>>> much more than others. That's the reason we created this program to test
>>>> whether mlock really sucks. And we compared the result with
>>>> rhel5(2.6.18) which runs much much faster.
>>>>
>>>> And from the commit log you described, we can remove lru_add_drain_all
>>>> safely here, so why add it? At least removing it makes mlock much faster
>>>> compared to the vanilla kernel.
>>>
>>> If we remove it, we lose to a test way of mlock. "Memlocked" field of
>>> /proc/meminfo
>>> show inaccurate number very easily. So, if 200usec is no avoidable,
>>> I'll ack you.
>>> But I'm not convinced yet.
>> Do you find something new for this?
> 
> No.
> 
> Or more exactly, 200usec is my calculation mistake. your program call mlock
> 3 times per each iteration. so, correct cost is 66usec.
yes, so mlock can do 15000/s, it is even slower than the whole i/o time
for some not very fast ssd disk and I don't think it is endurable. I
guess we should remove it, right? Or you have another other suggestion
that I can try for it?

Thanks
Tao

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2012-01-06  6:30               ` Tao Ma
@ 2012-01-06  6:33                 ` KOSAKI Motohiro
  2012-01-06  6:46                   ` Tao Ma
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-01-06  6:33 UTC (permalink / raw)
  To: Tao Ma
  Cc: KOSAKI Motohiro, linux-mm, linux-kernel, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner, Andrew Morton

(1/6/12 1:30 AM), Tao Ma wrote:
> On 01/06/2012 02:18 PM, KOSAKI Motohiro wrote:
>> 2012/1/6 Tao Ma<tm@tao.ma>:
>>> Hi Kosaki,
>>> On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>>>>> Because your test program is too artificial. 20sec/100000times =
>>>>>> 200usec. And your
>>>>>> program repeat mlock and munlock the exact same address. so, yes, if
>>>>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>>>>> worthless comparision.
>>>>>> none of any practical program does such strange mlock usage.
>>>>> yes, I should say it is artificial. But mlock did cause the problem in
>>>>> our product system and perf shows that the mlock uses the system time
>>>>> much more than others. That's the reason we created this program to test
>>>>> whether mlock really sucks. And we compared the result with
>>>>> rhel5(2.6.18) which runs much much faster.
>>>>>
>>>>> And from the commit log you described, we can remove lru_add_drain_all
>>>>> safely here, so why add it? At least removing it makes mlock much faster
>>>>> compared to the vanilla kernel.
>>>>
>>>> If we remove it, we lose to a test way of mlock. "Memlocked" field of
>>>> /proc/meminfo
>>>> show inaccurate number very easily. So, if 200usec is no avoidable,
>>>> I'll ack you.
>>>> But I'm not convinced yet.
>>> Do you find something new for this?
>>
>> No.
>>
>> Or more exactly, 200usec is my calculation mistake. your program call mlock
>> 3 times per each iteration. so, correct cost is 66usec.
> yes, so mlock can do 15000/s, it is even slower than the whole i/o time
> for some not very fast ssd disk and I don't think it is endurable. I
> guess we should remove it, right? Or you have another other suggestion
> that I can try for it?

read whole thread.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2012-01-06  6:33                 ` KOSAKI Motohiro
@ 2012-01-06  6:46                   ` Tao Ma
  2012-01-09 23:58                     ` KOSAKI Motohiro
  0 siblings, 1 reply; 28+ messages in thread
From: Tao Ma @ 2012-01-06  6:46 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: KOSAKI Motohiro, linux-mm, linux-kernel, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner, Andrew Morton

On 01/06/2012 02:33 PM, KOSAKI Motohiro wrote:
> (1/6/12 1:30 AM), Tao Ma wrote:
>> On 01/06/2012 02:18 PM, KOSAKI Motohiro wrote:
>>> 2012/1/6 Tao Ma<tm@tao.ma>:
>>>> Hi Kosaki,
>>>> On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>>>>>> Because your test program is too artificial. 20sec/100000times =
>>>>>>> 200usec. And your
>>>>>>> program repeat mlock and munlock the exact same address. so, yes, if
>>>>>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>>>>>> worthless comparision.
>>>>>>> none of any practical program does such strange mlock usage.
>>>>>> yes, I should say it is artificial. But mlock did cause the
>>>>>> problem in
>>>>>> our product system and perf shows that the mlock uses the system time
>>>>>> much more than others. That's the reason we created this program
>>>>>> to test
>>>>>> whether mlock really sucks. And we compared the result with
>>>>>> rhel5(2.6.18) which runs much much faster.
>>>>>>
>>>>>> And from the commit log you described, we can remove
>>>>>> lru_add_drain_all
>>>>>> safely here, so why add it? At least removing it makes mlock much
>>>>>> faster
>>>>>> compared to the vanilla kernel.
>>>>>
>>>>> If we remove it, we lose to a test way of mlock. "Memlocked" field of
>>>>> /proc/meminfo
>>>>> show inaccurate number very easily. So, if 200usec is no avoidable,
>>>>> I'll ack you.
>>>>> But I'm not convinced yet.
>>>> Do you find something new for this?
>>>
>>> No.
>>>
>>> Or more exactly, 200usec is my calculation mistake. your program call
>>> mlock
>>> 3 times per each iteration. so, correct cost is 66usec.
>> yes, so mlock can do 15000/s, it is even slower than the whole i/o time
>> for some not very fast ssd disk and I don't think it is endurable. I
>> guess we should remove it, right? Or you have another other suggestion
>> that I can try for it?
> 
> read whole thread.
I have read the whole thread, and you just described that the test case
is artificial and there is no suggestion or patch about how to resolve
it. As I have said that it is very time-consuming and with more cpu
cores, the more penalty, and an i/o time for a ssd can be faster than
it. So do you think 66 usec is OK for a memory operation?

Thanks
Tao

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2011-12-30 10:07         ` KOSAKI Motohiro
                             ` (2 preceding siblings ...)
  2012-01-06  6:13           ` [PATCH] mm: do not drain pagevecs for mlock Tao Ma
@ 2012-01-09  7:25           ` Tao Ma
  3 siblings, 0 replies; 28+ messages in thread
From: Tao Ma @ 2012-01-09  7:25 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-mm, linux-kernel, David Rientjes, Minchan Kim, Mel Gorman,
	Johannes Weiner, Andrew Morton

Hi KOSAKI,
On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>> Because your test program is too artificial. 20sec/100000times =
>>> 200usec. And your
>>> program repeat mlock and munlock the exact same address. so, yes, if
>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>> worthless comparision.
>>> none of any practical program does such strange mlock usage.
>> yes, I should say it is artificial. But mlock did cause the problem in
>> our product system and perf shows that the mlock uses the system time
>> much more than others. That's the reason we created this program to test
>> whether mlock really sucks. And we compared the result with
>> rhel5(2.6.18) which runs much much faster.
>>
>> And from the commit log you described, we can remove lru_add_drain_all
>> safely here, so why add it? At least removing it makes mlock much faster
>> compared to the vanilla kernel.
> 
> If we remove it, we lose to a test way of mlock. "Memlocked" field of
> /proc/meminfo
> show inaccurate number very easily. So, if 200usec is no avoidable,
> I'll ack you.
> But I'm not convinced yet.
As you don't think removing lru_add_drain_all is appropriate, I have
created another patch set to resolve it. I add a new per cpu counter to
record the counter of all the pages in the pagevecs. So if the counter
is 0, don't drain the corresponding cpu. Does it make sense to you?

Thanks
Tao

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2012-01-06  6:46                   ` Tao Ma
@ 2012-01-09 23:58                     ` KOSAKI Motohiro
  2012-01-10  2:08                       ` Tao Ma
  0 siblings, 1 reply; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-01-09 23:58 UTC (permalink / raw)
  To: Tao Ma
  Cc: KOSAKI Motohiro, linux-mm, linux-kernel, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner, Andrew Morton

(1/6/12 1:46 AM), Tao Ma wrote:
> On 01/06/2012 02:33 PM, KOSAKI Motohiro wrote:
>> (1/6/12 1:30 AM), Tao Ma wrote:
>>> On 01/06/2012 02:18 PM, KOSAKI Motohiro wrote:
>>>> 2012/1/6 Tao Ma<tm@tao.ma>:
>>>>> Hi Kosaki,
>>>>> On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>>>>>>> Because your test program is too artificial. 20sec/100000times =
>>>>>>>> 200usec. And your
>>>>>>>> program repeat mlock and munlock the exact same address. so, yes, if
>>>>>>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>>>>>>> worthless comparision.
>>>>>>>> none of any practical program does such strange mlock usage.
>>>>>>> yes, I should say it is artificial. But mlock did cause the
>>>>>>> problem in
>>>>>>> our product system and perf shows that the mlock uses the system time
>>>>>>> much more than others. That's the reason we created this program
>>>>>>> to test
>>>>>>> whether mlock really sucks. And we compared the result with
>>>>>>> rhel5(2.6.18) which runs much much faster.
>>>>>>>
>>>>>>> And from the commit log you described, we can remove
>>>>>>> lru_add_drain_all
>>>>>>> safely here, so why add it? At least removing it makes mlock much
>>>>>>> faster
>>>>>>> compared to the vanilla kernel.
>>>>>>
>>>>>> If we remove it, we lose to a test way of mlock. "Memlocked" field of
>>>>>> /proc/meminfo
>>>>>> show inaccurate number very easily. So, if 200usec is no avoidable,
>>>>>> I'll ack you.
>>>>>> But I'm not convinced yet.
>>>>> Do you find something new for this?
>>>>
>>>> No.
>>>>
>>>> Or more exactly, 200usec is my calculation mistake. your program call
>>>> mlock
>>>> 3 times per each iteration. so, correct cost is 66usec.
>>> yes, so mlock can do 15000/s, it is even slower than the whole i/o time
>>> for some not very fast ssd disk and I don't think it is endurable. I
>>> guess we should remove it, right? Or you have another other suggestion
>>> that I can try for it?
>>
>> read whole thread.
> I have read the whole thread, and you just described that the test case
> is artificial and there is no suggestion or patch about how to resolve
> it. As I have said that it is very time-consuming and with more cpu
> cores, the more penalty, and an i/o time for a ssd can be faster than
> it. So do you think 66 usec is OK for a memory operation?

I don't think you've read the thread at all. please read akpm's commnet.

http://www.spinics.net/lists/linux-mm/msg28290.html





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] mm: do not drain pagevecs for mlock
  2012-01-09 23:58                     ` KOSAKI Motohiro
@ 2012-01-10  2:08                       ` Tao Ma
  0 siblings, 0 replies; 28+ messages in thread
From: Tao Ma @ 2012-01-10  2:08 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: KOSAKI Motohiro, linux-mm, linux-kernel, David Rientjes,
	Minchan Kim, Mel Gorman, Johannes Weiner, Andrew Morton

On 01/10/2012 07:58 AM, KOSAKI Motohiro wrote:
> (1/6/12 1:46 AM), Tao Ma wrote:
>> On 01/06/2012 02:33 PM, KOSAKI Motohiro wrote:
>>> (1/6/12 1:30 AM), Tao Ma wrote:
>>>> On 01/06/2012 02:18 PM, KOSAKI Motohiro wrote:
>>>>> 2012/1/6 Tao Ma<tm@tao.ma>:
>>>>>> Hi Kosaki,
>>>>>> On 12/30/2011 06:07 PM, KOSAKI Motohiro wrote:
>>>>>>>>> Because your test program is too artificial. 20sec/100000times =
>>>>>>>>> 200usec. And your
>>>>>>>>> program repeat mlock and munlock the exact same address. so,
>>>>>>>>> yes, if
>>>>>>>>> lru_add_drain_all() is removed, it become near no-op. but it's
>>>>>>>>> worthless comparision.
>>>>>>>>> none of any practical program does such strange mlock usage.
>>>>>>>> yes, I should say it is artificial. But mlock did cause the
>>>>>>>> problem in
>>>>>>>> our product system and perf shows that the mlock uses the system
>>>>>>>> time
>>>>>>>> much more than others. That's the reason we created this program
>>>>>>>> to test
>>>>>>>> whether mlock really sucks. And we compared the result with
>>>>>>>> rhel5(2.6.18) which runs much much faster.
>>>>>>>>
>>>>>>>> And from the commit log you described, we can remove
>>>>>>>> lru_add_drain_all
>>>>>>>> safely here, so why add it? At least removing it makes mlock much
>>>>>>>> faster
>>>>>>>> compared to the vanilla kernel.
>>>>>>>
>>>>>>> If we remove it, we lose to a test way of mlock. "Memlocked"
>>>>>>> field of
>>>>>>> /proc/meminfo
>>>>>>> show inaccurate number very easily. So, if 200usec is no avoidable,
>>>>>>> I'll ack you.
>>>>>>> But I'm not convinced yet.
>>>>>> Do you find something new for this?
>>>>>
>>>>> No.
>>>>>
>>>>> Or more exactly, 200usec is my calculation mistake. your program call
>>>>> mlock
>>>>> 3 times per each iteration. so, correct cost is 66usec.
>>>> yes, so mlock can do 15000/s, it is even slower than the whole i/o time
>>>> for some not very fast ssd disk and I don't think it is endurable. I
>>>> guess we should remove it, right? Or you have another other suggestion
>>>> that I can try for it?
>>>
>>> read whole thread.
>> I have read the whole thread, and you just described that the test case
>> is artificial and there is no suggestion or patch about how to resolve
>> it. As I have said that it is very time-consuming and with more cpu
>> cores, the more penalty, and an i/o time for a ssd can be faster than
>> it. So do you think 66 usec is OK for a memory operation?
> 
> I don't think you've read the thread at all. please read akpm's commnet.
> 
> http://www.spinics.net/lists/linux-mm/msg28290.html
Oh, your patch set doesn't cc to me, so my mail filter moved it to
another directory..
Sorry and I will read the whole thread. Thanks again for your time.

Thanks
Tao

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/2] mm,mlock: drain pagevecs asynchronously
  2012-01-04  2:38               ` KOSAKI Motohiro
@ 2012-01-10  8:53                 ` Tao Ma
  0 siblings, 0 replies; 28+ messages in thread
From: Tao Ma @ 2012-01-10  8:53 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, linux-mm, linux-kernel, KOSAKI Motohiro,
	David Rientjes, Minchan Kim, Mel Gorman, Johannes Weiner,
	Andrew Morton

Hi KOSAKI,
On 01/04/2012 10:38 AM, KOSAKI Motohiro wrote:
> 
>>> @@ -704,10 +747,23 @@ static void ____pagevec_lru_add_fn(struct page
>>> *page, void *arg)
>>>       VM_BUG_ON(PageLRU(page));
>>>
>>>       SetPageLRU(page);
>>> -    if (active)
>>> -        SetPageActive(page);
>>> -    update_page_reclaim_stat(zone, page, file, active);
>>> -    add_page_to_lru_list(zone, page, lru);
>>> + redo:
>>> +    if (page_evictable(page, NULL)) {
>>> +        if (active)
>>> +            SetPageActive(page);
>>> +        update_page_reclaim_stat(zone, page, file, active);
>>> +        add_page_to_lru_list(zone, page, lru);
>>> +    } else {
>>> +        SetPageUnevictable(page);
>>> +        add_page_to_lru_list(zone, page, LRU_UNEVICTABLE);
>>> +        smp_mb();
>>
>> Why do we need barrier in here? Please comment it.
> 
> To cut-n-paste a comment from putback_lru_page() is good idea? :)
> 
> +               /*
> +                * When racing with an mlock clearing (page is
> +                * unlocked), make sure that if the other thread does
> +                * not observe our setting of PG_lru and fails
> +                * isolation, we see PG_mlocked cleared below and move
> +                * the page back to the evictable list.
> +                *
> +                * The other side is TestClearPageMlocked().
> +                */
> +               smp_mb();
> 
> 
> 
>>> +        if (page_evictable(page, NULL)) {
>>> +            del_page_from_lru_list(zone, page, LRU_UNEVICTABLE);
>>> +            ClearPageUnevictable(page);
>>> +            goto redo;
>>> +        }
>>> +    }
>>
>> I am not sure it's a good idea.
>> mlock is very rare event but ____pagevec_lru_add_fn is called frequently.
>> We are adding more overhead in ____pagevec_lru_add_fn.
>> Is it valuable?
> 
> dunno.
> 
> Personally, I think tao's case is too artificial and I haven't observed
> any real world application do such crazy mlock/munlock repeatness. But
> he said he has a such application.
ok, I will talk more about our application here.
So it is backend of a php. And for every user request, we will have to
call libmcrypt(http://sourceforge.net/projects/mcrypt/) several times to
encrypt some information, and libmcrypt  will use mlock/munlock. As a
server can finish many requests in one second, so the total
mlock/munlock counts will sum up to around 2000 and it really means some
for us.
> 
> If my remember is correct, ltp or some test suite depend on current
> meminfo synching behavior. then I'm afraid simple removing bring us
> new annoying bug report.
So this is the only side effect for removing the lru_add_drain_all from
mlock/mlockall right? Is there any other know issues?

I have read Andrew's comment, and if we have decided to remove all these
lru_* stuff, it seems that we have a long way to go before this issue
can be completed resolved. So I will remove it from our production
kernel first and wait for your final cleanup. Great thanks for your time
and kindly help.

Thanks
Tao

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2012-01-10  8:53 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-30  6:36 [PATCH] mm: do not drain pagevecs for mlock Tao Ma
2011-12-30  8:11 ` KOSAKI Motohiro
2011-12-30  8:48   ` Tao Ma
2011-12-30  9:31     ` KOSAKI Motohiro
2011-12-30  9:45       ` Tao Ma
2011-12-30 10:07         ` KOSAKI Motohiro
2012-01-01  7:30           ` [PATCH 1/2] mm,mlock: drain pagevecs asynchronously kosaki.motohiro
2012-01-04  1:17             ` Minchan Kim
2012-01-04  2:38               ` KOSAKI Motohiro
2012-01-10  8:53                 ` Tao Ma
2012-01-04  2:56             ` Hugh Dickins
2012-01-04 22:05             ` Andrew Morton
2012-01-04 23:33               ` KOSAKI Motohiro
2012-01-05  0:19                 ` Hugh Dickins
2012-01-01  7:30           ` [PATCH 2/2] sysvshm: SHM_LOCK use lru_add_drain_all_async() kosaki.motohiro
2012-01-04  1:51             ` Hugh Dickins
2012-01-04  2:19               ` KOSAKI Motohiro
2012-01-04  5:17                 ` Hugh Dickins
2012-01-04  8:34                   ` KOSAKI Motohiro
2012-01-06  6:13           ` [PATCH] mm: do not drain pagevecs for mlock Tao Ma
2012-01-06  6:18             ` KOSAKI Motohiro
2012-01-06  6:30               ` Tao Ma
2012-01-06  6:33                 ` KOSAKI Motohiro
2012-01-06  6:46                   ` Tao Ma
2012-01-09 23:58                     ` KOSAKI Motohiro
2012-01-10  2:08                       ` Tao Ma
2012-01-09  7:25           ` Tao Ma
2011-12-30 10:14         ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).