linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: mlock: remove lru_add_drain_all()
@ 2017-10-18 23:17 Shakeel Butt
  2017-10-19  3:18 ` Balbir Singh
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Shakeel Butt @ 2017-10-18 23:17 UTC (permalink / raw)
  To: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins
  Cc: linux-mm, linux-kernel, Shakeel Butt

Recently we have observed high latency in mlock() in our generic
library and noticed that users have started using tmpfs files even
without swap and the latency was due to expensive remote LRU cache
draining.

Is lru_add_drain_all() required by mlock()? The answer is no and the
reason it is still in mlock() is to rapidly move mlocked pages to
unevictable LRU. Without lru_add_drain_all() the mlocked pages which
were on pagevec at mlock() time will be moved to evictable LRUs but
will eventually be moved back to unevictable LRU by reclaim. So, we
can safely remove lru_add_drain_all() from mlock(). Also there is no
need for local lru_add_drain() as it will be called deep inside
__mm_populate() (in follow_page_pte()).

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
 mm/mlock.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index dfc6f1912176..3ceb2935d1e0 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -669,8 +669,6 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
 	if (!can_do_mlock())
 		return -EPERM;
 
-	lru_add_drain_all();	/* flush pagevec */
-
 	len = PAGE_ALIGN(len + (offset_in_page(start)));
 	start &= PAGE_MASK;
 
@@ -797,9 +795,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
 	if (!can_do_mlock())
 		return -EPERM;
 
-	if (flags & MCL_CURRENT)
-		lru_add_drain_all();	/* flush pagevec */
-
 	lock_limit = rlimit(RLIMIT_MEMLOCK);
 	lock_limit >>= PAGE_SHIFT;
 
-- 
2.15.0.rc1.287.g2b38de12cc-goog

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-18 23:17 [PATCH] mm: mlock: remove lru_add_drain_all() Shakeel Butt
@ 2017-10-19  3:18 ` Balbir Singh
  2017-10-19 20:12   ` Shakeel Butt
  2017-10-19  6:24 ` Anshuman Khandual
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Balbir Singh @ 2017-10-19  3:18 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins, linux-mm, linux-kernel

On Wed, 18 Oct 2017 16:17:30 -0700
Shakeel Butt <shakeelb@google.com> wrote:

> Recently we have observed high latency in mlock() in our generic
> library and noticed that users have started using tmpfs files even
> without swap and the latency was due to expensive remote LRU cache
> draining.
> 
> Is lru_add_drain_all() required by mlock()? The answer is no and the
> reason it is still in mlock() is to rapidly move mlocked pages to
> unevictable LRU. Without lru_add_drain_all() the mlocked pages which
> were on pagevec at mlock() time will be moved to evictable LRUs but
> will eventually be moved back to unevictable LRU by reclaim. So, we
> can safely remove lru_add_drain_all() from mlock(). Also there is no
> need for local lru_add_drain() as it will be called deep inside
> __mm_populate() (in follow_page_pte()).
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---

Does this perturb statistics around LRU pages in cgroups and meminfo
about where the pages actually belong?

Balbir Singh.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-18 23:17 [PATCH] mm: mlock: remove lru_add_drain_all() Shakeel Butt
  2017-10-19  3:18 ` Balbir Singh
@ 2017-10-19  6:24 ` Anshuman Khandual
  2017-10-19 19:19   ` Shakeel Butt
  2017-10-19 10:18 ` Kirill A. Shutemov
  2017-10-19 12:32 ` Michal Hocko
  3 siblings, 1 reply; 14+ messages in thread
From: Anshuman Khandual @ 2017-10-19  6:24 UTC (permalink / raw)
  To: Shakeel Butt, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Michal Hocko, Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar,
	Greg Thelen, Hugh Dickins
  Cc: linux-mm, linux-kernel

On 10/19/2017 04:47 AM, Shakeel Butt wrote:
> Recently we have observed high latency in mlock() in our generic
> library and noticed that users have started using tmpfs files even
> without swap and the latency was due to expensive remote LRU cache
> draining.

With and without this I patch I dont see much difference in number
of instructions executed in the kernel for mlock() system call on
POWER8 platform just after reboot (all the pagevecs might not been
filled by then though). There is an improvement but its very less.

Could you share your latency numbers and how this patch is making
them better.

> 
> Is lru_add_drain_all() required by mlock()? The answer is no and the
> reason it is still in mlock() is to rapidly move mlocked pages to
> unevictable LRU. Without lru_add_drain_all() the mlocked pages which
> were on pagevec at mlock() time will be moved to evictable LRUs but
> will eventually be moved back to unevictable LRU by reclaim. So, we

Wont this affect the performance during reclaim ?

> can safely remove lru_add_drain_all() from mlock(). Also there is no
> need for local lru_add_drain() as it will be called deep inside
> __mm_populate() (in follow_page_pte()).

The following commit which originally added lru_add_drain_all()
during mlock() and mlockall() has similar explanation.

8891d6da ("mm: remove lru_add_drain_all() from the munlock path")

"In addition, this patch add lru_add_drain_all() to sys_mlock()
and sys_mlockall().  it isn't must.  but it reduce the failure
of moving to unevictable list.  its failure can rescue in
vmscan later.  but reducing is better."

Which sounds like either we have to handle the active to inactive
LRU movement during reclaim or it can be done here to speed up
reclaim later on.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-18 23:17 [PATCH] mm: mlock: remove lru_add_drain_all() Shakeel Butt
  2017-10-19  3:18 ` Balbir Singh
  2017-10-19  6:24 ` Anshuman Khandual
@ 2017-10-19 10:18 ` Kirill A. Shutemov
  2017-10-19 19:19   ` Shakeel Butt
  2017-10-19 12:32 ` Michal Hocko
  3 siblings, 1 reply; 14+ messages in thread
From: Kirill A. Shutemov @ 2017-10-19 10:18 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins, linux-mm, linux-kernel

On Wed, Oct 18, 2017 at 04:17:30PM -0700, Shakeel Butt wrote:
> Recently we have observed high latency in mlock() in our generic
> library and noticed that users have started using tmpfs files even
> without swap and the latency was due to expensive remote LRU cache
> draining.

Hm. Isn't the point of mlock() to pay price upfront and make execution
smoother after this?

With this you shift latency onto reclaim (and future memory allocation).

I'm not sure if it's a win.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-18 23:17 [PATCH] mm: mlock: remove lru_add_drain_all() Shakeel Butt
                   ` (2 preceding siblings ...)
  2017-10-19 10:18 ` Kirill A. Shutemov
@ 2017-10-19 12:32 ` Michal Hocko
  2017-10-19 19:19   ` Shakeel Butt
  3 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2017-10-19 12:32 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	linux-mm, linux-kernel

On Wed 18-10-17 16:17:30, Shakeel Butt wrote:
> Recently we have observed high latency in mlock() in our generic
> library and noticed that users have started using tmpfs files even
> without swap and the latency was due to expensive remote LRU cache
> draining.

some numbers would be really nice
 
> Is lru_add_drain_all() required by mlock()? The answer is no and the
> reason it is still in mlock() is to rapidly move mlocked pages to
> unevictable LRU.

Is this really true? lru_add_drain_all will flush the previously cached
LRU pages. We are not flushing after the pages have been faulted in so
this might not do anything wrt. mlocked pages, right?

> Without lru_add_drain_all() the mlocked pages which
> were on pagevec at mlock() time will be moved to evictable LRUs but
> will eventually be moved back to unevictable LRU by reclaim. So, we
> can safely remove lru_add_drain_all() from mlock(). Also there is no
> need for local lru_add_drain() as it will be called deep inside
> __mm_populate() (in follow_page_pte()).

Anyway, I do agree that lru_add_drain_all here is pointless. Either we
should drain after the memory has been faulted in and mlocked or not at
all. So the patch looks good to me I am just not sure about the
changelog.
 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
>  mm/mlock.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/mm/mlock.c b/mm/mlock.c
> index dfc6f1912176..3ceb2935d1e0 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -669,8 +669,6 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	lru_add_drain_all();	/* flush pagevec */
> -
>  	len = PAGE_ALIGN(len + (offset_in_page(start)));
>  	start &= PAGE_MASK;
>  
> @@ -797,9 +795,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>  	if (!can_do_mlock())
>  		return -EPERM;
>  
> -	if (flags & MCL_CURRENT)
> -		lru_add_drain_all();	/* flush pagevec */
> -
>  	lock_limit = rlimit(RLIMIT_MEMLOCK);
>  	lock_limit >>= PAGE_SHIFT;
>  
> -- 
> 2.15.0.rc1.287.g2b38de12cc-goog
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19  6:24 ` Anshuman Khandual
@ 2017-10-19 19:19   ` Shakeel Butt
  0 siblings, 0 replies; 14+ messages in thread
From: Shakeel Butt @ 2017-10-19 19:19 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins, Linux MM, LKML

On Wed, Oct 18, 2017 at 11:24 PM, Anshuman Khandual
<khandual@linux.vnet.ibm.com> wrote:
> On 10/19/2017 04:47 AM, Shakeel Butt wrote:
>> Recently we have observed high latency in mlock() in our generic
>> library and noticed that users have started using tmpfs files even
>> without swap and the latency was due to expensive remote LRU cache
>> draining.
>
> With and without this I patch I dont see much difference in number
> of instructions executed in the kernel for mlock() system call on
> POWER8 platform just after reboot (all the pagevecs might not been
> filled by then though). There is an improvement but its very less.
>
> Could you share your latency numbers and how this patch is making
> them better.
>

The latency is very dependent on the workload and the number of cores
on the machine. On production workload, the customers were complaining
single mlock() was taking around 10 seconds on tmpfs files which were
already in memory.

>>
>> Is lru_add_drain_all() required by mlock()? The answer is no and the
>> reason it is still in mlock() is to rapidly move mlocked pages to
>> unevictable LRU. Without lru_add_drain_all() the mlocked pages which
>> were on pagevec at mlock() time will be moved to evictable LRUs but
>> will eventually be moved back to unevictable LRU by reclaim. So, we
>
> Wont this affect the performance during reclaim ?
>

Yes, but reclaim is already a slow path and to seriously impact
reclaim we will need a very very antagonistic workload which is very
hard to trigger (i.e. for each mlock on a cpu, the pages being mlocked
happen to be on the cache of other cpus).

>> can safely remove lru_add_drain_all() from mlock(). Also there is no
>> need for local lru_add_drain() as it will be called deep inside
>> __mm_populate() (in follow_page_pte()).
>
> The following commit which originally added lru_add_drain_all()
> during mlock() and mlockall() has similar explanation.
>
> 8891d6da ("mm: remove lru_add_drain_all() from the munlock path")
>
> "In addition, this patch add lru_add_drain_all() to sys_mlock()
> and sys_mlockall().  it isn't must.  but it reduce the failure
> of moving to unevictable list.  its failure can rescue in
> vmscan later.  but reducing is better."
>
> Which sounds like either we have to handle the active to inactive
> LRU movement during reclaim or it can be done here to speed up
> reclaim later on.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19 10:18 ` Kirill A. Shutemov
@ 2017-10-19 19:19   ` Shakeel Butt
  0 siblings, 0 replies; 14+ messages in thread
From: Shakeel Butt @ 2017-10-19 19:19 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins, Linux MM, LKML

On Thu, Oct 19, 2017 at 3:18 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Wed, Oct 18, 2017 at 04:17:30PM -0700, Shakeel Butt wrote:
>> Recently we have observed high latency in mlock() in our generic
>> library and noticed that users have started using tmpfs files even
>> without swap and the latency was due to expensive remote LRU cache
>> draining.
>
> Hm. Isn't the point of mlock() to pay price upfront and make execution
> smoother after this?
>
> With this you shift latency onto reclaim (and future memory allocation).
>
> I'm not sure if it's a win.
>

It will not shift latency to fast path memory allocation. Reclaim
happens on slow path and only reclaim may see more mlocked pages.
Please note that the very antagonistics workload i.e. for each mlock
on a cpu, the pages being mlocked happen to be on the cache of other
cpus, is very hard to trigger.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19 12:32 ` Michal Hocko
@ 2017-10-19 19:19   ` Shakeel Butt
  2017-10-19 19:35     ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2017-10-19 19:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Linux MM, LKML

On Thu, Oct 19, 2017 at 5:32 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 18-10-17 16:17:30, Shakeel Butt wrote:
>> Recently we have observed high latency in mlock() in our generic
>> library and noticed that users have started using tmpfs files even
>> without swap and the latency was due to expensive remote LRU cache
>> draining.
>
> some numbers would be really nice
>

On a production workload, customers complained that single mlock()
call took around 10 seconds on mapped tmpfs files and the perf profile
showed lru_add_drain_all as culprit.

I wasn't able to replicate the workload on my test machine but a
simple workload of calling mlock() many type on a free machine shows
significant difference. Other than workload, the machine size (number
of cores) also matters.

>> Is lru_add_drain_all() required by mlock()? The answer is no and the
>> reason it is still in mlock() is to rapidly move mlocked pages to
>> unevictable LRU.
>
> Is this really true? lru_add_drain_all will flush the previously cached
> LRU pages. We are not flushing after the pages have been faulted in so
> this might not do anything wrt. mlocked pages, right?
>

Sorry for the confusion. I wanted to say that if the pages which are
being mlocked are on caches of remote cpus then lru_add_drain_all will
move them to their corresponding LRUs and then remaining functionality
of mlock will move them again from their evictable LRUs to unevictable
LRU.

>> Without lru_add_drain_all() the mlocked pages which
>> were on pagevec at mlock() time will be moved to evictable LRUs but
>> will eventually be moved back to unevictable LRU by reclaim. So, we
>> can safely remove lru_add_drain_all() from mlock(). Also there is no
>> need for local lru_add_drain() as it will be called deep inside
>> __mm_populate() (in follow_page_pte()).
>
> Anyway, I do agree that lru_add_drain_all here is pointless. Either we
> should drain after the memory has been faulted in and mlocked or not at
> all. So the patch looks good to me I am just not sure about the
> changelog.
>
>> Signed-off-by: Shakeel Butt <shakeelb@google.com>
>> ---
>>  mm/mlock.c | 5 -----
>>  1 file changed, 5 deletions(-)
>>
>> diff --git a/mm/mlock.c b/mm/mlock.c
>> index dfc6f1912176..3ceb2935d1e0 100644
>> --- a/mm/mlock.c
>> +++ b/mm/mlock.c
>> @@ -669,8 +669,6 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
>>       if (!can_do_mlock())
>>               return -EPERM;
>>
>> -     lru_add_drain_all();    /* flush pagevec */
>> -
>>       len = PAGE_ALIGN(len + (offset_in_page(start)));
>>       start &= PAGE_MASK;
>>
>> @@ -797,9 +795,6 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>>       if (!can_do_mlock())
>>               return -EPERM;
>>
>> -     if (flags & MCL_CURRENT)
>> -             lru_add_drain_all();    /* flush pagevec */
>> -
>>       lock_limit = rlimit(RLIMIT_MEMLOCK);
>>       lock_limit >>= PAGE_SHIFT;
>>
>> --
>> 2.15.0.rc1.287.g2b38de12cc-goog
>>
>
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19 19:19   ` Shakeel Butt
@ 2017-10-19 19:35     ` Michal Hocko
  2017-10-19 19:46       ` Shakeel Butt
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2017-10-19 19:35 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Linux MM, LKML

On Thu 19-10-17 12:19:26, Shakeel Butt wrote:
> On Thu, Oct 19, 2017 at 5:32 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Wed 18-10-17 16:17:30, Shakeel Butt wrote:
> >> Recently we have observed high latency in mlock() in our generic
> >> library and noticed that users have started using tmpfs files even
> >> without swap and the latency was due to expensive remote LRU cache
> >> draining.
> >
> > some numbers would be really nice
> >
> 
> On a production workload, customers complained that single mlock()
> call took around 10 seconds on mapped tmpfs files and the perf profile
> showed lru_add_drain_all as culprit.

draining can take some time. I wouldn't expect orders of seconds so perf
data would be definitely helpful in the changelog.

[...]
> > Is this really true? lru_add_drain_all will flush the previously cached
> > LRU pages. We are not flushing after the pages have been faulted in so
> > this might not do anything wrt. mlocked pages, right?
> >
> 
> Sorry for the confusion. I wanted to say that if the pages which are
> being mlocked are on caches of remote cpus then lru_add_drain_all will
> move them to their corresponding LRUs and then remaining functionality
> of mlock will move them again from their evictable LRUs to unevictable
> LRU.

yes, but the point is that we are draining pages which might be not
directly related to pages which _will_ be mlocked by the syscall. In
fact those will stay on the cache. This is the primary reason why this
draining doesn't make much sense.
 
Or am I still misunderstanding what you are saying here?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19 19:35     ` Michal Hocko
@ 2017-10-19 19:46       ` Shakeel Butt
  2017-10-19 20:13         ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2017-10-19 19:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Linux MM, LKML

> [...]
>>
>> Sorry for the confusion. I wanted to say that if the pages which are
>> being mlocked are on caches of remote cpus then lru_add_drain_all will
>> move them to their corresponding LRUs and then remaining functionality
>> of mlock will move them again from their evictable LRUs to unevictable
>> LRU.
>
> yes, but the point is that we are draining pages which might be not
> directly related to pages which _will_ be mlocked by the syscall. In
> fact those will stay on the cache. This is the primary reason why this
> draining doesn't make much sense.
>
> Or am I still misunderstanding what you are saying here?
>

lru_add_drain_all() will drain everything irrespective if those pages
are being mlocked or not.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19  3:18 ` Balbir Singh
@ 2017-10-19 20:12   ` Shakeel Butt
  0 siblings, 0 replies; 14+ messages in thread
From: Shakeel Butt @ 2017-10-19 20:12 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Michal Hocko,
	Joonsoo Kim, Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen,
	Hugh Dickins, Linux MM, LKML

On Wed, Oct 18, 2017 at 8:18 PM, Balbir Singh <bsingharora@gmail.com> wrote:
> On Wed, 18 Oct 2017 16:17:30 -0700
> Shakeel Butt <shakeelb@google.com> wrote:
>
>> Recently we have observed high latency in mlock() in our generic
>> library and noticed that users have started using tmpfs files even
>> without swap and the latency was due to expensive remote LRU cache
>> draining.
>>
>> Is lru_add_drain_all() required by mlock()? The answer is no and the
>> reason it is still in mlock() is to rapidly move mlocked pages to
>> unevictable LRU. Without lru_add_drain_all() the mlocked pages which
>> were on pagevec at mlock() time will be moved to evictable LRUs but
>> will eventually be moved back to unevictable LRU by reclaim. So, we
>> can safely remove lru_add_drain_all() from mlock(). Also there is no
>> need for local lru_add_drain() as it will be called deep inside
>> __mm_populate() (in follow_page_pte()).
>>
>> Signed-off-by: Shakeel Butt <shakeelb@google.com>
>> ---
>
> Does this perturb statistics around LRU pages in cgroups and meminfo
> about where the pages actually belong?
>

Yes, it would because the page can be in the evictable LRU until the
reclaim moves it back to the unevictable LRU. However even with the
draining there is a chance that the same thing can happen. For
example, after mlock drains all caches and before getting mmap_sem,
another cpu swaps in the page which the mlock syscall wants to mlock.
Though the without draining the chance of this scenario will increase
and in worst case mlock() can fail to move at most PAGEVEC_SIZE *
(number of cpus - 1)  pages to the unevictable LRU.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19 19:46       ` Shakeel Butt
@ 2017-10-19 20:13         ` Michal Hocko
  2017-10-19 20:14           ` Shakeel Butt
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2017-10-19 20:13 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Linux MM, LKML

On Thu 19-10-17 12:46:50, Shakeel Butt wrote:
> > [...]
> >>
> >> Sorry for the confusion. I wanted to say that if the pages which are
> >> being mlocked are on caches of remote cpus then lru_add_drain_all will
> >> move them to their corresponding LRUs and then remaining functionality
> >> of mlock will move them again from their evictable LRUs to unevictable
> >> LRU.
> >
> > yes, but the point is that we are draining pages which might be not
> > directly related to pages which _will_ be mlocked by the syscall. In
> > fact those will stay on the cache. This is the primary reason why this
> > draining doesn't make much sense.
> >
> > Or am I still misunderstanding what you are saying here?
> >
> 
> lru_add_drain_all() will drain everything irrespective if those pages
> are being mlocked or not.

yes, let me be more specific. lru_add_drain_all will drain everything
that has been cached at the time mlock is called. And that is not really
related to the memory which will be faulted in (and cached) and mlocked
by the syscall itself. Does it make more sense now?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19 20:13         ` Michal Hocko
@ 2017-10-19 20:14           ` Shakeel Butt
  2017-10-19 20:53             ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: Shakeel Butt @ 2017-10-19 20:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Linux MM, LKML

On Thu, Oct 19, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 19-10-17 12:46:50, Shakeel Butt wrote:
>> > [...]
>> >>
>> >> Sorry for the confusion. I wanted to say that if the pages which are
>> >> being mlocked are on caches of remote cpus then lru_add_drain_all will
>> >> move them to their corresponding LRUs and then remaining functionality
>> >> of mlock will move them again from their evictable LRUs to unevictable
>> >> LRU.
>> >
>> > yes, but the point is that we are draining pages which might be not
>> > directly related to pages which _will_ be mlocked by the syscall. In
>> > fact those will stay on the cache. This is the primary reason why this
>> > draining doesn't make much sense.
>> >
>> > Or am I still misunderstanding what you are saying here?
>> >
>>
>> lru_add_drain_all() will drain everything irrespective if those pages
>> are being mlocked or not.
>
> yes, let me be more specific. lru_add_drain_all will drain everything
> that has been cached at the time mlock is called. And that is not really
> related to the memory which will be faulted in (and cached) and mlocked
> by the syscall itself. Does it make more sense now?
>

Yes, you are absolutely right. Sorry for the confusion.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm: mlock: remove lru_add_drain_all()
  2017-10-19 20:14           ` Shakeel Butt
@ 2017-10-19 20:53             ` Michal Hocko
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-10-19 20:53 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Joonsoo Kim,
	Minchan Kim, Yisheng Xie, Ingo Molnar, Greg Thelen, Hugh Dickins,
	Linux MM, LKML

On Thu 19-10-17 13:14:52, Shakeel Butt wrote:
> On Thu, Oct 19, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Thu 19-10-17 12:46:50, Shakeel Butt wrote:
> >> > [...]
> >> >>
> >> >> Sorry for the confusion. I wanted to say that if the pages which are
> >> >> being mlocked are on caches of remote cpus then lru_add_drain_all will
> >> >> move them to their corresponding LRUs and then remaining functionality
> >> >> of mlock will move them again from their evictable LRUs to unevictable
> >> >> LRU.
> >> >
> >> > yes, but the point is that we are draining pages which might be not
> >> > directly related to pages which _will_ be mlocked by the syscall. In
> >> > fact those will stay on the cache. This is the primary reason why this
> >> > draining doesn't make much sense.
> >> >
> >> > Or am I still misunderstanding what you are saying here?
> >> >
> >>
> >> lru_add_drain_all() will drain everything irrespective if those pages
> >> are being mlocked or not.
> >
> > yes, let me be more specific. lru_add_drain_all will drain everything
> > that has been cached at the time mlock is called. And that is not really
> > related to the memory which will be faulted in (and cached) and mlocked
> > by the syscall itself. Does it make more sense now?
> >
> 
> Yes, you are absolutely right. Sorry for the confusion.

So I think it would be much better to justify this change by arguing
that paying a random overhead for something that doesn't relate to the
work to be done is simply wrong.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-10-19 20:53 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-18 23:17 [PATCH] mm: mlock: remove lru_add_drain_all() Shakeel Butt
2017-10-19  3:18 ` Balbir Singh
2017-10-19 20:12   ` Shakeel Butt
2017-10-19  6:24 ` Anshuman Khandual
2017-10-19 19:19   ` Shakeel Butt
2017-10-19 10:18 ` Kirill A. Shutemov
2017-10-19 19:19   ` Shakeel Butt
2017-10-19 12:32 ` Michal Hocko
2017-10-19 19:19   ` Shakeel Butt
2017-10-19 19:35     ` Michal Hocko
2017-10-19 19:46       ` Shakeel Butt
2017-10-19 20:13         ` Michal Hocko
2017-10-19 20:14           ` Shakeel Butt
2017-10-19 20:53             ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).