* [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone
@ 2020-01-06 6:37 Li Xinhai
2020-01-06 10:43 ` Konstantin Khlebnikov
0 siblings, 1 reply; 4+ messages in thread
From: Li Xinhai @ 2020-01-06 6:37 UTC (permalink / raw)
To: linux-mm; +Cc: Wei Yang, Konstantin Khlebnikov, Kirill A. Shutemov
For fork case, the dst->vm_prev is always same as src->vm_prev when
anon_vma_clone() is called. Removing the assignment from
dst->vm_prev->anon_vma to dst->anon_vma, and explictly assign from
anon_vma which is shared by its parent vmas.
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
mm/rmap.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index b3e3819..3c912a6c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -269,10 +269,10 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
{
struct anon_vma_chain *avc, *pavc;
struct anon_vma *root = NULL;
- struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
+ struct vm_area_struct *pprev = src->vm_prev;
/*
- * If parent share anon_vma with its vm_prev, keep this sharing in in
+ * If parent share anon_vma with its vm_prev, keep this sharing in
* child.
*
* 1. Parent has vm_prev, which implies we have vm_prev.
@@ -280,8 +280,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
*/
if (!dst->anon_vma && src->anon_vma &&
pprev && pprev->anon_vma == src->anon_vma)
- dst->anon_vma = prev->anon_vma;
-
+ dst->anon_vma = pprev->anon_vma;
list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
struct anon_vma *anon_vma;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone
2020-01-06 6:37 [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone Li Xinhai
@ 2020-01-06 10:43 ` Konstantin Khlebnikov
2020-01-06 13:28 ` lixinhai.lxh
0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Khlebnikov @ 2020-01-06 10:43 UTC (permalink / raw)
To: Li Xinhai, linux-mm; +Cc: Wei Yang, Kirill A. Shutemov
On 06/01/2020 09.37, Li Xinhai wrote:
> For fork case, the dst->vm_prev is always same as src->vm_prev when
> anon_vma_clone() is called. Removing the assignment from
> dst->vm_prev->anon_vma to dst->anon_vma, and explictly assign from
> anon_vma which is shared by its parent vmas.
This doesn't sound right.
I see dst->vm_prev is set after anon_vma_fork(), so here it still points to parent prev.
So, this thing works isn't as is supposed to be.
I expect this logic: If parent SRC1 SRC2 .. SRCn share ANON0
then in child related DST1 DST2 .. DSTn should fork and share ANON1:
Forking DST1 creates new ANON1 and then DST2 and following share it.
Also this assumption is wrong:
> Parent has vm_prev, which implies we have vm_prev.
If in parent prev VMA has VM_DONTCOPY then in child prev VMA will
not match pprev or even could be NULL if it was first in mm.
See patch:
https://lore.kernel.org/lkml/157830736034.8148.7070851958306750616.stgit@buzz/T/#u
I've tested it using this:
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -847,6 +847,12 @@ static int show_smap(struct seq_file *m, void *v)
seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma));
show_smap_vma_flags(m, vma);
+ if (vma->anon_vma)
+ seq_printf(m, "AnonVMA: %p %p %d\n",
+ vma->anon_vma,
+ vma->anon_vma->parent,
+ vma->anon_vma->degree);
+
m_cache_vma(m, vma);
return 0;
---
#include <sys/mman.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
int main(int argc, char **argv) {
void *ptr;
char buf[100];
ptr = mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
memset(ptr, 0, 0x3000);
mprotect(ptr + 0x1000, 0x1000, PROT_READ);
sprintf(buf, "cat /proc/%d/smaps", getpid());
system(buf);
if (fork()) {
wait(NULL);
} else {
printf("\n\n\n");
fflush(stdout);
sprintf(buf, "cat /proc/%d/smaps", getpid());
system(buf);
}
}
---
>
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> Cc: Wei Yang <richardw.yang@linux.intel.com>
> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> mm/rmap.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b3e3819..3c912a6c 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -269,10 +269,10 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> {
> struct anon_vma_chain *avc, *pavc;
> struct anon_vma *root = NULL;
> - struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
> + struct vm_area_struct *pprev = src->vm_prev;
>
> /*
> - * If parent share anon_vma with its vm_prev, keep this sharing in in
> + * If parent share anon_vma with its vm_prev, keep this sharing in
> * child.
> *
> * 1. Parent has vm_prev, which implies we have vm_prev.
> @@ -280,8 +280,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> */
> if (!dst->anon_vma && src->anon_vma &&
> pprev && pprev->anon_vma == src->anon_vma)
> - dst->anon_vma = prev->anon_vma;
> -
> + dst->anon_vma = pprev->anon_vma;
>
> list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> struct anon_vma *anon_vma;
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone
2020-01-06 10:43 ` Konstantin Khlebnikov
@ 2020-01-06 13:28 ` lixinhai.lxh
2020-01-06 20:20 ` Konstantin Khlebnikov
0 siblings, 1 reply; 4+ messages in thread
From: lixinhai.lxh @ 2020-01-06 13:28 UTC (permalink / raw)
To: khlebnikov, linux-mm; +Cc: richardw.yang, kirill.shutemov
On 2020-01-06 at 18:43 Konstantin Khlebnikov wrote:
>On 06/01/2020 09.37, Li Xinhai wrote:
>> For fork case, the dst->vm_prev is always same as src->vm_prev when
>> anon_vma_clone() is called. Removing the assignment from
>> dst->vm_prev->anon_vma to dst->anon_vma, and explictly assign from
>> anon_vma which is shared by its parent vmas.
>
>This doesn't sound right.
>
>I see dst->vm_prev is set after anon_vma_fork(), so here it still points to parent prev.
>So, this thing works isn't as is supposed to be.
>
>I expect this logic: If parent SRC1 SRC2 .. SRCn share ANON0
>then in child related DST1 DST2 .. DSTn should fork and share ANON1:
>Forking DST1 creates new ANON1 and then DST2 and following share it.
This logic was not fully clarified in
https://lore.kernel.org/linux-mm/20191011072256.16275-2-richardw.yang@linux.intel.com/
I've assumed that sharing parent vma's anon_vma with child vma was the
purpose of that patch, and it intentionally want the first child has its own new
anon_vma (don't sharing as done by other child vma).
>
>Also this assumption is wrong:
> > Parent has vm_prev, which implies we have vm_prev.
>If in parent prev VMA has VM_DONTCOPY then in child prev VMA will
>not match pprev or even could be NULL if it was first in mm.
>
>See patch:
>https://lore.kernel.org/lkml/157830736034.8148.7070851958306750616.stgit@buzz/T/#u
>
>I've tested it using this:
>
>--- a/fs/proc/task_mmu.c
>+++ b/fs/proc/task_mmu.c
>@@ -847,6 +847,12 @@ static int show_smap(struct seq_file *m, void *v)
> seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma));
> show_smap_vma_flags(m, vma);
>
>+ if (vma->anon_vma)
>+ seq_printf(m, "AnonVMA: %p %p %d\n",
>+ vma->anon_vma,
>+ vma->anon_vma->parent,
>+ vma->anon_vma->degree);
>+
> m_cache_vma(m, vma);
>
> return 0;
>
>---
>
>#include <sys/mman.h>
>#include <stdlib.h>
>#include <unistd.h>
>#include <string.h>
>#include <stdio.h>
>
>int main(int argc, char **argv) {
> void *ptr;
> char buf[100];
>
> ptr = mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> memset(ptr, 0, 0x3000);
> mprotect(ptr + 0x1000, 0x1000, PROT_READ);
>
> sprintf(buf, "cat /proc/%d/smaps", getpid());
> system(buf);
>
> if (fork()) {
> wait(NULL);
> } else {
> printf("\n\n\n");
> fflush(stdout);
> sprintf(buf, "cat /proc/%d/smaps", getpid());
> system(buf);
> }
>}
>
>---
>
>>
>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
>> Cc: Wei Yang <richardw.yang@linux.intel.com>
>> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> ---
>> mm/rmap.c | 7 +++----
>> 1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index b3e3819..3c912a6c 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -269,10 +269,10 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
>> {
>> struct anon_vma_chain *avc, *pavc;
>> struct anon_vma *root = NULL;
>> - struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
>> + struct vm_area_struct *pprev = src->vm_prev;
>>
>> /*
>> - * If parent share anon_vma with its vm_prev, keep this sharing in in
>> + * If parent share anon_vma with its vm_prev, keep this sharing in
>> * child.
>> *
>> * 1. Parent has vm_prev, which implies we have vm_prev.
>> @@ -280,8 +280,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
>> */
>> if (!dst->anon_vma && src->anon_vma &&
>> pprev && pprev->anon_vma == src->anon_vma)
>> - dst->anon_vma = prev->anon_vma;
>> -
>> + dst->anon_vma = pprev->anon_vma;
>>
>> list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
>> struct anon_vma *anon_vma;
>>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone
2020-01-06 13:28 ` lixinhai.lxh
@ 2020-01-06 20:20 ` Konstantin Khlebnikov
0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Khlebnikov @ 2020-01-06 20:20 UTC (permalink / raw)
To: lixinhai.lxh; +Cc: khlebnikov, linux-mm, richardw.yang, kirill.shutemov
On Mon, Jan 6, 2020 at 4:28 PM lixinhai.lxh@gmail.com
<lixinhai.lxh@gmail.com> wrote:
>
> On 2020-01-06 at 18:43 Konstantin Khlebnikov wrote:
> >On 06/01/2020 09.37, Li Xinhai wrote:
> >> For fork case, the dst->vm_prev is always same as src->vm_prev when
> >> anon_vma_clone() is called. Removing the assignment from
> >> dst->vm_prev->anon_vma to dst->anon_vma, and explictly assign from
> >> anon_vma which is shared by its parent vmas.
> >
> >This doesn't sound right.
> >
> >I see dst->vm_prev is set after anon_vma_fork(), so here it still points to parent prev.
> >So, this thing works isn't as is supposed to be.
> >
> >I expect this logic: If parent SRC1 SRC2 .. SRCn share ANON0
> >then in child related DST1 DST2 .. DSTn should fork and share ANON1:
> >Forking DST1 creates new ANON1 and then DST2 and following share it.
>
> This logic was not fully clarified in
> https://lore.kernel.org/linux-mm/20191011072256.16275-2-richardw.yang@linux.intel.com/
> I've assumed that sharing parent vma's anon_vma with child vma was the
> purpose of that patch, and it intentionally want the first child has its own new
> anon_vma (don't sharing as done by other child vma).
Well, this more or less follows from original design.
Page anon-vma along with page offset limits set of vmas scanned by rmap:
it skips vmas where page cannot be mapped for sure.
If vmas in one process shares anon-vma then they likely have
non-overlapping offsets,
so there is no reason to fork personal anon-vma for each of them when
process forks.
But it's good to fork new anon-vma for all of them together: then rmap
could skip scanning
parent vmas for pages allocated\cowed in child process. Together they
act like one big vma.
>
> >
> >Also this assumption is wrong:
> > > Parent has vm_prev, which implies we have vm_prev.
> >If in parent prev VMA has VM_DONTCOPY then in child prev VMA will
> >not match pprev or even could be NULL if it was first in mm.
> >
> >See patch:
> >https://lore.kernel.org/lkml/157830736034.8148.7070851958306750616.stgit@buzz/T/#u
> >
> >I've tested it using this:
> >
> >--- a/fs/proc/task_mmu.c
> >+++ b/fs/proc/task_mmu.c
> >@@ -847,6 +847,12 @@ static int show_smap(struct seq_file *m, void *v)
> > seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma));
> > show_smap_vma_flags(m, vma);
> >
> >+ if (vma->anon_vma)
> >+ seq_printf(m, "AnonVMA: %p %p %d\n",
> >+ vma->anon_vma,
> >+ vma->anon_vma->parent,
> >+ vma->anon_vma->degree);
> >+
> > m_cache_vma(m, vma);
> >
> > return 0;
> >
> >---
> >
> >#include <sys/mman.h>
> >#include <stdlib.h>
> >#include <unistd.h>
> >#include <string.h>
> >#include <stdio.h>
> >
> >int main(int argc, char **argv) {
> > void *ptr;
> > char buf[100];
> >
> > ptr = mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> > memset(ptr, 0, 0x3000);
> > mprotect(ptr + 0x1000, 0x1000, PROT_READ);
> >
> > sprintf(buf, "cat /proc/%d/smaps", getpid());
> > system(buf);
> >
> > if (fork()) {
> > wait(NULL);
> > } else {
> > printf("\n\n\n");
> > fflush(stdout);
> > sprintf(buf, "cat /proc/%d/smaps", getpid());
> > system(buf);
> > }
> >}
> >
> >---
> >
> >>
> >> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> >> Cc: Wei Yang <richardw.yang@linux.intel.com>
> >> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> >> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >> ---
> >> mm/rmap.c | 7 +++----
> >> 1 file changed, 3 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/mm/rmap.c b/mm/rmap.c
> >> index b3e3819..3c912a6c 100644
> >> --- a/mm/rmap.c
> >> +++ b/mm/rmap.c
> >> @@ -269,10 +269,10 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >> {
> >> struct anon_vma_chain *avc, *pavc;
> >> struct anon_vma *root = NULL;
> >> - struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
> >> + struct vm_area_struct *pprev = src->vm_prev;
> >>
> >> /*
> >> - * If parent share anon_vma with its vm_prev, keep this sharing in in
> >> + * If parent share anon_vma with its vm_prev, keep this sharing in
> >> * child.
> >> *
> >> * 1. Parent has vm_prev, which implies we have vm_prev.
> >> @@ -280,8 +280,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >> */
> >> if (!dst->anon_vma && src->anon_vma &&
> >> pprev && pprev->anon_vma == src->anon_vma)
> >> - dst->anon_vma = prev->anon_vma;
> >> -
> >> + dst->anon_vma = pprev->anon_vma;
> >>
> >> list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> >> struct anon_vma *anon_vma;
> >>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-01-06 20:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-06 6:37 [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone Li Xinhai
2020-01-06 10:43 ` Konstantin Khlebnikov
2020-01-06 13:28 ` lixinhai.lxh
2020-01-06 20:20 ` Konstantin Khlebnikov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).