All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range()
@ 2023-09-18 21:16 Suren Baghdasaryan
  2023-09-18 23:30 ` Hugh Dickins
  2023-09-19  8:52 ` Michal Hocko
  0 siblings, 2 replies; 4+ messages in thread
From: Suren Baghdasaryan @ 2023-09-18 21:16 UTC (permalink / raw)
  To: akpm
  Cc: willy, hughd, shy828301, mhocko, vbabka, surenb, syzkaller-bugs,
	linux-mm, linux-kernel, syzbot+b591856e0f0139f83023

When queue_pages_range() encounters an unmovable page, it terminates
its page walk. This walk, among other things, locks the VMAs in the range.
This termination might result in some VMAs being left unlock after
queue_pages_range() completes. Since do_mbind() continues to operate on
these VMAs despite the failure from queue_pages_range(), it will encounter
an unlocked VMA.
This mbind() behavior has been modified several times before and might
need some changes to either finish the page walk even in the presence
of unmovable pages or to error out immediately after the failure to
queue_pages_range(). However that requires more discussions, so to
fix the immediate issue, explicitly lock the VMAs in the range if
queue_pages_range() failed. The added condition does not save much
but is added for documentation purposes to understand when this extra
locking is needed.

Fixes: 49b0638502da ("mm: enable page walking API to lock vmas during the walk")
Reported-by: syzbot+b591856e0f0139f83023@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000f392a60604a65085@google.com/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 mm/mempolicy.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 42b5567e3773..cbc584e9b6ca 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1342,6 +1342,9 @@ static long do_mbind(unsigned long start, unsigned long len,
 	vma_iter_init(&vmi, mm, start);
 	prev = vma_prev(&vmi);
 	for_each_vma_range(vmi, vma, end) {
+		/* If queue_pages_range failed then not all VMAs might be locked */
+		if (ret)
+			vma_start_write(vma);
 		err = mbind_range(&vmi, vma, &prev, start, end, new);
 		if (err)
 			break;
-- 
2.42.0.459.ge4e396fd5e-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range()
  2023-09-18 21:16 [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range() Suren Baghdasaryan
@ 2023-09-18 23:30 ` Hugh Dickins
  2023-09-19  8:52 ` Michal Hocko
  1 sibling, 0 replies; 4+ messages in thread
From: Hugh Dickins @ 2023-09-18 23:30 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, willy, hughd, shy828301, mhocko, vbabka, syzkaller-bugs,
	linux-mm, linux-kernel, syzbot+b591856e0f0139f83023

On Mon, 18 Sep 2023, Suren Baghdasaryan wrote:

> When queue_pages_range() encounters an unmovable page, it terminates
> its page walk. This walk, among other things, locks the VMAs in the range.
> This termination might result in some VMAs being left unlock after
> queue_pages_range() completes. Since do_mbind() continues to operate on
> these VMAs despite the failure from queue_pages_range(), it will encounter
> an unlocked VMA.
> This mbind() behavior has been modified several times before and might
> need some changes to either finish the page walk even in the presence
> of unmovable pages or to error out immediately after the failure to
> queue_pages_range(). However that requires more discussions, so to
> fix the immediate issue, explicitly lock the VMAs in the range if
> queue_pages_range() failed. The added condition does not save much
> but is added for documentation purposes to understand when this extra
> locking is needed.
> 
> Fixes: 49b0638502da ("mm: enable page walking API to lock vmas during the walk")
> Reported-by: syzbot+b591856e0f0139f83023@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000f392a60604a65085@google.com/
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>

Acked-by: Hugh Dickins <hughd@google.com>

> ---
>  mm/mempolicy.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 42b5567e3773..cbc584e9b6ca 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1342,6 +1342,9 @@ static long do_mbind(unsigned long start, unsigned long len,
>  	vma_iter_init(&vmi, mm, start);
>  	prev = vma_prev(&vmi);
>  	for_each_vma_range(vmi, vma, end) {
> +		/* If queue_pages_range failed then not all VMAs might be locked */
> +		if (ret)
> +			vma_start_write(vma);
>  		err = mbind_range(&vmi, vma, &prev, start, end, new);
>  		if (err)
>  			break;
> -- 
> 2.42.0.459.ge4e396fd5e-goog

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range()
  2023-09-18 21:16 [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range() Suren Baghdasaryan
  2023-09-18 23:30 ` Hugh Dickins
@ 2023-09-19  8:52 ` Michal Hocko
  2023-09-19 21:09   ` Yang Shi
  1 sibling, 1 reply; 4+ messages in thread
From: Michal Hocko @ 2023-09-19  8:52 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, willy, hughd, shy828301, vbabka, syzkaller-bugs, linux-mm,
	linux-kernel, syzbot+b591856e0f0139f83023

On Mon 18-09-23 14:16:08, Suren Baghdasaryan wrote:
> When queue_pages_range() encounters an unmovable page, it terminates
> its page walk. This walk, among other things, locks the VMAs in the range.
> This termination might result in some VMAs being left unlock after
> queue_pages_range() completes. Since do_mbind() continues to operate on
> these VMAs despite the failure from queue_pages_range(), it will encounter
> an unlocked VMA.
> This mbind() behavior has been modified several times before and might
> need some changes to either finish the page walk even in the presence
> of unmovable pages or to error out immediately after the failure to
> queue_pages_range(). However that requires more discussions, so to
> fix the immediate issue, explicitly lock the VMAs in the range if
> queue_pages_range() failed. The added condition does not save much
> but is added for documentation purposes to understand when this extra
> locking is needed.

The semantic of the walk in this case is really clear as mud. I was
trying to reconstruct the whole picture and it really hurts... Then I
found http://lkml.kernel.org/r/CAHbLzkrmTaqBRmHVdE2kyW57Uoghqd_E+jAXC9cB5ofkhL-uvw@mail.gmail.com
and that helped a lot. Let's keep it a reference at least in the email
thread here for future.

> Fixes: 49b0638502da ("mm: enable page walking API to lock vmas during the walk")
> Reported-by: syzbot+b591856e0f0139f83023@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000f392a60604a65085@google.com/
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>

I cannot say I like the patch (it looks like a potential double locking
unless you realize this lock is special) but considering this might be just
temporal I do not mind. 

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  mm/mempolicy.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 42b5567e3773..cbc584e9b6ca 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1342,6 +1342,9 @@ static long do_mbind(unsigned long start, unsigned long len,
>  	vma_iter_init(&vmi, mm, start);
>  	prev = vma_prev(&vmi);
>  	for_each_vma_range(vmi, vma, end) {
> +		/* If queue_pages_range failed then not all VMAs might be locked */
> +		if (ret)
> +			vma_start_write(vma);
>  		err = mbind_range(&vmi, vma, &prev, start, end, new);
>  		if (err)
>  			break;
> -- 
> 2.42.0.459.ge4e396fd5e-goog

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range()
  2023-09-19  8:52 ` Michal Hocko
@ 2023-09-19 21:09   ` Yang Shi
  0 siblings, 0 replies; 4+ messages in thread
From: Yang Shi @ 2023-09-19 21:09 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Suren Baghdasaryan, akpm, willy, hughd, vbabka, syzkaller-bugs,
	linux-mm, linux-kernel, syzbot+b591856e0f0139f83023

On Tue, Sep 19, 2023 at 1:53 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Mon 18-09-23 14:16:08, Suren Baghdasaryan wrote:
> > When queue_pages_range() encounters an unmovable page, it terminates
> > its page walk. This walk, among other things, locks the VMAs in the range.
> > This termination might result in some VMAs being left unlock after
> > queue_pages_range() completes. Since do_mbind() continues to operate on
> > these VMAs despite the failure from queue_pages_range(), it will encounter
> > an unlocked VMA.
> > This mbind() behavior has been modified several times before and might
> > need some changes to either finish the page walk even in the presence
> > of unmovable pages or to error out immediately after the failure to
> > queue_pages_range(). However that requires more discussions, so to
> > fix the immediate issue, explicitly lock the VMAs in the range if
> > queue_pages_range() failed. The added condition does not save much
> > but is added for documentation purposes to understand when this extra
> > locking is needed.
>
> The semantic of the walk in this case is really clear as mud. I was
> trying to reconstruct the whole picture and it really hurts... Then I
> found http://lkml.kernel.org/r/CAHbLzkrmTaqBRmHVdE2kyW57Uoghqd_E+jAXC9cB5ofkhL-uvw@mail.gmail.com
> and that helped a lot. Let's keep it a reference at least in the email
> thread here for future.

FYI, I'm working on a fix for the regression mentioned in that series,
and Hugh has some clean up and enhancement for that too.

>
> > Fixes: 49b0638502da ("mm: enable page walking API to lock vmas during the walk")
> > Reported-by: syzbot+b591856e0f0139f83023@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/000000000000f392a60604a65085@google.com/
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
>
> I cannot say I like the patch (it looks like a potential double locking
> unless you realize this lock is special) but considering this might be just
> temporal I do not mind.
>
> Acked-by: Michal Hocko <mhocko@suse.com>
>
> Thanks!
>
> > ---
> >  mm/mempolicy.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 42b5567e3773..cbc584e9b6ca 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1342,6 +1342,9 @@ static long do_mbind(unsigned long start, unsigned long len,
> >       vma_iter_init(&vmi, mm, start);
> >       prev = vma_prev(&vmi);
> >       for_each_vma_range(vmi, vma, end) {
> > +             /* If queue_pages_range failed then not all VMAs might be locked */
> > +             if (ret)
> > +                     vma_start_write(vma);
> >               err = mbind_range(&vmi, vma, &prev, start, end, new);
> >               if (err)
> >                       break;
> > --
> > 2.42.0.459.ge4e396fd5e-goog
>
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-09-19 21:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-18 21:16 [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range() Suren Baghdasaryan
2023-09-18 23:30 ` Hugh Dickins
2023-09-19  8:52 ` Michal Hocko
2023-09-19 21:09   ` Yang Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.