linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	 linux-mm@kvack.org, Shuah Khan <shuah@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,  Michal Hocko <mhocko@suse.com>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	 Kirill A Shutemov <kirill@shutemov.name>,
	"Liam R. Howlett" <liam.howlett@oracle.com>,
	 "Paul E. McKenney" <paulmck@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>
Subject: Re: [PATCH v2 1/4] mm/mremap: Optimize the start addresses in move_page_tables()
Date: Fri, 19 May 2023 18:52:28 -0400	[thread overview]
Message-ID: <CAEXW_YQ4wdGVa5M6jZfi5d-pdJOp1Nu7qTBvWYS=255AnYWZCw@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=whoajP4bZMbZC_VYmBhmhCpXsBesszwWUH0i6SpK_dAtw@mail.gmail.com>

Hi Linus,

On Fri, May 19, 2023 at 3:21 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, May 19, 2023 at 12:09 PM Joel Fernandes (Google)
> <joel@joelfernandes.org> wrote:
> >
> > +static bool check_addr_in_prev(struct vm_area_struct *vma, unsigned long addr,
> > +                              unsigned long mask)
> > +{
> > +       int addr_masked = addr & mask;
> > +       struct vm_area_struct *prev = NULL, *cur = NULL;
> > +
> > +       /* If the masked address is within vma, there is no prev mapping of concern. */
> > +       if (vma->vm_start <= addr_masked)
> > +               return false;
>
> Hmm.
>
> I should have caught this last time, but I didn't.
>
> That test smells bad to me. Or maybe it's just the comment.
>
> I *suspect* that the test is literally just for the stack movement
> case by execve, where it catches the case where we're doing the
> movement entirely within the one vma we set up.

Yes that's right, the test is only for the stack movement case. For
the regular mremap case, I don't think there is a way for it to
trigger.

> But in the *general* case I think the above is horribly wrong: if you
> want to move pages within an existing mapping, the page moving code
> can't just randomly say "I'll expand the area you wanted to move".
> Again, in that general mremap() case (as opposed to the special stack
> moving case for execve), I *think* that the caller has already split
> the vma's at the point of the move, and this test simply cannot ever
> trigger.

Yes, the test simply cannot ever trigger for mremap() but we still
need the test for the stack case. That's why I had considered adding
it and I had indeed reviewed the stack case when adding the test. I
could update the comment to mention that, if you want.

> So I think the _code_ works, but I think the comment in particular is
> questionable, and I'm a bit surprised about the code too,. because I
> thought execve() only expanded to exactly the moving area.

It expands to cover both the new start and the old end of the stack AFAICS:
          /*
           * cover the whole range: [new_start, old_end)
           */
          if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
                  return -ENOMEM;

In this case, it will trigger for the stack because this same expanded
vma is passed as both the new vma and the old vma to
move_page_tables().

           */
          if (length != move_page_tables(vma, old_start,
                                         vma, new_start, length, false))
                  return -ENOMEM;

So AFAICS, it is possible that old_start will start later than this
newly expanded VMA. So for such a situation, old_start can be
realigned to PMD and the test allows that by saying it need not worry
about aligning down to an existing VMA.

> End result: I think the patch on the whole looks nice, and smaller
> than I expected. I also suspect it works in practice, but I'd like
> that test clarified. Does it *actually* trigger for the stack moving
> case? Because I think it must (never* trigger for the mremap case?

You are right that the test will not trigger for the mremap case. But
from a correctness standpoint, I thought of leaving it there for the
stack (and who knows for what other future reasons the test may be needed).
I can update the comment describing the stack if you like.

> And maybe I'm the one confused here, and all I really need is an
> explanation with small words and simple grammar starting with "No,
> Linus, this is for case xyz"

Hopefully it is clear now and you agree. Let me know if you want me to
do something more. I can make some time next week to trace the stack
case a bit more if you like and report back on any behaviors, however
the mremap tests I did are looking good and working as expected.

thanks,

 - Joel


  reply	other threads:[~2023-05-19 22:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-19 19:09 [PATCH v2 0/4] Optimize mremap during mutual alignment within PMD Joel Fernandes (Google)
2023-05-19 19:09 ` [PATCH v2 1/4] mm/mremap: Optimize the start addresses in move_page_tables() Joel Fernandes (Google)
2023-05-19 19:21   ` Linus Torvalds
2023-05-19 22:52     ` Joel Fernandes [this message]
2023-05-20  2:34       ` Linus Torvalds
2023-05-20  3:17         ` Joel Fernandes
2023-05-20  3:56           ` Joel Fernandes
2023-05-20  4:01             ` Linus Torvalds
2023-05-20  4:14               ` Joel Fernandes
2023-05-20  4:22                 ` Joel Fernandes
2023-05-20  5:04               ` Joel Fernandes
2023-05-19 19:09 ` [PATCH v2 2/4] selftests: mm: Fix failure case when new remap region was not found Joel Fernandes (Google)
2023-05-19 19:09 ` [PATCH v2 3/4] selftests: mm: Add a test for mutually aligned moves > PMD size Joel Fernandes (Google)
2023-05-19 19:09 ` [PATCH v2 4/4] selftests: mm: Add a test for remapping to area immediately after existing mapping Joel Fernandes (Google)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEXW_YQ4wdGVa5M6jZfi5d-pdJOp1Nu7qTBvWYS=255AnYWZCw@mail.gmail.com' \
    --to=joel@joelfernandes.org \
    --cc=kirill@shutemov.name \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mhocko@suse.com \
    --cc=paulmck@kernel.org \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).