All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mina Almasry <almasrymina@google.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>,
	Peter Xu <peterx@redhat.com>, Linux-MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY
Date: Thu, 20 May 2021 12:18:24 -0700	[thread overview]
Message-ID: <CAHS8izN3+DwCMnVotW7UoiROKEpBh=i+n2jb+oMJQwbKeegx7Q@mail.gmail.com> (raw)
In-Reply-To: <09dc0712-48e8-8ba2-f170-4c2febcfff83@oracle.com>

On Thu, May 13, 2021 at 5:14 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 5/13/21 4:49 PM, Mina Almasry wrote:
> > On Thu, May 13, 2021 at 4:43 PM Mina Almasry <almasrymina@google.com> wrote:
> >>
> >> When hugetlb_mcopy_atomic_pte() is called with:
> >> - mode==MCOPY_ATOMIC_NORMAL and,
> >> - we already have a page in the page cache corresponding to the
> >> associated address,
> >>
> >> We will allocate a huge page from the reserves, and then fail to insert it
> >> into the cache and return -EEXIST. In this case, we need to return -EEXIST
> >> without allocating a new page as the page already exists in the cache.
> >> Allocating the extra page causes the resv_huge_pages to underflow temporarily
> >> until the extra page is freed.
> >>
> >> To fix this we check if a page exists in the cache, and allocate it and
> >> insert it in the cache immediately while holding the lock. After that we
> >> copy the contents into the page.
> >>
> >> As a side effect of this, pages may exist in the cache for which the
> >> copy failed and for these pages PageUptodate(page) == false. Modify code
> >> that query the cache to handle this correctly.
> >>
> >
> > To be honest, I'm not sure I've done this bit correctly. Please take a
> > look and let me know what you think. It may be too overly complicated
> > to have !PageUptodate() pages in the cache and ask the rest of the
> > code to handle that edge case correctly, but I'm not sure how else to
> > fix this issue.
> >
>
> I think you just moved the underflow from hugetlb_mcopy_atomic_pte to
> hugetlb_no_page.  Why?
>
> Consider the case where there is only one reserve left and someone does
> the MCOPY_ATOMIC_NORMAL for the address.  We will allocate the page and
> consume the reserve (reserve count == 0) and insert the page into the
> cache.  Now, if the copy_huge_page_from_user fails we must drop the
> locks/fault mutex to do the copy.  While locks are dropped, someone
> faults on the address and ends up in hugetlb_no_page.  The page is in
> the cache but not up to date, so we go down the allocate new page path
> and will decrement the reserve count again to cause underflow.
>
> How about this approach?
> - Keep the check for hugetlbfs_pagecache_present in hugetlb_mcopy_atomic_pte
>   that you added.  That will catch the race where the page was added to
>   the cache before entering the routine.
> - With the above check in place, we only need to worry about the case
>   where copy_huge_page_from_user fails and we must drop locks.  In this
>   case we:
>   - Free the page previously allocated.
>   - Allocate a 'temporary' huge page without consuming reserves.  I'm
>     thinking of something similar to page migration.
>   - Drop the locks and let the copy_huge_page_from_user be done to the
>     temporary page.
>   - When reentering hugetlb_mcopy_atomic_pte after dropping locks (the
>     *pagep case) we need to once again check
>     hugetlbfs_pagecache_present.
>   - We then try to allocate the huge page which will consume the
>     reserve.  If successful, copy contents of temporary page to newly
>     allocated page.  Free temporary page.
>
> There may be issues with this, and I have not given it deep thought.  It
> does abuse the temporary huge page concept, but perhaps no more than
> page migration.  Things do slow down if the extra page allocation and
> copy is required, but that would only be the case if copy_huge_page_from_user
> needs to be done without locks.  Not sure, but hoping that is rare.

Just following up this a bit: I've implemented this approach locally,
and with it it's passing the test as-is. When I hack the code such
that the copy in hugetlb_mcopy_atomic_pte() always fails, I run into
this edge case, which causes resv_huge_pages to underflow again (this
time permemantly):

- hugetlb_no_page() is called on an index and a page is allocated and
inserted into the cache consuming the reservation.
- remove_huge_page() is called on this index and the page is removed from cache.
- hugetlb_mcopy_atomic_pte() is called on this index, we do not find
the page in the cache and we trigger this code patch and the copy
fails.
- The allocations in this code path seem to double consume the
reservation and resv_huge_pages underflows.

I'm looking at this edge case to understand why a prior
remove_huge_page() causes my code to underflow resv_huge_pages.

> --
> Mike Kravetz

  parent reply	other threads:[~2021-05-20 19:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-13 23:43 [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY Mina Almasry
2021-05-13 23:43 ` Mina Almasry
2021-05-13 23:49 ` Mina Almasry
2021-05-13 23:49   ` Mina Almasry
2021-05-14  0:14   ` Mike Kravetz
2021-05-14  0:23     ` Mina Almasry
2021-05-14  0:23       ` Mina Almasry
2021-05-14  4:02       ` Mike Kravetz
2021-05-14 12:31         ` Peter Xu
2021-05-14 17:56           ` Mike Kravetz
2021-05-14 18:30             ` Axel Rasmussen
2021-05-14 18:30               ` Axel Rasmussen
2021-05-14 19:16             ` Peter Xu
2021-05-20 19:18     ` Mina Almasry [this message]
2021-05-20 19:18       ` Mina Almasry
2021-05-20 19:21       ` Mina Almasry
2021-05-20 19:21         ` Mina Almasry
2021-05-20 20:00         ` Mike Kravetz
2021-05-20 20:31           ` Mina Almasry
2021-05-20 20:31             ` Mina Almasry
2021-05-21  2:05             ` Mina Almasry
2021-05-21  2:05               ` Mina Almasry
  -- strict thread matches above, loose matches on Subject: below --
2021-05-12  3:06 resv_huge_page underflow with userfaultfd test Mike Kravetz
     [not found] ` <20210512065813.89270-1-almasrymina@google.com>
2021-05-12  7:44   ` [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY Mina Almasry
2021-05-12  7:44     ` Mina Almasry
     [not found]   ` <CAJHvVch0ZMapPVEc0Ge5V4KDgNDNhECbqwDi0y9XxsxFXQZ-gg@mail.gmail.com>
     [not found]     ` <c455d241-11f6-95a6-eb29-0ddd94eedbd7@oracle.com>
2021-05-12 19:42       ` Mina Almasry
2021-05-12 19:42         ` Mina Almasry
2021-05-12 20:14         ` Peter Xu
2021-05-12 21:31           ` Mike Kravetz
2021-05-12 21:52             ` Mina Almasry
2021-05-12 21:52               ` Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHS8izN3+DwCMnVotW7UoiROKEpBh=i+n2jb+oMJQwbKeegx7Q@mail.gmail.com' \
    --to=almasrymina@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.