linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mina Almasry <almasrymina@google.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	 open list <linux-kernel@vger.kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	 Peter Xu <peterx@redhat.com>
Subject: Re: [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY
Date: Wed, 12 May 2021 00:44:57 -0700	[thread overview]
Message-ID: <CAHS8izODzgEOCrorUmNjQZzOgAM3Kbv=DPbVpoDyrk0iKtRYMQ@mail.gmail.com> (raw)
In-Reply-To: <20210512065813.89270-1-almasrymina@google.com>

On Tue, May 11, 2021 at 11:58 PM Mina Almasry <almasrymina@google.com> wrote:
>
> When hugetlb_mcopy_atomic_pte() is called with:
> - mode==MCOPY_ATOMIC_NORMAL and,
> - we already have a page in the page cache corresponding to the
> associated address,
>
> We will allocate a huge page from the reserves, and then fail to insert it
> into the cache and return -EEXIST.
>
> In this case, we need to return -EEXIST without allocating a new page as
> the page already exists in the cache. Allocating the extra page causes
> the resv_huge_pages to underflow temporarily until the extra page is
> freed.
>
> Also, add the warning so that future similar instances of resv_huge_pages
> underflowing will be caught.
>
> Also, minor drive-by cleanups to this code path:
> - pagep is an out param never set by calling code, so delete code
> assuming there could be a valid page in there.
> - use hugetlbfs_pagecache_page() instead of repeating its
> implementation.
>
> Tested using:
> ./tools/testing/selftests/vm/userfaultfd hugetlb_shared 1024 200 \
> /mnt/huge
>
> Test passes, and dmesg shows no underflow warnings.
>
> Signed-off-by: Mina Almasry <almasrymina@google.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Axel Rasmussen <axelrasmussen@google.com>
> Cc: Peter Xu <peterx@redhat.com>
>
> ---
>  mm/hugetlb.c | 33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 629aa4c2259c..40f4ad1bca29 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1165,6 +1165,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h,
>         page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
>         if (page && !avoid_reserve && vma_has_reserves(vma, chg)) {
>                 SetHPageRestoreReserve(page);
> +               WARN_ON_ONCE(!h->resv_huge_pages);
>                 h->resv_huge_pages--;
>         }
>
> @@ -4868,30 +4869,39 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>                             struct page **pagep)
>  {
>         bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE);
> -       struct address_space *mapping;
> -       pgoff_t idx;
> +       struct hstate *h = hstate_vma(dst_vma);
> +       struct address_space *mapping = dst_vma->vm_file->f_mapping;
> +       pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr);
>         unsigned long size;
>         int vm_shared = dst_vma->vm_flags & VM_SHARED;
> -       struct hstate *h = hstate_vma(dst_vma);
>         pte_t _dst_pte;
>         spinlock_t *ptl;
> -       int ret;
> +       int ret = -ENOMEM;
>         struct page *page;
>         int writable;
>
> -       mapping = dst_vma->vm_file->f_mapping;
> -       idx = vma_hugecache_offset(h, dst_vma, dst_addr);
> +       /* Out parameter. */
> +       WARN_ON(*pagep);
>
>         if (is_continue) {
>                 ret = -EFAULT;
> -               page = find_lock_page(mapping, idx);
> +               page = hugetlbfs_pagecache_page(h, dst_vma, dst_addr);
>                 if (!page)
>                         goto out;
> -       } else if (!*pagep) {
> -               ret = -ENOMEM;
> +       } else {
> +               /* If a page already exists, then it's UFFDIO_COPY for
> +                * a non-missing case. Return -EEXIST.
> +                */
> +               if (hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) {
> +                       ret = -EEXIST;
> +                       goto out;
> +               }
> +
>                 page = alloc_huge_page(dst_vma, dst_addr, 0);
> -               if (IS_ERR(page))
> +               if (IS_ERR(page)) {
> +                       ret = -ENOMEM;
>                         goto out;
> +               }
>
>                 ret = copy_huge_page_from_user(page,
>                                                 (const void __user *) src_addr,
> @@ -4904,9 +4914,6 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>                         /* don't free the page */
>                         goto out;
>                 }
> -       } else {
> -               page = *pagep;
> -               *pagep = NULL;
>         }
>
>         /*
> --
> 2.31.1.607.g51e8a6a459-goog

I just realized I missed CCing Andrew and the mailing lists to this
patch's review. I'll collect review comments from folks and send a v2
to the correct folks and mailing lists.


  parent reply	other threads:[~2021-05-12  7:45 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07 21:21 resv_huge_page underflow with userfaultfd test Mina Almasry
2021-05-11  0:33 ` Mike Kravetz
2021-05-11  0:52   ` Mina Almasry
2021-05-11  6:45     ` Axel Rasmussen
2021-05-11  7:08       ` Mina Almasry
2021-05-11 16:38       ` Mike Kravetz
2021-05-11 19:08         ` Mina Almasry
2021-05-12  2:25   ` Mike Kravetz
2021-05-12  2:35     ` Peter Xu
2021-05-12  3:06       ` Mike Kravetz
     [not found]         ` <20210512065813.89270-1-almasrymina@google.com>
2021-05-12  7:44           ` Mina Almasry [this message]
     [not found]           ` <CAJHvVch0ZMapPVEc0Ge5V4KDgNDNhECbqwDi0y9XxsxFXQZ-gg@mail.gmail.com>
     [not found]             ` <c455d241-11f6-95a6-eb29-0ddd94eedbd7@oracle.com>
2021-05-12 19:42               ` [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY Mina Almasry
2021-05-12 20:14                 ` Peter Xu
2021-05-12 21:31                   ` Mike Kravetz
2021-05-12 21:52                     ` Mina Almasry
2021-05-13 23:43 Mina Almasry
2021-05-13 23:49 ` Mina Almasry
2021-05-14  0:14   ` Mike Kravetz
2021-05-14  0:23     ` Mina Almasry
2021-05-14  4:02       ` Mike Kravetz
2021-05-14 12:31         ` Peter Xu
2021-05-14 17:56           ` Mike Kravetz
2021-05-14 18:30             ` Axel Rasmussen
2021-05-14 19:16             ` Peter Xu
2021-05-20 19:18     ` Mina Almasry
2021-05-20 19:21       ` Mina Almasry
2021-05-20 20:00         ` Mike Kravetz
2021-05-20 20:31           ` Mina Almasry
2021-05-21  2:05             ` Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHS8izODzgEOCrorUmNjQZzOgAM3Kbv=DPbVpoDyrk0iKtRYMQ@mail.gmail.com' \
    --to=almasrymina@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).