* [PATCH] userfaultfd: hugetlbfs: fix new flag usage in error path
@ 2021-05-21 23:39 Mike Kravetz
2021-05-22 19:41 ` Mina Almasry
2021-05-23 2:45 ` [External] " Muchun Song
0 siblings, 2 replies; 3+ messages in thread
From: Mike Kravetz @ 2021-05-21 23:39 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: Oscar Salvador, Michal Hocko, Muchun Song, Naoya Horiguchi,
David Hildenbrand, Matthew Wilcox, Miaohe Lin, Mina Almasry,
Andrew Morton, Mike Kravetz, stable
In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
page flags") the use of PagePrivate to indicate a reservation count
should be restored at free time was changed to the hugetlb specific flag
HPageRestoreReserve. Changes to a userfaultfd error path as well as a
VM_BUG_ON() in remove_inode_hugepages() were overlooked.
Users could see incorrect hugetlb reserve counts if they experience an
error with a UFFDIO_COPY operation. Specifically, this would be the
result of an unlikely copy_huge_page_from_user error. There is not an
increased chance of hitting the VM_BUG_ON.
Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
Cc: <stable@vger.kernel.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
fs/hugetlbfs/inode.c | 2 +-
mm/userfaultfd.c | 28 ++++++++++++++--------------
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 9d9e0097c1d3..55efd3dd04f6 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -529,7 +529,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
* the subpool and global reserve usage count can need
* to be adjusted.
*/
- VM_BUG_ON(PagePrivate(page));
+ VM_BUG_ON(HPageRestoreReserve(page));
remove_huge_page(page);
freed++;
if (!truncate_op) {
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 5508f7d9e2dc..9ce5a3793ad4 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -432,38 +432,38 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
* If a reservation for the page existed in the reservation
* map of a private mapping, the map was modified to indicate
* the reservation was consumed when the page was allocated.
- * We clear the PagePrivate flag now so that the global
+ * We clear the HPageRestoreReserve flag now so that the global
* reserve count will not be incremented in free_huge_page.
* The reservation map will still indicate the reservation
* was consumed and possibly prevent later page allocation.
* This is better than leaking a global reservation. If no
- * reservation existed, it is still safe to clear PagePrivate
- * as no adjustments to reservation counts were made during
- * allocation.
+ * reservation existed, it is still safe to clear
+ * HPageRestoreReserve as no adjustments to reservation counts
+ * were made during allocation.
*
* The reservation map for shared mappings indicates which
* pages have reservations. When a huge page is allocated
* for an address with a reservation, no change is made to
- * the reserve map. In this case PagePrivate will be set
- * to indicate that the global reservation count should be
+ * the reserve map. In this case HPageRestoreReserve will be
+ * set to indicate that the global reservation count should be
* incremented when the page is freed. This is the desired
* behavior. However, when a huge page is allocated for an
* address without a reservation a reservation entry is added
- * to the reservation map, and PagePrivate will not be set.
- * When the page is freed, the global reserve count will NOT
- * be incremented and it will appear as though we have leaked
- * reserved page. In this case, set PagePrivate so that the
- * global reserve count will be incremented to match the
- * reservation map entry which was created.
+ * to the reservation map, and HPageRestoreReserve will not be
+ * set. When the page is freed, the global reserve count will
+ * NOT be incremented and it will appear as though we have
+ * leaked reserved page. In this case, set HPageRestoreReserve
+ * so that the global reserve count will be incremented to
+ * match the reservation map entry which was created.
*
* Note that vm_alloc_shared is based on the flags of the vma
* for which the page was originally allocated. dst_vma could
* be different or NULL on error.
*/
if (vm_alloc_shared)
- SetPagePrivate(page);
+ SetHPageRestoreReserve(page);
else
- ClearPagePrivate(page);
+ ClearHPageRestoreReserve(page);
put_page(page);
}
BUG_ON(copied < 0);
--
2.31.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] userfaultfd: hugetlbfs: fix new flag usage in error path
2021-05-21 23:39 [PATCH] userfaultfd: hugetlbfs: fix new flag usage in error path Mike Kravetz
@ 2021-05-22 19:41 ` Mina Almasry
2021-05-23 2:45 ` [External] " Muchun Song
1 sibling, 0 replies; 3+ messages in thread
From: Mina Almasry @ 2021-05-22 19:41 UTC (permalink / raw)
To: Mike Kravetz
Cc: Linux-MM, open list, Oscar Salvador, Michal Hocko, Muchun Song,
Naoya Horiguchi, David Hildenbrand, Matthew Wilcox, Miaohe Lin,
Andrew Morton, stable
On Fri, May 21, 2021 at 4:40 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
> page flags") the use of PagePrivate to indicate a reservation count
> should be restored at free time was changed to the hugetlb specific flag
> HPageRestoreReserve. Changes to a userfaultfd error path as well as a
> VM_BUG_ON() in remove_inode_hugepages() were overlooked.
>
> Users could see incorrect hugetlb reserve counts if they experience an
> error with a UFFDIO_COPY operation. Specifically, this would be the
> result of an unlikely copy_huge_page_from_user error. There is not an
> increased chance of hitting the VM_BUG_ON.
>
> Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
> fs/hugetlbfs/inode.c | 2 +-
> mm/userfaultfd.c | 28 ++++++++++++++--------------
> 2 files changed, 15 insertions(+), 15 deletions(-)
>
Reviewed-by: Mina Almasry <almasry.mina@google.com>
I'm guessing this may interact with my patch in review "[PATCH v3] mm,
hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY". I'll rebase my
patch and re-test.
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 9d9e0097c1d3..55efd3dd04f6 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -529,7 +529,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
> * the subpool and global reserve usage count can need
> * to be adjusted.
> */
> - VM_BUG_ON(PagePrivate(page));
> + VM_BUG_ON(HPageRestoreReserve(page));
> remove_huge_page(page);
> freed++;
> if (!truncate_op) {
> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index 5508f7d9e2dc..9ce5a3793ad4 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -432,38 +432,38 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
> * If a reservation for the page existed in the reservation
> * map of a private mapping, the map was modified to indicate
> * the reservation was consumed when the page was allocated.
> - * We clear the PagePrivate flag now so that the global
> + * We clear the HPageRestoreReserve flag now so that the global
> * reserve count will not be incremented in free_huge_page.
> * The reservation map will still indicate the reservation
> * was consumed and possibly prevent later page allocation.
> * This is better than leaking a global reservation. If no
> - * reservation existed, it is still safe to clear PagePrivate
> - * as no adjustments to reservation counts were made during
> - * allocation.
> + * reservation existed, it is still safe to clear
> + * HPageRestoreReserve as no adjustments to reservation counts
> + * were made during allocation.
> *
> * The reservation map for shared mappings indicates which
> * pages have reservations. When a huge page is allocated
> * for an address with a reservation, no change is made to
> - * the reserve map. In this case PagePrivate will be set
> - * to indicate that the global reservation count should be
> + * the reserve map. In this case HPageRestoreReserve will be
> + * set to indicate that the global reservation count should be
> * incremented when the page is freed. This is the desired
> * behavior. However, when a huge page is allocated for an
> * address without a reservation a reservation entry is added
> - * to the reservation map, and PagePrivate will not be set.
> - * When the page is freed, the global reserve count will NOT
> - * be incremented and it will appear as though we have leaked
> - * reserved page. In this case, set PagePrivate so that the
> - * global reserve count will be incremented to match the
> - * reservation map entry which was created.
> + * to the reservation map, and HPageRestoreReserve will not be
> + * set. When the page is freed, the global reserve count will
> + * NOT be incremented and it will appear as though we have
> + * leaked reserved page. In this case, set HPageRestoreReserve
> + * so that the global reserve count will be incremented to
> + * match the reservation map entry which was created.
> *
> * Note that vm_alloc_shared is based on the flags of the vma
> * for which the page was originally allocated. dst_vma could
> * be different or NULL on error.
> */
> if (vm_alloc_shared)
> - SetPagePrivate(page);
> + SetHPageRestoreReserve(page);
> else
> - ClearPagePrivate(page);
> + ClearHPageRestoreReserve(page);
> put_page(page);
> }
> BUG_ON(copied < 0);
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [External] [PATCH] userfaultfd: hugetlbfs: fix new flag usage in error path
2021-05-21 23:39 [PATCH] userfaultfd: hugetlbfs: fix new flag usage in error path Mike Kravetz
2021-05-22 19:41 ` Mina Almasry
@ 2021-05-23 2:45 ` Muchun Song
1 sibling, 0 replies; 3+ messages in thread
From: Muchun Song @ 2021-05-23 2:45 UTC (permalink / raw)
To: Mike Kravetz
Cc: Linux Memory Management List, LKML, Oscar Salvador, Michal Hocko,
Naoya Horiguchi, David Hildenbrand, Matthew Wilcox, Miaohe Lin,
Mina Almasry, Andrew Morton, linux- stable
On Sat, May 22, 2021 at 7:40 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
> page flags") the use of PagePrivate to indicate a reservation count
> should be restored at free time was changed to the hugetlb specific flag
> HPageRestoreReserve. Changes to a userfaultfd error path as well as a
> VM_BUG_ON() in remove_inode_hugepages() were overlooked.
>
> Users could see incorrect hugetlb reserve counts if they experience an
> error with a UFFDIO_COPY operation. Specifically, this would be the
> result of an unlikely copy_huge_page_from_user error. There is not an
> increased chance of hitting the VM_BUG_ON.
>
> Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Thanks Mike.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-05-23 2:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-21 23:39 [PATCH] userfaultfd: hugetlbfs: fix new flag usage in error path Mike Kravetz
2021-05-22 19:41 ` Mina Almasry
2021-05-23 2:45 ` [External] " Muchun Song
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).