[PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page
@ 2022-03-15  4:23 luofei
  2022-03-15 13:29 ` Muchun Song
  0 siblings, 1 reply; 6+ messages in thread
From: luofei @ 2022-03-15  4:23 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, luofei

No matter what context update_and_free_page() is called in,
the flag for allocating the vmemmap page is fixed
(GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
allocation is involved, so the description of atomicity here
is somewhat inappropriate.

and the atomic parameter naming of update_and_free_page() is
somewhat misleading.

Signed-off-by: luofei <luofei@unicloud.com>
---
 mm/hugetlb.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f8ca7cca3c1a..239ef82b7897 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 
 /*
  * As update_and_free_page() can be called under any context, so we cannot
- * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
- * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
+ * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
+ * actual freeing in a workqueue to prevent waits caused by allocating
  * the vmemmap pages.
  *
  * free_hpage_workfn() locklessly retrieves the linked list of pages to be
@@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
 }
 
 static void update_and_free_page(struct hstate *h, struct page *page,
-				 bool atomic)
+				 bool delay)
 {
-	if (!HPageVmemmapOptimized(page) || !atomic) {
+	if (!HPageVmemmapOptimized(page) || !delay) {
 		__update_and_free_page(h, page);
 		return;
 	}
 
 	/*
-	 * Defer freeing to avoid using GFP_ATOMIC to allocate vmemmap pages.
-	 *
 	 * Only call schedule_work() if hpage_freelist is previously
 	 * empty. Otherwise, schedule_work() had been called but the workfn
 	 * hasn't retrieved the list yet.
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page
  2022-03-15  4:23 [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page luofei
@ 2022-03-15 13:29 ` Muchun Song
  2022-03-15 21:16   ` Mike Kravetz
  2022-03-16  1:40   ` 罗飞
  0 siblings, 2 replies; 6+ messages in thread
From: Muchun Song @ 2022-03-15 13:29 UTC (permalink / raw)
  To: luofei; +Cc: Mike Kravetz, Andrew Morton, Linux Memory Management List, LKML

On Tue, Mar 15, 2022 at 12:24 PM luofei <luofei@unicloud.com> wrote:
>
> No matter what context update_and_free_page() is called in,
> the flag for allocating the vmemmap page is fixed
> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
> allocation is involved, so the description of atomicity here
> is somewhat inappropriate.
>
> and the atomic parameter naming of update_and_free_page() is
> somewhat misleading.
>
> Signed-off-by: luofei <luofei@unicloud.com>
> ---
>  mm/hugetlb.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f8ca7cca3c1a..239ef82b7897 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>
>  /*
>   * As update_and_free_page() can be called under any context, so we cannot
> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
> + * actual freeing in a workqueue to prevent waits caused by allocating
>   * the vmemmap pages.
>   *
>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
>  }
>
>  static void update_and_free_page(struct hstate *h, struct page *page,
> -                                bool atomic)
> +                                bool delay)

Hi luofei,

At least, I don't agree with this change.  The "atomic" means if the
caller is under atomic context instead of whether using atomic
GFP_MASK.  The "delay" seems to tell the caller that it can undelay
the allocation even if it is under atomic context (actually, it has no
choice).  But "atomic" can indicate the user is being asked to tell us
if it is under atomic context.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page
  2022-03-15 13:29 ` Muchun Song
@ 2022-03-15 21:16   ` Mike Kravetz
  2022-03-16  0:42     ` Muchun Song
  2022-03-16  2:26     ` 答复: " 罗飞
  2022-03-16  1:40   ` 罗飞
  1 sibling, 2 replies; 6+ messages in thread
From: Mike Kravetz @ 2022-03-15 21:16 UTC (permalink / raw)
  To: Muchun Song, luofei; +Cc: Andrew Morton, Linux Memory Management List, LKML

On 3/15/22 06:29, Muchun Song wrote:
> On Tue, Mar 15, 2022 at 12:24 PM luofei <luofei@unicloud.com> wrote:
>>
>> No matter what context update_and_free_page() is called in,
>> the flag for allocating the vmemmap page is fixed
>> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
>> allocation is involved, so the description of atomicity here
>> is somewhat inappropriate.
>>
>> and the atomic parameter naming of update_and_free_page() is
>> somewhat misleading.
>>
>> Signed-off-by: luofei <luofei@unicloud.com>
>> ---
>>  mm/hugetlb.c | 10 ++++------
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index f8ca7cca3c1a..239ef82b7897 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>>
>>  /*
>>   * As update_and_free_page() can be called under any context, so we cannot
>> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
>> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
>> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
>> + * actual freeing in a workqueue to prevent waits caused by allocating
>>   * the vmemmap pages.
>>   *
>>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
>> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
>>  }
>>
>>  static void update_and_free_page(struct hstate *h, struct page *page,
>> -                                bool atomic)
>> +                                bool delay)
> 
> Hi luofei,
> 
> At least, I don't agree with this change.  The "atomic" means if the
> caller is under atomic context instead of whether using atomic
> GFP_MASK.  The "delay" seems to tell the caller that it can undelay
> the allocation even if it is under atomic context (actually, it has no
> choice).  But "atomic" can indicate the user is being asked to tell us
> if it is under atomic context.

There may be some confusion since GFP_ATOMIC is mentioned in the comments
and GFP_ATOMIC is not used in the allocation of vmemmap pages.  IIRC,
the use of GFP_ATOMIC was discussed at one time but dismissed because of
undesired side effects such as dipping into "atomic reserves".

How about an update to the comments as follows (sorry mailer may mess up
formatting)?

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f8ca7cca3c1a..6a4d27e24b21 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1569,10 +1569,12 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 }
 
 /*
- * As update_and_free_page() can be called under any context, so we cannot
- * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
- * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
- * the vmemmap pages.
+ * Freeing hugetlb pages in done in update_and_free_page().  When freeing a
+ * hugetlb page, vmemmap pages may need to be allocated.  The routine
+ * alloc_huge_page_vmemmap() can possibly sleep as it uses GFP_KERNEL.
+ * However, update_and_free_page() can be called under any context.  To
+ * avoid the possibility of sleeping in a context where sleeping is not
+ * allowed, defer the actual freeing in a workqueue where sleeping is allowed.
  *
  * free_hpage_workfn() locklessly retrieves the linked list of pages to be
  * freed and frees them one-by-one. As the page->mapping pointer is going
@@ -1616,6 +1618,10 @@ static inline void flush_free_hpage_work(struct hstate *h)
 		flush_work(&free_hpage_work);
 }
 
+/*
+ * atomic == true indicates called from a context where sleeping is
+ * not allowed.
+ */
 static void update_and_free_page(struct hstate *h, struct page *page,
 				 bool atomic)
 {
@@ -1625,7 +1631,8 @@ static void update_and_free_page(struct hstate *h, struct page *page,
 	}
 
 	/*
-	 * Defer freeing to avoid using GFP_ATOMIC to allocate vmemmap pages.
+	 * Defer freeing to avoid possible sleeping when allocating
+	 * vmemmap pages.
 	 *
 	 * Only call schedule_work() if hpage_freelist is previously
 	 * empty. Otherwise, schedule_work() had been called but the workfn

-- 
Mike Kravetz

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page
  2022-03-15 21:16   ` Mike Kravetz
@ 2022-03-16  0:42     ` Muchun Song
  2022-03-16  2:26     ` 答复: " 罗飞
  1 sibling, 0 replies; 6+ messages in thread
From: Muchun Song @ 2022-03-16  0:42 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: luofei, Andrew Morton, Linux Memory Management List, LKML

On Wed, Mar 16, 2022 at 5:16 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 3/15/22 06:29, Muchun Song wrote:
> > On Tue, Mar 15, 2022 at 12:24 PM luofei <luofei@unicloud.com> wrote:
> >>
> >> No matter what context update_and_free_page() is called in,
> >> the flag for allocating the vmemmap page is fixed
> >> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
> >> allocation is involved, so the description of atomicity here
> >> is somewhat inappropriate.
> >>
> >> and the atomic parameter naming of update_and_free_page() is
> >> somewhat misleading.
> >>
> >> Signed-off-by: luofei <luofei@unicloud.com>
> >> ---
> >>  mm/hugetlb.c | 10 ++++------
> >>  1 file changed, 4 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index f8ca7cca3c1a..239ef82b7897 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
> >>
> >>  /*
> >>   * As update_and_free_page() can be called under any context, so we cannot
> >> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
> >> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
> >> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
> >> + * actual freeing in a workqueue to prevent waits caused by allocating
> >>   * the vmemmap pages.
> >>   *
> >>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
> >> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
> >>  }
> >>
> >>  static void update_and_free_page(struct hstate *h, struct page *page,
> >> -                                bool atomic)
> >> +                                bool delay)
> >
> > Hi luofei,
> >
> > At least, I don't agree with this change.  The "atomic" means if the
> > caller is under atomic context instead of whether using atomic
> > GFP_MASK.  The "delay" seems to tell the caller that it can undelay
> > the allocation even if it is under atomic context (actually, it has no
> > choice).  But "atomic" can indicate the user is being asked to tell us
> > if it is under atomic context.
>
> There may be some confusion since GFP_ATOMIC is mentioned in the comments
> and GFP_ATOMIC is not used in the allocation of vmemmap pages.  IIRC,
> the use of GFP_ATOMIC was discussed at one time but dismissed because of
> undesired side effects such as dipping into "atomic reserves".
>
> How about an update to the comments as follows (sorry mailer may mess up
> formatting)?
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f8ca7cca3c1a..6a4d27e24b21 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1569,10 +1569,12 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>  }
>
>  /*
> - * As update_and_free_page() can be called under any context, so we cannot
> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
> - * the vmemmap pages.
> + * Freeing hugetlb pages in done in update_and_free_page().  When freeing a
> + * hugetlb page, vmemmap pages may need to be allocated.  The routine
> + * alloc_huge_page_vmemmap() can possibly sleep as it uses GFP_KERNEL.
> + * However, update_and_free_page() can be called under any context.  To
> + * avoid the possibility of sleeping in a context where sleeping is not
> + * allowed, defer the actual freeing in a workqueue where sleeping is allowed.
>   *
>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
>   * freed and frees them one-by-one. As the page->mapping pointer is going
> @@ -1616,6 +1618,10 @@ static inline void flush_free_hpage_work(struct hstate *h)
>                 flush_work(&free_hpage_work);
>  }
>
> +/*
> + * atomic == true indicates called from a context where sleeping is
> + * not allowed.
> + */
>  static void update_and_free_page(struct hstate *h, struct page *page,
>                                  bool atomic)
>  {
> @@ -1625,7 +1631,8 @@ static void update_and_free_page(struct hstate *h, struct page *page,
>         }
>
>         /*
> -        * Defer freeing to avoid using GFP_ATOMIC to allocate vmemmap pages.
> +        * Defer freeing to avoid possible sleeping when allocating
> +        * vmemmap pages.
>          *
>          * Only call schedule_work() if hpage_freelist is previously
>          * empty. Otherwise, schedule_work() had been called but the workfn
>

LGTM. Thanks Mike.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* 答复: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page
  2022-03-15 13:29 ` Muchun Song
  2022-03-15 21:16   ` Mike Kravetz
@ 2022-03-16  1:40   ` 罗飞
  1 sibling, 0 replies; 6+ messages in thread
From: 罗飞 @ 2022-03-16  1:40 UTC (permalink / raw)
  To: Muchun Song
  Cc: Mike Kravetz, Andrew Morton, Linux Memory Management List, LKML

>> No matter what context update_and_free_page() is called in,
>> the flag for allocating the vmemmap page is fixed
>> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
>> allocation is involved, so the description of atomicity here
>> is somewhat inappropriate.
>>
>> and the atomic parameter naming of update_and_free_page() is
>> somewhat misleading.
>>
>> Signed-off-by: luofei <luofei@unicloud.com>
>> ---
>>  mm/hugetlb.c | 10 ++++------
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index f8ca7cca3c1a..239ef82b7897 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>>
>>  /*
>>   * As update_and_free_page() can be called under any context, so we cannot
>> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
>> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
>> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
>> + * actual freeing in a workqueue to prevent waits caused by allocating
>>   * the vmemmap pages.
>>   *
>>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
>> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
>>  }
>>
>>  static void update_and_free_page(struct hstate *h, struct page *page,
>> -                                bool atomic)
>> +                                bool delay)
>
>Hi luofei,
>
>At least, I don't agree with this change.  The "atomic" means if the
>caller is under atomic context instead of whether using atomic
>GFP_MASK.  The "delay" seems to tell the caller that it can undelay
>the allocation even if it is under atomic context (actually, it has no
>choice).  But "atomic" can indicate the user is being asked to tell us
>if it is under atomic context.

Based on the comments related to avoiding atomic allocation pages,
I would have thought that atomic variable had something to do with
this, it seems I misunderstood the meaning of the variable.

Thanks:)
________________________________________
发件人: Muchun Song <songmuchun@bytedance.com>
发送时间: 2022年3月15日 21:29:27
收件人: 罗飞
抄送: Mike Kravetz; Andrew Morton; Linux Memory Management List; LKML
主题: Re: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page

On Tue, Mar 15, 2022 at 12:24 PM luofei <luofei@unicloud.com> wrote:
>
> No matter what context update_and_free_page() is called in,
> the flag for allocating the vmemmap page is fixed
> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
> allocation is involved, so the description of atomicity here
> is somewhat inappropriate.
>
> and the atomic parameter naming of update_and_free_page() is
> somewhat misleading.
>
> Signed-off-by: luofei <luofei@unicloud.com>
> ---
>  mm/hugetlb.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f8ca7cca3c1a..239ef82b7897 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>
>  /*
>   * As update_and_free_page() can be called under any context, so we cannot
> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
> + * actual freeing in a workqueue to prevent waits caused by allocating
>   * the vmemmap pages.
>   *
>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
>  }
>
>  static void update_and_free_page(struct hstate *h, struct page *page,
> -                                bool atomic)
> +                                bool delay)

Hi luofei,

At least, I don't agree with this change.  The "atomic" means if the
caller is under atomic context instead of whether using atomic
GFP_MASK.  The "delay" seems to tell the caller that it can undelay
the allocation even if it is under atomic context (actually, it has no
choice).  But "atomic" can indicate the user is being asked to tell us
if it is under atomic context.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* 答复: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page
  2022-03-15 21:16   ` Mike Kravetz
  2022-03-16  0:42     ` Muchun Song
@ 2022-03-16  2:26     ` 罗飞
  1 sibling, 0 replies; 6+ messages in thread
From: 罗飞 @ 2022-03-16  2:26 UTC (permalink / raw)
  To: Mike Kravetz, Muchun Song
  Cc: Andrew Morton, Linux Memory Management List, LKML

>>>
>>> No matter what context update_and_free_page() is called in,
>>> the flag for allocating the vmemmap page is fixed
>>> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
>>> allocation is involved, so the description of atomicity here
>>> is somewhat inappropriate.
>>>
>>> and the atomic parameter naming of update_and_free_page() is
>>> somewhat misleading.
>>>
>>> Signed-off-by: luofei <luofei@unicloud.com>
>>> ---
>>>  mm/hugetlb.c | 10 ++++------
>>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index f8ca7cca3c1a..239ef82b7897 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>>>
>>>  /*
>>>   * As update_and_free_page() can be called under any context, so we cannot
>>> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
>>> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
>>> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
>>> + * actual freeing in a workqueue to prevent waits caused by allocating
>>>   * the vmemmap pages.
>>>   *
>>>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
>>> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
>>>  }
>>>
>>>  static void update_and_free_page(struct hstate *h, struct page *page,
>>> -                                bool atomic)
>>> +                                bool delay)
>>
>> Hi luofei,
>>
>> At least, I don't agree with this change.  The "atomic" means if the
>> caller is under atomic context instead of whether using atomic
>> GFP_MASK.  The "delay" seems to tell the caller that it can undelay
>> the allocation even if it is under atomic context (actually, it has no
>> choice).  But "atomic" can indicate the user is being asked to tell us
>> if it is under atomic context.
>
>There may be some confusion since GFP_ATOMIC is mentioned in the comments
>and GFP_ATOMIC is not used in the allocation of vmemmap pages.  IIRC,
>the use of GFP_ATOMIC was discussed at one time but dismissed because of
>undesired side effects such as dipping into "atomic reserves".
>
>How about an update to the comments as follows (sorry mailer may mess up
>formatting)?
>
>diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>index f8ca7cca3c1a..6a4d27e24b21 100644
>--- a/mm/hugetlb.c
>+++ b/mm/hugetlb.c
>@@ -1569,10 +1569,12 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>}
>
>/*
>- * As update_and_free_page() can be called under any context, so we cannot
>- * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
>- * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
>- * the vmemmap pages.
>+ * Freeing hugetlb pages in done in update_and_free_page().  When freeing a
>+ * hugetlb page, vmemmap pages may need to be allocated.  The routine
>+ * alloc_huge_page_vmemmap() can possibly sleep as it uses GFP_KERNEL.
>+ * However, update_and_free_page() can be called under any context.  To
>+ * avoid the possibility of sleeping in a context where sleeping is not
>+ * allowed, defer the actual freeing in a workqueue where sleeping is allowed.
> *
>  * free_hpage_workfn() locklessly retrieves the linked list of pages to be
>  * freed and frees them one-by-one. As the page->mapping pointer is going
>@@ -1616,6 +1618,10 @@ static inline void flush_free_hpage_work(struct hstate *h)
>                 flush_work(&free_hpage_work);
> }
>
>+/*
>+ * atomic == true indicates called from a context where sleeping is
>+ * not allowed.
>+ */
> static void update_and_free_page(struct hstate *h, struct page *page,
>                                 bool atomic)
> {
>@@ -1625,7 +1631,8 @@ static void update_and_free_page(struct hstate *h, struct page *page,
>         }
>
>        /*
>-        * Defer freeing to avoid using GFP_ATOMIC to allocate vmemmap pages.
>+        * Defer freeing to avoid possible sleeping when allocating
>+        * vmemmap pages.
>          *
>          * Only call schedule_work() if hpage_freelist is previously
>          * empty. Otherwise, schedule_work() had been called but the workfn

The comments made it clearer for me. I will revise the patch. Thanks :)
________________________________________
发件人: Mike Kravetz <mike.kravetz@oracle.com>
发送时间: 2022年3月16日 5:16:14
收件人: Muchun Song; 罗飞
抄送: Andrew Morton; Linux Memory Management List; LKML
主题: Re: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page

On 3/15/22 06:29, Muchun Song wrote:
> On Tue, Mar 15, 2022 at 12:24 PM luofei <luofei@unicloud.com> wrote:
>>
>> No matter what context update_and_free_page() is called in,
>> the flag for allocating the vmemmap page is fixed
>> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic
>> allocation is involved, so the description of atomicity here
>> is somewhat inappropriate.
>>
>> and the atomic parameter naming of update_and_free_page() is
>> somewhat misleading.
>>
>> Signed-off-by: luofei <luofei@unicloud.com>
>> ---
>>  mm/hugetlb.c | 10 ++++------
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index f8ca7cca3c1a..239ef82b7897 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
>>
>>  /*
>>   * As update_and_free_page() can be called under any context, so we cannot
>> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
>> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
>> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the
>> + * actual freeing in a workqueue to prevent waits caused by allocating
>>   * the vmemmap pages.
>>   *
>>   * free_hpage_workfn() locklessly retrieves the linked list of pages to be
>> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h)
>>  }
>>
>>  static void update_and_free_page(struct hstate *h, struct page *page,
>> -                                bool atomic)
>> +                                bool delay)
>
> Hi luofei,
>
> At least, I don't agree with this change.  The "atomic" means if the
> caller is under atomic context instead of whether using atomic
> GFP_MASK.  The "delay" seems to tell the caller that it can undelay
> the allocation even if it is under atomic context (actually, it has no
> choice).  But "atomic" can indicate the user is being asked to tell us
> if it is under atomic context.

There may be some confusion since GFP_ATOMIC is mentioned in the comments
and GFP_ATOMIC is not used in the allocation of vmemmap pages.  IIRC,
the use of GFP_ATOMIC was discussed at one time but dismissed because of
undesired side effects such as dipping into "atomic reserves".

How about an update to the comments as follows (sorry mailer may mess up
formatting)?

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f8ca7cca3c1a..6a4d27e24b21 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1569,10 +1569,12 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 }

 /*
- * As update_and_free_page() can be called under any context, so we cannot
- * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
- * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
- * the vmemmap pages.
+ * Freeing hugetlb pages in done in update_and_free_page().  When freeing a
+ * hugetlb page, vmemmap pages may need to be allocated.  The routine
+ * alloc_huge_page_vmemmap() can possibly sleep as it uses GFP_KERNEL.
+ * However, update_and_free_page() can be called under any context.  To
+ * avoid the possibility of sleeping in a context where sleeping is not
+ * allowed, defer the actual freeing in a workqueue where sleeping is allowed.
  *
  * free_hpage_workfn() locklessly retrieves the linked list of pages to be
  * freed and frees them one-by-one. As the page->mapping pointer is going
@@ -1616,6 +1618,10 @@ static inline void flush_free_hpage_work(struct hstate *h)
                flush_work(&free_hpage_work);
 }

+/*
+ * atomic == true indicates called from a context where sleeping is
+ * not allowed.
+ */
 static void update_and_free_page(struct hstate *h, struct page *page,
                                 bool atomic)
 {
@@ -1625,7 +1631,8 @@ static void update_and_free_page(struct hstate *h, struct page *page,
        }

        /*
-        * Defer freeing to avoid using GFP_ATOMIC to allocate vmemmap pages.
+        * Defer freeing to avoid possible sleeping when allocating
+        * vmemmap pages.
         *
         * Only call schedule_work() if hpage_freelist is previously
         * empty. Otherwise, schedule_work() had been called but the workfn

--
Mike Kravetz

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-16  2:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-15  4:23 [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page luofei
2022-03-15 13:29 ` Muchun Song
2022-03-15 21:16   ` Mike Kravetz
2022-03-16  0:42     ` Muchun Song
2022-03-16  2:26     ` 答复: " 罗飞
2022-03-16  1:40   ` 罗飞

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).