linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio
@ 2024-03-08  9:27 Barry Song
  2024-03-08  9:34 ` David Hildenbrand
  0 siblings, 1 reply; 6+ messages in thread
From: Barry Song @ 2024-03-08  9:27 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: minchan, fengwei.yin, linux-kernel, mhocko, peterx, ryan.roberts,
	shy828301, songmuchun, wangkefeng.wang, xiehuan09, zokeefe,
	chrisl, yuzhao, Barry Song, David Hildenbrand, Lance Yang

From: Barry Song <v-songbaohua@oppo.com>

In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire
large folio, resulting in the waste of (nr_pages - 1) pages. This wasted
memory remains allocated until it is either unmapped or memory
reclamation occurs.

The following small program can serve as evidence of this behavior

 main()
 {
 #define SIZE 1024 * 1024 * 1024UL
         void *p = malloc(SIZE);
         memset(p, 0x11, SIZE);
         if (fork() == 0)
                 _exit(0);
         memset(p, 0x12, SIZE);
         printf("done\n");
         while(1);
 }

For example, using a 1024KiB mTHP by:
 echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled

(1) w/o the patch, it takes 2GiB,

Before running the test program,
 / # free -m
                total        used        free      shared  buff/cache   available
 Mem:            5754          84        5692           0          17        5669
 Swap:              0           0           0

 / # /a.out &
 / # done

After running the test program,
 / # free -m
                 total        used        free      shared  buff/cache   available
 Mem:            5754        2149        3627           0          19        3605
 Swap:              0           0           0

(2) w/ the patch, it takes 1GiB only,

Before running the test program,
 / # free -m
                 total        used        free      shared  buff/cache   available
 Mem:            5754          89        5687           0          17        5664
 Swap:              0           0           0

 / # /a.out &
 / # done

After running the test program,
 / # free -m
                total        used        free      shared  buff/cache   available
 Mem:            5754        1122        4655           0          17        4632
 Swap:              0           0           0

This patch migrates the last subpage to a small folio and immediately
returns the large folio to the system. It benefits both memory availability
and anti-fragmentation.

Cc: David Hildenbrand <david@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 -v2:
  * return at the 1st beginning for a large folio according to David's comment,
    thanks!
 -v1:
 https://lore.kernel.org/linux-mm/20240308085653.124180-1-21cnbao@gmail.com/

 mm/memory.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index e17669d4f72f..f2bc6dd15eb8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3498,6 +3498,16 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf, struct folio *folio)
 static bool wp_can_reuse_anon_folio(struct folio *folio,
 				    struct vm_area_struct *vma)
 {
+	/*
+	 * We could currently only reuse a subpage of a large folio if no
+	 * other subpages of the large folios are still mapped. However,
+	 * let's just consistently not reuse subpages even if we could
+	 * reuse in that scenario, and give back a large folio a bit
+	 * sooner.
+	 */
+	if (folio_test_large(folio))
+		return false;
+
 	/*
 	 * We have to verify under folio lock: these early checks are
 	 * just an optimization to avoid locking the folio and freeing
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio
  2024-03-08  9:27 [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio Barry Song
@ 2024-03-08  9:34 ` David Hildenbrand
  2024-03-08 12:50   ` Ryan Roberts
  0 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2024-03-08  9:34 UTC (permalink / raw)
  To: Barry Song, akpm, linux-mm
  Cc: minchan, fengwei.yin, linux-kernel, mhocko, peterx, ryan.roberts,
	shy828301, songmuchun, wangkefeng.wang, xiehuan09, zokeefe,
	chrisl, yuzhao, Barry Song, Lance Yang

On 08.03.24 10:27, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire
> large folio, resulting in the waste of (nr_pages - 1) pages. This wasted
> memory remains allocated until it is either unmapped or memory
> reclamation occurs.
> 
> The following small program can serve as evidence of this behavior
> 
>   main()
>   {
>   #define SIZE 1024 * 1024 * 1024UL
>           void *p = malloc(SIZE);
>           memset(p, 0x11, SIZE);
>           if (fork() == 0)
>                   _exit(0);
>           memset(p, 0x12, SIZE);
>           printf("done\n");
>           while(1);
>   }
> 
> For example, using a 1024KiB mTHP by:
>   echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled
> 
> (1) w/o the patch, it takes 2GiB,
> 
> Before running the test program,
>   / # free -m
>                  total        used        free      shared  buff/cache   available
>   Mem:            5754          84        5692           0          17        5669
>   Swap:              0           0           0
> 
>   / # /a.out &
>   / # done
> 
> After running the test program,
>   / # free -m
>                   total        used        free      shared  buff/cache   available
>   Mem:            5754        2149        3627           0          19        3605
>   Swap:              0           0           0
> 
> (2) w/ the patch, it takes 1GiB only,
> 
> Before running the test program,
>   / # free -m
>                   total        used        free      shared  buff/cache   available
>   Mem:            5754          89        5687           0          17        5664
>   Swap:              0           0           0
> 
>   / # /a.out &
>   / # done
> 
> After running the test program,
>   / # free -m
>                  total        used        free      shared  buff/cache   available
>   Mem:            5754        1122        4655           0          17        4632
>   Swap:              0           0           0
> 
> This patch migrates the last subpage to a small folio and immediately
> returns the large folio to the system. It benefits both memory availability
> and anti-fragmentation.

It might be controversial optimization, and as Ryan said, there, are 
likely other cases where we'd want to migrate off-of a thp if possible 
earlier.

But I like that it just handles large folios now in a consistent way for 
the time being.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio
  2024-03-08  9:34 ` David Hildenbrand
@ 2024-03-08 12:50   ` Ryan Roberts
  2024-03-08 13:24     ` David Hildenbrand
  0 siblings, 1 reply; 6+ messages in thread
From: Ryan Roberts @ 2024-03-08 12:50 UTC (permalink / raw)
  To: David Hildenbrand, Barry Song, akpm, linux-mm
  Cc: minchan, fengwei.yin, linux-kernel, mhocko, peterx, shy828301,
	songmuchun, wangkefeng.wang, xiehuan09, zokeefe, chrisl, yuzhao,
	Barry Song, Lance Yang

On 08/03/2024 09:34, David Hildenbrand wrote:
> On 08.03.24 10:27, Barry Song wrote:
>> From: Barry Song <v-songbaohua@oppo.com>
>>
>> In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire
>> large folio, resulting in the waste of (nr_pages - 1) pages. This wasted
>> memory remains allocated until it is either unmapped or memory
>> reclamation occurs.
>>
>> The following small program can serve as evidence of this behavior
>>
>>   main()
>>   {
>>   #define SIZE 1024 * 1024 * 1024UL
>>           void *p = malloc(SIZE);
>>           memset(p, 0x11, SIZE);
>>           if (fork() == 0)
>>                   _exit(0);
>>           memset(p, 0x12, SIZE);
>>           printf("done\n");
>>           while(1);
>>   }
>>
>> For example, using a 1024KiB mTHP by:
>>   echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled
>>
>> (1) w/o the patch, it takes 2GiB,
>>
>> Before running the test program,
>>   / # free -m
>>                  total        used        free      shared  buff/cache  
>> available
>>   Mem:            5754          84        5692           0          17       
>> 5669
>>   Swap:              0           0           0
>>
>>   / # /a.out &
>>   / # done
>>
>> After running the test program,
>>   / # free -m
>>                   total        used        free      shared  buff/cache  
>> available
>>   Mem:            5754        2149        3627           0          19       
>> 3605
>>   Swap:              0           0           0
>>
>> (2) w/ the patch, it takes 1GiB only,
>>
>> Before running the test program,
>>   / # free -m
>>                   total        used        free      shared  buff/cache  
>> available
>>   Mem:            5754          89        5687           0          17       
>> 5664
>>   Swap:              0           0           0
>>
>>   / # /a.out &
>>   / # done
>>
>> After running the test program,
>>   / # free -m
>>                  total        used        free      shared  buff/cache  
>> available
>>   Mem:            5754        1122        4655           0          17       
>> 4632
>>   Swap:              0           0           0
>>
>> This patch migrates the last subpage to a small folio and immediately
>> returns the large folio to the system. It benefits both memory availability
>> and anti-fragmentation.
> 
> It might be controversial optimization, and as Ryan said, there, are likely
> other cases where we'd want to migrate off-of a thp if possible earlier.

Personally, I think there might also be cases where you want to copy/reuse the
entire large folio. If you're application is using 16K THPs perhaps it's a
bigger win to just treat it like a base page? I expect the cost/benefit will
change as the THP size increases?

I know we have previously talked about using a khugepaged-like mechanism to
re-collapse after CoW, but for the smaller sizes maybe that's just a lot more
effort?

> 
> But I like that it just handles large folios now in a consistent way for the
> time being.

Yes agreed.

> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio
  2024-03-08 12:50   ` Ryan Roberts
@ 2024-03-08 13:24     ` David Hildenbrand
  2024-03-08 13:45       ` Ryan Roberts
  0 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2024-03-08 13:24 UTC (permalink / raw)
  To: Ryan Roberts, Barry Song, akpm, linux-mm
  Cc: minchan, fengwei.yin, linux-kernel, mhocko, peterx, shy828301,
	songmuchun, wangkefeng.wang, xiehuan09, zokeefe, chrisl, yuzhao,
	Barry Song, Lance Yang

>>> This patch migrates the last subpage to a small folio and immediately
>>> returns the large folio to the system. It benefits both memory availability
>>> and anti-fragmentation.
>>
>> It might be controversial optimization, and as Ryan said, there, are likely
>> other cases where we'd want to migrate off-of a thp if possible earlier.
> 
> Personally, I think there might also be cases where you want to copy/reuse the
> entire large folio. If you're application is using 16K THPs perhaps it's a
> bigger win to just treat it like a base page? I expect the cost/benefit will
> change as the THP size increases?

Yes, I think for small folios (i.e., 16KiB) it will be rather easy to 
make a decision. The larger the folio, the larger the page fault latency 
due to scanning, copying, modifying, which can easily turn undesirable.

At least when it comes to page reuse, I have some simple backup plans 
for small folios if I won't be able to make progress with my other 
approach. For larger folios, it won't really work/be desirable, though.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio
  2024-03-08 13:24     ` David Hildenbrand
@ 2024-03-08 13:45       ` Ryan Roberts
  2024-03-08 13:46         ` David Hildenbrand
  0 siblings, 1 reply; 6+ messages in thread
From: Ryan Roberts @ 2024-03-08 13:45 UTC (permalink / raw)
  To: David Hildenbrand, Barry Song, akpm, linux-mm
  Cc: minchan, fengwei.yin, linux-kernel, mhocko, peterx, shy828301,
	songmuchun, wangkefeng.wang, xiehuan09, zokeefe, chrisl, yuzhao,
	Barry Song, Lance Yang

On 08/03/2024 13:24, David Hildenbrand wrote:
>>>> This patch migrates the last subpage to a small folio and immediately
>>>> returns the large folio to the system. It benefits both memory availability
>>>> and anti-fragmentation.
>>>
>>> It might be controversial optimization, and as Ryan said, there, are likely
>>> other cases where we'd want to migrate off-of a thp if possible earlier.
>>
>> Personally, I think there might also be cases where you want to copy/reuse the
>> entire large folio. If you're application is using 16K THPs perhaps it's a
>> bigger win to just treat it like a base page? I expect the cost/benefit will
>> change as the THP size increases?
> 
> Yes, I think for small folios (i.e., 16KiB) it will be rather easy to make a
> decision. The larger the folio, the larger the page fault latency due to
> scanning, copying, modifying, which can easily turn undesirable.
> 
> At least when it comes to page reuse, I have some simple backup plans for small
> folios if I won't be able to make progress with my other approach. 

Do you mean "small large folios" here? i.e. order >= 1? If so, great!


For larger
> folios, it won't really work/be desirable, though.
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio
  2024-03-08 13:45       ` Ryan Roberts
@ 2024-03-08 13:46         ` David Hildenbrand
  0 siblings, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2024-03-08 13:46 UTC (permalink / raw)
  To: Ryan Roberts, Barry Song, akpm, linux-mm
  Cc: minchan, fengwei.yin, linux-kernel, mhocko, peterx, shy828301,
	songmuchun, wangkefeng.wang, xiehuan09, zokeefe, chrisl, yuzhao,
	Barry Song, Lance Yang

On 08.03.24 14:45, Ryan Roberts wrote:
> On 08/03/2024 13:24, David Hildenbrand wrote:
>>>>> This patch migrates the last subpage to a small folio and immediately
>>>>> returns the large folio to the system. It benefits both memory availability
>>>>> and anti-fragmentation.
>>>>
>>>> It might be controversial optimization, and as Ryan said, there, are likely
>>>> other cases where we'd want to migrate off-of a thp if possible earlier.
>>>
>>> Personally, I think there might also be cases where you want to copy/reuse the
>>> entire large folio. If you're application is using 16K THPs perhaps it's a
>>> bigger win to just treat it like a base page? I expect the cost/benefit will
>>> change as the THP size increases?
>>
>> Yes, I think for small folios (i.e., 16KiB) it will be rather easy to make a
>> decision. The larger the folio, the larger the page fault latency due to
>> scanning, copying, modifying, which can easily turn undesirable.
>>
>> At least when it comes to page reuse, I have some simple backup plans for small
>> folios if I won't be able to make progress with my other approach.
> 
> Do you mean "small large folios" here? i.e. order >= 1? If so, great!

*smaller*, yes :)

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-08 13:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-08  9:27 [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio Barry Song
2024-03-08  9:34 ` David Hildenbrand
2024-03-08 12:50   ` Ryan Roberts
2024-03-08 13:24     ` David Hildenbrand
2024-03-08 13:45       ` Ryan Roberts
2024-03-08 13:46         ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).