All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-06 20:22 ` Yang Shi
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-06 20:22 UTC (permalink / raw)
  To: kirill.shutemov, hughd, mhocko, akpm; +Cc: Yang Shi, linux-mm, linux-kernel

When passing "huge=always" option for mounting tmpfs, THP is supposed to
be allocated all the time when it can fit, but when the available space is
smaller than the size of THP (2MB on x86), shmem fault handler still tries
to allocate huge page every time, then fallback to regular 4K page
allocation, i.e.:

	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048

The last dd command will handle 952 times page fault handler, then exit
with -ENOSPC.

Rounding up tmpfs size to THP size in order to use THP with "always"
more efficiently. And, it will not wast too much memory (just allocate
511 extra pages in worst case).

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 mm/shmem.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index 07a1d22..b2b595d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3567,6 +3567,11 @@ static int shmem_parse_options(char *options, struct shmem_sb_info *sbinfo,
 		}
 	}
 	sbinfo->mpol = mpol;
+#ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
+	/* Round up tmpfs size to huge page size */
+	if (sbinfo->max_blocks && sbinfo->huge == SHMEM_HUGE_ALWAYS)
+		sbinfo->max_blocks = round_up(sbinf->max_blocks, HPAGE_PMD_NR);
+#endif
 	return 0;
 
 bad_val:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-06 20:22 ` Yang Shi
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-06 20:22 UTC (permalink / raw)
  To: kirill.shutemov, hughd, mhocko, akpm; +Cc: Yang Shi, linux-mm, linux-kernel

When passing "huge=always" option for mounting tmpfs, THP is supposed to
be allocated all the time when it can fit, but when the available space is
smaller than the size of THP (2MB on x86), shmem fault handler still tries
to allocate huge page every time, then fallback to regular 4K page
allocation, i.e.:

	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048

The last dd command will handle 952 times page fault handler, then exit
with -ENOSPC.

Rounding up tmpfs size to THP size in order to use THP with "always"
more efficiently. And, it will not wast too much memory (just allocate
511 extra pages in worst case).

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 mm/shmem.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index 07a1d22..b2b595d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3567,6 +3567,11 @@ static int shmem_parse_options(char *options, struct shmem_sb_info *sbinfo,
 		}
 	}
 	sbinfo->mpol = mpol;
+#ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
+	/* Round up tmpfs size to huge page size */
+	if (sbinfo->max_blocks && sbinfo->huge == SHMEM_HUGE_ALWAYS)
+		sbinfo->max_blocks = round_up(sbinf->max_blocks, HPAGE_PMD_NR);
+#endif
 	return 0;
 
 bad_val:
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
  2017-10-06 20:22 ` Yang Shi
@ 2017-10-08 12:56   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-10-08 12:56 UTC (permalink / raw)
  To: Yang Shi; +Cc: kirill.shutemov, hughd, mhocko, akpm, linux-mm, linux-kernel

On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> When passing "huge=always" option for mounting tmpfs, THP is supposed to
> be allocated all the time when it can fit, but when the available space is
> smaller than the size of THP (2MB on x86), shmem fault handler still tries
> to allocate huge page every time, then fallback to regular 4K page
> allocation, i.e.:
> 
> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> 
> The last dd command will handle 952 times page fault handler, then exit
> with -ENOSPC.
> 
> Rounding up tmpfs size to THP size in order to use THP with "always"
> more efficiently. And, it will not wast too much memory (just allocate
> 511 extra pages in worst case).

Hm. I don't think it's good idea to silently increase size of fs.

Maybe better just refuse to mount with huge=always for too small fs?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-08 12:56   ` Kirill A. Shutemov
  0 siblings, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-10-08 12:56 UTC (permalink / raw)
  To: Yang Shi; +Cc: kirill.shutemov, hughd, mhocko, akpm, linux-mm, linux-kernel

On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> When passing "huge=always" option for mounting tmpfs, THP is supposed to
> be allocated all the time when it can fit, but when the available space is
> smaller than the size of THP (2MB on x86), shmem fault handler still tries
> to allocate huge page every time, then fallback to regular 4K page
> allocation, i.e.:
> 
> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> 
> The last dd command will handle 952 times page fault handler, then exit
> with -ENOSPC.
> 
> Rounding up tmpfs size to THP size in order to use THP with "always"
> more efficiently. And, it will not wast too much memory (just allocate
> 511 extra pages in worst case).

Hm. I don't think it's good idea to silently increase size of fs.

Maybe better just refuse to mount with huge=always for too small fs?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
  2017-10-08 12:56   ` Kirill A. Shutemov
@ 2017-10-08 19:51     ` Yang Shi
  -1 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-08 19:51 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: kirill.shutemov, hughd, mhocko, akpm, linux-mm, linux-kernel



On 10/8/17 5:56 AM, Kirill A. Shutemov wrote:
> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
>> When passing "huge=always" option for mounting tmpfs, THP is supposed to
>> be allocated all the time when it can fit, but when the available space is
>> smaller than the size of THP (2MB on x86), shmem fault handler still tries
>> to allocate huge page every time, then fallback to regular 4K page
>> allocation, i.e.:
>>
>> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
>> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
>> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
>>
>> The last dd command will handle 952 times page fault handler, then exit
>> with -ENOSPC.
>>
>> Rounding up tmpfs size to THP size in order to use THP with "always"
>> more efficiently. And, it will not wast too much memory (just allocate
>> 511 extra pages in worst case).
> 
> Hm. I don't think it's good idea to silently increase size of fs.

How about printing a warning to say the filesystem is resized?

> 
> Maybe better just refuse to mount with huge=always for too small fs?

It sounds fine too. When mounting tmpfs with "huge=always", if the size 
is not THP size aligned, it just can refuse to mount, then show warning 
about alignment restriction.

Thanks,
Yang

> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-08 19:51     ` Yang Shi
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-08 19:51 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: kirill.shutemov, hughd, mhocko, akpm, linux-mm, linux-kernel



On 10/8/17 5:56 AM, Kirill A. Shutemov wrote:
> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
>> When passing "huge=always" option for mounting tmpfs, THP is supposed to
>> be allocated all the time when it can fit, but when the available space is
>> smaller than the size of THP (2MB on x86), shmem fault handler still tries
>> to allocate huge page every time, then fallback to regular 4K page
>> allocation, i.e.:
>>
>> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
>> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
>> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
>>
>> The last dd command will handle 952 times page fault handler, then exit
>> with -ENOSPC.
>>
>> Rounding up tmpfs size to THP size in order to use THP with "always"
>> more efficiently. And, it will not wast too much memory (just allocate
>> 511 extra pages in worst case).
> 
> Hm. I don't think it's good idea to silently increase size of fs.

How about printing a warning to say the filesystem is resized?

> 
> Maybe better just refuse to mount with huge=always for too small fs?

It sounds fine too. When mounting tmpfs with "huge=always", if the size 
is not THP size aligned, it just can refuse to mount, then show warning 
about alignment restriction.

Thanks,
Yang

> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
  2017-10-08 19:51     ` Yang Shi
@ 2017-10-09  4:03       ` Kirill A. Shutemov
  -1 siblings, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-10-09  4:03 UTC (permalink / raw)
  To: Yang Shi; +Cc: kirill.shutemov, hughd, mhocko, akpm, linux-mm, linux-kernel

On Mon, Oct 09, 2017 at 03:51:06AM +0800, Yang Shi wrote:
> 
> 
> On 10/8/17 5:56 AM, Kirill A. Shutemov wrote:
> > On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> > > When passing "huge=always" option for mounting tmpfs, THP is supposed to
> > > be allocated all the time when it can fit, but when the available space is
> > > smaller than the size of THP (2MB on x86), shmem fault handler still tries
> > > to allocate huge page every time, then fallback to regular 4K page
> > > allocation, i.e.:
> > > 
> > > 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> > > 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> > > 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> > > 
> > > The last dd command will handle 952 times page fault handler, then exit
> > > with -ENOSPC.
> > > 
> > > Rounding up tmpfs size to THP size in order to use THP with "always"
> > > more efficiently. And, it will not wast too much memory (just allocate
> > > 511 extra pages in worst case).
> > 
> > Hm. I don't think it's good idea to silently increase size of fs.
> 
> How about printing a warning to say the filesystem is resized?
> 
> > 
> > Maybe better just refuse to mount with huge=always for too small fs?
> 
> It sounds fine too. When mounting tmpfs with "huge=always", if the size is
> not THP size aligned, it just can refuse to mount, then show warning about
> alignment restriction.

Honestly, I wouldn't bother.

Using filesystem at near-full capacity is not best practice. Performance
penalty for doing is fair enough.

And forcing alignment doesn't really fixes anything: user still allowed
to fill the filesystem with files less than 2M in size.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-09  4:03       ` Kirill A. Shutemov
  0 siblings, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-10-09  4:03 UTC (permalink / raw)
  To: Yang Shi; +Cc: kirill.shutemov, hughd, mhocko, akpm, linux-mm, linux-kernel

On Mon, Oct 09, 2017 at 03:51:06AM +0800, Yang Shi wrote:
> 
> 
> On 10/8/17 5:56 AM, Kirill A. Shutemov wrote:
> > On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> > > When passing "huge=always" option for mounting tmpfs, THP is supposed to
> > > be allocated all the time when it can fit, but when the available space is
> > > smaller than the size of THP (2MB on x86), shmem fault handler still tries
> > > to allocate huge page every time, then fallback to regular 4K page
> > > allocation, i.e.:
> > > 
> > > 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> > > 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> > > 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> > > 
> > > The last dd command will handle 952 times page fault handler, then exit
> > > with -ENOSPC.
> > > 
> > > Rounding up tmpfs size to THP size in order to use THP with "always"
> > > more efficiently. And, it will not wast too much memory (just allocate
> > > 511 extra pages in worst case).
> > 
> > Hm. I don't think it's good idea to silently increase size of fs.
> 
> How about printing a warning to say the filesystem is resized?
> 
> > 
> > Maybe better just refuse to mount with huge=always for too small fs?
> 
> It sounds fine too. When mounting tmpfs with "huge=always", if the size is
> not THP size aligned, it just can refuse to mount, then show warning about
> alignment restriction.

Honestly, I wouldn't bother.

Using filesystem at near-full capacity is not best practice. Performance
penalty for doing is fair enough.

And forcing alignment doesn't really fixes anything: user still allowed
to fill the filesystem with files less than 2M in size.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
  2017-10-08 12:56   ` Kirill A. Shutemov
@ 2017-10-09  6:48     ` Michal Hocko
  -1 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2017-10-09  6:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Yang Shi, kirill.shutemov, hughd, akpm, linux-mm, linux-kernel

On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> > When passing "huge=always" option for mounting tmpfs, THP is supposed to
> > be allocated all the time when it can fit, but when the available space is
> > smaller than the size of THP (2MB on x86), shmem fault handler still tries
> > to allocate huge page every time, then fallback to regular 4K page
> > allocation, i.e.:
> > 
> > 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> > 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> > 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> > 
> > The last dd command will handle 952 times page fault handler, then exit
> > with -ENOSPC.
> > 
> > Rounding up tmpfs size to THP size in order to use THP with "always"
> > more efficiently. And, it will not wast too much memory (just allocate
> > 511 extra pages in worst case).
> 
> Hm. I don't think it's good idea to silently increase size of fs.

Agreed!

> Maybe better just refuse to mount with huge=always for too small fs?

We cannot we simply have the remaining page !THP? What is the actual
problem?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-09  6:48     ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2017-10-09  6:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Yang Shi, kirill.shutemov, hughd, akpm, linux-mm, linux-kernel

On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> > When passing "huge=always" option for mounting tmpfs, THP is supposed to
> > be allocated all the time when it can fit, but when the available space is
> > smaller than the size of THP (2MB on x86), shmem fault handler still tries
> > to allocate huge page every time, then fallback to regular 4K page
> > allocation, i.e.:
> > 
> > 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> > 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> > 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> > 
> > The last dd command will handle 952 times page fault handler, then exit
> > with -ENOSPC.
> > 
> > Rounding up tmpfs size to THP size in order to use THP with "always"
> > more efficiently. And, it will not wast too much memory (just allocate
> > 511 extra pages in worst case).
> 
> Hm. I don't think it's good idea to silently increase size of fs.

Agreed!

> Maybe better just refuse to mount with huge=always for too small fs?

We cannot we simply have the remaining page !THP? What is the actual
problem?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
  2017-10-09  6:48     ` Michal Hocko
@ 2017-10-09 16:43       ` Yang Shi
  -1 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-09 16:43 UTC (permalink / raw)
  To: Michal Hocko, Kirill A. Shutemov
  Cc: kirill.shutemov, hughd, akpm, linux-mm, linux-kernel



On 10/8/17 11:48 PM, Michal Hocko wrote:
> On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
>> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
>>> When passing "huge=always" option for mounting tmpfs, THP is supposed to
>>> be allocated all the time when it can fit, but when the available space is
>>> smaller than the size of THP (2MB on x86), shmem fault handler still tries
>>> to allocate huge page every time, then fallback to regular 4K page
>>> allocation, i.e.:
>>>
>>> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
>>> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
>>> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
>>>
>>> The last dd command will handle 952 times page fault handler, then exit
>>> with -ENOSPC.
>>>
>>> Rounding up tmpfs size to THP size in order to use THP with "always"
>>> more efficiently. And, it will not wast too much memory (just allocate
>>> 511 extra pages in worst case).
>>
>> Hm. I don't think it's good idea to silently increase size of fs.
> 
> Agreed!
> 
>> Maybe better just refuse to mount with huge=always for too small fs?
> 
> We cannot we simply have the remaining page !THP? What is the actual
> problem?

The remaining pages can be !THP, it will fall back to regular 4k pages 
when the available space is less than THP size.

I just wonder it sounds not make sense to *not* mount tmpfs with THP 
size alignment when "huge=always" is passed.

I guess someone would like to assume all allocation in tmpfs with 
"huge=always" should be THP. But, they might not be fully aware of in 
some corner cases THP might be not used, for example, the remaining 
space is less then THP size, then some unexpected performance degrade 
might be perceived.

So, why not we do the mount correctly at the first place. It could be 
delegated to the administrator, but it should be better to give some 
hint from kernel side.

Thanks,
Yang


> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-09 16:43       ` Yang Shi
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-09 16:43 UTC (permalink / raw)
  To: Michal Hocko, Kirill A. Shutemov
  Cc: kirill.shutemov, hughd, akpm, linux-mm, linux-kernel



On 10/8/17 11:48 PM, Michal Hocko wrote:
> On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
>> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
>>> When passing "huge=always" option for mounting tmpfs, THP is supposed to
>>> be allocated all the time when it can fit, but when the available space is
>>> smaller than the size of THP (2MB on x86), shmem fault handler still tries
>>> to allocate huge page every time, then fallback to regular 4K page
>>> allocation, i.e.:
>>>
>>> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
>>> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
>>> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
>>>
>>> The last dd command will handle 952 times page fault handler, then exit
>>> with -ENOSPC.
>>>
>>> Rounding up tmpfs size to THP size in order to use THP with "always"
>>> more efficiently. And, it will not wast too much memory (just allocate
>>> 511 extra pages in worst case).
>>
>> Hm. I don't think it's good idea to silently increase size of fs.
> 
> Agreed!
> 
>> Maybe better just refuse to mount with huge=always for too small fs?
> 
> We cannot we simply have the remaining page !THP? What is the actual
> problem?

The remaining pages can be !THP, it will fall back to regular 4k pages 
when the available space is less than THP size.

I just wonder it sounds not make sense to *not* mount tmpfs with THP 
size alignment when "huge=always" is passed.

I guess someone would like to assume all allocation in tmpfs with 
"huge=always" should be THP. But, they might not be fully aware of in 
some corner cases THP might be not used, for example, the remaining 
space is less then THP size, then some unexpected performance degrade 
might be perceived.

So, why not we do the mount correctly at the first place. It could be 
delegated to the administrator, but it should be better to give some 
hint from kernel side.

Thanks,
Yang


> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
  2017-10-09 16:43       ` Yang Shi
@ 2017-10-09 17:26         ` Michal Hocko
  -1 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2017-10-09 17:26 UTC (permalink / raw)
  To: Yang Shi
  Cc: Kirill A. Shutemov, kirill.shutemov, hughd, akpm, linux-mm, linux-kernel

On Tue 10-10-17 00:43:31, Yang Shi wrote:
> 
> 
> On 10/8/17 11:48 PM, Michal Hocko wrote:
> > On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
> > > On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> > > > When passing "huge=always" option for mounting tmpfs, THP is supposed to
> > > > be allocated all the time when it can fit, but when the available space is
> > > > smaller than the size of THP (2MB on x86), shmem fault handler still tries
> > > > to allocate huge page every time, then fallback to regular 4K page
> > > > allocation, i.e.:
> > > > 
> > > > 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> > > > 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> > > > 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> > > > 
> > > > The last dd command will handle 952 times page fault handler, then exit
> > > > with -ENOSPC.
> > > > 
> > > > Rounding up tmpfs size to THP size in order to use THP with "always"
> > > > more efficiently. And, it will not wast too much memory (just allocate
> > > > 511 extra pages in worst case).
> > > 
> > > Hm. I don't think it's good idea to silently increase size of fs.
> > 
> > Agreed!
> > 
> > > Maybe better just refuse to mount with huge=always for too small fs?
> > 
> > We cannot we simply have the remaining page !THP? What is the actual

ups s@We@Why@

> > problem?
> 
> The remaining pages can be !THP, it will fall back to regular 4k pages when
> the available space is less than THP size.
> 
> I just wonder it sounds not make sense to *not* mount tmpfs with THP size
> alignment when "huge=always" is passed.

yes failure seems overly excessive reaction to me.

> I guess someone would like to assume all allocation in tmpfs with
> "huge=always" should be THP.

Nobody can assume that because THP pages can be broken up at any point
in time. We have hugetlb to provide a guarantee

> But, they might not be fully aware of in some
> corner cases THP might be not used, for example, the remaining space is less
> then THP size, then some unexpected performance degrade might be perceived.
> 
> So, why not we do the mount correctly at the first place. It could be
> delegated to the administrator, but it should be better to give some hint
> from kernel side.

Because we are not trying to be more clever than the user. I still do
not see what is the actual problem you are trying to fix to be honest.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-09 17:26         ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2017-10-09 17:26 UTC (permalink / raw)
  To: Yang Shi
  Cc: Kirill A. Shutemov, kirill.shutemov, hughd, akpm, linux-mm, linux-kernel

On Tue 10-10-17 00:43:31, Yang Shi wrote:
> 
> 
> On 10/8/17 11:48 PM, Michal Hocko wrote:
> > On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
> > > On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
> > > > When passing "huge=always" option for mounting tmpfs, THP is supposed to
> > > > be allocated all the time when it can fit, but when the available space is
> > > > smaller than the size of THP (2MB on x86), shmem fault handler still tries
> > > > to allocate huge page every time, then fallback to regular 4K page
> > > > allocation, i.e.:
> > > > 
> > > > 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
> > > > 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
> > > > 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
> > > > 
> > > > The last dd command will handle 952 times page fault handler, then exit
> > > > with -ENOSPC.
> > > > 
> > > > Rounding up tmpfs size to THP size in order to use THP with "always"
> > > > more efficiently. And, it will not wast too much memory (just allocate
> > > > 511 extra pages in worst case).
> > > 
> > > Hm. I don't think it's good idea to silently increase size of fs.
> > 
> > Agreed!
> > 
> > > Maybe better just refuse to mount with huge=always for too small fs?
> > 
> > We cannot we simply have the remaining page !THP? What is the actual

ups s@We@Why@

> > problem?
> 
> The remaining pages can be !THP, it will fall back to regular 4k pages when
> the available space is less than THP size.
> 
> I just wonder it sounds not make sense to *not* mount tmpfs with THP size
> alignment when "huge=always" is passed.

yes failure seems overly excessive reaction to me.

> I guess someone would like to assume all allocation in tmpfs with
> "huge=always" should be THP.

Nobody can assume that because THP pages can be broken up at any point
in time. We have hugetlb to provide a guarantee

> But, they might not be fully aware of in some
> corner cases THP might be not used, for example, the remaining space is less
> then THP size, then some unexpected performance degrade might be perceived.
> 
> So, why not we do the mount correctly at the first place. It could be
> delegated to the administrator, but it should be better to give some hint
> from kernel side.

Because we are not trying to be more clever than the user. I still do
not see what is the actual problem you are trying to fix to be honest.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
  2017-10-09 17:26         ` Michal Hocko
@ 2017-10-09 17:54           ` Yang Shi
  -1 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-09 17:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Kirill A. Shutemov, kirill.shutemov, hughd, akpm, linux-mm, linux-kernel



On 10/9/17 10:26 AM, Michal Hocko wrote:
> On Tue 10-10-17 00:43:31, Yang Shi wrote:
>>
>>
>> On 10/8/17 11:48 PM, Michal Hocko wrote:
>>> On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
>>>> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
>>>>> When passing "huge=always" option for mounting tmpfs, THP is supposed to
>>>>> be allocated all the time when it can fit, but when the available space is
>>>>> smaller than the size of THP (2MB on x86), shmem fault handler still tries
>>>>> to allocate huge page every time, then fallback to regular 4K page
>>>>> allocation, i.e.:
>>>>>
>>>>> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
>>>>> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
>>>>> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
>>>>>
>>>>> The last dd command will handle 952 times page fault handler, then exit
>>>>> with -ENOSPC.
>>>>>
>>>>> Rounding up tmpfs size to THP size in order to use THP with "always"
>>>>> more efficiently. And, it will not wast too much memory (just allocate
>>>>> 511 extra pages in worst case).
>>>>
>>>> Hm. I don't think it's good idea to silently increase size of fs.
>>>
>>> Agreed!
>>>
>>>> Maybe better just refuse to mount with huge=always for too small fs?
>>>
>>> We cannot we simply have the remaining page !THP? What is the actual
> 
> ups s@We@Why@
> 
>>> problem?
>>
>> The remaining pages can be !THP, it will fall back to regular 4k pages when
>> the available space is less than THP size.
>>
>> I just wonder it sounds not make sense to *not* mount tmpfs with THP size
>> alignment when "huge=always" is passed.
> 
> yes failure seems overly excessive reaction to me.
> 
>> I guess someone would like to assume all allocation in tmpfs with
>> "huge=always" should be THP.
> 
> Nobody can assume that because THP pages can be broken up at any point
> in time. We have hugetlb to provide a guarantee
> 
>> But, they might not be fully aware of in some
>> corner cases THP might be not used, for example, the remaining space is less
>> then THP size, then some unexpected performance degrade might be perceived.
>>
>> So, why not we do the mount correctly at the first place. It could be
>> delegated to the administrator, but it should be better to give some hint
>> from kernel side.
> 
> Because we are not trying to be more clever than the user. I still do
> not see what is the actual problem you are trying to fix to be honest.

Just try to provide a warning or hint to the users that it'd better to 
mount tmpfs with THP size aligned when "huge=always" is passed to avoid 
some unexpected performance degrade.

Resizing or failure to mount might be overkill, documenting it might be 
good enough.

Thanks,
Yang

> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always
@ 2017-10-09 17:54           ` Yang Shi
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Shi @ 2017-10-09 17:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Kirill A. Shutemov, kirill.shutemov, hughd, akpm, linux-mm, linux-kernel



On 10/9/17 10:26 AM, Michal Hocko wrote:
> On Tue 10-10-17 00:43:31, Yang Shi wrote:
>>
>>
>> On 10/8/17 11:48 PM, Michal Hocko wrote:
>>> On Sun 08-10-17 15:56:51, Kirill A. Shutemov wrote:
>>>> On Sat, Oct 07, 2017 at 04:22:10AM +0800, Yang Shi wrote:
>>>>> When passing "huge=always" option for mounting tmpfs, THP is supposed to
>>>>> be allocated all the time when it can fit, but when the available space is
>>>>> smaller than the size of THP (2MB on x86), shmem fault handler still tries
>>>>> to allocate huge page every time, then fallback to regular 4K page
>>>>> allocation, i.e.:
>>>>>
>>>>> 	# mount -t tmpfs -o huge,size=3000k tmpfs /tmp
>>>>> 	# dd if=/dev/zero of=/tmp/test bs=1k count=2048
>>>>> 	# dd if=/dev/zero of=/tmp/test1 bs=1k count=2048
>>>>>
>>>>> The last dd command will handle 952 times page fault handler, then exit
>>>>> with -ENOSPC.
>>>>>
>>>>> Rounding up tmpfs size to THP size in order to use THP with "always"
>>>>> more efficiently. And, it will not wast too much memory (just allocate
>>>>> 511 extra pages in worst case).
>>>>
>>>> Hm. I don't think it's good idea to silently increase size of fs.
>>>
>>> Agreed!
>>>
>>>> Maybe better just refuse to mount with huge=always for too small fs?
>>>
>>> We cannot we simply have the remaining page !THP? What is the actual
> 
> ups s@We@Why@
> 
>>> problem?
>>
>> The remaining pages can be !THP, it will fall back to regular 4k pages when
>> the available space is less than THP size.
>>
>> I just wonder it sounds not make sense to *not* mount tmpfs with THP size
>> alignment when "huge=always" is passed.
> 
> yes failure seems overly excessive reaction to me.
> 
>> I guess someone would like to assume all allocation in tmpfs with
>> "huge=always" should be THP.
> 
> Nobody can assume that because THP pages can be broken up at any point
> in time. We have hugetlb to provide a guarantee
> 
>> But, they might not be fully aware of in some
>> corner cases THP might be not used, for example, the remaining space is less
>> then THP size, then some unexpected performance degrade might be perceived.
>>
>> So, why not we do the mount correctly at the first place. It could be
>> delegated to the administrator, but it should be better to give some hint
>> from kernel side.
> 
> Because we are not trying to be more clever than the user. I still do
> not see what is the actual problem you are trying to fix to be honest.

Just try to provide a warning or hint to the users that it'd better to 
mount tmpfs with THP size aligned when "huge=always" is passed to avoid 
some unexpected performance degrade.

Resizing or failure to mount might be overkill, documenting it might be 
good enough.

Thanks,
Yang

> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-10-09 17:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-06 20:22 [RFC PATCH] mm: shm: round up tmpfs size to huge page size when huge=always Yang Shi
2017-10-06 20:22 ` Yang Shi
2017-10-08 12:56 ` Kirill A. Shutemov
2017-10-08 12:56   ` Kirill A. Shutemov
2017-10-08 19:51   ` Yang Shi
2017-10-08 19:51     ` Yang Shi
2017-10-09  4:03     ` Kirill A. Shutemov
2017-10-09  4:03       ` Kirill A. Shutemov
2017-10-09  6:48   ` Michal Hocko
2017-10-09  6:48     ` Michal Hocko
2017-10-09 16:43     ` Yang Shi
2017-10-09 16:43       ` Yang Shi
2017-10-09 17:26       ` Michal Hocko
2017-10-09 17:26         ` Michal Hocko
2017-10-09 17:54         ` Yang Shi
2017-10-09 17:54           ` Yang Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.