All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] mm, thp: fix defrag setting if newline is not used
@ 2020-01-15  1:58 ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-15  1:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Mel Gorman, Vlastimil Babka, linux-kernel, linux-mm

If thp defrag setting "defer" is used and a newline is *not* used when
writing to the sysfs file, this is interpreted as the "defer+madvise"
option.

This is because we do prefix matching and if five characters are written
without a newline, the current code ends up comparing to the first five
bytes of the "defer+madvise" option and using that instead.

Find the length of what the user is writing and use that to guide our
decision on which string comparison to do.

Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
Signed-off-by: David Rientjes <rientjes@google.com>
---
 This can be done in *many* different ways including extracting logic to
 a helper function.  If someone would like this to be implemented
 differently, please suggest it.

 mm/huge_memory.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -250,32 +250,33 @@ static ssize_t defrag_store(struct kobject *kobj,
 			    struct kobj_attribute *attr,
 			    const char *buf, size_t count)
 {
-	if (!memcmp("always", buf,
-		    min(sizeof("always")-1, count))) {
+	size_t len = count;
+
+	/* For prefix matching, find the length of interest */
+	if (buf[len-1] == '\n')
+		len--;
+
+	if (len == sizeof("always")-1 && !memcmp("always", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer+madvise", buf,
-		    min(sizeof("defer+madvise")-1, count))) {
+	} else if (len == sizeof("defer+madvise")-1 && !memcmp("defer+madvise", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer", buf,
-		    min(sizeof("defer")-1, count))) {
+	} else if (len == sizeof("defer")-1 && !memcmp("defer", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("madvise", buf,
-			   min(sizeof("madvise")-1, count))) {
+	} else if (len == sizeof("madvise")-1 && !memcmp("madvise", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("never", buf,
-			   min(sizeof("never")-1, count))) {
+	} else if (len == sizeof("never")-1 && !memcmp("never", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch] mm, thp: fix defrag setting if newline is not used
@ 2020-01-15  1:58 ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-15  1:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Mel Gorman, Vlastimil Babka, linux-kernel, linux-mm

If thp defrag setting "defer" is used and a newline is *not* used when
writing to the sysfs file, this is interpreted as the "defer+madvise"
option.

This is because we do prefix matching and if five characters are written
without a newline, the current code ends up comparing to the first five
bytes of the "defer+madvise" option and using that instead.

Find the length of what the user is writing and use that to guide our
decision on which string comparison to do.

Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
Signed-off-by: David Rientjes <rientjes@google.com>
---
 This can be done in *many* different ways including extracting logic to
 a helper function.  If someone would like this to be implemented
 differently, please suggest it.

 mm/huge_memory.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -250,32 +250,33 @@ static ssize_t defrag_store(struct kobject *kobj,
 			    struct kobj_attribute *attr,
 			    const char *buf, size_t count)
 {
-	if (!memcmp("always", buf,
-		    min(sizeof("always")-1, count))) {
+	size_t len = count;
+
+	/* For prefix matching, find the length of interest */
+	if (buf[len-1] == '\n')
+		len--;
+
+	if (len == sizeof("always")-1 && !memcmp("always", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer+madvise", buf,
-		    min(sizeof("defer+madvise")-1, count))) {
+	} else if (len == sizeof("defer+madvise")-1 && !memcmp("defer+madvise", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer", buf,
-		    min(sizeof("defer")-1, count))) {
+	} else if (len == sizeof("defer")-1 && !memcmp("defer", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("madvise", buf,
-			   min(sizeof("madvise")-1, count))) {
+	} else if (len == sizeof("madvise")-1 && !memcmp("madvise", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("never", buf,
-			   min(sizeof("never")-1, count))) {
+	} else if (len == sizeof("never")-1 && !memcmp("never", buf, len)) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch] mm, thp: fix defrag setting if newline is not used
  2020-01-15  1:58 ` David Rientjes
  (?)
@ 2020-01-15 12:45 ` Vlastimil Babka
  -1 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2020-01-15 12:45 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton; +Cc: Mel Gorman, linux-kernel, linux-mm

On 1/15/20 2:58 AM, David Rientjes wrote:
> If thp defrag setting "defer" is used and a newline is *not* used when
> writing to the sysfs file, this is interpreted as the "defer+madvise"
> option.
> 
> This is because we do prefix matching and if five characters are written
> without a newline, the current code ends up comparing to the first five
> bytes of the "defer+madvise" option and using that instead.
> 
> Find the length of what the user is writing and use that to guide our
> decision on which string comparison to do.
> 
> Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  This can be done in *many* different ways including extracting logic to
>  a helper function.  If someone would like this to be implemented
>  differently, please suggest it.

I've come up with this:

diff --git mm/huge_memory.c mm/huge_memory.c
index 41a0fbddc96b..f36b93334874 100644
--- mm/huge_memory.c
+++ mm/huge_memory.c
@@ -256,7 +256,7 @@ static ssize_t defrag_store(struct kobject *kobj,
                clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
                clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
                set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
-       } else if (!memcmp("defer+madvise", buf,
+       } else if (count > sizeof("defer")-1 && !memcmp("defer+madvise", buf,
                    min(sizeof("defer+madvise")-1, count))) {
                clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
                clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);

It's smaller, but more hacky. But it doesn't add new restrictions.
E.g. this still works:

# echo -n 'alw' > /sys/kernel/mm/transparent_hugepage/defrag
# cat /sys/kernel/mm/transparent_hugepage/defrag
[always] defer defer+madvise madvise never

But whether anyone does that, I don't know (it doesn't work without -n).
Also this still works:

# echo -n  'defer   ' > /sys/kernel/mm/transparent_hugepage/defrag
# cat /sys/kernel/mm/transparent_hugepage/defrag
always [defer] defer+madvise madvise never

Ideally we would have had strict matching as you propose (no matching of prefixes)
since the beginning and use e.g. strstrip() to remove all whitespace from buffer
first. But it's 'const char *' and I'm not sure if it's null-terminated.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [patch] mm, thp: fix defrag setting if newline is not used
  2020-01-15  1:58 ` David Rientjes
  (?)
  (?)
@ 2020-01-17  3:16 ` Andrew Morton
  2020-01-17  8:24   ` Vlastimil Babka
  -1 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2020-01-17  3:16 UTC (permalink / raw)
  To: David Rientjes; +Cc: Mel Gorman, Vlastimil Babka, linux-kernel, linux-mm

On Tue, 14 Jan 2020 17:58:36 -0800 (PST) David Rientjes <rientjes@google.com> wrote:

> If thp defrag setting "defer" is used and a newline is *not* used when
> writing to the sysfs file, this is interpreted as the "defer+madvise"
> option.
> 
> This is because we do prefix matching and if five characters are written
> without a newline, the current code ends up comparing to the first five
> bytes of the "defer+madvise" option and using that instead.
> 
> Find the length of what the user is writing and use that to guide our
> decision on which string comparison to do.

Gee, why is this code so complicated?  Can't we just do

	if (sysfs_streq(buf, "always")) {
		...
	} else if sysfs_streq(buf, "defer+madvise")) {
		...
	}
	...




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch] mm, thp: fix defrag setting if newline is not used
  2020-01-17  3:16 ` Andrew Morton
@ 2020-01-17  8:24   ` Vlastimil Babka
  2020-01-17  9:43       ` David Rientjes
  0 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2020-01-17  8:24 UTC (permalink / raw)
  To: Andrew Morton, David Rientjes; +Cc: Mel Gorman, linux-kernel, linux-mm

On 1/17/20 4:16 AM, Andrew Morton wrote:
> On Tue, 14 Jan 2020 17:58:36 -0800 (PST) David Rientjes <rientjes@google.com> wrote:
> 
>> If thp defrag setting "defer" is used and a newline is *not* used when
>> writing to the sysfs file, this is interpreted as the "defer+madvise"
>> option.
>>
>> This is because we do prefix matching and if five characters are written
>> without a newline, the current code ends up comparing to the first five
>> bytes of the "defer+madvise" option and using that instead.
>>
>> Find the length of what the user is writing and use that to guide our
>> decision on which string comparison to do.
> 
> Gee, why is this code so complicated?  Can't we just do
> 
> 	if (sysfs_streq(buf, "always")) {
> 		...
> 	} else if sysfs_streq(buf, "defer+madvise")) {
> 		...
> 	}
> 	...

Yeah, if we knew this existed :)

We would lose the prefix matching but hopefully nobody will complain.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch] mm, thp: fix defrag setting if newline is not used
  2020-01-17  8:24   ` Vlastimil Babka
@ 2020-01-17  9:43       ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-17  9:43 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Andrew Morton, Mel Gorman, linux-kernel, linux-mm

On Fri, 17 Jan 2020, Vlastimil Babka wrote:

> >> If thp defrag setting "defer" is used and a newline is *not* used when
> >> writing to the sysfs file, this is interpreted as the "defer+madvise"
> >> option.
> >>
> >> This is because we do prefix matching and if five characters are written
> >> without a newline, the current code ends up comparing to the first five
> >> bytes of the "defer+madvise" option and using that instead.
> >>
> >> Find the length of what the user is writing and use that to guide our
> >> decision on which string comparison to do.
> > 
> > Gee, why is this code so complicated?  Can't we just do
> > 
> > 	if (sysfs_streq(buf, "always")) {
> > 		...
> > 	} else if sysfs_streq(buf, "defer+madvise")) {
> > 		...
> > 	}
> > 	...
> 
> Yeah, if we knew this existed :)
> 
> We would lose the prefix matching but hopefully nobody will complain.
> 

I tested Vlastimil's patch and it works as intended so I was about to 
modify the changelog and send his patch and ask for a sign-off line 
because I think I agree the *partial* prefix matching has ~0.1% chance of 
breaking userspace and that 0.1% chance outweighs my desire to make the 
code consistent for all options.

But if userspace were broken by this, then at least it was already broken 
for "defer" depending on newline vs no newline.  (What we do know is that 
nobody has used "defer" for the past couple years without a newline :).

If nobody objects, I'll test and send Andrew's version with the changelog 
because I think we all agree the risk of breakage here is very minimal and 
actually fixes the case for defer.  

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch] mm, thp: fix defrag setting if newline is not used
@ 2020-01-17  9:43       ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-17  9:43 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Andrew Morton, Mel Gorman, linux-kernel, linux-mm

On Fri, 17 Jan 2020, Vlastimil Babka wrote:

> >> If thp defrag setting "defer" is used and a newline is *not* used when
> >> writing to the sysfs file, this is interpreted as the "defer+madvise"
> >> option.
> >>
> >> This is because we do prefix matching and if five characters are written
> >> without a newline, the current code ends up comparing to the first five
> >> bytes of the "defer+madvise" option and using that instead.
> >>
> >> Find the length of what the user is writing and use that to guide our
> >> decision on which string comparison to do.
> > 
> > Gee, why is this code so complicated?  Can't we just do
> > 
> > 	if (sysfs_streq(buf, "always")) {
> > 		...
> > 	} else if sysfs_streq(buf, "defer+madvise")) {
> > 		...
> > 	}
> > 	...
> 
> Yeah, if we knew this existed :)
> 
> We would lose the prefix matching but hopefully nobody will complain.
> 

I tested Vlastimil's patch and it works as intended so I was about to 
modify the changelog and send his patch and ask for a sign-off line 
because I think I agree the *partial* prefix matching has ~0.1% chance of 
breaking userspace and that 0.1% chance outweighs my desire to make the 
code consistent for all options.

But if userspace were broken by this, then at least it was already broken 
for "defer" depending on newline vs no newline.  (What we do know is that 
nobody has used "defer" for the past couple years without a newline :).

If nobody objects, I'll test and send Andrew's version with the changelog 
because I think we all agree the risk of breakage here is very minimal and 
actually fixes the case for defer.  


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch] mm, thp: fix defrag setting if newline is not used
  2020-01-17  9:43       ` David Rientjes
  (?)
@ 2020-01-17 10:12       ` Vlastimil Babka
  2020-01-17 22:11           ` David Rientjes
  -1 siblings, 1 reply; 14+ messages in thread
From: Vlastimil Babka @ 2020-01-17 10:12 UTC (permalink / raw)
  To: David Rientjes; +Cc: Andrew Morton, Mel Gorman, linux-kernel, linux-mm

On 1/17/20 10:43 AM, David Rientjes wrote:
> On Fri, 17 Jan 2020, Vlastimil Babka wrote:
> 
>>>> If thp defrag setting "defer" is used and a newline is *not* used when
>>>> writing to the sysfs file, this is interpreted as the "defer+madvise"
>>>> option.
>>>>
>>>> This is because we do prefix matching and if five characters are written
>>>> without a newline, the current code ends up comparing to the first five
>>>> bytes of the "defer+madvise" option and using that instead.
>>>>
>>>> Find the length of what the user is writing and use that to guide our
>>>> decision on which string comparison to do.
>>>
>>> Gee, why is this code so complicated?  Can't we just do
>>>
>>> 	if (sysfs_streq(buf, "always")) {
>>> 		...
>>> 	} else if sysfs_streq(buf, "defer+madvise")) {
>>> 		...
>>> 	}
>>> 	...
>>
>> Yeah, if we knew this existed :)
>>
>> We would lose the prefix matching but hopefully nobody will complain.
>>
> 
> I tested Vlastimil's patch and it works as intended so I was about to 
> modify the changelog and send his patch and ask for a sign-off line 
> because I think I agree the *partial* prefix matching has ~0.1% chance of 
> breaking userspace and that 0.1% chance outweighs my desire to make the 
> code consistent for all options.

If prefix matching worked with "echo alw > /sys..." then I would expect
some script out there relies on it, but since it only works with "echo
-n alw > /..." then perhaps there's no such script :)

> But if userspace were broken by this, then at least it was already broken 
> for "defer" depending on newline vs no newline.  (What we do know is that 
> nobody has used "defer" for the past couple years without a newline :).
> 
> If nobody objects, I'll test and send Andrew's version with the changelog 
> because I think we all agree the risk of breakage here is very minimal and 
> actually fixes the case for defer.  

Agreed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch v2] mm, thp: fix defrag setting if newline is not used
  2020-01-17 10:12       ` Vlastimil Babka
@ 2020-01-17 22:11           ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-17 22:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-kernel, linux-mm

If thp defrag setting "defer" is used and a newline is *not* used when
writing to the sysfs file, this is interpreted as the "defer+madvise"
option.

This is because we do prefix matching and if five characters are written
without a newline, the current code ends up comparing to the first five
bytes of the "defer+madvise" option and using that instead.

Use the more appropriate sysfs_streq() that handles the trailing newline
for us.  Since this doubles as a nice cleanup, do it in enabled_store()
as well.

Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 Latest 5.5-rc6 doesn't boot for me, something to be debugged separately,
 so this was tested on 5.4.  No changes in this area, however, between
 the two kernels.

 mm/huge_memory.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 13cc93785006..1c61dea937bc 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -177,16 +177,13 @@ static ssize_t enabled_store(struct kobject *kobj,
 {
 	ssize_t ret = count;
 
-	if (!memcmp("always", buf,
-		    min(sizeof("always")-1, count))) {
+	if (sysfs_streq(buf, "always")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("madvise", buf,
-			   min(sizeof("madvise")-1, count))) {
+	} else if (sysfs_streq(buf, "madvise")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("never", buf,
-			   min(sizeof("never")-1, count))) {
+	} else if (sysfs_streq(buf, "never")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
 	} else
@@ -250,32 +247,27 @@ static ssize_t defrag_store(struct kobject *kobj,
 			    struct kobj_attribute *attr,
 			    const char *buf, size_t count)
 {
-	if (!memcmp("always", buf,
-		    min(sizeof("always")-1, count))) {
+	if (sysfs_streq(buf, "always")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer+madvise", buf,
-		    min(sizeof("defer+madvise")-1, count))) {
+	} else if (sysfs_streq(buf, "defer+madvise")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer", buf,
-		    min(sizeof("defer")-1, count))) {
+	} else if (sysfs_streq(buf, "defer")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("madvise", buf,
-			   min(sizeof("madvise")-1, count))) {
+	} else if (sysfs_streq(buf, "madvise")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("never", buf,
-			   min(sizeof("never")-1, count))) {
+	} else if (sysfs_streq(buf, "never")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [patch v2] mm, thp: fix defrag setting if newline is not used
@ 2020-01-17 22:11           ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-17 22:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-kernel, linux-mm

If thp defrag setting "defer" is used and a newline is *not* used when
writing to the sysfs file, this is interpreted as the "defer+madvise"
option.

This is because we do prefix matching and if five characters are written
without a newline, the current code ends up comparing to the first five
bytes of the "defer+madvise" option and using that instead.

Use the more appropriate sysfs_streq() that handles the trailing newline
for us.  Since this doubles as a nice cleanup, do it in enabled_store()
as well.

Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 Latest 5.5-rc6 doesn't boot for me, something to be debugged separately,
 so this was tested on 5.4.  No changes in this area, however, between
 the two kernels.

 mm/huge_memory.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 13cc93785006..1c61dea937bc 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -177,16 +177,13 @@ static ssize_t enabled_store(struct kobject *kobj,
 {
 	ssize_t ret = count;
 
-	if (!memcmp("always", buf,
-		    min(sizeof("always")-1, count))) {
+	if (sysfs_streq(buf, "always")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("madvise", buf,
-			   min(sizeof("madvise")-1, count))) {
+	} else if (sysfs_streq(buf, "madvise")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("never", buf,
-			   min(sizeof("never")-1, count))) {
+	} else if (sysfs_streq(buf, "never")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
 	} else
@@ -250,32 +247,27 @@ static ssize_t defrag_store(struct kobject *kobj,
 			    struct kobj_attribute *attr,
 			    const char *buf, size_t count)
 {
-	if (!memcmp("always", buf,
-		    min(sizeof("always")-1, count))) {
+	if (sysfs_streq(buf, "always")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer+madvise", buf,
-		    min(sizeof("defer+madvise")-1, count))) {
+	} else if (sysfs_streq(buf, "defer+madvise")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("defer", buf,
-		    min(sizeof("defer")-1, count))) {
+	} else if (sysfs_streq(buf, "defer")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("madvise", buf,
-			   min(sizeof("madvise")-1, count))) {
+	} else if (sysfs_streq(buf, "madvise")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
 		set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (!memcmp("never", buf,
-			   min(sizeof("never")-1, count))) {
+	} else if (sysfs_streq(buf, "never")) {
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
 		clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [patch v2] mm, thp: fix defrag setting if newline is not used
  2020-01-17 22:11           ` David Rientjes
  (?)
@ 2020-01-18 10:54           ` Vlastimil Babka
  -1 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2020-01-18 10:54 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton; +Cc: Mel Gorman, linux-kernel, linux-mm

On 1/17/20 11:11 PM, David Rientjes wrote:
> If thp defrag setting "defer" is used and a newline is *not* used when
> writing to the sysfs file, this is interpreted as the "defer+madvise"
> option.
> 
> This is because we do prefix matching and if five characters are written
> without a newline, the current code ends up comparing to the first five
> bytes of the "defer+madvise" option and using that instead.
> 
> Use the more appropriate sysfs_streq() that handles the trailing newline
> for us.  Since this doubles as a nice cleanup, do it in enabled_store()
> as well.
> 
> Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch v2] mm, thp: fix defrag setting if newline is not used
  2020-01-17 22:11           ` David Rientjes
  (?)
  (?)
@ 2020-01-19  1:04           ` Andrew Morton
  2020-01-19 21:57               ` David Rientjes
  -1 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2020-01-19  1:04 UTC (permalink / raw)
  To: David Rientjes; +Cc: Vlastimil Babka, Mel Gorman, linux-kernel, linux-mm

On Fri, 17 Jan 2020 14:11:48 -0800 (PST) David Rientjes <rientjes@google.com> wrote:

> If thp defrag setting "defer" is used and a newline is *not* used when
> writing to the sysfs file, this is interpreted as the "defer+madvise"
> option.
> 
> This is because we do prefix matching and if five characters are written
> without a newline, the current code ends up comparing to the first five
> bytes of the "defer+madvise" option and using that instead.
> 
> Use the more appropriate sysfs_streq() that handles the trailing newline
> for us.  Since this doubles as a nice cleanup, do it in enabled_store()
> as well.

I can't really I really understand this prefix-matching thing that
we're taking away.  Documentation/admin-guide/mm/transhuge.rst doesn't
appear to mention it.  Could we please add a paragraph to the changelog
to spell all this out.  Bonus points for formally describing the
behaviour which we're removing!

Thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch v2] mm, thp: fix defrag setting if newline is not used
  2020-01-19  1:04           ` Andrew Morton
@ 2020-01-19 21:57               ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-19 21:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-kernel, linux-mm

On Sat, 18 Jan 2020, Andrew Morton wrote:

> > If thp defrag setting "defer" is used and a newline is *not* used when
> > writing to the sysfs file, this is interpreted as the "defer+madvise"
> > option.
> > 
> > This is because we do prefix matching and if five characters are written
> > without a newline, the current code ends up comparing to the first five
> > bytes of the "defer+madvise" option and using that instead.
> > 
> > Use the more appropriate sysfs_streq() that handles the trailing newline
> > for us.  Since this doubles as a nice cleanup, do it in enabled_store()
> > as well.
> 
> I can't really I really understand this prefix-matching thing that
> we're taking away.  Documentation/admin-guide/mm/transhuge.rst doesn't
> appear to mention it.  Could we please add a paragraph to the changelog
> to spell all this out.  Bonus points for formally describing the
> behaviour which we're removing!
> 

The current implementation relies on prefix matching: the number of bytes 
compared is either the number of bytes written or the length of the option 
being compared.  With a newline, "defer\n" does not match 
"defer+"madvise"; without a newline, however, "defer" is considered to 
match "defer+madvise" (prefix matching is only comparing the first five 
bytes).  End result is that writing "defer" is broken unless it has an 
additional trailing character.

This means that writing "madv" in the past would match and set "madvise".  
With strict checking, that no longer is the case but it is unlikely 
anybody is currently doing this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch v2] mm, thp: fix defrag setting if newline is not used
@ 2020-01-19 21:57               ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2020-01-19 21:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-kernel, linux-mm

On Sat, 18 Jan 2020, Andrew Morton wrote:

> > If thp defrag setting "defer" is used and a newline is *not* used when
> > writing to the sysfs file, this is interpreted as the "defer+madvise"
> > option.
> > 
> > This is because we do prefix matching and if five characters are written
> > without a newline, the current code ends up comparing to the first five
> > bytes of the "defer+madvise" option and using that instead.
> > 
> > Use the more appropriate sysfs_streq() that handles the trailing newline
> > for us.  Since this doubles as a nice cleanup, do it in enabled_store()
> > as well.
> 
> I can't really I really understand this prefix-matching thing that
> we're taking away.  Documentation/admin-guide/mm/transhuge.rst doesn't
> appear to mention it.  Could we please add a paragraph to the changelog
> to spell all this out.  Bonus points for formally describing the
> behaviour which we're removing!
> 

The current implementation relies on prefix matching: the number of bytes 
compared is either the number of bytes written or the length of the option 
being compared.  With a newline, "defer\n" does not match 
"defer+"madvise"; without a newline, however, "defer" is considered to 
match "defer+madvise" (prefix matching is only comparing the first five 
bytes).  End result is that writing "defer" is broken unless it has an 
additional trailing character.

This means that writing "madv" in the past would match and set "madvise".  
With strict checking, that no longer is the case but it is unlikely 
anybody is currently doing this.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-01-19 21:57 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-15  1:58 [patch] mm, thp: fix defrag setting if newline is not used David Rientjes
2020-01-15  1:58 ` David Rientjes
2020-01-15 12:45 ` Vlastimil Babka
2020-01-17  3:16 ` Andrew Morton
2020-01-17  8:24   ` Vlastimil Babka
2020-01-17  9:43     ` David Rientjes
2020-01-17  9:43       ` David Rientjes
2020-01-17 10:12       ` Vlastimil Babka
2020-01-17 22:11         ` [patch v2] " David Rientjes
2020-01-17 22:11           ` David Rientjes
2020-01-18 10:54           ` Vlastimil Babka
2020-01-19  1:04           ` Andrew Morton
2020-01-19 21:57             ` David Rientjes
2020-01-19 21:57               ` David Rientjes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.