linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] string.h: Add str_has_prefix() helper function
@ 2018-12-22  4:19 Steven Rostedt
  2018-12-22  4:42 ` Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22  4:19 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Greg Kroah-Hartman,
	Joe Perches, Namhyung Kim, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab


From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

A discussion came up in the trace triggers thread about converting a
bunch of:

 strncmp(str, "const", sizeof("const") - 1)

use cases into a helper macro. It started with:

	strncmp(str, const, sizeof(const) - 1)

But then Joe Perches mentioned that if a const is not used, the
sizeof() will be the size of a pointer, which can be bad. And that
gcc will optimize strlen("const") into "sizeof("const") - 1".

Thinking about this more, a quick grep in the kernel tree found several
(thousands!) of cases that use this construct. A quick grep also
revealed that there's probably several bugs in that use case. Some are
that people forgot the "- 1" (which I found) and others could be that
the constant for the sizeof is different than the constant (although, I
haven't found any of those, but I also didn't look hard).

I figured the best thing to do is to create a helper macro and place it
into include/linux/string.h. And go around and fix all the open coded
versions of it later.

Note, gcc appears to optimize this when we make it into an always_inline
static function, which removes a lot of issues that a macro produces.

Link: http://lkml.kernel.org/r/e3e754f2bd18e56eaa8baf79bee619316ebf4cfc.1545161087.git.tom.zanussi@linux.intel.com
Link: http://lkml.kernel.org/r/20181219211615.2298e781@gandalf.local.home
Link: http://lkml.kernel.org/r/CAHk-=wg_sR-UEC1ggmkZpypOUYanL5CMX4R7ceuaV4QMf5jBtg@mail.gmail.com

Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggestions-by: Joe Perches <joe@perches.com>
Suggestions-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---

Since I haven't heard anything since Linus said he preferred the len
to be the return value, I'm posting this last version before running
it through my tests.

Changes since v3:

  - Use size_t instead of int (Joe Perches)

 include/linux/string.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/string.h b/include/linux/string.h
index 27d0482e5e05..7927b875f80c 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -456,4 +456,24 @@ static inline void memcpy_and_pad(void *dest, size_t dest_len,
 		memcpy(dest, src, dest_len);
 }
 
+/**
+ * str_has_prefix - Test if a string has a given prefix
+ * @str: The string to test
+ * @prefix: The string to see if @str starts with
+ *
+ * A common way to test a prefix of a string is to do:
+ *  strncmp(str, prefix, sizeof(prefix) - 1)
+ *
+ * But this can lead to bugs due to typos, or if prefix is a pointer
+ * and not a constant. Instead use str_has_prefix().
+ *
+ * Returns: 0 if @str does not start with @prefix
+         strlen(@prefix) if @str does start with @prefix
+ */
+static __always_inline size_t str_has_prefix(const char *str, const char *prefix)
+{
+	size_t len = strlen(prefix);
+	return strncmp(str, prefix, len) == 0 ? len : 0;
+}
+
 #endif /* _LINUX_STRING_H_ */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22  4:19 [PATCH v4] string.h: Add str_has_prefix() helper function Steven Rostedt
@ 2018-12-22  4:42 ` Steven Rostedt
  2018-12-22  9:33 ` Namhyung Kim
  2019-01-11  8:10 ` [utility perl script] strncmp() -> str_has_prefix() conversions Joe Perches
  2 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22  4:42 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Greg Kroah-Hartman,
	Joe Perches, Namhyung Kim, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab

On Fri, 21 Dec 2018 23:19:24 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:


> Cc: Tom Zanussi <zanussi@kernel.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>

This should have been:

Suggestions-by: Linus Torvalds ...

But I went back to a further reflog when doing a reset --hard to get
back to returning length from str_has_prefix, which didn't have this
update.

-- Steve
 

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Suggestions-by: Joe Perches <joe@perches.com>
> Suggestions-by: Andreas Schwab <schwab@linux-m68k.org>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22  4:19 [PATCH v4] string.h: Add str_has_prefix() helper function Steven Rostedt
  2018-12-22  4:42 ` Steven Rostedt
@ 2018-12-22  9:33 ` Namhyung Kim
  2018-12-22 12:24   ` Steven Rostedt
  2019-01-11  8:10 ` [utility perl script] strncmp() -> str_has_prefix() conversions Joe Perches
  2 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2018-12-22  9:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Fri, Dec 21, 2018 at 11:19:24PM -0500, Steven Rostedt wrote:
> 
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> A discussion came up in the trace triggers thread about converting a
> bunch of:
> 
>  strncmp(str, "const", sizeof("const") - 1)
> 
> use cases into a helper macro. It started with:
> 
> 	strncmp(str, const, sizeof(const) - 1)
> 
> But then Joe Perches mentioned that if a const is not used, the
> sizeof() will be the size of a pointer, which can be bad. And that
> gcc will optimize strlen("const") into "sizeof("const") - 1".
> 
> Thinking about this more, a quick grep in the kernel tree found several
> (thousands!) of cases that use this construct. A quick grep also
> revealed that there's probably several bugs in that use case. Some are
> that people forgot the "- 1" (which I found) and others could be that
> the constant for the sizeof is different than the constant (although, I
> haven't found any of those, but I also didn't look hard).
> 
> I figured the best thing to do is to create a helper macro and place it
> into include/linux/string.h. And go around and fix all the open coded
> versions of it later.
> 
> Note, gcc appears to optimize this when we make it into an always_inline
> static function, which removes a lot of issues that a macro produces.
> 
> Link: http://lkml.kernel.org/r/e3e754f2bd18e56eaa8baf79bee619316ebf4cfc.1545161087.git.tom.zanussi@linux.intel.com
> Link: http://lkml.kernel.org/r/20181219211615.2298e781@gandalf.local.home
> Link: http://lkml.kernel.org/r/CAHk-=wg_sR-UEC1ggmkZpypOUYanL5CMX4R7ceuaV4QMf5jBtg@mail.gmail.com
> 
> Cc: Tom Zanussi <zanussi@kernel.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Suggestions-by: Joe Perches <joe@perches.com>
> Suggestions-by: Andreas Schwab <schwab@linux-m68k.org>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> 
> Since I haven't heard anything since Linus said he preferred the len
> to be the return value, I'm posting this last version before running
> it through my tests.
> 
> Changes since v3:
> 
>   - Use size_t instead of int (Joe Perches)
> 
>  include/linux/string.h | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/include/linux/string.h b/include/linux/string.h
> index 27d0482e5e05..7927b875f80c 100644
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -456,4 +456,24 @@ static inline void memcpy_and_pad(void *dest, size_t dest_len,
>  		memcpy(dest, src, dest_len);
>  }
>  
> +/**
> + * str_has_prefix - Test if a string has a given prefix
> + * @str: The string to test
> + * @prefix: The string to see if @str starts with
> + *
> + * A common way to test a prefix of a string is to do:
> + *  strncmp(str, prefix, sizeof(prefix) - 1)
> + *
> + * But this can lead to bugs due to typos, or if prefix is a pointer
> + * and not a constant. Instead use str_has_prefix().
> + *
> + * Returns: 0 if @str does not start with @prefix
> +         strlen(@prefix) if @str does start with @prefix
> + */
> +static __always_inline size_t str_has_prefix(const char *str, const char *prefix)
> +{
> +	size_t len = strlen(prefix);
> +	return strncmp(str, prefix, len) == 0 ? len : 0;

As it already knows the length (and it needs to use it for return
value), isn't it (slightly) better using memcmp() instead?

Thanks,
Namhyung


> +}
> +
>  #endif /* _LINUX_STRING_H_ */
> -- 
> 2.19.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22  9:33 ` Namhyung Kim
@ 2018-12-22 12:24   ` Steven Rostedt
  2018-12-22 14:24     ` Namhyung Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22 12:24 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, 22 Dec 2018 18:33:46 +0900
Namhyung Kim <namhyung@kernel.org> wrote:

> > +static __always_inline size_t str_has_prefix(const char *str, const char *prefix)
> > +{
> > +	size_t len = strlen(prefix);
> > +	return strncmp(str, prefix, len) == 0 ? len : 0;  
> 
> As it already knows the length (and it needs to use it for return
> value), isn't it (slightly) better using memcmp() instead?

No, because we don't know the length of str.


	[ str = "h\0[bad memory]" ]


	str_has_prefix(str, "TEST THIS BIG STRING AT FRONT")


If we use memcmp(), then we are testing way after str has ended, and
that can cause a memory fault.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 12:24   ` Steven Rostedt
@ 2018-12-22 14:24     ` Namhyung Kim
  2018-12-22 15:12       ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2018-12-22 14:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, Dec 22, 2018 at 07:24:04AM -0500, Steven Rostedt wrote:
> On Sat, 22 Dec 2018 18:33:46 +0900
> Namhyung Kim <namhyung@kernel.org> wrote:
> 
> > > +static __always_inline size_t str_has_prefix(const char *str, const char *prefix)
> > > +{
> > > +	size_t len = strlen(prefix);
> > > +	return strncmp(str, prefix, len) == 0 ? len : 0;  
> > 
> > As it already knows the length (and it needs to use it for return
> > value), isn't it (slightly) better using memcmp() instead?
> 
> No, because we don't know the length of str.
> 
> 
> 	[ str = "h\0[bad memory]" ]
> 
> 
> 	str_has_prefix(str, "TEST THIS BIG STRING AT FRONT")
> 
> 
> If we use memcmp(), then we are testing way after str has ended, and
> that can cause a memory fault.

I don't know what's the bad memory causing memory fault but anyway
memcpy() should stop at the NUL character first as it's different, no?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 14:24     ` Namhyung Kim
@ 2018-12-22 15:12       ` Steven Rostedt
  2018-12-22 16:16         ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22 15:12 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, 22 Dec 2018 23:24:11 +0900
Namhyung Kim <namhyung@kernel.org> wrote:

> > No, because we don't know the length of str.
> > 
> > 
> > 	[ str = "h\0[bad memory]" ]
> > 
> > 
> > 	str_has_prefix(str, "TEST THIS BIG STRING AT FRONT")
> > 
> > 
> > If we use memcmp(), then we are testing way after str has ended, and
> > that can cause a memory fault.  
> 
> I don't know what's the bad memory causing memory fault but anyway
> memcpy() should stop at the NUL character first as it's different, no?

No, that's the difference between memcpy() and strncpy(), memcpy()
doesn't care about nul characters. It's copying memory not strings.

-- Steve


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 15:12       ` Steven Rostedt
@ 2018-12-22 16:16         ` Steven Rostedt
  2018-12-22 16:46           ` Namhyung Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22 16:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, 22 Dec 2018 10:12:44 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Sat, 22 Dec 2018 23:24:11 +0900
> Namhyung Kim <namhyung@kernel.org> wrote:
> 
> > > No, because we don't know the length of str.
> > > 
> > > 
> > > 	[ str = "h\0[bad memory]" ]


> > 
> > I don't know what's the bad memory causing memory fault but anyway

What I meant by that is if a string is allocated at a end of a page,
and the next page is marked as not present. A read into that page will
cause a page fault, and since memcmp() does not stop at the '\0' it
will read into that not-present memory and trigger a fault, and that
read wont be in the exception table, and it will then BUG.

> > memcpy() should stop at the NUL character first as it's different, no?  
> 
> No, that's the difference between memcpy() and strncpy(), memcpy()
> doesn't care about nul characters. It's copying memory not strings.

I think we both meant s/cpy/cmp/ ;-)

-- Steve


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 16:16         ` Steven Rostedt
@ 2018-12-22 16:46           ` Namhyung Kim
  2018-12-22 17:19             ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2018-12-22 16:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, Dec 22, 2018 at 11:16:30AM -0500, Steven Rostedt wrote:
> On Sat, 22 Dec 2018 10:12:44 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Sat, 22 Dec 2018 23:24:11 +0900
> > Namhyung Kim <namhyung@kernel.org> wrote:
> > 
> > > > No, because we don't know the length of str.
> > > > 
> > > > 
> > > > 	[ str = "h\0[bad memory]" ]
> 
> 
> > > 
> > > I don't know what's the bad memory causing memory fault but anyway
> 
> What I meant by that is if a string is allocated at a end of a page,
> and the next page is marked as not present. A read into that page will
> cause a page fault, and since memcmp() does not stop at the '\0' it
> will read into that not-present memory and trigger a fault, and that
> read wont be in the exception table, and it will then BUG.

Why it doesn't stop at the '\0' if one has it and the other doesn't?
It's not because it's '\0', it's because they are different.  The '\0'
should be in the prev page (otherwise it's already a BUG) so it should
be detected and stopped before going to next page IMHO.


> 
> > > memcpy() should stop at the NUL character first as it's different, no?  
> > 
> > No, that's the difference between memcpy() and strncpy(), memcpy()
> > doesn't care about nul characters. It's copying memory not strings.
> 
> I think we both meant s/cpy/cmp/ ;-)

Sure..  sorry about that.

I know the difference between memcpy() and strcpy().  But this is
comparing so it should stop.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 16:46           ` Namhyung Kim
@ 2018-12-22 17:19             ` Steven Rostedt
  2018-12-22 17:23               ` Steven Rostedt
  2018-12-23  3:05               ` Namhyung Kim
  0 siblings, 2 replies; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22 17:19 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sun, 23 Dec 2018 01:46:05 +0900
Namhyung Kim <namhyung@kernel.org> wrote:


> > What I meant by that is if a string is allocated at a end of a page,
> > and the next page is marked as not present. A read into that page will
> > cause a page fault, and since memcmp() does not stop at the '\0' it
> > will read into that not-present memory and trigger a fault, and that
> > read wont be in the exception table, and it will then BUG.  
> 
> Why it doesn't stop at the '\0' if one has it and the other doesn't?
> It's not because it's '\0', it's because they are different.  The '\0'
> should be in the prev page (otherwise it's already a BUG) so it should
> be detected and stopped before going to next page IMHO.
> 

Because memcmp() isn't required to test byte by byte. In fact, most
implementations don't which is why memcmp is faster than strcncmp.

It can be checking in 8 byte chunks or more (although perhaps not
likely). Perhaps there's an arch command that lets you compare 32 bytes
at a time, if the size passed to memcmp is 32 or more, the
implementation is allowed to read both src and dst of 32 bytes at a
time. If there was a '\0' followed by not present memory, you will
still get that fault.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 17:19             ` Steven Rostedt
@ 2018-12-22 17:23               ` Steven Rostedt
  2018-12-22 17:24                 ` Steven Rostedt
  2018-12-23  3:05               ` Namhyung Kim
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22 17:23 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, 22 Dec 2018 12:19:11 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> Because memcmp() isn't required to test byte by byte. In fact, most
> implementations don't which is why memcmp is faster than strcncmp.

In fact, if memcmp() was safe to use if we only knew the size of one of
the parameters, then there would be no reason for strncmp to exist.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 17:23               ` Steven Rostedt
@ 2018-12-22 17:24                 ` Steven Rostedt
  2018-12-23  3:13                   ` Namhyung Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2018-12-22 17:24 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, 22 Dec 2018 12:23:35 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Sat, 22 Dec 2018 12:19:11 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > Because memcmp() isn't required to test byte by byte. In fact, most
> > implementations don't which is why memcmp is faster than strcncmp.  
> 
> In fact, if memcmp() was safe to use if we only knew the size of one of
> the parameters, then there would be no reason for strncmp to exist.
>

Also, I believe there are some memcmp implementations that start at the
end of the memory locations, not the beginning. That is, it compares
backwards. Which is also legit for memcmp to do.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 17:19             ` Steven Rostedt
  2018-12-22 17:23               ` Steven Rostedt
@ 2018-12-23  3:05               ` Namhyung Kim
  1 sibling, 0 replies; 15+ messages in thread
From: Namhyung Kim @ 2018-12-23  3:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, Dec 22, 2018 at 12:19:11PM -0500, Steven Rostedt wrote:
> On Sun, 23 Dec 2018 01:46:05 +0900
> Namhyung Kim <namhyung@kernel.org> wrote:
> 
> 
> > > What I meant by that is if a string is allocated at a end of a page,
> > > and the next page is marked as not present. A read into that page will
> > > cause a page fault, and since memcmp() does not stop at the '\0' it
> > > will read into that not-present memory and trigger a fault, and that
> > > read wont be in the exception table, and it will then BUG.  
> > 
> > Why it doesn't stop at the '\0' if one has it and the other doesn't?
> > It's not because it's '\0', it's because they are different.  The '\0'
> > should be in the prev page (otherwise it's already a BUG) so it should
> > be detected and stopped before going to next page IMHO.
> > 
> 
> Because memcmp() isn't required to test byte by byte. In fact, most
> implementations don't which is why memcmp is faster than strcncmp.
> 
> It can be checking in 8 byte chunks or more (although perhaps not
> likely). Perhaps there's an arch command that lets you compare 32 bytes
> at a time, if the size passed to memcmp is 32 or more, the
> implementation is allowed to read both src and dst of 32 bytes at a
> time. If there was a '\0' followed by not present memory, you will
> still get that fault.

I thought such implementation would check the alignment and not cross
the page boundary in a single read.  But it's implementation's choice
and I found that glibc's default implementation for misaligned pointer
reads next chunk as well to form an aligned chunk using shifts.  So
for the safety it'd be better to use strcmp()..

Thanks for your time and the explanation,
Namhyung

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-22 17:24                 ` Steven Rostedt
@ 2018-12-23  3:13                   ` Namhyung Kim
  2018-12-23  3:23                     ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2018-12-23  3:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sat, Dec 22, 2018 at 12:24:54PM -0500, Steven Rostedt wrote:
> On Sat, 22 Dec 2018 12:23:35 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Sat, 22 Dec 2018 12:19:11 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > Because memcmp() isn't required to test byte by byte. In fact, most
> > > implementations don't which is why memcmp is faster than strcncmp.  
> > 
> > In fact, if memcmp() was safe to use if we only knew the size of one of
> > the parameters, then there would be no reason for strncmp to exist.
> >
> 
> Also, I believe there are some memcmp implementations that start at the
> end of the memory locations, not the beginning. That is, it compares
> backwards. Which is also legit for memcmp to do.

I'm not sure, the man page says:

    RETURN VALUE
       The memcmp() function returns an integer less than, equal to,
       or greater than zero if the first n bytes of s1 is found,
       respectively, to be less than, to match, or be greater than
       the first n bytes of s2.

       For a nonzero return value, the sign is determined by the sign
       of the difference between the first pair of bytes (interpreted
       as unsigned char) that differ in s1 and s2.

       If n is zero, the return value is zero.


It should return difference in the first pair of bytes that differ so
I guess implementations should compare from the beginning.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] string.h: Add str_has_prefix() helper function
  2018-12-23  3:13                   ` Namhyung Kim
@ 2018-12-23  3:23                     ` Steven Rostedt
  0 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2018-12-23  3:23 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Greg Kroah-Hartman, Joe Perches, Masami Hiramatsu, Tom Zanussi,
	Andreas Schwab, kernel-team

On Sun, 23 Dec 2018 12:13:43 +0900
Namhyung Kim <namhyung@kernel.org> wrote:

> > Also, I believe there are some memcmp implementations that start at the
> > end of the memory locations, not the beginning. That is, it compares
> > backwards. Which is also legit for memcmp to do.  
> 
> I'm not sure, the man page says:
> 
>     RETURN VALUE
>        The memcmp() function returns an integer less than, equal to,
>        or greater than zero if the first n bytes of s1 is found,
>        respectively, to be less than, to match, or be greater than
>        the first n bytes of s2.
> 
>        For a nonzero return value, the sign is determined by the sign
>        of the difference between the first pair of bytes (interpreted
>        as unsigned char) that differ in s1 and s2.
> 
>        If n is zero, the return value is zero.
> 
> 
> It should return difference in the first pair of bytes that differ so
> I guess implementations should compare from the beginning.

Ah, makes sense. I think I'm thinking of memcpy() which can start at
the end, or maybe even the deprecated bcmp(). It's been a long time
since I had to deal with the implementations of these.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [utility perl script] strncmp() -> str_has_prefix() conversions
  2018-12-22  4:19 [PATCH v4] string.h: Add str_has_prefix() helper function Steven Rostedt
  2018-12-22  4:42 ` Steven Rostedt
  2018-12-22  9:33 ` Namhyung Kim
@ 2019-01-11  8:10 ` Joe Perches
  2 siblings, 0 replies; 15+ messages in thread
From: Joe Perches @ 2019-01-11  8:10 UTC (permalink / raw)
  To: Steven Rostedt, LKML; +Cc: Julia Lawall

[-- Attachment #1: Type: text/plain, Size: 2342 bytes --]

On Fri, 2018-12-21 at 23:19 -0500, Steven Rostedt wrote:
> str_has_prefix

A coccinelle script could be more thorough but here
is a trivial perl script that can do most of the
	strncmp() -> str_has_prefix()
conversions where there is a constant string as one of
the first two arguments of strncmp like any of:

	strncmp(foo, "bar", 3)
	strncmp(foo, "bar", strlen("bar"))
	strncmp(foo, "bar", sizeof("bar") - 1)
	strncmp("foo", bar, 3)
	strncmp("foo", bar, strlen("foo"))
	strncmp("foo", bar",
sizeof("foo") - 1)

It could be used with a particular path or file:

$ git grep -w --name-only strncmp <path> | \
  grep -vP '^(tools|scripts)' | \
  while read file ; do \
    echo $file ; \
    perl -i ./strncmp.perl $file ; \
  done

It mostly works, but there are a few uses that
are not converted properly when the non const
string argument to strncmp is an expression like

	strncmp(a+b, "foo", 3)

There are also strncmp uses that remain after
this script is run where strncmp should just
be converted to strcmp instead like:

	strncmp(p, "foo", sizeof(foo))
and
	strncmp(p, "foo", 4)

The script converts the most common cases:

    ## counted length of string
    # strncmp(arg, string, counted length of string) == 0
    # strncmp(arg, string, counted length of string) != 0
    # !strncmp(arg, string, counted length of string)
    # strncmp(arg, string, counted length of string)

    ## Reversed string/arg counted length of string uses
    # strncmp(string, arg, counted length of string) == 0
    # strncmp(string, arg, counted length of string) != 0
    # !strncmp(string, arg, counted length of string)
    # strncmp(string, arg, counted length of string)

    ## strlen uses
    # strncmp(arg, string, strlen(string)) == 0
    # !strncmp(arg, string, strlen(string))

    ## reversed string/arg strlen uses
    # strncmp(string, arg, strlen(string)) == 0
    # !strncmp(string, arg, strlen(string))

    ## 'sizeof(string) - 1' uses
    # strncmp(arg, string, sizeof(string) - 1) == 0
    # !strncmp(arg, string, sizeof(string) - 1)


On linux-next, running the script below

$ git grep -w --name-only strncmp | \
  grep -vP '^(tools|scripts)' | \
  while read file ; do \
    echo $file ; \
    perl -i ./strncmp.perl $file ; \
  done

produces:

$ git diff --shortstat
 437 files changed, 1483 insertions(+), 1500 deletions(-)


[-- Attachment #2: strncmp.perl --]
[-- Type: application/x-perl, Size: 6945 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-01-11  8:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-22  4:19 [PATCH v4] string.h: Add str_has_prefix() helper function Steven Rostedt
2018-12-22  4:42 ` Steven Rostedt
2018-12-22  9:33 ` Namhyung Kim
2018-12-22 12:24   ` Steven Rostedt
2018-12-22 14:24     ` Namhyung Kim
2018-12-22 15:12       ` Steven Rostedt
2018-12-22 16:16         ` Steven Rostedt
2018-12-22 16:46           ` Namhyung Kim
2018-12-22 17:19             ` Steven Rostedt
2018-12-22 17:23               ` Steven Rostedt
2018-12-22 17:24                 ` Steven Rostedt
2018-12-23  3:13                   ` Namhyung Kim
2018-12-23  3:23                     ` Steven Rostedt
2018-12-23  3:05               ` Namhyung Kim
2019-01-11  8:10 ` [utility perl script] strncmp() -> str_has_prefix() conversions Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).