u-boot.lists.denx.de archive mirror
 help / color / mirror / Atom feed
* [PATCH] lib: fix buggy strcmp and strncmp
@ 2022-10-05  9:09 Rasmus Villemoes
  2022-10-25 23:33 ` Rasmus Villemoes
  2022-10-27 14:54 ` Tom Rini
  0 siblings, 2 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2022-10-05  9:09 UTC (permalink / raw)
  To: u-boot; +Cc: Simon Glass, Tom Rini, Rasmus Villemoes

There are two problems with both strcmp and strncmp:

(1) The C standard is clear that the contents should be compared as
"unsigned char":

  The sign of a nonzero value returned by the comparison functions
  memcmp, strcmp, and strncmp is determined by the sign of the
  difference between the values of the first pair of characters (both
  interpreted as unsigned char) that differ in the objects being
  compared.

(2) The difference between two char (or unsigned char) values can
range from -255 to +255; so that's (due to integer promotion) the
range of values we could get in the *cs-*ct expressions, but when that
is then shoe-horned into an 8-bit quantity the sign may of course
change.

The impact is somewhat limited by the way these functions
are used in practice:

- Most of the time, one is only interested in equality (or for
  strncmp, "starts with"), and the existing functions do correctly
  return 0 if and only if the strings are equal [for strncmp, up to
  the given bound].

- Also most of the time, the strings being compared only consist of
  ASCII characters, i.e. have values in the range [0, 127], and in
  that case it doesn't matter if they are interpreted as signed or
  unsigned char, and the possible difference range is bounded to
  [-127, 127] which does fit the signed char.

For size, one could implement strcmp() in terms of strncmp() - just
make it "return strncmp(a, b, (size_t)-1);". However, performance of
strcmp() does matter somewhat, since it is used all over when parsing
and matching DT nodes and properties, so let's find some other place
to save those ~30 bytes.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---

Please double- and triple-check before applying. I've tested these
against my libc's strcmp() and strncmp() for thousands of random
strings, but I may very well have messed up when copy-pasting back to
lib/string.c.

 lib/string.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index 78bd65c413..ecea755f40 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -206,16 +206,20 @@ size_t strlcat(char *dest, const char *src, size_t size)
  * @cs: One string
  * @ct: Another string
  */
-int strcmp(const char * cs,const char * ct)
+int strcmp(const char *cs, const char *ct)
 {
-	register signed char __res;
+	int ret;
 
 	while (1) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		unsigned char a = *cs++;
+		unsigned char b = *ct++;
+
+		ret = a - b;
+		if (ret || !b)
 			break;
 	}
 
-	return __res;
+	return ret;
 }
 #endif
 
@@ -226,17 +230,20 @@ int strcmp(const char * cs,const char * ct)
  * @ct: Another string
  * @count: The maximum number of bytes to compare
  */
-int strncmp(const char * cs,const char * ct,size_t count)
+int strncmp(const char *cs, const char *ct, size_t count)
 {
-	register signed char __res = 0;
+	int ret = 0;
+
+	while (count--) {
+		unsigned char a = *cs++;
+		unsigned char b = *ct++;
 
-	while (count) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		ret = a - b;
+		if (ret || !b)
 			break;
-		count--;
 	}
 
-	return __res;
+	return ret;
 }
 #endif
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] lib: fix buggy strcmp and strncmp
  2022-10-05  9:09 [PATCH] lib: fix buggy strcmp and strncmp Rasmus Villemoes
@ 2022-10-25 23:33 ` Rasmus Villemoes
  2022-10-27 14:54 ` Tom Rini
  1 sibling, 0 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2022-10-25 23:33 UTC (permalink / raw)
  To: u-boot; +Cc: Simon Glass, Tom Rini

On 05/10/2022 11.09, Rasmus Villemoes wrote:
> There are two problems with both strcmp and strncmp:
> 

Hi Tom

Could you consider applying this now to give it ample time to soak
before release? I'm pretty confident it's correct.

Rasmus

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] lib: fix buggy strcmp and strncmp
  2022-10-05  9:09 [PATCH] lib: fix buggy strcmp and strncmp Rasmus Villemoes
  2022-10-25 23:33 ` Rasmus Villemoes
@ 2022-10-27 14:54 ` Tom Rini
  1 sibling, 0 replies; 3+ messages in thread
From: Tom Rini @ 2022-10-27 14:54 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: u-boot, Simon Glass

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

On Wed, Oct 05, 2022 at 11:09:25AM +0200, Rasmus Villemoes wrote:

> There are two problems with both strcmp and strncmp:
> 
> (1) The C standard is clear that the contents should be compared as
> "unsigned char":
> 
>   The sign of a nonzero value returned by the comparison functions
>   memcmp, strcmp, and strncmp is determined by the sign of the
>   difference between the values of the first pair of characters (both
>   interpreted as unsigned char) that differ in the objects being
>   compared.
> 
> (2) The difference between two char (or unsigned char) values can
> range from -255 to +255; so that's (due to integer promotion) the
> range of values we could get in the *cs-*ct expressions, but when that
> is then shoe-horned into an 8-bit quantity the sign may of course
> change.
> 
> The impact is somewhat limited by the way these functions
> are used in practice:
> 
> - Most of the time, one is only interested in equality (or for
>   strncmp, "starts with"), and the existing functions do correctly
>   return 0 if and only if the strings are equal [for strncmp, up to
>   the given bound].
> 
> - Also most of the time, the strings being compared only consist of
>   ASCII characters, i.e. have values in the range [0, 127], and in
>   that case it doesn't matter if they are interpreted as signed or
>   unsigned char, and the possible difference range is bounded to
>   [-127, 127] which does fit the signed char.
> 
> For size, one could implement strcmp() in terms of strncmp() - just
> make it "return strncmp(a, b, (size_t)-1);". However, performance of
> strcmp() does matter somewhat, since it is used all over when parsing
> and matching DT nodes and properties, so let's find some other place
> to save those ~30 bytes.
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>

Applied to u-boot/master, thanks!

-- 
Tom

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-27 14:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-05  9:09 [PATCH] lib: fix buggy strcmp and strncmp Rasmus Villemoes
2022-10-25 23:33 ` Rasmus Villemoes
2022-10-27 14:54 ` Tom Rini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).