linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vsscanf() in lib/vsprintf.c
@ 2021-05-04 19:19 Stefan Kanthak
  2021-05-05 10:49 ` David Laight
  2021-05-05 14:35 ` Rasmus Villemoes
  0 siblings, 2 replies; 4+ messages in thread
From: Stefan Kanthak @ 2021-05-04 19:19 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1517 bytes --]

Hi @ll,

both <https://www.kernel.org/doc/htmldocs/kernel-api/API-sscanf.html>
and <https://www.kernel.org/doc/htmldocs/kernel-api/API-vsscanf.html>
are rather terse and fail to specify the supported arguments and their
conversion specifiers/modifiers.

<https://www.kernel.org/doc/htmldocs/kernel-api/libc.html#id-1.4.3>
tells OTOH:

| The behaviour of these functions may vary slightly from those
| defined by ANSI, and these deviations are noted in the text.

There is but no text (see above) despite multiple deviations from
ANSI C 

<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/lib/vsprintf.c?h=v5.12>

|  /* '%*[' not yet supported, invalid format */
...
|  /*
|   * Warning: This implementation of the '[' conversion specifier
|   * deviates from its glibc counterpart in the following ways:
...

More deviations (just from reading the source):

1. no support for %p
2. no support for conversion modifiers j and t
3. no support for multibyte characters and strings, i.e. %<width>c
   and %<width>s may split UTF-8 codepoints
4. accepts %[<width>]<modifier>[c|s], but ignores all conversion
   modifiers
5. treats %<width><modifier>% (and combinations) as %%
6. accepts %<width><modifier>n (and combinations)
7. doesn't scan the input for %[...]n
8. uses simple_strto[u]l for the conversion modifier z, i.e. assigns
   uint32_t to size_t, resulting in truncation

Is this intended?
If not: patch to fix 5. and 6. and simplify the qualifier handling
        attached 

Stefan Kanthak

[-- Attachment #2: vsprintf.patch --]
[-- Type: application/octet-stream, Size: 1782 bytes --]

--- -/lib/vsprintf.c
+++ +/lib/vsprintf.c
@@ -3287,17 +3287,25 @@
 			str = skip_spaces(str);
 		}
 
+		if (!*fmt)
+			break;
+
 		/* anything that is not a conversion must match exactly */
-		if (*fmt != '%' && *fmt) {
+		if (*fmt != '%') {
 			if (*fmt++ != *str++)
 				break;
 			continue;
 		}
 
-		if (!*fmt)
-			break;
 
+		/* %% must match % */
+		if (*fmt == '%') {
+			if (*fmt++ != *str++)
+				break;
+			continue;
+		}
+
 		/* skip this conversion.
 		 * advance both strings to next white space
 		 */
@@ -3315,6 +3323,13 @@
 			continue;
 		}
 
+		if (*fmt == 'n') {
+			/* return number of characters read so far */
+			*va_arg(args, int *) = str - buf;
+			++fmt;
+			continue;
+		}
+
 		/* get field width */
 		field_width = -1;
 		if (isdigit(*fmt)) {
@@ -3325,30 +3340,18 @@
 
 		/* get conversion qualifier */
 		qualifier = -1;
-		if (*fmt == 'h' || _tolower(*fmt) == 'l' ||
-		    *fmt == 'z') {
+		if (*fmt == 'z' || *fmt == 'L')
 			qualifier = *fmt++;
+		else if (*fmt == 'h' || *fmt == 'l') {
 			if (unlikely(qualifier == *fmt)) {
-				if (qualifier == 'h') {
-					qualifier = 'H';
-					fmt++;
-				} else if (qualifier == 'l') {
-					qualifier = 'L';
-					fmt++;
-				}
+				qualifier = _toupper(qualifier);
+				fmt++;
 			}
 		}
 
 		if (!*fmt)
 			break;
 
-		if (*fmt == 'n') {
-			/* return number of characters read so far */
-			*va_arg(args, int *) = str - buf;
-			++fmt;
-			continue;
-		}
-
 		if (!*str)
 			break;
 
@@ -3450,11 +3453,6 @@
 			fallthrough;
 		case 'u':
 			break;
-		case '%':
-			/* looking for '%' in str */
-			if (*str++ != '%')
-				return num;
-			continue;
 		default:
 			/* invalid format; stop here */
 			return num;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] vsscanf() in lib/vsprintf.c
  2021-05-04 19:19 [PATCH] vsscanf() in lib/vsprintf.c Stefan Kanthak
@ 2021-05-05 10:49 ` David Laight
  2021-05-05 14:35 ` Rasmus Villemoes
  1 sibling, 0 replies; 4+ messages in thread
From: David Laight @ 2021-05-05 10:49 UTC (permalink / raw)
  To: 'Stefan Kanthak', linux-kernel

It is so stupendously hard to use scanf() safely
the best thing is probably to just delete it :-)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] vsscanf() in lib/vsprintf.c
  2021-05-04 19:19 [PATCH] vsscanf() in lib/vsprintf.c Stefan Kanthak
  2021-05-05 10:49 ` David Laight
@ 2021-05-05 14:35 ` Rasmus Villemoes
  2021-05-05 16:41   ` Stefan Kanthak
  1 sibling, 1 reply; 4+ messages in thread
From: Rasmus Villemoes @ 2021-05-05 14:35 UTC (permalink / raw)
  To: Stefan Kanthak, linux-kernel

On 04/05/2021 21.19, Stefan Kanthak wrote:
> Hi @ll,
> 
> both <https://www.kernel.org/doc/htmldocs/kernel-api/API-sscanf.html>
> and <https://www.kernel.org/doc/htmldocs/kernel-api/API-vsscanf.html>
> are rather terse and fail to specify the supported arguments and their
> conversion specifiers/modifiers.
> 
> <https://www.kernel.org/doc/htmldocs/kernel-api/libc.html#id-1.4.3>
> tells OTOH:
> 
> | The behaviour of these functions may vary slightly from those
> | defined by ANSI, and these deviations are noted in the text.
> 
> There is but no text (see above) despite multiple deviations from
> ANSI C 
> 
> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/lib/vsprintf.c?h=v5.12>
> 
> |  /* '%*[' not yet supported, invalid format */
> ...
> |  /*
> |   * Warning: This implementation of the '[' conversion specifier
> |   * deviates from its glibc counterpart in the following ways:
> ...
> 
> More deviations (just from reading the source):
> 
> 1. no support for %p

What on earth good would that do in the kernel?

> 2. no support for conversion modifiers j and t

Could be added, but do you have a user?

> 3. no support for multibyte characters and strings, i.e. %<width>c
>    and %<width>s may split UTF-8 codepoints

So what? The kernel doesn't do a lot of text processing and wchar_t stuff.

> 4. accepts %[<width>]<modifier>[c|s], but ignores all conversion
>    modifiers

Yeah, %ls is technically accepted and treated as %s, that's mostly for
ease of parsing it seems. Do you have a use case where you'd want wchar_ts?

> 5. treats %<width><modifier>% (and combinations) as %%

What would you expect it to do? Seems to be a non-issue, gcc flags that
nonsense just fine

vs.c: In function ‘v’:
vs.c:5:18: warning: conversion lacks type at end of format [-Wformat=]
    5 |  x = sscanf(s, "%l% %d", &y);
      |                  ^
vs.c:5:20: warning: unknown conversion type character ‘ ’ in format
[-Wformat=]
    5 |  x = sscanf(s, "%l% %d", &y);
      |                    ^

> 6. accepts %<width><modifier>n (and combinations)

Again, non-issue (warning: field width used with ‘%n’ gnu_scanf format)

> 7. doesn't scan the input for %[...]n

? What's that supposed to mean.

> 8. uses simple_strto[u]l for the conversion modifier z, i.e. assigns
>    uint32_t to size_t, resulting in truncation

Where do you see uint32_t? The code is

                       val.u = qualifier != 'L' ?
                                simple_strtoul(str, &next, base) :
                                simple_strtoull(str, &next, base);

                case 'z':
                        *va_arg(args, size_t *) = val.u;
                        break;

so the conversion is done with simple_strtoul which return "unsigned
long". And size_t is either a typedef for "unsigned long" or "unsigned
int", so yes, of course a truncation may happen, but if the value
actually fits in a size_t, it also fits in unsigned long (as returned
from simple_strtoul) and unsigned long long (as stored in val.u).

Rasmus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] vsscanf() in lib/vsprintf.c
  2021-05-05 14:35 ` Rasmus Villemoes
@ 2021-05-05 16:41   ` Stefan Kanthak
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Kanthak @ 2021-05-05 16:41 UTC (permalink / raw)
  To: linux-kernel, Rasmus Villemoes

Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 04/05/2021 21.19, Stefan Kanthak wrote:
>> Hi @ll,
>>
>> both <https://www.kernel.org/doc/htmldocs/kernel-api/API-sscanf.html>
>> and <https://www.kernel.org/doc/htmldocs/kernel-api/API-vsscanf.html>
>> are rather terse and fail to specify the supported arguments and their
>> conversion specifiers/modifiers.
>>
>> <https://www.kernel.org/doc/htmldocs/kernel-api/libc.html#id-1.4.3>
>> tells OTOH:
>>
>> | The behaviour of these functions may vary slightly from those
>> | defined by ANSI, and these deviations are noted in the text.
>>
>> There is but no text (see above) despite multiple deviations from
>> ANSI C
>>
>> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/lib/vsprintf.c?h=v5.12>
>>
>> |  /* '%*[' not yet supported, invalid format */
>> ...
>> |  /*
>> |   * Warning: This implementation of the '[' conversion specifier
>> |   * deviates from its glibc counterpart in the following ways:
>> ...
>>
>> More deviations (just from reading the source):
>>
>> 1. no support for %p
>
> What on earth good would that do in the kernel?

| The behaviour of these functions may vary slightly from those
| defined by ANSI, and these deviations are noted in the text.
                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> 2. no support for conversion modifiers j and t
>
> Could be added, but do you have a user?

Just fix your documentation.

>> 3. no support for multibyte characters and strings, i.e. %<width>c
>>    and %<width>s may split UTF-8 codepoints
>
> So what?

It's a BUG!

> The kernel doesn't do a lot of text processing and wchar_t stuff.

Nobody will ever feed a UTF-8 string to the kernel?

>> 4. accepts %[<width>]<modifier>[c|s], but ignores all conversion
>>    modifiers
>
> Yeah, %ls is technically accepted and treated as %s,

just like %Ls and %Hs and %hhs and %zs ... what the documentation
but fails to tell: just fix it.

> that's mostly for ease of parsing it seems. Do you have a use
> case where you'd want wchar_ts?

>> 5. treats %<width><modifier>% (and combinations) as %%
>
> What would you expect it to do?

See the patch: stop and return the number of converted items, like
an ANSI/ISO conformant scanf()

> Seems to be a non-issue, gcc flags that nonsense just fine

Nobody will ever feed a non-constant format string to [v]sscanf()?

>> 6. accepts %<width><modifier>n (and combinations)
>
> Again, non-issue (warning: field width used with ‘%n’ gnu_scanf format)

How does gnu_scanf() handle %0Ln etc.?
Does a warning stop compilation of the kernel?

See above: it's undocumented, and it's not flagged in calls with
non-constant format string.

>> 7. doesn't scan the input for %[...]n
>
> ? What's that supposed to mean.

Argh, my fault: should have been %*

>> 8. uses simple_strto[u]l for the conversion modifier z, i.e. assigns
>>    uint32_t to size_t, resulting in truncation
>
> Where do you see uint32_t?

LLP64 vs. LP64, so my last point is invalid.

Stefan


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-05-05 16:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-04 19:19 [PATCH] vsscanf() in lib/vsprintf.c Stefan Kanthak
2021-05-05 10:49 ` David Laight
2021-05-05 14:35 ` Rasmus Villemoes
2021-05-05 16:41   ` Stefan Kanthak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).