All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 0/4] Make sscanf() stricter
@ 2023-06-12 11:59 Alexey Dobriyan
  2023-06-12 20:25 ` Demi Marie Obenour
  0 siblings, 1 reply; 20+ messages in thread
From: Alexey Dobriyan @ 2023-06-12 11:59 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko

> +	bool _placeholder;
> +	return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);

This can be done without introducing dummy variables:

	void f(bool *b)
	{
	}

	f((bool[1]){});

> > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> So NAK.

Yeah, ! should go after format specifier like it does for %p.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-12 11:59 [PATCH v3 0/4] Make sscanf() stricter Alexey Dobriyan
@ 2023-06-12 20:25 ` Demi Marie Obenour
  2023-06-12 21:00   ` Andy Shevchenko
  0 siblings, 1 reply; 20+ messages in thread
From: Demi Marie Obenour @ 2023-06-12 20:25 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky, Andy Shevchenko

[-- Attachment #1: Type: text/plain, Size: 737 bytes --]

On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > +	bool _placeholder;
> > +	return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
> 
> This can be done without introducing dummy variables:
> 
> 	void f(bool *b)
> 	{
> 	}
> 
> 	f((bool[1]){});

This is more consise, but (at least to me) significantly less readable.

> > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > So NAK.
> 
> Yeah, ! should go after format specifier like it does for %p.

I hadn't considered that.  Is the typical approach in Linux to use e.g.
%d%[!] if one wants a literal '!'?
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-12 20:25 ` Demi Marie Obenour
@ 2023-06-12 21:00   ` Andy Shevchenko
  2023-06-12 21:23     ` Demi Marie Obenour
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Shevchenko @ 2023-06-12 21:00 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

On Mon, Jun 12, 2023 at 04:25:01PM -0400, Demi Marie Obenour wrote:
> On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > > +	bool _placeholder;
> > > +	return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
> > 
> > This can be done without introducing dummy variables:
> > 
> > 	void f(bool *b)
> > 	{
> > 	}
> > 
> > 	f((bool[1]){});
> 
> This is more consise, but (at least to me) significantly less readable.
> 
> > > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > > So NAK.
> > 
> > Yeah, ! should go after format specifier like it does for %p.
> 
> I hadn't considered that.  Is the typical approach in Linux to use e.g.
> %d%[!] if one wants a literal '!'?

It might be that the cleanest way we have is to create %p-like extensions to
sscanf(). %p takes alnum as parameter and that is usually works since it makes
a little sense to attach alnum suffix to the pointer.

(I don't like to have %dX, where X is alnum as we expanding our hack to
 something which people don't expect to be altered even in the kernelm, you may
 refer to the discussion about %de for printing errors)


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-12 21:00   ` Andy Shevchenko
@ 2023-06-12 21:23     ` Demi Marie Obenour
  2023-06-12 22:16       ` Andy Shevchenko
  2023-06-13 13:02       ` David Laight
  0 siblings, 2 replies; 20+ messages in thread
From: Demi Marie Obenour @ 2023-06-12 21:23 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]

On Tue, Jun 13, 2023 at 12:00:44AM +0300, Andy Shevchenko wrote:
> On Mon, Jun 12, 2023 at 04:25:01PM -0400, Demi Marie Obenour wrote:
> > On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > > > +	bool _placeholder;
> > > > +	return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
> > > 
> > > This can be done without introducing dummy variables:
> > > 
> > > 	void f(bool *b)
> > > 	{
> > > 	}
> > > 
> > > 	f((bool[1]){});
> > 
> > This is more consise, but (at least to me) significantly less readable.
> > 
> > > > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > > > So NAK.
> > > 
> > > Yeah, ! should go after format specifier like it does for %p.
> > 
> > I hadn't considered that.  Is the typical approach in Linux to use e.g.
> > %d%[!] if one wants a literal '!'?
> 
> It might be that the cleanest way we have is to create %p-like extensions to
> sscanf(). %p takes alnum as parameter and that is usually works since it makes
> a little sense to attach alnum suffix to the pointer.
> 
> (I don't like to have %dX, where X is alnum as we expanding our hack to
>  something which people don't expect to be altered even in the kernelm, you may
>  refer to the discussion about %de for printing errors)

Personally I’m not too worried about compatibility with userspace
sscanf(), except to the extent that -Werror=format can keep working.
Userspace sscanf() is almost useless: it has undefined behavior on
integer overflow and swallows spaces that should usually be rejected.
I typically either use strto*l() or (as I am currently doing for Xen’s
toolstack) just write my own parsing functions from scratch.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-12 21:23     ` Demi Marie Obenour
@ 2023-06-12 22:16       ` Andy Shevchenko
  2023-06-13 13:02       ` David Laight
  1 sibling, 0 replies; 20+ messages in thread
From: Andy Shevchenko @ 2023-06-12 22:16 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

On Mon, Jun 12, 2023 at 05:23:18PM -0400, Demi Marie Obenour wrote:
> On Tue, Jun 13, 2023 at 12:00:44AM +0300, Andy Shevchenko wrote:
> > On Mon, Jun 12, 2023 at 04:25:01PM -0400, Demi Marie Obenour wrote:
> > > On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > > > > +	bool _placeholder;
> > > > > +	return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
> > > > 
> > > > This can be done without introducing dummy variables:
> > > > 
> > > > 	void f(bool *b)
> > > > 	{
> > > > 	}
> > > > 
> > > > 	f((bool[1]){});
> > > 
> > > This is more consise, but (at least to me) significantly less readable.
> > > 
> > > > > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > > > > So NAK.
> > > > 
> > > > Yeah, ! should go after format specifier like it does for %p.
> > > 
> > > I hadn't considered that.  Is the typical approach in Linux to use e.g.
> > > %d%[!] if one wants a literal '!'?
> > 
> > It might be that the cleanest way we have is to create %p-like extensions to
> > sscanf(). %p takes alnum as parameter and that is usually works since it makes
> > a little sense to attach alnum suffix to the pointer.
> > 
> > (I don't like to have %dX, where X is alnum as we expanding our hack to
> >  something which people don't expect to be altered even in the kernelm, you may
> >  refer to the discussion about %de for printing errors)
> 
> Personally I’m not too worried about compatibility with userspace
> sscanf(), except to the extent that -Werror=format can keep working.
> Userspace sscanf() is almost useless: it has undefined behavior on
> integer overflow and swallows spaces that should usually be rejected.
> I typically either use strto*l() or (as I am currently doing for Xen’s
> toolstack) just write my own parsing functions from scratch.

`man sscanf` tells about %p, and currently we have no use (if I'm not mistaken)
for %pj in printf(), so that can be used for %pj in sscanf() to avoid ambiguity
with possible extensions to actually parse our %p extension-like strings.

Not sure if others support the idea.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-12 21:23     ` Demi Marie Obenour
  2023-06-12 22:16       ` Andy Shevchenko
@ 2023-06-13 13:02       ` David Laight
  2023-06-13 15:35         ` Demi Marie Obenour
  1 sibling, 1 reply; 20+ messages in thread
From: David Laight @ 2023-06-13 13:02 UTC (permalink / raw)
  To: 'Demi Marie Obenour', Andy Shevchenko
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

From: Demi Marie Obenour
> Sent: 12 June 2023 22:23
....
> sscanf(), except to the extent that -Werror=format can keep working.
> Userspace sscanf() is almost useless: it has undefined behavior on
> integer overflow and swallows spaces that should usually be rejected.

scanf() is designed for parsing space separated data.
Eating spaces it part of its job description.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-13 13:02       ` David Laight
@ 2023-06-13 15:35         ` Demi Marie Obenour
  2023-06-14  8:23           ` David Laight
  0 siblings, 1 reply; 20+ messages in thread
From: Demi Marie Obenour @ 2023-06-13 15:35 UTC (permalink / raw)
  To: David Laight, Andy Shevchenko
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

[-- Attachment #1: Type: text/plain, Size: 896 bytes --]

On Tue, Jun 13, 2023 at 01:02:59PM +0000, David Laight wrote:
> From: Demi Marie Obenour
> > Sent: 12 June 2023 22:23
> ....
> > sscanf(), except to the extent that -Werror=format can keep working.
> > Userspace sscanf() is almost useless: it has undefined behavior on
> > integer overflow and swallows spaces that should usually be rejected.
> 
> scanf() is designed for parsing space separated data.
> Eating spaces it part of its job description.
> 
> 	David

In this case I would prefer to have two versions: one that eats spaces
and one that does not.  For instance, I don’t think any user of
xenbus_scanf() wants the space-swallowing behavior.  This can be worked
around in xenbus_scanf(), of course, by having it reject strings with
spaces (as determened by isspace()) before calling vsscanf().
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-13 15:35         ` Demi Marie Obenour
@ 2023-06-14  8:23           ` David Laight
  2023-06-14 20:08             ` Demi Marie Obenour
  0 siblings, 1 reply; 20+ messages in thread
From: David Laight @ 2023-06-14  8:23 UTC (permalink / raw)
  To: 'Demi Marie Obenour', Andy Shevchenko
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

From: Demi Marie Obenour
> Sent: 13 June 2023 16:35
> 
> On Tue, Jun 13, 2023 at 01:02:59PM +0000, David Laight wrote:
> > From: Demi Marie Obenour
> > > Sent: 12 June 2023 22:23
> > ....
> > > sscanf(), except to the extent that -Werror=format can keep working.
> > > Userspace sscanf() is almost useless: it has undefined behavior on
> > > integer overflow and swallows spaces that should usually be rejected.
> >
> > scanf() is designed for parsing space separated data.
> > Eating spaces it part of its job description.
> >
> > 	David
> 
> In this case I would prefer to have two versions: one that eats spaces
> and one that does not.  For instance, I don’t think any user of
> xenbus_scanf() wants the space-swallowing behavior.  This can be worked
> around in xenbus_scanf(), of course, by having it reject strings with
> spaces (as determened by isspace()) before calling vsscanf().

What sort of formats and data are being used?
The "%s" format terminates on whitespace.
Even stroul() (and friends) will skip leading whitespace.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-14  8:23           ` David Laight
@ 2023-06-14 20:08             ` Demi Marie Obenour
  2023-06-15  8:06               ` David Laight
  0 siblings, 1 reply; 20+ messages in thread
From: Demi Marie Obenour @ 2023-06-14 20:08 UTC (permalink / raw)
  To: David Laight, Andy Shevchenko
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

[-- Attachment #1: Type: text/plain, Size: 1453 bytes --]

On Wed, Jun 14, 2023 at 08:23:56AM +0000, David Laight wrote:
> From: Demi Marie Obenour
> > Sent: 13 June 2023 16:35
> > 
> > On Tue, Jun 13, 2023 at 01:02:59PM +0000, David Laight wrote:
> > > From: Demi Marie Obenour
> > > > Sent: 12 June 2023 22:23
> > > ....
> > > > sscanf(), except to the extent that -Werror=format can keep working.
> > > > Userspace sscanf() is almost useless: it has undefined behavior on
> > > > integer overflow and swallows spaces that should usually be rejected.
> > >
> > > scanf() is designed for parsing space separated data.
> > > Eating spaces it part of its job description.
> > >
> > > 	David
> > 
> > In this case I would prefer to have two versions: one that eats spaces
> > and one that does not.  For instance, I don’t think any user of
> > xenbus_scanf() wants the space-swallowing behavior.  This can be worked
> > around in xenbus_scanf(), of course, by having it reject strings with
> > spaces (as determened by isspace()) before calling vsscanf().
> 
> What sort of formats and data are being used?

Base-10 or base-16 integers, with whitespace never being valid.

> The "%s" format terminates on whitespace.
> Even stroul() (and friends) will skip leading whitespace.

Yes, which is a reason that strto*l() are just broken IMO.  I’m trying
to replace their uses in Xen with custom parsing code.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-14 20:08             ` Demi Marie Obenour
@ 2023-06-15  8:06               ` David Laight
  2023-06-15 11:23                 ` Andy Shevchenko
  0 siblings, 1 reply; 20+ messages in thread
From: David Laight @ 2023-06-15  8:06 UTC (permalink / raw)
  To: 'Demi Marie Obenour', Andy Shevchenko
  Cc: Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

From: Demi Marie Obenour
> Sent: 14 June 2023 21:09
....
> > What sort of formats and data are being used?
> 
> Base-10 or base-16 integers, with whitespace never being valid.

In which case sscanf() really isn't what you are looking for.

> > The "%s" format terminates on whitespace.
> > Even stroul() (and friends) will skip leading whitespace.
> 
> Yes, which is a reason that strto*l() are just broken IMO.

They are not 'broken', that is what is useful most of the time.
The usual problem is that "020" is treated as octal.

> I’m trying to replace their uses in Xen with custom parsing code.

Then write a custom parser :-)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-15  8:06               ` David Laight
@ 2023-06-15 11:23                 ` Andy Shevchenko
  2023-06-15 11:38                   ` David Laight
  2023-06-20 13:34                   ` Petr Mladek
  0 siblings, 2 replies; 20+ messages in thread
From: Andy Shevchenko @ 2023-06-15 11:23 UTC (permalink / raw)
  To: David Laight
  Cc: 'Demi Marie Obenour',
	Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
> From: Demi Marie Obenour
> > Sent: 14 June 2023 21:09

...

> > > What sort of formats and data are being used?
> > 
> > Base-10 or base-16 integers, with whitespace never being valid.
> 
> In which case sscanf() really isn't what you are looking for.
> 
> > > The "%s" format terminates on whitespace.
> > > Even stroul() (and friends) will skip leading whitespace.
> > 
> > Yes, which is a reason that strto*l() are just broken IMO.
> 
> They are not 'broken', that is what is useful most of the time.
> The usual problem is that "020" is treated as octal.
> 
> > I’m trying to replace their uses in Xen with custom parsing code.
> 
> Then write a custom parser :-)

Hmm... Usually we are against zillion implementations of the same with zillion
bugs hidden (each buggy implementation with its own bugs).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-15 11:23                 ` Andy Shevchenko
@ 2023-06-15 11:38                   ` David Laight
  2023-06-20 13:34                   ` Petr Mladek
  1 sibling, 0 replies; 20+ messages in thread
From: David Laight @ 2023-06-15 11:38 UTC (permalink / raw)
  To: 'Andy Shevchenko'
  Cc: 'Demi Marie Obenour',
	Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Petr Mladek, Steven Rostedt, Sergey Senozhatsky

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Sent: 15 June 2023 12:24
> ...
> 
> > > > What sort of formats and data are being used?
> > >
> > > Base-10 or base-16 integers, with whitespace never being valid.
> >
> > In which case sscanf() really isn't what you are looking for.
> >
> > > > The "%s" format terminates on whitespace.
> > > > Even stroul() (and friends) will skip leading whitespace.
> > >
> > > Yes, which is a reason that strto*l() are just broken IMO.
> >
> > They are not 'broken', that is what is useful most of the time.
> > The usual problem is that "020" is treated as octal.
> >
> > > I’m trying to replace their uses in Xen with custom parsing code.
> >
> > Then write a custom parser :-)
> 
> Hmm... Usually we are against zillion implementations of the same with zillion
> bugs hidden (each buggy implementation with its own bugs).

Using strtoull() for the actual numeric conversion removes most of that.

I am intrigued about why you want to error leading whitespace.
I bet you'll find something that is passing space-padded input.
It might be against the letter of the spec - but it doesn't mean
it doesn't happen.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-15 11:23                 ` Andy Shevchenko
  2023-06-15 11:38                   ` David Laight
@ 2023-06-20 13:34                   ` Petr Mladek
  2023-06-20 13:52                     ` Andy Shevchenko
  2023-06-21  0:56                     ` Demi Marie Obenour
  1 sibling, 2 replies; 20+ messages in thread
From: Petr Mladek @ 2023-06-20 13:34 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: David Laight, 'Demi Marie Obenour',
	Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Steven Rostedt, Sergey Senozhatsky

On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
> > From: Demi Marie Obenour
> > > Sent: 14 June 2023 21:09
> 
> ...
> 
> > > > What sort of formats and data are being used?
> > > 
> > > Base-10 or base-16 integers, with whitespace never being valid.
> > 
> > In which case sscanf() really isn't what you are looking for.
> > 
> > > > The "%s" format terminates on whitespace.
> > > > Even stroul() (and friends) will skip leading whitespace.
> > > 
> > > Yes, which is a reason that strto*l() are just broken IMO.
> >
> > They are not 'broken', that is what is useful most of the time.
> > The usual problem is that "020" is treated as octal.

I do not know how many users depend on this behavior. But I believe
that there are such users. And breaking compatibility with userspace
implementation would make more harm then good in this case.

> > > I’m trying to replace their uses in Xen with custom parsing code.
> > 
> > Then write a custom parser :-)

Honestly, I dislike any sscanf() modification which have been suggested
so far:

  + %!d is not acceptable because it produces compiler errors

  + %d! is not acceptable because "use 64!" is a realistic string.
    We could not be sure that "<number>!" will never be parsed
    in kernel.

  + %d%[!] produces compiler error either. It is hard to parse by eyes.
    Also the meaning of such a format would be far from obvious.

  + %pj or another %p modifiers would be hard to understand either.

    Yes, we have %pe but I think that only few people really use it.
    And it is kind of self-explanatory because it is typically
    used together with ERR_PTR() and with variables called
    "err" or "ret".


> Hmm... Usually we are against zillion implementations of the same with zillion
> bugs hidden (each buggy implementation with its own bugs).

I would really like to see the code depending on it. The cover letter
suggests that there already is a patch with such a custom parser.
I am sorry if it has already been mentioned. There were so many threads.

Sure, we do not want two full featured sscanf() implementations. But a
wrapper checking for leading whitespace and using kstrto<foo>
family does not sound too complex.

There should always be a good reason to introduce an incompatibility
between the kernel and the userspace implementation of a commonly
used API.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-20 13:34                   ` Petr Mladek
@ 2023-06-20 13:52                     ` Andy Shevchenko
  2023-06-20 13:54                       ` Andy Shevchenko
  2023-06-20 14:57                       ` Petr Mladek
  2023-06-21  0:56                     ` Demi Marie Obenour
  1 sibling, 2 replies; 20+ messages in thread
From: Andy Shevchenko @ 2023-06-20 13:52 UTC (permalink / raw)
  To: Petr Mladek
  Cc: David Laight, 'Demi Marie Obenour',
	Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Steven Rostedt, Sergey Senozhatsky

On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:

...

>   + %pj or another %p modifiers would be hard to understand either.
> 
>     Yes, we have %pe but I think that only few people really use it.
>     And it is kind of self-explanatory because it is typically
>     used together with ERR_PTR() and with variables called
>     "err" or "ret".

j, besides the luck of no (yet) use in the kernel's printf(), is
described for printf(3)

  j   A  following integer conversion corresponds to an intmax_t or uintmax_t
      argument, or a following n conversion corresponds to a pointer to an
      intmax_t argument.

So, I this among all proposals, this one is the best (while all of them may
sound not good).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-20 13:52                     ` Andy Shevchenko
@ 2023-06-20 13:54                       ` Andy Shevchenko
  2023-06-20 14:57                       ` Petr Mladek
  1 sibling, 0 replies; 20+ messages in thread
From: Andy Shevchenko @ 2023-06-20 13:54 UTC (permalink / raw)
  To: Petr Mladek
  Cc: David Laight, 'Demi Marie Obenour',
	Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Steven Rostedt, Sergey Senozhatsky

On Tue, Jun 20, 2023 at 04:52:43PM +0300, Andy Shevchenko wrote:
> On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> > On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:

...

> >   + %pj or another %p modifiers would be hard to understand either.
> > 
> >     Yes, we have %pe but I think that only few people really use it.
> >     And it is kind of self-explanatory because it is typically
> >     used together with ERR_PTR() and with variables called
> >     "err" or "ret".
> 
> j, besides the luck of no (yet) use in the kernel's printf(), is
> described for printf(3)
> 
>   j   A  following integer conversion corresponds to an intmax_t or uintmax_t
>       argument, or a following n conversion corresponds to a pointer to an
>       intmax_t argument.
> 
> So, I this among all proposals, this one is the best (while all of them may

s/this/think

> sound not good).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-20 13:52                     ` Andy Shevchenko
  2023-06-20 13:54                       ` Andy Shevchenko
@ 2023-06-20 14:57                       ` Petr Mladek
  2023-06-20 15:05                         ` Andy Shevchenko
  1 sibling, 1 reply; 20+ messages in thread
From: Petr Mladek @ 2023-06-20 14:57 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: David Laight, 'Demi Marie Obenour',
	Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Steven Rostedt, Sergey Senozhatsky

On Tue 2023-06-20 16:52:42, Andy Shevchenko wrote:
> On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> > On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
> 
> ...
> 
> >   + %pj or another %p modifiers would be hard to understand either.
> > 
> >     Yes, we have %pe but I think that only few people really use it.
> >     And it is kind of self-explanatory because it is typically
> >     used together with ERR_PTR() and with variables called
> >     "err" or "ret".
> 
> j, besides the luck of no (yet) use in the kernel's printf(), is
> described for printf(3)
> 
>   j   A  following integer conversion corresponds to an intmax_t or uintmax_t
>       argument, or a following n conversion corresponds to a pointer to an
>       intmax_t argument.

I see, I have missed this coincidence. And we would really need to use %pj.
%jd requires intmax_t variable. Otherwise, the compiler produces:

kernel/lib/test.c:10:17: error: format ‘%jd’ expects argument of type ‘intmax_t *’, but argument 3 has type ‘int *’ [-Werror=format=]
  sscanf(str, "%jd hello.", &tmp);

Hmm, %pj might even make some sense for sscanf() which requires pointers anyway.
But still, we would lose the compiler check of the size of the passed
buffer.

This is actually my concern with many other %p modifiers. The compiler
is not able to check that we pass the right pointer. I know that this
might happen even with wrong buffer passed to %s or so. But number
types is another category.


> So, I think among all proposals, this one is the best (while all of them may
> sound not good).

I still prefer the custom handler when it is not too complex.

Or if there are many users, we could create sscanf_strict() or so.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-20 14:57                       ` Petr Mladek
@ 2023-06-20 15:05                         ` Andy Shevchenko
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Shevchenko @ 2023-06-20 15:05 UTC (permalink / raw)
  To: Petr Mladek
  Cc: David Laight, 'Demi Marie Obenour',
	Alexey Dobriyan, linux-kernel, Rasmus Villemoes, Hans de Goede,
	Mauro Carvalho Chehab, Sakari Ailus, Greg Kroah-Hartman,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Lee Jones, Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino,
	Steven Rostedt, Sergey Senozhatsky

On Tue, Jun 20, 2023 at 04:57:55PM +0200, Petr Mladek wrote:
> On Tue 2023-06-20 16:52:42, Andy Shevchenko wrote:
> > On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> > > On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > > > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:

...

> > >   + %pj or another %p modifiers would be hard to understand either.
> > > 
> > >     Yes, we have %pe but I think that only few people really use it.
> > >     And it is kind of self-explanatory because it is typically
> > >     used together with ERR_PTR() and with variables called
> > >     "err" or "ret".
> > 
> > j, besides the luck of no (yet) use in the kernel's printf(), is
> > described for printf(3)
> > 
> >   j   A  following integer conversion corresponds to an intmax_t or uintmax_t
> >       argument, or a following n conversion corresponds to a pointer to an
> >       intmax_t argument.
> 
> I see, I have missed this coincidence. And we would really need to use %pj.
> %jd requires intmax_t variable. Otherwise, the compiler produces:
> 
> kernel/lib/test.c:10:17: error: format ‘%jd’ expects argument of type ‘intmax_t *’, but argument 3 has type ‘int *’ [-Werror=format=]
>   sscanf(str, "%jd hello.", &tmp);
> 
> Hmm, %pj might even make some sense for sscanf() which requires pointers anyway.
> But still, we would lose the compiler check of the size of the passed
> buffer.
> 
> This is actually my concern with many other %p modifiers. The compiler
> is not able to check that we pass the right pointer. I know that this
> might happen even with wrong buffer passed to %s or so. But number
> types is another category.

Yeah, it was a discussion IIRC for the compiler plugin to support %p
extensions, but I have no idea where it's now.

> > So, I think among all proposals, this one is the best (while all of them may
> > sound not good).
> 
> I still prefer the custom handler when it is not too complex.
> 
> Or if there are many users, we could create sscanf_strict() or so.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-20 13:34                   ` Petr Mladek
  2023-06-20 13:52                     ` Andy Shevchenko
@ 2023-06-21  0:56                     ` Demi Marie Obenour
  1 sibling, 0 replies; 20+ messages in thread
From: Demi Marie Obenour @ 2023-06-21  0:56 UTC (permalink / raw)
  To: Petr Mladek, Andy Shevchenko
  Cc: David Laight, Alexey Dobriyan, linux-kernel, Rasmus Villemoes,
	Hans de Goede, Mauro Carvalho Chehab, Sakari Ailus,
	Greg Kroah-Hartman, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Lee Jones, Andy Lutomirski,
	Thomas Gleixner, Vincenzo Frascino, Steven Rostedt,
	Sergey Senozhatsky

[-- Attachment #1: Type: text/plain, Size: 3056 bytes --]

On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
> > > From: Demi Marie Obenour
> > > > Sent: 14 June 2023 21:09
> > 
> > ...
> > 
> > > > > What sort of formats and data are being used?
> > > > 
> > > > Base-10 or base-16 integers, with whitespace never being valid.
> > > 
> > > In which case sscanf() really isn't what you are looking for.
> > > 
> > > > > The "%s" format terminates on whitespace.
> > > > > Even stroul() (and friends) will skip leading whitespace.
> > > > 
> > > > Yes, which is a reason that strto*l() are just broken IMO.
> > >
> > > They are not 'broken', that is what is useful most of the time.
> > > The usual problem is that "020" is treated as octal.
> 
> I do not know how many users depend on this behavior. But I believe
> that there are such users. And breaking compatibility with userspace
> implementation would make more harm then good in this case.
> 
> > > > I’m trying to replace their uses in Xen with custom parsing code.
> > > 
> > > Then write a custom parser :-)
> 
> Honestly, I dislike any sscanf() modification which have been suggested
> so far:
> 
>   + %!d is not acceptable because it produces compiler errors
> 
>   + %d! is not acceptable because "use 64!" is a realistic string.
>     We could not be sure that "<number>!" will never be parsed
>     in kernel.
> 
>   + %d%[!] produces compiler error either. It is hard to parse by eyes.
>     Also the meaning of such a format would be far from obvious.
> 
>   + %pj or another %p modifiers would be hard to understand either.
> 
>     Yes, we have %pe but I think that only few people really use it.
>     And it is kind of self-explanatory because it is typically
>     used together with ERR_PTR() and with variables called
>     "err" or "ret".
> 
> 
> > Hmm... Usually we are against zillion implementations of the same with zillion
> > bugs hidden (each buggy implementation with its own bugs).
> 
> I would really like to see the code depending on it. The cover letter
> suggests that there already is a patch with such a custom parser.
> I am sorry if it has already been mentioned. There were so many threads.
> 
> Sure, we do not want two full featured sscanf() implementations. But a
> wrapper checking for leading whitespace and using kstrto<foo>
> family does not sound too complex.
> 
> There should always be a good reason to introduce an incompatibility
> between the kernel and the userspace implementation of a commonly
> used API.
> 
> Best Regards,
> Petr

I strongly believe that overflow should be forbidden by default, but it
turns out that I do not have time to advance this patch further.  My
understanding is that Xen never wants to allow spaces in Xenstore
entries, but that is easy to ensure via an explicit check prior to
calling vsscanf().
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] Make sscanf() stricter
  2023-06-10 20:40 Demi Marie Obenour
@ 2023-06-12 15:34 ` Andy Shevchenko
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Shevchenko @ 2023-06-12 15:34 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Hans de Goede, Mauro Carvalho Chehab, Sakari Ailus,
	Greg Kroah-Hartman, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Lee Jones, Andy Lutomirski,
	Thomas Gleixner, Vincenzo Frascino, Petr Mladek, Steven Rostedt,
	Sergey Senozhatsky, Rasmus Villemoes, linux-media, linux-staging,
	linux-kernel, xen-devel

On Sat, Jun 10, 2023 at 04:40:40PM -0400, Demi Marie Obenour wrote:
> Roger Pau Monné suggested making xenbus_scanf() stricter instead of
> using a custom parser.  Christoph Hellwig asked why the normal vsscanf()
> cannot be made stricter.  Richard Weinberger mentioned Linus Torvalds’s
> suggestion of using ! to allow overflow.

As Rasmus articulated, NAK w.o. test cases being added to all parts where your
changes touch.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/4] Make sscanf() stricter
@ 2023-06-10 20:40 Demi Marie Obenour
  2023-06-12 15:34 ` Andy Shevchenko
  0 siblings, 1 reply; 20+ messages in thread
From: Demi Marie Obenour @ 2023-06-10 20:40 UTC (permalink / raw)
  To: Hans de Goede, Mauro Carvalho Chehab, Sakari Ailus,
	Greg Kroah-Hartman, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Lee Jones, Andy Lutomirski,
	Thomas Gleixner, Vincenzo Frascino, Petr Mladek, Steven Rostedt,
	Sergey Senozhatsky, Andy Shevchenko, Rasmus Villemoes
  Cc: Demi Marie Obenour, linux-media, linux-staging, linux-kernel, xen-devel

Roger Pau Monné suggested making xenbus_scanf() stricter instead of
using a custom parser.  Christoph Hellwig asked why the normal vsscanf()
cannot be made stricter.  Richard Weinberger mentioned Linus Torvalds’s
suggestion of using ! to allow overflow.

Changes since v2:

- Better commit messages.
- Fix a compile error in simple_strtoll() (found by 0day bot).
- Fix an uninitialized variable (found by Dan Carpenter).

Changes since v1:

- Better commit messages.
- Use ! to explicitly opt-in to allowing overflow.
- Treat overflow as a conversion failure instead of returning ERANGE.
- Drop the first patch (removal of simple_strtoll()) as it breaks
  bcache.
- Stop skipping spaces in vsscanf() instead of adding a separate
  vsscanf_strict() function.

Demi Marie Obenour (4):
  limits.h: add UCHAR_MAX, SCHAR_MAX, and SCHAR_MIN
  vsscanf(): Integer overflow is a conversion failure
  vsscanf(): do not skip spaces
  Reject NUL bytes in xenstore nodes

 .../hive_isp_css_include/platform_support.h   |  1 -
 drivers/xen/xenbus/xenbus_xs.c                | 17 +++-
 include/linux/limits.h                        |  1 +
 include/linux/mfd/wl1273-core.h               |  3 -
 include/vdso/limits.h                         |  3 +
 lib/vsprintf.c                                | 98 +++++++++++++------
 6 files changed, 86 insertions(+), 37 deletions(-)

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-06-21  0:56 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-12 11:59 [PATCH v3 0/4] Make sscanf() stricter Alexey Dobriyan
2023-06-12 20:25 ` Demi Marie Obenour
2023-06-12 21:00   ` Andy Shevchenko
2023-06-12 21:23     ` Demi Marie Obenour
2023-06-12 22:16       ` Andy Shevchenko
2023-06-13 13:02       ` David Laight
2023-06-13 15:35         ` Demi Marie Obenour
2023-06-14  8:23           ` David Laight
2023-06-14 20:08             ` Demi Marie Obenour
2023-06-15  8:06               ` David Laight
2023-06-15 11:23                 ` Andy Shevchenko
2023-06-15 11:38                   ` David Laight
2023-06-20 13:34                   ` Petr Mladek
2023-06-20 13:52                     ` Andy Shevchenko
2023-06-20 13:54                       ` Andy Shevchenko
2023-06-20 14:57                       ` Petr Mladek
2023-06-20 15:05                         ` Andy Shevchenko
2023-06-21  0:56                     ` Demi Marie Obenour
  -- strict thread matches above, loose matches on Subject: below --
2023-06-10 20:40 Demi Marie Obenour
2023-06-12 15:34 ` Andy Shevchenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.