linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
To: Alexey Dobriyan <adobriyan@gmail.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/10] Add parse_integer() (replacement for simple_strto*())
Date: Mon, 04 May 2015 15:24:28 +0200	[thread overview]
Message-ID: <87h9rsiigj.fsf@rasmusvillemoes.dk> (raw)
In-Reply-To: <20150502004714.GA21655@p183.telecom.by> (Alexey Dobriyan's message of "Sat, 2 May 2015 03:47:14 +0300")

On Sat, May 02 2015, Alexey Dobriyan <adobriyan@gmail.com> wrote:

> kstrto*() and kstrto*_from_user() family of functions were added
> to help with parsing one integer written as string to proc/sysfs/debugfs
> files. But they have a limitation: string passed must end with \0 or \n\0.
> There are enough places where kstrto*() functions can't be used because of
> this limitation. Trivial example: major:minor "%u:%u".
>
> Currently the only way to parse everything is simple_strto*() functions.
> But they are suboptimal:
> * they do not detect overflow (can be fixed, but no one bothered since ~0.99.11),
> * there are only 4 of them -- long and "long long" versions,
>   This leads to silent truncation in the most simple case:
>
> 	val = strtoul(s, NULL, 0);
>
> * half of the people think that "char **endp" argument is necessary and
>   add unnecessary variable.
>
> OpenBSD people, fed up with how complex correct integer parsing is, added
> strtonum(3) to fixup for deficiencies of libc-style integer parsing:
> http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/strtonum.3?query=strtonum&arch=i386
>
> It'd be OK to copy that but it relies on "errno" and fixed strings as
> error reporting channel which I think is not OK for kernel.
> strtonum() also doesn't report number of characted consumed.
>
> What to do?
>
> Enter parse_integer().
>
> 	int parse_integer(const char *s, unsigned int base, T *val);
>

I like the general idea. Few nits below (and in reply to other patches).

First: Could you tell me what tree I can commit this on top of, to see
what gcc makes of it.

> Rationale:
>
> * parse_integer() is exactly 1 (one) interface not 4 or
>   many,one for each type.
>
> * parse_integer() reports -E errors reliably in a canonical kernel way:
>
> 	rv = parse_integer(str, 10, &val);
> 	if (rv < 0)
> 		return rv;
>
> * parse_integer() writes result only if there were no errors, at least
>   one digit has to be consumed,
>
> * parse_integer doesn't mix error reporting channel and value channel,
>   it does mix error and number-of-character-consumed channel, though.
>
> * parse_integer() reports number of characters consumed, makes parsing
>   multiple values easy:

I agree that these are desirable properties, and the way it should be done.

> There are several deficiencies in parse_integer() compared to strto*():
> * can't be used in initializers:
>
> 	const T x = strtoul();
>
> * can't be used with bitfields,
> * can't be used in expressions:
>
> 	x = strtol() * 1024;
> 	x = y = strtol() * 1024;
> 	x += strtol();

It's nice that you list these, but...

> In defense of parse_integer() all I can say, is that using strtol() in
> expression or initializer promotes no error checking and thus probably
> should not be encouraged in C,

...agreed - I actually think it is a _good_ thing that the parse_integer
interface makes it impossible to use in these cases.

> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -10,6 +10,7 @@
>  #include <linux/bitops.h>
>  #include <linux/log2.h>
>  #include <linux/typecheck.h>
> +#include <linux/parse-integer.h>
>  #include <linux/printk.h>
>  #include <linux/dynamic_debug.h>
>  #include <asm/byteorder.h>
> @@ -387,8 +388,10 @@ static inline int __must_check kstrtos32_from_user(const char __user *s, size_t
>  	return kstrtoint_from_user(s, count, base, res);
>  }
>  
> -/* Obsolete, do not use.  Use kstrto<foo> instead */
> -
> +/*
> + * Obsolete, do not use.
> + * Use parse_integer(), kstrto*(), kstrto*_from_user(), sscanf().
> + */
>  extern unsigned long simple_strtoul(const char *,char **,unsigned int);
>  extern long simple_strtol(const char *,char **,unsigned int);
>  extern unsigned long long simple_strtoull(const char *,char **,unsigned int);
> --- /dev/null
> +++ b/include/linux/parse-integer.h
> @@ -0,0 +1,79 @@
> +#ifndef _PARSE_INTEGER_H
> +#define _PARSE_INTEGER_H
> +#include <linux/compiler.h>
> +#include <linux/types.h>
> +
> +/*
> + * int parse_integer(const char *s, unsigned int base, T *val);
> + *
> + * Convert integer string representation to an integer.
> + * Range of accepted values equals to that of type T.
> + *
> + * Conversion to unsigned integer accepts sign "+".
> + * Conversion to signed integer accepts sign "+" and sign "-".
> + *
> + * Radix 0 means autodetection: leading "0x" implies radix 16,
> + * leading "0" implies radix 8, otherwise radix is 10.
> + * Autodetection hint works after optional sign, but not before.
> + *
> + * Return number of characters parsed or -E.
> + *
> + * "T=char" case is not supported because -f{un,}signed-char can silently
> + * change range of accepted values.
> + */
> +#define parse_integer(s, base, val)	\
> +({					\
> +	const char *_s = (s);		\
> +	unsigned int _base = (base);	\
> +	typeof(val) _val = (val);	\
> +					\
> +	__builtin_choose_expr(						\
> +	__builtin_types_compatible_p(typeof(_val), signed char *),	\
> +	_parse_integer_sc(_s, _base, (void *)_val),
> \

Why the (void*) cast? Isn't _val supposed to have precisely the type
expected by _parse_integer_sc at this point?

> +	__builtin_choose_expr(						\
> +	__builtin_types_compatible_p(typeof(_val), long *) && sizeof(long) == 4,\
> +	_parse_integer_i(_s, _base, (void *)_val),			\
> +	__builtin_choose_expr(						\
> +	__builtin_types_compatible_p(typeof(_val), long *) && sizeof(long) == 8,\
> +	_parse_integer_ll(_s, _base, (void *)_val),			\
> +	__builtin_choose_expr(						\
> +	__builtin_types_compatible_p(typeof(_val), unsigned long *) && sizeof(unsigned long) == 4,\
> +	_parse_integer_u(_s, _base, (void *)_val),			\
> +	__builtin_choose_expr(						\
> +	__builtin_types_compatible_p(typeof(_val), unsigned long *) && sizeof(unsigned long) == 8,\
> +	_parse_integer_ull(_s, _base, (void *)_val),			\

Ah, I see. In these cases, one probably has to do a cast to pass a
(long*) as either (int*) or (long long*) - but why not cast to the type
actually expected by _parse_integer_* instead of the launder-anything (void*)?


Another thing: It may be slightly confusing that this can't be used with
an array passed as val. This won't work:

long x[1];
rv = parse_integer(s, 0, x);

One could argue that one should pass &x[0] instead, but since this is a
macro, gcc doesn't really give a very helpful error (I just get "error:
invalid initializer"). I think it can be fixed simply by declaring _val
using typeof(&val[0]).

> +int _parse_integer_ull(const char *s, unsigned int base, unsigned long long *val)
> +{
> +	int rv;
> +
> +	if (*s == '-') {
> +		return -EINVAL;
> +	} else if (*s == '+') {
> +		rv = __parse_integer(s + 1, base, val);
> +		if (rv < 0)
> +			return rv;
> +		return rv + 1;
> +	} else
> +		return __parse_integer(s, base, val);
> +}
> +EXPORT_SYMBOL(_parse_integer_ull);
> +
> +int _parse_integer_ll(const char *s, unsigned int base, long long *val)
> +{
> +	unsigned long long tmp;
> +	int rv;
> +
> +	if (*s == '-') {
> +		rv = __parse_integer(s + 1, base, &tmp);
> +		if (rv < 0)
> +			return rv;
> +		if ((long long)-tmp >= 0)
> +			return -ERANGE;

Is there any reason to disallow "-0"?

Rasmus

  parent reply	other threads:[~2015-05-04 13:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-02  0:47 Alexey Dobriyan
2015-05-02  0:48 ` [PATCH 02/10] parse_integer: rewrite kstrto*() Alexey Dobriyan
2015-05-02  0:50 ` [PATCH 03/10] parse_integer: convert sscanf() Alexey Dobriyan
2015-05-02  1:10   ` [PATCH CORRECT " Alexey Dobriyan
2015-05-02  0:51 ` [PATCH 04/10] sscanf: fix overflow Alexey Dobriyan
2015-05-05  9:51   ` Rasmus Villemoes
2015-05-05 11:10     ` Alexey Dobriyan
2015-05-06  7:49       ` Rasmus Villemoes
2015-05-02  0:53 ` [PATCH 05/10] parse_integer: convert lib/ Alexey Dobriyan
2015-05-04 14:10   ` Rasmus Villemoes
2015-05-04 14:57     ` Alexey Dobriyan
2015-05-02  0:55 ` [PATCH 06/10] parse_integer: convert mm/ Alexey Dobriyan
2015-05-04 14:33   ` Rasmus Villemoes
2015-05-04 15:09     ` Alexey Dobriyan
2015-05-02  0:56 ` [PATCH 07/10] parse_integer: convert misc fs/ code Alexey Dobriyan
2015-05-02  0:59 ` [PATCH 08/10] fs/cachefiles/: convert to parse_integer() Alexey Dobriyan
2015-05-02  1:01 ` [PATCH 09/10] ocfs2: convert to parse_integer()/kstrto*() Alexey Dobriyan
2015-05-02  1:03 ` [PATCH 10/10] ext2, ext3, ext4: " Alexey Dobriyan
2015-05-04 13:24 ` Rasmus Villemoes [this message]
2015-05-04 14:32   ` [PATCH 01/10] Add parse_integer() (replacement for simple_strto*()) Alexey Dobriyan
2015-05-04 16:44     ` Rasmus Villemoes
2015-05-04 19:54       ` Alexey Dobriyan
2015-05-04 21:48         ` Rasmus Villemoes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h9rsiigj.fsf@rasmusvillemoes.dk \
    --to=linux@rasmusvillemoes.dk \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --subject='Re: [PATCH 01/10] Add parse_integer() (replacement for simple_strto*())' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).