linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Dobriyan <adobriyan@gmail.com>
To: akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH 01/10] Add parse_integer() (replacement for simple_strto*())
Date: Sat, 2 May 2015 03:47:14 +0300	[thread overview]
Message-ID: <20150502004714.GA21655@p183.telecom.by> (raw)

kstrto*() and kstrto*_from_user() family of functions were added
to help with parsing one integer written as string to proc/sysfs/debugfs
files. But they have a limitation: string passed must end with \0 or \n\0.
There are enough places where kstrto*() functions can't be used because of
this limitation. Trivial example: major:minor "%u:%u".

Currently the only way to parse everything is simple_strto*() functions.
But they are suboptimal:
* they do not detect overflow (can be fixed, but no one bothered since ~0.99.11),
* there are only 4 of them -- long and "long long" versions,
  This leads to silent truncation in the most simple case:

	val = strtoul(s, NULL, 0);

* half of the people think that "char **endp" argument is necessary and
  add unnecessary variable.

OpenBSD people, fed up with how complex correct integer parsing is, added
strtonum(3) to fixup for deficiencies of libc-style integer parsing:
http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/strtonum.3?query=strtonum&arch=i386

It'd be OK to copy that but it relies on "errno" and fixed strings as
error reporting channel which I think is not OK for kernel.
strtonum() also doesn't report number of characted consumed.

What to do?

Enter parse_integer().

	int parse_integer(const char *s, unsigned int base, T *val);

Rationale:

* parse_integer() is exactly 1 (one) interface not 4 or
  many,one for each type.

* parse_integer() reports -E errors reliably in a canonical kernel way:

	rv = parse_integer(str, 10, &val);
	if (rv < 0)
		return rv;

* parse_integer() writes result only if there were no errors, at least
  one digit has to be consumed,

* parse_integer doesn't mix error reporting channel and value channel,
  it does mix error and number-of-character-consumed channel, though.

* parse_integer() reports number of characters consumed, makes parsing
  multiple values easy:

	rv = parse_integer(str, 0, &val1);
	if (rv < 0)
		return rv;
	str += rv;
	if (*str++ != ':')
		return -EINVAL;
	rv = parse_integer(str, 0, &val2);
	if (rv < 0)
		return rv;
	if (str[rv] != '\0')
		return -EINVAL;

There are several deficiencies in parse_integer() compared to strto*():
* can't be used in initializers:

	const T x = strtoul();

* can't be used with bitfields,
* can't be used in expressions:

	x = strtol() * 1024;
	x = y = strtol() * 1024;
	x += strtol();

* currently there is no support for _Bool, and at least one place
  where simple_strtoul() is directly assigned to _Bool variable.
  It is trivial to add, but not clear if it should only accept "0" and "1",
  because, say, module param code accepts 0, 1, y, n, Y and N
  (which I personally think is stupid).

In defense of parse_integer() all I can say, is that using strtol() in
expression or initializer promotes no error checking and thus probably
should not be encouraged in C, language with no built-in error checking
anyway.

The amount of "x = y = strtol()" expressions in kernel is very small.
The amount of direct usage in expression is not small, but can be
counted as an acceptable loss.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/linux/kernel.h        |    7 +
 include/linux/parse-integer.h |   79 ++++++++++++++++
 lib/Makefile                  |    1 
 lib/kstrtox.c                 |   76 ++++-----------
 lib/kstrtox.h                 |    1 
 lib/parse-integer.c           |  203 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 310 insertions(+), 57 deletions(-)

--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -10,6 +10,7 @@
 #include <linux/bitops.h>
 #include <linux/log2.h>
 #include <linux/typecheck.h>
+#include <linux/parse-integer.h>
 #include <linux/printk.h>
 #include <linux/dynamic_debug.h>
 #include <asm/byteorder.h>
@@ -387,8 +388,10 @@ static inline int __must_check kstrtos32_from_user(const char __user *s, size_t
 	return kstrtoint_from_user(s, count, base, res);
 }
 
-/* Obsolete, do not use.  Use kstrto<foo> instead */
-
+/*
+ * Obsolete, do not use.
+ * Use parse_integer(), kstrto*(), kstrto*_from_user(), sscanf().
+ */
 extern unsigned long simple_strtoul(const char *,char **,unsigned int);
 extern long simple_strtol(const char *,char **,unsigned int);
 extern unsigned long long simple_strtoull(const char *,char **,unsigned int);
--- /dev/null
+++ b/include/linux/parse-integer.h
@@ -0,0 +1,79 @@
+#ifndef _PARSE_INTEGER_H
+#define _PARSE_INTEGER_H
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+/*
+ * int parse_integer(const char *s, unsigned int base, T *val);
+ *
+ * Convert integer string representation to an integer.
+ * Range of accepted values equals to that of type T.
+ *
+ * Conversion to unsigned integer accepts sign "+".
+ * Conversion to signed integer accepts sign "+" and sign "-".
+ *
+ * Radix 0 means autodetection: leading "0x" implies radix 16,
+ * leading "0" implies radix 8, otherwise radix is 10.
+ * Autodetection hint works after optional sign, but not before.
+ *
+ * Return number of characters parsed or -E.
+ *
+ * "T=char" case is not supported because -f{un,}signed-char can silently
+ * change range of accepted values.
+ */
+#define parse_integer(s, base, val)	\
+({					\
+	const char *_s = (s);		\
+	unsigned int _base = (base);	\
+	typeof(val) _val = (val);	\
+					\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), signed char *),	\
+	_parse_integer_sc(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), unsigned char *),	\
+	_parse_integer_uc(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), short *),		\
+	_parse_integer_s(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), unsigned short *),	\
+	_parse_integer_us(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), int *),		\
+	_parse_integer_i(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), unsigned int *),	\
+	_parse_integer_u(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), long *) && sizeof(long) == 4,\
+	_parse_integer_i(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), long *) && sizeof(long) == 8,\
+	_parse_integer_ll(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), unsigned long *) && sizeof(unsigned long) == 4,\
+	_parse_integer_u(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), unsigned long *) && sizeof(unsigned long) == 8,\
+	_parse_integer_ull(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), long long *),	\
+	_parse_integer_ll(_s, _base, (void *)_val),			\
+	__builtin_choose_expr(						\
+	__builtin_types_compatible_p(typeof(_val), unsigned long long *),\
+	_parse_integer_ull(_s, _base, (void *)_val),			\
+	_parse_integer_link_time_error()))))))))))));	\
+})
+/* internal, do not use */
+int _parse_integer_sc(const char *s, unsigned int base, signed char *val);
+int _parse_integer_uc(const char *s, unsigned int base, unsigned char *val);
+int _parse_integer_s(const char *s, unsigned int base, short *val);
+int _parse_integer_us(const char *s, unsigned int base, unsigned short *val);
+int _parse_integer_i(const char *s, unsigned int base, int *val);
+int _parse_integer_u(const char *s, unsigned int base, unsigned int *val);
+int _parse_integer_ll(const char *s, unsigned int base, long long *val);
+int _parse_integer_ull(const char *s, unsigned int base, unsigned long long *val);
+void _parse_integer_link_time_error(void);
+const char *_parse_integer_fixup_radix(const char *s, unsigned int *base);
+#endif
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -32,6 +32,7 @@ obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
 obj-y += hexdump.o
 obj-$(CONFIG_TEST_HEXDUMP) += test-hexdump.o
 obj-y += kstrtox.o
+obj-y += parse-integer.o
 obj-$(CONFIG_TEST_BPF) += test_bpf.o
 obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
 obj-$(CONFIG_TEST_KASAN) += test_kasan.o
--- a/lib/kstrtox.c
+++ b/lib/kstrtox.c
@@ -20,22 +20,6 @@
 #include <asm/uaccess.h>
 #include "kstrtox.h"
 
-const char *_parse_integer_fixup_radix(const char *s, unsigned int *base)
-{
-	if (*base == 0) {
-		if (s[0] == '0') {
-			if (_tolower(s[1]) == 'x' && isxdigit(s[2]))
-				*base = 16;
-			else
-				*base = 8;
-		} else
-			*base = 10;
-	}
-	if (*base == 16 && s[0] == '0' && _tolower(s[1]) == 'x')
-		s += 2;
-	return s;
-}
-
 /*
  * Convert non-negative integer string representation in explicitly given radix
  * to an integer.
@@ -83,26 +67,6 @@ unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long
 	return rv;
 }
 
-static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
-{
-	unsigned long long _res;
-	unsigned int rv;
-
-	s = _parse_integer_fixup_radix(s, &base);
-	rv = _parse_integer(s, base, &_res);
-	if (rv & KSTRTOX_OVERFLOW)
-		return -ERANGE;
-	if (rv == 0)
-		return -EINVAL;
-	s += rv;
-	if (*s == '\n')
-		s++;
-	if (*s)
-		return -EINVAL;
-	*res = _res;
-	return 0;
-}
-
 /**
  * kstrtoull - convert a string to an unsigned long long
  * @s: The start of the string. The string must be null-terminated, and may also
@@ -121,9 +85,19 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
  */
 int kstrtoull(const char *s, unsigned int base, unsigned long long *res)
 {
-	if (s[0] == '+')
+	unsigned long long _res;
+	int rv;
+
+	rv = parse_integer(s, base, &_res);
+	if (rv < 0)
+		return rv;
+	s += rv;
+	if (*s == '\n')
 		s++;
-	return _kstrtoull(s, base, res);
+	if (*s)
+		return -EINVAL;
+	*res = _res;
+	return 0;
 }
 EXPORT_SYMBOL(kstrtoull);
 
@@ -145,24 +119,18 @@ EXPORT_SYMBOL(kstrtoull);
  */
 int kstrtoll(const char *s, unsigned int base, long long *res)
 {
-	unsigned long long tmp;
+	long long _res;
 	int rv;
 
-	if (s[0] == '-') {
-		rv = _kstrtoull(s + 1, base, &tmp);
-		if (rv < 0)
-			return rv;
-		if ((long long)(-tmp) >= 0)
-			return -ERANGE;
-		*res = -tmp;
-	} else {
-		rv = kstrtoull(s, base, &tmp);
-		if (rv < 0)
-			return rv;
-		if ((long long)tmp < 0)
-			return -ERANGE;
-		*res = tmp;
-	}
+	rv = parse_integer(s, base, &_res);
+	if (rv < 0)
+		return rv;
+	s += rv;
+	if (*s == '\n')
+		s++;
+	if (*s)
+		return -EINVAL;
+	*res = _res;
 	return 0;
 }
 EXPORT_SYMBOL(kstrtoll);
--- a/lib/kstrtox.h
+++ b/lib/kstrtox.h
@@ -2,7 +2,6 @@
 #define _LIB_KSTRTOX_H
 
 #define KSTRTOX_OVERFLOW	(1U << 31)
-const char *_parse_integer_fixup_radix(const char *s, unsigned int *base);
 unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long *res);
 
 #endif
--- /dev/null
+++ b/lib/parse-integer.c
@@ -0,0 +1,203 @@
+/*
+ * See parse_integer().
+ *
+ * Individual dispatch functions in this file aren't supposed to be used
+ * directly and thus aren't advertised and documented despited being exported.
+ *
+ * Do not use any function in this file for any reason.
+ */
+#include <linux/ctype.h>
+#include <linux/errno.h>
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/math64.h>
+#include <linux/parse-integer.h>
+#include <asm/bug.h>
+
+const char *_parse_integer_fixup_radix(const char *s, unsigned int *base)
+{
+	if (*base == 0) {
+		if (s[0] == '0') {
+			if (_tolower(s[1]) == 'x' && isxdigit(s[2]))
+				*base = 16;
+			else
+				*base = 8;
+		} else
+			*base = 10;
+	}
+	if (*base == 16 && s[0] == '0' && _tolower(s[1]) == 'x')
+		s += 2;
+	BUG_ON(*base < 2 || *base > 16);
+	return s;
+}
+
+static int __parse_integer(const char *s, unsigned int base, unsigned long long *val)
+{
+	const char *s0 = s, *sd;
+	unsigned long long acc;
+
+	s = sd = _parse_integer_fixup_radix(s0, &base);
+	acc = 0;
+	while (*s != '\0') {
+		unsigned int d;
+
+		if ('0' <= *s && *s <= '9')
+			d = *s - '0';
+		else if ('a' <= _tolower(*s) && _tolower(*s) <= 'f')
+			d = _tolower(*s) - 'a' + 10;
+		else
+			break;
+		if (d >= base)
+			break;
+		/* Overflow can't happen early enough. */
+		if ((acc >> 60) && acc > div_u64(ULLONG_MAX - d, base))
+			return -ERANGE;
+		acc = acc * base + d;
+		s++;
+	}
+	/* At least one digit has to be converted. */
+	if (s == sd)
+		return -EINVAL;
+	*val = acc;
+	/* Radix 1 is not supported otherwise returned length can overflow. */
+	return s - s0;
+}
+
+int _parse_integer_ull(const char *s, unsigned int base, unsigned long long *val)
+{
+	int rv;
+
+	if (*s == '-') {
+		return -EINVAL;
+	} else if (*s == '+') {
+		rv = __parse_integer(s + 1, base, val);
+		if (rv < 0)
+			return rv;
+		return rv + 1;
+	} else
+		return __parse_integer(s, base, val);
+}
+EXPORT_SYMBOL(_parse_integer_ull);
+
+int _parse_integer_ll(const char *s, unsigned int base, long long *val)
+{
+	unsigned long long tmp;
+	int rv;
+
+	if (*s == '-') {
+		rv = __parse_integer(s + 1, base, &tmp);
+		if (rv < 0)
+			return rv;
+		if ((long long)-tmp >= 0)
+			return -ERANGE;
+		*val = -tmp;
+		return rv + 1;
+	} else if (*s == '+') {
+		rv = __parse_integer(s + 1, base, &tmp);
+		if (rv < 0)
+			return rv;
+		if ((long long)tmp < 0)
+			return -ERANGE;
+		*val = tmp;
+		return rv + 1;
+	} else {
+		rv = __parse_integer(s, base, &tmp);
+		if (rv < 0)
+			return rv;
+		if ((long long)tmp < 0)
+			return -ERANGE;
+		*val = tmp;
+		return rv;
+	}
+}
+EXPORT_SYMBOL(_parse_integer_ll);
+
+int _parse_integer_u(const char *s, unsigned int base, unsigned int *val)
+{
+	unsigned long long tmp;
+	int rv;
+
+	rv = _parse_integer_ull(s, base, &tmp);
+	if (rv < 0)
+		return rv;
+	if (tmp != (unsigned int)tmp)
+		return -ERANGE;
+	*val = tmp;
+	return rv;
+}
+EXPORT_SYMBOL(_parse_integer_u);
+
+int _parse_integer_i(const char *s, unsigned int base, int *val)
+{
+	long long tmp;
+	int rv;
+
+	rv = _parse_integer_ll(s, base, &tmp);
+	if (rv < 0)
+		return rv;
+	if (tmp != (int)tmp)
+		return -ERANGE;
+	*val = tmp;
+	return rv;
+}
+EXPORT_SYMBOL(_parse_integer_i);
+
+int _parse_integer_us(const char *s, unsigned int base, unsigned short *val)
+{
+	unsigned long long tmp;
+	int rv;
+
+	rv = _parse_integer_ull(s, base, &tmp);
+	if (rv < 0)
+		return rv;
+	if (tmp != (unsigned short)tmp)
+		return -ERANGE;
+	*val = tmp;
+	return rv;
+}
+EXPORT_SYMBOL(_parse_integer_us);
+
+int _parse_integer_s(const char *s, unsigned int base, short *val)
+{
+	long long tmp;
+	int rv;
+
+	rv = _parse_integer_ll(s, base, &tmp);
+	if (rv < 0)
+		return rv;
+	if (tmp != (short)tmp)
+		return -ERANGE;
+	*val = tmp;
+	return rv;
+}
+EXPORT_SYMBOL(_parse_integer_s);
+
+int _parse_integer_uc(const char *s, unsigned int base, unsigned char *val)
+{
+	unsigned long long tmp;
+	int rv;
+
+	rv = _parse_integer_ull(s, base, &tmp);
+	if (rv < 0)
+		return rv;
+	if (tmp != (unsigned char)tmp)
+		return -ERANGE;
+	*val = tmp;
+	return rv;
+}
+EXPORT_SYMBOL(_parse_integer_uc);
+
+int _parse_integer_sc(const char *s, unsigned int base, signed char *val)
+{
+	long long tmp;
+	int rv;
+
+	rv = _parse_integer_ll(s, base, &tmp);
+	if (rv < 0)
+		return rv;
+	if (tmp != (signed char)tmp)
+		return -ERANGE;
+	*val = tmp;
+	return rv;
+}
+EXPORT_SYMBOL(_parse_integer_sc);

             reply	other threads:[~2015-05-02  0:47 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-02  0:47 Alexey Dobriyan [this message]
2015-05-02  0:48 ` [PATCH 02/10] parse_integer: rewrite kstrto*() Alexey Dobriyan
2015-05-02  0:50 ` [PATCH 03/10] parse_integer: convert sscanf() Alexey Dobriyan
2015-05-02  1:10   ` [PATCH CORRECT " Alexey Dobriyan
2015-05-02  0:51 ` [PATCH 04/10] sscanf: fix overflow Alexey Dobriyan
2015-05-05  9:51   ` Rasmus Villemoes
2015-05-05 11:10     ` Alexey Dobriyan
2015-05-06  7:49       ` Rasmus Villemoes
2015-05-02  0:53 ` [PATCH 05/10] parse_integer: convert lib/ Alexey Dobriyan
2015-05-04 14:10   ` Rasmus Villemoes
2015-05-04 14:57     ` Alexey Dobriyan
2015-05-02  0:55 ` [PATCH 06/10] parse_integer: convert mm/ Alexey Dobriyan
2015-05-04 14:33   ` Rasmus Villemoes
2015-05-04 15:09     ` Alexey Dobriyan
2015-05-02  0:56 ` [PATCH 07/10] parse_integer: convert misc fs/ code Alexey Dobriyan
2015-05-02  0:59 ` [PATCH 08/10] fs/cachefiles/: convert to parse_integer() Alexey Dobriyan
2015-05-02  1:01 ` [PATCH 09/10] ocfs2: convert to parse_integer()/kstrto*() Alexey Dobriyan
2015-05-02  1:03 ` [PATCH 10/10] ext2, ext3, ext4: " Alexey Dobriyan
2015-05-04 13:24 ` [PATCH 01/10] Add parse_integer() (replacement for simple_strto*()) Rasmus Villemoes
2015-05-04 14:32   ` Alexey Dobriyan
2015-05-04 16:44     ` Rasmus Villemoes
2015-05-04 19:54       ` Alexey Dobriyan
2015-05-04 21:48         ` Rasmus Villemoes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150502004714.GA21655@p183.telecom.by \
    --to=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --subject='Re: [PATCH 01/10] Add parse_integer() (replacement for simple_strto*())' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).