All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] Add lib/glob.c
       [not found] <20140313121032.GA9981@htj.dyndns.org>
@ 2014-05-10  3:13 ` George Spelvin
  2014-05-10  3:14   ` [PATCH 2/2] libata: Use glob_match from lib/glob.c George Spelvin
                     ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: George Spelvin @ 2014-05-10  3:13 UTC (permalink / raw)
  To: linux, tj; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

This is a helper function from drivers/ata/libata_core.c, where it is used
to blacklist particular device models.  It's being moved to lib/ so other
drivers may use it for the same purpose.

This implementation in non-recursive, so is safe for the kernel stack.

Signed-off-by: George Spelvin <linux@horizon.com>
---
Finally rescued this from the back burner.  The code size will go back
down in the second patch which removes the old implementation, although
the number of source line reflects more comments and a test driver.

The infrastructure parts of this, such as the module name and Kconfig
declarations, are something I haven't done before, and comments are
appreciated.

Tejun Heo <htejun@gmail.com> wrote:
> On Wed, Mar 12, 2014 at 06:52:44PM -0400, George Spelvin wrote:
>> Sure; I'll prepare some patches.  May I feed it through you, or
>> is there a lib/ maintainer I need to go through?
>
> Please keep me cc'd but you'd probably also want to cc Linus, Andrew
> Morton and Ingo.

 include/linux/glob.h |  10 +++
 lib/Kconfig          |  14 ++++
 lib/Makefile         |   2 +
 lib/glob.c           | 225 +++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 251 insertions(+)
 create mode 100644 include/linux/glob.h
 create mode 100644 lib/glob.c

diff --git a/include/linux/glob.h b/include/linux/glob.h
new file mode 100644
index 0000000..c990b7f
--- /dev/null
+++ b/include/linux/glob.h
@@ -0,0 +1,10 @@
+#ifndef _LINUX_GLOB_H
+#define _LINUX_GLOB_H
+
+#include <linux/types.h>	/* For bool */
+#include <linux/compiler.h>	/* For __pure */
+
+bool __pure glob_match(char const *pat, char const *str);
+
+#endif	/* _LINUX_GLOB_H */
+
diff --git a/lib/Kconfig b/lib/Kconfig
index 991c98b..5333d10 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -373,6 +373,20 @@ config CPU_RMAP
 config DQL
 	bool
 
+config GLOB
+	tristate
+#	(Prompt disabled to reduce kbuild clutter until someone needs it.)
+#	prompt "glob_match() function"
+	help
+	  This option provides a glob_match function for performing simple
+	  text pattern matching.  It is primarily used by the ATA code
+	  to blacklist particular drive models, but other device drivers
+	  may need similar functionality.
+
+	  All in-kernel drivers that require this function automatically
+	  select this option.  Say N unless you are compiling an out-of
+	  tree driver which tells you it depend on it.
+
 #
 # Netlink attribute parsing support is select'ed if needed
 #
diff --git a/lib/Makefile b/lib/Makefile
index 48140e3..c859f81 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -133,6 +133,8 @@ obj-$(CONFIG_CORDIC) += cordic.o
 
 obj-$(CONFIG_DQL) += dynamic_queue_limits.o
 
+obj-$(CONFIG_GLOB) += glob.o
+
 obj-$(CONFIG_MPILIB) += mpi/
 obj-$(CONFIG_SIGNATURE) += digsig.o
 
diff --git a/lib/glob.c b/lib/glob.c
new file mode 100644
index 0000000..b71ca59
--- /dev/null
+++ b/lib/glob.c
@@ -0,0 +1,225 @@
+#ifdef UNITTEST
+/* To do a basic sanity test, "cc -DUNITTEST glob.c" and run a.out. */
+
+#include <stdbool.h>
+#define __pure __attribute__((pure))
+#define NOP(x)
+#define EXPORT_SYMBOL NOP	/* Two stages to avoid checkpatch complaints */
+
+#else
+
+#include <linux/module.h>
+#include <linux/glob.h>
+
+MODULE_DESCRIPTION("glob(7) matching");
+MODULE_LICENSE("Dual MIT/GPL");
+
+#endif
+
+/**
+ * glob_match - Shell-style pattern matching, like !fnmatch(pat, str, 0)
+ * @pat: Pattern to match.  Metacharacters are ?, *, [ and \.
+ *       (And, inside character classes, !, - and ].)
+ * @str: String to match.  the pattern must match the entire string.
+ *
+ * Perform shell-style glob matching, returning true (1) if the match
+ * succeeds, or false (0) if it fails.
+ *
+ * This is small and simple implementation intended for device blacklists
+ * where a string is matched against a number of patterns.  Thus, it
+ * does not preprocess the patterns.  It is non-recursive, and run-time
+ * is at most quadratic: strlen(@str)*strlen(@pat).
+ *
+ * An example of the worst case is glob_match("*aaaaa", "aaaaaaaaaa");
+ * it takes 6 passes over the pattern before matching the string.
+ *
+ * Like !fnmatch(@pat, @str, 0) and unlike the shell, this does NOT
+ * treat / or leading . specially; it isn't actually used for pathnames.
+ *
+ * Note that according to glob(7) (and unlike bash), character classes
+ * are complemented by a leading !; this does not support the regex-style
+ * [^a-z] syntax.
+ *
+ * An opening bracket without a matching close is matched literally.
+ */
+bool __pure
+glob_match(char const *pat, char const *str)
+{
+	/*
+	 * Backtrack to previous * on mismatch and retry starting one
+	 * character later in the string.  Because * matches all characters
+	 * (no exception for /), it can be easily proved that there's
+	 * never a need to backtrack multiple levels.
+	 */
+	char const *back_pat = 0, *back_str = back_str;
+	/*
+	 * Loop over each token (character or class) in pat, matching
+	 * it against the remaining unmatched tail of str.  Return false
+	 * on mismatch, or true after matching the trailing nul bytes.
+	 */
+	for (;;) {
+		unsigned char c = *str++;
+		unsigned char d = *pat++;
+
+		switch (d) {
+		case '?':	/* Wildcard: anything but nul */
+			if (c == '\0')
+				return false;
+			break;
+		case '*':	/* Any-length wildcard */
+			if (*pat == '\0')	/* Optimize trailing * case */
+				return true;
+			back_pat = pat;
+			back_str = --str;	/* Allow zero-length match */
+			break;
+		case '[': {	/* Character class */
+			bool match = false, inverted = (*pat == '!');
+			char const *class = pat + inverted;
+			unsigned char a = *class++;
+
+			/*
+			 * Iterate over each span in the character class.
+			 * A span is either a single character a, or a
+			 * range a-b.
+			 */
+			do {
+				unsigned char b = a;
+
+				if (a == '\0')	/* Malformed */
+					goto literal;
+
+				if (class[0] == '-' && class[1] != ']') {
+					b = class[1];
+
+					if (b == '\0')
+						goto literal;
+
+					class += 2;
+					/* Any special action if a > b? */
+				}
+				match |= (a <= c && c <= b);
+			} while ((a = *class++) != ']');
+
+			if (match == inverted)
+				goto backtrack;
+			pat = class;
+			}
+			break;
+		case '\\':
+			d = *pat++;
+			/* FALLLTHROUGH */
+		default:	/* Literal character */
+literal:
+			if (c == d) {
+				if (d == '\0')
+					return true;
+				break;
+			}
+backtrack:
+			if (c == '\0' || !back_pat)
+				return false;	/* No point continuing */
+			/* Try again from last *, one character later in str. */
+			pat = back_pat;
+			str = ++back_str;
+			break;
+		}
+	}
+}
+EXPORT_SYMBOL(glob_match);
+
+
+#ifdef UNITTEST
+
+/* Test code */
+#include <stdio.h>
+#include <stdlib.h>
+struct glob_test {
+	char const *pat, *str;
+	bool expected;
+};
+
+static void
+test(struct glob_test const *g)
+{
+	bool match = glob_match(g->pat, g->str);
+
+	printf("\"%s\" vs. \"%s\": %s %s\n", g->pat, g->str,
+		match ? "match" : "mismatch",
+		match == g->expected ? "OK" : "*** ERROR ***");
+	if (match != g->expected)
+		exit(1);
+}
+
+static struct glob_test const tests[] = {
+	{ "a", "a", true },
+	{ "a", "b", false },
+	{ "a", "aa", false },
+	{ "a", "", false },
+	{ "", "", true },
+	{ "", "a", false },
+	{ "[a]", "a", true },
+	{ "[!a]", "a", false },
+	{ "[!a]", "b", true },
+	{ "[ab]", "a", true },
+	{ "[ab]", "b", true },
+	{ "[ab]", "c", false },
+	{ "[!ab]", "c", true },
+	{ "[a-c]", "b", true },
+	{ "[a-c]", "d", false },
+	{ "[]a-ceg-ik[]", "a", true },
+	{ "[]a-ceg-ik[]", "]", true },
+	{ "[]a-ceg-ik[]", "[", true },
+	{ "[]a-ceg-ik[]", "h", true },
+	{ "[]a-ceg-ik[]", "f", false },
+	{ "[!]a-ceg-ik[]", "h", false },
+	{ "[!]a-ceg-ik[]", "]", false },
+	{ "[!]a-ceg-ik[]", "f", true },
+	{ "[a-b-]", "-", true },
+	{ "?", "a", true },
+	{ "?", "aa", false },
+	{ "??", "a", false },
+	{ "?x?", "axb", true },
+	{ "?x?", "abx", false },
+	{ "?x?", "xab", false },
+	{ "*??", "a", false },
+	{ "*??", "ab", true },
+	{ "*??", "abc", true },
+	{ "*??", "abcd", true },
+	{ "??*", "a", false },
+	{ "??*", "ab", true },
+	{ "??*", "abc", true },
+	{ "??*", "abcd", true },
+	{ "?*?", "a", false },
+	{ "?*?", "ab", true },
+	{ "?*?", "abc", true },
+	{ "?*?", "abcd", true },
+	{ "*b", "b", true },
+	{ "*b", "ab", true },
+	{ "*b", "aab", true },
+	{ "*bc", "abbc", true },
+	{ "*bc", "bc", true },
+	{ "*bc", "bbc", true },
+	{ "*bc", "bcbc", true },
+	{ "*ac*", "abacadaeafag", true },
+	{ "*ac*ae*ag*", "abacadaeafag", true },
+	{ "*a*b*[bc]*[ef]*g*", "abacadaeafag", true },
+	{ "*a*b*[ef]*[cd]*g*", "abacadaeafag", false },
+	{ "*abcd*", "abcabcabcabcdefg", true },
+	{ "*ab*cd*", "abcabcabcabcdefg", true },
+	{ "*abcd*abcdef*", "abcabcdabcdeabcdefg", true },
+	{ "*abcd*", "abcabcabcabcefg", false },
+	{ "*ab*cd*", "abcabcabcabcefg", false }
+};
+
+int
+main(void)
+{
+	size_t i;
+
+	for (i = 0; i < sizeof(tests)/sizeof(*tests); i++)
+		test(tests + i);
+
+	return 0;
+}
+
+#endif	/* UNITTEST */
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] libata: Use glob_match from lib/glob.c
  2014-05-10  3:13 ` [PATCH 1/2] Add lib/glob.c George Spelvin
@ 2014-05-10  3:14   ` George Spelvin
  2014-05-10 12:21   ` [PATCH 1/2] Add lib/glob.c Tejun Heo
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: George Spelvin @ 2014-05-10  3:14 UTC (permalink / raw)
  To: linux, tj; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

The function may be useful for other drivers, so export it.
(Suggested by Tejun Heo.)

Note that I inverted the return value of glob_match; returning
true on match seemed to make more sense.

Signed-off-by: George Spelvin <linux@horizon.com>
---
 drivers/ata/Kconfig       |  1 +
 drivers/ata/libata-core.c | 72 ++---------------------------------------------
 2 files changed, 4 insertions(+), 69 deletions(-)

diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index 868429a..f121fa0 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -16,6 +16,7 @@ menuconfig ATA
 	depends on BLOCK
 	depends on !(M32R || M68K || S390) || BROKEN
 	select SCSI
+	select GLOB
 	---help---
 	  If you want to use a ATA hard disk, ATA tape drive, ATA CD-ROM or
 	  any other ATA device under Linux, say Y and make sure that you know
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 8cb2522..e40ca97 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -59,6 +59,7 @@
 #include <linux/async.h>
 #include <linux/log2.h>
 #include <linux/slab.h>
+#include <linux/glob.h>
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_host.h>
@@ -4248,73 +4249,6 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
 	{ }
 };
 
-/**
- *	glob_match - match a text string against a glob-style pattern
- *	@text: the string to be examined
- *	@pattern: the glob-style pattern to be matched against
- *
- *	Either/both of text and pattern can be empty strings.
- *
- *	Match text against a glob-style pattern, with wildcards and simple sets:
- *
- *		?	matches any single character.
- *		*	matches any run of characters.
- *		[xyz]	matches a single character from the set: x, y, or z.
- *		[a-d]	matches a single character from the range: a, b, c, or d.
- *		[a-d0-9] matches a single character from either range.
- *
- *	The special characters ?, [, -, or *, can be matched using a set, eg. [*]
- *	Behaviour with malformed patterns is undefined, though generally reasonable.
- *
- *	Sample patterns:  "SD1?",  "SD1[0-5]",  "*R0",  "SD*1?[012]*xx"
- *
- *	This function uses one level of recursion per '*' in pattern.
- *	Since it calls _nothing_ else, and has _no_ explicit local variables,
- *	this will not cause stack problems for any reasonable use here.
- *
- *	RETURNS:
- *	0 on match, 1 otherwise.
- */
-static int glob_match (const char *text, const char *pattern)
-{
-	do {
-		/* Match single character or a '?' wildcard */
-		if (*text == *pattern || *pattern == '?') {
-			if (!*pattern++)
-				return 0;  /* End of both strings: match */
-		} else {
-			/* Match single char against a '[' bracketed ']' pattern set */
-			if (!*text || *pattern != '[')
-				break;  /* Not a pattern set */
-			while (*++pattern && *pattern != ']' && *text != *pattern) {
-				if (*pattern == '-' && *(pattern - 1) != '[')
-					if (*text > *(pattern - 1) && *text < *(pattern + 1)) {
-						++pattern;
-						break;
-					}
-			}
-			if (!*pattern || *pattern == ']')
-				return 1;  /* No match */
-			while (*pattern && *pattern++ != ']');
-		}
-	} while (*++text && *pattern);
-
-	/* Match any run of chars against a '*' wildcard */
-	if (*pattern == '*') {
-		if (!*++pattern)
-			return 0;  /* Match: avoid recursion at end of pattern */
-		/* Loop to handle additional pattern chars after the wildcard */
-		while (*text) {
-			if (glob_match(text, pattern) == 0)
-				return 0;  /* Remainder matched */
-			++text;  /* Absorb (match) this char and try again */
-		}
-	}
-	if (!*text && !*pattern)
-		return 0;  /* End of both strings: match */
-	return 1;  /* No match */
-}
-
 static unsigned long ata_dev_blacklisted(const struct ata_device *dev)
 {
 	unsigned char model_num[ATA_ID_PROD_LEN + 1];
@@ -4325,10 +4259,10 @@ static unsigned long ata_dev_blacklisted(const struct ata_device *dev)
 	ata_id_c_string(dev->id, model_rev, ATA_ID_FW_REV, sizeof(model_rev));
 
 	while (ad->model_num) {
-		if (!glob_match(model_num, ad->model_num)) {
+		if (glob_match(model_num, ad->model_num)) {
 			if (ad->model_rev == NULL)
 				return ad->horkage;
-			if (!glob_match(model_rev, ad->model_rev))
+			if (glob_match(model_rev, ad->model_rev))
 				return ad->horkage;
 		}
 		ad++;
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10  3:13 ` [PATCH 1/2] Add lib/glob.c George Spelvin
  2014-05-10  3:14   ` [PATCH 2/2] libata: Use glob_match from lib/glob.c George Spelvin
@ 2014-05-10 12:21   ` Tejun Heo
  2014-05-11  6:02     ` George Spelvin
  2014-05-12 23:03     ` Andrew Morton
  2014-05-10 12:23   ` Tejun Heo
  2014-05-10 17:29   ` Randy Dunlap
  3 siblings, 2 replies; 15+ messages in thread
From: Tejun Heo @ 2014-05-10 12:21 UTC (permalink / raw)
  To: George Spelvin; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

Hello,

On Fri, May 09, 2014 at 11:13:56PM -0400, George Spelvin wrote:
> +config GLOB
> +	tristate
> +#	(Prompt disabled to reduce kbuild clutter until someone needs it.)
> +#	prompt "glob_match() function"
> +	help
> +	  This option provides a glob_match function for performing simple
> +	  text pattern matching.  It is primarily used by the ATA code
> +	  to blacklist particular drive models, but other device drivers
> +	  may need similar functionality.
> +
> +	  All in-kernel drivers that require this function automatically
> +	  select this option.  Say N unless you are compiling an out-of
> +	  tree driver which tells you it depend on it.

Just adding glob.o to lib-y should be enough.  It will be excluded
from linking if unused.

> +#ifdef UNITTEST
> +/* To do a basic sanity test, "cc -DUNITTEST glob.c" and run a.out. */
> +
> +#include <stdbool.h>
> +#define __pure __attribute__((pure))
> +#define NOP(x)
> +#define EXPORT_SYMBOL NOP	/* Two stages to avoid checkpatch complaints */

These things tend to bitrot.  Let's please keep testing harness out of
tree.

> +#else
> +
> +#include <linux/module.h>
> +#include <linux/glob.h>
> +
> +MODULE_DESCRIPTION("glob(7) matching");
> +MODULE_LICENSE("Dual MIT/GPL");

Do we make library routines separate modules usually?

...
> +bool __pure
> +glob_match(char const *pat, char const *str)

The whole thing fits in a single 80 column line, right?

bool __pure glob_match(char const *pat, char const *str)

> +{
> +	/*
> +	 * Backtrack to previous * on mismatch and retry starting one
> +	 * character later in the string.  Because * matches all characters
> +	 * (no exception for /), it can be easily proved that there's
> +	 * never a need to backtrack multiple levels.
> +	 */
> +	char const *back_pat = 0, *back_str = back_str;

Blank line here.

I haven't delved into the actual implementation.  Looks sane on the
first glance.

> +#ifdef UNITTEST
> +
> +/* Test code */
> +#include <stdio.h>
> +#include <stdlib.h>
> +struct glob_test {
> +	char const *pat, *str;
> +	bool expected;
> +};
> +
> +static void
> +test(struct glob_test const *g)
> +{
> +	bool match = glob_match(g->pat, g->str);
> +
> +	printf("\"%s\" vs. \"%s\": %s %s\n", g->pat, g->str,
> +		match ? "match" : "mismatch",
> +		match == g->expected ? "OK" : "*** ERROR ***");
> +	if (match != g->expected)
> +		exit(1);
> +}
> +
> +static struct glob_test const tests[] = {
> +	{ "a", "a", true },
...
> +	{ "*ab*cd*", "abcabcabcabcefg", false }
> +};
> +
> +int
> +main(void)
> +{
> +	size_t i;
> +
> +	for (i = 0; i < sizeof(tests)/sizeof(*tests); i++)
> +		test(tests + i);
> +
> +	return 0;
> +}
> +
> +#endif	/* UNITTEST */

Again, I don't really think the userland testing code belongs here.
If you wanna keep them, please make it in-kernel selftesting.  We
don't really wanna keep code which can't get built and tested in
kernel tree proper.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10  3:13 ` [PATCH 1/2] Add lib/glob.c George Spelvin
  2014-05-10  3:14   ` [PATCH 2/2] libata: Use glob_match from lib/glob.c George Spelvin
  2014-05-10 12:21   ` [PATCH 1/2] Add lib/glob.c Tejun Heo
@ 2014-05-10 12:23   ` Tejun Heo
  2014-05-10 14:03     ` George Spelvin
  2014-05-10 17:29   ` Randy Dunlap
  3 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2014-05-10 12:23 UTC (permalink / raw)
  To: George Spelvin; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

Ooh, one more nitpick.

On Fri, May 09, 2014 at 11:13:56PM -0400, George Spelvin wrote:
> +/**
> + * glob_match - Shell-style pattern matching, like !fnmatch(pat, str, 0)
> + * @pat: Pattern to match.  Metacharacters are ?, *, [ and \.
> + *       (And, inside character classes, !, - and ].)

@ARG lines should be contained in a single line.  Just "Pattern to
match." should do.  With detailed description in the body.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10 12:23   ` Tejun Heo
@ 2014-05-10 14:03     ` George Spelvin
  2014-05-10 17:22       ` Randy Dunlap
  0 siblings, 1 reply; 15+ messages in thread
From: George Spelvin @ 2014-05-10 14:03 UTC (permalink / raw)
  To: linux, tj; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

Thanks a lot for the feedback!

> On Fri, May 09, 2014 at 11:13:56PM -0400, George Spelvin wrote:
>> +/**
>> + * glob_match - Shell-style pattern matching, like !fnmatch(pat, str, 0)
>> + * @pat: Pattern to match.  Metacharacters are ?, *, [ and \.
>> + *       (And, inside character classes, !, - and ].)

> @ARG lines should be contained in a single line.  Just "Pattern to
> match." should do.  With detailed description in the body.

Huh, Documentation/kernel-doc-nano-HOWTO.txt (lines 57-59, to be precise)
implies otherwise pretty strongly.  But I can certainly change it.

> Just adding glob.o to lib-y should be enough.  It will be excluded
> from linking if unused.

Will that work right if the caller is a module?  What will it get linked
into, the main kernel binary or the module?

A significant and very helpful simplification; I just want to be sure
it works right.

>> +#ifdef UNITTEST
>> +/* To do a basic sanity test, "cc -DUNITTEST glob.c" and run a.out. */
>> +
>> +#include <stdbool.h>
>> +#define __pure __attribute__((pure))
>> +#define NOP(x)
>> +#define EXPORT_SYMBOL NOP	/* Two stages to avoid checkpatch complaints */

> These things tend to bitrot.  Let's please keep testing harness out of
> tree.

Damn, when separated it bitrots a lot faster.  That's *is* my testing
harness, and I wanted to keep it close so it has a chance on hell of
being used by someone who updates it.

Especially given that the function's interface is quite rigidly defined,
do you really think there will be a lot of rot?

> Do we make library routines separate modules usually?

A large number of files in lib/ are implemented that way (lib/crc-ccitt.c,
just for one example), and that's what I copied.  But if I just do the
obj-y thing, all that goes away

>> +bool __pure
>> +glob_match(char const *pat, char const *str)
>
> The whole thing fits in a single 80 column line, right?
> 
> bool __pure glob_match(char const *pat, char const *str)

Whoops, a residue of my personal code style.  (I like to left-align
function names in definitions so they're easy to search for with ^func.)
But it's not kernel style.  Will fix.

>> +{
>> +	/*
>> +	 * Backtrack to previous * on mismatch and retry starting one
>> +	 * character later in the string.  Because * matches all characters
>> +	 * (no exception for /), it can be easily proved that there's
>> +	 * never a need to backtrack multiple levels.
>> +	 */
>> +	char const *back_pat = 0, *back_str = back_str;

> Blank line here.

I had considered the "/*" start of the following block comment as visually
enough separation between variable declarations and statements, but sure,
I can add one.

> I haven't delved into the actual implementation.  Looks sane on the
> first glance.

That's the part I'm least worried about, actually.

> Again, I don't really think the userland testing code belongs here.
> If you want to keep them, please make it in-kernel selftesting.  We
> don't really want to keep code which can't get built and tested in
> kernel tree proper.

I'll see if I can figure out how to do that.  Simple as it is, I hate to
throw away regression tests.

Thank you very much.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10 14:03     ` George Spelvin
@ 2014-05-10 17:22       ` Randy Dunlap
  0 siblings, 0 replies; 15+ messages in thread
From: Randy Dunlap @ 2014-05-10 17:22 UTC (permalink / raw)
  To: George Spelvin, tj; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

On 05/10/2014 07:03 AM, George Spelvin wrote:
> Thanks a lot for the feedback!
> 
>> On Fri, May 09, 2014 at 11:13:56PM -0400, George Spelvin wrote:
>>> +/**
>>> + * glob_match - Shell-style pattern matching, like !fnmatch(pat, str, 0)
>>> + * @pat: Pattern to match.  Metacharacters are ?, *, [ and \.
>>> + *       (And, inside character classes, !, - and ].)
> 
>> @ARG lines should be contained in a single line.  Just "Pattern to
>> match." should do.  With detailed description in the body.

That's old/historical, not current.

> Huh, Documentation/kernel-doc-nano-HOWTO.txt (lines 57-59, to be precise)
> implies otherwise pretty strongly.  But I can certainly change it.

Either way should be OK.

>> Just adding glob.o to lib-y should be enough.  It will be excluded
>> from linking if unused.
> 
> Will that work right if the caller is a module?  What will it get linked
> into, the main kernel binary or the module?

and sometimes we have to use obj-y instead of lib-y.

> A significant and very helpful simplification; I just want to be sure
> it works right.
> 
>>> +#ifdef UNITTEST
>>> +/* To do a basic sanity test, "cc -DUNITTEST glob.c" and run a.out. */
>>> +
>>> +#include <stdbool.h>
>>> +#define __pure __attribute__((pure))
>>> +#define NOP(x)
>>> +#define EXPORT_SYMBOL NOP	/* Two stages to avoid checkpatch complaints */
> 
>> These things tend to bitrot.  Let's please keep testing harness out of
>> tree.
> 
> Damn, when separated it bitrots a lot faster.  That's *is* my testing
> harness, and I wanted to keep it close so it has a chance on hell of
> being used by someone who updates it.
> 
> Especially given that the function's interface is quite rigidly defined,
> do you really think there will be a lot of rot?
> 
>> Do we make library routines separate modules usually?
> 
> A large number of files in lib/ are implemented that way (lib/crc-ccitt.c,
> just for one example), and that's what I copied.  But if I just do the
> obj-y thing, all that goes away
> 
>>> +bool __pure
>>> +glob_match(char const *pat, char const *str)
>>
>> The whole thing fits in a single 80 column line, right?
>>
>> bool __pure glob_match(char const *pat, char const *str)
> 
> Whoops, a residue of my personal code style.  (I like to left-align
> function names in definitions so they're easy to search for with ^func.)
> But it's not kernel style.  Will fix.
> 
>>> +{
>>> +	/*
>>> +	 * Backtrack to previous * on mismatch and retry starting one
>>> +	 * character later in the string.  Because * matches all characters
>>> +	 * (no exception for /), it can be easily proved that there's
>>> +	 * never a need to backtrack multiple levels.
>>> +	 */
>>> +	char const *back_pat = 0, *back_str = back_str;
> 
>> Blank line here.
> 
> I had considered the "/*" start of the following block comment as visually
> enough separation between variable declarations and statements, but sure,
> I can add one.
> 
>> I haven't delved into the actual implementation.  Looks sane on the
>> first glance.
> 
> That's the part I'm least worried about, actually.
> 
>> Again, I don't really think the userland testing code belongs here.
>> If you want to keep them, please make it in-kernel selftesting.  We
>> don't really want to keep code which can't get built and tested in
>> kernel tree proper.
> 
> I'll see if I can figure out how to do that.  Simple as it is, I hate to
> throw away regression tests.


-- 
~Randy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10  3:13 ` [PATCH 1/2] Add lib/glob.c George Spelvin
                     ` (2 preceding siblings ...)
  2014-05-10 12:23   ` Tejun Heo
@ 2014-05-10 17:29   ` Randy Dunlap
  2014-05-11 12:36     ` Tejun Heo
  3 siblings, 1 reply; 15+ messages in thread
From: Randy Dunlap @ 2014-05-10 17:29 UTC (permalink / raw)
  To: George Spelvin, tj; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

On 05/09/2014 08:13 PM, George Spelvin wrote:
> This is a helper function from drivers/ata/libata_core.c, where it is used
> to blacklist particular device models.  It's being moved to lib/ so other
> drivers may use it for the same purpose.
> 
> This implementation in non-recursive, so is safe for the kernel stack.
> 
> Signed-off-by: George Spelvin <linux@horizon.com>
> ---
> Finally rescued this from the back burner.  The code size will go back
> down in the second patch which removes the old implementation, although
> the number of source line reflects more comments and a test driver.
> 
> The infrastructure parts of this, such as the module name and Kconfig
> declarations, are something I haven't done before, and comments are
> appreciated.
> 
> Tejun Heo <htejun@gmail.com> wrote:
>> On Wed, Mar 12, 2014 at 06:52:44PM -0400, George Spelvin wrote:
>>> Sure; I'll prepare some patches.  May I feed it through you, or
>>> is there a lib/ maintainer I need to go through?
>>
>> Please keep me cc'd but you'd probably also want to cc Linus, Andrew
>> Morton and Ingo.
> 
>  include/linux/glob.h |  10 +++
>  lib/Kconfig          |  14 ++++
>  lib/Makefile         |   2 +
>  lib/glob.c           | 225 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 251 insertions(+)
>  create mode 100644 include/linux/glob.h
>  create mode 100644 lib/glob.c

> diff --git a/lib/Kconfig b/lib/Kconfig
> index 991c98b..5333d10 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -373,6 +373,20 @@ config CPU_RMAP
>  config DQL
>  	bool
>  
> +config GLOB
> +	tristate
> +#	(Prompt disabled to reduce kbuild clutter until someone needs it.)
> +#	prompt "glob_match() function"
> +	help
> +	  This option provides a glob_match function for performing simple
> +	  text pattern matching.  It is primarily used by the ATA code
> +	  to blacklist particular drive models, but other device drivers
> +	  may need similar functionality.
> +
> +	  All in-kernel drivers that require this function automatically

	I would drop "automatically". It has to be coded.

> +	  select this option.  Say N unless you are compiling an out-of
> +	  tree driver which tells you it depend on it.

	To support out-of-tree drivers, I'm pretty sure that you will need
	to use obj- instead of lib-.  lib- drops unused code, like Tejun said.

> +
>  #
>  # Netlink attribute parsing support is select'ed if needed
>  #


-- 
~Randy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10 12:21   ` [PATCH 1/2] Add lib/glob.c Tejun Heo
@ 2014-05-11  6:02     ` George Spelvin
  2014-05-12 23:03     ` Andrew Morton
  1 sibling, 0 replies; 15+ messages in thread
From: George Spelvin @ 2014-05-11  6:02 UTC (permalink / raw)
  To: linux, tj; +Cc: akpm, linux-ide, linux-kernel, mingo, torvalds

On Sat, 10 May 2014 08:21:38 -0400, Tejun Heo wrote:
> On Fri, May 09, 2014 at 11:13:56PM -0400, George Spelvin wrote:
>> +config GLOB
>> +	tristate
>> +#	(Prompt disabled to reduce kbuild clutter until someone needs it.)
>> +#	prompt "glob_match() function"
>> +	help
>> +	  This option provides a glob_match function for performing simple
>> +	  text pattern matching.  It is primarily used by the ATA code
>> +	  to blacklist particular drive models, but other device drivers
>> +	  may need similar functionality.
>> +
>> +	  All in-kernel drivers that require this function automatically
>> +	  select this option.  Say N unless you are compiling an out-of
>> +	  tree driver which tells you it depend on it.

> Just adding glob.o to lib-y should be enough.  It will be excluded
> from linking if unused.

But, I just confirmed, it will also be excluded from linking if
CONFIG_ATA=m.  See for example commit b4d3ba3346f0 where some helpers
had to be moved from lib-y to obj-y to fix module load errors.
(Thanks to Randy Dunlap for this example.)

>> +#else
>> +
>> +#include <linux/module.h>
>> +#include <linux/glob.h>
>> +
>> +MODULE_DESCRIPTION("glob(7) matching");
>> +MODULE_LICENSE("Dual MIT/GPL");

> Do we make library routines separate modules usually?

There are basically three options I can see:
1) Make it non-configurable and always include the code in the kernel
   using oby-y, and just accept the wasted space if CONFIG_ATA=n
2) Make it a boolean option, so CONFIG_ATA=m will force CONFIG_GLOB=y and
   it will be included in the static kernel but unused until the module
   is loaded.
3) Make it a tristate option, which compiles into the kernel if
   CONFIG_ATA=y (the common case), and is its own module if CONFIG_ATA=m.
   An awful lot of the files in lib/ take this approach.

Do you have a preference?  Option 3 is the current status.

(Revised patch will come when I have the self-test converted.)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10 17:29   ` Randy Dunlap
@ 2014-05-11 12:36     ` Tejun Heo
  2014-06-07  2:44       ` [PATCH v2 1/3] " George Spelvin
  0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2014-05-11 12:36 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: George Spelvin, akpm, linux-ide, linux-kernel, mingo, torvalds

On Sat, May 10, 2014 at 10:29:18AM -0700, Randy Dunlap wrote:
> > +	  select this option.  Say N unless you are compiling an out-of
> > +	  tree driver which tells you it depend on it.
> 
> 	To support out-of-tree drivers, I'm pretty sure that you will need
> 	to use obj- instead of lib-.  lib- drops unused code, like Tejun said.

I don't think we need to worry about out-of-tree drivers from the
beginning.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Add lib/glob.c
  2014-05-10 12:21   ` [PATCH 1/2] Add lib/glob.c Tejun Heo
  2014-05-11  6:02     ` George Spelvin
@ 2014-05-12 23:03     ` Andrew Morton
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Morton @ 2014-05-12 23:03 UTC (permalink / raw)
  To: Tejun Heo; +Cc: George Spelvin, linux-ide, linux-kernel, mingo, torvalds

On Sat, 10 May 2014 08:21:38 -0400 Tejun Heo <tj@kernel.org> wrote:

> > +int
> > +main(void)
> > +{
> > +	size_t i;
> > +
> > +	for (i = 0; i < sizeof(tests)/sizeof(*tests); i++)
> > +		test(tests + i);
> > +
> > +	return 0;
> > +}
> > +
> > +#endif	/* UNITTEST */
> 
> Again, I don't really think the userland testing code belongs here.
> If you wanna keep them, please make it in-kernel selftesting.  We
> don't really wanna keep code which can't get built and tested in
> kernel tree proper.

If the tests don't measurably increase boot times we could run them
unconditionally at boot.  Mark everything __init/__initdata and
the costs are very low.  lib/plist.c does this.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/3] Add lib/glob.c
  2014-05-11 12:36     ` Tejun Heo
@ 2014-06-07  2:44       ` George Spelvin
  2014-06-07  2:49         ` [PATCH v2 2/3] lib: glob.c: Add CONFIG_GLOB_SELFTEST George Spelvin
  2014-06-07  2:50         ` [PATCH v2 3/3] libata: Use glob_match from lib/glob.c George Spelvin
  0 siblings, 2 replies; 15+ messages in thread
From: George Spelvin @ 2014-06-07  2:44 UTC (permalink / raw)
  To: rdunlap, tj; +Cc: akpm, linux-ide, linux-kernel, linux, mingo

This is a helper function from drivers/ata/libata_core.c, where it is used
to blacklist particular device models.  It's being moved to lib/ so other
drivers may use it for the same purpose.

This implementation in non-recursive, so is safe for the kernel stack.

Signed-off-by: George Spelvin <linux@horizon.com>
---
Sorry for the delay; while researching ways to organize the self-test
code I found lots of other code to fix up.

Per previous discussion, support for out-of-tree users is currently
disabled, although the machinery necessary to add it easily is
still there.

If nothing else, it provides a good place to put the self-test code.
(Which is now split out into a separate patch.)

I believe I've addressed all the comments made last time.  Any more?
(Acked-by: and Reviewed-by: particularly appreciated!)

 include/linux/glob.h |  10 +++++
 lib/Kconfig          |  19 ++++++++
 lib/Makefile         |   2 +
 lib/glob.c           | 123 +++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 154 insertions(+)
 create mode 100644 include/linux/glob.h
 create mode 100644 lib/glob.c

diff --git a/include/linux/glob.h b/include/linux/glob.h
new file mode 100644
index 00000000..c990b7fb
--- /dev/null
+++ b/include/linux/glob.h
@@ -0,0 +1,10 @@
+#ifndef _LINUX_GLOB_H
+#define _LINUX_GLOB_H
+
+#include <linux/types.h>	/* For bool */
+#include <linux/compiler.h>	/* For __pure */
+
+bool __pure glob_match(char const *pat, char const *str);
+
+#endif	/* _LINUX_GLOB_H */
+
diff --git a/lib/Kconfig b/lib/Kconfig
index 4771fb3f..4a607036 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -382,6 +382,25 @@ config CPU_RMAP
 config DQL
 	bool
 
+config GLOB
+	bool
+#	This actually supports modular compilation, but the module overhead
+#	is ridiculous for the amount of code involved.	Until an out-of-tree
+#	driver asks for it, we'll just link it directly it into the kernel
+#	when required.  Since we're ignoring out-of-tree users,	there's also
+#	no need bother prompting for a manual decision:
+#	prompt "glob_match() function"
+	help
+	  This option provides a glob_match function for performing
+	  simple text pattern matching.  It originated in the ATA code
+	  to blacklist particular drive models, but other device drivers
+	  may need similar functionality.
+
+	  All drivers in the Linux kernel tree that require this function
+	  should automatically select this option.  Say N unless you
+	  are compiling an out-of tree driver which tells you that it
+	  depends on this.
+
 #
 # Netlink attribute parsing support is select'ed if needed
 #
diff --git a/lib/Makefile b/lib/Makefile
index 0cd7b68e..a2ae6fe2 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -134,6 +134,8 @@ obj-$(CONFIG_CORDIC) += cordic.o
 
 obj-$(CONFIG_DQL) += dynamic_queue_limits.o
 
+obj-$(CONFIG_GLOB) += glob.o
+
 obj-$(CONFIG_MPILIB) += mpi/
 obj-$(CONFIG_SIGNATURE) += digsig.o
 
diff --git a/lib/glob.c b/lib/glob.c
new file mode 100644
index 00000000..05beb470
--- /dev/null
+++ b/lib/glob.c
@@ -0,0 +1,123 @@
+#include <linux/module.h>
+#include <linux/glob.h>
+
+/*
+ * The only reason this code can be compiled as a module is because the
+ * ATA code that depends on it can be as well.  In practice, they're
+ * both usually compiled in and the module overhead goes away.
+ */
+MODULE_DESCRIPTION("glob(7) matching");
+MODULE_LICENSE("Dual MIT/GPL");
+
+/**
+ * glob_match - Shell-style pattern matching, like !fnmatch(pat, str, 0)
+ * @pat: Shell-style pattern to match, e.g. "*.[ch]".
+ * @str: String to match.  The pattern must match the entire string.
+ *
+ * Perform shell-style glob matching, returning true (1) if the match
+ * succeeds, or false (0) if it fails.  Equivalent to !fnmatch(@pat, @str, 0).
+ *
+ * Pattern metacharacters are ?, *, [ and \.
+ * (And, inside character classes, !, - and ].)
+ *
+ * This is small and simple implementation intended for device blacklists
+ * where a string is matched against a number of patterns.  Thus, it
+ * does not preprocess the patterns.  It is non-recursive, and run-time
+ * is at most quadratic: strlen(@str)*strlen(@pat).
+ *
+ * An example of the worst case is glob_match("*aaaaa", "aaaaaaaaaa");
+ * it takes 6 passes over the pattern before matching the string.
+ *
+ * Like !fnmatch(@pat, @str, 0) and unlike the shell, this does NOT
+ * treat / or leading . specially; it isn't actually used for pathnames.
+ *
+ * Note that according to glob(7) (and unlike bash), character classes
+ * are complemented by a leading !; this does not support the regex-style
+ * [^a-z] syntax.
+ *
+ * An opening bracket without a matching close is matched literally.
+ */
+bool __pure glob_match(char const *pat, char const *str)
+{
+	/*
+	 * Backtrack to previous * on mismatch and retry starting one
+	 * character later in the string.  Because * matches all characters
+	 * (no exception for /), it can be easily proved that there's
+	 * never a need to backtrack multiple levels.
+	 */
+	char const *back_pat = 0, *back_str = back_str;
+
+	/*
+	 * Loop over each token (character or class) in pat, matching
+	 * it against the remaining unmatched tail of str.  Return false
+	 * on mismatch, or true after matching the trailing nul bytes.
+	 */
+	for (;;) {
+		unsigned char c = *str++;
+		unsigned char d = *pat++;
+
+		switch (d) {
+		case '?':	/* Wildcard: anything but nul */
+			if (c == '\0')
+				return false;
+			break;
+		case '*':	/* Any-length wildcard */
+			if (*pat == '\0')	/* Optimize trailing * case */
+				return true;
+			back_pat = pat;
+			back_str = --str;	/* Allow zero-length match */
+			break;
+		case '[': {	/* Character class */
+			bool match = false, inverted = (*pat == '!');
+			char const *class = pat + inverted;
+			unsigned char a = *class++;
+
+			/*
+			 * Iterate over each span in the character class.
+			 * A span is either a single character a, or a
+			 * range a-b.  The first span may begin with ']'.
+			 */
+			do {
+				unsigned char b = a;
+
+				if (a == '\0')	/* Malformed */
+					goto literal;
+
+				if (class[0] == '-' && class[1] != ']') {
+					b = class[1];
+
+					if (b == '\0')
+						goto literal;
+
+					class += 2;
+					/* Any special action if a > b? */
+				}
+				match |= (a <= c && c <= b);
+			} while ((a = *class++) != ']');
+
+			if (match == inverted)
+				goto backtrack;
+			pat = class;
+			}
+			break;
+		case '\\':
+			d = *pat++;
+			/*FALLTHROUGH*/
+		default:	/* Literal character */
+literal:
+			if (c == d) {
+				if (d == '\0')
+					return true;
+				break;
+			}
+backtrack:
+			if (c == '\0' || !back_pat)
+				return false;	/* No point continuing */
+			/* Try again from last *, one character later in str. */
+			pat = back_pat;
+			str = ++back_str;
+			break;
+		}
+	}
+}
+EXPORT_SYMBOL(glob_match);
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/3] lib: glob.c: Add CONFIG_GLOB_SELFTEST
  2014-06-07  2:44       ` [PATCH v2 1/3] " George Spelvin
@ 2014-06-07  2:49         ` George Spelvin
  2014-06-11 23:04           ` Andrew Morton
  2014-06-07  2:50         ` [PATCH v2 3/3] libata: Use glob_match from lib/glob.c George Spelvin
  1 sibling, 1 reply; 15+ messages in thread
From: George Spelvin @ 2014-06-07  2:49 UTC (permalink / raw)
  To: linux, rdunlap, tj; +Cc: akpm, linux-ide, linux-kernel, mingo

This was useful during development, and is retained for future regression
testing.

GCC appears to have no way to place string literals in a particular
section; adding __initconst to a char pointer leaves the string itself
in the default string section, where it will not be thrown away after
module load.

Thus all string constants are kept in explicitly declared and named
arrays.  Sorry this makes printk a bit harder to read.  At least the
tests are more compact.

Signed-off-by: George Spelvin <linux@horizon.com>
---
Persuading GCC to throw away *all* the self-test data after running
it was surprisingly annoying.

The one thing I'm not really sure about is what to do if the self-test
fails.  For now, I make the module_init function fail too.  Opinions?

 lib/Kconfig |  14 ++++++
 lib/glob.c  | 164 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 178 insertions(+)

diff --git a/lib/Kconfig b/lib/Kconfig
index 4a607036..49592008 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -401,6 +401,20 @@ config GLOB
 	  are compiling an out-of tree driver which tells you that it
 	  depends on this.
 
+config GLOB_SELFTEST
+	bool "glob self-test on init"
+	default n
+	depends on GLOB
+	help
+	  This option enables a simple self-test of the glob_match
+	  function on startup.	It is primarily useful for people
+	  working on the code to ensure they haven't introduced any
+	  regressions.
+
+	  It only adds a little bit of code and slows kernel boot (or
+	  module load) by a small amount, so you're welcome to play with
+	  it, but you probably don't need it.
+
 #
 # Netlink attribute parsing support is select'ed if needed
 #
diff --git a/lib/glob.c b/lib/glob.c
index 05beb470..671f80d3 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -121,3 +121,167 @@ backtrack:
 	}
 }
 EXPORT_SYMBOL(glob_match);
+
+
+#ifdef CONFIG_GLOB_SELFTEST
+
+#include <linux/printk.h>
+#include <linux/moduleparam.h>
+
+/* Boot with "glob.verbose=1" to show successful tests, too */
+static bool verbose = false;
+module_param(verbose, bool, 0);
+
+struct glob_test {
+	char const *pat, *str;
+	bool expected;
+};
+
+static bool __pure __init test(char const *pat, char const *str, bool expected)
+{
+	bool match = glob_match(pat, str);
+	bool success = match == expected;
+
+	/* Can't get string literals into a particular section, so... */
+	static char const msg_error[] __initconst =
+		KERN_ERR "glob: \"%s\" vs. \"%s\": %s *** ERROR ***\n";
+	static char const msg_ok[] __initconst =
+		KERN_DEBUG "glob: \"%s\" vs. \"%s\": %s OK\n";
+	static char const mismatch[] __initconst = "mismatch";
+	char const *message;
+
+	if (!success)
+		message = msg_error;
+	else if (verbose)
+		message = msg_ok;
+	else
+		return success;
+
+	printk(message, pat, str, mismatch + 3*match);
+	return success;
+}
+
+/*
+ * The tests are all jammed together in one array to make it simpler
+ * to place that array in the .init.rodata section.  The obvious
+ * "array of structures containing char *" has no way to force the
+ * pointed-to strings to be in a particular section.
+ *
+ * Anyway, a test consists of:
+ * 1. Expected glob_match result: '1' or '0'.
+ * 2. Pattern to match: null-terminated string
+ * 3. String to match against: null-terminated string
+ *
+ * The list of tests is terminated with a final '\0' instead of
+ * a glob_match result character.
+ */
+static char const glob_tests[] __initconst = 
+	/* Some basic tests */
+	"1" "a\0" "a\0"
+	"0" "a\0" "b\0"
+	"0" "a\0" "aa\0"
+	"0" "a\0" "\0"
+	"1" "\0" "\0"
+	"0" "\0" "a\0"
+	/* Simple character class tests */
+	"1" "[a]\0" "a\0"
+	"0" "[a]\0" "b\0"
+	"0" "[!a]\0" "a\0"
+	"1" "[!a]\0" "b\0"
+	"1" "[ab]\0" "a\0"
+	"1" "[ab]\0" "b\0"
+	"0" "[ab]\0" "c\0"
+	"1" "[!ab]\0" "c\0"
+	"1" "[a-c]\0" "b\0"
+	"0" "[a-c]\0" "d\0"
+	/* Corner cases in character class parsing */
+	"1" "[a-c-e-g]\0" "-\0"
+	"0" "[a-c-e-g]\0" "d\0"
+	"1" "[a-c-e-g]\0" "f\0"
+	"1" "[]a-ceg-ik[]\0" "a\0"
+	"1" "[]a-ceg-ik[]\0" "]\0"
+	"1" "[]a-ceg-ik[]\0" "[\0"
+	"1" "[]a-ceg-ik[]\0" "h\0"
+	"0" "[]a-ceg-ik[]\0" "f\0"
+	"0" "[!]a-ceg-ik[]\0" "h\0"
+	"0" "[!]a-ceg-ik[]\0" "]\0"
+	"1" "[!]a-ceg-ik[]\0" "f\0"
+	/* Simple wild cards */
+	"1" "?\0" "a\0"
+	"0" "?\0" "aa\0"
+	"0" "??\0" "a\0"
+	"1" "?x?\0" "axb\0"
+	"0" "?x?\0" "abx\0"
+	"0" "?x?\0" "xab\0"
+	/* Asterisk wild cards (backtracking) */
+	"0" "*??\0" "a\0"
+	"1" "*??\0" "ab\0"
+	"1" "*??\0" "abc\0"
+	"1" "*??\0" "abcd\0"
+	"0" "??*\0" "a\0"
+	"1" "??*\0" "ab\0"
+	"1" "??*\0" "abc\0"
+	"1" "??*\0" "abcd\0"
+	"0" "?*?\0" "a\0"
+	"1" "?*?\0" "ab\0"
+	"1" "?*?\0" "abc\0"
+	"1" "?*?\0" "abcd\0"
+	"1" "*b\0" "b\0"
+	"1" "*b\0" "ab\0"
+	"0" "*b\0" "ba\0"
+	"1" "*b\0" "bb\0"
+	"1" "*b\0" "abb\0"
+	"1" "*b\0" "bab\0"
+	"1" "*bc\0" "abbc\0"
+	"1" "*bc\0" "bc\0"
+	"1" "*bc\0" "bbc\0"
+	"1" "*bc\0" "bcbc\0"
+	/* Multiple asterisks (complex backtracking) */
+	"1" "*ac*\0" "abacadaeafag\0"
+	"1" "*ac*ae*ag*\0" "abacadaeafag\0"
+	"1" "*a*b*[bc]*[ef]*g*\0" "abacadaeafag\0"
+	"0" "*a*b*[ef]*[cd]*g*\0" "abacadaeafag\0"
+	"1" "*abcd*\0" "abcabcabcabcdefg\0"
+	"1" "*ab*cd*\0" "abcabcabcabcdefg\0"
+	"1" "*abcd*abcdef*\0" "abcabcdabcdeabcdefg\0"
+	"0" "*abcd*\0" "abcabcabcabcefg\0"
+	"0" "*ab*cd*\0" "abcabcabcabcefg\0";
+
+static int __init glob_init(void)
+{
+	unsigned successes = 0;
+	unsigned n = 0;
+	char const *p = glob_tests;
+	static char const message[] __initconst =
+		KERN_INFO "glob: %u self-tests passed, %u failed\n";
+
+	/*
+	 * Tests are jammed together in a string.  The first byte is '1'
+	 * or '0' to indicate the expected outcome, or '\0' to indicate the
+	 * end of the tests.  Then come two null-terminated strings: the
+	 * pattern and the string to match it against.
+	 */
+	while (*p) {
+		bool expected = *p++ & 1;
+		char const *pat = p;
+
+		p += strlen(p) + 1;
+		successes += test(pat, p, expected);
+		p += strlen(p) + 1;
+		n++;
+	}
+
+	n -= successes;
+	printk(message, successes, n);
+
+	/* What's the errno for "kernel bug detected"?  Guess... */
+	return n ? -ECANCELED : 0;
+}
+
+/* We need a dummy exit function to allow unload */
+static void __exit glob_fini(void) { }
+
+module_init(glob_init);
+module_exit(glob_fini);
+
+#endif /* CONFIG_GLOB_SELFTEST */
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/3] libata: Use glob_match from lib/glob.c
  2014-06-07  2:44       ` [PATCH v2 1/3] " George Spelvin
  2014-06-07  2:49         ` [PATCH v2 2/3] lib: glob.c: Add CONFIG_GLOB_SELFTEST George Spelvin
@ 2014-06-07  2:50         ` George Spelvin
  1 sibling, 0 replies; 15+ messages in thread
From: George Spelvin @ 2014-06-07  2:50 UTC (permalink / raw)
  To: linux, rdunlap, tj; +Cc: akpm, linux-ide, linux-kernel, mingo

The function may be useful for other drivers, so export it.
(Suggested by Tejun Heo.)

Note that I inverted the return value of glob_match; returning
true on match seemed to make more sense.

Signed-off-by: George Spelvin <linux@horizon.com>
---
This is unchanged vrom v1.

 drivers/ata/Kconfig       |  1 +
 drivers/ata/libata-core.c | 72 ++---------------------------------------------
 2 files changed, 4 insertions(+), 69 deletions(-)

diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index 0033fafc..0a5e4556 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -16,6 +16,7 @@ menuconfig ATA
 	depends on BLOCK
 	depends on !(M32R || M68K || S390) || BROKEN
 	select SCSI
+	select GLOB
 	---help---
 	  If you want to use an ATA hard disk, ATA tape drive, ATA CD-ROM or
 	  any other ATA device under Linux, say Y and make sure that you know
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index ea83828b..b75695ff 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -59,6 +59,7 @@
 #include <linux/async.h>
 #include <linux/log2.h>
 #include <linux/slab.h>
+#include <linux/glob.h>
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_host.h>
@@ -4250,73 +4251,6 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
 	{ }
 };
 
-/**
- *	glob_match - match a text string against a glob-style pattern
- *	@text: the string to be examined
- *	@pattern: the glob-style pattern to be matched against
- *
- *	Either/both of text and pattern can be empty strings.
- *
- *	Match text against a glob-style pattern, with wildcards and simple sets:
- *
- *		?	matches any single character.
- *		*	matches any run of characters.
- *		[xyz]	matches a single character from the set: x, y, or z.
- *		[a-d]	matches a single character from the range: a, b, c, or d.
- *		[a-d0-9] matches a single character from either range.
- *
- *	The special characters ?, [, -, or *, can be matched using a set, eg. [*]
- *	Behaviour with malformed patterns is undefined, though generally reasonable.
- *
- *	Sample patterns:  "SD1?",  "SD1[0-5]",  "*R0",  "SD*1?[012]*xx"
- *
- *	This function uses one level of recursion per '*' in pattern.
- *	Since it calls _nothing_ else, and has _no_ explicit local variables,
- *	this will not cause stack problems for any reasonable use here.
- *
- *	RETURNS:
- *	0 on match, 1 otherwise.
- */
-static int glob_match (const char *text, const char *pattern)
-{
-	do {
-		/* Match single character or a '?' wildcard */
-		if (*text == *pattern || *pattern == '?') {
-			if (!*pattern++)
-				return 0;  /* End of both strings: match */
-		} else {
-			/* Match single char against a '[' bracketed ']' pattern set */
-			if (!*text || *pattern != '[')
-				break;  /* Not a pattern set */
-			while (*++pattern && *pattern != ']' && *text != *pattern) {
-				if (*pattern == '-' && *(pattern - 1) != '[')
-					if (*text > *(pattern - 1) && *text < *(pattern + 1)) {
-						++pattern;
-						break;
-					}
-			}
-			if (!*pattern || *pattern == ']')
-				return 1;  /* No match */
-			while (*pattern && *pattern++ != ']');
-		}
-	} while (*++text && *pattern);
-
-	/* Match any run of chars against a '*' wildcard */
-	if (*pattern == '*') {
-		if (!*++pattern)
-			return 0;  /* Match: avoid recursion at end of pattern */
-		/* Loop to handle additional pattern chars after the wildcard */
-		while (*text) {
-			if (glob_match(text, pattern) == 0)
-				return 0;  /* Remainder matched */
-			++text;  /* Absorb (match) this char and try again */
-		}
-	}
-	if (!*text && !*pattern)
-		return 0;  /* End of both strings: match */
-	return 1;  /* No match */
-}
-
 static unsigned long ata_dev_blacklisted(const struct ata_device *dev)
 {
 	unsigned char model_num[ATA_ID_PROD_LEN + 1];
@@ -4327,10 +4261,10 @@ static unsigned long ata_dev_blacklisted(const struct ata_device *dev)
 	ata_id_c_string(dev->id, model_rev, ATA_ID_FW_REV, sizeof(model_rev));
 
 	while (ad->model_num) {
-		if (!glob_match(model_num, ad->model_num)) {
+		if (glob_match(model_num, ad->model_num)) {
 			if (ad->model_rev == NULL)
 				return ad->horkage;
-			if (!glob_match(model_rev, ad->model_rev))
+			if (glob_match(model_rev, ad->model_rev))
 				return ad->horkage;
 		}
 		ad++;
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] lib: glob.c: Add CONFIG_GLOB_SELFTEST
  2014-06-07  2:49         ` [PATCH v2 2/3] lib: glob.c: Add CONFIG_GLOB_SELFTEST George Spelvin
@ 2014-06-11 23:04           ` Andrew Morton
  2014-06-12  1:38             ` George Spelvin
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2014-06-11 23:04 UTC (permalink / raw)
  To: George Spelvin; +Cc: rdunlap, tj, linux-ide, linux-kernel, mingo

On 6 Jun 2014 22:49:28 -0400 "George Spelvin" <linux@horizon.com> wrote:

> This was useful during development, and is retained for future regression
> testing.
> 
> GCC appears to have no way to place string literals in a particular
> section; adding __initconst to a char pointer leaves the string itself
> in the default string section, where it will not be thrown away after
> module load.
> 
> Thus all string constants are kept in explicitly declared and named
> arrays.  Sorry this makes printk a bit harder to read.  At least the
> tests are more compact.
> 
> Signed-off-by: George Spelvin <linux@horizon.com>
> ---
> Persuading GCC to throw away *all* the self-test data after running
> it was surprisingly annoying.

Yeah.  Props for making the attempt.

> The one thing I'm not really sure about is what to do if the self-test
> fails.  For now, I make the module_init function fail too.  Opinions?

The printk should suffice - someone will notice it eventually.

Using KERN_ERR to report a failure might help draw attention to it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] lib: glob.c: Add CONFIG_GLOB_SELFTEST
  2014-06-11 23:04           ` Andrew Morton
@ 2014-06-12  1:38             ` George Spelvin
  0 siblings, 0 replies; 15+ messages in thread
From: George Spelvin @ 2014-06-12  1:38 UTC (permalink / raw)
  To: akpm, linux; +Cc: linux-ide, linux-kernel, mingo, rdunlap, tj

>> Persuading GCC to throw away *all* the self-test data after running
>> it was surprisingly annoying.
>
> Yeah.  Props for making the attempt.

*Whew*.  I was worried I'd get upbraided for overoptimziation.

>> The one thing I'm not really sure about is what to do if the self-test
>> fails.  For now, I make the module_init function fail too.  Opinions?
>
> The printk should suffice - someone will notice it eventually.
> 
> Using KERN_ERR to report a failure might help draw attention to it.

I'm not sure what you mean by "might"; I already *do* report it as KERN_ERR.

If you think failing the module load is a bad idea, feel free to
modify the patch.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-06-12  1:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20140313121032.GA9981@htj.dyndns.org>
2014-05-10  3:13 ` [PATCH 1/2] Add lib/glob.c George Spelvin
2014-05-10  3:14   ` [PATCH 2/2] libata: Use glob_match from lib/glob.c George Spelvin
2014-05-10 12:21   ` [PATCH 1/2] Add lib/glob.c Tejun Heo
2014-05-11  6:02     ` George Spelvin
2014-05-12 23:03     ` Andrew Morton
2014-05-10 12:23   ` Tejun Heo
2014-05-10 14:03     ` George Spelvin
2014-05-10 17:22       ` Randy Dunlap
2014-05-10 17:29   ` Randy Dunlap
2014-05-11 12:36     ` Tejun Heo
2014-06-07  2:44       ` [PATCH v2 1/3] " George Spelvin
2014-06-07  2:49         ` [PATCH v2 2/3] lib: glob.c: Add CONFIG_GLOB_SELFTEST George Spelvin
2014-06-11 23:04           ` Andrew Morton
2014-06-12  1:38             ` George Spelvin
2014-06-07  2:50         ` [PATCH v2 3/3] libata: Use glob_match from lib/glob.c George Spelvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.