All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Add some string_list-related functions
@ 2012-09-09  5:53 Michael Haggerty
  2012-09-09  5:53 ` [PATCH 1/4] Add a new function, string_list_split_in_place() Michael Haggerty
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Michael Haggerty @ 2012-09-09  5:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

This patch series adds a few functions to the string_list API.  They
will be used in two upcoming patch series.  Unfortunately, both of the
series (which are otherwise logically independent) need the same
function; therefore, I am submitting these string-list enhancements as
a separate series on which the other two can depend.

This patch series applies to current master.

Michael Haggerty (4):
  Add a new function, string_list_split_in_place()
  Add a new function, filter_string_list()
  Add a new function, string_list_remove_duplicates()
  Add a function string_list_longest_prefix()

 .gitignore                                  |  1 +
 Documentation/technical/api-string-list.txt | 32 ++++++++++
 Makefile                                    |  1 +
 string-list.c                               | 77 ++++++++++++++++++++++++
 string-list.h                               | 41 +++++++++++++
 t/t0063-string-list.sh                      | 93 +++++++++++++++++++++++++++++
 test-string-list.c                          | 47 +++++++++++++++
 7 files changed, 292 insertions(+)
 create mode 100755 t/t0063-string-list.sh
 create mode 100644 test-string-list.c

-- 
1.7.11.3

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/4] Add a new function, string_list_split_in_place()
  2012-09-09  5:53 [PATCH 0/4] Add some string_list-related functions Michael Haggerty
@ 2012-09-09  5:53 ` Michael Haggerty
  2012-09-09  9:35   ` Junio C Hamano
  2012-09-09  5:53 ` [PATCH 2/4] Add a new function, filter_string_list() Michael Haggerty
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-09  5:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

Split a string into a string_list on a separator character.

This is similar to the strbuf_split_*() functions except that it works
with the more powerful string_list interface.  If strdup_strings is
false, it reuses the memory from the input string (thereby needing no
string memory allocations, though of course allocations are still
needed for the string_list_items array).

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---

In the tests, I use here documents to specify the expected output.  Is
this OK?  (It is certainly convenient.)

 .gitignore                                  |  1 +
 Documentation/technical/api-string-list.txt | 12 ++++++
 Makefile                                    |  1 +
 string-list.c                               | 23 +++++++++++
 string-list.h                               | 19 +++++++++
 t/t0063-string-list.sh                      | 63 +++++++++++++++++++++++++++++
 test-string-list.c                          | 25 ++++++++++++
 7 files changed, 144 insertions(+)
 create mode 100755 t/t0063-string-list.sh
 create mode 100644 test-string-list.c

diff --git a/.gitignore b/.gitignore
index bb5c91e..0ca7df8 100644
--- a/.gitignore
+++ b/.gitignore
@@ -193,6 +193,7 @@
 /test-run-command
 /test-sha1
 /test-sigchain
+/test-string-list
 /test-subprocess
 /test-svn-fe
 /common-cmds.h
diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
index 5a0c14f..3b959a2 100644
--- a/Documentation/technical/api-string-list.txt
+++ b/Documentation/technical/api-string-list.txt
@@ -124,6 +124,18 @@ counterpart for sorted lists, which performs a binary search.
 	is set. The third parameter controls if the `util` pointer of the
 	items should be freed or not.
 
+`string_list_split_in_place`::
+
+	Split string into substrings on character delim and append the
+	substrings to a string_list.  The delimiter characters in
+	string are overwritten with NULs in the process.  If maxsplit
+	is a positive integer, then split at most maxsplit times.  If
+	list.strdup_strings is not set, then the new string_list_items
+	point into string, which therefore must not be modified or
+	freed while the string_list is in use.  Return the number of
+	substrings appended to the list.
+
+
 Data structures
 ---------------
 
diff --git a/Makefile b/Makefile
index 66e8216..ebbb381 100644
--- a/Makefile
+++ b/Makefile
@@ -501,6 +501,7 @@ TEST_PROGRAMS_NEED_X += test-run-command
 TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
 TEST_PROGRAMS_NEED_X += test-sha1
 TEST_PROGRAMS_NEED_X += test-sigchain
+TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
 
diff --git a/string-list.c b/string-list.c
index d9810ab..110449c 100644
--- a/string-list.c
+++ b/string-list.c
@@ -194,3 +194,26 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 	list->items[i] = list->items[list->nr-1];
 	list->nr--;
 }
+
+int string_list_split_in_place(struct string_list *list, char *string,
+			       int delim, int maxsplit)
+{
+	int count = 0;
+	char *p = string, *end;
+	for (;;) {
+		count++;
+		if (maxsplit > 0 && count > maxsplit) {
+			string_list_append(list, p);
+			return count;
+		}
+		end = strchr(p, delim);
+		if (end) {
+			*end = '\0';
+			string_list_append(list, p);
+			p = end + 1;
+		} else {
+			string_list_append(list, p);
+			return count;
+		}
+	}
+}
diff --git a/string-list.h b/string-list.h
index 0684cb7..7e51d03 100644
--- a/string-list.h
+++ b/string-list.h
@@ -45,4 +45,23 @@ int unsorted_string_list_has_string(struct string_list *list, const char *string
 struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
 						     const char *string);
 void unsorted_string_list_delete_item(struct string_list *list, int i, int free_util);
+
+/*
+ * Split string into substrings on character delim and append the
+ * substrings to list.  The delimiter characters in string are
+ * overwritten with NULs in the process.  If maxsplit is a positive
+ * integer, then split at most maxsplit times.  If list.strdup_strings
+ * is not set, then the new string_list_items point into string, which
+ * therefore must not be modified or freed while the string_list
+ * is in use.  Return the number of substrings appended to list.
+ *
+ * Examples:
+ *   string_list_split_in_place(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
+ *   string_list_split_in_place(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
+ *   string_list_split_in_place(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
+ *   string_list_split_in_place(l, "", ':', -1) -> [""]
+ *   string_list_split_in_place(l, ":", ':', -1) -> ["", ""]
+ */
+int string_list_split_in_place(struct string_list *list, char *string,
+			       int delim, int maxsplit);
 #endif /* STRING_LIST_H */
diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
new file mode 100755
index 0000000..0eede83
--- /dev/null
+++ b/t/t0063-string-list.sh
@@ -0,0 +1,63 @@
+#!/bin/sh
+#
+# Copyright (c) 2012 Michael Haggerty
+#
+
+test_description='Test string list functionality'
+
+. ./test-lib.sh
+
+string_list_split_in_place() {
+	cat >split-expected &&
+	test_expect_success "split $1 $2 $3" "
+		test-string-list split_in_place '$1' '$2' '$3' >split-actual &&
+		test_cmp split-expected split-actual
+	"
+}
+
+string_list_split_in_place "foo:bar:baz" ":" "-1" <<EOF
+3
+[0]: "foo"
+[1]: "bar"
+[2]: "baz"
+EOF
+
+string_list_split_in_place "foo:bar:baz" ":" "0" <<EOF
+3
+[0]: "foo"
+[1]: "bar"
+[2]: "baz"
+EOF
+
+string_list_split_in_place "foo:bar:baz" ":" "1" <<EOF
+2
+[0]: "foo"
+[1]: "bar:baz"
+EOF
+
+string_list_split_in_place "foo:bar:baz" ":" "2" <<EOF
+3
+[0]: "foo"
+[1]: "bar"
+[2]: "baz"
+EOF
+
+string_list_split_in_place "foo:bar:" ":" "-1" <<EOF
+3
+[0]: "foo"
+[1]: "bar"
+[2]: ""
+EOF
+
+string_list_split_in_place "" ":" "-1" <<EOF
+1
+[0]: ""
+EOF
+
+string_list_split_in_place ":" ":" "-1" <<EOF
+2
+[0]: ""
+[1]: ""
+EOF
+
+test_done
diff --git a/test-string-list.c b/test-string-list.c
new file mode 100644
index 0000000..f08d3cc
--- /dev/null
+++ b/test-string-list.c
@@ -0,0 +1,25 @@
+#include "cache.h"
+#include "string-list.h"
+
+int main(int argc, char **argv)
+{
+	if ((argc == 4 || argc == 5) && !strcmp(argv[1], "split_in_place")) {
+		struct string_list list = STRING_LIST_INIT_NODUP;
+		int i;
+		char *s = xstrdup(argv[2]);
+		int delim = *argv[3];
+		int maxsplit = (argc == 5) ? atoi(argv[4]) : -1;
+
+		i = string_list_split_in_place(&list, s, delim, maxsplit);
+		printf("%d\n", i);
+		for (i = 0; i < list.nr; i++)
+			printf("[%d]: \"%s\"\n", i, list.items[i].string);
+		string_list_clear(&list, 0);
+		free(s);
+		return 0;
+	}
+
+	fprintf(stderr, "%s: unknown function name: %s\n", argv[0],
+		argv[1] ? argv[1] : "(there was none)");
+	return 1;
+}
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/4] Add a new function, filter_string_list()
  2012-09-09  5:53 [PATCH 0/4] Add some string_list-related functions Michael Haggerty
  2012-09-09  5:53 ` [PATCH 1/4] Add a new function, string_list_split_in_place() Michael Haggerty
@ 2012-09-09  5:53 ` Michael Haggerty
  2012-09-09  9:40   ` Junio C Hamano
  2012-09-09  5:53 ` [PATCH 3/4] Add a new function, string_list_remove_duplicates() Michael Haggerty
  2012-09-09  5:53 ` [PATCH 4/4] Add a function string_list_longest_prefix() Michael Haggerty
  3 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-09  5:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 Documentation/technical/api-string-list.txt |  8 ++++++++
 string-list.c                               | 17 +++++++++++++++++
 string-list.h                               |  9 +++++++++
 3 files changed, 34 insertions(+)

diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
index 3b959a2..15b8072 100644
--- a/Documentation/technical/api-string-list.txt
+++ b/Documentation/technical/api-string-list.txt
@@ -60,6 +60,14 @@ Functions
 
 * General ones (works with sorted and unsorted lists as well)
 
+`filter_string_list`::
+
+	Apply a function to each item in a list, retaining only the
+	items for which the function returns true.  If free_util is
+	true, call free() on the util members of any items that have
+	to be deleted.  Preserve the order of the items that are
+	retained.
+
 `print_string_list`::
 
 	Dump a string_list to stdout, useful mainly for debugging purposes. It
diff --git a/string-list.c b/string-list.c
index 110449c..72610ce 100644
--- a/string-list.c
+++ b/string-list.c
@@ -102,6 +102,23 @@ int for_each_string_list(struct string_list *list,
 	return ret;
 }
 
+void filter_string_list(struct string_list *list, int free_util,
+			string_list_each_func_t fn, void *cb_data)
+{
+	int src, dst = 0;
+	for (src = 0; src < list->nr; src++) {
+		if (fn(&list->items[src], cb_data)) {
+			list->items[dst++] = list->items[src];
+		} else {
+			if (list->strdup_strings)
+				free(list->items[src].string);
+			if (free_util)
+				free(list->items[src].util);
+		}
+	}
+	list->nr = dst;
+}
+
 void string_list_clear(struct string_list *list, int free_util)
 {
 	if (list->items) {
diff --git a/string-list.h b/string-list.h
index 7e51d03..84996aa 100644
--- a/string-list.h
+++ b/string-list.h
@@ -29,6 +29,15 @@ int for_each_string_list(struct string_list *list,
 #define for_each_string_list_item(item,list) \
 	for (item = (list)->items; item < (list)->items + (list)->nr; ++item)
 
+/*
+ * Apply fn to each item in list, retaining only the ones for which
+ * the function returns true.  If free_util is true, call free() on
+ * the util members of any items that have to be deleted.  Preserve
+ * the order of the items that are retained.
+ */
+void filter_string_list(struct string_list *list, int free_util,
+			string_list_each_func_t fn, void *cb_data);
+
 /* Use these functions only on sorted lists: */
 int string_list_has_string(const struct string_list *list, const char *string);
 int string_list_find_insert_index(const struct string_list *list, const char *string,
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/4] Add a new function, string_list_remove_duplicates()
  2012-09-09  5:53 [PATCH 0/4] Add some string_list-related functions Michael Haggerty
  2012-09-09  5:53 ` [PATCH 1/4] Add a new function, string_list_split_in_place() Michael Haggerty
  2012-09-09  5:53 ` [PATCH 2/4] Add a new function, filter_string_list() Michael Haggerty
@ 2012-09-09  5:53 ` Michael Haggerty
  2012-09-09  9:45   ` Junio C Hamano
  2012-09-09  5:53 ` [PATCH 4/4] Add a function string_list_longest_prefix() Michael Haggerty
  3 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-09  5:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 Documentation/technical/api-string-list.txt |  4 ++++
 string-list.c                               | 17 +++++++++++++++++
 string-list.h                               |  5 +++++
 3 files changed, 26 insertions(+)

diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
index 15b8072..9206f8f 100644
--- a/Documentation/technical/api-string-list.txt
+++ b/Documentation/technical/api-string-list.txt
@@ -104,6 +104,10 @@ write `string_list_insert(...)->util = ...;`.
 	Look up a given string in the string_list, returning the containing
 	string_list_item. If the string is not found, NULL is returned.
 
+`string_list_remove_duplicates`::
+
+	Remove all but the first entry that has a given string value.
+
 * Functions for unsorted lists only
 
 `string_list_append`::
diff --git a/string-list.c b/string-list.c
index 72610ce..bfef6cf 100644
--- a/string-list.c
+++ b/string-list.c
@@ -92,6 +92,23 @@ struct string_list_item *string_list_lookup(struct string_list *list, const char
 	return list->items + i;
 }
 
+void string_list_remove_duplicates(struct string_list *list, int free_util)
+{
+	if (list->nr > 1) {
+		int src, dst;
+		for (src = dst = 1; src < list->nr; src++) {
+			if (!strcmp(list->items[dst - 1].string, list->items[src].string)) {
+				if (list->strdup_strings)
+					free(list->items[src].string);
+				if (free_util)
+					free(list->items[src].util);
+			} else
+				list->items[dst++] = list->items[src];
+		}
+		list->nr = dst;
+	}
+}
+
 int for_each_string_list(struct string_list *list,
 			 string_list_each_func_t fn, void *cb_data)
 {
diff --git a/string-list.h b/string-list.h
index 84996aa..c4dc659 100644
--- a/string-list.h
+++ b/string-list.h
@@ -38,6 +38,7 @@ int for_each_string_list(struct string_list *list,
 void filter_string_list(struct string_list *list, int free_util,
 			string_list_each_func_t fn, void *cb_data);
 
+
 /* Use these functions only on sorted lists: */
 int string_list_has_string(const struct string_list *list, const char *string);
 int string_list_find_insert_index(const struct string_list *list, const char *string,
@@ -47,6 +48,10 @@ struct string_list_item *string_list_insert_at_index(struct string_list *list,
 						     int insert_at, const char *string);
 struct string_list_item *string_list_lookup(struct string_list *list, const char *string);
 
+/* Remove all but the first entry that has a given string value. */
+void string_list_remove_duplicates(struct string_list *list, int free_util);
+
+
 /* Use these functions only on unsorted lists: */
 struct string_list_item *string_list_append(struct string_list *list, const char *string);
 void sort_string_list(struct string_list *list);
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/4] Add a function string_list_longest_prefix()
  2012-09-09  5:53 [PATCH 0/4] Add some string_list-related functions Michael Haggerty
                   ` (2 preceding siblings ...)
  2012-09-09  5:53 ` [PATCH 3/4] Add a new function, string_list_remove_duplicates() Michael Haggerty
@ 2012-09-09  5:53 ` Michael Haggerty
  2012-09-09  9:54   ` Junio C Hamano
  3 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-09  5:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 Documentation/technical/api-string-list.txt |  8 ++++++++
 string-list.c                               | 20 +++++++++++++++++++
 string-list.h                               |  8 ++++++++
 t/t0063-string-list.sh                      | 30 +++++++++++++++++++++++++++++
 test-string-list.c                          | 22 +++++++++++++++++++++
 5 files changed, 88 insertions(+)

diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
index 9206f8f..291ac4c 100644
--- a/Documentation/technical/api-string-list.txt
+++ b/Documentation/technical/api-string-list.txt
@@ -68,6 +68,14 @@ Functions
 	to be deleted.  Preserve the order of the items that are
 	retained.
 
+`string_list_longest_prefix`::
+
+	Return the longest string within a string_list that is a
+	prefix (in the sense of prefixcmp()) of the specified string,
+	or NULL if no such prefix exists.  This function does not
+	require the string_list to be sorted (it does a linear
+	search).
+
 `print_string_list`::
 
 	Dump a string_list to stdout, useful mainly for debugging purposes. It
diff --git a/string-list.c b/string-list.c
index bfef6cf..043f6c4 100644
--- a/string-list.c
+++ b/string-list.c
@@ -136,6 +136,26 @@ void filter_string_list(struct string_list *list, int free_util,
 	list->nr = dst;
 }
 
+char *string_list_longest_prefix(const struct string_list *prefixes,
+				 const char *string)
+{
+	int i, max_len = -1;
+	char *retval = NULL;
+
+	for (i = 0; i < prefixes->nr; i++) {
+		char *prefix = prefixes->items[i].string;
+		if (!prefixcmp(string, prefix)) {
+			int len = strlen(prefix);
+			if (len > max_len) {
+				retval = prefix;
+				max_len = len;
+			}
+		}
+	}
+
+	return retval;
+}
+
 void string_list_clear(struct string_list *list, int free_util)
 {
 	if (list->items) {
diff --git a/string-list.h b/string-list.h
index c4dc659..680916c 100644
--- a/string-list.h
+++ b/string-list.h
@@ -38,6 +38,14 @@ int for_each_string_list(struct string_list *list,
 void filter_string_list(struct string_list *list, int free_util,
 			string_list_each_func_t fn, void *cb_data);
 
+/*
+ * Return the longest string in prefixes that is a prefix (in the
+ * sense of prefixcmp()) of string, or NULL if no such prefix exists.
+ * This function does not require the string_list to be sorted (it
+ * does a linear search).
+ */
+char *string_list_longest_prefix(const struct string_list *prefixes, const char *string);
+
 
 /* Use these functions only on sorted lists: */
 int string_list_has_string(const struct string_list *list, const char *string);
diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
index 0eede83..fa96eba 100755
--- a/t/t0063-string-list.sh
+++ b/t/t0063-string-list.sh
@@ -15,6 +15,14 @@ string_list_split_in_place() {
 	"
 }
 
+longest_prefix() {
+	test "$(test-string-list longest_prefix "$1" "$2")" = "$3"
+}
+
+no_longest_prefix() {
+	test_must_fail test-string-list longest_prefix "$1" "$2"
+}
+
 string_list_split_in_place "foo:bar:baz" ":" "-1" <<EOF
 3
 [0]: "foo"
@@ -60,4 +68,26 @@ string_list_split_in_place ":" ":" "-1" <<EOF
 [1]: ""
 EOF
 
+test_expect_success "test longest_prefix" '
+	no_longest_prefix - '' &&
+	no_longest_prefix - x &&
+	longest_prefix "" x "" &&
+	longest_prefix x x x &&
+	longest_prefix "" foo "" &&
+	longest_prefix : foo "" &&
+	longest_prefix f foo f &&
+	longest_prefix foo foobar foo &&
+	longest_prefix foo foo foo &&
+	no_longest_prefix bar foo &&
+	no_longest_prefix bar:bar foo &&
+	no_longest_prefix foobar foo &&
+	longest_prefix foo:bar foo foo &&
+	longest_prefix foo:bar bar bar &&
+	longest_prefix foo::bar foo foo &&
+	longest_prefix foo:foobar foo foo &&
+	longest_prefix foobar:foo foo foo &&
+	longest_prefix foo: bar "" &&
+	longest_prefix :foo bar ""
+'
+
 test_done
diff --git a/test-string-list.c b/test-string-list.c
index f08d3cc..c7e71f2 100644
--- a/test-string-list.c
+++ b/test-string-list.c
@@ -19,6 +19,28 @@ int main(int argc, char **argv)
 		return 0;
 	}
 
+	if (argc == 4 && !strcmp(argv[1], "longest_prefix")) {
+		/* arguments: <colon-separated-prefixes>|- <string> */
+		struct string_list prefixes = STRING_LIST_INIT_NODUP;
+		int retval;
+		char *prefix_string = xstrdup(argv[2]);
+		char *string = argv[3];
+		char *match;
+
+		if (strcmp(prefix_string, "-"))
+			string_list_split_in_place(&prefixes, prefix_string, ':', -1);
+		match = string_list_longest_prefix(&prefixes, string);
+		if (match) {
+			printf("%s\n", match);
+			retval = 0;
+		}
+		else
+			retval = 1;
+		string_list_clear(&prefixes, 0);
+		free(prefix_string);
+		return retval;
+	}
+
 	fprintf(stderr, "%s: unknown function name: %s\n", argv[0],
 		argv[1] ? argv[1] : "(there was none)");
 	return 1;
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] Add a new function, string_list_split_in_place()
  2012-09-09  5:53 ` [PATCH 1/4] Add a new function, string_list_split_in_place() Michael Haggerty
@ 2012-09-09  9:35   ` Junio C Hamano
  2012-09-10  4:45     ` Michael Haggerty
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-09-09  9:35 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Split a string into a string_list on a separator character.
>
> This is similar to the strbuf_split_*() functions except that it works
> with the more powerful string_list interface.  If strdup_strings is
> false, it reuses the memory from the input string (thereby needing no
> string memory allocations, though of course allocations are still
> needed for the string_list_items array).
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>
> In the tests, I use here documents to specify the expected output.  Is
> this OK?  (It is certainly convenient.)

I offhand do not have an objection to that style, but it looked
funny to see the helper test function "string_list_split_in_place"
named without "test_" prefix.  Maybe it's just me.

> diff --git a/.gitignore b/.gitignore
> index bb5c91e..0ca7df8 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -193,6 +193,7 @@
>  /test-run-command
>  /test-sha1
>  /test-sigchain
> +/test-string-list
>  /test-subprocess
>  /test-svn-fe
>  /common-cmds.h
> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
> index 5a0c14f..3b959a2 100644
> --- a/Documentation/technical/api-string-list.txt
> +++ b/Documentation/technical/api-string-list.txt
> @@ -124,6 +124,18 @@ counterpart for sorted lists, which performs a binary search.
>  	is set. The third parameter controls if the `util` pointer of the
>  	items should be freed or not.
>  
> +`string_list_split_in_place`::
> +
> +	Split string into substrings on character delim and append the
> +	substrings to a string_list.  The delimiter characters in
> +	string are overwritten with NULs in the process.  If maxsplit
> +	is a positive integer, then split at most maxsplit times.  If

So passing 0 is the natural way to say "pay attention to all the
delimiters", which matches my intuition.

> +	list.strdup_strings is not set, then the new string_list_items
> +	point into string, which therefore must not be modified or
> +	freed while the string_list is in use.  Return the number of
> +	substrings appended to the list.

I am not sure about this strdup_strings business; it smells somewhat
fishy from the API design point of view.

If you are not mucking with the input string and not splitting in
place, it would not be possible to do this without strdup_strings,
but if you are doing the in-place splitting, is there any reason for
the caller to ask for strdup_strings?

In such a case, the reason the caller cannot promise that the input
string will not go away to the string_list (hence it has to ask to
make a copy) is because it does not own the string in the first
place, and in such a case, I would imagine it cannot allow the
delimiters in the string to be overwritten with NULs.

I would sort-of-kind-of understand a function "string_list_split"
that bases its decision to do an in-place splitting or not on the
strdup_strings flag in the string list that was passed in.  But it
would make the use of the function a bit limited (e.g. you cannot
sanely mix and match tokens from different kind of strings).

> diff --git a/string-list.c b/string-list.c
> index d9810ab..110449c 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -194,3 +194,26 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
>  	list->items[i] = list->items[list->nr-1];
>  	list->nr--;
>  }
> +
> +int string_list_split_in_place(struct string_list *list, char *string,
> +			       int delim, int maxsplit)
> +{
> +	int count = 0;
> +	char *p = string, *end;

Blank line here.

> +	for (;;) {
> +		count++;
> +		if (maxsplit > 0 && count > maxsplit) {
> +			string_list_append(list, p);
> +			return count;
> +		}
> +		end = strchr(p, delim);
> +		if (end) {
> +			*end = '\0';
> +			string_list_append(list, p);
> +			p = end + 1;
> +		} else {
> +			string_list_append(list, p);
> +			return count;
> +		}
> +	}
> +}
> diff --git a/string-list.h b/string-list.h
> index 0684cb7..7e51d03 100644
> --- a/string-list.h
> +++ b/string-list.h
> @@ -45,4 +45,23 @@ int unsorted_string_list_has_string(struct string_list *list, const char *string
>  struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
>  						     const char *string);
>  void unsorted_string_list_delete_item(struct string_list *list, int i, int free_util);
> +
> +/*
> + * Split string into substrings on character delim and append the
> + * substrings to list.  The delimiter characters in string are
> + * overwritten with NULs in the process.  If maxsplit is a positive
> + * integer, then split at most maxsplit times.  If list.strdup_strings
> + * is not set, then the new string_list_items point into string, which
> + * therefore must not be modified or freed while the string_list
> + * is in use.  Return the number of substrings appended to list.
> + *
> + * Examples:
> + *   string_list_split_in_place(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
> + *   string_list_split_in_place(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
> + *   string_list_split_in_place(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]

I would find it more natural to see a sentinel value against
"positive" to be 0, not -1.  "-1" gives an impression as if "-2"
might do something different from "-1", but Zero is a lot more
special.

> diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
> new file mode 100755
> index 0000000..0eede83
> --- /dev/null
> +++ b/t/t0063-string-list.sh
> @@ -0,0 +1,63 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2012 Michael Haggerty
> +#
> +
> +test_description='Test string list functionality'
> +
> +. ./test-lib.sh
> +
> +string_list_split_in_place() {

SP before ()?

> +	cat >split-expected &&
> +	test_expect_success "split $1 $2 $3" "

"split '$1' at '$2', max $3", or something?

> +		test-string-list split_in_place '$1' '$2' '$3' >split-actual &&
> +		test_cmp split-expected split-actual
> +	"
> +}

What is it buying us to use "split-" before expected and actual to
make things "unusual" from many other tests?

> +string_list_split_in_place "foo:bar:baz" ":" "-1" <<EOF

Likewise about "-1" (I think your test helper does not even require
"-1" to be spelled out here).

> +3
> +[0]: "foo"
> +[1]: "bar"
> +[2]: "baz"
> +EOF
> +
> +string_list_split_in_place "foo:bar:baz" ":" "0" <<EOF
> +3
> +[0]: "foo"
> +[1]: "bar"
> +[2]: "baz"
> +EOF
> +
> +string_list_split_in_place "foo:bar:baz" ":" "1" <<EOF
> +2
> +[0]: "foo"
> +[1]: "bar:baz"
> +EOF
> +
> +string_list_split_in_place "foo:bar:baz" ":" "2" <<EOF
> +3
> +[0]: "foo"
> +[1]: "bar"
> +[2]: "baz"
> +EOF
> +
> +string_list_split_in_place "foo:bar:" ":" "-1" <<EOF
> +3
> +[0]: "foo"
> +[1]: "bar"
> +[2]: ""
> +EOF
> +
> +string_list_split_in_place "" ":" "-1" <<EOF
> +1
> +[0]: ""
> +EOF
> +
> +string_list_split_in_place ":" ":" "-1" <<EOF
> +2
> +[0]: ""
> +[1]: ""
> +EOF
> +
> +test_done
> diff --git a/test-string-list.c b/test-string-list.c
> new file mode 100644
> index 0000000..f08d3cc
> --- /dev/null
> +++ b/test-string-list.c
> @@ -0,0 +1,25 @@
> +#include "cache.h"
> +#include "string-list.h"
> +
> +int main(int argc, char **argv)
> +{
> +	if ((argc == 4 || argc == 5) && !strcmp(argv[1], "split_in_place")) {
> +		struct string_list list = STRING_LIST_INIT_NODUP;
> +		int i;
> +		char *s = xstrdup(argv[2]);
> +		int delim = *argv[3];
> +		int maxsplit = (argc == 5) ? atoi(argv[4]) : -1;

Likewise on "-1".

> +		i = string_list_split_in_place(&list, s, delim, maxsplit);
> +		printf("%d\n", i);
> +		for (i = 0; i < list.nr; i++)
> +			printf("[%d]: \"%s\"\n", i, list.items[i].string);
> +		string_list_clear(&list, 0);
> +		free(s);
> +		return 0;
> +	}
> +
> +	fprintf(stderr, "%s: unknown function name: %s\n", argv[0],
> +		argv[1] ? argv[1] : "(there was none)");
> +	return 1;
> +}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] Add a new function, filter_string_list()
  2012-09-09  5:53 ` [PATCH 2/4] Add a new function, filter_string_list() Michael Haggerty
@ 2012-09-09  9:40   ` Junio C Hamano
  2012-09-10  8:58     ` Michael Haggerty
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-09-09  9:40 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  Documentation/technical/api-string-list.txt |  8 ++++++++
>  string-list.c                               | 17 +++++++++++++++++
>  string-list.h                               |  9 +++++++++
>  3 files changed, 34 insertions(+)
>
> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
> index 3b959a2..15b8072 100644
> --- a/Documentation/technical/api-string-list.txt
> +++ b/Documentation/technical/api-string-list.txt
> @@ -60,6 +60,14 @@ Functions
>  
>  * General ones (works with sorted and unsorted lists as well)
>  
> +`filter_string_list`::
> +
> +	Apply a function to each item in a list, retaining only the
> +	items for which the function returns true.  If free_util is
> +	true, call free() on the util members of any items that have
> +	to be deleted.  Preserve the order of the items that are
> +	retained.

In other words, this can safely be used on both sorted and unsorted
string list.  Good.

>  `print_string_list`::
>  
>  	Dump a string_list to stdout, useful mainly for debugging purposes. It
> diff --git a/string-list.c b/string-list.c
> index 110449c..72610ce 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -102,6 +102,23 @@ int for_each_string_list(struct string_list *list,
>  	return ret;
>  }
>  
> +void filter_string_list(struct string_list *list, int free_util,
> +			string_list_each_func_t fn, void *cb_data)
> +{
> +	int src, dst = 0;
> +	for (src = 0; src < list->nr; src++) {
> +		if (fn(&list->items[src], cb_data)) {
> +			list->items[dst++] = list->items[src];
> +		} else {
> +			if (list->strdup_strings)
> +				free(list->items[src].string);
> +			if (free_util)
> +				free(list->items[src].util);
> +		}
> +	}
> +	list->nr = dst;
> +}
> +
>  void string_list_clear(struct string_list *list, int free_util)
>  {
>  	if (list->items) {
> diff --git a/string-list.h b/string-list.h
> index 7e51d03..84996aa 100644
> --- a/string-list.h
> +++ b/string-list.h
> @@ -29,6 +29,15 @@ int for_each_string_list(struct string_list *list,
>  #define for_each_string_list_item(item,list) \
>  	for (item = (list)->items; item < (list)->items + (list)->nr; ++item)
>  
> +/*
> + * Apply fn to each item in list, retaining only the ones for which
> + * the function returns true.  If free_util is true, call free() on
> + * the util members of any items that have to be deleted.  Preserve
> + * the order of the items that are retained.
> + */
> +void filter_string_list(struct string_list *list, int free_util,
> +			string_list_each_func_t fn, void *cb_data);
> +
>  /* Use these functions only on sorted lists: */
>  int string_list_has_string(const struct string_list *list, const char *string);
>  int string_list_find_insert_index(const struct string_list *list, const char *string,

Having seen that the previous patch introduced a new test helper for
unit testing (which is a very good idea) and dedicated a new test
number, I would have expected to see a new test for filtering
here.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] Add a new function, string_list_remove_duplicates()
  2012-09-09  5:53 ` [PATCH 3/4] Add a new function, string_list_remove_duplicates() Michael Haggerty
@ 2012-09-09  9:45   ` Junio C Hamano
  2012-09-10  9:15     ` Michael Haggerty
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-09-09  9:45 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  Documentation/technical/api-string-list.txt |  4 ++++
>  string-list.c                               | 17 +++++++++++++++++
>  string-list.h                               |  5 +++++
>  3 files changed, 26 insertions(+)
>
> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
> index 15b8072..9206f8f 100644
> --- a/Documentation/technical/api-string-list.txt
> +++ b/Documentation/technical/api-string-list.txt
> @@ -104,6 +104,10 @@ write `string_list_insert(...)->util = ...;`.
>  	Look up a given string in the string_list, returning the containing
>  	string_list_item. If the string is not found, NULL is returned.
>  
> +`string_list_remove_duplicates`::
> +
> +	Remove all but the first entry that has a given string value.

Unlike the previous patch, free_util is not documented?

It is kind of shame that the string list must be sorted for this
function to work, but I guess you do not have a need for a version
that works on both sorted and unsorted (yet).  We can introduce a
variant with "unsorted_" prefix later when it becomes necessary, so
OK.

>  * Functions for unsorted lists only
>  
>  `string_list_append`::
> diff --git a/string-list.c b/string-list.c
> index 72610ce..bfef6cf 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -92,6 +92,23 @@ struct string_list_item *string_list_lookup(struct string_list *list, const char
>  	return list->items + i;
>  }
>  
> +void string_list_remove_duplicates(struct string_list *list, int free_util)
> +{
> +	if (list->nr > 1) {
> +		int src, dst;
> +		for (src = dst = 1; src < list->nr; src++) {
> +			if (!strcmp(list->items[dst - 1].string, list->items[src].string)) {
> +				if (list->strdup_strings)
> +					free(list->items[src].string);
> +				if (free_util)
> +					free(list->items[src].util);
> +			} else
> +				list->items[dst++] = list->items[src];
> +		}
> +		list->nr = dst;
> +	}
> +}
> +
>  int for_each_string_list(struct string_list *list,
>  			 string_list_each_func_t fn, void *cb_data)
>  {
> diff --git a/string-list.h b/string-list.h
> index 84996aa..c4dc659 100644
> --- a/string-list.h
> +++ b/string-list.h
> @@ -38,6 +38,7 @@ int for_each_string_list(struct string_list *list,
>  void filter_string_list(struct string_list *list, int free_util,
>  			string_list_each_func_t fn, void *cb_data);
>  
> +
>  /* Use these functions only on sorted lists: */
>  int string_list_has_string(const struct string_list *list, const char *string);
>  int string_list_find_insert_index(const struct string_list *list, const char *string,
> @@ -47,6 +48,10 @@ struct string_list_item *string_list_insert_at_index(struct string_list *list,
>  						     int insert_at, const char *string);
>  struct string_list_item *string_list_lookup(struct string_list *list, const char *string);
>  
> +/* Remove all but the first entry that has a given string value. */
> +void string_list_remove_duplicates(struct string_list *list, int free_util);
> +
> +
>  /* Use these functions only on unsorted lists: */
>  struct string_list_item *string_list_append(struct string_list *list, const char *string);
>  void sort_string_list(struct string_list *list);

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] Add a function string_list_longest_prefix()
  2012-09-09  5:53 ` [PATCH 4/4] Add a function string_list_longest_prefix() Michael Haggerty
@ 2012-09-09  9:54   ` Junio C Hamano
  2012-09-10 10:01     ` Michael Haggerty
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-09-09  9:54 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  Documentation/technical/api-string-list.txt |  8 ++++++++
>  string-list.c                               | 20 +++++++++++++++++++
>  string-list.h                               |  8 ++++++++
>  t/t0063-string-list.sh                      | 30 +++++++++++++++++++++++++++++
>  test-string-list.c                          | 22 +++++++++++++++++++++
>  5 files changed, 88 insertions(+)
>
> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
> index 9206f8f..291ac4c 100644
> --- a/Documentation/technical/api-string-list.txt
> +++ b/Documentation/technical/api-string-list.txt
> @@ -68,6 +68,14 @@ Functions
>  	to be deleted.  Preserve the order of the items that are
>  	retained.
>  
> +`string_list_longest_prefix`::
> +
> +	Return the longest string within a string_list that is a
> +	prefix (in the sense of prefixcmp()) of the specified string,
> +	or NULL if no such prefix exists.  This function does not
> +	require the string_list to be sorted (it does a linear
> +	search).
> +
>  `print_string_list`::

This may feel like outside the scope of this series, but since this
series will be the main culprit for adding many new functions to
this API in the recent history...

 - We may want to name things a bit more consistently so that people
   can tell which ones can be called on any string list, which ones
   are sorted list only, and which ones are unsorted one only.

   In addition, the last category _may_ need a bit more thought.
   Calling unsorted_string_list_lookup() on an already sorted list
   is not a crime---it is just a stupid thing to do.

 - Why are these new functions described at the top, not appended at
   the bottom?  I would have expected either an alphabetical, or a
   more generic ones first (i.e. print and clear are a lot "easier"
   ones compared to filter and prefix that are very much more
   specialized).

> diff --git a/string-list.c b/string-list.c
> index bfef6cf..043f6c4 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -136,6 +136,26 @@ void filter_string_list(struct string_list *list, int free_util,
>  	list->nr = dst;
>  }
>  
> +char *string_list_longest_prefix(const struct string_list *prefixes,
> +				 const char *string)
> +{
> +	int i, max_len = -1;
> +	char *retval = NULL;
> +
> +	for (i = 0; i < prefixes->nr; i++) {
> +		char *prefix = prefixes->items[i].string;
> +		if (!prefixcmp(string, prefix)) {
> +			int len = strlen(prefix);
> +			if (len > max_len) {
> +				retval = prefix;
> +				max_len = len;
> +			}
> +		}
> +	}
> +
> +	return retval;
> +}
> +
>  void string_list_clear(struct string_list *list, int free_util)
>  {
>  	if (list->items) {
> diff --git a/string-list.h b/string-list.h
> index c4dc659..680916c 100644
> --- a/string-list.h
> +++ b/string-list.h
> @@ -38,6 +38,14 @@ int for_each_string_list(struct string_list *list,
>  void filter_string_list(struct string_list *list, int free_util,
>  			string_list_each_func_t fn, void *cb_data);
>  
> +/*
> + * Return the longest string in prefixes that is a prefix (in the
> + * sense of prefixcmp()) of string, or NULL if no such prefix exists.
> + * This function does not require the string_list to be sorted (it
> + * does a linear search).
> + */
> +char *string_list_longest_prefix(const struct string_list *prefixes, const char *string);
> +
>  
>  /* Use these functions only on sorted lists: */
>  int string_list_has_string(const struct string_list *list, const char *string);
> diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
> index 0eede83..fa96eba 100755
> --- a/t/t0063-string-list.sh
> +++ b/t/t0063-string-list.sh
> @@ -15,6 +15,14 @@ string_list_split_in_place() {
>  	"
>  }
>  
> +longest_prefix() {
> +	test "$(test-string-list longest_prefix "$1" "$2")" = "$3"
> +}
> +
> +no_longest_prefix() {
> +	test_must_fail test-string-list longest_prefix "$1" "$2"
> +}
> +
>  string_list_split_in_place "foo:bar:baz" ":" "-1" <<EOF
>  3
>  [0]: "foo"
> @@ -60,4 +68,26 @@ string_list_split_in_place ":" ":" "-1" <<EOF
>  [1]: ""
>  EOF
>  
> +test_expect_success "test longest_prefix" '
> +	no_longest_prefix - '' &&
> +	no_longest_prefix - x &&
> +	longest_prefix "" x "" &&
> +	longest_prefix x x x &&
> +	longest_prefix "" foo "" &&
> +	longest_prefix : foo "" &&
> +	longest_prefix f foo f &&
> +	longest_prefix foo foobar foo &&
> +	longest_prefix foo foo foo &&
> +	no_longest_prefix bar foo &&
> +	no_longest_prefix bar:bar foo &&
> +	no_longest_prefix foobar foo &&
> +	longest_prefix foo:bar foo foo &&
> +	longest_prefix foo:bar bar bar &&
> +	longest_prefix foo::bar foo foo &&
> +	longest_prefix foo:foobar foo foo &&
> +	longest_prefix foobar:foo foo foo &&
> +	longest_prefix foo: bar "" &&
> +	longest_prefix :foo bar ""
> +'
> +
>  test_done
> diff --git a/test-string-list.c b/test-string-list.c
> index f08d3cc..c7e71f2 100644
> --- a/test-string-list.c
> +++ b/test-string-list.c
> @@ -19,6 +19,28 @@ int main(int argc, char **argv)
>  		return 0;
>  	}
>  
> +	if (argc == 4 && !strcmp(argv[1], "longest_prefix")) {
> +		/* arguments: <colon-separated-prefixes>|- <string> */
> +		struct string_list prefixes = STRING_LIST_INIT_NODUP;
> +		int retval;
> +		char *prefix_string = xstrdup(argv[2]);
> +		char *string = argv[3];
> +		char *match;
> +
> +		if (strcmp(prefix_string, "-"))
> +			string_list_split_in_place(&prefixes, prefix_string, ':', -1);
> +		match = string_list_longest_prefix(&prefixes, string);
> +		if (match) {
> +			printf("%s\n", match);
> +			retval = 0;
> +		}
> +		else
> +			retval = 1;
> +		string_list_clear(&prefixes, 0);
> +		free(prefix_string);
> +		return retval;
> +	}
> +
>  	fprintf(stderr, "%s: unknown function name: %s\n", argv[0],
>  		argv[1] ? argv[1] : "(there was none)");
>  	return 1;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] Add a new function, string_list_split_in_place()
  2012-09-09  9:35   ` Junio C Hamano
@ 2012-09-10  4:45     ` Michael Haggerty
  2012-09-10  5:47       ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-10  4:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 09/09/2012 11:35 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Split a string into a string_list on a separator character.
>>
>> This is similar to the strbuf_split_*() functions except that it works
>> with the more powerful string_list interface.  If strdup_strings is
>> false, it reuses the memory from the input string (thereby needing no
>> string memory allocations, though of course allocations are still
>> needed for the string_list_items array).
>>
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
>>
>> In the tests, I use here documents to specify the expected output.  Is
>> this OK?  (It is certainly convenient.)
> 
> I offhand do not have an objection to that style, but it looked
> funny to see the helper test function "string_list_split_in_place"
> named without "test_" prefix.  Maybe it's just me.

I renamed the test function to "test_split_in_place".  (The
"string_list" part can be dispensed with in a test called
t0063-string-list.sh, I think.)

>> diff --git a/.gitignore b/.gitignore
>> index bb5c91e..0ca7df8 100644
>> --- a/.gitignore
>> +++ b/.gitignore
>> @@ -193,6 +193,7 @@
>>  /test-run-command
>>  /test-sha1
>>  /test-sigchain
>> +/test-string-list
>>  /test-subprocess
>>  /test-svn-fe
>>  /common-cmds.h
>> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
>> index 5a0c14f..3b959a2 100644
>> --- a/Documentation/technical/api-string-list.txt
>> +++ b/Documentation/technical/api-string-list.txt
>> @@ -124,6 +124,18 @@ counterpart for sorted lists, which performs a binary search.
>>  	is set. The third parameter controls if the `util` pointer of the
>>  	items should be freed or not.
>>  
>> +`string_list_split_in_place`::
>> +
>> +	Split string into substrings on character delim and append the
>> +	substrings to a string_list.  The delimiter characters in
>> +	string are overwritten with NULs in the process.  If maxsplit
>> +	is a positive integer, then split at most maxsplit times.  If
> 
> So passing 0 is the natural way to say "pay attention to all the
> delimiters", which matches my intuition.

See below.

>> +	list.strdup_strings is not set, then the new string_list_items
>> +	point into string, which therefore must not be modified or
>> +	freed while the string_list is in use.  Return the number of
>> +	substrings appended to the list.
> 
> I am not sure about this strdup_strings business; it smells somewhat
> fishy from the API design point of view.
> 
> If you are not mucking with the input string and not splitting in
> place, it would not be possible to do this without strdup_strings,
> but if you are doing the in-place splitting, is there any reason for
> the caller to ask for strdup_strings?
>
> In such a case, the reason the caller cannot promise that the input
> string will not go away to the string_list (hence it has to ask to
> make a copy) is because it does not own the string in the first
> place, and in such a case, I would imagine it cannot allow the
> delimiters in the string to be overwritten with NULs.

On one level it is immaterial whether this is a sensible usage; the
caller *can* set strdup_strings on any string_list and pass that
string_list into the function.  In that case the function *has* to
strdup the strings because when strdup_strings is set, the string_list
functions are allowed to free() a string whenever they want.

But there are situations when this usage is also convenient; namely when
the lifetime of the string being split is shorter than the lifetime of
the string_list.  Consider something like

    struct string_list *split_file_into_words(FILE *f)
    {
        char buf[1024];
        struct string_list *list = new string list;
        list->strdup_strings = 1;
        while (not EOF) {
            read_line_into_buf();
            string_list_split_in_place(list, buf, ' ', -1);
        }
        return list;
    }

Consider also that the string_list passed to
string_list_split_in_place() might already have contents from elsewhere.
 Since a string_list has a single allocation policy covering all of its
entries, it might be more convenient for the caller to allow the split
strings to be copied than to change the allocation policy of the
pre-inserted strings.

> I would sort-of-kind-of understand a function "string_list_split"
> that bases its decision to do an in-place splitting or not on the
> strdup_strings flag in the string list that was passed in.  But it
> would make the use of the function a bit limited (e.g. you cannot
> sanely mix and match tokens from different kind of strings).

Here is my thinking:

I want to have a very low overhead way to use string_list, without a
memory allocation per string (so that people don't have an excuse to
avoid the API).  Thus the "in-place" option.

It would be easy (and have no run-time cost) to change the split
function to *not* modify the input string if strdup_strings is set on
the string_list.  But (1) I don't think it is a good idea for the
handling of the string argument to depend on an option buried in another
parameters, and (2) the string argument could not be declared "const",
so many situations when this variant were useful would require const to
be cast away.

So if/when such a need arises, I think it would be better to invent a
new string_list_split() variant that *never* overwrites its input string
(though this function could *only* work correctly if strdup_strings is set).

>> diff --git a/string-list.c b/string-list.c
>> index d9810ab..110449c 100644
>> --- a/string-list.c
>> +++ b/string-list.c
>> @@ -194,3 +194,26 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
>>  	list->items[i] = list->items[list->nr-1];
>>  	list->nr--;
>>  }
>> +
>> +int string_list_split_in_place(struct string_list *list, char *string,
>> +			       int delim, int maxsplit)
>> +{
>> +	int count = 0;
>> +	char *p = string, *end;
> 
> Blank line here.

Thanks.

>> +	for (;;) {
>> +		count++;
>> +		if (maxsplit > 0 && count > maxsplit) {
>> +			string_list_append(list, p);
>> +			return count;
>> +		}
>> +		end = strchr(p, delim);
>> +		if (end) {
>> +			*end = '\0';
>> +			string_list_append(list, p);
>> +			p = end + 1;
>> +		} else {
>> +			string_list_append(list, p);
>> +			return count;
>> +		}
>> +	}
>> +}
>> diff --git a/string-list.h b/string-list.h
>> index 0684cb7..7e51d03 100644
>> --- a/string-list.h
>> +++ b/string-list.h
>> @@ -45,4 +45,23 @@ int unsorted_string_list_has_string(struct string_list *list, const char *string
>>  struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
>>  						     const char *string);
>>  void unsorted_string_list_delete_item(struct string_list *list, int i, int free_util);
>> +
>> +/*
>> + * Split string into substrings on character delim and append the
>> + * substrings to list.  The delimiter characters in string are
>> + * overwritten with NULs in the process.  If maxsplit is a positive
>> + * integer, then split at most maxsplit times.  If list.strdup_strings
>> + * is not set, then the new string_list_items point into string, which
>> + * therefore must not be modified or freed while the string_list
>> + * is in use.  Return the number of substrings appended to list.
>> + *
>> + * Examples:
>> + *   string_list_split_in_place(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
>> + *   string_list_split_in_place(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
>> + *   string_list_split_in_place(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
> 
> I would find it more natural to see a sentinel value against
> "positive" to be 0, not -1.  "-1" gives an impression as if "-2"
> might do something different from "-1", but Zero is a lot more
> special.

You have raised a good point and I think there is a flaw in the API, but
I'm not sure I agree with you what the flaw is...

The "maxsplit" argument limits the number of times the string should be
split.  I.e., if maxsplit is set, then the output will have at most
(maxsplit + 1) strings.

When I used "-1" as the special value for "unlimited", I was
subconsciously thinking that "0" is an imaginable edge case; it could
mean "don't split string at all"; i.e., return the whole input as the
single string in the output string_list.  This would be a kindof silly
usage, but might perhaps remove the need to handle 0 specially at a
caller if "maxsplit" is derived from another source.

But of course my definition of string_list_split_in_place() *doesn't*
treat 0 this way.  I think *that* was my mistake.  So I propose to
change the meaning of maxsplit to

    If maxsplit is a non-negative integer, then split at most maxsplit
    times.

and to continue using -1 as the "special" value for unlimited splits.
Are you OK with that?

>> diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
>> new file mode 100755
>> index 0000000..0eede83
>> --- /dev/null
>> +++ b/t/t0063-string-list.sh
>> @@ -0,0 +1,63 @@
>> +#!/bin/sh
>> +#
>> +# Copyright (c) 2012 Michael Haggerty
>> +#
>> +
>> +test_description='Test string list functionality'
>> +
>> +. ./test-lib.sh
>> +
>> +string_list_split_in_place() {
> 
> SP before ()?

Thanks.

>> +	cat >split-expected &&
>> +	test_expect_success "split $1 $2 $3" "
> 
> "split '$1' at '$2', max $3", or something?

Good idea.

>> +		test-string-list split_in_place '$1' '$2' '$3' >split-actual &&
>> +		test_cmp split-expected split-actual
>> +	"
>> +}
> 
> What is it buying us to use "split-" before expected and actual to
> make things "unusual" from many other tests?

Nothing, I guess.  Changed.

>> +string_list_split_in_place "foo:bar:baz" ":" "-1" <<EOF
> 
> Likewise about "-1" (I think your test helper does not even require
> "-1" to be spelled out here).

Since we're testing the C API I think it's best that the tests be
explicit about the value to be passed to the function.

Therefore I guess I will change the test program to *require* the
maxsplit option, since testing is its only purpose.

>> +3
>> +[0]: "foo"
>> +[1]: "bar"
>> +[2]: "baz"
>> +EOF
>> +
>> +string_list_split_in_place "foo:bar:baz" ":" "0" <<EOF
>> +3
>> +[0]: "foo"
>> +[1]: "bar"
>> +[2]: "baz"
>> +EOF
>> +
>> +string_list_split_in_place "foo:bar:baz" ":" "1" <<EOF
>> +2
>> +[0]: "foo"
>> +[1]: "bar:baz"
>> +EOF
>> +
>> +string_list_split_in_place "foo:bar:baz" ":" "2" <<EOF
>> +3
>> +[0]: "foo"
>> +[1]: "bar"
>> +[2]: "baz"
>> +EOF
>> +
>> +string_list_split_in_place "foo:bar:" ":" "-1" <<EOF
>> +3
>> +[0]: "foo"
>> +[1]: "bar"
>> +[2]: ""
>> +EOF
>> +
>> +string_list_split_in_place "" ":" "-1" <<EOF
>> +1
>> +[0]: ""
>> +EOF
>> +
>> +string_list_split_in_place ":" ":" "-1" <<EOF
>> +2
>> +[0]: ""
>> +[1]: ""
>> +EOF
>> +
>> +test_done
>> diff --git a/test-string-list.c b/test-string-list.c
>> new file mode 100644
>> index 0000000..f08d3cc
>> --- /dev/null
>> +++ b/test-string-list.c
>> @@ -0,0 +1,25 @@
>> +#include "cache.h"
>> +#include "string-list.h"
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	if ((argc == 4 || argc == 5) && !strcmp(argv[1], "split_in_place")) {
>> +		struct string_list list = STRING_LIST_INIT_NODUP;
>> +		int i;
>> +		char *s = xstrdup(argv[2]);
>> +		int delim = *argv[3];
>> +		int maxsplit = (argc == 5) ? atoi(argv[4]) : -1;
> 
> Likewise on "-1".
> 
>> +		i = string_list_split_in_place(&list, s, delim, maxsplit);
>> +		printf("%d\n", i);
>> +		for (i = 0; i < list.nr; i++)
>> +			printf("[%d]: \"%s\"\n", i, list.items[i].string);
>> +		string_list_clear(&list, 0);
>> +		free(s);
>> +		return 0;
>> +	}
>> +
>> +	fprintf(stderr, "%s: unknown function name: %s\n", argv[0],
>> +		argv[1] ? argv[1] : "(there was none)");
>> +	return 1;
>> +}

Thanks for your helpful feedback!  I will send a revised patch series as
soon as I can.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] Add a new function, string_list_split_in_place()
  2012-09-10  4:45     ` Michael Haggerty
@ 2012-09-10  5:47       ` Junio C Hamano
  2012-09-10 11:48         ` Michael Haggerty
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-09-10  5:47 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> ...  Consider something like
>
>     struct string_list *split_file_into_words(FILE *f)
>     {
>         char buf[1024];
>         struct string_list *list = new string list;
>         list->strdup_strings = 1;
>         while (not EOF) {
>             read_line_into_buf();
>             string_list_split_in_place(list, buf, ' ', -1);
>         }
>         return list;
>     }

That is a prime example to argue that string_list_split() would make
more sense, no?  The caller does _not_ mind if the function mucks
with buf, but the resulting list is not allowed to point into buf.

In such a case, the caller shouldn't have to _care_ if it wants to
allow buf to be mucked with; it is already asking that the resulting
list _not_ point into buf by setting strdup_strings (by the way,
that is part of the function input, so think of it like various *opt
variables passed into functions to tweak their behaviour).  If the
implementation can do so without sacrificing performance (and in
this case, as you said, it can), it should take "const char *buf".

The above caller shouldn't have to choose between sl_split() and
sl_split_in_place(), in other words.

So it appears to me that sl_split_in_place(), if implemented, should
be kept as a special case for performance-minded callers that have
full control of the lifetime rules of the variables they use, can
set strdup_strings to false, and can let buf modified in place, and
can accept list that point into buf.

>>> + * Examples:
>>> + *   string_list_split_in_place(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
>>> + *   string_list_split_in_place(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
>>> + *   string_list_split_in_place(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
>> 
>> I would find it more natural to see a sentinel value against
>> "positive" to be 0, not -1.  "-1" gives an impression as if "-2"
>> might do something different from "-1", but Zero is a lot more
>> special.
>
> You have raised a good point and I think there is a flaw in the API, but
> I'm not sure I agree with you what the flaw is...
>
> The "maxsplit" argument limits the number of times the string should be
> split.  I.e., if maxsplit is set, then the output will have at most
> (maxsplit + 1) strings.

So "do not split, just give me the whole thing" would be maxsplit == 0
to split into (maxsplit+1) == 1 string.  I think we are in agreement
that your "-1" does not make any sense, and your documentation that
said "positive" is the saner thing to do, no?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] Add a new function, filter_string_list()
  2012-09-09  9:40   ` Junio C Hamano
@ 2012-09-10  8:58     ` Michael Haggerty
  0 siblings, 0 replies; 23+ messages in thread
From: Michael Haggerty @ 2012-09-10  8:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 09/09/2012 11:40 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
>>  Documentation/technical/api-string-list.txt |  8 ++++++++
>>  string-list.c                               | 17 +++++++++++++++++
>>  string-list.h                               |  9 +++++++++
>>  3 files changed, 34 insertions(+)
>>
>> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
>> index 3b959a2..15b8072 100644
>> --- a/Documentation/technical/api-string-list.txt
>> +++ b/Documentation/technical/api-string-list.txt
>> @@ -60,6 +60,14 @@ Functions
>>  
>>  * General ones (works with sorted and unsorted lists as well)
>>  
>> +`filter_string_list`::
>> +
>> +	Apply a function to each item in a list, retaining only the
>> +	items for which the function returns true.  If free_util is
>> +	true, call free() on the util members of any items that have
>> +	to be deleted.  Preserve the order of the items that are
>> +	retained.
> 
> In other words, this can safely be used on both sorted and unsorted
> string list.  Good.

Preserving order (while retaining performance) is the main reason for
this function.  Otherwise, unsorted_string_list_delete_item() could be
used in a loop.

>>  `print_string_list`::
>>  
>>  	Dump a string_list to stdout, useful mainly for debugging purposes. It
>> diff --git a/string-list.c b/string-list.c
>> index 110449c..72610ce 100644
>> --- a/string-list.c
>> +++ b/string-list.c
>> @@ -102,6 +102,23 @@ int for_each_string_list(struct string_list *list,
>>  	return ret;
>>  }
>>  
>> +void filter_string_list(struct string_list *list, int free_util,
>> +			string_list_each_func_t fn, void *cb_data)
>> +{
>> +	int src, dst = 0;
>> +	for (src = 0; src < list->nr; src++) {
>> +		if (fn(&list->items[src], cb_data)) {
>> +			list->items[dst++] = list->items[src];
>> +		} else {
>> +			if (list->strdup_strings)
>> +				free(list->items[src].string);
>> +			if (free_util)
>> +				free(list->items[src].util);
>> +		}
>> +	}
>> +	list->nr = dst;
>> +}
>> +
>>  void string_list_clear(struct string_list *list, int free_util)
>>  {
>>  	if (list->items) {
>> diff --git a/string-list.h b/string-list.h
>> index 7e51d03..84996aa 100644
>> --- a/string-list.h
>> +++ b/string-list.h
>> @@ -29,6 +29,15 @@ int for_each_string_list(struct string_list *list,
>>  #define for_each_string_list_item(item,list) \
>>  	for (item = (list)->items; item < (list)->items + (list)->nr; ++item)
>>  
>> +/*
>> + * Apply fn to each item in list, retaining only the ones for which
>> + * the function returns true.  If free_util is true, call free() on
>> + * the util members of any items that have to be deleted.  Preserve
>> + * the order of the items that are retained.
>> + */
>> +void filter_string_list(struct string_list *list, int free_util,
>> +			string_list_each_func_t fn, void *cb_data);
>> +
>>  /* Use these functions only on sorted lists: */
>>  int string_list_has_string(const struct string_list *list, const char *string);
>>  int string_list_find_insert_index(const struct string_list *list, const char *string,
> 
> Having seen that the previous patch introduced a new test helper for
> unit testing (which is a very good idea) and dedicated a new test
> number, I would have expected to see a new test for filtering
> here.

I thought that the code was too trivial to warrant a test, especially
considering that the memory handling aspect of the function can't be
tested very well.  But you've correctly shamed me into adding tests for
this and also for patch 3/4, string_list_remove_duplicates().

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] Add a new function, string_list_remove_duplicates()
  2012-09-09  9:45   ` Junio C Hamano
@ 2012-09-10  9:15     ` Michael Haggerty
  0 siblings, 0 replies; 23+ messages in thread
From: Michael Haggerty @ 2012-09-10  9:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 09/09/2012 11:45 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
>>  Documentation/technical/api-string-list.txt |  4 ++++
>>  string-list.c                               | 17 +++++++++++++++++
>>  string-list.h                               |  5 +++++
>>  3 files changed, 26 insertions(+)
>>
>> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
>> index 15b8072..9206f8f 100644
>> --- a/Documentation/technical/api-string-list.txt
>> +++ b/Documentation/technical/api-string-list.txt
>> @@ -104,6 +104,10 @@ write `string_list_insert(...)->util = ...;`.
>>  	Look up a given string in the string_list, returning the containing
>>  	string_list_item. If the string is not found, NULL is returned.
>>  
>> +`string_list_remove_duplicates`::
>> +
>> +	Remove all but the first entry that has a given string value.
> 
> Unlike the previous patch, free_util is not documented?

Fixed.

> It is kind of shame that the string list must be sorted for this
> function to work, but I guess you do not have a need for a version
> that works on both sorted and unsorted (yet).  We can introduce a
> variant with "unsorted_" prefix later when it becomes necessary, so
> OK.

Not only that; for an unsorted list it is not quite as obvious what a
caller would want.  Often lists are used as a poor man's set, in which
case the caller would probably not mind sorting the list anyway.  There
are two operations that one might conceive of for unsorted lists: (1)
remove duplicates while preserving the order of the remaining entries,
or (2) remove duplicates while not worrying about the order of the
remaining entries.  (Admittedly the first is not much more expensive
than the second.)  These are more complicated to program, require
temporary space, and are of less obvious utility than removing
duplicates from a sorted list.

>>  * Functions for unsorted lists only
>>  
>>  `string_list_append`::
>> diff --git a/string-list.c b/string-list.c
>> index 72610ce..bfef6cf 100644
>> --- a/string-list.c
>> +++ b/string-list.c
>> @@ -92,6 +92,23 @@ struct string_list_item *string_list_lookup(struct string_list *list, const char
>>  	return list->items + i;
>>  }
>>  
>> +void string_list_remove_duplicates(struct string_list *list, int free_util)
>> +{
>> +	if (list->nr > 1) {
>> +		int src, dst;
>> +		for (src = dst = 1; src < list->nr; src++) {
>> +			if (!strcmp(list->items[dst - 1].string, list->items[src].string)) {
>> +				if (list->strdup_strings)
>> +					free(list->items[src].string);
>> +				if (free_util)
>> +					free(list->items[src].util);
>> +			} else
>> +				list->items[dst++] = list->items[src];
>> +		}
>> +		list->nr = dst;
>> +	}
>> +}
>> +
>>  int for_each_string_list(struct string_list *list,
>>  			 string_list_each_func_t fn, void *cb_data)
>>  {
>> diff --git a/string-list.h b/string-list.h
>> index 84996aa..c4dc659 100644
>> --- a/string-list.h
>> +++ b/string-list.h
>> @@ -38,6 +38,7 @@ int for_each_string_list(struct string_list *list,
>>  void filter_string_list(struct string_list *list, int free_util,
>>  			string_list_each_func_t fn, void *cb_data);
>>  
>> +
>>  /* Use these functions only on sorted lists: */
>>  int string_list_has_string(const struct string_list *list, const char *string);
>>  int string_list_find_insert_index(const struct string_list *list, const char *string,
>> @@ -47,6 +48,10 @@ struct string_list_item *string_list_insert_at_index(struct string_list *list,
>>  						     int insert_at, const char *string);
>>  struct string_list_item *string_list_lookup(struct string_list *list, const char *string);
>>  
>> +/* Remove all but the first entry that has a given string value. */
>> +void string_list_remove_duplicates(struct string_list *list, int free_util);
>> +
>> +
>>  /* Use these functions only on unsorted lists: */
>>  struct string_list_item *string_list_append(struct string_list *list, const char *string);
>>  void sort_string_list(struct string_list *list);


-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] Add a function string_list_longest_prefix()
  2012-09-09  9:54   ` Junio C Hamano
@ 2012-09-10 10:01     ` Michael Haggerty
  2012-09-10 16:24       ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-10 10:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 09/09/2012 11:54 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> [...]
>> diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
>> index 9206f8f..291ac4c 100644
>> --- a/Documentation/technical/api-string-list.txt
>> +++ b/Documentation/technical/api-string-list.txt
>> @@ -68,6 +68,14 @@ Functions
>>  	to be deleted.  Preserve the order of the items that are
>>  	retained.
>>  
>> +`string_list_longest_prefix`::
>> +
>> +	Return the longest string within a string_list that is a
>> +	prefix (in the sense of prefixcmp()) of the specified string,
>> +	or NULL if no such prefix exists.  This function does not
>> +	require the string_list to be sorted (it does a linear
>> +	search).
>> +
>>  `print_string_list`::
> 
> This may feel like outside the scope of this series, but since this
> series will be the main culprit for adding many new functions to
> this API in the recent history...
> 
>  - We may want to name things a bit more consistently so that people
>    can tell which ones can be called on any string list, which ones
>    are sorted list only, and which ones are unsorted one only.
> 
>    In addition, the last category _may_ need a bit more thought.
>    Calling unsorted_string_list_lookup() on an already sorted list
>    is not a crime---it is just a stupid thing to do.

Yes, this could be clearer.  Though I'm skeptical that a naming
convention can capture all of the variation without being too cumbersome.

Another idea: in string-list.h, one could name parameters "sorted_list"
when they must be sorted as a precondition of the function.

But before getting too hung up on finery, the effort might be better
invested adding documentation for functions that are totally lacking it,
like

    string_list_clear_func()
    for_each_string_list()
    for_each_string_list_item()
    string_list_find_insert_index()
    string_list_insert_at_index()

While we're on the subject, it seems to me that documenting APIs like
these in separate files under Documentation/technical rather than in the
header files themselves

- makes the documentation for a particular function harder to find,

- makes it easier for the documentation to get out of sync with the
actual collection of functions (e.g., the 5 undocumented functions
listed above).

- makes it awkward for the documentation to refer to particular function
parameters by name.

While it is nice to have a high-level prose description of an API, I am
often frustrated by the lack of "docstrings" in the header file where a
function is declared.  The high-level description of an API could be put
at the top of the header file.

Also, better documentation in header files could enable the automatic
generation of API docs (e.g., via doxygen).

Is there some reason for the current documentation policy or is it
historical and just needs somebody to put in the work to change it?

>  - Why are these new functions described at the top, not appended at
>    the bottom?  I would have expected either an alphabetical, or a
>    more generic ones first (i.e. print and clear are a lot "easier"
>    ones compared to filter and prefix that are very much more
>    specialized).

The order seemed logical to me at the time (given the constraint that
functions are grouped by sorted/unsorted/don't-care):
print_string_list() is only useful for debugging, so it seemed to belong
below the "production" functions.  string_list_clear() was already below
print_string_list() (which I guessed was because it is logically used
last in the life of a string_list) so I left it at the end of its
section.  My preference would probably have been to move
print_string_list() below string_list_clear(), but somebody else made
the opposite choice so I decided to respect it.

That being said, I don't have anything against a different order.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] Add a new function, string_list_split_in_place()
  2012-09-10  5:47       ` Junio C Hamano
@ 2012-09-10 11:48         ` Michael Haggerty
  2012-09-10 16:09           ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-10 11:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 09/10/2012 07:47 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> ...  Consider something like
>>
>>     struct string_list *split_file_into_words(FILE *f)
>>     {
>>         char buf[1024];
>>         struct string_list *list = new string list;
>>         list->strdup_strings = 1;
>>         while (not EOF) {
>>             read_line_into_buf();
>>             string_list_split_in_place(list, buf, ' ', -1);
>>         }
>>         return list;
>>     }
> 
> That is a prime example to argue that string_list_split() would make
> more sense, no?  The caller does _not_ mind if the function mucks
> with buf, but the resulting list is not allowed to point into buf.
> 
> In such a case, the caller shouldn't have to _care_ if it wants to
> allow buf to be mucked with; it is already asking that the resulting
> list _not_ point into buf by setting strdup_strings (by the way,
> that is part of the function input, so think of it like various *opt
> variables passed into functions to tweak their behaviour).  If the
> implementation can do so without sacrificing performance (and in
> this case, as you said, it can), it should take "const char *buf".

You're right, I was thinking that a caller of
string_list_split_in_place() could choose to remain ignorant of whether
strdup_strings is set, but this is incorrect because it needs to know
whether to do its own memory management of the strings that it passes
into string_list_append().

> So it appears to me that sl_split_in_place(), if implemented, should
> be kept as a special case for performance-minded callers that have
> full control of the lifetime rules of the variables they use, can
> set strdup_strings to false, and can let buf modified in place, and
> can accept list that point into buf.

OK, so the bottom line would be to have two versions of the function.
One takes a (const char *) and *requires* strdup_strings to be set on
its input list:

int string_list_split(struct string_list *list, const char *string,
		      int delim, int maxsplit)
{
	assert(list->strdup_strings);
	...
}

The other takes a (char *) and modifies it in-place, and maybe even
requires strdup_strings to be false on its input list:

int string_list_split_in_place(struct string_list *list, char *string,
			       int delim, int maxsplit)
{
	/* not an error per se but a strong suggestion of one: */
	assert(!list->strdup_strings);
	...
}

(The latter (modulo assert) is the one that I have implemented, but it
might not be needed immediately.)  Do you agree?

>>>> + * Examples:
>>>> + *   string_list_split_in_place(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
>>>> + *   string_list_split_in_place(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
>>>> + *   string_list_split_in_place(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
>>>
>>> I would find it more natural to see a sentinel value against
>>> "positive" to be 0, not -1.  "-1" gives an impression as if "-2"
>>> might do something different from "-1", but Zero is a lot more
>>> special.
>>
>> You have raised a good point and I think there is a flaw in the API, but
>> I'm not sure I agree with you what the flaw is...
>>
>> The "maxsplit" argument limits the number of times the string should be
>> split.  I.e., if maxsplit is set, then the output will have at most
>> (maxsplit + 1) strings.
> 
> So "do not split, just give me the whole thing" would be maxsplit == 0
> to split into (maxsplit+1) == 1 string.  I think we are in agreement
> that your "-1" does not make any sense, and your documentation that
> said "positive" is the saner thing to do, no?

No.  I think that my handling of maxsplit=0 was incorrect but that we
should continue using -1 as the special value.

I see maxsplit=0 as a legitimate degenerate case meaning "split into 1
string".  Granted, it would only be useful in specialized situations
[1].  Moreover, "-1" makes a much more obvious special value than "0";
somebody reading code with maxsplit=-1 knows immediately that this is a
special value, whereas the handling of maxsplit=0 isn't quite so
blindingly obvious unless the reader knows the outcome of our current
discussion :-)

Therefore I still prefer treating only negative values of maxsplit to
mean "unlimited" and fixing maxsplit=0 as described.  But if you insist
on the other convention, let me know and I will change it.

Michael

[1] A case I can think of would be parsing a format like

    NUMPARENTS [PARENT...] SUMMARY

where "string_list_split(list, rest_of_line, ' ', numparents)" does the
right thing even if numparents==0.

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] Add a new function, string_list_split_in_place()
  2012-09-10 11:48         ` Michael Haggerty
@ 2012-09-10 16:09           ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2012-09-10 16:09 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> OK, so the bottom line would be to have two versions of the function.
> One takes a (const char *) and *requires* strdup_strings to be set on
> its input list:
>
> int string_list_split(struct string_list *list, const char *string,
> 		      int delim, int maxsplit)
> {
> 	assert(list->strdup_strings);
> 	...
> }
>
> The other takes a (char *) and modifies it in-place, and maybe even
> requires strdup_strings to be false on its input list:
>
> int string_list_split_in_place(struct string_list *list, char *string,
> 			       int delim, int maxsplit)
> {
> 	/* not an error per se but a strong suggestion of one: */
> 	assert(!list->strdup_strings);
> 	...
> }
>
> (The latter (modulo assert) is the one that I have implemented, but it
> might not be needed immediately.)  Do you agree?

OK; I do not offhand know which one you immediately needed, but I
think that is a sensible way to structure the API.

> [1] A case I can think of would be parsing a format like
>
>     NUMPARENTS [PARENT...] SUMMARY
>
> where "string_list_split(list, rest_of_line, ' ', numparents)" does the
> right thing even if numparents==0.

OK.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] Add a function string_list_longest_prefix()
  2012-09-10 10:01     ` Michael Haggerty
@ 2012-09-10 16:24       ` Junio C Hamano
  2012-09-10 16:33         ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-09-10 16:24 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Another idea: in string-list.h, one could name parameters "sorted_list"
> when they must be sorted as a precondition of the function.

That sounds like a very sensible thing to do.

> But before getting too hung up on finery, the effort might be better
> invested adding documentation for functions that are totally lacking it,
> like
>
>     string_list_clear_func()
>     for_each_string_list()
>     for_each_string_list_item()
>     string_list_find_insert_index()
>     string_list_insert_at_index()
>
> While we're on the subject, it seems to me that documenting APIs like
> these in separate files under Documentation/technical rather than in the
> header files themselves
>
> - makes the documentation for a particular function harder to find,
>
> - makes it easier for the documentation to get out of sync with the
> actual collection of functions (e.g., the 5 undocumented functions
> listed above).
>
> - makes it awkward for the documentation to refer to particular function
> parameters by name.
>
> While it is nice to have a high-level prose description of an API, I am
> often frustrated by the lack of "docstrings" in the header file where a
> function is declared.  The high-level description of an API could be put
> at the top of the header file.
>
> Also, better documentation in header files could enable the automatic
> generation of API docs (e.g., via doxygen).

Yeah, perhaps you may want to look into doing an automated
generation of Documentation/technical/api-*.txt files out of the
headers.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] Add a function string_list_longest_prefix()
  2012-09-10 16:24       ` Junio C Hamano
@ 2012-09-10 16:33         ` Jeff King
  2012-09-10 17:48           ` Andreas Ericsson
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2012-09-10 16:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

On Mon, Sep 10, 2012 at 09:24:17AM -0700, Junio C Hamano wrote:

> > While we're on the subject, it seems to me that documenting APIs like
> > these in separate files under Documentation/technical rather than in the
> > header files themselves
> >
> > - makes the documentation for a particular function harder to find,
> >
> > - makes it easier for the documentation to get out of sync with the
> > actual collection of functions (e.g., the 5 undocumented functions
> > listed above).
> >
> > - makes it awkward for the documentation to refer to particular function
> > parameters by name.
> >
> > While it is nice to have a high-level prose description of an API, I am
> > often frustrated by the lack of "docstrings" in the header file where a
> > function is declared.  The high-level description of an API could be put
> > at the top of the header file.
> >
> > Also, better documentation in header files could enable the automatic
> > generation of API docs (e.g., via doxygen).
> 
> Yeah, perhaps you may want to look into doing an automated
> generation of Documentation/technical/api-*.txt files out of the
> headers.

I was just documenting something in technical/api-* the other day, and
had the same feeling. I'd be very happy if we moved to some kind of
literate-programming system. I have no idea which ones are good or bad,
though. I have used doxygen, but all I remember is it being painfully
baroque. I'd much rather have something simple and lightweight, with an
easy markup format. For example, this:

  http://tomdoc.org/

Looks much nicer to me than most doxygen I've seen. But again, it's been
a while, so maybe doxygen is nicer than I remember.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] Add a function string_list_longest_prefix()
  2012-09-10 16:33         ` Jeff King
@ 2012-09-10 17:48           ` Andreas Ericsson
  2012-09-10 19:21             ` Using doxygen (or something similar) to generate API docs [was [PATCH 4/4] Add a function string_list_longest_prefix()] Michael Haggerty
  0 siblings, 1 reply; 23+ messages in thread
From: Andreas Ericsson @ 2012-09-10 17:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Michael Haggerty, git

On 09/10/2012 06:33 PM, Jeff King wrote:
> On Mon, Sep 10, 2012 at 09:24:17AM -0700, Junio C Hamano wrote:
> 
>>> While we're on the subject, it seems to me that documenting APIs like
>>> these in separate files under Documentation/technical rather than in the
>>> header files themselves
>>>
>>> - makes the documentation for a particular function harder to find,
>>>
>>> - makes it easier for the documentation to get out of sync with the
>>> actual collection of functions (e.g., the 5 undocumented functions
>>> listed above).
>>>
>>> - makes it awkward for the documentation to refer to particular function
>>> parameters by name.
>>>
>>> While it is nice to have a high-level prose description of an API, I am
>>> often frustrated by the lack of "docstrings" in the header file where a
>>> function is declared.  The high-level description of an API could be put
>>> at the top of the header file.
>>>
>>> Also, better documentation in header files could enable the automatic
>>> generation of API docs (e.g., via doxygen).
>>
>> Yeah, perhaps you may want to look into doing an automated
>> generation of Documentation/technical/api-*.txt files out of the
>> headers.
> 
> I was just documenting something in technical/api-* the other day, and
> had the same feeling. I'd be very happy if we moved to some kind of
> literate-programming system. I have no idea which ones are good or bad,
> though. I have used doxygen, but all I remember is it being painfully
> baroque. I'd much rather have something simple and lightweight, with an
> easy markup format. For example, this:
> 
>    http://tomdoc.org/
> 
> Looks much nicer to me than most doxygen I've seen. But again, it's been
> a while, so maybe doxygen is nicer than I remember.
> 

Doxygen has a the very nifty feature of being able to generate
callgraphs though. We use it extensively at $dayjob, so if you need a
hand building something sensible out of git's headers, I'd be happy
to help.

libgit2 uses doxygen btw, and has done since the start. If we ever
merge the two, it would be neat to use the same.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Using doxygen (or something similar) to generate API docs [was [PATCH 4/4] Add a function string_list_longest_prefix()]
  2012-09-10 17:48           ` Andreas Ericsson
@ 2012-09-10 19:21             ` Michael Haggerty
  2012-09-10 21:56               ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Michael Haggerty @ 2012-09-10 19:21 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Jeff King, Junio C Hamano, git

I'm renaming this thread so that the bikeshedding can get over ASAP.

On 09/10/2012 07:48 PM, Andreas Ericsson wrote:
> On 09/10/2012 06:33 PM, Jeff King wrote:
>> On Mon, Sep 10, 2012 at 09:24:17AM -0700, Junio C Hamano wrote:
>>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>>> Also, better documentation in header files could enable the automatic
>>>> generation of API docs (e.g., via doxygen).
>>>
>>> Yeah, perhaps you may want to look into doing an automated
>>> generation of Documentation/technical/api-*.txt files out of the
>>> headers.
>>
>> I was just documenting something in technical/api-* the other day, and
>> had the same feeling. I'd be very happy if we moved to some kind of
>> literate-programming system. I have no idea which ones are good or bad,
>> though. I have used doxygen, but all I remember is it being painfully
>> baroque. I'd much rather have something simple and lightweight, with an
>> easy markup format. For example, this:
>>
>>    http://tomdoc.org/
>>
>> Looks much nicer to me than most doxygen I've seen. But again, it's been
>> a while, so maybe doxygen is nicer than I remember.

I don't have a personal preference for what system is used.  I mentioned
doxygen only because it seems to be a well-known example.

>From a glance at the URL you mentioned, it looks like TomDoc is only
applicable to Ruby code.

> Doxygen has a the very nifty feature of being able to generate
> callgraphs though. We use it extensively at $dayjob, so if you need a
> hand building something sensible out of git's headers, I'd be happy
> to help.

My plate is full.  If you are able to work on this, it would be awesome.
 As far as I'm concerned, you are the new literate documentation czar :-)

Most importantly, having a convenient system of converting header
comments into documentation would hopefully motivate other people to add
better header comments in the first place, and motivate reviewers to
insist on them.  It's shocking (to me) how few functions are documented,
and how often I have to read masses of C code to figure out what a
function is for, its pre- and post-conditions, its memory policy, etc.
Often I find myself having to read functions three layers down the call
tree to figure out the behavior of the top-layer function.  I try to
document things as I go, but it's only a drop in the bucket.

> libgit2 uses doxygen btw, and has done since the start. If we ever
> merge the two, it would be neat to use the same.

That would be a nice bonus.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using doxygen (or something similar) to generate API docs [was [PATCH 4/4] Add a function string_list_longest_prefix()]
  2012-09-10 19:21             ` Using doxygen (or something similar) to generate API docs [was [PATCH 4/4] Add a function string_list_longest_prefix()] Michael Haggerty
@ 2012-09-10 21:56               ` Jeff King
  2012-09-10 22:09                 ` Michael Haggerty
  2012-09-11  1:01                 ` Andreas Ericsson
  0 siblings, 2 replies; 23+ messages in thread
From: Jeff King @ 2012-09-10 21:56 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Andreas Ericsson, Junio C Hamano, git

On Mon, Sep 10, 2012 at 09:21:12PM +0200, Michael Haggerty wrote:

> I'm renaming this thread so that the bikeshedding can get over ASAP.

Thanks. :)

> >>    http://tomdoc.org/
> >>
> >> Looks much nicer to me than most doxygen I've seen. But again, it's been
> >> a while, so maybe doxygen is nicer than I remember.
> 
> I don't have a personal preference for what system is used.  I mentioned
> doxygen only because it seems to be a well-known example.
> 
> From a glance at the URL you mentioned, it looks like TomDoc is only
> applicable to Ruby code.

Yeah, sorry, I should have been more clear; tomdoc is not an option
because it doesn't do C. But what I like about it is the more
natural markup syntax. I was wondering if there were other similar
solutions. Looks like "NaturalDocs" is one:

  http://www.naturaldocs.org/documenting.html

On the other hand, doxygen is well-known among open source folks, which
counts for something.  And from what I've read, recent versions support
Markdown, but I'm not sure of the details. So maybe it is a lot better
than I remember.

> > Doxygen has a the very nifty feature of being able to generate
> > callgraphs though. We use it extensively at $dayjob, so if you need a
> > hand building something sensible out of git's headers, I'd be happy
> > to help.

It has been over a decade since I seriously used doxygen for anything,
and then it was a medium-sized project. So take my opinion with a grain
of salt. But I remember the callgraph feature being one of those things
that _sounded_ really cool, but in practice was not all that useful.

> My plate is full.  If you are able to work on this, it would be awesome.
>  As far as I'm concerned, you are the new literate documentation czar :-)

Lucky me? :)

I think I'll leave it for the moment, and next time I start to add some
api-level documentation I'll take a look at doxygen-ating them and see
how I like it. And I'd invite anyone else to do the same (in doxygen, or
whatever system you like -- the best way to evaluate a tool like this is
to see how your real work would look).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using doxygen (or something similar) to generate API docs [was [PATCH 4/4] Add a function string_list_longest_prefix()]
  2012-09-10 21:56               ` Jeff King
@ 2012-09-10 22:09                 ` Michael Haggerty
  2012-09-11  1:01                 ` Andreas Ericsson
  1 sibling, 0 replies; 23+ messages in thread
From: Michael Haggerty @ 2012-09-10 22:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Ericsson, Junio C Hamano, git

On 09/10/2012 11:56 PM, Jeff King wrote:
> On Mon, Sep 10, 2012 at 09:21:12PM +0200, Michael Haggerty wrote:
>> My plate is full.  If you are able to work on this, it would be awesome.
>>  As far as I'm concerned, you are the new literate documentation czar :-)
> 
> Lucky me? :)

I was nominating Andreas, who rashly volunteered to help.  But don't
feel left out; there's enough work to go around :-)

> I think I'll leave it for the moment, and next time I start to add some
> api-level documentation I'll take a look at doxygen-ating them and see
> how I like it. And I'd invite anyone else to do the same (in doxygen, or
> whatever system you like -- the best way to evaluate a tool like this is
> to see how your real work would look).

I agree with that.  A very good start would be to mark up a single API
and build the docs (by hand if need be) using a proposed tool.  This
will let people get a feel for (1) what the markup has to look like and
(2) what they get out of it.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using doxygen (or something similar) to generate API docs [was [PATCH 4/4] Add a function string_list_longest_prefix()]
  2012-09-10 21:56               ` Jeff King
  2012-09-10 22:09                 ` Michael Haggerty
@ 2012-09-11  1:01                 ` Andreas Ericsson
  1 sibling, 0 replies; 23+ messages in thread
From: Andreas Ericsson @ 2012-09-11  1:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, Junio C Hamano, git

On 09/10/2012 11:56 PM, Jeff King wrote:
> On Mon, Sep 10, 2012 at 09:21:12PM +0200, Michael Haggerty wrote:
> 
>> I'm renaming this thread so that the bikeshedding can get over ASAP.
> 
> Thanks. :)
> 
>>>>     http://tomdoc.org/
>>>>
>>>> Looks much nicer to me than most doxygen I've seen. But again, it's been
>>>> a while, so maybe doxygen is nicer than I remember.
>>
>> I don't have a personal preference for what system is used.  I mentioned
>> doxygen only because it seems to be a well-known example.
>>
>>  From a glance at the URL you mentioned, it looks like TomDoc is only
>> applicable to Ruby code.
> 
> Yeah, sorry, I should have been more clear; tomdoc is not an option
> because it doesn't do C. But what I like about it is the more
> natural markup syntax. I was wondering if there were other similar
> solutions. Looks like "NaturalDocs" is one:
> 
>    http://www.naturaldocs.org/documenting.html
> 
> On the other hand, doxygen is well-known among open source folks, which
> counts for something.  And from what I've read, recent versions support
> Markdown, but I'm not sure of the details. So maybe it is a lot better
> than I remember.
> 

Markdown is supported, yes. There aren't really any details to it.
I don't particularly like markdown, but my colleagues tend to use
it for howto's and whatnot and it can be mixed with other doxygen
styles without problem.


>>> Doxygen has a the very nifty feature of being able to generate
>>> callgraphs though. We use it extensively at $dayjob, so if you need a
>>> hand building something sensible out of git's headers, I'd be happy
>>> to help.
> 
> It has been over a decade since I seriously used doxygen for anything,
> and then it was a medium-sized project. So take my opinion with a grain
> of salt. But I remember the callgraph feature being one of those things
> that _sounded_ really cool, but in practice was not all that useful.
> 

It's like all tools; Once you're used to it, it's immensely useful. I
tend to prefer using it to find either code in dire need of refactoring
(where the graph is too large), or engines and exit points. For those
purposes, it's pretty hard to beat a good callgraph.

>> My plate is full.  If you are able to work on this, it would be awesome.
>>   As far as I'm concerned, you are the new literate documentation czar :-)
> 
> Lucky me? :)
> 

I think he was talking to me, but since you seem to have volunteered... ;)

> I think I'll leave it for the moment, and next time I start to add some
> api-level documentation I'll take a look at doxygen-ating them and see
> how I like it. And I'd invite anyone else to do the same (in doxygen, or
> whatever system you like -- the best way to evaluate a tool like this is
> to see how your real work would look).
> 

That's one of the problems. People follow what's already there, and there
are no comments there now so there won't be any added in the future :-/

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-09-11  1:01 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-09  5:53 [PATCH 0/4] Add some string_list-related functions Michael Haggerty
2012-09-09  5:53 ` [PATCH 1/4] Add a new function, string_list_split_in_place() Michael Haggerty
2012-09-09  9:35   ` Junio C Hamano
2012-09-10  4:45     ` Michael Haggerty
2012-09-10  5:47       ` Junio C Hamano
2012-09-10 11:48         ` Michael Haggerty
2012-09-10 16:09           ` Junio C Hamano
2012-09-09  5:53 ` [PATCH 2/4] Add a new function, filter_string_list() Michael Haggerty
2012-09-09  9:40   ` Junio C Hamano
2012-09-10  8:58     ` Michael Haggerty
2012-09-09  5:53 ` [PATCH 3/4] Add a new function, string_list_remove_duplicates() Michael Haggerty
2012-09-09  9:45   ` Junio C Hamano
2012-09-10  9:15     ` Michael Haggerty
2012-09-09  5:53 ` [PATCH 4/4] Add a function string_list_longest_prefix() Michael Haggerty
2012-09-09  9:54   ` Junio C Hamano
2012-09-10 10:01     ` Michael Haggerty
2012-09-10 16:24       ` Junio C Hamano
2012-09-10 16:33         ` Jeff King
2012-09-10 17:48           ` Andreas Ericsson
2012-09-10 19:21             ` Using doxygen (or something similar) to generate API docs [was [PATCH 4/4] Add a function string_list_longest_prefix()] Michael Haggerty
2012-09-10 21:56               ` Jeff King
2012-09-10 22:09                 ` Michael Haggerty
2012-09-11  1:01                 ` Andreas Ericsson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.