All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] Git config cache & special querying api utilizing the cache
@ 2014-06-09 12:49 Tanay Abhra
  2014-06-09 12:49 ` [PATCH v1] config: Add hashtable for config parsing & retrival Tanay Abhra
  2014-06-11 14:01 ` [PATCH v1] Git config cache & special querying api utilizing the cache Matthieu Moy
  0 siblings, 2 replies; 11+ messages in thread
From: Tanay Abhra @ 2014-06-09 12:49 UTC (permalink / raw)
  To: git
  Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Eric Sunshine,
	Jeff King, Torsten Bogershausen

Hi,

I am taking this patch out of RFC.

[PATCH V1]:Most of the invaluable suggestions by Eric Sunshine, Torsten Bogershausen and
	Jeff King has been implemented[1]. Complete rewrite of config_cache*() family
	using git_config() as hook as suggested by Jeff. Thanks for the review.

[RFC V2]: Improved according to the suggestions by Eric Sunshine and Torsten Bogershausen.
	Added cache invalidation when config file is changed.[2]
	I am using git_config_set_multivar_in_file() as an update hook.

This is my first patch for this year's GSoC. My project is
"Git Config API improvements". The link of my proposal is appended below [3].

The aim of this patch series is to generate a cache for querying values from
the config files in a non-callback manner as the current method reads and
parses the config files every time a value is queried for.

The cache is generated from hooking the update_cache function to the current
parsing and callback mechanism in config.c. It is implemented as an hashmap
using the hashmap-api with variables and its corresponding values list as
its members. The values in the list are sorted in order of increasing priority.
The cache is initialised the first time when any of the new query functions is
called. It is invalidated by using git_config_set_multivar_in_file() as an
update hook.

We add two new functions to the config-api, git_config_get_string() and
git_config_get_string_multi() for querying in a non callback manner from
the cache.

I have run the tests and debug the code using custom functions and it works
fine.

[1] http://marc.info/?t=140172066200006&r=1&w=2
[2] http://git.661346.n2.nabble.com/RFC-PATCH-0-2-Git-config-cache-amp-special-querying-api-utilizing-the-cache-td7611691.html
[3] https://drive.google.com/file/d/0B4suZ-aHqDcnSUZJRXVTTnZUN1E/edit?usp=sharing

Cheers,
Tanay Abhra.

Tanay Abhra (1):
  config: Add hashtable for config parsing & retrival

 Documentation/technical/api-config.txt |  18 +++++
 cache.h                                |   2 +
 config.c                               | 122 +++++++++++++++++++++++++++++++++
 3 files changed, 142 insertions(+)

-- 
1.9.0.GIT

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-09 12:49 [PATCH v1] Git config cache & special querying api utilizing the cache Tanay Abhra
@ 2014-06-09 12:49 ` Tanay Abhra
  2014-06-09 14:24   ` Matthieu Moy
                     ` (3 more replies)
  2014-06-11 14:01 ` [PATCH v1] Git config cache & special querying api utilizing the cache Matthieu Moy
  1 sibling, 4 replies; 11+ messages in thread
From: Tanay Abhra @ 2014-06-09 12:49 UTC (permalink / raw)
  To: git
  Cc: Tanay Abhra, Ramkumar Ramachandra, Matthieu Moy, Eric Sunshine,
	Jeff King, Torsten Bogershausen

Add a hash table to cache all key-value pairs read from config files
(repo specific .git/config, user wide ~/.gitconfig and the global
/etc/gitconfig). Add two external functions `git_config_get_string` and
`git_config_get_string_multi` for querying in a non-callback manner from the
hash table.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
---
 Documentation/technical/api-config.txt |  18 +++++
 cache.h                                |   2 +
 config.c                               | 122 +++++++++++++++++++++++++++++++++
 3 files changed, 142 insertions(+)

diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
index 230b3a0..5b6e376 100644
--- a/Documentation/technical/api-config.txt
+++ b/Documentation/technical/api-config.txt
@@ -77,6 +77,24 @@ To read a specific file in git-config format, use
 `git_config_from_file`. This takes the same callback and data parameters
 as `git_config`.
 
+Querying For Specific Variables
+-------------------------------
+
+For programs wanting to query for specific variables in a non-callback
+manner, the config API provides two functions `git_config_get_string`
+and `git_config_get_string_multi`. They both take a single parameter,
+
+- a key string in canonical flat form for which the corresponding values
+  will be retrieved and returned.
+
+They both read values from an internal cache generated previously from
+reading the config files. `git_config_get_string` returns the value with
+the highest priority(i.e. for the same variable value in the repo config
+will be preferred over value in user wide config).
+
+`git_config_get_string_multi` returns a `string_list` containing all the
+values for that particular key, sorted in order of increasing priority.
+
 Value Parsing Helpers
 ---------------------
 
diff --git a/cache.h b/cache.h
index 107ac61..2038150 100644
--- a/cache.h
+++ b/cache.h
@@ -1271,6 +1271,8 @@ extern int check_repository_format_version(const char *var, const char *value, v
 extern int git_env_bool(const char *, int);
 extern int git_config_system(void);
 extern int config_error_nonbool(const char *);
+extern const char *git_config_get_string(const char *);
+extern const struct string_list *git_config_get_string_multi(const char *);
 #if defined(__GNUC__) && ! defined(__clang__)
 #define config_error_nonbool(s) (config_error_nonbool(s), -1)
 #endif
diff --git a/config.c b/config.c
index a30cb5c..6f6b04e 100644
--- a/config.c
+++ b/config.c
@@ -9,6 +9,8 @@
 #include "exec_cmd.h"
 #include "strbuf.h"
 #include "quote.h"
+#include "hashmap.h"
+#include "string-list.h"
 
 struct config_source {
 	struct config_source *prev;
@@ -37,6 +39,120 @@ static struct config_source *cf;
 
 static int zlib_compression_seen;
 
+struct config_cache_entry {
+	struct hashmap_entry ent;
+	char *key;
+	struct string_list *value_list;
+};
+
+static int hashmap_is_init;
+
+static int config_cache_set_value(const char *key, const char *value);
+
+static int config_cache_entry_cmp_case(const struct config_cache_entry *e1,
+				 const struct config_cache_entry *e2, const void *unused)
+{
+	return strcmp(e1->key, e2->key);
+}
+
+static void config_cache_init(struct hashmap *config_cache)
+{
+	hashmap_init(config_cache, (hashmap_cmp_fn) config_cache_entry_cmp_case, 0);
+}
+
+static int config_cache_callback(const char *key, const char *value, void *unused)
+{
+	config_cache_set_value(key, value);
+	return 0;
+}
+
+static struct hashmap *get_config_cache(void)
+{
+	static struct hashmap config_cache;
+	if (!hashmap_is_init) {
+		config_cache_init(&config_cache);
+		hashmap_is_init = 1;
+		git_config(config_cache_callback, NULL);
+		return &config_cache;
+	}
+	return &config_cache;
+}
+
+static void config_cache_free(void)
+{
+	struct hashmap *config_cache;
+	struct config_cache_entry *entry;
+	struct hashmap_iter iter;
+	config_cache = get_config_cache();
+	hashmap_iter_init(config_cache, &iter);
+	while ((entry = hashmap_iter_next(&iter))) {
+		free(entry->key);
+		string_list_clear(entry->value_list, 0);
+	}
+	hashmap_free(config_cache, 1);
+	hashmap_is_init = 0;
+}
+
+static struct config_cache_entry *config_cache_find_entry(const char *key)
+{
+	struct hashmap *config_cache;
+	struct config_cache_entry k;
+	char *normalized_key;
+	int baselength = 0, ret;
+	config_cache = get_config_cache();
+	ret = git_config_parse_key(key, &normalized_key, &baselength);
+
+	if (ret)
+		return NULL;
+
+	hashmap_entry_init(&k, strhash(normalized_key));
+	k.key = (char*) key;
+	return hashmap_get(config_cache, &k, NULL);
+}
+
+static struct string_list *config_cache_get_value(const char *key)
+{
+	struct config_cache_entry *e = config_cache_find_entry(key);
+	return e ? e->value_list : NULL;
+}
+
+
+static int config_cache_set_value(const char *key, const char *value)
+{
+	struct hashmap *config_cache;
+	struct config_cache_entry *e;
+
+	config_cache = get_config_cache();
+	e = config_cache_find_entry(key);
+	if (!e) {
+		e = xmalloc(sizeof(*e));
+		hashmap_entry_init(e, strhash(key));
+		e->key = xstrdup(key);
+		e->value_list = xcalloc(sizeof(struct string_list), 1);
+		e->value_list->strdup_strings = 1;
+		string_list_append(e->value_list, value);
+		hashmap_add(config_cache, e);
+	} else {
+		string_list_append(e->value_list, value);
+	}
+	return 0;
+}
+
+extern const char *git_config_get_string(const char *key)
+{
+	struct string_list *values;
+	values = config_cache_get_value(key);
+	if (!values)
+		return NULL;
+	assert(values->nr > 0);
+	return values->items[values->nr - 1].string;
+}
+
+extern const struct string_list *git_config_get_string_multi(const char *key)
+{
+	return config_cache_get_value(key);
+}
+
 static int config_file_fgetc(struct config_source *conf)
 {
 	return fgetc(conf->u.file);
@@ -1700,6 +1816,12 @@ int git_config_set_multivar_in_file(const char *config_filename,
 	lock = NULL;
 	ret = 0;
 
+	/* contents of config file has changed, so invalidate the
+	 * config cache used by non-callback based query functions.
+	 */
+	if (hashmap_is_init)
+		config_cache_free();
+
 out_free:
 	if (lock)
 		rollback_lock_file(lock);
-- 
1.9.0.GIT

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-09 12:49 ` [PATCH v1] config: Add hashtable for config parsing & retrival Tanay Abhra
@ 2014-06-09 14:24   ` Matthieu Moy
  2014-06-10 11:51     ` Eric Sunshine
  2014-06-10 11:27   ` Eric Sunshine
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Matthieu Moy @ 2014-06-09 14:24 UTC (permalink / raw)
  To: Tanay Abhra
  Cc: git, Ramkumar Ramachandra, Eric Sunshine, Jeff King,
	Torsten Bogershausen

Tanay Abhra <tanayabh@gmail.com> writes:

> +the highest priority(i.e. for the same variable value in the repo config
                       ^
missing space.

> +struct config_cache_entry {
> +	struct hashmap_entry ent;
> +	char *key;
> +	struct string_list *value_list;
> +};

I guess this crossed Eric's remark about the fact that this is a
pointer.

> +static int hashmap_is_init;

I'd call it hashmap_initialized, but that's a matter of taste.

> +static void config_cache_free(void)

I didn't look closely enough to make sure there were no memory leak
remaining, but did you check with valgrind --leak-check that it is the
case in practice?

> +	/* contents of config file has changed, so invalidate the
> +	 * config cache used by non-callback based query functions.
> +	 */

Style: Git usually writes multi-line comments

/*
 * like
 * this
 */

(not always applied, but documented in Documentation/CodingGuidelines)

(no time for a more detailed review, sorry)

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-09 12:49 ` [PATCH v1] config: Add hashtable for config parsing & retrival Tanay Abhra
  2014-06-09 14:24   ` Matthieu Moy
@ 2014-06-10 11:27   ` Eric Sunshine
  2014-06-10 12:35     ` Tanay Abhra
  2014-06-10 11:45   ` Eric Sunshine
  2014-06-10 23:30   ` Eric Sunshine
  3 siblings, 1 reply; 11+ messages in thread
From: Eric Sunshine @ 2014-06-10 11:27 UTC (permalink / raw)
  To: Tanay Abhra
  Cc: Git List, Ramkumar Ramachandra, Matthieu Moy, Jeff King,
	Torsten Bogershausen

On Mon, Jun 9, 2014 at 8:49 AM, Tanay Abhra <tanayabh@gmail.com> wrote:
> Add a hash table to cache all key-value pairs read from config files
> (repo specific .git/config, user wide ~/.gitconfig and the global
> /etc/gitconfig). Add two external functions `git_config_get_string` and
> `git_config_get_string_multi` for querying in a non-callback manner from the
> hash table.
>
> Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
> ---
> diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
> index 230b3a0..5b6e376 100644
> --- a/Documentation/technical/api-config.txt
> +++ b/Documentation/technical/api-config.txt
> @@ -77,6 +77,24 @@ To read a specific file in git-config format, use
>  `git_config_from_file`. This takes the same callback and data parameters
>  as `git_config`.
>
> +Querying For Specific Variables
> +-------------------------------
> +
> +For programs wanting to query for specific variables in a non-callback
> +manner, the config API provides two functions `git_config_get_string`
> +and `git_config_get_string_multi`. They both take a single parameter,
> +
> +- a key string in canonical flat form for which the corresponding values
> +  will be retrieved and returned.

It's strange to have a bulleted list for a single item. It can be
stated more naturally as a full sentence, without the list.

More below.

> +They both read values from an internal cache generated previously from
> +reading the config files. `git_config_get_string` returns the value with
> +the highest priority(i.e. for the same variable value in the repo config
> +will be preferred over value in user wide config).
> +
> +`git_config_get_string_multi` returns a `string_list` containing all the
> +values for that particular key, sorted in order of increasing priority.
> +
>  Value Parsing Helpers
>  ---------------------
>
> diff --git a/config.c b/config.c
> index a30cb5c..6f6b04e 100644
> --- a/config.c
> +++ b/config.c
> @@ -9,6 +9,8 @@
>  #include "exec_cmd.h"
>  #include "strbuf.h"
>  #include "quote.h"
> +#include "hashmap.h"
> +#include "string-list.h"
>
>  struct config_source {
>         struct config_source *prev;
> @@ -37,6 +39,120 @@ static struct config_source *cf;
>
>  static int zlib_compression_seen;
>
> +struct config_cache_entry {
> +       struct hashmap_entry ent;
> +       char *key;
> +       struct string_list *value_list;

Same question as in my v1 and v2 reviews [1][2], and reiterated by
Matthieu [3]: Why is 'value_list' a pointer to a structure rather than
just a structure?

This is a genuine question. As a reviewer, I'm not sure how to
interpret your lack of response. I can't tell if you merely overlook
or forget the question multiple times; if you don't understand the
inquiry; or if you have a strong reason to prefer making this a
pointer. If you don't understand the question, it's okay to ask for
clarification. If there is a strong reason for this to be a pointer,
it should be explained. Without feedback from you, reviewers will
continue asking the same question(s) upon each submission.

> +};
> +
> +static int hashmap_is_init;
> +
> +static int config_cache_set_value(const char *key, const char *value);
> +
> +static int config_cache_entry_cmp_case(const struct config_cache_entry *e1,
> +                                const struct config_cache_entry *e2, const void *unused)
> +{
> +       return strcmp(e1->key, e2->key);

As suggested by Peff [4], this comparison is now case-sensitive,
instead of case-insensitive as in the previous version. Rather than
changing the appended '_icase' to '_case', a more typical function
name would be simply config_cache_entry_cmp().

> +}
> +
> +static void config_cache_init(struct hashmap *config_cache)
> +{
> +       hashmap_init(config_cache, (hashmap_cmp_fn) config_cache_entry_cmp_case, 0);

In his review [4], Peff suggested giving config_cache_entry_cmp_case()
the correct hashmap_cmp_fn signature so that this cast can be dropped.
It's not clear whether you disagree with his advice or overlooked or
forgot about it. If you disagree with his suggestion, it's okay to say
so and explain why you feel the way you do, but without feedback from
you, he or another reviewer is going to continue suggesting the same
change.

> +}
> +
> +static int config_cache_callback(const char *key, const char *value, void *unused)
> +{
> +       config_cache_set_value(key, value);
> +       return 0;
> +}
> +
> +static struct hashmap *get_config_cache(void)
> +{
> +       static struct hashmap config_cache;
> +       if (!hashmap_is_init) {
> +               config_cache_init(&config_cache);
> +               hashmap_is_init = 1;
> +               git_config(config_cache_callback, NULL);
> +               return &config_cache;

Why do you 'return' here rather than just falling through to the
'return' outside the conditional?

> +       }
> +       return &config_cache;
> +}
> +
> +static void config_cache_free(void)
> +{
> +       struct hashmap *config_cache;
> +       struct config_cache_entry *entry;
> +       struct hashmap_iter iter;
> +       config_cache = get_config_cache();
> +       hashmap_iter_init(config_cache, &iter);
> +       while ((entry = hashmap_iter_next(&iter))) {
> +               free(entry->key);
> +               string_list_clear(entry->value_list, 0);
> +       }
> +       hashmap_free(config_cache, 1);
> +       hashmap_is_init = 0;
> +}
> +
> +static struct config_cache_entry *config_cache_find_entry(const char *key)
> +{
> +       struct hashmap *config_cache;
> +       struct config_cache_entry k;
> +       char *normalized_key;
> +       int baselength = 0, ret;
> +       config_cache = get_config_cache();
> +       ret = git_config_parse_key(key, &normalized_key, &baselength);

Since you neither care about nor ever reference 'baselength', you
should just pass NULL as the third argument.

> +       if (ret)
> +               return NULL;
> +
> +       hashmap_entry_init(&k, strhash(normalized_key));
> +       k.key = (char*) key;

This looks fishy. You're hashing based upon 'normalized_key' but then
comparing against the unnormalized 'key'. Peff's suggestion [4] was to
store the normalized key in the hash, which means that you should use
'normalized_key' for both hashing and comparing. (See additional
commentary about this issue below in config_cache_set_value().)

Style: (char *)key

> +       return hashmap_get(config_cache, &k, NULL);

You're leaking 'normalized_key', which git_config_parse_key()
allocated on your behalf.

> +}
> +
> +static struct string_list *config_cache_get_value(const char *key)
> +{
> +       struct config_cache_entry *e = config_cache_find_entry(key);
> +       return e ? e->value_list : NULL;
> +}
> +
> +
> +static int config_cache_set_value(const char *key, const char *value)
> +{
> +       struct hashmap *config_cache;
> +       struct config_cache_entry *e;
> +
> +       config_cache = get_config_cache();
> +       e = config_cache_find_entry(key);
> +       if (!e) {
> +               e = xmalloc(sizeof(*e));
> +               hashmap_entry_init(e, strhash(key));

The hash computed to store the key should be the same as the hash to
look it up. In config_cache_find_entry(), you're correctly computing
the hash based upon the normalized key, but here you're doing so from
the unnormalized key.

> +               e->key = xstrdup(key);

Likewise. Normalized keys should be stored in the hash, not unnormalized.

You should be invoking git_config_parse_key() to normalize the key
both for hashing and storing.

Note, also, that call git_config_parse_key() allocates the normalize
key on your behalf, so you do *not* want to xstrdup() it here.

> +               e->value_list = xcalloc(sizeof(struct string_list), 1);
> +               e->value_list->strdup_strings = 1;

This code is perhaps too intimate with the implementation of
string_list. It may work today (because it is safe to initialize all
string_list fields to 0 via xcalloc()), but a future change to the
string_list implementation may invalidate that assumption. The
initialization macros in string-list.h exist to preserve the
abstraction, so you don't have to know the details of initialization.
The macros are not suitable for your use-case, but an initialization
function, such as string_list_init(), would be suitable, and it would
be appropriate to add such a function in a preparatory patch (as
already suggested by [1]).

> +               string_list_append(e->value_list, value);
> +               hashmap_add(config_cache, e);
> +       } else {
> +               string_list_append(e->value_list, value);
> +       }
> +       return 0;
> +}
> +
> +extern const char *git_config_get_string(const char *key)

Drop the 'extern'. The header already declares it such.

> +{
> +       struct string_list *values;
> +       values = config_cache_get_value(key);
> +       if (!values)
> +               return NULL;
> +       assert(values->nr > 0);
> +       return values->items[values->nr - 1].string;
> +}
> +
> +extern const struct string_list *git_config_get_string_multi(const char *key)

Drop the 'extern'. The header already declares it such.

> +{
> +       return config_cache_get_value(key);
> +}
> +
>  static int config_file_fgetc(struct config_source *conf)
>  {
>         return fgetc(conf->u.file);
> @@ -1700,6 +1816,12 @@ int git_config_set_multivar_in_file(const char *config_filename,
>         lock = NULL;
>         ret = 0;
>
> +       /* contents of config file has changed, so invalidate the
> +        * config cache used by non-callback based query functions.
> +        */
> +       if (hashmap_is_init)
> +               config_cache_free();
> +
>  out_free:
>         if (lock)
>                 rollback_lock_file(lock);
> --
> 1.9.0.GIT

[1]: http://git.661346.n2.nabble.com/RFC-PATCH-0-2-Git-config-cache-amp-special-querying-api-utilizing-the-cache-td7611691.html#a7611860
[2]: http://thread.gmane.org/gmane.comp.version-control.git/250566/focus=251058
[3]: http://thread.gmane.org/gmane.comp.version-control.git/251073/focus=251079
[4]: http://thread.gmane.org/gmane.comp.version-control.git/250566/focus=250618

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-09 12:49 ` [PATCH v1] config: Add hashtable for config parsing & retrival Tanay Abhra
  2014-06-09 14:24   ` Matthieu Moy
  2014-06-10 11:27   ` Eric Sunshine
@ 2014-06-10 11:45   ` Eric Sunshine
  2014-06-10 12:37     ` Tanay Abhra
  2014-06-10 23:30   ` Eric Sunshine
  3 siblings, 1 reply; 11+ messages in thread
From: Eric Sunshine @ 2014-06-10 11:45 UTC (permalink / raw)
  To: Tanay Abhra
  Cc: Git List, Ramkumar Ramachandra, Matthieu Moy, Jeff King,
	Torsten Bogershausen, Brian Gesiak

One additional comment...

On Mon, Jun 9, 2014 at 8:49 AM, Tanay Abhra <tanayabh@gmail.com> wrote:
> +static int config_cache_set_value(const char *key, const char *value)
> +{
> +       struct hashmap *config_cache;
> +       struct config_cache_entry *e;
> +
> +       config_cache = get_config_cache();
> +       e = config_cache_find_entry(key);
> +       if (!e) {
> +               e = xmalloc(sizeof(*e));
> +               hashmap_entry_init(e, strhash(key));
> +               e->key = xstrdup(key);
> +               e->value_list = xcalloc(sizeof(struct string_list), 1);

Order of xcalloc() arguments is incorrect [1].

[1]: http://git.661346.n2.nabble.com/PATCH-00-15-Rearrange-xcalloc-arguments-td7611675.html

> +               e->value_list->strdup_strings = 1;
> +               string_list_append(e->value_list, value);
> +               hashmap_add(config_cache, e);
> +       } else {
> +               string_list_append(e->value_list, value);
> +       }
> +       return 0;
> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-09 14:24   ` Matthieu Moy
@ 2014-06-10 11:51     ` Eric Sunshine
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Sunshine @ 2014-06-10 11:51 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Tanay Abhra, Git List, Ramkumar Ramachandra, Jeff King,
	Torsten Bogershausen

On Mon, Jun 9, 2014 at 10:24 AM, Matthieu Moy
<Matthieu.Moy@grenoble-inp.fr> wrote:
> Tanay Abhra <tanayabh@gmail.com> writes:
>> +struct config_cache_entry {
>> +     struct hashmap_entry ent;
>> +     char *key;
>> +     struct string_list *value_list;
>> +};
>
> I guess this crossed Eric's remark about the fact that this is a
> pointer.
>
>> +static void config_cache_free(void)
>
> I didn't look closely enough to make sure there were no memory leak
> remaining, but did you check with valgrind --leak-check that it is the
> case in practice?

It does leak the config_cache_entry::value_list member which was
xcalloc()'d in config_cache_set_value().  (I didn't mention it in my
reviews because I was expecting 'value_list' to be changed to a plain
structure rather than a pointer to a structure.)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-10 11:27   ` Eric Sunshine
@ 2014-06-10 12:35     ` Tanay Abhra
  2014-06-10 21:28       ` Eric Sunshine
  0 siblings, 1 reply; 11+ messages in thread
From: Tanay Abhra @ 2014-06-10 12:35 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Ramkumar Ramachandra, Matthieu Moy, Jeff King,
	Torsten Bogershausen

Hi,

Thanks for the review, Eric. I have replied to your comments below,
I will try to reply early and more promptly now.

On 06/10/2014 04:27 AM, Eric Sunshine wrote:
>> ---
>> diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
>> index 230b3a0..5b6e376 100644
>> --- a/Documentation/technical/api-config.txt
>> +++ b/Documentation/technical/api-config.txt
>> @@ -77,6 +77,24 @@ To read a specific file in git-config format, use
>>  `git_config_from_file`. This takes the same callback and data parameters
>>  as `git_config`.
>>
>> +Querying For Specific Variables
>> +-------------------------------
>> +
>> +For programs wanting to query for specific variables in a non-callback
>> +manner, the config API provides two functions `git_config_get_string`
>> +and `git_config_get_string_multi`. They both take a single parameter,
>> +
>> +- a key string in canonical flat form for which the corresponding values
>> +  will be retrieved and returned.
> 
> It's strange to have a bulleted list for a single item. It can be
> stated more naturally as a full sentence, without the list.

Point Noted.

>> +They both read values from an internal cache generated previously from
>> +reading the config files. `git_config_get_string` returns the value with
>> +the highest priority(i.e. for the same variable value in the repo config
>> +will be preferred over value in user wide config).
>> +
>> +`git_config_get_string_multi` returns a `string_list` containing all the
>> +values for that particular key, sorted in order of increasing priority.
>> +
>>  Value Parsing Helpers
>>  ---------------------
>>
>> diff --git a/config.c b/config.c
>> index a30cb5c..6f6b04e 100644
>> --- a/config.c
>> +++ b/config.c
>> @@ -9,6 +9,8 @@
>>  #include "exec_cmd.h"
>>  #include "strbuf.h"
>>  #include "quote.h"
>> +#include "hashmap.h"
>> +#include "string-list.h"
>>
>>  struct config_source {
>>         struct config_source *prev;
>> @@ -37,6 +39,120 @@ static struct config_source *cf;
>>
>>  static int zlib_compression_seen;
>>
>> +struct config_cache_entry {
>> +       struct hashmap_entry ent;
>> +       char *key;
>> +       struct string_list *value_list;
> 
> Same question as in my v1 and v2 reviews [1][2], and reiterated by
> Matthieu [3]: Why is 'value_list' a pointer to a structure rather than
> just a structure?
> 

Sorry for the lack of response on this part. I thought choosing a pointer to a
structure or just the structure itself was a stylistic choice. Since most of the
functions just pass the pointer to this struct, I thought it would be more natural to
use it. But I have changed my mind on this one. In the next iteration I will be using
a plane old struct.

> 
>> +};
>> +
>> +static int hashmap_is_init;
>> +
>> +static int config_cache_set_value(const char *key, const char *value);
>> +
>> +static int config_cache_entry_cmp_case(const struct config_cache_entry *e1,
>> +                                const struct config_cache_entry *e2, const void *unused)
>> +{
>> +       return strcmp(e1->key, e2->key);
> 
> As suggested by Peff [4], this comparison is now case-sensitive,
> instead of case-insensitive as in the previous version. Rather than
> changing the appended '_icase' to '_case', a more typical function
> name would be simply config_cache_entry_cmp().

Noted.

>> +}
>> +
>> +static void config_cache_init(struct hashmap *config_cache)
>> +{
>> +       hashmap_init(config_cache, (hashmap_cmp_fn) config_cache_entry_cmp_case, 0);
> 
> In his review [4], Peff suggested giving config_cache_entry_cmp_case()
> the correct hashmap_cmp_fn signature so that this cast can be dropped.
> It's not clear whether you disagree with his advice or overlooked or
> forgot about it. If you disagree with his suggestion, it's okay to say
> so and explain why you feel the way you do, but without feedback from
> you, he or another reviewer is going to continue suggesting the same
> change.

Now on this one, I checked the code thoroughly. Every single hashmap_init()
incantation in git code has a hashmap_cmp_fn cast. I don't think that it is
necessary to do so. Is it?

>> +}
>> +
>> +static int config_cache_callback(const char *key, const char *value, void *unused)
>> +{
>> +       config_cache_set_value(key, value);
>> +       return 0;
>> +}
>> +
>> +static struct hashmap *get_config_cache(void)
>> +{
>> +       static struct hashmap config_cache;
>> +       if (!hashmap_is_init) {
>> +               config_cache_init(&config_cache);
>> +               hashmap_is_init = 1;
>> +               git_config(config_cache_callback, NULL);
>> +               return &config_cache;
> 
> Why do you 'return' here rather than just falling through to the
> 'return' outside the conditional?

Noted.

>> +static struct config_cache_entry *config_cache_find_entry(const char *key)
>> +{
>> +       struct hashmap *config_cache;
>> +       struct config_cache_entry k;
>> +       char *normalized_key;
>> +       int baselength = 0, ret;
>> +       config_cache = get_config_cache();
>> +       ret = git_config_parse_key(key, &normalized_key, &baselength);
> 
> Since you neither care about nor ever reference 'baselength', you
> should just pass NULL as the third argument.
> 

Noted.

>> +       if (ret)
>> +               return NULL;
>> +
>> +       hashmap_entry_init(&k, strhash(normalized_key));
>> +       k.key = (char*) key;
> 
> This looks fishy. You're hashing based upon 'normalized_key' but then
> comparing against the unnormalized 'key'. Peff's suggestion [4] was to
> store the normalized key in the hash, which means that you should use
> 'normalized_key' for both hashing and comparing. (See additional
> commentary about this issue below in config_cache_set_value().)

Ouch, this I had corrected in a future commit. But forgot to include in
this patch.

> Style: (char *)key

Noted. In function arguments the code uses (char*) key. I copied it from there.
:)

>> +       return hashmap_get(config_cache, &k, NULL);
> 
> You're leaking 'normalized_key', which git_config_parse_key()
> allocated on your behalf.
>
Noted. I will now check with valgrind before sending any future series.

>> +}
>> +
>> +static struct string_list *config_cache_get_value(const char *key)
>> +{
>> +       struct config_cache_entry *e = config_cache_find_entry(key);
>> +       return e ? e->value_list : NULL;
>> +}
>> +
>> +
>> +static int config_cache_set_value(const char *key, const char *value)
>> +{
>> +       struct hashmap *config_cache;
>> +       struct config_cache_entry *e;
>> +
>> +       config_cache = get_config_cache();
>> +       e = config_cache_find_entry(key);
>> +       if (!e) {
>> +               e = xmalloc(sizeof(*e));
>> +               hashmap_entry_init(e, strhash(key));
> 
> The hash computed to store the key should be the same as the hash to
> look it up. In config_cache_find_entry(), you're correctly computing
> the hash based upon the normalized key, but here you're doing so from
> the unnormalized key.
> 
>> +               e->key = xstrdup(key);
> 
> Likewise. Normalized keys should be stored in the hash, not unnormalized.
> 
> You should be invoking git_config_parse_key() to normalize the key
> both for hashing and storing.
> 
> Note, also, that call git_config_parse_key() allocates the normalize
> key on your behalf, so you do *not* want to xstrdup() it here.

config_cache_set_value() gets its values from git_config() as it the callback.
git_config() feeds the callback only normalized values by using functions like
get_extended_base_var(), get_base_var() etc. Thus, I didn't use
git_config_parse_key(). Please correct me if I am wrong, but I tested this case
thoroughly.

>> +               e->value_list = xcalloc(sizeof(struct string_list), 1);
>> +               e->value_list->strdup_strings = 1;
> 
> This code is perhaps too intimate with the implementation of
> string_list. It may work today (because it is safe to initialize all
> string_list fields to 0 via xcalloc()), but a future change to the
> string_list implementation may invalidate that assumption. The
> initialization macros in string-list.h exist to preserve the
> abstraction, so you don't have to know the details of initialization.
> The macros are not suitable for your use-case, but an initialization
> function, such as string_list_init(), would be suitable, and it would
> be appropriate to add such a function in a preparatory patch (as
> already suggested by [1]).

Noted. As I am going to use a simple struct for the list, this won't be
a problem.

>> +               string_list_append(e->value_list, value);
>> +               hashmap_add(config_cache, e);
>> +       } else {
>> +               string_list_append(e->value_list, value);
>> +       }
>> +       return 0;
>> +}
>> +
>> +extern const char *git_config_get_string(const char *key)
> 
> Drop the 'extern'. The header already declares it such.
>

Noted.

>> +{
>> +       struct string_list *values;
>> +       values = config_cache_get_value(key);
>> +       if (!values)
>> +               return NULL;
>> +       assert(values->nr > 0);
>> +       return values->items[values->nr - 1].string;
>> +}
>> +
>> +extern const struct string_list *git_config_get_string_multi(const char *key)
> 
> Drop the 'extern'. The header already declares it such.
> 

Noted.

>> +{
>> +       return config_cache_get_value(key);
>> +}
>> +
>>  static int config_file_fgetc(struct config_source *conf)
>>  {
>>         return fgetc(conf->u.file);
>> @@ -1700,6 +1816,12 @@ int git_config_set_multivar_in_file(const char *config_filename,
>>         lock = NULL;
>>         ret = 0;
>>
>> +       /* contents of config file has changed, so invalidate the
>> +        * config cache used by non-callback based query functions.
>> +        */
>> +       if (hashmap_is_init)
>> +               config_cache_free();
>> +
>>  out_free:
>>         if (lock)
>>                 rollback_lock_file(lock);
>> --
>> 1.9.0.GIT
> 
> [1]: http://git.661346.n2.nabble.com/RFC-PATCH-0-2-Git-config-cache-amp-special-querying-api-utilizing-the-cache-td7611691.html#a7611860
> [2]: http://thread.gmane.org/gmane.comp.version-control.git/250566/focus=251058
> [3]: http://thread.gmane.org/gmane.comp.version-control.git/251073/focus=251079
> [4]: http://thread.gmane.org/gmane.comp.version-control.git/250566/focus=250618
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-10 11:45   ` Eric Sunshine
@ 2014-06-10 12:37     ` Tanay Abhra
  0 siblings, 0 replies; 11+ messages in thread
From: Tanay Abhra @ 2014-06-10 12:37 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Ramkumar Ramachandra, Matthieu Moy, Jeff King,
	Torsten Bogershausen, Brian Gesiak

On 06/10/2014 04:45 AM, Eric Sunshine wrote:
> One additional comment...
> 
> On Mon, Jun 9, 2014 at 8:49 AM, Tanay Abhra <tanayabh@gmail.com> wrote:
>> +static int config_cache_set_value(const char *key, const char *value)
>> +{
>> +       struct hashmap *config_cache;
>> +       struct config_cache_entry *e;
>> +
>> +       config_cache = get_config_cache();
>> +       e = config_cache_find_entry(key);
>> +       if (!e) {
>> +               e = xmalloc(sizeof(*e));
>> +               hashmap_entry_init(e, strhash(key));
>> +               e->key = xstrdup(key);
>> +               e->value_list = xcalloc(sizeof(struct string_list), 1);
> 
> Order of xcalloc() arguments is incorrect [1].
> 

Noted. Thanks for pointing it out.

> [1]: http://git.661346.n2.nabble.com/PATCH-00-15-Rearrange-xcalloc-arguments-td7611675.html
> 
>> +               e->value_list->strdup_strings = 1;
>> +               string_list_append(e->value_list, value);
>> +               hashmap_add(config_cache, e);
>> +       } else {
>> +               string_list_append(e->value_list, value);
>> +       }
>> +       return 0;
>> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-10 12:35     ` Tanay Abhra
@ 2014-06-10 21:28       ` Eric Sunshine
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Sunshine @ 2014-06-10 21:28 UTC (permalink / raw)
  To: Tanay Abhra
  Cc: Git List, Ramkumar Ramachandra, Matthieu Moy, Jeff King,
	Torsten Bogershausen

On Tue, Jun 10, 2014 at 8:35 AM, Tanay Abhra <tanayabh@gmail.com> wrote:
> Thanks for the review, Eric. I have replied to your comments below,
> I will try to reply early and more promptly now.

Thanks for responding. More below.

> On 06/10/2014 04:27 AM, Eric Sunshine wrote:
>>> ---
>>> diff --git a/config.c b/config.c
>>> index a30cb5c..6f6b04e 100644
>>> --- a/config.c
>>> +++ b/config.c
>>> @@ -9,6 +9,8 @@
>>>  #include "exec_cmd.h"
>>>  #include "strbuf.h"
>>>  #include "quote.h"
>>> +#include "hashmap.h"
>>> +#include "string-list.h"
>>>
>>>  struct config_source {
>>>         struct config_source *prev;
>>> @@ -37,6 +39,120 @@ static struct config_source *cf;
>>>
>>>  static int zlib_compression_seen;
>>>
>>> +struct config_cache_entry {
>>> +       struct hashmap_entry ent;
>>> +       char *key;
>>> +       struct string_list *value_list;
>>
>> Same question as in my v1 and v2 reviews [1][2], and reiterated by
>> Matthieu [3]: Why is 'value_list' a pointer to a structure rather than
>> just a structure?
>
> Sorry for the lack of response on this part. I thought choosing a pointer to a
> structure or just the structure itself was a stylistic choice.

Each point or question raised in a review is meaningful to the
reviewer and deserves some sort of response or consideration, even if
you don't agree with the reviewer's opinion. It's okay to have
stylistic or other preferences (and to argue for them), but unless you
voice them, the reviewer is likely to conclude that his review effort
is wasted.

In this case, it's not a stylistic consideration which prompted the
question, but practical concern. (More on that below.)

> Since most of the
> functions just pass the pointer to this struct, I thought it would be more natural to
> use it. But I have changed my mind on this one. In the next iteration I will be using
> a plane old struct.

There are practical reasons for choosing a structure rather than a
pointer to a structure. Problems with a pointer include:

- Expense of an extra (unnecessary) heap allocation. You allocate both
config_cache_entry and string_list on the heap for each entry being
inserted into the hash, rather than allocating only a
config_cache_entry.

- Potential additional heap fragmentation.

- Increased possibility of leaking memory. The current implementation
of config_cache_free() drives this point home: although it correctly
clears the string_list, it leaks the heap-allocated string_list
itself.

- Potentially misleading readers of the code into thinking that there
is an important reason for the string_list to be allocated separately
from its containing object (for instance, "is the string list shared?"
or "does the string list need to exist beyond the lifetime of the
config_cache_entry?").

>>> +}
>>> +
>>> +static void config_cache_init(struct hashmap *config_cache)
>>> +{
>>> +       hashmap_init(config_cache, (hashmap_cmp_fn) config_cache_entry_cmp_case, 0);
>>
>> In his review [4], Peff suggested giving config_cache_entry_cmp_case()
>> the correct hashmap_cmp_fn signature so that this cast can be dropped.
>> It's not clear whether you disagree with his advice or overlooked or
>> forgot about it. If you disagree with his suggestion, it's okay to say
>> so and explain why you feel the way you do, but without feedback from
>> you, he or another reviewer is going to continue suggesting the same
>> change.
>
> Now on this one, I checked the code thoroughly. Every single hashmap_init()
> incantation in git code has a hashmap_cmp_fn cast.

That's a good argument for explaining to Peff why you did it this way,
but unless you voice this, he will likely feel that you simply ignored
his review comment.

> I don't think that it is necessary to do so. Is it?

There are good reasons for avoiding the cast. Jonathan gives one such
reason here [1]. In that case, he let it slide for the reason you
mention: existing hashmap clients use the cast. (However, the
implication is that it would be nice for someone -- not necessarily
you -- to submit a patch series eliminating the casts.)

Whether Peff will be swayed by the same argument, only he can answer.
(If he doesn't answer, taking [1] into consideration, you'd probably
be on safe ground to continue using the cast.)

[1]: http://thread.gmane.org/gmane.comp.version-control.git/243818/focus=243917
(see his comments about hashmap_init()).

>>> +       if (ret)
>>> +               return NULL;
>>> +
>>> +       hashmap_entry_init(&k, strhash(normalized_key));
>>> +       k.key = (char*) key;
>>
>> This looks fishy. You're hashing based upon 'normalized_key' but then
>> comparing against the unnormalized 'key'. Peff's suggestion [4] was to
>> store the normalized key in the hash, which means that you should use
>> 'normalized_key' for both hashing and comparing. (See additional
>> commentary about this issue below in config_cache_set_value().)
>
> Ouch, this I had corrected in a future commit. But forgot to include in
> this patch.

[Leaving unsnipped since it has relevance below.]

>> Style: (char *)key
>
> Noted. In function arguments the code uses (char*) key. I copied it from there.
> :)

Indeed, there's an unfortunate lack of consistency: in fact, even
config.c already has a mix of "(x*)y", "(x *)y", and "(x *) y", but I
think "(x *)y" is the one preferred and seems to be the most common.

>>> +static int config_cache_set_value(const char *key, const char *value)
>>> +{
>>> +       struct hashmap *config_cache;
>>> +       struct config_cache_entry *e;
>>> +
>>> +       config_cache = get_config_cache();
>>> +       e = config_cache_find_entry(key);
>>> +       if (!e) {
>>> +               e = xmalloc(sizeof(*e));
>>> +               hashmap_entry_init(e, strhash(key));
>>
>> The hash computed to store the key should be the same as the hash to
>> look it up. In config_cache_find_entry(), you're correctly computing
>> the hash based upon the normalized key, but here you're doing so from
>> the unnormalized key.
>>
>>> +               e->key = xstrdup(key);
>>
>> Likewise. Normalized keys should be stored in the hash, not unnormalized.
>>
>> You should be invoking git_config_parse_key() to normalize the key
>> both for hashing and storing.
>>
>> Note, also, that call git_config_parse_key() allocates the normalize
>> key on your behalf, so you do *not* want to xstrdup() it here.
>
> config_cache_set_value() gets its values from git_config() as it the callback.
> git_config() feeds the callback only normalized values by using functions like
> get_extended_base_var(), get_base_var() etc. Thus, I didn't use
> git_config_parse_key(). Please correct me if I am wrong, but I tested this case
> thoroughly.

You're right. Documentation/api-config.txt even states that 'key' is
already normalized here. Thanks for pointing it out. So, the only
thing in need of fixing is the fishy bit in config_cache_find_entry()
where it is incorrectly searching for the unnormalized key.

>>> +               e->value_list = xcalloc(sizeof(struct string_list), 1);
>>> +               e->value_list->strdup_strings = 1;
>>
>> This code is perhaps too intimate with the implementation of
>> string_list. It may work today (because it is safe to initialize all
>> string_list fields to 0 via xcalloc()), but a future change to the
>> string_list implementation may invalidate that assumption. The
>> initialization macros in string-list.h exist to preserve the
>> abstraction, so you don't have to know the details of initialization.
>> The macros are not suitable for your use-case, but an initialization
>> function, such as string_list_init(), would be suitable, and it would
>> be appropriate to add such a function in a preparatory patch (as
>> already suggested by [1]).
>
> Noted. As I am going to use a simple struct for the list, this won't be
> a problem.

Even after 'value_list' becomes a simple struct rather than a pointer
to a struct, the string_list will still require initialization, and
the STRING_LIST_INIT_* macros still don't help your case, so it
remains a problem. The suggested preparatory patch introducing a new
string_list_init() function would allow you to initialize your
string_list while preserving the abstraction.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] config: Add hashtable for config parsing & retrival
  2014-06-09 12:49 ` [PATCH v1] config: Add hashtable for config parsing & retrival Tanay Abhra
                     ` (2 preceding siblings ...)
  2014-06-10 11:45   ` Eric Sunshine
@ 2014-06-10 23:30   ` Eric Sunshine
  3 siblings, 0 replies; 11+ messages in thread
From: Eric Sunshine @ 2014-06-10 23:30 UTC (permalink / raw)
  To: Tanay Abhra
  Cc: Git List, Ramkumar Ramachandra, Matthieu Moy, Jeff King,
	Torsten Bogershausen

One other minor point...

On Mon, Jun 9, 2014 at 8:49 AM, Tanay Abhra <tanayabh@gmail.com> wrote:
> Subject: config: Add hashtable for config parsing & retrival

s/retrival/retrieval/

> Add a hash table to cache all key-value pairs read from config files
> (repo specific .git/config, user wide ~/.gitconfig and the global
> /etc/gitconfig). Add two external functions `git_config_get_string` and
> `git_config_get_string_multi` for querying in a non-callback manner from the
> hash table.
>
> Signed-off-by: Tanay Abhra <tanayabh@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] Git config cache & special querying api utilizing the cache
  2014-06-09 12:49 [PATCH v1] Git config cache & special querying api utilizing the cache Tanay Abhra
  2014-06-09 12:49 ` [PATCH v1] config: Add hashtable for config parsing & retrival Tanay Abhra
@ 2014-06-11 14:01 ` Matthieu Moy
  1 sibling, 0 replies; 11+ messages in thread
From: Matthieu Moy @ 2014-06-11 14:01 UTC (permalink / raw)
  To: Tanay Abhra
  Cc: git, Ramkumar Ramachandra, Eric Sunshine, Jeff King,
	Torsten Bogershausen

Tanay Abhra <tanayabh@gmail.com> writes:

> I have run the tests and debug the code using custom functions and it works
> fine.

I understand that you wrote the custom functions for you, but didn't
include them in the patch, right?

If so, wouldn't it make sense to include these as unit-tests for your
code? See what Git already does with test-*.c, e.g.

$ git grep test-hashmap
.gitignore:/test-hashmap
Documentation/technical/api-hashmap.txt:See test-hashmap.c for an example using arbitrary-length strings as keys.
Makefile:TEST_PROGRAMS_NEED_X += test-hashmap
t/t0011-hashmap.sh:     echo "$1" | test-hashmap $3 > actual &&
t/t0011-hashmap.sh:     cat in | test-hashmap > out &&
test-hashmap.c: * Usage: time echo "perfhashmap method rounds" | test-hashmap

for examples. The resulting binary would be a good candidate for
valgrind --leak-check=full.

The tests are not just for you, they are also:

* For reviewers, to show an example of correct usage of the new API,

* For people of the future, to avoid regressions.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-06-11 14:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-09 12:49 [PATCH v1] Git config cache & special querying api utilizing the cache Tanay Abhra
2014-06-09 12:49 ` [PATCH v1] config: Add hashtable for config parsing & retrival Tanay Abhra
2014-06-09 14:24   ` Matthieu Moy
2014-06-10 11:51     ` Eric Sunshine
2014-06-10 11:27   ` Eric Sunshine
2014-06-10 12:35     ` Tanay Abhra
2014-06-10 21:28       ` Eric Sunshine
2014-06-10 11:45   ` Eric Sunshine
2014-06-10 12:37     ` Tanay Abhra
2014-06-10 23:30   ` Eric Sunshine
2014-06-11 14:01 ` [PATCH v1] Git config cache & special querying api utilizing the cache Matthieu Moy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.