From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee <derrickstolee@github.com>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH 4/5] config: return an empty list, not NULL
Date: Wed, 28 Sep 2022 16:37:55 +0200 [thread overview]
Message-ID: <220928.86y1u3wnaz.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <26b3c9ef-5dd7-18f2-89c4-8d210a409ce4@github.com>
On Wed, Sep 28 2022, Derrick Stolee wrote:
> On 9/27/22 3:18 PM, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Tue, Sep 27 2022, Derrick Stolee wrote:
>>
>>> On 9/27/2022 12:21 PM, Ævar Arnfjörð Bjarmason wrote:
>>>>
>>>> On Tue, Sep 27 2022, Derrick Stolee via GitGitGadget wrote:
>>>
>>>>> /**
>>>>> * Finds and returns the value list, sorted in order of increasing priority
>>>>> * for the configuration variable `key`. When the configuration variable
>>>>> - * `key` is not found, returns NULL. The caller should not free or modify
>>>>> - * the returned pointer, as it is owned by the cache.
>>>>> + * `key` is not found, returns an empty list. The caller should not free or
>>>>> + * modify the returned pointer, as it is owned by the cache.
>>>>> */
>>>>> const struct string_list *git_config_get_value_multi(const char *key);
>>>>
>>>> Aside from the "DWIM API" aspect of this (which I don't mind) I think
>>>> this is really taking the low-level function in the wrong direction, and
>>>> that we should just add a new simple wrapper instead.
>>>>
>>>> I.e. both the pre-image API docs & this series gloss over the fact that
>>>> we'd not just return NULL here if the config wasn't there, but also if
>>>> git_config_parse_key() failed.
>>>>
>>>> So it seems to me that a better direction would be starting with
>>>> something like the WIP below (which doesn't compile the whole code, I
>>>> stopped at config.[ch] and pack-bitmap.c). I.e. the same "int" return
>>>> and "dest" pattern that most other things in the config API have.
>>>
>>> Do you have an example where a caller would benefit from this
>>> distinction? Without such an example, I don't think it is worth
>>> creating such a huge change for purity's sake alone.
>>
>> Not initially, I started poking at this because the CL/series/commits
>> says that we don't care about the case of non-existing keys, without
>> being clear as to why we want to conflate that with other errors we
>> might get from this API.
>>
>> But after some digging I found:
>>
>> $ for k in a a.b. "'x.y"; do ./git for-each-repo --config=$k; echo $?; done
>> error: key does not contain a section: a
>> 0
>> error: key does not contain variable name: a.b.
>> 0
>> error: invalid key: 'x.y
>> 0
>>
>> I.e. the repo_config_get_value_multi() you added in for-each-repo
>> doesn't distinguish between bad keys and non-existing keys, and returns
>> 0 even though it printed an "error".
>
> I can understand wanting to inform the user that they provided an
> invalid key using a nonzero exit code. I can also understand that
> the command does what is asked: it did nothing because the given
> key has no values (because it can't). I think the use of an "error"
> message balances things towards wanting a nonzero exit code.
Right, to be clear I think 6c62f015520 (for-each-repo: do nothing on
empty config, 2021-01-08) is sensible, i.e. we want to return 0 on a
non-existing key.
We just shouldn't conflate that with e.g. these parse errors, which the
API squashing the underlying negative return values and the "NULL list"
imposes on the user.
>>> I'm pretty happy that the diff for this series is an overall
>>> reduction in code, while also not being too large in the interim:
>>>
>>> 12 files changed, 39 insertions(+), 57 deletions(-)
>>>
>>> If all callers that use the *_multi() methods would only use the
>>> wrapper, then what is the point of doing the low-level manipulations?
>>
>> I hacked up something that's at least RFC-quality based on this
>> approach, but CI is running etc., so not submitting it
>> now:
>>
>> https://github.com/git/git/compare/master...avar:git:avar/have-git_configset_get_value-use-dest-and-int-pattern
>>
>> I think the resulting diff is more idiomatic API use, i.e. you ended up
>> with:
>>
>> /* submodule.active is set */
>> sl = repo_config_get_value_multi(repo, "submodule.active");
>> - if (sl) {
>> + if (sl && sl->nr) {
>
> You're right that I forgot to change this one to "if (sl->nr)"
> in patch 5.
If I am I didn't mean to point that out, I ws just pointing out the
end-API use. I.e. int return value v.s. the "populate dest" pattern, but
yes, in your end-state you'd drop the "sl &&" part.
>> But I ended up doing:
>>
>> /* submodule.active is set */
>> - sl = repo_config_get_value_multi(repo, "submodule.active");
>> - if (sl) {
>> + if (!repo_config_get_const_value_multi(repo, "submodule.active", &sl)) {
>>
>> Note the "const" in the function name, i.e. there's wrappers that handle
>> the case where we have a hardcoded key name, in which case we can BUG()
>> out if we'd return < 0, so all we have left is just "does key exist".
>
> The problem here is that the block actually cares that the list is non-empty
> and should not run if the list is empty. In that case, you would need to add
> "&& sl->nr" to the condition.
>
> I'm of course assuming that an empty list is different from an error. In
> your for-each-repo example, we would not want to return a non-zero exit
> code on an empty list, only on a bad key (or other I/O problem).
>
> If we return a negative value on an error and the number of matches on
> success, then this change could instead be "if (repo_config....() > 0)".
Hrm, I think you're confusing the worldview your series here is
advocating for, and what I'm suggesting as an alternative.
There isn't any way on "master" to have "an empty list", that's a
worldview you're proposing. In particular your 1/5 here removes:
assert(values->nr > 0);
More generally the config format has no notion of "an empty list", if
you have a valid key-value pair at all you have a list of ".nr >= 1".
The "empty list" is a construct you're introducing in this series,
because you wanted the idiom of passing things to
for_each_string_list_item.
I'm advocating for not going that route, and instead make the *_multi()
method like the rest of the config API. I.e. to use the "return int,
populate dest" pattern.
It's fine if we disagree, but I get the sense that it's not clear what
we're disagreeing *on* :)
>> In any case, I'm all for having some simple wrapper for the common cases
> A simple wrapper would be nice, and be exactly the method as it is
> updated in this series. The error-result version could be adopted when
> there is reason to do so.
Well, no :) We ended up with two different "simple wrapper[s]", mine
doesn't have this notion of a "struct string_list *list" with .nr == 0.
>> But I didn't find a single case where we actually needed this "never
>> give me a non-NULL list" behavior, it could just be generalized to
>> "let's have the API tell us if the key exist".
>
> Most cases want to feed the result into the for_each_string_list_item()
> macro. Based on the changes in patch 5, I think the empty list is a
> better pattern and leads to prettier code in almost all cases.
I updated the WIP RFC series I linked to upthread a bit since my initial
reply (the link is still good, I force-pushed), I then rebased your
series here on "master", below is a diff of some select files.
The overall diff is much bigger obviously (API changes and all), but the
below demonstrates some of the API changes (yours is "-", mine is
"+"). I've commented inline on some of it:
diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
index 635ea5e15fd..16e9a76d04a 100644
--- a/builtin/for-each-repo.c
+++ b/builtin/for-each-repo.c
@@ -29,6 +29,7 @@ int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
static const char *config_key = NULL;
int i, result = 0;
const struct string_list *values;
+ int err;
const struct option options[] = {
OPT_STRING(0, "config", &config_key, N_("config"),
@@ -42,8 +43,13 @@ int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
if (!config_key)
die(_("missing --config=<config>"));
- values = repo_config_get_value_multi(the_repository,
- config_key);
+ err = repo_config_get_value_multi(the_repository, config_key, &values);
+ if (err < 0)
+ usage_msg_optf(_("got bad config --config=%s"),
+ for_each_repo_usage, options, config_key);
+ else if (err)
+ return 0;
+
for (i = 0; !result && i < values->nr; i++)
result = run_command_on_repo(values->items[i].string, argc, argv);
Here we're relying an error to the user that we couldn't before, because
repo_config_get_value_multi() would return "NULL" for both "key is bad"
and "key doesn't exist". There's a corresponding test modification
below.
diff --git a/builtin/gc.c b/builtin/gc.c
index 1e9ac2ac7e3..94b77a88a99 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1472,9 +1472,7 @@ static int maintenance_register(int argc, const char **argv, const char *prefix)
};
int found = 0;
const char *key = "maintenance.repo";
- char *config_value;
char *maintpath = get_maintpath();
- struct string_list_item *item;
const struct string_list *list;
argc = parse_options(argc, argv, prefix, options,
@@ -1487,18 +1485,11 @@ static int maintenance_register(int argc, const char **argv, const char *prefix)
git_config_set("maintenance.auto", "false");
/* Set maintenance strategy, if unset */
- if (!git_config_get_string("maintenance.strategy", &config_value))
- free(config_value);
- else
+ if (git_config_lookup_value("maintenance.strategy"))
git_config_set("maintenance.strategy", "incremental");
In looking at this I thought we were way overdue for a "does this key
exist?" helper, this and a few other API users use it.
- list = git_config_get_value_multi(key);
- for_each_string_list_item(item, list) {
- if (!strcmp(maintpath, item->string)) {
- found = 1;
- break;
- }
- }
+ if (!git_config_get_const_value_multi(key, &list))
+ found = unsorted_string_list_has_string(list, maintpath);
So, it turns out that the initial reason you wanted the "pass NULL to
for_each_string_list_item" is actually something we can do with
unsorted_string_list_has_string(), which implements the same loop.
The difference here is *the* API difference we're discussing. I.e. we'll
never get a NULL "list", we'll instead always get a non-NULL list with
>= 1 item if we can get this key at all.
The "const value" helper is a wrapper that handles the "err < 0" case. I
cases where we hardcode the key it's a BUG() if we get "err < 0". The
wrapper is just:
int err = git_configset_get_value_multi(cs, key, dest);
if (err < 0)
BUG("failed to parse constant key '%s'!", key);
return err;
[...]
@@ -1547,13 +1537,8 @@ static int maintenance_unregister(int argc, const char **argv, const char *prefi
usage_with_options(builtin_maintenance_unregister_usage,
options);
- list = git_config_get_value_multi(key);
- for_each_string_list_item(item, list) {
- if (!strcmp(maintpath, item->string)) {
- found = 1;
- break;
- }
- }
+ if (!git_config_get_const_value_multi(key, &list))
+ found = unsorted_string_list_has_string(list, maintpath);
Ditto the same git_config_get_const_value_multi() &
unsorted_string_list_has_string() pattern.
if (found) {
int rc;
diff --git a/builtin/log.c b/builtin/log.c
index 719ef966045..bdb87f6c42b 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -182,13 +182,15 @@ static void set_default_decoration_filter(struct decoration_filter *decoration_f
int i;
char *value = NULL;
struct string_list *include = decoration_filter->include_ref_pattern;
- struct string_list_item *item;
- const struct string_list *config_exclude =
- git_config_get_value_multi("log.excludeDecoration");
+ const struct string_list *config_exclude;
- for_each_string_list_item(item, config_exclude)
- string_list_append(decoration_filter->exclude_ref_config_pattern,
- item->string);
+ if (!git_config_get_const_value_multi("log.excludeDecoration",
+ &config_exclude)) {
+ struct string_list_item *item;
+ for_each_string_list_item(item, config_exclude)
+ string_list_append(decoration_filter->exclude_ref_config_pattern,
+ item->string);
+ }
Here's a case where we need to use for_each_string_list_item(), I think
it's nice how we can now scope the "item" variable.
/*
* By default, decorate_all is disabled. Enable it if
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 5a8b6120157..b758255f816 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -552,7 +552,7 @@ static int module_init(int argc, const char **argv, const char *prefix)
* If there are no path args and submodule.active is set then,
* by default, only initialize 'active' modules.
*/
- if (!argc && git_config_get_value_multi("submodule.active")->nr)
+ if (!argc && !git_config_lookup_value("submodule.active"))
module_list_active(&list);
You changed these in your 2/5, but they really just wanted the new "does
this key exist?" API. No need to construct the string_list just to throw
it away...
info.prefix = prefix;
@@ -2720,7 +2720,7 @@ static int module_update(int argc, const char **argv, const char *prefix)
* If there are no path args and submodule.active is set then,
* by default, only initialize 'active' modules.
*/
- if (!argc && git_config_get_value_multi("submodule.active")->nr)
+ if (!argc && !git_config_lookup_value("submodule.active"))
module_list_active(&list);
Ditto.
info.prefix = opt.prefix;
@@ -3164,7 +3164,6 @@ static int config_submodule_in_gitmodules(const char *name, const char *var, con
static void configure_added_submodule(struct add_data *add_data)
{
char *key;
- const char *val;
struct child_process add_submod = CHILD_PROCESS_INIT;
struct child_process add_gitmodules = CHILD_PROCESS_INIT;
@@ -3209,7 +3208,7 @@ static void configure_added_submodule(struct add_data *add_data)
* is_submodule_active(), since that function needs to find
* out the value of "submodule.active" again anyway.
*/
- if (!git_config_get_string_tmp("submodule.active", &val)) {
+ if (!git_config_lookup_value("submodule.active")) {
/*
* If the submodule being added isn't already covered by the
* current configured pathspec, set the submodule's active flag
Ditto.
diff --git a/submodule.c b/submodule.c
index 06230961c80..4474cf9ed2d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -274,8 +274,7 @@ int is_tree_submodule_active(struct repository *repo,
free(key);
/* submodule.active is set */
- sl = repo_config_get_value_multi(repo, "submodule.active");
- if (sl && sl->nr) {
+ if (!repo_config_get_const_value_multi(repo, "submodule.active", &sl)) {
struct pathspec ps;
struct strvec args = STRVEC_INIT;
const struct string_list_item *item;
Another "*the* API difference we're discussing". I.e. sure, your end
state would be "if (sl->nr)", but if we make it return "int"...
diff --git a/t/helper/test-config.c b/t/helper/test-config.c
index 90810946783..432ad047537 100644
--- a/t/helper/test-config.c
+++ b/t/helper/test-config.c
@@ -95,8 +95,7 @@ int cmd__config(int argc, const char **argv)
goto exit1;
}
} else if (argc == 3 && !strcmp(argv[1], "get_value_multi")) {
- strptr = git_config_get_value_multi(argv[2]);
- if (strptr->nr) {
+ if (!git_config_get_const_value_multi(argv[2], &strptr)) {
for (i = 0; i < strptr->nr; i++) {
v = strptr->items[i].string;
if (!v)
Ditto, (this one converts away from your preferred API use).
@@ -159,8 +158,7 @@ int cmd__config(int argc, const char **argv)
goto exit2;
}
}
- strptr = git_configset_get_value_multi(&cs, argv[2]);
- if (strptr && strptr->nr) {
+ if (!git_configset_get_const_value_multi(&cs, argv[2], &strptr)) {
for (i = 0; i < strptr->nr; i++) {
v = strptr->items[i].string;
if (!v)
Ditto, sans that you'd presumably want s/strptr && // here.
diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh
index 4675e852517..115221c9ca5 100755
--- a/t/t0068-for-each-repo.sh
+++ b/t/t0068-for-each-repo.sh
@@ -33,4 +33,10 @@ test_expect_success 'do nothing on empty config' '
git for-each-repo --config=bogus.config -- help --no-such-option
'
+test_expect_success 'error on bad config keys' '
+ test_expect_code 129 git for-each-repo --config=a &&
+ test_expect_code 129 git for-each-repo --config=a.b. &&
+ test_expect_code 129 git for-each-repo --config="'\''.b"
+'
+
test_done
A test showing behavior change we can implement now that we don't sweep
the "err < 0" under the rug.
That branch also grew to have some other changes we may or may not want,
one thing was to convert the various *_get_*() functionts that now
normalize the non-zero return value with e.g.:
int git_configset_get_int(struct config_set *cs, const char *key, int *dest)
{
const char *value;
- if (!git_configset_get_value(cs, key, &value)) {
- *dest = git_config_int(key, value);
- return 0;
- } else
- return 1;
+ int err;
+
+ if ((err = git_configset_get_value(cs, key, &value)))
+ return err;
+ *dest = git_config_int(key, value);
+ return 0;
}
No caller currently cares about it, but I think it makes sense generally
not to throw away errors if we can (whether that part is worth the churn
is another topic).
Anyway, the reason I started looking at this RFC to begin with was
because this *_multi() part of the config API has often seemed odd to
me, i.e. I wondered why we couldn't just have it use the "return int,
populate dest" pattern. I'd just never tried to see if I could get that
to work.
It's a bit of one-off churn to get to this point, but I think the end
result of having all the API functions act the same way to signal key
existence v.s. validity is worth it.
next prev parent reply other threads:[~2022-09-28 15:07 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-27 14:08 [PATCH 0/5] [RFC] config API: return empty list, not NULL Derrick Stolee via GitGitGadget
2022-09-27 14:08 ` [PATCH 1/5] config: relax requirements on multi-value return Derrick Stolee via GitGitGadget
2022-09-27 17:26 ` Junio C Hamano
2022-09-27 14:08 ` [PATCH 2/5] *: relax git_configset_get_value_multi result Derrick Stolee via GitGitGadget
2022-09-28 15:58 ` Taylor Blau
2022-09-27 14:08 ` [PATCH 3/5] config: add BUG() statement instead of possible segfault Derrick Stolee via GitGitGadget
2022-09-27 16:17 ` Ævar Arnfjörð Bjarmason
2022-09-27 16:46 ` Derrick Stolee
2022-09-27 17:22 ` Ævar Arnfjörð Bjarmason
2022-09-27 14:08 ` [PATCH 4/5] config: return an empty list, not NULL Derrick Stolee via GitGitGadget
2022-09-27 16:21 ` Ævar Arnfjörð Bjarmason
2022-09-27 16:50 ` Derrick Stolee
2022-09-27 19:18 ` Ævar Arnfjörð Bjarmason
2022-09-28 13:46 ` Derrick Stolee
2022-09-28 14:37 ` Ævar Arnfjörð Bjarmason [this message]
2022-09-28 18:10 ` Derrick Stolee
2022-09-28 19:33 ` Ævar Arnfjörð Bjarmason
2022-09-27 14:08 ` [PATCH 5/5] *: expect a non-NULL list of config values Derrick Stolee via GitGitGadget
2022-09-28 2:40 ` [PATCH 0/5] [RFC] config API: return empty list, not NULL Junio C Hamano
2022-09-28 18:38 ` Derrick Stolee
2022-09-28 19:27 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=220928.86y1u3wnaz.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).