* [PATCH v4 0/1] receive-pack: optionally deny case clone refs @ 2014-06-11 22:30 David Turner 2014-06-11 22:30 ` [PATCH v4 1/1] " David Turner ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: David Turner @ 2014-06-11 22:30 UTC (permalink / raw) To: git This issue bit us again recently. In talking with some colleagues, I realized that the previous version of this patch, in addition to being potentially slow, was incomplete. Specifically, it didn't handle the case of refs/heads/case/one vs refs/heads/CASE/two; these are case clones even though they strcasecmp different. This new version contains code to prevent this, as well as tests for this case. Also it uses a hashmap to make lookups constant-time. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v4 1/1] receive-pack: optionally deny case clone refs 2014-06-11 22:30 [PATCH v4 0/1] receive-pack: optionally deny case clone refs David Turner @ 2014-06-11 22:30 ` David Turner 2014-06-13 4:03 ` Torsten Bögershausen 2014-06-12 19:47 ` [PATCH v4 0/1] " Junio C Hamano 2014-08-13 16:20 ` Ronnie Sahlberg 2 siblings, 1 reply; 18+ messages in thread From: David Turner @ 2014-06-11 22:30 UTC (permalink / raw) To: git; +Cc: David Turner It is possible to have two refs which are the same but for case. This works great on the case-sensitive filesystems, but not so well on case-insensitive filesystems. It is fairly typical to have case-insensitive clients (Macs, say) with a case-sensitive server (GNU/Linux). Should a user attempt to pull on a Mac when there are case clone refs with differing contents, they'll get an error message containing something like "Ref refs/remotes/origin/lower is at [sha-of-lowercase-ref] but expected [sha-of-uppercase-ref].... (unable to update local ref)". With a case-insensitive git server, if a branch called capital-M Master (that differs from lowercase-m-master) is pushed, nobody else can push to (lowercase-m) master until the branch is removed. Create the option receive.denycaseclonerefs, which checks pushed refs to ensure that they are not case clones of an existing ref. This setting is turned on by default if core.ignorecase is set, but not otherwise. Signed-off-by: David Turner <dturner@twitter.com> --- Documentation/config.txt | 6 ++ Documentation/git-push.txt | 5 +- Documentation/glossary-content.txt | 9 +++ builtin/receive-pack.c | 147 ++++++++++++++++++++++++++++++++++++- refs.c | 7 +- refs.h | 6 ++ t/t5400-send-pack.sh | 90 +++++++++++++++++++++++ 7 files changed, 261 insertions(+), 9 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1932e9b..b24b117 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2053,6 +2053,12 @@ receive.unpackLimit:: especially on slow filesystems. If not set, the value of `transfer.unpackLimit` is used instead. +receive.denyCaseCloneRefs:: + If set to true, git-receive-pack will deny a ref update that creates + a ref which is the same but for case as an existing ref. This is + useful when clients are on a case-insensitive filesystem, which + will cause errors when given refs which differ only in case. + receive.denyDeletes:: If set to true, git-receive-pack will deny a ref update that deletes the ref. Use this to prevent such a ref deletion via a push. diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt index 21cd455..c92c3a6 100644 --- a/Documentation/git-push.txt +++ b/Documentation/git-push.txt @@ -323,8 +323,9 @@ remote rejected:: of the following safety options in effect: `receive.denyCurrentBranch` (for pushes to the checked out branch), `receive.denyNonFastForwards` (for forced - non-fast-forward updates), `receive.denyDeletes` or - `receive.denyDeleteCurrent`. See linkgit:git-config[1]. + non-fast-forward updates), `receive.denyDeletes`, + `receive.denyCaseCloneRefs` or `receive.denyDeleteCurrent`. + See linkgit:git-config[1]. remote failure:: The remote end did not report the successful update of the ref, diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt index be0858c..d1f24bf 100644 --- a/Documentation/glossary-content.txt +++ b/Documentation/glossary-content.txt @@ -31,6 +31,15 @@ [[def_cache]]cache:: Obsolete for: <<def_index,index>>. +[[def_case_clone]]case clone:: + Two entities (e.g. filenames or refs) that differ only in case. + These can cause problems on case-insensitive filesystems, and + Git has machinery to prevent these problems in various cases. + Case clone refs also include the situation where one ref has + a path component which is a case-clone of a different ref's path + component in the same directory e.g. refs/heads/case/a is a case + clone of refs/heads/CASE/b + [[def_chain]]chain:: A list of objects, where each <<def_object,object>> in the list contains a reference to its successor (for example, the successor of a diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index c323081..6431758 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -27,6 +27,7 @@ enum deny_action { static int deny_deletes; static int deny_non_fast_forwards; +static int deny_case_clone_refs = DENY_UNCONFIGURED; static enum deny_action deny_current_branch = DENY_UNCONFIGURED; static enum deny_action deny_delete_current = DENY_UNCONFIGURED; static int receive_fsck_objects = -1; @@ -69,6 +70,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb) if (status) return status; + if (strcmp(var, "receive.denycaseclonerefs") == 0) { + deny_case_clone_refs = parse_deny_action(var, value); + return 0; + } + if (strcmp(var, "receive.denydeletes") == 0) { deny_deletes = git_config_bool(var, value); return 0; @@ -468,6 +474,138 @@ static int update_shallow_ref(struct command *cmd, struct shallow_info *si) return 0; } +/* + * This contains not just refs, but ref prefixes -- i.e. not just + * refs/heads/foo/bar, but refs, refs/heads, and refs/heads/foo + */ +struct ref_cache_entry { + struct hashmap_entry ent; + unsigned int count; /* count of refs having this as a component */ + char ref[FLEX_ARRAY]; +}; + +static struct hashmap *ref_case_clone_cache; + +static int ref_cache_entry_cmp(const struct ref_cache_entry *e1, + const struct ref_cache_entry *e2, const char *ref) +{ + return strcasecmp(e1->ref, ref ? ref : e2->ref); +} + +/* + * Insert a ref into the ref cache, as well as all of its ancestor + * directory names -- so if we insert refs/heads/something/other, + * refs, refs/heads, refs/heads/something/other will be included. + */ +static int ref_cache_insert(const char *refname, struct hashmap *map) +{ + int total_len = 0, comp_len; + + while ((comp_len = check_refname_component(refname + total_len, 0)) >= 0) { + struct ref_cache_entry *old; + struct ref_cache_entry *entry = xmalloc(sizeof(*entry) + total_len + comp_len + 1); + total_len += comp_len; + memcpy(entry->ref, refname, total_len); + entry->ref[total_len] = 0; + entry->count = 1; + hashmap_entry_init(entry, memihash(entry->ref, total_len)); + old = hashmap_get(map, entry, entry->ref); + if (old) { + old->count ++; + free(entry); + } else + hashmap_add(map, entry); + total_len ++; + } +} + +/* + * Remove a ref from the ref cache, as well as any of its ancestor + * directory names that no longer contain any refs. + */ +static int ref_cache_delete(const char *refname, struct hashmap *map) +{ + int total_len = 0, comp_len; + + struct ref_cache_entry *entry = xmalloc(sizeof(*entry) + strlen(refname)); + + while ((comp_len = check_refname_component(refname + total_len, 0)) >= 0) { + struct ref_cache_entry *old; + total_len += comp_len; + memcpy(entry->ref, refname, total_len); + entry->ref[total_len] = 0; + hashmap_entry_init(entry, memihash(entry->ref, total_len)); + old = hashmap_get(map, entry, entry->ref); + if (old) { + old->count --; + if (old->count == 0) { + hashmap_remove(map, old, old->ref); + free(old); + } + } else { + warn("Ref cache coherency failure: %s from %s", entry->ref, refname); + break; + } + total_len ++; + } + free(entry); +} + + +static int ref_cache_insert_cb(const char *refname, const unsigned char *sha1, + int flags, void *cb_data) +{ + ref_cache_insert(refname, cb_data); +} + +static void ensure_ref_case_clone_cache(void) +{ + if (ref_case_clone_cache) + return; + ref_case_clone_cache = xmalloc(sizeof(*ref_case_clone_cache)); + hashmap_init(ref_case_clone_cache, + (hashmap_cmp_fn)ref_cache_entry_cmp, 1000); + + for_each_ref(ref_cache_insert_cb, (void *)ref_case_clone_cache); +} + +/* + * Search the ref cache for a ref that is a case clone of this + * incoming ref; this includes prefix case clones so that + * refs/heads/case/clone will conflict with refs/heads/CASE/other + */ +static int ref_is_case_clone(const char *name) { + struct ref_cache_entry key; + struct ref_cache_entry *existing; + int total_len = 0, comp_len; + char *name_so_far = strdup(name); + + while ((comp_len = check_refname_component(name + total_len, 0)) >= 0) { + total_len += comp_len; + name_so_far[total_len] = 0; + hashmap_entry_init(&key, memihash(name_so_far, total_len)); + existing = hashmap_get(ref_case_clone_cache, &key, name_so_far); + if (!existing) + return 0; + if (memcmp(existing->ref, name_so_far, total_len)) + return 1; + name_so_far[total_len] = '/'; + total_len ++; + } + + free(name_so_far); + return 0; +} + +static int ref_is_denied_case_clone(const char *name) +{ + if (!deny_case_clone_refs) + return 0; + ensure_ref_case_clone_cache(); + + return ref_is_case_clone(name); +} + static const char *update(struct command *cmd, struct shallow_info *si) { const char *name = cmd->ref_name; @@ -478,7 +616,8 @@ static const char *update(struct command *cmd, struct shallow_info *si) struct ref_lock *lock; /* only refs/... are allowed */ - if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0)) { + if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0) || + ref_is_denied_case_clone(name)) { rp_error("refusing to create funny ref '%s' remotely", name); return "funny refname"; } @@ -573,6 +712,8 @@ static const char *update(struct command *cmd, struct shallow_info *si) rp_error("failed to delete %s", name); return "failed to delete"; } + if (deny_case_clone_refs) + ref_cache_delete(name, ref_case_clone_cache); return NULL; /* good */ } else { @@ -589,6 +730,8 @@ static const char *update(struct command *cmd, struct shallow_info *si) if (write_ref_sha1(lock, new_sha1, "push")) { return "failed to write"; /* error() already called */ } + if (deny_case_clone_refs) + ref_cache_insert(name, ref_case_clone_cache); return NULL; /* good */ } } @@ -1171,6 +1314,8 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix) die("'%s' does not appear to be a git repository", dir); git_config(receive_pack_config, NULL); + if (deny_case_clone_refs == DENY_UNCONFIGURED) + deny_case_clone_refs = ignore_case; if (0 <= transfer_unpack_limit) unpack_limit = transfer_unpack_limit; diff --git a/refs.c b/refs.c index 28d5eca..7d534cc 100644 --- a/refs.c +++ b/refs.c @@ -29,12 +29,7 @@ static inline int bad_ref_char(int ch) return 0; } -/* - * Try to read one refname component from the front of refname. Return - * the length of the component found, or -1 if the component is not - * legal. - */ -static int check_refname_component(const char *refname, int flags) +int check_refname_component(const char *refname, int flags) { const char *cp; char last = '\0'; diff --git a/refs.h b/refs.h index 87a1a79..38f3272 100644 --- a/refs.h +++ b/refs.h @@ -200,6 +200,12 @@ extern int for_each_reflog(each_ref_fn, void *); * "." or ".."). */ extern int check_refname_format(const char *refname, int flags); +/* + * Try to read one refname component from the front of refname. Return + * the length of the component found, or -1 if the component is not + * legal. + */ +extern int check_refname_component(const char *refname, int flags); extern const char *prettify_refname(const char *refname); extern char *shorten_unambiguous_ref(const char *refname, int strict); diff --git a/t/t5400-send-pack.sh b/t/t5400-send-pack.sh index 0736bcb..de0a88d 100755 --- a/t/t5400-send-pack.sh +++ b/t/t5400-send-pack.sh @@ -129,6 +129,96 @@ test_expect_success 'denyNonFastforwards trumps --force' ' test "$victim_orig" = "$victim_head" ' +test_expect_success 'denyCaseCloneRefs works' ' + ( + cd victim && + git config receive.denyCaseCloneRefs true && + git config receive.denyDeletes false + ) && + git send-pack ./victim HEAD:refs/heads/case/clone && + orig_ver=$(git rev-parse HEAD) && + test_must_fail git send-pack ./victim HEAD^:refs/heads/Case/Clone && + # confirm that this had no effect upstream + ( + cd victim && + ref=$(git for-each-ref --format="%(refname)" refs/heads/Case/Clone) && + echo "$ref" | test_must_fail grep -q Case/Clone && + remote_ver=$(git rev-parse case/clone) && + test "$orig_ver" = "$remote_ver" + ) && + git send-pack ./victim HEAD^:refs/heads/notacase/clone && + test_must_fail git send-pack ./victim :Case/Clone && + # confirm that this had no effect upstream + ( + cd victim && + ref=$(git for-each-ref --format="%(refname)" refs/heads/Case/Clone) && + echo "$ref" | test_must_fail grep -q Case/Clone && + remote_ver=$(git rev-parse case/clone) && + test "$orig_ver" = "$remote_ver" + ) && + git send-pack ./victim :case/clone && + # confirm that this took effect upstream + ( + cd victim && + test_must_fail git rev-parse case/clone + ) && + # check that we can recreate a branch after deleting a + # case-clone of it + case_clone_ver=$(git rev-parse HEAD^) && + git send-pack ./victim HEAD^:refs/heads/Case/Clone && + ( + cd victim && + ref=$(git for-each-ref --format="%(refname)" refs/heads/case/clone) && + echo "$ref" | test_must_fail grep -q case/clone && + remote_ver=$(git rev-parse Case/Clone) && + test "$case_clone_ver" = "$remote_ver" + ) && + # check subdirectory + test_must_fail git send-pack ./victim HEAD:refs/heads/case/clone/morx && + # confirm that this had no effect upstream + ( + cd victim && + ref=$(git for-each-ref --format="%(refname)" refs/heads/case/clone/morx) && + echo "$ref" | test_must_fail grep -q case/clone/morx && + remote_ver=$(git rev-parse Case/Clone) && + test "$case_clone_ver" = "$remote_ver" + ) && + #check parent directory + test_must_fail git send-pack ./victim HEAD:refs/heads/case/other && + # confirm that this had no effect upstream + ( + cd victim && + ref=$(git for-each-ref --format="%(refname)" refs/heads/case/other) && + echo "$ref" | test_must_fail grep -q case/other && + remote_ver=$(git rev-parse Case/Clone) && + test "$case_clone_ver" = "$remote_ver" + ) +' +test_expect_success 'denyCaseCloneRefs works in same send-pack case' ' + ( + cd victim && + git config receive.denyCaseCloneRefs true && + git config receive.denyDeletes false + ) && + test_must_fail git send-pack ./victim HEAD:refs/heads/case2 HEAD~:refs/heads/CASE2 && + ( + cd victim && + ref=$(git for-each-ref --format="%(refname)" refs/heads/CASE2) && + echo "$ref" | test_must_fail grep -q CASE2 && + remote_ver=$(git rev-parse case2) && + test "$orig_ver" = "$remote_ver" + ) && + case_clone_ver=$(git rev-parse HEAD^) && + git send-pack ./victim :refs/heads/case2 HEAD~:refs/heads/CASE2 && + ( + cd victim && + ref=$(git for-each-ref --format="%(refname)" refs/heads/case2) && + echo "$ref" | test_must_fail grep -q case2 && + remote_ver=$(git rev-parse CASE2) && + test "$case_clone_ver" = "$remote_ver" + ) +' + test_expect_success 'push --all excludes remote-tracking hierarchy' ' mkdir parent && ( -- 2.0.0.rc1.24.g0588c94.dirty ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/1] receive-pack: optionally deny case clone refs 2014-06-11 22:30 ` [PATCH v4 1/1] " David Turner @ 2014-06-13 4:03 ` Torsten Bögershausen 0 siblings, 0 replies; 18+ messages in thread From: Torsten Bögershausen @ 2014-06-13 4:03 UTC (permalink / raw) To: David Turner, git; +Cc: David Turner On 12.06.14 00:30, David Turner wrote: [] Just a general question: > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c > index c323081..6431758 100644 > --- a/builtin/receive-pack.c > +++ b/builtin/receive-pack.c > @@ -27,6 +27,7 @@ enum deny_action { > > static int deny_deletes; > static int deny_non_fast_forwards; > +static int deny_case_clone_refs = DENY_UNCONFIGURED; > static enum deny_action deny_current_branch = DENY_UNCONFIGURED; > static enum deny_action deny_delete_current = DENY_UNCONFIGURED; > static int receive_fsck_objects = -1; > @@ -69,6 +70,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb) > if (status) > return status; > > + if (strcmp(var, "receive.denycaseclonerefs") == 0) { > + deny_case_clone_refs = parse_deny_action(var, value); _action() : Which action ? May be this is a better name: parse_deny_case_clone_refs() > + return 0; > + } > + > if (strcmp(var, "receive.denydeletes") == 0) { > deny_deletes = git_config_bool(var, value); > return 0; > @@ -468,6 +474,138 @@ static int update_shallow_ref(struct command *cmd, struct shallow_info *si) > return 0; > } > > +/* > + * This contains not just refs, but ref prefixes -- i.e. not just > + * refs/heads/foo/bar, but refs, refs/heads, and refs/heads/foo > + */ > +struct ref_cache_entry { > + struct hashmap_entry ent; > + unsigned int count; /* count of refs having this as a component */ > + char ref[FLEX_ARRAY]; > +}; > + > +static struct hashmap *ref_case_clone_cache; > + > +static int ref_cache_entry_cmp(const struct ref_cache_entry *e1, > + const struct ref_cache_entry *e2, const char *ref) > +{ > + return strcasecmp(e1->ref, ref ? ref : e2->ref); > +} > + > +/* > + * Insert a ref into the ref cache, as well as all of its ancestor > + * directory names -- so if we insert refs/heads/something/other, > + * refs, refs/heads, refs/heads/something/other will be included. > + */ > +static int ref_cache_insert(const char *refname, struct hashmap *map) > +{ > + int total_len = 0, comp_len; > + > + while ((comp_len = check_refname_component(refname + total_len, 0)) >= 0) { > + struct ref_cache_entry *old; > + struct ref_cache_entry *entry = xmalloc(sizeof(*entry) + total_len + comp_len + 1); > + total_len += comp_len; Could the total length can be calculated first and already used in xmalloc() ? That will give 3 lines of code, but the reader is sure we are allocating the right length. > + struct ref_cache_entry; > + total_len += comp_len; > + entry = xmalloc(sizeof(*entry) + total_len + 1); > + memcpy(entry->ref, refname, total_len); > + entry->ref[total_len] = 0; > + entry->count = 1; > + hashmap_entry_init(entry, memihash(entry->ref, total_len)); > + old = hashmap_get(map, entry, entry->ref); > + if (old) { > + old->count ++; > + free(entry); I'm not sure if I read it right: If there is an old entry "old", we anyway create a new one and delete it immediately ? > + } else > + hashmap_add(map, entry); > + total_len ++; > + } > +} > + > +/* > + * Remove a ref from the ref cache, as well as any of its ancestor > + * directory names that no longer contain any refs. > + */ > +static int ref_cache_delete(const char *refname, struct hashmap *map) > +{ > + int total_len = 0, comp_len; > + > + struct ref_cache_entry *entry = xmalloc(sizeof(*entry) + strlen(refname)); > + > + while ((comp_len = check_refname_component(refname + total_len, 0)) >= 0) { > + struct ref_cache_entry *old; > + total_len += comp_len; > + memcpy(entry->ref, refname, total_len); > + entry->ref[total_len] = 0; > + hashmap_entry_init(entry, memihash(entry->ref, total_len)); > + old = hashmap_get(map, entry, entry->ref); > + if (old) { > + old->count --; > + if (old->count == 0) { > + hashmap_remove(map, old, old->ref); > + free(old); > + } > + } else { > + warn("Ref cache coherency failure: %s from %s", entry->ref, refname); > + break; > + } > + total_len ++; > + } > + free(entry); > +} > + > + > +static int ref_cache_insert_cb(const char *refname, const unsigned char *sha1, > + int flags, void *cb_data) > +{ > + ref_cache_insert(refname, cb_data); > +} > + > +static void ensure_ref_case_clone_cache(void) > +{ > + if (ref_case_clone_cache) > + return; > + ref_case_clone_cache = xmalloc(sizeof(*ref_case_clone_cache)); > + hashmap_init(ref_case_clone_cache, > + (hashmap_cmp_fn)ref_cache_entry_cmp, 1000); > + > + for_each_ref(ref_cache_insert_cb, (void *)ref_case_clone_cache); > +} > + > +/* > + * Search the ref cache for a ref that is a case clone of this > + * incoming ref; this includes prefix case clones so that > + * refs/heads/case/clone will conflict with refs/heads/CASE/other > + */ > +static int ref_is_case_clone(const char *name) { > + struct ref_cache_entry key; > + struct ref_cache_entry *existing; > + int total_len = 0, comp_len; > + char *name_so_far = strdup(name); > + > + while ((comp_len = check_refname_component(name + total_len, 0)) >= 0) { > + total_len += comp_len; > + name_so_far[total_len] = 0; > + hashmap_entry_init(&key, memihash(name_so_far, total_len)); > + existing = hashmap_get(ref_case_clone_cache, &key, name_so_far); > + if (!existing) > + return 0; > + if (memcmp(existing->ref, name_so_far, total_len)) > + return 1; > + name_so_far[total_len] = '/'; > + total_len ++; > + } > + > + free(name_so_far); > + return 0; > +} > + > +static int ref_is_denied_case_clone(const char *name) > +{ > + if (!deny_case_clone_refs) > + return 0; > + ensure_ref_case_clone_cache(); > + > + return ref_is_case_clone(name); > +} > + > static const char *update(struct command *cmd, struct shallow_info *si) > { > const char *name = cmd->ref_name; > @@ -478,7 +616,8 @@ static const char *update(struct command *cmd, struct shallow_info *si) > struct ref_lock *lock; > > /* only refs/... are allowed */ > - if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0)) { > + if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0) || > + ref_is_denied_case_clone(name)) { > rp_error("refusing to create funny ref '%s' remotely", name); > return "funny refname"; Not related to this patch: the word "funny" may be not so funny. But related to the patch: If the update is denied because of ref_is_denied_case_clone() says so, the user wants to know this. So that she/he is able to understand it better. Here we may want something like rp_error("refusing to create ref '%s' remotely because of a case insensitive duplicate", name); > > } > @@ -573,6 +712,8 @@ static const char *update(struct command *cmd, struct shallow_info *si) > rp_error("failed to delete %s", name); > return "failed to delete"; > } > + if (deny_case_clone_refs) > + ref_cache_delete(name, ref_case_clone_cache); > return NULL; /* good */ > } > else { > @@ -589,6 +730,8 @@ static const char *update(struct command *cmd, struct shallow_info *si) > if (write_ref_sha1(lock, new_sha1, "push")) { > return "failed to write"; /* error() already called */ > } > + if (deny_case_clone_refs) > + ref_cache_insert(name, ref_case_clone_cache); > return NULL; /* good */ > } > } > @@ -1171,6 +1314,8 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix) > die("'%s' does not appear to be a git repository", dir); > > git_config(receive_pack_config, NULL); > + if (deny_case_clone_refs == DENY_UNCONFIGURED) > + deny_case_clone_refs = ignore_case; > > if (0 <= transfer_unpack_limit) > unpack_limit = transfer_unpack_limit; > diff --git a/refs.c b/refs.c > index 28d5eca..7d534cc 100644 > --- a/refs.c > +++ b/refs.c > @@ -29,12 +29,7 @@ static inline int bad_ref_char(int ch) > return 0; > } > > -/* > - * Try to read one refname component from the front of refname. Return > - * the length of the component found, or -1 if the component is not > - * legal. > - */ > -static int check_refname_component(const char *refname, int flags) > +int check_refname_component(const char *refname, int flags) > { > const char *cp; > char last = '\0'; > diff --git a/refs.h b/refs.h > index 87a1a79..38f3272 100644 > --- a/refs.h > +++ b/refs.h > @@ -200,6 +200,12 @@ extern int for_each_reflog(each_ref_fn, void *); > * "." or ".."). > */ > extern int check_refname_format(const char *refname, int flags); > +/* > + * Try to read one refname component from the front of refname. Return > + * the length of the component found, or -1 if the component is not > + * legal. > + */ > +extern int check_refname_component(const char *refname, int flags); > > extern const char *prettify_refname(const char *refname); > extern char *shorten_unambiguous_ref(const char *refname, int strict); > diff --git a/t/t5400-send-pack.sh b/t/t5400-send-pack.sh > index 0736bcb..de0a88d 100755 > --- a/t/t5400-send-pack.sh > +++ b/t/t5400-send-pack.sh > @@ -129,6 +129,96 @@ test_expect_success 'denyNonFastforwards trumps --force' ' > test "$victim_orig" = "$victim_head" > ' > > +test_expect_success 'denyCaseCloneRefs works' ' > + ( > + cd victim && > + git config receive.denyCaseCloneRefs true && > + git config receive.denyDeletes false > + ) && > + git send-pack ./victim HEAD:refs/heads/case/clone && > + orig_ver=$(git rev-parse HEAD) && > + test_must_fail git send-pack ./victim HEAD^:refs/heads/Case/Clone && > + # confirm that this had no effect upstream > + ( > + cd victim && > + ref=$(git for-each-ref --format="%(refname)" refs/heads/Case/Clone) && > + echo "$ref" | test_must_fail grep -q Case/Clone && > + remote_ver=$(git rev-parse case/clone) && > + test "$orig_ver" = "$remote_ver" > + ) && > + git send-pack ./victim HEAD^:refs/heads/notacase/clone && > + test_must_fail git send-pack ./victim :Case/Clone && > + # confirm that this had no effect upstream > + ( > + cd victim && > + ref=$(git for-each-ref --format="%(refname)" refs/heads/Case/Clone) && > + echo "$ref" | test_must_fail grep -q Case/Clone && I'm not sure if this is the ideal combination: Collect information in a shell variable, echo that to stdout and use grep -q to find out that something is NOT there- especially as I have in the back of my mind the warning "grep -q" is not portable... But grep -q is in POSIX, so it may be that versions of grep being part of busybox have this restriction. I don't have a better suggestion either, just a loose idea: cd victim && cat "this_or_that" >expected && git for-each-ref --format="%(refname)" refs/heads/Case/Clone | sort >actual && test_cmp expect actual ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-11 22:30 [PATCH v4 0/1] receive-pack: optionally deny case clone refs David Turner 2014-06-11 22:30 ` [PATCH v4 1/1] " David Turner @ 2014-06-12 19:47 ` Junio C Hamano 2014-06-12 23:30 ` David Turner 2014-06-13 18:20 ` Ronnie Sahlberg 2014-08-13 16:20 ` Ronnie Sahlberg 2 siblings, 2 replies; 18+ messages in thread From: Junio C Hamano @ 2014-06-12 19:47 UTC (permalink / raw) To: David Turner; +Cc: git, Michael Haggerty, Ronnie Sahlberg, Jonathan Nieder David Turner <dturner@twopensource.com> writes: > This issue bit us again recently. > > In talking with some colleagues, I realized that the previous version > of this patch, in addition to being potentially slow, was incomplete. > Specifically, it didn't handle the case of refs/heads/case/one vs > refs/heads/CASE/two; these are case clones even though they strcasecmp > different. Good catch to realize that two refs that share leading paths that are the same except for cases are also problematic, but that makes this feature even less about "case clones", doesn't it? Also it somehow feels that the patch attempts to solve the issue at a wrong level. On a platform that cannot represent two refs like these (e.g. trying to create "FOO" when "foo" already exists, or trying to create "a/c" when "A/b" already exists---ending up with "A/c" instead, which is not what the user wanted to create), would it be more sensible to fail the ref creation without touching the users of ref API such as receive-pack? That way, you would also catch other uses of refs that are not supported on your system, e.g. "git branch a/c" when there already is a branch called "A/b", no? CC'ing those who are more active in the ref API area recently than I am for their inputs. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-12 19:47 ` [PATCH v4 0/1] " Junio C Hamano @ 2014-06-12 23:30 ` David Turner 2014-06-13 4:03 ` Torsten Bögershausen 2014-06-13 17:08 ` Junio C Hamano 2014-06-13 18:20 ` Ronnie Sahlberg 1 sibling, 2 replies; 18+ messages in thread From: David Turner @ 2014-06-12 23:30 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Michael Haggerty, Ronnie Sahlberg, Jonathan Nieder On Thu, 2014-06-12 at 12:47 -0700, Junio C Hamano wrote: > David Turner <dturner@twopensource.com> writes: > > > This issue bit us again recently. > > > > In talking with some colleagues, I realized that the previous version > > of this patch, in addition to being potentially slow, was incomplete. > > Specifically, it didn't handle the case of refs/heads/case/one vs > > refs/heads/CASE/two; these are case clones even though they strcasecmp > > different. > > Good catch to realize that two refs that share leading paths that > are the same except for cases are also problematic, but that makes > this feature even less about "case clones", doesn't it? I agree: word "clone" is less good now. Maybe "case conflicts"? > Also it somehow feels that the patch attempts to solve the issue at > a wrong level. On a platform that cannot represent two refs like > these (e.g. trying to create "FOO" when "foo" already exists, or > trying to create "a/c" when "A/b" already exists---ending up with > "A/c" instead, which is not what the user wanted to create), would > it be more sensible to fail the ref creation without touching the > users of ref API such as receive-pack? That way, you would also > catch other uses of refs that are not supported on your system, > e.g. "git branch a/c" when there already is a branch called "A/b", > no? So we would change is_refname_available? And to do this, we would change the ref_dir functions to take case into account? This might be somewhat complicated because we could be starting from a repo which already has case conflicts. What should we do about that? I think this is even possible on a case-insensitive filesystem with packed-refs, although I have not checked. The simplest idea is probably to pretend that the first conflicting refname component we find is the one true one, and reject new refs containing versions which are case conflicting with this one until the user cleans up their repo. In other words, if the user has A/b and a/c already, and we find A/b first, then we reject a/d but allow A/d. This is arbitrary, but workable. We could warn about this situation when we load up the refs, too. Does this match what you are suggesting? If so, I think it is possible, and if I don't hear anything back from the other ref folks, I'll see if I have time to implement it. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-12 23:30 ` David Turner @ 2014-06-13 4:03 ` Torsten Bögershausen 2014-06-13 17:12 ` Junio C Hamano 2014-06-13 17:08 ` Junio C Hamano 1 sibling, 1 reply; 18+ messages in thread From: Torsten Bögershausen @ 2014-06-13 4:03 UTC (permalink / raw) To: David Turner, Junio C Hamano Cc: git, Michael Haggerty, Ronnie Sahlberg, Jonathan Nieder On 2014-06-13 01.30, David Turner wrote: > On Thu, 2014-06-12 at 12:47 -0700, Junio C Hamano wrote: >> David Turner <dturner@twopensource.com> writes: >> >>> This issue bit us again recently. >>> >>> In talking with some colleagues, I realized that the previous version >>> of this patch, in addition to being potentially slow, was incomplete. >>> Specifically, it didn't handle the case of refs/heads/case/one vs >>> refs/heads/CASE/two; these are case clones even though they strcasecmp >>> different. >> >> Good catch to realize that two refs that share leading paths that >> are the same except for cases are also problematic, but that makes >> this feature even less about "case clones", doesn't it? > > I agree: word "clone" is less good now. Maybe "case conflicts"? > >> Also it somehow feels that the patch attempts to solve the issue at >> a wrong level. On a platform that cannot represent two refs like >> these (e.g. trying to create "FOO" when "foo" already exists, or >> trying to create "a/c" when "A/b" already exists---ending up with >> "A/c" instead, which is not what the user wanted to create), would >> it be more sensible to fail the ref creation without touching the >> users of ref API such as receive-pack? That way, you would also >> catch other uses of refs that are not supported on your system, >> e.g. "git branch a/c" when there already is a branch called "A/b", >> no? > > So we would change is_refname_available? And to do this, we would > change the ref_dir functions to take case into account? > > This might be somewhat complicated because we could be starting from a > repo which already has case conflicts. What should we do about that? I > think this is even possible on a case-insensitive filesystem with > packed-refs, although I have not checked. It is: user@mac:~/test> git init Initialized empty Git repository in /Users/tb/test/.git/ user@mac:~/test> git checkout -b Branch Switched to a new branch 'Branch' user@mac:~/test> echo a>a && git add a && git commit -m "Add a" [Branch (root-commit) e27e99c] Add a 1 file changed, 1 insertion(+) create mode 100644 a user@mac:~/test> ls .git/refs/heads/ Branch user@mac:~/test> git pack-refs --all user@mac:~/test> ls .git/refs/heads/ user@mac:~/test> git checkout -b BRANCH Switched to a new branch 'BRANCH' user@mac:~/test> ls .git/refs/heads/ BRANCH user@mac:~/test> git branch * BRANCH Branch user@mac:~/test> > The simplest idea is probably > to pretend that the first conflicting refname component we find is the > one true one, and reject new refs containing versions which are case > conflicting with this one until the user cleans up their repo. In other > words, if the user has A/b and a/c already, and we find A/b first, then > we reject a/d but allow A/d. This is arbitrary, but workable. We > could warn about this situation when we load up the refs, too. > > Does this match what you are suggesting? > > If so, I think it is possible, and if I don't hear anything back from > the other ref folks, I'll see if I have time to implement it. I can see 2 ways "forward": Either to try to avoid case insensitve refs at all. If you do that carefully in a project, it may work with the patches you suggest. Or try to have a functionality to always use packed refs, and have a configuration for it: The advantage can be that branch names like "Branch" and "BRANCH" can live together in a project, regardless if you have a case sensitive or insensitve file system. This assumes that e.g. "core.packedrefs" is default true whenever core.ignorecase is true, which we can easily do in init_db.c Another advantage with the "always packed refs" is that you can have branches bugfix and bugfix/next-bug side by side. Today when you create a branch bugfix theh you have a file bugfix, and can not create a directory called bugfix/ to store the file bugfix/next-bug. I don't know if this is scaling well for many refs, some reports say it does not. But I can imagine to mmap .git/packed-refs into memory, throw every line into a hash table and do a lazy check if a ref is "good" whenever it is needed. Or do it in git fsck ? The disadvantege with packed refs is that you need to carefully lock one central file, and that may be a bottleneck on a server. I don't have that many refs in my projects, so I only can test on dummy repos, more or less close to reality. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-13 4:03 ` Torsten Bögershausen @ 2014-06-13 17:12 ` Junio C Hamano 0 siblings, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2014-06-13 17:12 UTC (permalink / raw) To: Torsten Bögershausen Cc: David Turner, git, Michael Haggerty, Ronnie Sahlberg, Jonathan Nieder Torsten Bögershausen <tboegi@web.de> writes: > Or try to have a functionality to always use packed refs, and have a configuration > for it: > > The advantage can be that branch names like "Branch" and "BRANCH" can live together > in a project, regardless if you have a case sensitive or insensitve file system. That is unfortunately *FALSE*. You forgot about how reflogs are stored for these branches. > Another advantage with the "always packed refs" is that you can have branches > bugfix and bugfix/next-bug side by side. Again, you forgot about reflogs. For this case I do not think it even is unfortunate. Having two "bugfix" and "bugfix/a" branches, if you are planning to use the hierarchical names to group things together, does not make much sense, and if you are not using these hierarchical names to group things together you can always use "bugfix" vs "bugfix-a" instead. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-12 23:30 ` David Turner 2014-06-13 4:03 ` Torsten Bögershausen @ 2014-06-13 17:08 ` Junio C Hamano 1 sibling, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2014-06-13 17:08 UTC (permalink / raw) To: David Turner; +Cc: git, Michael Haggerty, Ronnie Sahlberg, Jonathan Nieder David Turner <dturner@twopensource.com> writes: > On Thu, 2014-06-12 at 12:47 -0700, Junio C Hamano wrote: >> David Turner <dturner@twopensource.com> writes: >> >> > This issue bit us again recently. >> > >> > In talking with some colleagues, I realized that the previous version >> > of this patch, in addition to being potentially slow, was incomplete. >> > Specifically, it didn't handle the case of refs/heads/case/one vs >> > refs/heads/CASE/two; these are case clones even though they strcasecmp >> > different. >> >> Good catch to realize that two refs that share leading paths that >> are the same except for cases are also problematic, but that makes >> this feature even less about "case clones", doesn't it? > > I agree: word "clone" is less good now. Maybe "case conflicts"? Sounds better but I'd like to hear from the ref people first, as they have thought about it longer than I have ;-) >> Also it somehow feels that the patch attempts to solve the issue at >> a wrong level. On a platform that cannot represent two refs like >> these (e.g. trying to create "FOO" when "foo" already exists, or >> trying to create "a/c" when "A/b" already exists---ending up with >> "A/c" instead, which is not what the user wanted to create), would >> it be more sensible to fail the ref creation without touching the >> users of ref API such as receive-pack? That way, you would also >> catch other uses of refs that are not supported on your system, >> e.g. "git branch a/c" when there already is a branch called "A/b", >> no? > > So we would change is_refname_available? And to do this, we would > change the ref_dir functions to take case into account? > ... > In other > words, if the user has A/b and a/c already, and we find A/b first, then > we reject a/d but allow A/d. This is arbitrary, but workable. We > could warn about this situation when we load up the refs, too. > > Does this match what you are suggesting? Yes. But again, I'd like to hear from the ref people first. They may have better ideas. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-12 19:47 ` [PATCH v4 0/1] " Junio C Hamano 2014-06-12 23:30 ` David Turner @ 2014-06-13 18:20 ` Ronnie Sahlberg 2014-06-13 19:05 ` Ronnie Sahlberg 2014-06-13 21:25 ` Junio C Hamano 1 sibling, 2 replies; 18+ messages in thread From: Ronnie Sahlberg @ 2014-06-13 18:20 UTC (permalink / raw) To: Junio C Hamano; +Cc: David Turner, git, Michael Haggerty, Jonathan Nieder On Thu, Jun 12, 2014 at 12:47 PM, Junio C Hamano <gitster@pobox.com> wrote: > David Turner <dturner@twopensource.com> writes: > >> This issue bit us again recently. >> >> In talking with some colleagues, I realized that the previous version >> of this patch, in addition to being potentially slow, was incomplete. >> Specifically, it didn't handle the case of refs/heads/case/one vs >> refs/heads/CASE/two; these are case clones even though they strcasecmp >> different. > > Good catch to realize that two refs that share leading paths that > are the same except for cases are also problematic, but that makes > this feature even less about "case clones", doesn't it? > > Also it somehow feels that the patch attempts to solve the issue at > a wrong level. > On a platform that cannot represent two refs like > these (e.g. trying to create "FOO" when "foo" already exists, or > trying to create "a/c" when "A/b" already exists---ending up with > "A/c" instead, which is not what the user wanted to create), would > it be more sensible to fail the ref creation without touching the > users of ref API such as receive-pack? That way, you would also > catch other uses of refs that are not supported on your system, > e.g. "git branch a/c" when there already is a branch called "A/b", > no? It gets even more hairy : If the server has A/a and a/b and you clone it it becomes A/a and A/b locally. Then you push back to the server and you end up with three refs on the server: A/a A/b and a/b. Or what we end up with locally could either be a/a and a/b or A/a and A/b depending on which order the server sends the refs back. Since the ordering I think is not formally defined, is it?, this could mean that we would end up with something but it would be difficult to deteministically decide what the outcome would be. As the refs handling is pretty complex as is I think we want to avoid adding undeterministic residuals after a clone/fetch/* has failed. That then IMHO means that we should wait with implementing something until we have finished the ref API rewrite. If nothing else, having atomic transactions would mean that when things fail due to case collissions we would have a deterministic outcome : transaction failed so nothing was created locally. But I think this is the wrong level to solve this issue at. receive-pack.c is only one out of many place where I think this would come into effect. You also have reflogs, remotes and all sorts of other things that would have to be fixed up. If we want to add this kind check-fails, we should wait until after we have ref transactions because then we will have one single path, transaction_update_sha1(), through which every ref creation/update will have to pass. and thus then we have one single place where such a check-fail could cover all cases of ref creations. > > CC'ing those who are more active in the ref API area recently than I > am for their inputs. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-13 18:20 ` Ronnie Sahlberg @ 2014-06-13 19:05 ` Ronnie Sahlberg 2014-06-13 21:11 ` Junio C Hamano 2014-06-15 7:10 ` David Turner 2014-06-13 21:25 ` Junio C Hamano 1 sibling, 2 replies; 18+ messages in thread From: Ronnie Sahlberg @ 2014-06-13 19:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: David Turner, git, Michael Haggerty, Jonathan Nieder Thinking about it more. I think we want to wait until the ref transaction API work is finished. The ref transactions API is in progress and it aims to add transactions for ref updates as a first step but then it aims to define a public API for all public ref functions. As part of that I will also develop support for pluggable ref backends. The first backend will be the current files based structure but I also will add an optional backend using a TDB database. That means that on mac you could just use "core.refs_backend = tdb" and you would have full support for case sensitive refs. With that backend "refs/heads/A/foo" would no longer be a file but a key to a row in the TDB database. I.e. no need to do anything at all, just wait until my refs work is finished and yourt problem is gone :-) Alternatively, if you don't want a TDB database and you REALLY want to keep storing the refs as files, just as they are today, that would be very easy to do too once the refs work is finished. Each refs backend will be defined by a set of methods. struct ref_be refs_files = { .transaction_create_sha1 = files_transaction_create_sha1, ... You could then very easily create a new backend, say 'struct refs_be refs_files_case_insensitive' where the methods would just convert any refnames to/from a case insensitive encoding before invoking the default methods from the files backend. Perhaps something as simple as converting any upper case characters to/from '%xx' representation when accessing the actual files. On Fri, Jun 13, 2014 at 11:20 AM, Ronnie Sahlberg <sahlberg@google.com> wrote: > On Thu, Jun 12, 2014 at 12:47 PM, Junio C Hamano <gitster@pobox.com> wrote: >> David Turner <dturner@twopensource.com> writes: >> >>> This issue bit us again recently. >>> >>> In talking with some colleagues, I realized that the previous version >>> of this patch, in addition to being potentially slow, was incomplete. >>> Specifically, it didn't handle the case of refs/heads/case/one vs >>> refs/heads/CASE/two; these are case clones even though they strcasecmp >>> different. >> >> Good catch to realize that two refs that share leading paths that >> are the same except for cases are also problematic, but that makes >> this feature even less about "case clones", doesn't it? >> >> Also it somehow feels that the patch attempts to solve the issue at >> a wrong level. >> On a platform that cannot represent two refs like >> these (e.g. trying to create "FOO" when "foo" already exists, or >> trying to create "a/c" when "A/b" already exists---ending up with >> "A/c" instead, which is not what the user wanted to create), would >> it be more sensible to fail the ref creation without touching the >> users of ref API such as receive-pack? That way, you would also >> catch other uses of refs that are not supported on your system, >> e.g. "git branch a/c" when there already is a branch called "A/b", >> no? > > It gets even more hairy : > If the server has A/a and a/b and you clone it it becomes A/a and A/b > locally. Then you push back to the server and you end up with three > refs on the server: A/a A/b and a/b. > Or what we end up with locally could either be a/a and a/b or A/a and > A/b depending on which order the server sends the refs back. > Since the ordering I think is not formally defined, is it?, this could > mean that we would end up with something but it would be difficult to > deteministically decide what the outcome would be. > As the refs handling is pretty complex as is I think we want to avoid > adding undeterministic residuals after a clone/fetch/* has failed. > > That then IMHO means that we should wait with implementing something > until we have finished the ref API rewrite. If nothing else, having > atomic transactions would mean that > when things fail due to case collissions we would have a deterministic > outcome : transaction failed so nothing was created locally. > > > > But I think this is the wrong level to solve this issue at. > receive-pack.c is only one out of many place where I think this would > come into effect. > You also have reflogs, remotes and all sorts of other things that > would have to be fixed up. > > If we want to add this kind check-fails, we should wait until after we > have ref transactions because then we will have one single path, > transaction_update_sha1(), through which every ref creation/update > will have to pass. and thus then we have one single place where such a > check-fail could cover all cases of ref creations. > > > > >> >> CC'ing those who are more active in the ref API area recently than I >> am for their inputs. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-13 19:05 ` Ronnie Sahlberg @ 2014-06-13 21:11 ` Junio C Hamano 2014-06-13 22:24 ` Ronnie Sahlberg 2014-06-15 7:10 ` David Turner 1 sibling, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2014-06-13 21:11 UTC (permalink / raw) To: Ronnie Sahlberg; +Cc: David Turner, git, Michael Haggerty, Jonathan Nieder Ronnie Sahlberg <sahlberg@google.com> writes: > ... The first > backend will be the current files based structure but I also will add > an optional backend using a TDB database. I am assuming that as part of the transactions work, accesses to reflogs will also have their own backends? > You could then very easily create a new backend, say 'struct refs_be > refs_files_case_insensitive' where the methods would just convert any > refnames to/from a case insensitive encoding before invoking the > default methods from the files backend. > Perhaps something as simple as converting any upper case characters > to/from '%xx' representation when accessing the actual files. Hmm... that would work only when the new implementation of Git is the only one that accesses the repository. Other implementations (e.g. Eclipse via egit, Gerrit via jgit, etc.) peeking into the same repository wouldn't know what to do with these encoded refnames. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-13 21:11 ` Junio C Hamano @ 2014-06-13 22:24 ` Ronnie Sahlberg 0 siblings, 0 replies; 18+ messages in thread From: Ronnie Sahlberg @ 2014-06-13 22:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: David Turner, git, Michael Haggerty, Jonathan Nieder On Fri, Jun 13, 2014 at 2:11 PM, Junio C Hamano <gitster@pobox.com> wrote: > Ronnie Sahlberg <sahlberg@google.com> writes: > >> ... The first >> backend will be the current files based structure but I also will add >> an optional backend using a TDB database. > > I am assuming that as part of the transactions work, accesses to > reflogs will also have their own backends? Yes. They will be stored in the same database so that updates to refs and reflogs will become truly atomic. > >> You could then very easily create a new backend, say 'struct refs_be >> refs_files_case_insensitive' where the methods would just convert any >> refnames to/from a case insensitive encoding before invoking the >> default methods from the files backend. >> Perhaps something as simple as converting any upper case characters >> to/from '%xx' representation when accessing the actual files. > > Hmm... that would work only when the new implementation of Git is > the only one that accesses the repository. Other implementations > (e.g. Eclipse via egit, Gerrit via jgit, etc.) peeking into the > same repository wouldn't know what to do with these encoded > refnames. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-13 19:05 ` Ronnie Sahlberg 2014-06-13 21:11 ` Junio C Hamano @ 2014-06-15 7:10 ` David Turner 1 sibling, 0 replies; 18+ messages in thread From: David Turner @ 2014-06-15 7:10 UTC (permalink / raw) To: Ronnie Sahlberg; +Cc: Junio C Hamano, git, Michael Haggerty, Jonathan Nieder On Fri, 2014-06-13 at 12:05 -0700, Ronnie Sahlberg wrote: > Thinking about it more. > > I think we want to wait until the ref transaction API work is > finished. The ref transactions API is in progress and it aims to add > transactions for ref updates as a first step but then it aims to > define a public API for all public ref functions. As part of that I > will also develop support for pluggable ref backends. The first > backend will be the current files based structure but I also will add > an optional backend using a TDB database. OK, in that case I'll shelve this patch for now (except that we'll probably use it internally) If for some reason pluggable ref backends don't work out, we can always revive it. Thanks to all for the reviews. You can find the latest version at https://github.com/dturner-tw/git/tree/case-ref . ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-13 18:20 ` Ronnie Sahlberg 2014-06-13 19:05 ` Ronnie Sahlberg @ 2014-06-13 21:25 ` Junio C Hamano 2014-06-18 11:33 ` Michael Haggerty 1 sibling, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2014-06-13 21:25 UTC (permalink / raw) To: Ronnie Sahlberg; +Cc: David Turner, git, Michael Haggerty, Jonathan Nieder Ronnie Sahlberg <sahlberg@google.com> writes: > It gets even more hairy : > If the server has A/a and a/b and you clone it it becomes A/a and A/b > locally. Then you push back to the server and you end up with three > refs on the server: A/a A/b and a/b. That is part of the transition in deployment. David who wants to forbid A/a and a/b mixed in his project will surely correct the situation at the server end so "somebody fetches A/a and a/b and ends up with A/a and A/b" will not happen. They will fetch A/a and A/b. If a user is with an older Git and he has his own a/c, fetching A/a and A/b from a corrected central repository will still give the user a/a and a/b, but then pushing it back will be forbidden. The user's repository needs to be fixed and installation of Git needs to be updated to the version with an equivalent of David's "optionally deny" feature implemented for the fetching side, so that the user notices the local a/c is bad and not allowed within the context of his project, deletes it and recreates it as A/c before he can fetch A/a and A/b from the central repository. I agree that the transition may be painful, but as long as the desired semantics is "If you have A/a, you are not allowed to have a/a or a/b", it cannot be avoided---in that sense, I view it as a lower priority issue. Having said that, it may indicate that the desired semantics above may not be the optimal one. Perhaps the flag might want to be "on this platform, we cannot do case sensitive refs, so pretend as if all refs are lowercase" instead. I suspect that such a flag may allow smoother transition than what has been proposed. Underlying refs "A/a" and "a/b" can stay as they are in David's central repository, but ref enumeration with the flag enabled will return a/a and a/b, and these are the names that will be fetched by the users. If the user had an existing A/c, then fetching these will still create A/a and A/b locally, but pushing them back will, under that flag enabled, be turned into updates to a/a, a/b, and a/c on the central server side with updated Git. I dunno. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-13 21:25 ` Junio C Hamano @ 2014-06-18 11:33 ` Michael Haggerty 2014-06-18 15:03 ` Ronnie Sahlberg 0 siblings, 1 reply; 18+ messages in thread From: Michael Haggerty @ 2014-06-18 11:33 UTC (permalink / raw) To: Junio C Hamano, Ronnie Sahlberg; +Cc: David Turner, git, Jonathan Nieder On 06/13/2014 11:25 PM, Junio C Hamano wrote: > Ronnie Sahlberg <sahlberg@google.com> writes: > >> It gets even more hairy : >> If the server has A/a and a/b and you clone it it becomes A/a and A/b >> locally. Then you push back to the server and you end up with three >> refs on the server: A/a A/b and a/b. > > That is part of the transition in deployment. David who wants to > forbid A/a and a/b mixed in his project will surely correct the > situation at the server end so "somebody fetches A/a and a/b and > ends up with A/a and A/b" will not happen. They will fetch A/a and > A/b. > > If a user is with an older Git and he has his own a/c, fetching A/a > and A/b from a corrected central repository will still give the user > a/a and a/b, but then pushing it back will be forbidden. The user's > repository needs to be fixed and installation of Git needs to be > updated to the version with an equivalent of David's "optionally > deny" feature implemented for the fetching side, so that the user > notices the local a/c is bad and not allowed within the context of > his project, deletes it and recreates it as A/c before he can fetch > A/a and A/b from the central repository. > > I agree that the transition may be painful, but as long as the > desired semantics is "If you have A/a, you are not allowed to have > a/a or a/b", it cannot be avoided---in that sense, I view it as a > lower priority issue. > > Having said that, it may indicate that the desired semantics above > may not be the optimal one. Perhaps the flag might want to be "on > this platform, we cannot do case sensitive refs, so pretend as if > all refs are lowercase" instead. I suspect that such a flag may > allow smoother transition than what has been proposed. > > Underlying refs "A/a" and "a/b" can stay as they are in David's > central repository, but ref enumeration with the flag enabled will > return a/a and a/b, and these are the names that will be fetched by > the users. If the user had an existing A/c, then fetching these > will still create A/a and A/b locally, but pushing them back will, > under that flag enabled, be turned into updates to a/a, a/b, and a/c > on the central server side with updated Git. The discussion here has made it pretty clear that, given our current loose reference and reflog storage schemes, it is not possible to implement case-sensitive references or even case-insensitive but case-preserving references correctly on a non-case-sensitive filesystem. We would always have spooky non-local conflicts like A/a vs. a/b. I think we *could* correctly implement * case-folded reference names (e.g., all lower-case; I wonder if that would also apply to HEAD etc.?) * case-folded reference names except for the last component, which could be case-insensitive but case-preserving: refs/heads/MyCrazyBranch. I suppose that many mixed-OS projects effectively use this policy nowadays and that is why we don't hear more complaints about this Git deficiency. If we had an option to use only packed references in a repository, I think we could also implement case-insensitive but case-preserving references on a non-case-preserving filesystem. The packed-refs file would be authoritative WRT case, and the case of the reflog directories and files would be ignored. But I would be nervous about promoting this style, because it will likely cause performance problems for repos that have a lot of refs. To support arbitrary refname policies on arbitrary filesystems, we of course need a different way of storing references and reflogs. Michael -- Michael Haggerty mhagger@alum.mit.edu http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-18 11:33 ` Michael Haggerty @ 2014-06-18 15:03 ` Ronnie Sahlberg 0 siblings, 0 replies; 18+ messages in thread From: Ronnie Sahlberg @ 2014-06-18 15:03 UTC (permalink / raw) To: Michael Haggerty; +Cc: Junio C Hamano, David Turner, git, Jonathan Nieder On Wed, Jun 18, 2014 at 4:33 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > On 06/13/2014 11:25 PM, Junio C Hamano wrote: >> Ronnie Sahlberg <sahlberg@google.com> writes: >> >>> It gets even more hairy : >>> If the server has A/a and a/b and you clone it it becomes A/a and A/b >>> locally. Then you push back to the server and you end up with three >>> refs on the server: A/a A/b and a/b. >> >> That is part of the transition in deployment. David who wants to >> forbid A/a and a/b mixed in his project will surely correct the >> situation at the server end so "somebody fetches A/a and a/b and >> ends up with A/a and A/b" will not happen. They will fetch A/a and >> A/b. >> >> If a user is with an older Git and he has his own a/c, fetching A/a >> and A/b from a corrected central repository will still give the user >> a/a and a/b, but then pushing it back will be forbidden. The user's >> repository needs to be fixed and installation of Git needs to be >> updated to the version with an equivalent of David's "optionally >> deny" feature implemented for the fetching side, so that the user >> notices the local a/c is bad and not allowed within the context of >> his project, deletes it and recreates it as A/c before he can fetch >> A/a and A/b from the central repository. >> >> I agree that the transition may be painful, but as long as the >> desired semantics is "If you have A/a, you are not allowed to have >> a/a or a/b", it cannot be avoided---in that sense, I view it as a >> lower priority issue. >> >> Having said that, it may indicate that the desired semantics above >> may not be the optimal one. Perhaps the flag might want to be "on >> this platform, we cannot do case sensitive refs, so pretend as if >> all refs are lowercase" instead. I suspect that such a flag may >> allow smoother transition than what has been proposed. >> >> Underlying refs "A/a" and "a/b" can stay as they are in David's >> central repository, but ref enumeration with the flag enabled will >> return a/a and a/b, and these are the names that will be fetched by >> the users. If the user had an existing A/c, then fetching these >> will still create A/a and A/b locally, but pushing them back will, >> under that flag enabled, be turned into updates to a/a, a/b, and a/c >> on the central server side with updated Git. > > The discussion here has made it pretty clear that, given our current > loose reference and reflog storage schemes, it is not possible to > implement case-sensitive references or even case-insensitive but > case-preserving references correctly on a non-case-sensitive filesystem. > We would always have spooky non-local conflicts like A/a vs. a/b. > > I think we *could* correctly implement > > * case-folded reference names (e.g., all lower-case; I wonder if > that would also apply to HEAD etc.?) > > * case-folded reference names except for the last component, which > could be case-insensitive but case-preserving: > refs/heads/MyCrazyBranch. I suppose that many mixed-OS projects > effectively use this policy nowadays and that is why we don't hear > more complaints about this Git deficiency. > > If we had an option to use only packed references in a repository, I > think we could also implement case-insensitive but case-preserving > references on a non-case-preserving filesystem. The packed-refs file > would be authoritative WRT case, and the case of the reflog directories > and files would be ignored. But I would be nervous about promoting this > style, because it will likely cause performance problems for repos that > have a lot of refs. > > To support arbitrary refname policies on arbitrary filesystems, we of > course need a different way of storing references and reflogs. > Agree. I think the transaction work can help here for both the cases of refs (which might be solved by a pacxked-refs only setting) and reflogs. My next two (small) series for transactions : https://github.com/rsahlberg/git/tree/ref-transactions-reflog https://github.com/rsahlberg/git/tree/ref-transactions-rename reworks the handling of reflogs and all reflog API callers. With these two series we get to a stage where the reflog API and implementation both shrinks and becomes more well defined. We essentially end up with only these functions for touching reflogs: check if a reflog exists: extern int reflog_exists(const char *refname); create/initialize an empty reflog: extern int create_reflog(const char *refname); delete a reflog: extern int delete_reflog(const char *refname); these two functions iterate/read over all entries for a particular reflog: int for_each_reflog_ent(const char *refname, each_reflog_ent_fn fn, void *cb_data); int for_each_reflog_ent_reverse(const char *refname, each_reflog_ent_fn fn, void *cb_data); iterate over all reflogs: extern int for_each_reflog(each_ref_fn, void *); write/append an entry to a reflog: int transaction_update_reflog(struct ref_transaction *transaction, When we get to that point it will become much easier if we want to change how reflogs are stored. If for example we want to default to something else than " a shadow directory structure under logs/". So I would like to see those two series as both refactoring the existing code as well as defining the API for reflogs and in its extension make it easier to do changes. For my plan with pluggable refs backends I plan to put the reflogs inside a TDB database, but if someone wants to, this would also provide the framework for someone to add a "core.store_reflog_in_xyz = True" which do something to address the problems with case insensitive filesystems. Following those series, I have another small series that has not been out for review yet: https://github.com/rsahlberg/git/tree/ref-transactions-req-packed-refs This particular patch series starts using packed refs as part of a multi ref transaction. The main purpose of this is to make transaction_commit() become more atomic, it particular for better rollbackability but also in order to make the transaction commit become more atomic for an external observer as well. In doing so, for any transactions that affects 2 or more refs it will start reading and writing to the packed refs file instead of loose refs. For example a transaction that will rename a ref, i.e. deleting the old ref and creating a new ref with the same sha1, it will do 1, copy the old ref to the packed refs file and repack the packed refs. 2, delete the old loose ref 3, delete the old ref from the packed refs file and add the new ref and then commit this change to the packed refs file. Or if you are deleting multiple refs : 1, copy all refs to the packed refs file and commit the packed refs 2, delete the loose refs 3, remove the refs from the packed refs file and commit it. etc. The patch series as it is today only do this for multi ref transactions. That is for performance reasons since repacking a huge packed refs file would be heaps slower than just creating/updating a single loose ref for the common case of "git commit, and update a single ref". We could see this series as also lay the groundwork for an option : "core.packed_refs_only = True" which would change transaction_commit() to only operate on packed refs also for the single ref case, and then you would basically have a "refs only exist as packed refs mode" in which case you would have a solution to the case insensitive filesystem as well. So we could see this as, once the series is ready for review and/or use, this would provide almost a solution for a "packed refs only" mode. What I am trying to say is that I think we will get multiple ways to solve the case insensitive filesystem problem as a sideeffect of finishing the ref transaction work. Perhaps not full solutions out of the box but at least we should have ended up with a framework to make it a lot easier to solve these problems than they are today. regards ronnie sahlberg > Michael > > -- > Michael Haggerty > mhagger@alum.mit.edu > http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-06-11 22:30 [PATCH v4 0/1] receive-pack: optionally deny case clone refs David Turner 2014-06-11 22:30 ` [PATCH v4 1/1] " David Turner 2014-06-12 19:47 ` [PATCH v4 0/1] " Junio C Hamano @ 2014-08-13 16:20 ` Ronnie Sahlberg 2014-08-13 19:28 ` David Turner 2 siblings, 1 reply; 18+ messages in thread From: Ronnie Sahlberg @ 2014-08-13 16:20 UTC (permalink / raw) To: David Turner; +Cc: git David, One possible solution can be to use the external database daemon I am working of for ref transactions. Since this makes all refs be stored in a dedicated database instead of the filesystem you no longer are dependent on file system semantics. While not in the official git trees yet I would appreciate any testing and comments you may have it you want to test it. https://github.com/rsahlberg/git/tree/backend-struct-db-2 Not for production use but seeing if it does build and works on your platforms would be great. refsd-tdb: example refs database daemon refsd-tdb.c is a simple reference implementation of a refs daemon. This will eventually be hosted in a separate project but can live in this branch for now. Compile with : gcc refsd-tdb.c -o refsd-tdb -l tdb Run with: ./refsd-tdb /tmp/refsd.socket /tmp /tmp/refsd.log Once the refs daemon is running you can start using it with newly created git repositories by specifying the --db-socket and --db-repo-name arguments to git clone/init-db : git clone --db-repo-name=ROCKy --db-socket=/tmp/refsd.socket <some repo> . This refs daemon is an example. It should be relatively straightforward to modify it to attach to any other kind of data store. regards ronnie sahlberg On Wed, Jun 11, 2014 at 3:30 PM, David Turner <dturner@twopensource.com> wrote: > This issue bit us again recently. > > In talking with some colleagues, I realized that the previous version > of this patch, in addition to being potentially slow, was incomplete. > Specifically, it didn't handle the case of refs/heads/case/one vs > refs/heads/CASE/two; these are case clones even though they strcasecmp > different. > > This new version contains code to prevent this, as well as tests for > this case. > > Also it uses a hashmap to make lookups constant-time. > > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/1] receive-pack: optionally deny case clone refs 2014-08-13 16:20 ` Ronnie Sahlberg @ 2014-08-13 19:28 ` David Turner 0 siblings, 0 replies; 18+ messages in thread From: David Turner @ 2014-08-13 19:28 UTC (permalink / raw) To: Ronnie Sahlberg; +Cc: git On Wed, 2014-08-13 at 09:20 -0700, Ronnie Sahlberg wrote: > David, > > One possible solution can be to use the external database daemon I am > working of for ref transactions. > Since this makes all refs be stored in a dedicated database instead of > the filesystem you no longer are dependent on file system semantics. > > While not in the official git trees yet I would appreciate any testing > and comments you may have it you want to test it. > https://github.com/rsahlberg/git/tree/backend-struct-db-2 > > Not for production use but seeing if it does build and works on your > platforms would be great. Thanks. We ultimately went with a custom hook (which also enforces other business rules), but I'm very excited about the new backend and hope it will improve performance generally. One thing you could do to increase adoption would be to make a Homebrew formula for tdb. Homebrew is the defacto standard Mac packaging system. Sadly, a large number of engineers use Macs, so this is a large barrier -- especially since the tdb build is somewhat baroque. It would also be great if it didn't require a fresh clone; for a large organization with large repos, requiring every engineer to re-clone is somewhat onerous. I assume it's straightforward to do a conversion from an existing set of refs but that it just hasn't been a priority for you yet. Thanks so much for working on this! ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-08-13 19:28 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-06-11 22:30 [PATCH v4 0/1] receive-pack: optionally deny case clone refs David Turner 2014-06-11 22:30 ` [PATCH v4 1/1] " David Turner 2014-06-13 4:03 ` Torsten Bögershausen 2014-06-12 19:47 ` [PATCH v4 0/1] " Junio C Hamano 2014-06-12 23:30 ` David Turner 2014-06-13 4:03 ` Torsten Bögershausen 2014-06-13 17:12 ` Junio C Hamano 2014-06-13 17:08 ` Junio C Hamano 2014-06-13 18:20 ` Ronnie Sahlberg 2014-06-13 19:05 ` Ronnie Sahlberg 2014-06-13 21:11 ` Junio C Hamano 2014-06-13 22:24 ` Ronnie Sahlberg 2014-06-15 7:10 ` David Turner 2014-06-13 21:25 ` Junio C Hamano 2014-06-18 11:33 ` Michael Haggerty 2014-06-18 15:03 ` Ronnie Sahlberg 2014-08-13 16:20 ` Ronnie Sahlberg 2014-08-13 19:28 ` David Turner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.