From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id A77191F404 for ; Tue, 23 Jan 2018 20:03:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752524AbeAWUDe (ORCPT ); Tue, 23 Jan 2018 15:03:34 -0500 Received: from pb-smtp1.pobox.com ([64.147.108.70]:54580 "EHLO pb-smtp1.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752519AbeAWUDc (ORCPT ); Tue, 23 Jan 2018 15:03:32 -0500 Received: from pb-smtp1.pobox.com (unknown [127.0.0.1]) by pb-smtp1.pobox.com (Postfix) with ESMTP id CEB20C7B14; Tue, 23 Jan 2018 15:03:31 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=ZjoEjiKPpR9MaJyGl9u8n0Yudzg=; b=PKqyTR ELmFtCFU1i3uUPZ5NEwozq34o/lJRQYgoEZvbddp708CgD+KtKzY2EnQOzwcc4N7 Vw64GRGv1ddoPxvngGrCHtr0iofZyytwpZUtNpbqlYv0RN7mdxBFKK3QeULwveTs eTapRt3gV/py9jlKQ5E6+PJNK0AxxhJ3rzRdg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=FHgPomZcuGCu90Xmpuwd9XeymEeJVDYE flQlMynES3EDJTXvDjpqTfDR1IRN0Snt2G+yrlkTl7POwbYVbQfd0J8D0rqB08kN nEg1GrM3K/bTft0nQlMJWlw+KKbaRKiPf3eM9JLikazewAjA0NdeUviUDmJg2cMg ivTieL27oeg= Received: from pb-smtp1.nyi.icgroup.com (unknown [127.0.0.1]) by pb-smtp1.pobox.com (Postfix) with ESMTP id C69DBC7B13; Tue, 23 Jan 2018 15:03:31 -0500 (EST) Received: from pobox.com (unknown [104.132.0.95]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by pb-smtp1.pobox.com (Postfix) with ESMTPSA id 44A7BC7B11; Tue, 23 Jan 2018 15:03:31 -0500 (EST) From: Junio C Hamano To: Johannes Schindelin Cc: git@vger.kernel.org, Jacob Keller Subject: Re: [PATCH 4/8] rebase-helper --make-script: introduce a flag to recreate merges References: Date: Tue, 23 Jan 2018 12:03:30 -0800 In-Reply-To: (Johannes Schindelin's message of "Thu, 18 Jan 2018 16:35:40 +0100 (STD)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 7C19E50E-0078-11E8-A9EB-8EF31968708C-77302942!pb-smtp1.pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Johannes Schindelin writes: > The sequencer just learned a new commands intended to recreate branch s/a //; > structure (similar in spirit to --preserve-merges, but with a > substantially less-broken design). > ... > @@ -2785,6 +2787,335 @@ void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag) > strbuf_release(&sob); > } > > +struct labels_entry { > + struct hashmap_entry entry; > + char label[FLEX_ARRAY]; > +}; > + > +static int labels_cmp(const void *fndata, const struct labels_entry *a, > + const struct labels_entry *b, const void *key) > +{ > + return key ? strcmp(a->label, key) : strcmp(a->label, b->label); > +} label_oid() accesses state->labels hash using strihash() as the hash function, but the final comparison between the entries in the same hash buckets are done with case sensitivity. It is unclear to me if that is what was intended, and why. > +struct string_entry { > + struct oidmap_entry entry; > + char string[FLEX_ARRAY]; > +}; > + > +struct label_state { > + struct oidmap commit2label; > + struct hashmap labels; > + struct strbuf buf; > +}; > + > +static const char *label_oid(struct object_id *oid, const char *label, > + struct label_state *state) > +{ > + struct labels_entry *labels_entry; > + struct string_entry *string_entry; > + struct object_id dummy; > + size_t len; > + int i; > + > + string_entry = oidmap_get(&state->commit2label, oid); > + if (string_entry) > + return string_entry->string; > + > + /* > + * For "uninteresting" commits, i.e. commits that are not to be > + * rebased, and which can therefore not be labeled, we use a unique > + * abbreviation of the commit name. This is slightly more complicated > + * than calling find_unique_abbrev() because we also need to make > + * sure that the abbreviation does not conflict with any other > + * label. > + * > + * We disallow "interesting" commits to be labeled by a string that > + * is a valid full-length hash, to ensure that we always can find an > + * abbreviation for any uninteresting commit's names that does not > + * clash with any other label. > + */ > + if (!label) { > + char *p; > + > + strbuf_reset(&state->buf); > + strbuf_grow(&state->buf, GIT_SHA1_HEXSZ); > + label = p = state->buf.buf; > + > + find_unique_abbrev_r(p, oid->hash, default_abbrev); > + > + /* > + * We may need to extend the abbreviated hash so that there is > + * no conflicting label. > + */ > + if (hashmap_get_from_hash(&state->labels, strihash(p), p)) { > + size_t i = strlen(p) + 1; > + > + oid_to_hex_r(p, oid); > + for (; i < GIT_SHA1_HEXSZ; i++) { > + char save = p[i]; > + p[i] = '\0'; > + if (!hashmap_get_from_hash(&state->labels, > + strihash(p), p)) > + break; > + p[i] = save; > + } > + } If oid->hash required full 40-hex to disambiguate, then find-unique-abbrev would give 40-hex and we'd want the same "-" suffix technique employed below to make it consistently unique. I wonder if organizing the function this way ... if (!label) label = oid-to-hex(oid); if (label already exists or full oid) { make it unambiguous; } ... allows the resulting code easier to understand and manage. A related tangent. Does an auto-label given to "uninteresting" commit need to be visible to end users? I doubted it and that is why I said oid-to-hex in the above, but if it is given to end users, use of find-unique-abbrev-r is perfectly fine. > +static int make_script_with_merges(struct pretty_print_context *pp, > + struct rev_info *revs, FILE *out, > + unsigned flags) > +{ > + ... > + int abbr = flags & TODO_LIST_ABBREVIATE_CMDS; > + const char *p = abbr ? "p" : "pick", *l = abbr ? "l" : "label", > + *t = abbr ? "t" : "reset", *b = abbr ? "b" : "bud", > + *m = abbr ? "m" : "merge"; It would be easier to understand if these short named variables are reserved only for temporary use, not as constants. It is not too much to spell fprintf(out, "%s onto\n", cmd_label); than fprintf(out, "%s onto\n", l); and would save readers from head-scratching, wondering where the last assignment to variable "l" is. > + > + oidmap_init(&commit2todo, 0); > + oidmap_init(&state.commit2label, 0); > + hashmap_init(&state.labels, (hashmap_cmp_fn) labels_cmp, NULL, 0); > + strbuf_init(&state.buf, 32); > + > + if (revs->cmdline.nr && (revs->cmdline.rev[0].flags & BOTTOM)) { > + struct object_id *oid = &revs->cmdline.rev[0].item->oid; > + FLEX_ALLOC_STR(entry, string, "onto"); > + oidcpy(&entry->entry.oid, oid); > + oidmap_put(&state.commit2label, entry); > + } > + > + /* > + * First phase: > + * - get onelines for all commits > + * - gather all branch tips (i.e. 2nd or later parents of merges) > + * - label all branch tips > + */ When an early part of a branch is merged and then the remaining part of the same branch is merged again, "branch tip" and "2nd or later parents of merges" would become different concepts. The 2nd parent of an early merge is not among the branch tips. For the purpose of the "recreate the topology" algorithm, I am imagining that you would need not just the tips but all the 2nd and subsequent parents of merges, and my quick skimming tells me that the following code grabs them correctly.