All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Junio C Hamano via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Han-Wen Nienhuys <hanwenn@gmail.com>, Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 4/4] Cleanse reflog messages in the refs.c layer
Date: Fri, 10 Jul 2020 17:19:53 +0000	[thread overview]
Message-ID: <6ca5b99c8d7af1a3f35b3a7d25db284c879a2f10.1594401593.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.669.v2.git.1594401593.gitgitgadget@gmail.com>

From: Junio C Hamano <gitster@pobox.com>

Regarding reflog messages:

 - We expect that a reflog message consists of a single line.  The
   file format used by the files backend may add a LF after the
   message as a delimiter, and output by commands like "git log -g"
   may complete such an incomplete line by adding a LF at the end,
   but philosophically, the terminating LF is not a part of the
   message.

 - We however allow callers of refs API to supply a random sequence
   of NUL terminated bytes.  We cleanse caller-supplied message by
   squashing a run of whitespaces into a SP, and by trimming trailing
   whitespace, before storing the message.  This is how we tolerate,
   instead of erring out, a message with LF in it (be it at the end,
   in the middle, or both).

Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out.  This is sufficient with the
current code, as that is the only backend that writes reflogs.  But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used.

An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.

Side note: I am not interested in supporting multi-line
reflog messages right at the moment (nobody is asking for
it), but I envision that instead of the "squash a run of
whitespaces into a SP and rtrim" cleansing, we can
%urlencode problematic bytes in the message *AND* append a
SP at the end, when a new version of Git that supports
multi-line and/or verbatim reflog messages writes a reflog
record.  The reading side can detect the presense of SP at
the end (which should have been rtrimmed out if it were
written by existing versions of Git) as a signal that
decoding %urlencode recovers the original reflog message.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 refs.c               | 50 ++++++++++++++++++++++++++++++++++++--------
 refs/files-backend.c |  2 +-
 refs/refs-internal.h |  6 ------
 3 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/refs.c b/refs.c
index 639cba93b4..89814c7be4 100644
--- a/refs.c
+++ b/refs.c
@@ -902,7 +902,7 @@ int delete_ref(const char *msg, const char *refname,
 			       old_oid, flags);
 }
 
-void copy_reflog_msg(struct strbuf *sb, const char *msg)
+static void copy_reflog_msg(struct strbuf *sb, const char *msg)
 {
 	char c;
 	int wasspace = 1;
@@ -919,6 +919,15 @@ void copy_reflog_msg(struct strbuf *sb, const char *msg)
 	strbuf_rtrim(sb);
 }
 
+static char *normalize_reflog_message(const char *msg)
+{
+	struct strbuf sb = STRBUF_INIT;
+
+	if (msg && *msg)
+		copy_reflog_msg(&sb, msg);
+	return strbuf_detach(&sb, NULL);
+}
+
 int should_autocreate_reflog(const char *refname)
 {
 	switch (log_all_ref_updates) {
@@ -1124,7 +1133,7 @@ struct ref_update *ref_transaction_add_update(
 		oidcpy(&update->new_oid, new_oid);
 	if (flags & REF_HAVE_OLD)
 		oidcpy(&update->old_oid, old_oid);
-	update->msg = xstrdup_or_null(msg);
+	update->msg = normalize_reflog_message(msg);
 	return update;
 }
 
@@ -1983,9 +1992,14 @@ int refs_create_symref(struct ref_store *refs,
 		       const char *refs_heads_master,
 		       const char *logmsg)
 {
-	return refs->be->create_symref(refs, ref_target,
-				       refs_heads_master,
-				       logmsg);
+	char *msg;
+	int retval;
+
+	msg = normalize_reflog_message(logmsg);
+	retval = refs->be->create_symref(refs, ref_target, refs_heads_master,
+					 msg);
+	free(msg);
+	return retval;
 }
 
 int create_symref(const char *ref_target, const char *refs_heads_master,
@@ -2370,10 +2384,16 @@ int initial_ref_transaction_commit(struct ref_transaction *transaction,
 	return refs->be->initial_transaction_commit(refs, transaction, err);
 }
 
-int refs_delete_refs(struct ref_store *refs, const char *msg,
+int refs_delete_refs(struct ref_store *refs, const char *logmsg,
 		     struct string_list *refnames, unsigned int flags)
 {
-	return refs->be->delete_refs(refs, msg, refnames, flags);
+	char *msg;
+	int retval;
+
+	msg = normalize_reflog_message(logmsg);
+	retval = refs->be->delete_refs(refs, msg, refnames, flags);
+	free(msg);
+	return retval;
 }
 
 int delete_refs(const char *msg, struct string_list *refnames,
@@ -2385,7 +2405,13 @@ int delete_refs(const char *msg, struct string_list *refnames,
 int refs_rename_ref(struct ref_store *refs, const char *oldref,
 		    const char *newref, const char *logmsg)
 {
-	return refs->be->rename_ref(refs, oldref, newref, logmsg);
+	char *msg;
+	int retval;
+
+	msg = normalize_reflog_message(logmsg);
+	retval = refs->be->rename_ref(refs, oldref, newref, msg);
+	free(msg);
+	return retval;
 }
 
 int rename_ref(const char *oldref, const char *newref, const char *logmsg)
@@ -2396,7 +2422,13 @@ int rename_ref(const char *oldref, const char *newref, const char *logmsg)
 int refs_copy_existing_ref(struct ref_store *refs, const char *oldref,
 		    const char *newref, const char *logmsg)
 {
-	return refs->be->copy_ref(refs, oldref, newref, logmsg);
+	char *msg;
+	int retval;
+
+	msg = normalize_reflog_message(logmsg);
+	retval = refs->be->copy_ref(refs, oldref, newref, msg);
+	free(msg);
+	return retval;
 }
 
 int copy_existing_ref(const char *oldref, const char *newref, const char *logmsg)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 6516c7bc8c..e0aba23eb2 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1629,7 +1629,7 @@ static int log_ref_write_fd(int fd, const struct object_id *old_oid,
 
 	strbuf_addf(&sb, "%s %s %s", oid_to_hex(old_oid), oid_to_hex(new_oid), committer);
 	if (msg && *msg)
-		copy_reflog_msg(&sb, msg);
+		strbuf_addstr(&sb, msg);
 	strbuf_addch(&sb, '\n');
 	if (write_in_full(fd, sb.buf, sb.len) < 0)
 		ret = -1;
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 4271362d26..357359a0be 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -96,12 +96,6 @@ enum peel_status {
  */
 enum peel_status peel_object(const struct object_id *name, struct object_id *oid);
 
-/*
- * Copy the reflog message msg to sb while cleaning up the whitespaces.
- * Especially, convert LF to space, because reflog file is one line per entry.
- */
-void copy_reflog_msg(struct strbuf *sb, const char *msg);
-
 /**
  * Information needed for a single ref update. Set new_oid to the new
  * value or to null_oid to delete the ref. To check the old value
-- 
gitgitgadget

  parent reply	other threads:[~2020-07-10 17:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-30 12:05 [PATCH 0/4] Preliminary fixes for reftable support Han-Wen Nienhuys via GitGitGadget
2020-06-30 12:05 ` [PATCH 1/4] lib-t6000.sh: write tag using git-update-ref Han-Wen Nienhuys via GitGitGadget
2020-06-30 12:05 ` [PATCH 2/4] t3432: use git-reflog to inspect the reflog for HEAD Han-Wen Nienhuys via GitGitGadget
2020-06-30 12:05 ` [PATCH 3/4] checkout: add '\n' to reflog message Han-Wen Nienhuys via GitGitGadget
2020-06-30 12:05 ` [PATCH 4/4] Treat BISECT_HEAD as a pseudo ref Han-Wen Nienhuys via GitGitGadget
2020-07-07  4:40 ` [PATCH 0/4] Preliminary fixes for reftable support Junio C Hamano
2020-07-10 17:19 ` [PATCH v2 " Han-Wen Nienhuys via GitGitGadget
2020-07-10 17:19   ` [PATCH v2 1/4] lib-t6000.sh: write tag using git-update-ref Han-Wen Nienhuys via GitGitGadget
2020-07-10 17:19   ` [PATCH v2 2/4] t3432: use git-reflog to inspect the reflog for HEAD Han-Wen Nienhuys via GitGitGadget
2020-07-10 17:19   ` [PATCH v2 3/4] Treat BISECT_HEAD as a pseudo ref Han-Wen Nienhuys via GitGitGadget
2020-07-10 17:19   ` Junio C Hamano via GitGitGadget [this message]
2020-07-10 21:06     ` [PATCH v2 4/4] Cleanse reflog messages in the refs.c layer Junio C Hamano
2020-07-10 20:57   ` [PATCH v2 0/4] Preliminary fixes for reftable support Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ca5b99c8d7af1a3f35b3a7d25db284c879a2f10.1594401593.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hanwenn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.